From 5e6902be73485b227606649325567db50f496be9 Mon Sep 17 00:00:00 2001 From: Suzanne Duce Date: Fri, 13 May 2016 09:10:41 +0000 Subject: [PATCH] Added annotation sections and made few minor changes. --- TheJalviewTutorial.tex | 284 +++++++++++++++++++++++++++++++++++------------- 1 file changed, 211 insertions(+), 73 deletions(-) diff --git a/TheJalviewTutorial.tex b/TheJalviewTutorial.tex index 231b364..5bcbf97 100644 --- a/TheJalviewTutorial.tex +++ b/TheJalviewTutorial.tex @@ -105,7 +105,7 @@ Manual Version 1.6 % post CLS lifesci course on 15th January % draft. Remaining items are AACon, RNA visualization/editing and Protein disorder analysis exercises. -18th February 2016 +12th May 2016 \end{center} @@ -151,7 +151,7 @@ colouring.} and includes a javascript API to allow customisable display of alignments for web sites such as Pfam.\footnote{\url{http://pfam.xfam.org}} -Jalview 2.8.2 was released in December 2014. The Jalview Desktop in this version +The Jalview Desktop in this version provides access to protein and nucleic acid sequence, alignment and structure databases, and includes the Jmol\footnote{ Provided under the LGPL licence at \url{http://www.jmol.org}} viewer for molecular structures, and the VARNA\footnote{Provided under GPL licence at \url{http://varna.lri.fr}} program for the visualization of RNA secondary structure. A @@ -188,7 +188,8 @@ between Jalview, TOPALi and AstexViewer.} in 2004, enabling Andrew Waterhouse an Jim Procter to re-engineer the original program to introduce contemporary developments in bioinformatics and take advantage of the latest web and Java technology. Jalview's development has been supported from 2009 -by awards from the BBSRC's Tools and Resources fund, and, since 2014, a Wellcome Trust Biomedical Resource grant\footnote{Wellcome grant number 101651/Z/13/Z}. In 2010, 2011, and 2012, Jalview benefitted from the +onwards by BBSRC funding, and since 2014 by a +Wellcome Trust Biomedical Resource grant\footnote{Wellcome grant number 101651/Z/13/Z}. In 2010, 2011, and 2012, Jalview benefitted from the \href{http://code.google.com/soc/}{Google Summer of Code}, when Lauren Lui and Jan Engelhardt introduced new features for handling RNA alignments and secondary structure annotation, in collaboration with Yann Ponty.\footnote{\url{http://www.lix.polytechnique.fr/~ponty/}} @@ -209,19 +210,20 @@ Jalview Java alignment editor"} \newline Michele Clamp, James Cuff, Stephen M. S This tutorial is written in a manual format with short exercises where appropriate, typically at the end of each section. This chapter concerns the basic operation of Jalview and should be sufficient for those who want to -load Jalview (Section \ref{startingjv}), open an alignment (Section \ref{loadingseqs}), perform basic editing and colouring (Section \ref{selectingandediting} and Section \ref{colours}), and produce publication -and presentation quality graphical output (Section \ref{layoutandoutput}). - -Chapter \ref{analysisannotation} covers the additional visualization and -analysis techniques that Jalview provides. This includes working with the -embedded Jmol molecular structure viewer, building and viewing trees and PCA +launch Jalview (Section \ref{startingjv}), open an alignment (Section +\ref{loadingseqs}), perform basic editing (Section +\ref{selectingandediting}), colouring (Section \ref{colours}), and produce +publication and presentation quality graphical output (Section \ref{layoutandoutput}). + +In addition, the manual covers the additional visualization and +analysis techniques available in Jalview. This includes working +with the embedded Jmol molecular structure viewer, building and viewing trees and PCA plots, and using trees for sequence conservation analysis. An overview of the Jalview Desktop's webservices is given in Section \ref{jvwebservices}, and the alignment and secondary structure prediction services are described -in detail in Sections \ref{msaservices} and \ref{protsspredservices}. Following -this, Section \ref{featannot} details the creation and visualization of sequence +in detail in Sections \ref{msaservices} and \ref{protsspredservices}. Section \ref{featannot} details the creation and visualization of sequence and alignment annotation, and the retrieval of sequences and annotation from -databases and DAS Servers. Finally, Section \ref{workingwithnuc} discusses +databases and DAS Servers. Section \ref{workingwithnuc} discusses specific features of use when working with nucleic acid sequences, such as translation and linking to protein coding regions, and the display and analysis of RNA secondary structure. @@ -235,7 +237,7 @@ Keystrokes using the special non-symbol keys are represented in the tutorial by enclosing the pressed keys with square brackets ({\em e.g.} [RETURN] or [CTRL]). Keystroke combinations are combined with a `-' symbol ({\em e.g.} [CTRL]-C means -press [CTRL] and the `C' key). +press [CTRL] and the `C' key) simultaneously. Menu options are given as a path from the menu that contains them - for example {\sl File $\Rightarrow$ Input Alignment @@ -276,7 +278,7 @@ These links will launch the latest stable release of Jalview.\par When the application is launched with webstart, two dialogs may appear before the application starts. If your browser is not set up to handle webstart, then clicking the launch link may download a file that needs to be opened -manually, or prompt you to select the correct program to handle the webstart +manually, or prompt you to select the program to handle the webstart file. If that is the case, then you will need to locate the {\bf javaws} program on your system\footnote{The file that is downloaded will have a type of {\bf application/x-java-jnlp-file} or {\bf .jnlp}. The {\bf javaws} program that can run @@ -415,7 +417,8 @@ The major features of the Jalview Desktop are illustrated in Figure \ref{anatomy where editing and navigation are performed using the keyboard. The {\bf F2 key} is used to switch between these two modes. With a Mac as the F2 is often assigned to screen brightness, one may often need to type {\bf function - [Fn] key with F2}. + [Fn] key with F2} function + [Fn]-F2. \begin{figure}[htb] \begin{center} @@ -563,7 +566,7 @@ URL directly. \subsection{From a File} Jalview can read sequence alignments from a sequence alignment file. This is a -text file, not a word processor document. For entering sequences from a +text file, {\bf not} a word processor document. For entering sequences from a wordprocessor document see Cut and Paste (Section \ref{cutpaste}) below. Select {\sl File $\Rightarrow$ Input Alignment $\Rightarrow$ From File} from the main menu (Figure \ref{loadfile}). You will then get a file selection window where @@ -1324,7 +1327,8 @@ static residue schemes are modified using a dynamic scheme. The individual schem \subsection{Colouring a Group or Selection} Selections or groups can be coloured in two ways. The first is {\sl via} the Alignment Window's {\sl Colour} menu as stated above, - after first ensuring that the {\sl Apply Colour To All Groups} flag is not selected. + after first ensuring that the {\sl Apply Colour To All Groups} flag is {\bf + not} selected. This must be turned {\sl off} specifically as it is {\sl on} by default. When unticked, selections from the Colours menu will only change the colour for residues in the current selection, or the alignment view's ``background colourscheme'' when no selection exists. @@ -1695,19 +1699,153 @@ Photoshop, Illustrator, Inkscape, Ghostview, Powerpoint (Windows), or Preview (Mac OS X). Zoom in and note that the image has near-infinite resolution.} } -\chapter{Features and Annotation} -\label{annotation} -\section{Features and Annotation} +\chapter{Annotation and Features} \label{featannot} Features and annotations are additional information that is overlaid on the sequences and the alignment. Generally speaking, annotations are associated with columns in the alignment. Features are associated with specific residues in the sequence. -Annotations are shown below the alignment in the annotation panel, and often reflect properties of the alignment as a whole. The Conservation, Consensus and Quality scores are examples of dynamic annotation, so as the alignment changes, they change along with it. Conversely, sequence features are properties of the individual sequences, so they do not change with the alignment, but are shown mapped on to specific residues within the alignment. +Annotations are shown below the alignment in the annotation panel, and often reflect properties of the alignment as a whole. +Conversely, sequence features are properties of the individual sequences, so they do not change with the alignment, +but are shown mapped on to specific residues within the alignment. Features and annotation can be interactively created, or retrieved from external data sources. DAS (the Distributed Annotation System) is the primary source of sequence features, whilst webservices like JNet (see \ref{jpred} above) can be used to analyse a given sequence or alignment and generate annotation for it. + +\section{Conservation, Quality and Conservation Annotation} +\label{annotationintro} +Jalview automatically calculates several quantitative alignment annotations +which are displayed as histograms below the multiple sequence alignment columns. +Conservation, quality and conservation scores are examples of dynamic +annotation, so as the alignment changes, they change along with it. +The scores can be used in the hybrid colouring options to shade the alignments. +Mousing over a conservation histogram reveals a tooltip with more information. + +These annotations can be hidden and deleted via the context menu linked to the +annotation row; but they are only created on loading an alignment. If they are +deleted then the alignment should be saved and then reloaded to restore them. +Jalview provides a toggle to autocalculate a consensus sequence upon editing. This is normally selected by default, but can be turned off for +large alignments {\sl via} the {\sl Calculate $\Rightarrow$ Autocalculate +Consensus} menu option if the interface is too slow. + +\subsubsection{Conservation Annotation} + +Alignment conservation annotation is quantitative numerical index reflecting the +conservation of the physico-chemical properties for each column of the alignment. +The calculation is based on AMAS method of multiple sequence alignment analysis (Livingstone C.D. and Barton G.J. (1993) CABIOS Vol. 9 No. 6 p745-756), +with identities scoring highest, and amino acids with substitutions in the same physico-chemical class have next highest score. +The score for each column is shown below the histogram. +The conserved columns with a score of 11 are indicated by '*'. +Columns with a score of 10 have mutations but all properties are conserved are marked with a '+'. + +\subsubsection{Consensus Annotation} + +Alignment consensus annotation reflects the percentage of the different residue +per column. By default this calculation includes gaps in columns, gaps can be ignored via the Consensus label context +menu to the left of the consensus bar chart. +The consensus histogram can be overlaid +with a sequence logo that reflects the symbol distribution at each column of +the alignment. Right click on the Consensus annotation row and select the {\sl Show +Logo} option to display the Consensus profile for the group or alignment. +Sequence logos can be enabled by default for all new alignments {\sl via} the +Visual tab in the Jalview desktop's preferences dialog box. + +\subsubsection{Quality Annotation} + +Alignment quality annotation is an ad-hoc measure of the likelihood of observing +the mutations (if any) in a particular column of the alignment. The quality score is calculated for each column in an alignment by summing, +for all mutations, the ratio of the two BLOSUM 62 scores for a mutation pair and each residue's conserved BLOSUM62 score (which is higher). +This value is normalised for each column, and then plotted on a scale from 0 to 1. + +\subsubsection{Group Associated Annotation} +\label{groupassocannotation} +Group associated consensus and conservation annotation rows reflect the +sequence variation within a particular group. Their calculation is enabled +by selecting the {\sl Group Conservation} or {\sl Group Consensus} options in +the {\sl Annotation $\Rightarrow$ Autocalculated Annotation } submenu of the +alignment window. + +\subsection{Creating User Defined Annotation} + +Annotations are properties that apply to the alignment as a whole and are visualized on rows in the annotation panel. +To create a new annotation row, right click on the annotation label panel and select the {\sl Add New Row} menu option (Figure \ref{newannotrow}). +A dialogue box appears. Enter the label to use for this row and a new row will appear. + +To create a new annotation, first select all the positions to be annotated on the appropriate row. +Right-clicking on this selection brings up the context menu which allows the insertion of graphics for secondary structure ({\sl Helix} or {\sl Sheet}), +text {\sl Label} and the colour in which to present the annotation (Figure \ref{newannot}). On selecting {\sl Label} a dialogue box will appear, +requesting the text to place at that position. After the text is entered, the selection can be removed and the annotation becomes clearly +visible\footnote{When annotating a block of positions, the text can be partly obscured by the selection highlight. Pressing the [ESC] key clears +the selection and the label is then visible.}. Annotations can be coloured or deleted as desired. + +\begin{figure}[htbp] +\begin{center} +\includegraphics[width=1.3in]{images/annots1.pdf} +\includegraphics[width=2in]{images/annots2.pdf} +\caption{{\bf Creating a new annotation row.} Annotation rows can be reordered by dragging them to the desired place.} +\label{newannotrow} +\end{center} +\end{figure} + +\begin{figure}[htbp] +\begin{center} +\includegraphics[width=2in]{images/annots3.pdf} +\includegraphics[width=2in]{images/annots4.pdf} +\includegraphics[width=2in]{images/annots5.pdf} +\caption{{\bf Creating a new annotation.} Annotations are created from a selection on the annotation row and can be coloured as desired.} +\label{newannot} +\end{center} +\end{figure} + +\exercise{Annotating Alignments}{ +\exstep{Load the alignment at \textsf{http://www.jalview.org/tutorial/alignment.fa}. +Right-click on the {\sl Conservation} annotation row to +bring up the context menu and select {\sl Add New Row}. A dialogue box will appear asking for {\sl Label for annotation}. +Enter ``Iron binding site" and click OK. A new, empty, row appears. +} +\exstep{ +Navigate to column 97. Move down and on the new annotation row called +``Iron binding site, select column 97. +Right click at this selection and select {\sl Label} from the context menu. +Enter ``Fe" in the box and click OK. Right-click on the selection again and select {\sl Colour}. +Choose a colour from the colour chooser dialogue +and click OK. Press [ESC] to remove the selection. +} +\exstep{ Select columns 70-77 on the annotation row. Right-click and choose {\sl Sheet} from the + context menu. You will be prompted for a label. Enter ``B" and press OK. A new line showing the + sheet as an arrow appears. The colour of the label can be changed but not the colour of the sheet + arrow. +} +\exstep{Right click on the title text of annotation row that you just created. +Select {\sl Export Annotation} and, in the {\bf Export Annotation} dialog box that will open, select the Jalview format and click +the [To Textbox] button. + +The format for this file is given in the Jalview help. Press [F1] to open it, and find +the ``Annotations File Format'' entry in the ``Alignment Annotations'' section of the contents +pane. } + +\exstep{Export the file to a text editor and edit the file to change the name of the annotation +row. Save the file and drag it onto the alignment view.} +\exstep{Try to add an additional helix somewhere along the row by editing the file and +re-importing it. +{\sl Hint: Use the {\bf Export Annotation} function to view what helix annotation looks like in +a Jalview annotation file.}} +\exstep{Use the {\sl Alignment Window $\Rightarrow$ File $\Rightarrow$ Export Annotations...} +function to export all the alignment's annotation to a file.} +\exstep{Open the exported annotation in a text editor, and use the {\bf Annotation File Format} +documentation to modify the style of the Conservation, Consensus and Quality annotation rows so +they appear as several lines on a single line graph. +{\sl Hint: You need to change the style of annotation row in the first field of the annotation +row entry in the file, and create an annotation row grouping to overlay the three quantitative +annotation rows.} +} +\label{viewannotfileex}\exstep{Recover or recreate the secondary structure +prediction that you made in exercise \ref{secstrpredex}. Use the {\sl File $\Rightarrow$ Export +Annotation} function to view the Jnet secondary structure prediction annotation row. Note the +{\bf SEQUENCE\_REF} statements surrounding the row specifying the sequence association for the +annotation. } } + \section{Importing Features from Databases} \label{featuresfromdb} Jalview supports feature retrieval from public databases either directly or {\sl @@ -2145,12 +2283,12 @@ Web Service $\Rightarrow$ Alignment $\Rightarrow$ \ldots} submenu (Figure default settings, use one of the available presets, or customise the parameters with the `{\sl Edit and Run ..}' dialog box. Once the job is submitted, a progress window will appear giving information about the job and any errors that -occur. After successful completion of the job, a new window is opened with the -results, in this case an alignment. By default, the new alignment will be -ordered in the same way as the input sequences; however, many alignment programs -re-order the input to place homologous sequences close together. This ordering -can be recovered using the `Original ordering' entry within the {\sl Calculate -$\Rightarrow$ Sort } sub menu. +occur. After successful completion of the job, a new alignment window is opened +with the results, in this case an alignment. By default, the new alignment will be +ordered in the same way as the input sequences. Note: many alignment +programs re-order the input to place homologous sequences close together, the +original ordering can be recovered using the `Original ordering' entry within +the {\sl Calculate $\Rightarrow$ Sort } sub menu. \begin{figure}[htbp] \begin{center} @@ -2378,7 +2516,7 @@ Jalview provides support for sequence analysis in two ways. A number of analytical methods are `built-in', these are accessed from the {\sl Calculate} alignment window menu. Computationally intensive analyses are run outside Jalview {\sl via} web services - these are typically accessed {\sl via} the {\sl -Web Service} menu, and described in \ref{jvwebservices} and subsequent sections. +Web Service} menu, and described in chapter \ref{jvwebservices}. In this section, we describe the built-in analysis capabilities common to both the Jalview Desktop and the JalviewLite applet. @@ -2646,30 +2784,30 @@ Consensus} from the pop-up context menu located with consensus annotation row. Quality is a measure of the inverse likelihood of unfavourable mutations in the alignment. Further details on these calculations can be found in the on-line documentation. -These annotations can be hidden and deleted via the context menu linked to the -annotation row; but they are only created on loading an alignment. If they are -deleted then the alignment should be saved and then reloaded to restore them. -Jalview provides a toggle to autocalculate a consensus sequence upon editing. This is normally selected by default, but can be turned off for -large alignments {\sl via} the {\sl Calculate $\Rightarrow$ Autocalculate -Consensus} menu option if the interface is too slow. - -\subsubsection{Group Associated Annotation} -\label{groupassocannotation} -Group associated consensus and conservation annotation rows reflect the -sequence variation within a particular group. Their calculation is enabled -by selecting the {\sl Group Conservation} or {\sl Group Consensus} options in -the {\sl Annotation $\Rightarrow$ Autocalculated Annotation } submenu of the -alignment window. - -\subsubsection{Alignment and Group Sequence Logos} -\label{seqlogos} - -The consensus annotation row that is shown below the alignment can be overlaid -with a sequence logo that reflects the symbol distribution at each column of -the alignment. Right click on the Consensus annotation row and select the {\sl Show -Logo} option to display the Consensus profile for the group or alignment. -Sequence logos can be enabled by default for all new alignments {\sl via} the -Visual tab in the Jalview desktop's preferences dialog box. +% These annotations can be hidden and deleted via the context menu linked to the +% annotation row; but they are only created on loading an alignment. If they are +% deleted then the alignment should be saved and then reloaded to restore them. +% Jalview provides a toggle to autocalculate a consensus sequence upon editing. +% This is normally selected by default, but can be turned off for large alignments {\sl via} the {\sl Calculate $\Rightarrow$ Autocalculate +% Consensus} menu option if the interface is too slow. + +% \subsubsection{Group Associated Annotation} +% \label{groupassocannotation} +% Group associated consensus and conservation annotation rows reflect the +% sequence variation within a particular group. Their calculation is enabled +% by selecting the {\sl Group Conservation} or {\sl Group Consensus} options in +% the {\sl Annotation $\Rightarrow$ Autocalculated Annotation } submenu of the +% alignment window. + +% \subsubsection{Alignment and Group Sequence Logos} +% \label{seqlogos} + +% The consensus annotation row that is shown below the alignment can be overlaid +% with a sequence logo that reflects the symbol distribution at each column of +% the alignment. Right click on the Consensus annotation row and select the {\sl +% Show Logo} option to display the Consensus profile for the group or alignment. +% Sequence logos can be enabled by default for all new alignments {\sl via} the +% Visual tab in the Jalview desktop's preferences dialog box. \exercise{Group Conservation Analysis}{ \exstep{Re-use or recreate the alignment and tree which you worked with in the @@ -2722,32 +2860,32 @@ Jalview can calculate optimal pairwise alignments between arbitrary sequences {\ \label{3Dstructure} -This chapter describes the annotation, analysis, and visualization tasks that -the Jalview Desktop can perform. - -Section \ref{wkwithstructure} introduces the structure visualization -capabilities of Jalview. In Section \ref{alignanalysis}, you will find -descriptions and exercises on building and displaying trees, PCA analysis, -alignment redundancy removal, pairwise alignments and alignment conservation -analysis. Section \ref{jvwebservices} introduces the various web based services -available to Jalview users, and Section \ref{jabaservices} explains how to -configure the Jalview Desktop for access to new JABAWS servers. +To summarise, section \ref{featannot} describes the mechanisms provided by +Jalview for interactive creation of sequence and alignment annotation, and how they can be +displayed, imported and exported and used to reorder the alignment. Section +\ref{featuresfromdb} discusses the retrieval of database references and +establishment of sequence coordinate systems for the retrieval and display of +features from databases and DAS annotation services. Section \ref{msaservices} describes how to use the range of multiple alignment programs provided by JABAWS, and Section \ref{aacons} introduces JABAWS AACon service for protein multiple alignment conservation analysis. + In Section \ref{alignanalysis}, you will find +descriptions and exercises on building and displaying trees, PCA analysis, +alignment redundancy removal, pairwise alignments and alignment conservation +analysis. +Section \ref{wkwithstructure} introduces the structure visualization +capabilities of Jalview. Section \ref{protsspredservices} explains how to perform protein secondary structure predictions with JPred, and JABAWS protein disorder prediction services are introduced in Section \ref{protdisorderpred}. - -Section \ref{featannot} describes the mechanisms provided by Jalview for -interactive creation of sequence and alignment annotation, and how they can be -displayed, imported and exported and used to reorder the alignment. Section -\ref{featuresfromdb} discusses the retrieval of database references and -establishment of sequence coordinate systems for the retrieval and display of -features from databases and DAS annotation services. Section -\ref{workingwithnuc} describes functions and visualization techniques relevant +Section \ref{workingwithnuc} describes functions and visualization techniques relevant to working with nucleotide sequences, coding region annotation and nucleotide sequence alignments. +Section \ref{jvwebservices} introduces the various web based services +available to Jalview users, and Section \ref{jabaservices} explains how to +configure the Jalview Desktop for access to new JABAWS servers. + + % and Section \ref{workingwithrna} covers the visualization, % editing and analysis of RNA secondary structure. @@ -3246,8 +3384,8 @@ Each service operates on sequences in the alignment to identify regions likely to be unstructured or flexible, or alternately, fold to form globular domains. As a consequence, disorder predictor results include both sequence features and sequence associated alignment annotation rows. Section \ref{featannot} describes -the manipulation and display of these data in detail, and {\bf Figure -\ref{alignmentdisorder}} demonstrates how sequence feature shading and +the manipulation and display of these data in detail, and Figure +\ref{alignmentdisorder} demonstrates how sequence feature shading and thresholding (described in Section \ref{featureschemes}) can be used to highlight differences in disorder prediction across aligned sequences. @@ -3291,7 +3429,7 @@ alignment.} } \subsubsection{Navigating Large Sets of Disorder Predictions} -{\bf Figure \ref{alignmentdisorderannot}} shows a single sequence annotated with +Figure \ref{alignmentdisorderannot} shows a single sequence annotated with a range of disorder predictions. Disorder prediction annotation rows are associated with a sequence in the same way as secondary structure prediction results. When browsing an alignment containing large numbers of disorder -- 1.7.10.2