From 54e22cdbc8b5102a04423f3d8321021ddbf586fb Mon Sep 17 00:00:00 2001 From: Suzanne Duce Date: Fri, 24 Apr 2015 11:16:22 +0000 Subject: [PATCH] Changes added after the London 2014 workshop --- TheJalviewTutorial.tex | 253 ++++++++++++++++++++++++++---------------------- 1 file changed, 139 insertions(+), 114 deletions(-) diff --git a/TheJalviewTutorial.tex b/TheJalviewTutorial.tex index 52e2343..3f468dc 100644 --- a/TheJalviewTutorial.tex +++ b/TheJalviewTutorial.tex @@ -100,11 +100,11 @@ Dundee, Scotland DD1 5EH, UK \vspace{2in} -Manual Version 1.5.0 +Manual Version 1.5.1 % post CLS lifesci course on 15th January % draft. Remaining items are AACon, RNA visualization/editing and Protein disorder analysis exercises. -12th December 2014 +24th April 2015 \end{center} @@ -147,8 +147,8 @@ visualization, editing and analysis capabilities as the desktop, without the desktop's webservice and figure generation capabilities. It is designed to be embedded in a web page,\footnote{A demonstration version of Jalview (Jalview Micro Edition) also runs on a mobile phone but the functionality is limited to sequence -colouring.} and includes a javascript API to allow customisable display of alignments for web sites such as -{\bf Pfam}.\footnote{\url{http://pfam.xfam.org}} +colouring.} and includes a javascript API to allow customisable display of +alignments for web sites such as Pfam.\footnote{\url{http://pfam.xfam.org}} Jalview 2.8.2 was released in December 2014. The Jalview Desktop in this version @@ -335,7 +335,7 @@ when new articles are available from the Jalview Desktop's news channel.} \end{figure} -\exercise{Launching Jalview from the Jalview website}{ +\exercise{Launching Jalview from the Jalview Website}{ \label{start} \exstep{Open the Jalview web site \href{http://www.jalview.org}{(www.jalview.org)} @@ -380,7 +380,7 @@ the website.}} \subsection{Getting Help} \label{gettinghelp} -\subsubsection{Built in documentation} +\subsubsection{Built in Documentation} Jalview has comprehensive on-line help documentation. Select {\sl Help $\Rightarrow$ Documentation} from the main window menu and a new window will open (Figure \ref{help}). The appropriate topic can then be selected from the @@ -397,7 +397,7 @@ the `search' tab and enter keywords in the box which appears. \end{center} \end{figure} -\subsubsection{Email lists} +\subsubsection{Email Lists} The Jalview Discussion list {\tt jalview-discuss@jalview.org} provides a forum for Jalview users and developers to raise problems and exchange ideas - any @@ -484,7 +484,8 @@ the arrow keys ($\uparrow$, $\downarrow$, $\leftarrow$, $\rightarrow$). Rapid movement to specific positions is accomplished as listed below: \begin{list}{$\circ$}{} -\item {\bf Jump to Sequence {\sl n}:} Type a number {\sl n} then press [S] to move to sequence (row) {\sl n} +\item {\bf Jump to Sequence {\sl n}:} Type a number {\sl n} then press [S] to +move to sequence (row). {\sl n} \item {\bf Jump to Column {\sl n}:} Type a number {\sl n} then press [C] to move to column {\sl n} in the alignment. \item {\bf Jump to Residue {\sl n}:} Type a number {\sl n} then press [P] to move to residue number {\sl n} in the current sequence. \item {\bf Jump to column {\sl m} row {\sl n}:} Type the column number {\sl m}, a comma, the row number {\sl n} and press [RETURN]. @@ -509,8 +510,8 @@ an easy way to access it.)} \exstep{Find the Overview Window, {\sl Views $\Rightarrow$ Overview Window} and open it. Move around the alignment by clicking and dragging the red box in the overview window.} -\exstep{Look at the status bar (lower left hand corner of the alignment window) as -you move the mouse over the alignment. It indicates information about the +\exstep{Return to the alignment window. Look at the status bar (lower left hand +corner of the alignment window) as you move the mouse over the alignment. It indicates information about the sequence and residue under the cursor.} \exstep{Press [F2] to enter {\bf Cursor mode}. Use the direction {\bfarrow keys} to move the cursor around the alignment.} @@ -659,7 +660,7 @@ record.} \end{figure} -\exercise{Loading sequences}{ +\exercise{Loading Sequences}{ \label{load} \exstep{Use {\sl Window $\Rightarrow$ Close All} from the Desktop window menu to close all windows.} @@ -711,7 +712,8 @@ another window called Select Database Retrieval Source showing all the database sources. Select the {\bf PFAM seed} database and click ok, then enter the accession -number {\bf PF03460} and click OK. An alignment of about 107 sequences should load. +number {\bf PF03460} and click OK. An alignment of about 174 sequences should +load. Several database IDs or accession numbers can be loaded by using semicolons to separate them.}} @@ -879,7 +881,7 @@ ranges of sequences (Figure \ref{selectrows}). To define a selection in cursor mode (which is enabled by pressing [F2] when the alignment window is selected), navigate to the top left corner of the proposed selection (using the mouse, the arrow keys, or the keystroke commands described in Section \ref{cursormode}). Pressing the [Q] key marks this as the -corner. A red outline appears around the cursor (Figure \ref{cselect}) +corner. A red outline appears around the cursor (Figure \ref{cselect}). Navigate to the bottom right corner of the proposed selection and press the [M] key. This marks the bottom right corner of the selection. The selection can then be treated in the same way as if it had been created in normal mode. @@ -904,7 +906,7 @@ simply select the region that is to be kept unselected, and then invert the sele This may also be useful when hiding large regions in an alignment (see Section \ref{hidingregions} below). Instead of selecting the columns and rows that are to be hidden, simply select the region that is to be kept visible, invert the selection, then select {\sl View $\Rightarrow$ Hide -$\Rightarrow$ Selected Region }. +$\Rightarrow$ Selected Region}. \subsection{Creating Groups} Selections are lost as soon as a different region is selected. Groups can be @@ -1067,7 +1069,7 @@ to hide the unselected region. Instead of hiding a group completely, it is sometimes useful to work with just one representative sequence. The {\sl $<$Sequence ID$>$ $\Rightarrow$ Represent group with $<$Sequence ID$>$ } option from the sequence ID pop-up menu enables this variant of the hidden groups function. The remaining representative sequence can be visualized and manipulated like any other. However, any alignment edits that affect the sequence will also affect the whole sequence group. -\exercise{Hiding and revealing regions}{ +\exercise{Hiding and Revealing Regions}{ \exstep{Close all windows, open the PFAM accession PF03460. Select a contiguous set of sequences by clicking and dragging on the sequence ID panel. Right click on the selected sequence IDs to bring up the sequence ID pop-up @@ -1189,7 +1191,7 @@ on the sequence IDs to open the sequence ID pop-up menu, and select {\sl Hide Sequences}). } \exstep{ Select FER3\_RAPSA and FER\_BRANA. Slide the sequences to -the left so the initial {\bf A} lies at column 57 using the $\Rightarrow$ key.} +the right so the initial {\bf A} lies at column 57 using the $\Rightarrow$ key.} \exstep{ Select FER1\_SPIOL, FER1\_ARATH, FER2\_ARATH, Q93Z60\_ARATH and O80429\_MAIZE @@ -1263,9 +1265,11 @@ right of the selected residue. \ref{mousealedit}, and recreates the final part of the example ferredoxin alignment from the unaligned sequences using Jalview's keyboard editing mode. -{\bf {\sl Note for Windows Users:}} The [SHIFT]-[SPACE] command has the same effect as -the [CTRL]-[SPACE] command mentioned in this exercise, and you should use -[SHIFT]-[SPACE] in order to avoid opening the window menu.} +{{\bf Note:}} For Mac users, [CTRL]-[SPACE] command +has the same effect as the [SHIFT]-[SPACE] command mentioned in this exercise. + +Window users should use [SHIFT]-[SPACE] rather than the [CTRL]-[SPACE] command, +as this command will close the window.} \exstep{Load the sequence alignment at \textsf{http://www.jalview.org/tutorial/unaligned.fa}, or continue using the @@ -1276,13 +1280,16 @@ previous exercise, then first right click on the sequence ID panel and select Now, enter cursor mode by pressing [F2]} % TODO: BACKSPACE or DELETE WHEN SEQS ARE SELECTED WILL DELETE ALL SEQS JAL-783 \exstep{Insert 58 gaps at the start of the first sequence (FER\_CAPAA). Press {\sl 58} then {\sl [SPACE]}. } -\exstep{Go down one sequence and select rows 2-5 as a block. Click on the second sequence ID (FER\_CAPAN). Hold down shift and click on the fifth (FER1\_PEA). } +\exstep{Go down one sequence and select rows 2-5 as a block. Click on the second sequence ID (FER\_CAPAN). + Hold down shift and click on the fifth (FER1\_PEA). } \exstep{Insert 6 gaps at the start of this group. Go to column 1 row 2 by typing {\sl 1,2} then press {\sl [RETURN]}. Now insert 6 gaps in all the sequences. -Type {\sl 6} then hold down {\sl [CTRL]} and press {\sl [SPACE]}.} \exstep{Now insert one gap at column 34 and another at 38. Insert 3 gaps at 47. -Press {\sl 34C} then {\sl [CTRL]-[SPACE]}. Press {\sl 38C} then [CTRL]-[SPACE]. -Press {\sl 47C} then {\sl 3 [CTRL-SPACE]} the first through fourth sequences are -now aligned.} +Type {\sl 6} then hold down {\sl [SHIFT]} and press {\sl [SPACE]}.} +\exstep{Now insert one gap at column 34 and another at 38. Insert 3 gaps at 47. +Press {\sl 34C} then {\sl [SHIFT]-[SPACE]}. Press {\sl 38C} then +[SHIFT]-[SPACE]. +Press {\sl 47C} then {\sl 3 [SHIFT-SPACE]} the first through fourth sequences +are now aligned.} \exstep{The fifth sequence (FER1\_PEA) is poorly aligned. We will delete some gaps and add some new ones. Press {\sl [ESC]} to clear the selection. Navigate to the start of sequence 5 and delete 3 gaps. Press {\sl 1,5 [RETURN]} then {\sl 3 [BACKSPACE]} to delete three gaps. Go to column 31 and delete the gap. Press {\sl 31C [BACKSPACE]} .} \exstep{ Similarly delete the gap now at column 34, then insert two gaps at column 38. Press {\sl 34C [BACKSPACE] 38C 2 [SPACE]}. Delete three gaps at column 44 and insert one at column 47 by pressing {\sl 44C 3 [BACKSPACE] 47C [SPACE]}. The top five sequences are now aligned.} } @@ -1420,7 +1427,8 @@ The residues are coloured according to their physicochemical properties. The phy \subsubsection{Taylor} \parbox[c]{3.5in}{ -This colour scheme was devised by Willie Taylor and an entertaining description of it's origin can be found in Protein Engineering, Vol 10 , 743-746 (1997) +This colour scheme was devised by Willie Taylor and an entertaining description of its origin can be found in Protein Engineering, +Vol 10 , 743-746 (1997). } \parbox[c]{3in}{ \includegraphics[width=2.75in]{images/col_taylor.pdf} @@ -1482,13 +1490,12 @@ sequences and alignments. \parbox[c]{3.5in}{ Residues are coloured according to whether the corresponding nucleotide bases are purine (magenta) or pyrimidine (cyan) based. All non ACTG residues are uncoloured. For further information about working with nucleic acid -sequences and alignments, see Section \ref{workingwithnuc} +sequences and alignments, see Section \ref{workingwithnuc}. %and Section \ref{workingwithrna} -. } \parbox[c]{3in}{ \includegraphics[width=2.75in]{images/col_purpyr.pdf} } -\subsubsection{RNA Helix colouring} +\subsubsection{RNA Helix Colouring} \parbox[c]{3.5in}{ Columns are coloured according to their assigned RNA helix as defined by a secondary structure annotation line on the alignment. Colours for each helix are randomly assigned, and option only available when an RNA @@ -1508,7 +1515,7 @@ Select the alignment menu option {\sl Colour $\Rightarrow$ ClustalX}. Note the c \exstep{ Colour the alignment using {\sl Colour $\Rightarrow$ Blosum62}. Select a group of around 4 similar sequences. Use the context menu (right click on the group) -option {\sl Selection $\Rightarrow$ Group $\Rightarrow$ Group Colour +option {\sl Selection $\Rightarrow$ Edit New Group $\Rightarrow$ Group Colour $\Rightarrow$ Blosum62} to colour the selection. Notice how some residues which were not coloured are now coloured. The calculations performed for dynamic colouring schemes like Blosum62 are based on the group being coloured, not the @@ -1516,7 +1523,10 @@ whole alignment (this also explains the colouring changes observed in exercise \ref{exselect} during the group selection step). } \exstep{ -Keeping the same selection as before, colour the complete alignment using {\sl Colour $\Rightarrow$ Taylor}. Select the menu option {\sl Colour $\Rightarrow$ By Conservation}. Slide the selector from side to side and observe the changes in the alignment colouring in the selection and in the complete alignment. +Keeping the same selection as before, colour the complete alignment except +the group using {\sl Colour $\Rightarrow$ Taylor}. +Select the menu option {\sl Colour $\Rightarrow$ By Conservation}. +Slide the selector from side to side and observe the changes in the alignment colouring in the selection and in the complete alignment. } } @@ -1535,7 +1545,7 @@ This dialogue allows the user to create any number of named colour schemes at wi \end{figure} -\exercise{User defined colour schemes}{ +\exercise{User Defined Colour Schemes}{ \exstep{Load a sequence alignment. Select the alignment menu option {\sl Colour $\Rightarrow$ User Defined}. A dialogue window will open. } \exstep{Click on an amino acid button, then select a colour for that amino acid. Repeat till all amino acids are coloured to your liking. @@ -1700,10 +1710,10 @@ analysis. Section \ref{jvwebservices} introduces the various web based services available to Jalview users, and Section \ref{jabaservices} explains how to configure the Jalview Desktop for access to new JABAWS servers. Section \ref{msaservices} describes how to use the range of multiple alignment -programs provided by JABAWS, and Section \ref{aacons} introduces JABAWS' AACon +programs provided by JABAWS, and Section \ref{aacons} introduces JABAWS AACon service for protein multiple alignment conservation analysis. Section \ref{protsspredservices} explains how to perform protein secondary -structure predictions with JPred, and JABAWS' protein disorder prediction +structure predictions with JPred, and JABAWS protein disorder prediction services are introduced in Section \ref{protdisorderpred}. Section \ref{featannot} describes the mechanisms provided by Jalview for @@ -1718,7 +1728,7 @@ sequence alignments. % and Section \ref{workingwithrna} covers the visualization, % editing and analysis of RNA secondary structure. -\section{Working with structures} +\section{Working with Structures} \label{wkwithstructure} Jalview facilitates the use of protein structures for the analysis of alignments by providing a linked view of structures associated with sequences in @@ -1734,10 +1744,10 @@ PDB format files can be imported directly or structures can be retrieved from the European Protein Databank (PDBe) using the Sequence Fetcher (see \ref{fetchseq}). -\subsection{Automatic association of PDB structures with sequences} +\subsection{Automatic Association of PDB Structures with Sequences} Jalview can automatically determine which structures are associated with a sequence in a number of ways. -\subsubsection{Discovery of PDB IDs from sequence database cross-references} +\subsubsection{Discovery of PDB IDs from Sequence Database Cross-references} If a sequence has an ID from a public database that contains cross-references to the PDB, such as Uniprot. Right-click on any sequence ID and select {\sl Structure $\Rightarrow$ Associate Structure with Sequence $\Rightarrow$ Discover PDB IDs } from the context menu (Figure \ref{auto}). Jalview will attempt to associate the @@ -1772,13 +1782,15 @@ associated PDB structures. } } -\caption{{\bf Automatic PDB ID discovery.} The tooltip (left) indicates that no PDB structure has been associated with the sequence. After PDB ID discovery (center) the tool tip now indicates the Uniprot ID and any associated PDB structures (right)} +\caption{{\bf Automatic PDB ID discovery.} The tooltip (left) indicates that no PDB structure has been associated with the sequence. +After PDB ID discovery (center) the tool tip now indicates the Uniprot ID and +any associated PDB structures (right).} \label{auto} \end{center} \end{figure} -\subsubsection{Drag-and-drop association of PDB files with sequences by filename -match} +\subsubsection{Drag-and-Drop Association of PDB Files with Sequences by Filename +Match} \label{multipdbfileassoc} If one or more PDB files stored on your computer are dragged from their location on the file browser onto an alignment window, Jalview will search the alignment @@ -1789,8 +1801,8 @@ for the matches. If no associations are made, then sequences extracted from the structure will be simply added to the alignment. However, if only -some of the PDB files are associated, jalview will raise another dialog box giving -you the option to add any remaining sequences from the PDB structure files not present in +some of the PDB files are associated, Jalview will raise another dialog box +giving you the option to add any remaining sequences from the PDB structure files not present in the alignment. This allows you to easily decorate sequences in a newly imported alignment with any corresponding structures you've already collected in a directory accessible from your computer.\footnote{We plan to extend this facility in @@ -1865,7 +1877,7 @@ disabled for the current view. \end{center} \end{figure} -\subsection{Customising structure display} +\subsection{Customising Structure Display} Structure display can be modified using the {\sl Colour} and {\sl View} menus in the structure viewer. The background colour can be modified by selecting the @@ -1882,7 +1894,7 @@ data to be saved as PDB format. The mapping between the structure and the sequence (How well and which parts of the structure relate to the sequence) can be viewed with the {\sl File $\Rightarrow$ View Mapping} menu option. -\subsubsection{Using the Jmol visualization interface } +\subsubsection{Using the Jmol Visualization Interface } Jmol has a comprehensive set of selection and visualization functions that are accessed from the Jmol popup menu (by right-clicking in the Jmol window or by @@ -1929,7 +1941,7 @@ $\Rightarrow$ 1A70}. A structure viewing window appears. Rotate the molecule by Verify that the Jmol display is as it was when you just saved the file.} } -\subsection{Superimposing structures} +\subsection{Superimposing Structures} \label{superposestructs} Many comparative biomolecular analysis investigations aim to determine if the biochemical properties of a given molecule are significantly different to its @@ -1951,7 +1963,7 @@ $\Rightarrow$ View all {\bf N} PDB Structures} option (when {\bf {\sl N}} $>$ 1) if the current selection contains two or more sequences with associated structures. -\subsubsection{Obtaining the RMSD for a superposition} +\subsubsection{Obtaining the RMSD for a Superposition} The RMSD (Root Mean Square Deviation) is a measure of how similar the structures are when they are superimposed. Figure \ref{mstrucsuperposition} shows a superposition created during the course of Exercise \ref{superpositionex}. The @@ -1964,8 +1976,8 @@ console.\footnote{The Jalview Java Console is opened from {\sl Tools $\Rightarrow$ Java Console} option in the Desktop's menu bar} This output also includes the precise atom pairs used to superpose structures. -\subsubsection{Choosing which part of the alignment is used for structural -superposition} Jalview uses the visible part of each alignment view to define +\subsubsection{Choosing which part of the Alignment is used for Structural +Superposition} Jalview uses the visible part of each alignment view to define which parts of each molecule are to be superimposed. Hiding a column in a view used for superposition will remove that correspondence from the set, and will exclude it from the superposition and RMSD calculation. @@ -2000,8 +2012,8 @@ after the superposition is shown in the Jmol console.} \end{center} \end{figure} -\exercise{Aligning structures using the ferredoxin -sequence alignment.}{\label{superpositionex} +\exercise{Aligning Structures using the Ferredoxin +Sequence Alignment}{\label{superpositionex} \exstep{Continue with the Jalview project created in exercise \ref{viewingstructex}. Use the {\sl Discover PDB IDs} function to retrieve PDB @@ -2025,8 +2037,8 @@ the two structures.}} the small section and with the whole alignment. Which view do you think give the best 3D superposition, and why ?} } -\subsection{Colouring structure data associated with multiple alignments and views} -Normally, the original view from which a particular structure view was +\subsection{Colouring Structure Data Associated with Multiple Alignments and +Views} Normally, the original view from which a particular structure view was opened will be the one used to colour structure data. If alignments involving sequences associated with structure data shown in a Jmol have multiple views, Jalview gives you full control over which alignment, or alignment view, is used to colour the structure @@ -2054,7 +2066,7 @@ $\Rightarrow$ By Sequence} option is selected.} \end{center} \end{figure} -\subsubsection{Colouring complexes} +\subsubsection{Colouring Complexes} \label{complexstructurecolours} The ability to control which multiple alignment view is used to colour structural data is essential when working with data relating to @@ -2064,7 +2076,7 @@ In these situations, each chain identified in the structure may have a different evolutionary history, and a complete picture of functional variation can only be gained by integrating data from different alignments on the same structure view. An example of this is shown in Figure -\ref{mviewalcomplex}, based on data from Song et. al\footnote{Structure of +\ref{mviewalcomplex}, based on data from Song et. al.\footnote{Structure of DNMT1-DNA Complex Reveals a Role for Autoinhibition in Maintenance DNA Methylation. Jikui Song, Olga Rechkoblit, Timothy H. Bestor, and Dinshaw J. Patel. {\sl Science} 2011 {\bf 331} 1036-1040 \href{http://www.sciencemag.org/content/331/6020/1036}{DOI:10.1126/science.1195380}} @@ -2080,7 +2092,8 @@ in each component of this protein-DNA complex. Instructions for recreating this \end{center} \end{figure} -\exercise{Colouring a protein complex to explore domain-domain interfaces}{\label{dnmtcomplexex} +\exercise{Colouring a Protein Complex to Explore Domain-Domain +Interfaces}{\label{dnmtcomplexex} \exstep{Download the PDB file at \textsf{\url{http://www.jalview.org/tutorial/DNMT1\_MOUSE.pdb}} to your desktop. @@ -2130,7 +2143,7 @@ in the structure.}} % for one relating to highlighting of positions in the alignment window).} } -\section{Analysis of alignments} +\section{Analysis of Alignments} \label{alignanalysis} Jalview provides support for sequence analysis in two ways. A number of analytical methods are `built-in', these are accessed from the {\sl Calculate} @@ -2188,7 +2201,7 @@ cursor is placed over a sequence. Sequences can be selected by clicking on them. \begin{center} \includegraphics[width=2in]{images/PCA1.pdf} \includegraphics[width=3in]{images/PCA3.pdf} -\caption{{\bf PCA Analysis} } +\caption{{\bf PCA Analysis.} } \label{PCA} \end{center} \end{figure} @@ -2207,7 +2220,7 @@ colour. Note how the colour of the sequence ID label matches both the colour of the partitioned tree and the points in the PCA plot. } } -\subsubsection{PCA data export} +\subsubsection{PCA Data Export} Although the PCA viewer supports export of the current view, the plots produced are rarely suitable for direct publication. The PCA viewer's {\sl File} menu includes a number of options for exporting the PCA matrix and transformed points @@ -2241,7 +2254,8 @@ option. Leaf names on imported trees will be matched to the associated alignment \includegraphics[width=2.5in]{images/trees1.pdf} \includegraphics[width=2.5in]{images/trees2.pdf} \includegraphics[width=1.25in]{images/trees4.pdf} -\caption{{\bf Calculating Trees} Jalview provides four built in models for calculating trees. Jalview can also load precalculated trees in Newick format (right).} +\caption{{\bf Calculating Trees} Jalview provides four built in models for calculating trees. +Jalview can also load precalculated trees in Newick format (right).} \label{trees1} \end{center} \end{figure} @@ -2261,7 +2275,8 @@ preserve these. \begin{figure} \begin{center} \includegraphics[width=5in]{images/trees3.pdf} -\caption{{\bf Interactive Trees} The tree level cutoff can be used to designate groups in Jalview} +\caption{{\bf Interactive Trees} The tree level cutoff can be used to designate +groups in Jalview.} \label{trees2} \end{center} \end{figure} @@ -2270,14 +2285,14 @@ preserve these. % move to ch. 3 ? %Both PCA and Tree viewers are linked analysis windows. This means that their selection and display are linked to a particular alignment, and control and reflect the selection state for a particular view. -\subsubsection{Recovering input data for a tree or PCA plot calculation} +\subsubsection{Recovering input Data for a Tree or PCA Plot Calculation} \parbox[c]{5in}{ The {\sl File $\Rightarrow$ Input Data } option will open a new alignment window containing the original data used to calculate the tree or PCA plot (if available). This function is useful when a tree has been created and then the alignment subsequently changed. } \parbox[c]{1.25in}{\centerline{\includegraphics[width=1.25in]{images/pca_fmenu.pdf} }} -\subsubsection{Changing the associated view for a tree or PCA viewer} +\subsubsection{Changing the associated View for a Tree or PCA Viewer} \parbox[c]{4in}{ The {\sl View $\Rightarrow$ Associated Nodes With $\Rightarrow$ .. } submenu is shown when the viewer is associated with an alignment that is involved in multiple views. Selecting a different view does not affect the tree or PCA data, but will change the colouring and display of selected sequences in the display according to the colouring and selection state of the newly associated view. } \parbox[c]{3in}{\centerline{ @@ -2339,7 +2354,7 @@ The redundancy removal dialog box is opened using the {\sl Edit $\Rightarrow$ Re \caption{The Redundancy Removal dialog box opened from the edit menu. Sequences that exceed the current percentage identity threshold and are to be removed are highlighted in black.} \end{figure} -\exercise{Remove redundant sequences}{ +\exercise{Remove Redundant Sequences}{ \exstep{Re-use or recreate the alignment and tree which you worked with in the tree based conservation analysis exercise (exercise \ref{consanalyexerc})} @@ -2349,7 +2364,7 @@ tree based conservation analysis exercise (exercise \ref{consanalyexerc})} \exstep{Experiment with the redundancy removal and observe the relationship between the percentage identity threshold and the pattern of unlinked nodes in the tree display.} } -\subsection{Subdividing the alignment according to specific mutations} +\subsection{Subdividing the Alignment According to Specific Mutations} It is often necessary to explore variations in an alignment that may correlate with mutations observed in a particular region; for example, sites exhibiting @@ -2368,7 +2383,7 @@ selected region, and Jalview's group based conservation analysis annotation and colourschemes can then be used to reveal any associated pattern of sequence variation across the whole alignment. -\subsection{Automated annotation of Alignments and Groups} +\subsection{Automated Annotation of Alignments and Groups} On loading a sequence alignment, Jalview will normally\footnote{Automatic annotation can be turned off in the {\sl Visual } tab in the {\sl Tools @@ -2412,7 +2427,7 @@ Logo} option to display the Consensus profile for the group or alignment. Sequence logos can be enabled by default for all new alignments {\sl via} the Visual tab in the Jalview desktop's preferences dialog box. -\exercise{Group conservation analysis}{ +\exercise{Group Conservation Analysis}{ \exstep{Re-use or recreate the alignment and tree which you worked with in the tree based conservation analysis exercise (exercise \ref{consanalyexerc})} \exstep{Create a new view, and ensure the annotation panel is displayed, and @@ -2473,9 +2488,9 @@ your own server.}, which provides an easily installable system for performing a range of bioinformatics analysis tasks. } \parbox[c]{1.75in}{\includegraphics[width=1.65in]{images/wsmenu.pdf}} -\subsection{One-way web services} +\subsection{One-Way Web Services} -There are two types of one way service in jalview. Database services, +There are two types of one way service in Jalview. Database services, which were introduced in in Section \ref{fetchseq}, provide sequence and alignment data. They can also be used to add sequence IDs to an alignment imported from a local file, prior to further annotation retrieval, as described @@ -2528,7 +2543,7 @@ successfully use web services from Jalview, since it periodically checks the progress of running jobs. -\subsection{JABA Web Services for sequence alignment and analysis} +\subsection{JABA Web Services for Sequence Alignment and Analysis} \label{jabaservices} JABA stands for ``JAva Bioinformatics Analysis'', which is a system developed by Peter Troshin and Geoff Barton at the University of Dundee for running @@ -2548,7 +2563,7 @@ need any further help or more information about the services, please go to the %%\item Learn how to install JABA services and configure Jalview to access them %%\end{list} -\subsection{Changing the Web Services menu layout} +\subsection{Changing the Web Services Menu Layout} \label{changewsmenulayout} If you are working with a lot of different JABA services, you may wish to change the way Jalview lays out the web services menu. You can do this from the Web @@ -2603,20 +2618,20 @@ Test results from JABAWS are reported on Jalview's console output (opened from the Tools menu). Tests are re-run every time Jalview starts, and when the [Refresh Services] button is pressed on the Jalview JABAWS configuration panel. -\subsubsection{Resetting the JABA services setting to their defaults} +\subsubsection{Resetting the JABA Services Setting to their Defaults} Once you have configured a JABAWS server and selected the OK button of the preferences menu, the settings will be stored in your Jalview preferences file, along with any preferences regarding the layout of the web services menu. If you should ever need to reset the JABAWS server list to its defaults, use the `Reset Services' button on the Web Services preferences panel. -\subsection{Running your own JABA server} +\subsection{Running your own JABA Server} You can download and run JABA on your own machine using the `VMWare' or VirtualBox virtual machine environments. If you would like to learn how to do this, there are full instructions at the \href{http://www.compbio.dundee.ac.uk/jabaws/}{JABA web site}. -\exercise{Installing a JABA Virtual Machine on your computer}{ +\exercise{Installing a JABA Virtual Machine on your Computer}{ \label{jabawsvmex}{\sl This tutorial will demonstrate the simplest way of installing JABA on your computer, and configuring Jalview so it can access the JABA services. @@ -2652,7 +2667,7 @@ for the different services provided by the VM. Make a note of the JABAWS URL -- this will begin with `http:' and end with `/jabaws''.} } -\exercise{Configuring Jalview to access your new JABAWS virtual appliance}{ +\exercise{Configuring Jalview to Access your new JABAWS Virtual Appliance}{ \label{confnewjabawsappl} \exstep{Start Jalview (If you have not done so already).} \exstep{Enable the Jalview Java Console by selecting its option from the Tools @@ -2738,7 +2753,7 @@ $\Rightarrow$ Sort } sub menu. \parbox[c]{2in}{\includegraphics[width=2in]{images/ws3.pdf}} \caption{{\bf Multiple alignment via web services} The appropriate method is selected from the menu (left), a status box appears (centre), and the results -appear in a new window (right)} +appear in a new window (right).} \label{webservices} \end{center} \end{figure} @@ -2753,7 +2768,7 @@ any further optimisation to the existing alignment. The Re-alignment service provided by ClustalW in this case is effectively a simple form of profile alignment. -\subsubsection{Alignments of sequences that include hidden regions} +\subsubsection{Alignments of Sequences that include Hidden Regions} If the view or selected region that is submitted for alignment contains hidden regions, then {\bf only the visible sequences will be submitted to the service}. @@ -2778,7 +2793,7 @@ Web Service $\Rightarrow$ Alignment $\Rightarrow$ Muscle with Defaults}. A windo } -\subsection{Customising the parameters used for alignment} +\subsection{Customising the Parameters used for Alignment} JABA web services allow you to vary the parameters used when performing a bioinformatics analysis. For JABA alignment services, this means you are @@ -2791,9 +2806,9 @@ usually able to modify the following types of parameters: \end{list} -\subsubsection{Getting help on the parameters for a service} +\subsubsection{Getting Help on the Parameters for a Service} Each parameter available for a method usually has a short description, which -jalview will display as a tooltip, or as a text pane that can be opened under +Jalview will display as a tooltip, or as a text pane that can be opened under the parameter's controls. In the parameter shown in Figure \ref{clustalwparamdetail}, the description was opened by selecting the button on the left hand side. Online help for the service can also be accessed, by right clicking the button and selecting a URL @@ -2815,7 +2830,7 @@ reasons, each JABA service may provide one or more presets -- which are pre-defined sets of parameters suited for particular types of alignment problem. For instance, the Muscle service provides the following presets: \begin{list}{$\bullet$}{} -\item Huge +\item Large alignments (balanced) \item Protein alignments (fastest speed) \item Nucleotide alignments (fastest speed) \end{list} @@ -2834,7 +2849,7 @@ perform the alignment. Should you try to submit more sequences than a service can handle, then an error message will be shown informing you of the maximum number allowed by the server. -\subsection{User defined Presets} +\subsection{User Defined Presets} Jalview allows you to create your own presets for a particular service. To do this, select the `{\sl Edit settings and run ...}' option for your service, which will open a parameter editing dialog box like the one shown in Figure @@ -2854,7 +2869,7 @@ parameter set's entry in the web services menu. \label{jwsparamsdialog} } \end{figure} -\subsubsection{Saving parameter sets} +\subsubsection{Saving Parameter Sets} When creating a custom parameter set, you will be asked for a file name to save it. The location of the file is recorded in the Jalview user preferences in the same way as a custom alignment colourscheme, so when Jalview is launched again, @@ -2900,7 +2915,7 @@ JABA service. % } % } -\section{Protein alignment conservation analysis} +\section{Protein Alignment Conservation Analysis} \label{aacons} The {\sl Web Service $\Rightarrow$ Conservation} menu controls the computation of up to 17 different amino acid conservation measures for the current alignment @@ -2911,13 +2926,13 @@ Proteins: Structure, Function, and Genetics} {\bf 43} 227-241.} as well as an ef score developed by Manning et al. in 2008.\footnote{SMERFS Score Manning et al. {\sl BMC Bioinformatics} 2008, {\bf 9} 51 \href{http://dx.doi.org/10.1186/1471-2105-9-51}{doi:10.1186/1471-2105-9-51}} -\subsubsection{Enabling and disabling AACon calculations} +\subsubsection{Enabling and Disabling AACon Calculations} When the AACon Calculation entry in the {\sl Web Services $\Rightarrow$ Conservation} menu is ticked, AACon calculations will be performed every time the alignment is modified. Selecting the menu item will enable or disable automatic recalculation. -\subsubsection{Configuring which AACon calculations are performed} +\subsubsection{Configuring which AACon Calculations are Performed} The {\sl Web Services $\Rightarrow$ Conservation $\Rightarrow$ Change AACon Settings ...} menu entry will open a web services parameter dialog for the currently configured AACon server. Standard presets are provided for quick and @@ -2926,7 +2941,7 @@ change the way that SMERFS calculations are performed. AACon settings for an alignment are saved in Jalview projects along with the latest calculation results. -\subsubsection{Changing the server used for AACon calculations} +\subsubsection{Changing the Server used for AACon Calculations} If you are working with alignments too large to analyse with the public JABAWS server, then you will most likely have already configured additional JABAWS servers. By default, Jalview will chose the first AACon service available from @@ -3028,7 +3043,7 @@ function. The {\sl Web Services $\Rightarrow$ Disorder} menu in the alignment wi allows access to protein disorder prediction services provided by the configured JABAWS servers. -\subsection{Disorder prediction results} +\subsection{Disorder Prediction Results} Each service operates on sequences in the alignment to identify regions likely to be unstructured or flexible, or alternately, fold to form globular domains. As a consequence, disorder predictor results include both sequence features and @@ -3046,7 +3061,7 @@ highlight differences in disorder prediction across aligned sequences. \end{center} \end{figure} -\subsubsection{Navigating large sets of disorder predictions} +\subsubsection{Navigating Large Sets of Disorder Predictions} {\bf Figure \ref{alignmentdisorderannot}} shows a single sequence annotated with a range of disorder predictions. Disorder prediction annotation rows are @@ -3066,7 +3081,7 @@ zoomed out view of a prediction for a single sequence. The sequence is shaded to \end{figure} -\subsection{Disorder predictors provided by JABAWS 2.0} +\subsection{Disorder Predictors provided by JABAWS 2.0} For full details of each predictor and the results that Jalview can display, please consult \href{http://www.jalview.org/help/html/webServices/proteinDisorder.html}{Jalview's @@ -3198,7 +3213,7 @@ data sources. DAS (the Distributed Annotation System) is the primary source of sequence features, whilst webservices like JNet (see \ref{jpred} above) can be used to analyse a given sequence or alignment and generate annotation for it. -\subsection{Creating sequence features} +\subsection{Creating Sequence Features} Sequence features can be created simply by selecting the area in a sequence (or sequences) to form the feature and selecting {\sl Selection $\Rightarrow$ Create Sequence Feature } from the right-click context menu (Figure \ref{features}). A dialogue box allows the user to customise the feature with respect to name, group, and colour. The feature is then associated with the sequence. Moving the mouse over a residue associated with a feature brings up a tool tip listing all features associated with the residue. \begin{figure}[htbp] @@ -3211,9 +3226,10 @@ Sequence features can be created simply by selecting the area in a sequence (or \end{center} \end{figure} -Creation of features from a selection spanning multiple sequences results in the creation of one feature per sequence. Each feature remains associated with it's own sequence. +Creation of features from a selection spanning multiple sequences results in the creation of one feature per sequence. +Each feature remains associated with its own sequence. -\subsection{Customising feature display} +\subsection{Customising Feature Display} Feature display can be toggled on or off by selecting the {\sl View $\Rightarrow$ Show Sequence Features} menu option. When multiple features are @@ -3252,10 +3268,10 @@ http://www.sanger.ac.uk/resources/software/gff/spec.html} and its own Jalview Features file format for the import of sequence annotation. Features and alignment annotation are also extracted from other formats such as Stockholm, and AMSA. URL links may also be attached to features. See the online -documentation for more details of the additional capabilities of the jalview +documentation for more details of the additional capabilities of the Jalview features file. -\exercise{Creating features}{ +\exercise{Creating Features}{ \exstep{Open the alignment at \textsf{http://www.jalview.org/tutorial/alignment.fa}. We know that the Cysteine residues at columns 97, 102, 105 and 135 are involved in iron binding so we will create them as features. Navigate to column 97, sequence 1. Select the entire column by clicking in the ruler bar. Then right-click on the selection to bring up the context menu and select {\sl Selection $\Rightarrow$ Create Sequence Feature}. A dialogue box will appear. } \exstep{ @@ -3273,7 +3289,7 @@ feature type is now turned off. Click it again and note that the features are now displayed. Close the sequence feature settings box by clicking OK or Cancel.} } -\subsection{Creating user defined annotation} +\subsection{Creating User Defined Annotation} Annotations are properties that apply to the alignment as a whole and are visualized on rows in the annotation panel. To create a new annotation row, right click on the annotation label panel and select the {\sl Add New Row} menu option (Figure \ref{newannotrow}). A dialogue box appears. Enter the label to use for this row and a new row will appear. @@ -3299,7 +3315,7 @@ To create a new annotation, first select all the positions to be annotated on th \end{center} \end{figure} -\exercise{Annotating alignments}{ +\exercise{Annotating Alignments}{ \exstep{Load the alignment at \textsf{http://www.jalview.org/tutorial/alignment.fa}. Right-click on the annotation label for {\sl Conservation} to bring up the context menu and select {\sl Add New Row}. A dialogue box will appear asking for {\sl Label for annotation}. Enter ``Iron binding site" and click OK. A new, empty, row appears. } \exstep{ @@ -3321,7 +3337,7 @@ The format for this file is given in the Jalview help. Press [F1] to open it, an \label{viewannotfileex}\exstep{Recover or recreate the secondary structure prediction that you made in exercise \ref{secstrpredex}. Use the {\sl File $\Rightarrow$ Export Annotation} function to view the Jnet secondary structure prediction annotation row. Note the {\bf SEQUENCE\_REF} statements surrounding the row specifying the sequence association for the annotation. } } -\section{Importing features from databases} +\section{Importing Features from Databases} \label{featuresfromdb} Jalview supports feature retrieval from public databases either directly or {\sl via} the Distributed Annotation System (DAS\footnote{http://www.biodas.org/}). @@ -3346,7 +3362,7 @@ rendered relative to the sequence's start position. If the start/end positions do not match the coordinate system from which the features were defined, then the features will be displayed incorrectly. -\subsubsection{Viewing and exporting a sequence's database annotation} +\subsubsection{Viewing and Exporting a Sequence's Database Annotation} You can export all the database cross references and annotation terms shown in the sequence ID tooltip for a sequence by right-clicking and selecting the {\sl @@ -3363,7 +3379,7 @@ pasted into a web page.} \parbox[c]{3in}{ \centerline{\includegraphics[width=2.2in]{images/seqdetailsreport.pdf}}} -\subsubsection{Automatically discovering a sequence's database references} +\subsubsection{Automatically Discovering a Sequence's Database References} Jalview includes a function to automatically verify and update each sequence's start and end numbering against any of the sequence databases that the {\sl Sequence Fetcher} has access to. This function is accessed from the {\sl @@ -3429,22 +3445,31 @@ listed and groups of features from one data source can be selected/deselected by checking the labelled box at the top of the panel. -\subsubsection{The Fetch Uniprot IDs dialog box} +\subsubsection{The Fetch Uniprot IDs Dialog Box} \label{discoveruniprotids} If any sources are selected which refer to Uniprot coordinates as their reference system, then you may be asked if you wish to retrieve Uniprot IDs for your sequence. Pressing OK instructs Jalview to verify the sequences against Uniprot records retrieved using the sequence's ID string. This operates in much the same way as the {\sl Web Service $\Rightarrow$ Fetch Database References } function described in Section \ref{fetchdbrefs}. If a sequence is verified, then the start/end numbering will be adjusted to match the Uniprot record to ensure that features retrieved from the DAS source are rendered at the correct position. -\subsubsection{Rate of feature retrieval} +\subsubsection{Rate of Feature Retrieval} Feature retrieval can take some time if a large number of sources is selected and if the alignment contains a large number of sequences. This is because Jalview only queries a particular DAS source with one sequence at a time, to avoid overloading it. As features are retrieved, they are immediately added to the current alignment view. The retrieved features are shown on the sequence and can be customised as described previously. -\exercise{Retrieving features with DAS}{ +\exercise{Retrieving Features with DAS}{ \label{dasfeatretrexcercise} \exstep{Load the alignment at \textsf{http://www.jalview.org/tutorial/alignment.fa}. Select {\sl View $\Rightarrow$ Feature Settings \ldots} from the alignment window menu. Select -the {\sl DAS Settings} tab. A long list of available DAS sources is listed. Select a small number, eg Uniprot, DSSP, signalP and netnglyc. Click OK. A window may prompt whether you wish Jalview to map the sequence IDs onto Uniprot IDs. Click {\sl Yes}. Jalview will start retrieving features. As features become available they will be mapped onto the alignment. } \exstep{If Jalview is taking too long to retrieve features, the process can be cancelled with the {\sl Cancel Fetch} button. Rolling the mouse cursor over the sequences reveals a large number of features annotated in the tool tip. Close the Sequence Feature Settings window. } -\exstep{Move the mouse over the sequence ID panel. Non-positional features such as literature references and protein localisation predictions are given in the tooltip, below any database cross references associated with the sequence.} -\exstep{Search through the alignment to find a feature with a link symbol next to it. Right click to bring up the alignment view popup menu, and find a corresponding entry in the {\sl Link } sub menu. } +the {\sl DAS Settings} tab. A long list of available DAS sources is listed. +Select a small number, eg Uniprot, DSSP, signalP and netnglyc. Click. +A window may prompt whether you wish Jalview to fetch DAS features. Click {\sl +Yes}. +Jalview will start retrieving features. As features become available they will be mapped onto the alignment. } +\exstep{If Jalview is taking too long to retrieve features, the process can be cancelled with the {\sl Cancel Fetch} button. +Rolling the mouse cursor over the sequences reveals a large number of features annotated in the tool tip. +Close the Sequence Feature Settings window. } +\exstep{Move the mouse over the sequence ID panel. +Non-positional features such as literature references and protein localisation predictions are given in the tooltip, below any database cross references associated with the sequence.} +\exstep{Search through the alignment to find a feature with a link symbol next to it. +Right click to bring up the alignment view popup menu, and find a corresponding entry in the {\sl Link } sub menu. } % TODO this doesn't work ! \includegraphics[width=.3in]{images/link.pdf} \exstep{ @@ -3460,8 +3485,8 @@ Select {\sl View $\Rightarrow$ Feature Settings\ldots} to reopen the Feature Set } } -\subsection{Colouring features by score or description -text} +\subsection{Colouring Features by Score or Description +Text} \label{featureschemes} Sometimes, you may need to visualize the differences in information carried by sequence features of the same type. This is most often the case when features @@ -3492,7 +3517,7 @@ that feature type - with coloured blocks or text to indicate the colouring style and a greater than ($>$) or less than ($<$) symbol to indicate when a threshold has been defined. -\subsection{Using features to re-order the alignment} +\subsection{Using Features to Re-order the Alignment} \label{featureordering} The presence of sequence features on certain sequences or in a particular region of an alignment can quantitatively identify important trends in @@ -3510,7 +3535,7 @@ options to re-order the alignment. Finally, if a specific region is selected, then only features found in that region of the alignment will be used to create the new alignment ordering. -% \exercise{Shading and sorting alignments using sequence features}{ +% \exercise{Shading and Sorting Alignments using Sequence Features}{ % \label{shadingorderingfeatsex} % % This exercise is currently not included in the tutorial because no DAS servers @@ -3586,7 +3611,7 @@ table below shows which alignment programs are most appropriate for nucleotide alignment. Generally, all will work, but some may be more suited to your purposes than others. We also note that none of these include support for taking RNA secondary structure prediction into account when aligning -sequences (but will be providing services for this in the future!). +sequences (but will be providing services for this in the future!) \begin{table}{} \centering \begin{tabular}{|l|c|l|} @@ -3662,7 +3687,7 @@ Views of alignments involving DNA sequences are linked to views of alignments co } -\subsection{Coding regions from EMBL records} +\subsection{Coding Regions from EMBL Records} Many EMBL records that can be retrieved with the sequence fetcher contain exons. Coding regions will be marked as features on the EMBL nucleotide sequence, and @@ -3674,7 +3699,7 @@ EMBL sequence. Jalview utilises cross-reference information in two ways. \subsubsection{Retrieval of Protein or DNA Cross References} The {\sl Calculate $\Rightarrow$ Get Cross References } function is only available when Jalview recognises that there are protein/DNA cross-references present on sequences in the alignment. When selected, it retrieves the cross references from the alignment's dataset (a set of sequence and annotation metadata shared between alignments) or using the sequence database fetcher. This function can be used for EMBL sequences containing coding regions to open the Uniprot protein products in a new alignment window. The new alignment window that is opened to show the protein products will also allow dynamic highlighting of codon positions in the EMBL record for each residue in the protein product(s). -\subsubsection{Retrieval of protein DAS features on coding regions} +\subsubsection{Retrieval of Protein DAS Features on Coding Regions} The Uniprot cross-references derived from EMBL records can be used by Jalview to visualize protein sequence features directly on nucleotide alignments. This is because the database cross references include the sequence coordinate mapping information to correspond regions on the protein sequence with that of the nucleotide contig. Jalview will use the Uniprot accession numbers associated with the sequence to retrieve features, and then map them onto the nucleotide sequence's coordinate system using the coding region location. @@ -3690,7 +3715,7 @@ here).} \end{center} \end{figure} -\exercise{Visualizing protein features on coding regions} +\exercise{Visualizing Protein Features on Coding Regions} { \exstep{Use the sequence fetcher to retrieve EMBL record D49489.} \exstep{Ensure that {\sl View $\Rightarrow$ Show Sequence Features} is checked and change the alignment view format to Wrapped mode so the distinct exons can be seen.} @@ -3746,7 +3771,7 @@ the VARNA secondary structure viewer for the display of RNA base pair diagrams. It also allows the extraction of RNA secondary structure from 3D data when available. -\subsection{Performing RNA secondary structure predictions} +\subsection{Performing RNA Secondary Structure Predictions} Secondary structure consensus calculations can be performed by enabling the VIENNA service {\sl via} the {\sl Web Services $\Rightarrow$ Secondary Structure} menu. These consensus structures are created by analysing the @@ -3778,7 +3803,7 @@ score.} \end{figure} -\exercise{Viewing RNA structures}{ +\exercise{Viewing RNA Structures}{ \label{viewingrnaex} \exstep{Import RF00162 from Rfam (Full).} -- 1.7.10.2