Changed pca exercise.

[jalview-manual.git] / TheJalviewTutorial.tex
diff --git a/TheJalviewTutorial.tex b/TheJalviewTutorial.tex

index 231b364..bb18127 100644 (file)
--- a/TheJalviewTutorial.tex
+++ b/TheJalviewTutorial.tex
@@ -101,11 +101,11 @@ Dundee, Scotland DD1 5EH, UK
  
  \vspace{2in}
  
-Manual Version 1.6 
+Manual Version 1.6
  % post CLS lifesci course on 15th January
  % draft. Remaining items are AACon, RNA visualization/editing and Protein disorder analysis exercises.
  
-18th February 2016
+9th June 2016
  
  
  \end{center}
@@ -151,7 +151,7 @@ colouring.} and includes a javascript API to allow customisable display of
  alignments for web sites such as Pfam.\footnote{\url{http://pfam.xfam.org}}
  
  
-Jalview 2.8.2 was released in December 2014. The Jalview Desktop in this version
+The Jalview Desktop in this version
  provides access to protein and nucleic acid sequence, alignment and structure
  databases, and includes the Jmol\footnote{ Provided under the LGPL licence at
  \url{http://www.jmol.org}} viewer for molecular structures, and the VARNA\footnote{Provided under GPL licence at \url{http://varna.lri.fr}} program for the visualization of RNA secondary structure. A
@@ -188,7 +188,8 @@ between Jalview, TOPALi and AstexViewer.} in 2004, enabling Andrew Waterhouse an
  Jim Procter to re-engineer the original program to introduce contemporary developments
  in bioinformatics and take advantage of the latest web and Java technology.
  Jalview's development has been supported from 2009
-by awards from the BBSRC's Tools and Resources fund, and, since 2014, a Wellcome Trust Biomedical Resource grant\footnote{Wellcome grant number 101651/Z/13/Z}. In 2010, 2011, and 2012, Jalview benefitted from the
+onwards by BBSRC funding, and since 2014 by a
+Wellcome Trust Biomedical Resource grant\footnote{Wellcome grant number 101651/Z/13/Z}. In 2010, 2011, and 2012, Jalview benefitted from the
  \href{http://code.google.com/soc/}{Google Summer of Code}, when Lauren Lui and Jan Engelhardt introduced new features for handling RNA alignments and secondary structure annotation, in collaboration with Yann Ponty.\footnote{\url{http://www.lix.polytechnique.fr/~ponty/}}
  
   
@@ -209,19 +210,20 @@ Jalview Java alignment editor"} \newline Michele Clamp, James Cuff, Stephen M. S
  This tutorial is written in a manual format with short exercises where
  appropriate, typically at the end of each section. This chapter concerns the
  basic operation of Jalview and should be sufficient for those who want to
-load Jalview (Section \ref{startingjv}), open an alignment (Section \ref{loadingseqs}), perform basic editing and colouring (Section \ref{selectingandediting} and Section \ref{colours}), and produce publication
-and presentation quality graphical output (Section \ref{layoutandoutput}).
-
-Chapter \ref{analysisannotation} covers the additional visualization and
-analysis techniques that Jalview provides. This includes working with the
-embedded Jmol molecular structure viewer, building and viewing trees and PCA
+launch Jalview (Section \ref{startingjv}), open an alignment (Section
+\ref{loadingseqs}), perform basic editing (Section
+\ref{selectingandediting}), colouring (Section \ref{colours}), and produce
+publication and presentation quality graphical output (Section \ref{layoutandoutput}).
+
+In addition, the manual covers the additional visualization and
+analysis techniques available in Jalview. This includes working
+with the embedded Jmol molecular structure viewer, building and viewing trees and PCA
  plots, and using trees for sequence conservation analysis. An overview of
  the Jalview Desktop's webservices is given in Section \ref{jvwebservices}, and
  the alignment and secondary structure prediction services are described
-in detail in Sections \ref{msaservices} and \ref{protsspredservices}. Following
-this, Section \ref{featannot} details the creation and visualization of sequence
+in detail in Sections \ref{msaservices} and \ref{protsspredservices}. Section \ref{featannot} details the creation and visualization of sequence
  and alignment annotation, and the retrieval of sequences and annotation from
-databases and DAS Servers. Finally, Section \ref{workingwithnuc} discusses
+databases and DAS Servers. Section \ref{workingwithnuc} discusses
  specific features of use when working with nucleic acid sequences, such as translation and linking to protein
  coding regions, and the display and analysis of RNA secondary structure.
  
@@ -235,7 +237,7 @@ Keystrokes using the special non-symbol keys are represented in the tutorial by
  enclosing the pressed keys with square brackets ({\em e.g.} [RETURN] or [CTRL]).
  
  Keystroke combinations are combined with a `-' symbol ({\em e.g.} [CTRL]-C means
-press [CTRL] and the `C' key).
+press [CTRL] and the `C' key) simultaneously.
  
  Menu options are given as a path from the menu
  that contains them - for example {\sl File $\Rightarrow$ Input Alignment
@@ -276,7 +278,7 @@ These links will launch the latest stable release of Jalview.\par
  When the application is launched with webstart, two dialogs may appear before
  the application starts. If your browser is not set up to handle webstart, then
  clicking the launch link may download a file that needs to be opened
-manually, or prompt you to select the correct program to handle the webstart
+manually, or prompt you to select the program to handle the webstart
  file. If that is the case, then you will need to locate the {\bf javaws} program
  on your system\footnote{The file that is downloaded will have a type of {\bf
  application/x-java-jnlp-file} or {\bf .jnlp}. The {\bf javaws} program that can run
@@ -415,7 +417,8 @@ The major features of the Jalview Desktop are illustrated in Figure \ref{anatomy
   where editing and navigation are performed using the keyboard. The {\bf F2 key}
   is used to switch between these two modes. With a Mac as the F2 is
   often assigned to screen brightness, one may often need to  type {\bf function
- [Fn] key with F2}.
+ [Fn] key with F2} function
+ [Fn]-F2.
  
  \begin{figure}[htb]
  \begin{center}
@@ -563,7 +566,7 @@ URL directly.
  
  \subsection{From a File}
  Jalview can read sequence alignments from a sequence alignment file. This is a
-text file, not a word processor document. For entering sequences from a
+text file, {\bf not} a word processor document. For entering sequences from a
  wordprocessor document see Cut and Paste  (Section \ref{cutpaste}) below. Select
  {\sl File $\Rightarrow$ Input Alignment $\Rightarrow$ From File} from the main
  menu (Figure \ref{loadfile}). You will then get a file selection window where
@@ -1324,7 +1327,8 @@ static residue schemes are modified using a dynamic scheme. The individual schem
  \subsection{Colouring a Group or Selection}
  
  Selections or groups can be coloured in two ways. The first is {\sl via} the Alignment Window's {\sl Colour} menu as stated above,
- after first ensuring that the {\sl Apply Colour To All Groups} flag is not selected. 
+ after first ensuring that the {\sl Apply Colour To All Groups} flag is {\bf
+ not} selected.
   This must be turned {\sl off} specifically as it is {\sl on} by default. 
   When unticked, selections from the Colours menu will only change the colour for residues in the current selection, 
   or the alignment view's ``background colourscheme'' when no selection exists.
@@ -1695,19 +1699,174 @@ Photoshop, Illustrator, Inkscape, Ghostview, Powerpoint (Windows), or
  Preview (Mac OS X). Zoom in and note that the image has near-infinite
  resolution.} }
  
-\chapter{Features and Annotation}
-\label{annotation}
-\section{Features and Annotation}
+\chapter{Annotation and Features}
  \label{featannot}
  Features and annotations are additional information that is overlaid on the sequences and the alignment. Generally speaking, annotations are associated with columns in the alignment. Features are associated with specific residues in the sequence. 
  
-Annotations are shown below the alignment in the annotation panel, and often reflect properties of the alignment as a whole.  The Conservation, Consensus and Quality scores are examples of dynamic annotation, so as the alignment changes, they change along with it. Conversely, sequence features are properties of the individual sequences, so they do not change with the alignment, but are shown mapped on to specific residues within the alignment. 
+Annotations are shown below the alignment in the annotation panel, and often reflect properties of the alignment as a whole. 
+Conversely, sequence features are properties of the individual sequences, so they do not change with the alignment, 
+but are shown mapped on to specific residues within the alignment. 
  
  Features and annotation can be interactively created, or retrieved from external
  data sources. DAS (the Distributed Annotation System) is the primary source of
  sequence features, whilst webservices like JNet (see \ref{jpred} above) can be used to analyse a 
  given sequence or alignment and generate annotation for it.
  
+
+\section{Conservation, Quality and Conservation Annotation}
+\label{annotationintro}
+Jalview automatically calculates several quantitative alignment annotations
+which are displayed as histograms below the multiple sequence alignment columns. 
+Conservation, quality and conservation scores are examples of dynamic
+annotation, so as the alignment changes, they change along with it.
+The scores can be used in the hybrid colouring options to shade the alignments. 
+Mousing over a conservation histogram reveals a tooltip with more information.
+
+These annotations can be hidden and deleted via the context menu linked to the
+annotation row; but they are only created on loading an alignment. If they are
+deleted then the alignment should be saved and then reloaded to restore them.
+Jalview provides a toggle to autocalculate a consensus sequence upon editing. This is normally selected by default, but can be turned off for
+large alignments {\sl via} the {\sl Calculate $\Rightarrow$ Autocalculate
+Consensus} menu option if the interface is too slow.
+
+\subsubsection{Conservation Annotation}
+
+Alignment conservation annotation is quantitative numerical index reflecting the
+conservation of the physico-chemical properties for each column of the alignment. 
+The calculation is based on AMAS method of multiple sequence alignment analysis (Livingstone C.D. and Barton G.J. (1993) CABIOS Vol. 9 No. 6 p745-756), 
+with identities scoring highest, and amino acids with substitutions in the same physico-chemical class have next highest score. 
+The score for each column is shown below the histogram. 
+The conserved columns with a score of 11 are indicated by '*'.
+Columns with a score of 10 have mutations but all properties are conserved are marked with a '+'.
+
+\subsubsection{Consensus Annotation}
+
+Alignment consensus annotation reflects the percentage of the different residue
+per column. By default this calculation includes gaps in columns, gaps can be ignored via the Consensus label context 
+menu to the left of the consensus bar chart. 
+The consensus histogram can be overlaid
+with a sequence logo that reflects the symbol distribution at each column of
+the alignment. Right click on the Consensus annotation row and select the {\sl Show
+Logo} option to display the Consensus profile for the group or alignment.
+Sequence logos can be enabled by default for all new alignments {\sl via} the
+Visual tab in the Jalview desktop's preferences dialog box.
+
+\subsubsection{Quality Annotation}
+
+Alignment quality annotation is an ad-hoc measure of the likelihood of observing
+the mutations (if any) in a particular column of the alignment. The quality score is calculated for each column in an alignment by summing, 
+for all mutations, the ratio of the two BLOSUM 62 scores for a mutation pair and each residue's conserved BLOSUM62 score (which is higher). 
+This value is normalised for each column, and then plotted on a scale from 0 to 1.
+
+\subsubsection{Group Associated Annotation}
+\label{groupassocannotation}
+Group associated consensus and conservation annotation rows reflect the
+sequence variation within a particular group. Their calculation is enabled
+by selecting the {\sl Group Conservation} or {\sl Group Consensus} options in
+the {\sl Annotation $\Rightarrow$ Autocalculated Annotation } submenu of the
+alignment window. 
+
+\subsection{Creating User Defined Annotation}
+
+Annotations are properties that apply to the alignment as a whole and are visualized on rows in the annotation panel.
+To create a new annotation row, right click on the annotation label panel and select the {\sl Add New Row} menu option (Figure \ref{newannotrow}).
+A dialogue box appears. Enter the label to use for this row and a new row will appear.
+
+To create a new annotation, first select all the positions to be annotated on the appropriate row. 
+Right-clicking on this selection brings up the context menu which allows the insertion of graphics for secondary structure ({\sl Helix} or {\sl Sheet}), 
+text {\sl Label} and the colour in which to present the annotation (Figure \ref{newannot}). On selecting {\sl Label} a dialogue box will appear, 
+requesting the text to place at that position. After the text is entered, the selection can be removed and the annotation becomes clearly 
+visible\footnote{When annotating a block of positions, the text can be partly obscured by the selection highlight. Pressing the  [ESC] key clears 
+the selection and the label is then visible.}. Annotations can be coloured or deleted as desired.
+
+\begin{figure}[htbp]
+\begin{center}
+\includegraphics[width=1.3in]{images/annots1.pdf}
+\includegraphics[width=2in]{images/annots2.pdf}
+\caption{{\bf Creating a new annotation row.} Annotation rows can be reordered by dragging them to the desired place.}
+\label{newannotrow}
+\end{center}
+\end{figure}
+
+\begin{figure}[htbp]
+\begin{center}
+\includegraphics[width=2in]{images/annots3.pdf}
+\includegraphics[width=2in]{images/annots4.pdf}
+\includegraphics[width=2in]{images/annots5.pdf}
+\caption{{\bf Creating a new annotation.} Annotations are created from a selection on the annotation row and can be coloured as desired.}
+\label{newannot}
+\end{center}
+\end{figure}
+
+\subsection{Automated Annotation of Alignments and Groups}
+
+On loading a sequence alignment, Jalview will normally\footnote{Automatic
+annotation can be turned off in the {\sl Visual } tab in the {\sl Tools
+$\Rightarrow$ Preferences } dialog box.} calculate a set of automatic annotation
+rows which are shown below the alignment. For nucleotide sequence alignments,
+only an alignment consensus row will be shown, but for amino acid sequences,
+alignment quality (based on BLOSUM 62) and physicochemical conservation will
+also be shown. Conservation is calculated according to Livingstone and
+Barton\footnote{{\sl ``Protein Sequence Alignments: A Strategy for the
+Hierarchical Analysis of Residue Conservation." } Livingstone C.D. and Barton
+G.J. (1993) {\sl CABIOS } {\bf 9}, 745-756}.
+Consensus is the modal residue (or {\tt +} where there is an equal top residue).
+The inclusion of gaps in the consensus calculation can be toggled by
+right-clicking on the the Consensus label and selecting {\sl Ignore Gaps in
+Consensus} from the pop-up context menu located with consensus annotation row.
+Quality is a measure of the inverse likelihood of unfavourable mutations in the alignment. Further details on these
+calculations can be found in the on-line documentation.
+
+
+\exercise{Annotating Alignments}{
+\exstep{Load the alignment at \textsf{http://www.jalview.org/tutorial/alignment.fa}. 
+Right-click on the {\sl Conservation} annotation row to
+bring up the context menu and select {\sl Add New Row}. A dialogue box will appear asking for  {\sl Label for annotation}. 
+Enter ``Iron binding site" and click OK. A new, empty, row appears.
+}
+\exstep{
+Navigate to column 97. Move down and on the new annotation row called
+``Iron binding site, select column 97.
+Right click at this selection and select {\sl Label} from the context menu.
+Enter ``Fe" in the box and click OK. Right-click on the selection again and select {\sl Colour}. 
+Choose a colour from the colour chooser dialogue 
+and click OK. Press [ESC] to remove the selection.
+}
+\exstep{ Select columns 70-77 on the annotation row. Right-click and choose {\sl Sheet} from the
+ context menu. You will be prompted for a label. Enter ``B" and press OK. A new line showing the 
+ sheet as an arrow appears. The colour of the label can be changed but not the colour of the sheet 
+ arrow. 
+}
+\exstep{Right click on the title text of annotation row that you just created. 
+Select {\sl Export Annotation} and, in the {\bf Export Annotation} dialog box that will open, select the Jalview format and click 
+the [To Textbox] button. 
+
+The format for this file is given in the Jalview help. Press [F1] to open it, and find 
+the ``Annotations File Format'' entry in the ``Alignment Annotations'' section of the contents 
+pane. }
+
+\exstep{Export the file to a text editor and edit the file to change the name of the annotation 
+row. Save the file and drag it onto the alignment view.}
+\exstep{Try to add an additional helix somewhere along the row by editing the file and 
+re-importing it.
+{\sl Hint: Use the {\bf Export Annotation} function to view what helix annotation looks like in 
+a Jalview annotation file.}}
+\exstep{Use the {\sl Alignment Window $\Rightarrow$ File $\Rightarrow$ Export Annotations...} 
+function to export all the alignment's annotation to a file.}
+\exstep{Open the exported annotation in a text editor, and use the {\bf Annotation File Format} 
+documentation to modify the style of the Conservation, Consensus and Quality annotation rows so 
+they appear as several lines on a single line graph.
+{\sl Hint: You need to change the style of annotation row in the first field of the annotation 
+row entry in the file, and create an annotation row grouping to overlay the three quantitative 
+annotation rows.}
+}
+\label{viewannotfileex}\exstep{Recover or recreate the secondary structure
+prediction that you made in exercise \ref{secstrpredex}. Use the {\sl File $\Rightarrow$ Export 
+Annotation} function to view the Jnet secondary structure prediction annotation row. Note the 
+{\bf SEQUENCE\_REF} statements surrounding the row specifying the sequence association for the 
+annotation. } }
+
+
  \section{Importing Features from Databases}
  \label{featuresfromdb}
  Jalview supports feature retrieval from public databases either directly or {\sl
@@ -2023,7 +2182,8 @@ appropriate box. Click on the Feature Colour bar to change the colour if
  desired, add a short description (``One of four Iron binding Cysteines") and press OK. The features will then appear on the sequences. } \exstep{Roll the mouse cursor over the new features. Note that the position given in the tool tip is the residue number, not the column number.  To demonstrate that there is one feature per sequence, clear all selections by pressing [ESC] then insert a gap in sequence 3 at column 95. Roll the mouse over the features and you will see that the feature has moved with the sequence. Delete the gap you created.
  }
  \exstep{
-Add a similar feature to column 102. When the feature dialogue box appears, clicking the Sequence Feature Name box brings up a list of previously described features. Using the same Sequence Feature Name allows the features to be grouped.}
+Add a similar feature to column 102. When the feature dialogue box appears, clicking the Sequence Feature 
+Name box brings up a list of previously described features. Using the same Sequence Feature Name allows the features to be grouped.}
  \exstep{Select {\sl View $\Rightarrow$ Feature Settings\ldots} from the
  alignment window menu. The Sequence Feature Settings window will appear. Move
  this so that you can see the features you have just created. Click the check
@@ -2032,80 +2192,6 @@ feature type is now turned off. Click it again and note that the features are
  now displayed. Close the sequence feature settings box by clicking OK or
  Cancel.} }
  
-\subsection{Creating User Defined Annotation}
-
-Annotations are properties that apply to the alignment as a whole and are visualized on rows in the annotation panel.
-To create a new annotation row, right click on the annotation label panel and select the {\sl Add New Row} menu option (Figure \ref{newannotrow}). A dialogue box appears. Enter the label to use for this row and a new row will appear.
-
-\begin{figure}[htbp]
-\begin{center}
-\includegraphics[width=1.3in]{images/annots1.pdf}
-\includegraphics[width=2in]{images/annots2.pdf}
-\caption{{\bf Creating a new annotation row.} Annotation rows can be reordered by dragging them to the desired place.}
-\label{newannotrow}
-\end{center}
-\end{figure}
-
-To create a new annotation, first select all the positions to be annotated on the appropriate row. Right-clicking on this selection brings up the context menu which allows the insertion of graphics for secondary structure ({\sl Helix} or {\sl Sheet}), text {\sl Label} and the colour in which to present the annotation (Figure \ref{newannot}). On selecting {\sl Label} a dialogue box will appear, requesting the text to place at that position. After the text is entered, the selection can be removed and the annotation becomes clearly visible\footnote{When annotating a block of positions, the text can be partly obscured by the selection highlight. Pressing the  [ESC] key clears the selection and the label is then visible.}. Annotations can be coloured or deleted as desired.
-
-\begin{figure}[htbp]
-\begin{center}
-\includegraphics[width=2in]{images/annots3.pdf}
-\includegraphics[width=2in]{images/annots4.pdf}
-\includegraphics[width=2in]{images/annots5.pdf}
-\caption{{\bf Creating a new annotation.} Annotations are created from a selection on the annotation row and can be coloured as desired.}
-\label{newannot}
-\end{center}
-\end{figure}
-
-\exercise{Annotating Alignments}{
-\exstep{Load the alignment at \textsf{http://www.jalview.org/tutorial/alignment.fa}. 
-Right-click on the {\sl Conservation} annotation row to
-bring up the context menu and select {\sl Add New Row}. A dialogue box will appear asking for  {\sl Label for annotation}. 
-Enter ``Iron binding site" and click OK. A new, empty, row appears.
-}
-\exstep{
-Navigate to column 97. Move down and on the new annotation row called
-``Iron binding site, select column 97.
-Right click at this selection and select {\sl Label} from the context menu.
-Enter ``Fe" in the box and click OK. Right-click on the selection again and select {\sl Colour}. 
-Choose a colour from the colour chooser dialogue 
-and click OK. Press [ESC] to remove the selection.
-}
-\exstep{ Select columns 70-77 on the annotation row. Right-click and choose {\sl Sheet} from the
- context menu. You will be prompted for a label. Enter ``B" and press OK. A new line showing the 
- sheet as an arrow appears. The colour of the label can be changed but not the colour of the sheet 
- arrow. 
-}
-\exstep{Right click on the title text of annotation row that you just created. 
-Select {\sl Export Annotation} and, in the {\bf Export Annotation} dialog box that will open, select the Jalview format and click 
-the [To Textbox] button. 
-
-The format for this file is given in the Jalview help. Press [F1] to open it, and find 
-the ``Annotations File Format'' entry in the ``Alignment Annotations'' section of the contents 
-pane. }
-
-\exstep{Export the file to a text editor and edit the file to change the name of the annotation 
-row. Save the file and drag it onto the alignment view.}
-\exstep{Try to add an additional helix somewhere along the row by editing the file and 
-re-importing it.
-{\sl Hint: Use the {\bf Export Annotation} function to view what helix annotation looks like in 
-a Jalview annotation file.}}
-\exstep{Use the {\sl Alignment Window $\Rightarrow$ File $\Rightarrow$ Export Annotations...} 
-function to export all the alignment's annotation to a file.}
-\exstep{Open the exported annotation in a text editor, and use the {\bf Annotation File Format} 
-documentation to modify the style of the Conservation, Consensus and Quality annotation rows so 
-they appear as several lines on a single line graph.
-{\sl Hint: You need to change the style of annotation row in the first field of the annotation 
-row entry in the file, and create an annotation row grouping to overlay the three quantitative 
-annotation rows.}
-}
-\label{viewannotfileex}\exstep{Recover or recreate the secondary structure
-prediction that you made in exercise \ref{secstrpredex}. Use the {\sl File $\Rightarrow$ Export 
-Annotation} function to view the Jnet secondary structure prediction annotation row. Note the 
-{\bf SEQUENCE\_REF} statements surrounding the row specifying the sequence association for the 
-annotation. } }
-
  \chapter{Multiple Sequence Alignment}
  \label{msaservices}
  Sequences can be aligned using a range of algorithms provided by JABA web
@@ -2145,12 +2231,23 @@ Web Service $\Rightarrow$  Alignment $\Rightarrow$ \ldots} submenu (Figure
  default settings, use one of the available presets, or customise the parameters
  with the `{\sl Edit and Run ..}' dialog box. Once the job is submitted, a
  progress window will appear giving information about the job and any errors that
-occur. After successful completion of the job, a new window is opened with the
-results, in this case an alignment. By default, the new alignment will be
-ordered in the same way as the input sequences; however, many alignment programs
-re-order the input to place homologous sequences close together. This ordering
-can be recovered using the `Original ordering' entry within the {\sl Calculate
-$\Rightarrow$ Sort } sub menu.
+occur. After successful completion of the job, a new alignment window is opened
+with the results, in this case an alignment. By default, the new alignment will be
+ordered in the same way as the input sequences. Note: many alignment
+programs re-order the input during their analysis and place homologous
+sequences close together, the MSA algorithm ordering can be recovered
+using the `Algorithm ordering' entry within the {\sl Calculate $\Rightarrow$
+Sort } sub menu.
+
+\subsubsection{Realignment}
+The re-alignment option is currently only supported by ClustalW and Clustal
+Omega. When performing a re-alignment, Jalview submits the current selection to
+the alignment service complete with any existing gaps. This approach is useful
+when one wishes to align additional sequences to an existing alignment without
+any further optimisation to the existing alignment. The re-alignment service
+provided by ClustalW in this case is effectively a simple form of profile
+alignment.
+
  
  \begin{figure}[htbp]
  \begin{center}
@@ -2164,29 +2261,8 @@ appear in a new window (right).}
  \end{center}
  \end{figure}
  
-\subsubsection{Realignment}
  
-The re-alignment option is currently only supported by ClustalW and Clustal
-Omega. When performing a re-alignment, Jalview submits the current selection to
-the alignment service complete with any existing gaps. This approach is useful
-when one wishes to align additional sequences to an existing alignment without
-any further optimisation to the existing alignment. The re-alignment service
-provided by ClustalW in this case is effectively a simple form of profile
-alignment.
  
-\subsubsection{Alignments of Sequences that include Hidden Regions}
-
-If the view or selected region that is submitted for alignment contains hidden
-regions, then {\bf only the visible sequences will be submitted to the service}.
-Furthermore, each contiguous segment of sequences will be aligned independently
-(resulting in a number of alignment `subjobs' appearing in the status window).
-Finally, the results of each subjob will be concatenated with the hidden regions
-in the input data prior to their display in a new window. This approach ensures
-that 1) hidden column boundaries in the input data are preserved in the
-resulting alignment - in a similar fashion to the constraint that hidden columns
-place on alignment editing (see Section \ref{lockededits} and 2) hidden
-columns can be used to preserve existing parts of an alignment whilst the
-visible parts are locally refined.
  
  \exercise{Multiple Sequence Alignment}{
  \exstep{ Close all windows and open the alignment at {\sf
@@ -2215,6 +2291,20 @@ service again and explore the effect of local alignment on the non-homologous pa
  N-terminal region.} 
  }
  
+\subsubsection{Alignments of Sequences that include Hidden Regions}
+
+If the view or selected region that is submitted for alignment contains hidden
+regions, then {\bf only the visible sequences will be submitted to the service}.
+Furthermore, each contiguous segment of sequences will be aligned independently
+(resulting in a number of alignment `subjobs' appearing in the status window).
+Finally, the results of each subjob will be concatenated with the hidden regions
+in the input data prior to their display in a new window. This approach ensures
+that 1) hidden column boundaries in the input data are preserved in the
+resulting alignment - in a similar fashion to the constraint that hidden columns
+place on alignment editing (see Section \ref{lockededits} and 2) hidden
+columns can be used to preserve existing parts of an alignment whilst the
+visible parts are locally refined.
+
  
  \subsection{Customising the Parameters used for Alignment}
  
@@ -2378,7 +2468,7 @@ Jalview provides support for sequence analysis in two ways. A number of
  analytical methods are `built-in', these are accessed from the {\sl Calculate}
  alignment window menu. Computationally intensive analyses are run outside
  Jalview {\sl via} web services - these are typically accessed {\sl via} the {\sl
-Web Service} menu, and described in \ref{jvwebservices} and subsequent sections.
+Web Service} menu, and described in chapter \ref{jvwebservices}.
  In this section, we describe the built-in analysis capabilities common to both
  the Jalview Desktop and the JalviewLite applet.
   
@@ -2395,9 +2485,9 @@ how to overcome them, were discussed in Section \ref{memorylimits}.
  \subsubsection{What is PCA?}
  Principal components analysis is a technique for examining the structure of
  complex data sets. The components are a set of dimensions formed from the
-measured values in the data set, and the principle component is the one with the
+measured values in the data set, and the principal component is the one with the
  greatest magnitude, or length. The sets of measurements that differ the most
-should lie at either end of this principle axis, and the other axes correspond
+should lie at either end of this principal axis, and the other axes correspond
  to less extreme patterns of variation in the data set.
  In this case, the components are generated by an eigenvector decomposition of
  the matrix formed from the sum of pairwise substitution scores at each aligned
@@ -2415,9 +2505,23 @@ method to be toggled between SeqSpace and a variant calculation that is detailed
  in Jalview's built in documentation.\footnote{See
  \url{http://www.jalview.org/help/html/calculations/pca.html}.}
  
+
+\exercise{Principal Component Analysis}{ \exstep{Load the alignment at
+\textsf{http://www.jalview.org/tutorial/alignment.fa} }
+\exstep{Select the menu option {\sl Calculate $\Rightarrow$ Principle Component Analysis}.
+A new window will open. Move this window so that the tree, alignment and PCA viewer window are all visible.
+Try rotating the plot by clicking and dragging the mouse on the plot in the PCA window.
+Note that clicking on points in the plot will highlight them on the alignment. } 
+\exstep{ Select {\sl Calculate $\Rightarrow$ Calculate Tree $\Rightarrow$
+Neighbour Joining Using BLOSUM62}. A new tree window will appear.
+Click on the tree window. Careful selection of the tree partition location will divide the alignment into a number of groups, 
+each of a different colour.
+Note how the colour of the sequence ID label matches both the colour of
+the partitioned tree and the points in the PCA plot.} }
+
  \subsubsection{The PCA Viewer}
  
-PCA analysis can be launched from the {\sl Calculate $\Rightarrow$ Principle
+PCA analysis can be launched from the {\sl Calculate $\Rightarrow$ Principlal
  Component Analysis} menu option. {\bf PCA requires a selection containing at
  least 4 sequences}.  A window opens containing the PCA tool (Figure \ref{PCA}).
  Each sequence is represented by a small square, coloured by the background
@@ -2426,6 +2530,13 @@ dragging the left mouse button and zoomed using the $\uparrow$ and $\downarrow$
  keys or the scroll wheel of the mouse (if available).  A tool tip appears if the
  cursor is placed over a sequence. Sequences can be selected by clicking on them.
  [CTRL]-Click can be used to select multiple sequences.
+
+Labels will be shown for each sequence by toggling the {\sl View $\Rightarrow$
+Show Labels} menu option, and the plot background colour changed {\sl via} the
+{\sl View $\Rightarrow$ Background Colour..} dialog box. A graphical
+representation of the PCA plot can be exported as an EPS or PNG image {\sl via}
+the {\sl File $\Rightarrow$ Save As $\Rightarrow$ \ldots } submenu.
+
  \begin{figure}[hbtp]
  \begin{center}
  \includegraphics[width=2in]{images/PCA1.pdf}
@@ -2435,19 +2546,7 @@ cursor is placed over a sequence. Sequences can be selected by clicking on them.
  \end{center}
  \end{figure}
  
-Labels will be shown for each sequence by toggling the {\sl View $\Rightarrow$
-Show Labels} menu option, and the plot background colour changed {\sl via} the
-{\sl View $\Rightarrow$ Background Colour..} dialog box. A graphical
-representation of the PCA plot can be exported as an EPS or PNG image {\sl via}
-the {\sl File $\Rightarrow$ Save As $\Rightarrow$ \ldots } submenu.
  
-\exercise{Principle Component Analysis}{ \exstep{Load the alignment at
-\textsf{http://www.jalview.org/examples/exampleFile.jar} and press [ESC] to clear any selections. Alternatively, select {\sl Select $\Rightarrow$ Undefine Groups} to remove all groups and colourschemes. } \exstep{Select the menu option {\sl Calculate $\Rightarrow$ Principle Component Analysis}. A new window will open. Move this window so that the tree, alignment and PCA viewer window are all visible. Try rotating the plot by clicking and dragging the mouse on the plot in the PCA window. Note that clicking on points in the plot will highlight them on the alignment and tree. }
-\exstep{ Click on the tree window. Careful selection of the tree partition
-location will divide the alignment into a number of groups, each of a different
-colour. Note how the colour of the sequence ID label matches both the colour of
-the partitioned tree and the points in the PCA plot.
-} }
  
  \subsubsection{PCA Data Export}
  Although the PCA viewer supports export of the current view, the plots produced
@@ -2527,6 +2626,20 @@ The {\sl View $\Rightarrow$ Associated Nodes With $\Rightarrow$ .. } submenu is
  } \parbox[c]{3in}{\centerline{
  \includegraphics[width=2.5in]{images/pca_vmenu.pdf} }}
  
+\subsection{Tree Based Conservation Analysis}
+\label{treeconsanaly}
+
+Trees reflect the pattern of global sequence similarity exhibited by the
+alignment, or region within the alignment, that was used for their calculation.
+The Jalview tree viewer enables sequences to be partitioned into groups based
+on the tree. This is done by clicking within the tree viewer window. Once subdivided, the
+conservation between and within groups can be visually compared in order to
+better understand the pattern of similarity revealed by the tree and the
+variation within the clades partitioned by the grouping. The conservation based
+colourschemes and the group associated conservation and consensus annotation
+(enabled using the alignment window's {\sl View $\Rightarrow$ Autocalculated
+Annotation $\Rightarrow$ Group Conservation} and {\sl Group Consensus} options)
+can help when working with larger alignments.
  
  \exercise{Trees}{
  \exstep{Ensure that you have at least 1G memory available in Jalview
@@ -2536,7 +2649,8 @@ or in the Development section of the Jalview web site
  (\href{http://www.jalview.org/development/development-builds}{http://www.jalview.org/development/development-builds})
  in the ``latest official build'' row in the table, go to the
  ``Webstart'' column, click on ``G2''.)}
-\exstep{Open the alignment at \textsf{http://www.jalview.org/tutorial/alignment.fa}. Select {\sl Calculate $\Rightarrow$ Calculate Tree $\Rightarrow$ Neighbour Joining Using BLOSUM62}. A new tree window will appear.}
+\exstep{Open the alignment at \textsf{http://www.jalview.org/tutorial/alignment.fa}. 
+Select {\sl Calculate $\Rightarrow$ Calculate Tree $\Rightarrow$ Neighbour Joining Using BLOSUM62}. A new tree window will appear.}
  \exstep{Click on the tree window. A cursor will appear. Note that placing this cursor divides the tree into a number of groups by colour. Place the cursor to give about 4 groups, then select {\sl Calculate $\Rightarrow$ Sort $\Rightarrow$ By Tree Order $\Rightarrow$ Neighbour Joining Tree using BLOSUM62 from ... }. The sequences are reordered to match the order in the tree and groups are formed implicitly.}
  \exstep{Select {\sl Calculate $\Rightarrow$ Calculate Tree $\Rightarrow$
  Neighbour Joining Using \% Identity}. A new tree window will appear. The group colouring 
@@ -2554,21 +2668,6 @@ This demonstrates the use of the {\sl Pad Gaps } editing preference, which ensur
  
  }
  
-\subsection{Tree Based Conservation Analysis}
-\label{treeconsanaly}
-
-Trees reflect the pattern of global sequence similarity exhibited by the
-alignment, or region within the alignment, that was used for their calculation.
-The Jalview tree viewer enables sequences to be partitioned into groups based
-on the tree. This is done by clicking within the tree viewer window. Once subdivided, the
-conservation between and within groups can be visually compared in order to
-better understand the pattern of similarity revealed by the tree and the
-variation within the clades partitioned by the grouping. The conservation based
-colourschemes and the group associated conservation and consensus annotation
-(enabled using the alignment window's {\sl View $\Rightarrow$ Autocalculated
-Annotation $\Rightarrow$ Group Conservation} and {\sl Group Consensus} options)
-can help when working with larger alignments.
-
  \exercise{Tree Based Conservation Analysis}{
  \label{consanalyexerc}
  \exstep{Load the PF03460 PFAM seed alignment using the sequence fetcher. Colour it with the {\sl Taylor colourscheme}, and apply {\sl Conservation } shading. }
@@ -2582,6 +2681,7 @@ within the View menu to aid navigation.}
  {\sl Note: You may want to save the alignment and tree as a project file, since
  it is used in the next few exercises. } }
  
+
  \subsection{Redundancy Removal}
  
  The redundancy removal dialog box is opened using the {\sl Edit $\Rightarrow$ Remove Redundancy\ldots} option in the alignment menu. As its menu option placement suggests, this is actually an alignment editing function, but it is convenient to describe it here. The redundancy removal dialog box presents a percentage identity slider which sets the redundancy threshold. Aligned sequences which exhibit a percentage identity greater than the current threshold are highlighted in black. The [Remove] button can then be used to delete these sequences from the alignment as an edit operation\footnote{Which can usually be undone. A future version of Jalview may allow redundant sequences to be hidden, or represented by a chosen sequence, rather than deleted.}.
@@ -2593,20 +2693,6 @@ The redundancy removal dialog box is opened using the {\sl Edit $\Rightarrow$ Re
  \caption{The Redundancy Removal dialog box opened from the edit menu. Sequences that exceed the current percentage identity threshold and are to be removed are highlighted in black.}
  \end{figure}
  
-\exercise{Remove Redundant Sequences}{
-
-\exstep{Re-use or recreate the alignment and tree which you worked with in the
-tree based conservation analysis exercise (exercise \ref{consanalyexerc}). In
-the alignment window, you may need to deselect groups using Esc key.}
-\exstep{In the Edit menu select Remove Redundancy to open the Redundancy
-threshold selection dialog. Adjust the redundancy threshold value, start
-at 50 and increase the value to 65. Sequences selected will change colour in the Sequence ID panel. Select ``Remove'' to
-remove the sequences that are more than 65\% similar under this alignment.}
-\exstep{Select the Tree viewer's {\sl View $\Rightarrow$ Mark Unlinked Leaves} option, and note that the removed sequences are now prefixed with a * in the tree view.}
-\exstep{Use the [Undo] button in the Redundancy threshold selection dialog box
-to recover the sequences. Note that the * symbols disappear from the tree display.}
-\exstep{Experiment with the redundancy removal and observe the relationship between the percentage identity threshold and the pattern of unlinked nodes in the tree display.}
-}
  
  \subsection{Subdividing the Alignment According to Specific Mutations}
  
@@ -2627,49 +2713,53 @@ selected region, and Jalview's group based conservation analysis annotation and
  colourschemes can then be used to reveal any associated pattern of sequence
  variation across the whole alignment.
  
-\subsection{Automated Annotation of Alignments and Groups}
  
-On loading a sequence alignment, Jalview will normally\footnote{Automatic
-annotation can be turned off in the {\sl Visual } tab in the {\sl Tools
-$\Rightarrow$ Preferences } dialog box.} calculate a set of automatic annotation
-rows which are shown below the alignment. For nucleotide sequence alignments,
-only an alignment consensus row will be shown, but for amino acid sequences,
-alignment quality (based on BLOSUM 62) and physicochemical conservation will
-also be shown. Conservation is calculated according to Livingstone and
-Barton\footnote{{\sl ``Protein Sequence Alignments: A Strategy for the
-Hierarchical Analysis of Residue Conservation." } Livingstone C.D. and Barton
-G.J. (1993) {\sl CABIOS } {\bf 9}, 745-756}.
-Consensus is the modal residue (or {\tt +} where there is an equal top residue).
-The inclusion of gaps in the consensus calculation can be toggled by
-right-clicking on the the Consensus label and selecting {\sl Ignore Gaps in
-Consensus} from the pop-up context menu located with consensus annotation row.
-Quality is a measure of the inverse likelihood of unfavourable mutations in the alignment. Further details on these
-calculations can be found in the on-line documentation.
+% These annotations can be hidden and deleted via the context menu linked to the
+% annotation row; but they are only created on loading an alignment. If they are
+% deleted then the alignment should be saved and then reloaded to restore them.
+% Jalview provides a toggle to autocalculate a consensus sequence upon editing.
+% This is normally selected by default, but can be turned off for large alignments {\sl via} the {\sl Calculate $\Rightarrow$ Autocalculate
+% Consensus} menu option if the interface is too slow.
  
-These annotations can be hidden and deleted via the context menu linked to the
-annotation row; but they are only created on loading an alignment. If they are
-deleted then the alignment should be saved and then reloaded to restore them.
-Jalview provides a toggle to autocalculate a consensus sequence upon editing. This is normally selected by default, but can be turned off for
-large alignments {\sl via} the {\sl Calculate $\Rightarrow$ Autocalculate
-Consensus} menu option if the interface is too slow.
+% \subsubsection{Group Associated Annotation}
+% \label{groupassocannotation}
+% Group associated consensus and conservation annotation rows reflect the
+% sequence variation within a particular group. Their calculation is enabled
+% by selecting the {\sl Group Conservation} or {\sl Group Consensus} options in
+% the {\sl Annotation $\Rightarrow$ Autocalculated Annotation } submenu of the
+% alignment window. 
  
-\subsubsection{Group Associated Annotation}
-\label{groupassocannotation}
-Group associated consensus and conservation annotation rows reflect the
-sequence variation within a particular group. Their calculation is enabled
-by selecting the {\sl Group Conservation} or {\sl Group Consensus} options in
-the {\sl Annotation $\Rightarrow$ Autocalculated Annotation } submenu of the
-alignment window. 
+% \subsubsection{Alignment and Group Sequence Logos}
+% \label{seqlogos}
  
-\subsubsection{Alignment and Group Sequence Logos}
-\label{seqlogos}
+% The consensus annotation row that is shown below the alignment can be overlaid
+% with a sequence logo that reflects the symbol distribution at each column of
+% the alignment. Right click on the Consensus annotation row and select the {\sl
+% Show Logo} option to display the Consensus profile for the group or alignment.
+% Sequence logos can be enabled by default for all new alignments {\sl via} the
+% Visual tab in the Jalview desktop's preferences dialog box.
+
+\section{Pairwise Alignments}
+Jalview can calculate optimal pairwise alignments between arbitrary 
+sequences {\sl via} the {\sl Calculate $\Rightarrow$ Pairwise Alignments\ldots} menu option. 
+Global alignments of all pairwise combinations of the selected sequences are performed and the results returned in a text box.
  
-The consensus annotation row that is shown below the alignment can be overlaid
-with a sequence logo that reflects the symbol distribution at each column of
-the alignment. Right click on the Consensus annotation row and select the {\sl Show
-Logo} option to display the Consensus profile for the group or alignment.
-Sequence logos can be enabled by default for all new alignments {\sl via} the
-Visual tab in the Jalview desktop's preferences dialog box.
+
+
+\exercise{Remove Redundant Sequences}{
+
+\exstep{Re-use or recreate the alignment and tree which you worked with in the
+tree based conservation analysis exercise (exercise \ref{consanalyexerc}). In
+the alignment window, you may need to deselect groups using Esc key.}
+\exstep{In the Edit menu select Remove Redundancy to open the Redundancy
+threshold selection dialog. Adjust the redundancy threshold value, start
+at 50 and increase the value to 65. Sequences selected will change colour in the Sequence ID panel. Select ``Remove'' to
+remove the sequences that are more than 65\% similar under this alignment.}
+\exstep{Select the Tree viewer's {\sl View $\Rightarrow$ Mark Unlinked Leaves} option, and note that the removed sequences are now prefixed with a * in the tree view.}
+\exstep{Use the [Undo] button in the Redundancy threshold selection dialog box
+to recover the sequences. Note that the * symbols disappear from the tree display.}
+\exstep{Experiment with the redundancy removal and observe the relationship between the percentage identity threshold and the pattern of unlinked nodes in the tree display.}
+}
  
  \exercise{Group Conservation Analysis}{
  \exstep{Re-use or recreate the alignment and tree which you worked with in the
@@ -2701,13 +2791,6 @@ both of the columns that you wish to use to subdivide the alignment.}}
  the tree groups made in the previous exercise.}
  }
  
-\subsection{Other Calculations}
-
-
-\subsubsection{Pairwise Alignments}
-
-Jalview can calculate optimal pairwise alignments between arbitrary sequences {\sl via} the {\sl Calculate $\Rightarrow$ Pairwise Alignments\ldots} menu option. Global alignments of all pairwise combinations of the selected sequences are performed and the results returned in a text box.
-
  \begin{figure}[]
  \begin{center}
  \includegraphics[width=4in]{images/pairwise.pdf}
@@ -2716,38 +2799,37 @@ Jalview can calculate optimal pairwise alignments between arbitrary sequences {\
  \end{center}
  \end{figure}
  
-\pagebreak[2]
  
  \chapter{Working with 3D structures}
  \label{3Dstructure}
  
  
-This chapter describes the annotation, analysis, and visualization tasks that
-the Jalview Desktop can perform.
-
-Section \ref{wkwithstructure} introduces the structure visualization
-capabilities of Jalview. In Section \ref{alignanalysis}, you will find
-descriptions and exercises on building and displaying trees, PCA analysis,
-alignment redundancy removal, pairwise alignments and alignment conservation
-analysis. Section \ref{jvwebservices} introduces the various web based services
-available to Jalview users, and Section \ref{jabaservices} explains how to
-configure the Jalview Desktop for access to new JABAWS servers.
+To summarise, section \ref{featannot} describes the mechanisms provided by
+Jalview for interactive creation of sequence and alignment annotation, and how they can be
+displayed, imported and exported and used to reorder the alignment. Section
+\ref{featuresfromdb} discusses the retrieval of database references and
+establishment of sequence coordinate systems for the retrieval and display of
+features from databases and DAS annotation services.
  Section \ref{msaservices} describes how to use the range of multiple alignment
  programs provided by JABAWS, and Section \ref{aacons} introduces JABAWS AACon
  service for protein multiple alignment conservation analysis.
+ In Section \ref{alignanalysis}, you will find
+descriptions and exercises on building and displaying trees, PCA analysis,
+alignment redundancy removal, pairwise alignments and alignment conservation
+analysis. 
+Section \ref{wkwithstructure} introduces the structure visualization
+capabilities of Jalview.
  Section \ref{protsspredservices} explains how to perform protein secondary
  structure predictions with JPred, and JABAWS protein disorder prediction
  services are introduced in Section \ref{protdisorderpred}.
-
-Section \ref{featannot} describes the mechanisms provided by Jalview for
-interactive creation of sequence and alignment annotation, and how they can be
-displayed, imported and exported and used to reorder the alignment. Section
-\ref{featuresfromdb} discusses the retrieval of database references and
-establishment of sequence coordinate systems for the retrieval and display of
-features from databases and DAS annotation services. Section
-\ref{workingwithnuc} describes functions and visualization techniques relevant
+Section \ref{workingwithnuc} describes functions and visualization techniques relevant
  to working with nucleotide sequences, coding region annotation and nucleotide
  sequence alignments.
+Section \ref{jvwebservices} introduces the various web based services
+available to Jalview users, and Section \ref{jabaservices} explains how to
+configure the Jalview Desktop for access to new JABAWS servers.
+
+ 
  % and Section \ref{workingwithrna} covers the visualization,
  % editing and analysis of RNA secondary structure.
  
@@ -3246,8 +3328,8 @@ Each service operates on sequences in the alignment to identify regions likely
  to be unstructured or flexible, or alternately, fold to form globular domains.
  As a consequence, disorder predictor results include both sequence features and
  sequence associated alignment annotation rows. Section \ref{featannot} describes
-the manipulation and display of these data in detail, and {\bf Figure
-\ref{alignmentdisorder}} demonstrates how sequence feature shading and
+the manipulation and display of these data in detail, and Figure
+\ref{alignmentdisorder} demonstrates how sequence feature shading and
  thresholding (described in Section \ref{featureschemes}) can be used to
  highlight differences in disorder prediction across aligned sequences.
  
@@ -3291,7 +3373,7 @@ alignment.} }
  
  \subsubsection{Navigating Large Sets of Disorder Predictions}
  
-{\bf Figure \ref{alignmentdisorderannot}} shows a single sequence annotated with
+Figure \ref{alignmentdisorderannot} shows a single sequence annotated with
  a range of disorder predictions. Disorder prediction annotation rows are
  associated with a sequence in the same way as secondary structure prediction
  results. When browsing an alignment containing large numbers of disorder