+\subsection{Configuring which AACon Calculations are Performed}
+The {\sl Web Service $\Rightarrow$ Conservation $\Rightarrow$ Change AACon
+Settings ...} menu entry will open a web services parameter dialog for the
+currently configured AACon server. Standard presets are provided for quick and
+more expensive conservation calculations, and parameters are also provided to
+change the way that SMERFS calculations are performed.
+AACon settings for an alignment are saved in Jalview projects along with the
+latest calculation results.
+
+\subsection{Changing the Server used for AACon Calculations}
+If you are working with alignments too large to analyse with the public JABAWS
+server, then you will most likely have already configured additional JABAWS
+servers. By default, Jalview will chose the first AACon service available from
+the list of JABAWS servers available. You can change the AACon services by
+selecting it from the {\sl Web Service $\Rightarrow$
+Conservation $\Rightarrow$ Change AACon Settings} submenu.
+Alternatively to add new service, go to the desktop window menu and select {\sl Tools $\Rightarrow$
+Preferences $\Rightarrow$ Web Services tab} and add {\sl New Services URL}, then use the {\sl move up} or {\sl move down} buttons
+to reorder the services.
+
+
+\chapter{Analysis of Alignments}
+\label{alignanalysis}
+Jalview provides support for sequence analysis in two ways. A number of
+analytical methods are `built-in', these are accessed from the {\sl Calculate}
+alignment window menu. Computationally intensive analyses are run outside
+Jalview {\sl via} web services - and found under the
+{\sl Web Service} menu. In this section, we describe the built-in analysis
+capabilities common to both the Jalview Desktop and the JalviewJS.
+
+\section{PCA}
+Principal components analysis calculations create a spatial
+representation of the similarities within the current selection or the whole alignment if no selection has been made. After
+the calculation finishes, a 3D viewer displays each sequence as a point in
+3D `similarity space'. Sets of similar sequences tend to lie near each other in
+this space.
+Note: The calculation is computationally expensive, and may fail for very large
+sets of sequences - because the JVM has run out of memory. Memory issues, and
+how to overcome them, were discussed in Section \ref{memorylimits}.
+
+\subsubsection{What is PCA?}
+Principal components analysis is a technique for examining the structure of
+complex data sets. The components are a set of dimensions formed from the
+measured values in the data set, and the principal component is the one with the
+greatest magnitude, or length. The sets of measurements that differ the most
+should lie at either end of this principal axis, and the other axes correspond
+to less extreme patterns of variation in the data set.
+In this case, the components are generated by an eigenvector decomposition of
+the matrix formed from the sum of pairwise substitution scores at each aligned
+position between each pair of sequences. The basic method is described in the
+1995 paper by {\sl G. Casari, C. Sander} and {\sl A. Valencia} \footnote{{\sl
+Nature Structural Biology} (1995) {\bf 2}, 171-8.
+PMID: 7749921} and implemented at the SeqSpace server at the EBI.\footnote{See \url{http://www.jalview.org/help/html/calculations/pca.html}.}
+%
+% Jalview provides two different options for the PCA calculation: SeqSpace and
+% Jalview mode. In SeqSpace mode, PCAs are computed using the identity matrix, and
+% gaps are treated as 'the unknown residue' (this actually differs from the
+% original SeqSpace paper, and will be adjusted in a future version of Jalview).
+% In Jalview mode, PCAs are computed using the chosen score matrix - which for
+% protein sequences, defaults to BLOSUM 62, and for nucleotides, is the
+% DNA identity matrix that also treats Us and Ts as identical, to support analysis
+% of both RNA and DNA alignments.
+
+\subsubsection{The PCA Viewer}
+
+PCA analysis can be launched from the {\sl Calculate $\Rightarrow$ Tree or PCA} menu option.
+{\bf PCA requires a selection containing at
+least 4 sequences}. In the Choose Calculation window, select the {\sl Principal Components Analysis} button and then select {\sl Calculate}
+(Figure \ref{PCA}).
+Each sequence is represented by a small square, coloured by the background
+colour of the sequence ID label. The axes can be rotated by clicking and
+dragging the left mouse button and zoomed using the $\uparrow$ and $\downarrow$
+keys or the scroll wheel of the mouse (if available). A tool tip appears if the
+cursor is placed over a sequence. Sequences can be selected by clicking on them.
+[CTRL]-Click can be used to select multiple sequences.
+
+Labels will be shown for each sequence by toggling the {\sl View $\Rightarrow$
+Show labels} menu option, and the plot background colour changed {\sl via} the
+{\sl View $\Rightarrow$ Background Colour..} dialog box. A graphical
+representation of the PCA plot can be exported as an EPS or PNG image {\sl via}
+the {\sl File $\Rightarrow$ Save as $\Rightarrow$ \ldots } submenu.
+
+\exercise{Principal Component Analysis}
+{\label{pcaex}
+\exstep{Load the alignment at
+\textsf{http://www.jalview.org/tutorial/alignment.fa}.}
+\exstep{Select the menu option {\sl Calculate $\Rightarrow$ Tree or PCA..}. in the alignment
+window and a dialogue box will open. Select the Principal Component Analysis option
+and then click the Calculate button.}
+\exstep{Move
+this window within the desktop so that the alignment and PCA viewer windows are visible.
+Try rotating the plot by clicking and dragging the mouse on the plot in the PCA window.
+Note that clicking on points in the plot will highlight the sequences on the
+alignment.}
+\exstep{Use the [ESC] key to deselect sequence selection.
+Select {\sl Calculate $\Rightarrow$ Tree or PCA..}. in the alignment window. In dialogue box select Neighbour
+Joining and in the drop-down list select BLOSUM62. Click the Calculate button
+and a tree window will open.}
+\exstep{Place the mouse cursor on the tree so that the
+tree partition divides the tree into a number of groups, each with a
+different (arbitrarily selected) colour.
+Note how the colour of the sequence ID label matches both the colour of
+the partitioned tree and the points in the PCA plot.}
+{\bf See the video at:
+\url{http://www.jalview.org/training/Training-Videos}.}
+}
+
+\begin{figure}[hbtp]
+\begin{center}
+\includegraphics[width=2in]{images/PCA1.pdf}
+\includegraphics[width=3in]{images/PCA3.pdf}
+\caption{{\bf PCA Analysis.} }
+\label{PCA}
+\end{center}
+\end{figure}
+
+
+
+\subsubsection{PCA Data Export}
+Although the PCA viewer supports export of the current view, the plots produced
+are rarely suitable for direct publication. The PCA viewer's {\sl File} menu
+includes a number of options for exporting the PCA matrix and transformed points
+as comma separated value (CSV) files. These files can be imported by tools such
+as {\bf R} or {\bf gnuplot} in order to graph the data.
+
+\section{Trees}
+\label{trees}
+
+Jalview can calculate and display trees, providing interactive tree-based
+grouping of sequences though a tree viewer. All trees are calculated {\sl via}
+the {\sl Calculate $\Rightarrow$ Tree or PCA \ldots} menu option.
+Trees can be calculated from distance matrices determined from \% identity or
+aggregate BLOSUM 62 score using either {\sl Average Distance} (UPGMA) or {\sl
+Neighbour Joining} algorithms. The input data for a tree is either the selected
+region or the whole alignment, excluding any hidden regions.
+
+On calculating a tree, a new window opens (Figure \ref{trees1}) which contains
+the tree. Various display settings can be found in the tree window {\sl View}
+menu, including font, scaling and label display options. The {\sl File
+$\Rightarrow$ Save As} submenu contains options for image and Newick file
+export. Newick format is a standard file format for trees which allows them to
+be exported to other programs. Jalview can also read in external trees in
+Newick format {\sl via} the {\sl File $\Rightarrow$ Load Associated Tree} menu
+option. Leaf names on imported trees will be matched to the associated alignment
+- unmatched leaves will still be displayed, and can be highlighted using the
+{\sl View $\Rightarrow$ Mark Unlinked Leaves} menu option.
+
+
+
+\begin{figure}
+\begin{center}
+\includegraphics[width=2.5in]{images/trees1.pdf}
+\includegraphics[width=2.5in]{images/trees2.pdf}
+\includegraphics[width=1.25in]{images/trees4.pdf}
+\caption{{\bf Calculating Trees} Jalview provides a range of options
+for calculating trees.
+Jalview can also load precalculated trees in Newick format (right).}
+\label{trees1}
+\end{center}
+\end{figure}
+
+\begin{figure}
+\begin{center}
+\includegraphics[width=5in]{images/trees3.pdf}
+\caption{{\bf Interactive Trees} The tree level cutoff can be used to designate
+groups in Jalview.}
+\label{trees2}
+\end{center}
+\end{figure}
+
+Clicking on the tree brings up a cursor across the height of the tree. The
+sequences are automatically partitioned and coloured (Figure \ref{trees2}). To
+group them together, select the {\sl Calculate $\Rightarrow$ Sort $\Rightarrow$
+By Tree Order $\Rightarrow$ \ldots} alignment window menu option and choose the
+correct tree. The sequences will then be sorted according to the leaf order
+currently shown in the tree view. The coloured background to the sequence IDs
+can be removed with {\sl Select $\Rightarrow$ Undefine Groups} from the
+alignment window menu. Note that tree partitioning will also remove any groups
+and colourschemes on a view, so create a new view ([CTRL-T]) if you wish to
+preserve these.
+
+
+\exercise{Trees}
+{\label{treeex}
+
+\exstep{Open the alignment at
+\textsf{http://www.jalview.org/tutorial/alignment.fa}.}
+
+\exstep{Select {\sl Calculate $\Rightarrow$ Tree or PCA..}. in the alignment
+window menu and a dialogue box opens. In the tree section select Neighbour
+Joining, in the drop-down list select BLOSUM62 and click the Calculate
+button. A tree window will open.}
+
+\exstep{Click on the
+tree window, a cursor will appear as a vertical line. Note that clicking will
+place this cursor, and divides the tree into a number of groups, each highlighted
+with a different colour. Place the cursor to give about 4 groups.}
+
+\exstep{Place the mouse cursor on a node of the tree to open a tool tip. Double click the node to invert the leaves.
+}
+
+\exstep{In the tree window, select {\sl View $\Rightarrow$ Sort Alignment
+by Tree}. The sequences are reordered to match the order in the tree and groups
+ are formed implicitly. Alternatively in the alignment window, select
+{\sl Calculate $\Rightarrow$ Sort $\Rightarrow$ By Tree Order $\Rightarrow$
+ Neighbour Joining Tree using BLOSUM62 from...}.}
+
+\exstep{Select {\sl Calculate $\Rightarrow$ Tree or PCA..}. in the alignment
+window. In the dialogue box, select Average Distance and in the drop down
+list select BLOSUM62. Click the Calculate button and a new
+tree window will appear. The group colouring makes it easy to see the differences between the two
+trees calculated by the different methods.}
+
+\exstep{With no groups selected in the alignment window, select sequence 2 from
+column 60 to sequence 12 and column 123. Select {\sl Calculate $\Rightarrow$
+Tree or PCA..}. , in the dialogue box select Neighbour Joining and
+BLOSUM62, then click the Calculate button.
+ A tree will appear containing 11 sequences. It has been coloured
+ according to the already selected groups from the first tree and is calculated purely from the residues
+ in the selection.}
+
+Comparing the location of individual sequences between the three trees illustrates the importance of selecting appropriate regions of the
+alignment for the calculation of trees.
+
+{\bf See the video at:
+\url{http://www.jalview.org/training/Training-Videos}.}
+
+}
+
+%\subsubsection{Multiple Views and Input Data recovery from PCA and Tree Viewers}
+% move to ch. 3 ?
+%Both PCA and Tree viewers are linked analysis windows. This means that their selection and display are linked to a particular alignment, and control and reflect the selection state for a particular view.
+
+\subsubsection{Recovering input data for a Tree or PCA Plot Calculation}
+\parbox[c]{5in}{
+The {\sl File $\Rightarrow$ Input Data } option will open a new alignment window containing the original data used to calculate the tree or PCA plot (if available). This function is useful when a tree has been created and then the alignment subsequently changed.
+}
+\parbox[c]{1.25in}{\centerline{\includegraphics[width=1.25in]{images/pca_fmenu.pdf}
+}}
+
+\subsubsection{Changing the associated view for a Tree or PCA Viewer}
+\parbox[c]{4in}{
+The {\sl View $\Rightarrow$ Associated Nodes With $\Rightarrow$ .. } submenu is shown when the viewer is associated with an alignment that is involved in multiple views. Selecting a different view does not affect the tree or PCA data, but will change the colouring and display of selected sequences in the display according to the colouring and selection state of the newly associated view.
+} \parbox[c]{3in}{\centerline{
+\includegraphics[width=2.5in]{images/pca_vmenu.pdf} }}
+
+\subsection{Tree Based Conservation Analysis}
+\label{treeconsanaly}
+
+Trees reflect the pattern of global sequence similarity exhibited by the
+alignment, or region within the alignment, that was used for their calculation.
+The Jalview tree viewer enables sequences to be partitioned into groups based
+on the tree. This is done by clicking within the tree viewer window. Once subdivided, the
+conservation between and within groups can be visually compared in order to
+better understand the pattern of similarity revealed by the tree and the
+variation within the clades partitioned by the grouping. The conservation based
+colourschemes and the group associated conservation and consensus annotation
+(enabled using the alignment window's {\sl Annotations $\Rightarrow$ Autocalculated
+Annotation $\Rightarrow$ Group Conservation} and {\sl Group Consensus} options)
+can help when working with larger alignments.
+
+
+
+
+%\exercise{Pad Gaps in an Alignment}{
+%\exstep{Open the alignment at
+% \textsf{http://www.jalview.org/tutorial/alignment.fa}. In alignment window, ensure that the {\sl Edit $\Rightarrow$ Pad Gaps } option is {\sl not} ticked, and insert one gap anywhere in the
+%alignment.}
+%\exstep{Select {\sl Calculate $\Rightarrow$ Calculate Tree $\Rightarrow$
+%Neighbour Joining Using BLOSUM62}.
+
+%A warning dialog box {\bf ``Sequences not aligned'' } appears because the
+% sequences input to the tree calculation are of different lengths. }
+
+%\exstep{Select {\sl Edit $\Rightarrow$ tick Pad Gaps } and perform the
+%tree calculation again. This time a new tree should appear - because padding
+%gaps ensures all the sequences are the same length after editing.}
+%{\sl Pad Gaps } option
+%can be set in Preferences using
+%{\sl Tool $\Rightarrow$ Preference $\Rightarrow$ Editing}.
+
+%{\bf See the video at:
+%\url{http://www.jalview.org/training/Training-Videos}.}
+%}
+
+ \exercise{Tree Based Conservation Analysis}{
+\label{consanalyexerc}
+\exstep{Load the PF03460 PFAM seed alignment using the sequence fetcher.
+Select {\sl Colour $\Rightarrow$ Taylor $\Rightarrow$ By Conservation}, set
+ {\sl Conservation} shading threshold at around 20. }
+ \exstep{Build a Neighbour joining tree by selecting {\sl Calculate
+ $\Rightarrow$ Tree or PCA..}. in the alignment window. In the dialogue box, select Neighbour
+Joining and in the drop-down
+list select BLOSUM62, then click the Calculate button.}
+\exstep{Use the cursor to select a point on the tree to partition the
+alignment into groups.}
+\exstep {Select {\sl View $\Rightarrow$ Sort Alignment By Tree} option in the
+tree window to re-order the sequences in the alignment.
+Examine the variation in colouring between different groups of sequences in the
+ alignment window. }
+\exstep {You may find it easier to browse the alignment if you first uncheck
+ the {\sl Annotations $\Rightarrow$ Show Annotations} option. Open the
+Overview Window from the View menu to aid navigation.}
+
+\exstep{Try changing the colourscheme of the residues in the alignment to
+BLOSUM62 (whilst ensuring that {\sl Apply Colour to All Groups} is selected).}
+{\sl Note: You may want to save the alignment and tree as a project file, since
+it is used in the next set of exercises. }
+
+{\bf See the video at:
+\url{http://www.jalview.org/training/Training-Videos}.}
+}
+
+
+\subsection{Redundancy Removal}
+
+The redundancy removal dialog box is opened using the {\sl Edit $\Rightarrow$ Remove Redundancy\ldots} option
+in the alignment menu. As its menu option placement suggests, this is actually an alignment editing function,
+but it is convenient to describe it here. The redundancy removal dialog box presents a percentage identity
+slider which sets the redundancy threshold. Aligned sequences which exhibit a percentage identity greater
+than the current threshold are highlighted in black. The [Remove] button can then be used to delete these
+sequences from the alignment as an edit operation.
+\begin{figure}
+\begin{center}
+\includegraphics[width=5.5in]{images/redundancy.pdf}
+\end{center}
+\label{removeredundancydialog}
+\caption{The Redundancy Removal dialog box opened from the edit menu. Sequences that exceed the current percentage identity
+threshold and are to be removed are highlighted in black.}
+\end{figure}
+
+
+\subsection{Subdividing the Alignment According to Specific Mutations}
+
+It is often necessary to explore variations in an alignment that may correlate
+with mutations observed in a particular region; for example, sites exhibiting
+single nucleotide polymorphism, or residues involved in substrate recognition in
+an enzyme. One way to do this would be to calculate a tree using the specific
+region, and subdivide it in order to partition the alignment.
+However, calculating a tree can be slow for large alignments, and the tree may
+be difficult to partition when complex mutation patterns are being analysed. The
+{\sl Select $\Rightarrow$ Make groups for selection } function was introduced to
+make this kind of analysis easier. When selected, it will use the characters in
+the currently selected region to subdivide the alignment. For example, if a
+single column is selected, then the alignment (or each group defined on the
+alignment) will be divided into groups based on the residue or nucleotide found
+at that position. These new groups are annotated with the characters in the
+selected region, and Jalview's group based conservation analysis annotation and
+colourschemes can then be used to reveal any associated pattern of sequence
+variation across the whole alignment.
+
+
+% These annotations can be hidden and deleted via the context menu linked to the
+% annotation row; but they are only created on loading an alignment. If they are
+% deleted then the alignment should be saved and then reloaded to restore them.
+% Jalview provides a toggle to autocalculate a consensus sequence upon editing.
+% This is normally selected by default, but can be turned off for large alignments {\sl via} the {\sl Calculate $\Rightarrow$ Autocalculate
+% Consensus} menu option if the interface is too slow.
+
+% \subsubsection{Group Associated Annotation}
+% \label{groupassocannotation}
+% Group associated consensus and conservation annotation rows reflect the
+% sequence variation within a particular group. Their calculation is enabled
+% by selecting the {\sl Group Conservation} or {\sl Group Consensus} options in
+% the {\sl Annotation $\Rightarrow$ Autocalculated Annotation } submenu of the
+% alignment window.
+
+% \subsubsection{Alignment and Group Sequence Logos}
+% \label{seqlogos}
+
+% The consensus annotation row that is shown below the alignment can be overlaid
+% with a sequence logo that reflects the symbol distribution at each column of
+% the alignment. Right click on the Consensus annotation row and select the {\sl
+% Show Logo} option to display the Consensus profile for the group or alignment.
+% Sequence logos can be enabled by default for all new alignments {\sl via} the
+% Visual tab in the Jalview desktop's preferences dialog box.
+
+\section{Pairwise Alignments}
+Jalview can calculate optimal pairwise alignments between arbitrary
+sequences {\sl via} the {\sl Calculate $\Rightarrow$ Pairwise Alignments\ldots} menu option.
+Global alignments of all pairwise combinations of the selected sequences are performed and the results returned in a text box.
+
+
+
+\exercise{Remove Redundant Sequences}{
+\label{redundantex}
+\exstep{Using the alignment generated in the previous exercise (exercise
+\ref{consanalyexerc}).
+In the alignment window, you may need to deselect groups using Esc key.}
+
+\exstep{In the {\sl Edit} menu select {\sl Remove Redundancy} to open the
+Redundancy threshold selection dialog. Adjust the redundancy threshold value, start
+at 50 and increase the value to 65. Sequences selected will change colour in the Sequence ID panel. Select ``Remove'' to
+remove the sequences that are more than 65\% similar under this alignment.}