+\exstep{Select {\sl File $\Rightarrow$ Export Features\ldots} from the Alignment window. You can choose to export the retrieved features as a GFF file, or Jalview's own Features format.
+% TODO: describe working with features files and GFF
+}
+}
+
+\subsubsection{The Fetch Uniprot IDs Dialog Box}
+\label{discoveruniprotids}
+If any sources are selected which refer to Uniprot coordinates as their reference system,
+then you may be asked if you wish to retrieve Uniprot IDs for your sequence. Pressing OK instructs Jalview to verify the sequences against Uniprot records retrieved using the sequence's ID string. This operates in much the same way as the {\sl Web Service $\Rightarrow$ Fetch Database References } function described in Section \ref{fetchdbrefs}. If a sequence is verified, then the start/end numbering will be adjusted to match the Uniprot record to ensure that features retrieved from the DAS source are rendered at the correct position.
+
+\subsubsection{Rate of Feature Retrieval}
+Feature retrieval can take some time if a large number of sources is selected and if the alignment
+contains a large number of sequences. This is because Jalview only queries a particular DAS source with one sequence at a time, to avoid overloading it. As features are retrieved, they are immediately added to the current alignment view. The retrieved features are shown on the sequence and can be customised as described previously.
+
+
+\subsection{Colouring Features by Score or Description
+Text}
+\label{featureschemes}
+Sometimes, you may need to visualize the differences in information carried by
+sequence features of the same type. This is most often the case when features
+of a particular type are the result of a specific type of database query or calculation. Here, they may also carry information within their textual description, or most commonly for calculations, a score related to the property being investigated. Jalview can shade sequence
+features using a graduated colourscheme in order to highlight these variations.
+In order to apply a graduated scheme to a feature type, select the `Graduated
+colour' entry in the Sequence {\sl Feature Type}'s popup menu, which is opened by
+right-clicking the {\sl Feature Type} or {\sl Color} in the {\sl Sequence Feature Settings} dialog box. Two types
+of colouring styles are currently supported: the default is quantitative
+colouring, which shades each feature based on its score, with the highest
+scores receiving the `Max' colour, and the lowest scoring features coloured
+with the `Min' colour. Alternately, you can select the `Colour by label'
+option to create feature colours according to the description text associated
+with each feature. This is useful for general feature types - such as
+Uniprot's `DOMAIN' feature - where the actual type of domain is given in the
+feature's description.
+
+Graduated feature colourschemes can also be used to exclude low or
+high-scoring features from the alignment display. This is done by choosing your
+desired threshold type (either above or below), using the drop-down menu in the
+dialog box. Then, adjust the slider or enter a value in the text box to set the
+threshold for displaying this type of feature.
+
+The feature settings dialog box allows you to toggle between a graduated and
+simple feature colourscheme using the pop-up menu for the feature type. When a
+graduated scheme is applied, it will be indicated in the colour column for
+that feature type - with coloured blocks or text to indicate the colouring
+style and a greater than ($>$) or less than ($<$) symbol to indicate when a
+threshold has been defined.
+
+\subsection{Using Features to Re-order the Alignment}
+\label{featureordering}
+The presence of sequence features on certain sequences or in a particular
+region of an alignment can quantitatively identify important trends in
+the aligned sequences. In this case, it is more useful to
+re-order the alignment based on the number of features or their associated scores, rather than simply re-colour the aligned sequences. The sequence feature settings
+dialog box provides two buttons: `Seq sort by Density' and `Seq sort by
+Score', that allow you to reorder the alignment according to the number of
+sequence features present on each sequence, and also according to any scores
+associated with a feature. Each of these buttons uses the currently displayed
+features to determine the ordering, but
+if you wish to re-order the alignment using a single type of feature, then you can do this from the {\sl Feature Type}'s
+popup menu. Simply right-click the type's style in the Sequence Feature Settings dialog
+box, and select one of the {\sl Sort by Score} and {\sl Sort by Density}
+options to re-order the alignment. Finally, if a specific region is selected,
+then only features found in that region of the alignment will be used to
+create the new alignment ordering.
+% \exercise{Shading and Sorting Alignments using Sequence Features}{
+% \label{shadingorderingfeatsex}
+%
+% This exercise is currently not included in the tutorial because no DAS servers
+% currently exist that yield per-residue features for any Uniprot sequence.
+%
+% \exstep{Re-load the alignment from \ref{dasfeatretrexcercise}.
+% }
+% \exstep{Open the
+% feature settings panel, and, after first clearing the current
+% selection, press the {\em Seq Sort by Density} button a few times.}
+% \exstep{Use the DAS fetcher to retrieve the Kyte and Doolittle Hydrophobicity
+% scores for the protein sequences in the alignment.
+% {\sl Hint: the nickname for the das source is `KD$\_$hydrophobicity'.}}
+% \exstep{Change the feature settings so only the hydrophobicity features are
+% displayed. Mouse over the annotation and also export and examine the GFF and
+% Jalview features file to better understand how the hydrophobicity measurements
+% are recorded.}
+% \exstep{Apply a {\sl Graduated Colour} to the hydrophobicity annotation to
+% reveal the variation in average hydrophobicity across the alignment.}
+% \exstep{Select a range of alignment columns, and use one of the sort by feature buttons to order the alignment according to that region's average
+% hydrophobicity.}
+% \exstep{Save the alignment as a project, for use in exercise
+% \ref{threshgradfeaturesex}.} }
+%
+% \exercise{Shading alignments with combinations of graduated feature
+% colourschemes}{
+% \label{threshgradfeaturesex}
+% \exstep{Reusing the annotated alignment from exercise
+% \ref{shadingorderingfeatsex}, experiment with the colourscheme threshold to
+% highlight the most, or least hydrophobic regions. Note how the {\sl Colour} icon for the {\sl Feature Type} changes when you change the threshold type and press OK.}
+% \exstep{Change the colourscheme so
+% that features at the threshold are always coloured grey, and the most
+% hydrophobic residues are coloured red, regardless of the threshold value
+% ({\em hint - there is a switch on the dialog to do this for you}).}
+% \exstep{Enable the Uniprot {\em chain} annotation in the feature settings
+% display and re-order the features so it is visible under the hydrophobicity
+% annotation.}
+% \exstep{Apply a {\sl Graduated Colour} to the {\em chain}
+% annotation so that it distinguishes the different canonical names associated
+% with the mature polypeptide chains.}
+% \exstep{Export the alignment's sequence features using the Jalview and GFF file formats, to see how the different types of graduated feature
+% colour styles are encoded. }
+% }
+
+\subsection{Creating Sequence Features}
+Sequence features can be created simply by selecting the area in a sequence (or sequences) to form the feature and selecting {\sl Selection $\Rightarrow$ Create Sequence Feature } from the right-click context menu (Figure \ref{features}). A dialogue box allows the user to customise the feature with respect to name, group, and colour. The feature is then associated with the sequence. Moving the mouse over a residue associated with a feature brings up a tool tip listing all features associated with the residue.
+
+\begin{figure}[htbp]
+\begin{center}
+\includegraphics[width=2in]{images/feature1.pdf}
+\includegraphics[width=2.5in]{images/feature2.pdf}
+\includegraphics[width=1.5in]{images/feature3.pdf}
+\caption{{\bf Creating sequence features.} Features can readily be created from selections via the context menu and are then displayed on the sequence. }
+\label{features}
+\end{center}
+\end{figure}
+
+Creation of features from a selection spanning multiple sequences results in the creation of one feature per sequence.
+Each feature remains associated with its own sequence.
+
+\subsection{Customising Feature Display}
+
+Feature display can be toggled on or off by selecting the {\sl View
+$\Rightarrow$ Show Sequence Features} menu option. When multiple features are
+present it is usually necessary to customise the display. Jalview allows the
+display, colour, rendering order and transparency of features to be modified
+{\sl via} the {\sl View $\Rightarrow$ Feature Settings\ldots} menu option. This
+brings up a dialogue window (Figure \ref{custfeat}) which allows the
+visibility of individual feature types to be selected, colours changed (by
+clicking on the colour of each sequence feature type) and the rendering order
+modified by dragging feature types to a new position in the list. Dragging the
+slider alters the transparency of the feature rendering. The Feature
+Settings dialog also includes functions for more advanced feature shading
+schemes and buttons for sorting the alignment according to the distribution of
+features. These capabilities are described further in sections
+\ref{featureschemes} and \ref{featureordering}.
+
+\begin{figure}[htbp]
+\begin{center}
+\includegraphics[width=4in]{images/features4.pdf}
+\caption{{\bf Multiple sequence features.} An alignment with JPred secondary structure prediction annotation below it, and many sequence features overlaid onto the aligned sequences. The tooltip lists the features annotating the residue below the mouse-pointer.}
+\end{center}
+\end{figure}
+
+\begin{figure}[htbp]
+\begin{center}
+\includegraphics[width=4in]{images/features5.pdf}
+\caption{{\bf Customising sequence features.} Features can be recoloured, switched on or off and have the rendering order changed. }
+\label{custfeat}
+\end{center}
+\end{figure}
+
+\subsection{Sequence Feature File Formats}
+
+Jalview supports the widely used GFF tab delimited format\footnote{see
+http://www.sanger.ac.uk/resources/software/gff/spec.html} and its own Jalview
+Features file format for the import of sequence annotation. Features and
+alignment annotation are also extracted from other formats such as Stockholm,
+and AMSA. URL links may also be attached to features. See the online
+documentation for more details of the additional capabilities of the Jalview
+features file.
+
+\exercise{Creating Features}{
+\exstep{Open the alignment at \textsf{http://www.jalview.org/tutorial/alignment.fa}.
+We know that the Cysteine residues at columns 97, 102, 105 and 135 are involved in
+iron binding so we will create them as features. Navigate to column 97, sequence 1.
+Select the entire column by clicking in the ruler bar. Then right-click on the selection
+to bring up the context menu and select {\sl Selection $\Rightarrow$ Create Sequence Feature}.
+A dialogue box will appear.
+}
+\exstep{
+Enter a suitable Sequence Feature Name (e.g. ``Iron binding site") in the
+appropriate box. Click on the Feature Colour bar to change the colour if
+desired, add a short description (``One of four Iron binding Cysteines") and press OK. The features will then appear on the sequences. } \exstep{Roll the mouse cursor over the new features. Note that the position given in the tool tip is the residue number, not the column number. To demonstrate that there is one feature per sequence, clear all selections by pressing [ESC] then insert a gap in sequence 3 at column 95. Roll the mouse over the features and you will see that the feature has moved with the sequence. Delete the gap you created.
+}
+\exstep{
+Add a similar feature to column 102. When the feature dialogue box appears, clicking the Sequence Feature
+Name box brings up a list of previously described features. Using the same Sequence Feature Name allows the features to be grouped.}
+\exstep{Select {\sl View $\Rightarrow$ Feature Settings\ldots} from the
+alignment window menu. The Sequence Feature Settings window will appear. Move
+this so that you can see the features you have just created. Click the check
+box for ``Iron binding site" under {\sl Display} and note that display of this
+feature type is now turned off. Click it again and note that the features are
+now displayed. Close the sequence feature settings box by clicking OK or
+Cancel.} }
+
+\chapter{Multiple Sequence Alignment}
+\label{msaservices}
+Sequences can be aligned using a range of algorithms provided by JABA web
+services. These include ClustalW\footnote{{\sl ``CLUSTAL W: improving the
+sensitivity of progressive multiple sequence alignment through sequence
+weighting, position specific gap penalties and weight matrix choice."} Thompson
+JD, Higgins DG, Gibson TJ (1994) {\sl Nucleic Acids Research} {\bf 22},
+4673-80}, Muscle\footnote{{\sl ``MUSCLE: a multiple sequence alignment method
+with reduced time and space complexity"} Edgar, R.C.
+(2004) {\sl BMC Bioinformatics} {\bf 5}, 113}, MAFFT\footnote{{\sl ``MAFFT: a
+novel method for rapid multiple sequence alignment based on fast Fourier
+transform"} Katoh, K., Misawa, K., Kuma, K. and Miyata, T. (2002) {\sl Nucleic
+Acids Research} {\bf 30}, 3059-3066. and {\sl ``MAFFT version 5:
+improvement in accuracy of multiple sequence alignment"} Katoh, K., Kuma, K.,
+Toh, H. and Miyata, T. (2005) {\sl Nucleic Acids Research} {\bf 33}, 511-518.},
+ProbCons,\footnote{PROBCONS: Probabilistic Consistency-based Multiple Sequence
+Alignment.
+Do, C.B., Mahabhashyam, M.S.P., Brudno, M., and Batzoglou, S.
+(2005) {\sl Genome Research} {\bf 15} 330-340.} T-COFFEE\footnote{T-Coffee:
+A novel method for multiple sequence alignments. (2000) Notredame, Higgins and
+Heringa {\sl JMB} {\bf 302} 205-217} and Clustal Omega.\footnote{Fast, scalable
+generation of high-quality protein multiple sequence alignments using Clustal
+Omega. Sievers F, Wilm A, Dineen DG, Gibson TJ, Karplus K, Li W, Lopez R,
+McWilliam H, Remmert M, Soding J, Thompson JD, Higgins DG (2011) {\sl Molecular
+Systems Biology} {\bf 7} 539
+\href{http://dx.doi.org/10.1038/msb.2011.75}{doi:10.1038/msb.2011.75}} Of these,
+T-COFFEE is the slowest, but also the most accurate. ClustalW is historically
+the most widely used. Muscle is faster than ClustalW and probably the most
+accurate for smaller alignments and MAFFT is probably the best for large
+alignments, however {\bf Clustal Omega}, which was released in 2011, is
+arguably the fastest and most accurate tool for protein multiple alignment.
+
+
+To run an alignment web service, select the appropriate method from the {\sl
+Web Service $\Rightarrow$ Alignment $\Rightarrow$ \ldots} submenu (Figure
+\ref{webservices}). For each service you may either perform an alignment with
+default settings, use one of the available presets, or customise the parameters
+with the `{\sl Edit and Run ..}' dialog box. Once the job is submitted, a
+progress window will appear giving information about the job and any errors that
+occur. After successful completion of the job, a new alignment window is opened
+with the results, in this case an alignment. By default, the new alignment will be
+ordered in the same way as the input sequences. Note: many alignment
+programs re-order the input during their analysis and place homologous
+sequences close together, the MSA algorithm ordering can be recovered
+using the `Algorithm ordering' entry within the {\sl Calculate $\Rightarrow$
+Sort } sub menu.
+
+\subsubsection{Realignment}
+The re-alignment option is currently only supported by ClustalW and Clustal
+Omega. When performing a re-alignment, Jalview submits the current selection to
+the alignment service complete with any existing gaps. This approach is useful
+when one wishes to align additional sequences to an existing alignment without
+any further optimisation to the existing alignment. The re-alignment service
+provided by ClustalW in this case is effectively a simple form of profile
+alignment.
+
+
+\begin{figure}[htbp]
+\begin{center}
+\parbox[c]{1.5in}{\includegraphics[width=1.5in]{images/ws1.pdf}}
+\parbox[c]{2.5in}{\includegraphics[width=2.5in]{images/ws2.pdf}}
+\parbox[c]{2in}{\includegraphics[width=2in]{images/ws3.pdf}}
+\caption{{\bf Multiple alignment via web services} The appropriate method is
+selected from the menu (left), a status box appears (centre), and the results
+appear in a new window (right).}
+\label{webservices}
+\end{center}
+\end{figure}
+
+
+
+
+\exercise{Multiple Sequence Alignment}{
+\exstep{ Close all windows and open the alignment at {\sf
+http://www.jalview.org/tutorial/unaligned.fa}. Select {\sl
+Web Service $\Rightarrow$ Alignment $\Rightarrow$ Muscle with Defaults}.
+A window will open giving the job status. After a short time, a second window will open
+ with the results of the alignment.}
+ \exstep{Return to the first sequence alignment window by clicking on
+ the window, and repeat using Clustal and MAFFT (from the {\sl Web
+ Service $\Rightarrow$ Alignment} menu) on the same initial alignment. Compare them and
+ you should notice small differences. }
+\exstep{Select the last three sequences in the MAFFT alignment, and de-align them
+with {\sl Edit $\Rightarrow$ Remove All Gaps}. Press [ESC] to deselect them and then
+submit the view for re-alignment with Clustal.}
+\exstep{Use [CTRL]-Z to recover the alignment of the last three sequences in the MAFFT alignment.
+Once the Clustal re-alignment has completed, compare the results of re-alignment of the
+three sequences with their alignment in the original MAFFT result.}
+\exstep{Select columns 60 to 125 in the original MAFFT alignment and hide them.
+Select {\sl Web Services $\Rightarrow$ Alignment $\Rightarrow$ Mafft with Defaults} to
+submit the visible portion of the alignment to MAFFT. When the web service job pane appears,
+note that there are now two alignment job status panes shown in the window.}
+\exstep{When the MAFFT job has finished, compare the alignment of the N-terminal visible
+region in the result with the corresponding region of the original alignment. If you wish,
+select and hide a few more columns in the N-terminal region, and submit the alignment to the
+service again and explore the effect of local alignment on the non-homologous parts of the
+N-terminal region.}
+}
+
+\subsubsection{Alignments of Sequences that include Hidden Regions}
+
+If the view or selected region that is submitted for alignment contains hidden
+regions, then {\bf only the visible sequences will be submitted to the service}.
+Furthermore, each contiguous segment of sequences will be aligned independently
+(resulting in a number of alignment `subjobs' appearing in the status window).
+Finally, the results of each subjob will be concatenated with the hidden regions
+in the input data prior to their display in a new window. This approach ensures
+that 1) hidden column boundaries in the input data are preserved in the
+resulting alignment - in a similar fashion to the constraint that hidden columns
+place on alignment editing (see Section \ref{lockededits} and 2) hidden
+columns can be used to preserve existing parts of an alignment whilst the
+visible parts are locally refined.
+
+
+\subsection{Customising the Parameters used for Alignment}
+
+JABA web services allow you to vary the parameters used when performing a
+bioinformatics analysis. For JABA alignment services, this means you are
+usually able to modify the following types of parameters:
+\begin{list}{$\bullet$}{}
+\item Amino acid or nucleotide substitution score matrix
+\item Gap opening and widening penalties
+\item Types of distance metric used to construct guide trees
+\item Number of rounds of re-alignment or alignment optimisation
+\end{list}
+
+
+\subsubsection{Getting Help on the Parameters for a Service}
+Each parameter available for a method usually has a short description, which
+Jalview will display as a tooltip, or as a text pane that can be opened under
+the parameter's controls. In the parameter shown in Figure
+\ref{clustalwparamdetail}, the description was opened by selecting the button on the left hand side. Online help for the
+service can also be accessed, by right clicking the button and selecting a URL
+from the pop-up menu that will open.
+
+\begin{figure}[htbp]
+\begin{center}
+\includegraphics[width=2.5in]{images/clustalwparamdetail.pdf}
+\caption{{\bf ClustalW parameter slider detail}. From the ClustalW {\sl Clustal $\Rightarrow$ Edit settings and run ...} dialog box. }
+\label{clustalwparamdetail}
+\end{center}
+\end{figure}
+
+\subsection{Alignment Presets}
+The different multiple alignment algorithms available from JABA vary greatly in
+the number of adjustable parameters, and it is often difficult to identify what
+are the best values for the sequences that you are trying to align. For these
+reasons, each JABA service may provide one or more presets -- which are
+pre-defined sets of parameters suited for particular types of alignment
+problem. For instance, the Muscle service provides the following presets:
+\begin{list}{$\bullet$}{}
+\item Large alignments (balanced)
+\item Protein alignments (fastest speed)
+\item Nucleotide alignments (fastest speed)
+\end{list}
+
+The presets are displayed in the JABA web services submenu, and can also be
+accessed from the parameter editing dialog box, which is opened by selecting
+the `{\sl Edit settings and run ...}' option from the web services menu. If you have used
+a preset, then it will be mentioned at the beginning of the job status file shown
+in the web service job progress window.
+
+\subsubsection{Alignment Service Limits}
+Multiple alignment is a computationally intensive calculation. Some JABA server
+services and service presets only allow a certain number of sequences to be
+aligned. The precise number will depend on the server that you are using to
+perform the alignment. Should you try to submit more sequences than a service
+can handle, then an error message will be shown informing you of the maximum
+number allowed by the server.
+
+\subsection{User Defined Presets}
+Jalview allows you to create your own presets for a particular service. To do
+this, select the `{\sl Edit settings and run ...}' option for your service,
+which will open a parameter editing dialog box like the one shown in Figure
+\ref{jwsparamsdialog}.
+
+The top row of this dialog allows you to browse the existing presets, and
+when editing a parameter set, allows you to change its nickname. As you
+adjust settings, buttons will appear at the top of the parameters dialog that
+allow you to Revert or Update the currently selected user preset with your changes, Delete the current preset, or Create a new preset, if none exists with the given name. In addition to the parameter set name, you can also provide a short
+description for the parameter set, which will be shown in the tooltip for the
+parameter set's entry in the web services menu.
+
+\begin{figure}[htbc]
+\center{
+\includegraphics[width=3in]{images/jvaliwsparamsbox.pdf}
+\caption{{\bf Jalview's JABA alignment service parameter editing dialog box}.}
+\label{jwsparamsdialog} }
+\end{figure}
+
+\subsubsection{Saving Parameter Sets}
+When creating a custom parameter set, you will be asked for a file name to save
+it. The location of the file is recorded in the Jalview user preferences in the
+same way as a custom alignment colourscheme, so when Jalview is launched again,
+it will show your custom preset amongst the options available for running the
+JABA service.
+
+%
+% \exercise{Creating and using user defined presets}{\label{createandusepreseex}
+% \exstep{Import the file at
+% \textsf{http://www.jalview.org/tutorial/fdx\_unaligned.fa} into jalview.}
+% \exstep{Use the `{\slDiscover Database Ids}' function to recover the PDB cross
+% references for the sequences.}
+% \exstep{Align the sequences using the default ClustalW parameters.}
+% \exstep{Use the `{\sl Edit and run..}'
+% option to open the ClustalW parameters dialog box, and create a new preset using
+% the following settings:
+% \begin{list}{$\bullet$}{}
+% \item BLOSUM matrix (unchanged)
+% \item Gap Opening and End Gap penalties = 0.05
+% \item Gap Extension and Separation penalties = 0.05
+% \end{list}
+%
+% As you edit the parameters, buttons will appear on the dialog box
+% allowing you revert your changes or save your settings as a new parameter
+% set.
+%
+% Before you save your settings, remeber to give them a meaningful name by editing
+% the text box at the top of the dialog box.
+% }
+% \exstep{Repeat the alignment using your new parameter set by selecting it from
+% the {\sl ClustalW Presets menu}.}
+% \exstep{These sequences have PDB structures associated with them, so it is
+% possible to compare the quality of the alignments.
+%
+% Use the {\sl View all {\bf N}
+% structures} option to calculate the superposition of 1fdn on 1fxd for both
+% alignments (refer to section \ref{superposestructs} for instructions). Which
+% alignment gives the best RMSD ? }
+% \exstep{Apply the same alignment parameter settings to the example alignment
+% (available from \textsf{http://www.jalview.org/examples/uniref50.fa}).
+%
+% Are there differences ? If not, why not ?
+% }
+% }
+
+\section{Protein Alignment Conservation Analysis}
+\label{aacons}
+The {\sl Web Service $\Rightarrow$ Conservation} menu controls the computation
+of up to 17 different amino acid conservation measures for the current alignment
+view. The JABAWS AACon Alignment Conservation Calculation Service, which is used
+to calculate these scores, provides a variety of standard measures described by
+Valdar in 2002\footnote{Scoring residue conservation. Valdar (2002) {\sl
+Proteins: Structure, Function, and Genetics} {\bf 43} 227-241.} as well as an efficient implementation of the SMERFs
+score developed by Manning et al. in 2008.\footnote{SMERFS Score Manning et al. {\sl BMC
+Bioinformatics} 2008, {\bf 9} 51 \href{http://dx.doi.org/10.1186/1471-2105-9-51}{doi:10.1186/1471-2105-9-51}}
+
+\subsubsection{Enabling and Disabling AACon Calculations}
+When the AACon Calculation entry in the {\sl Web Services $\Rightarrow$
+Conservation} menu is ticked, AACon calculations will be performed every time
+the alignment is modified. Selecting the menu item will enable or disable
+automatic recalculation.
+
+\subsubsection{Configuring which AACon Calculations are Performed}
+The {\sl Web Services $\Rightarrow$ Conservation $\Rightarrow$ Change AACon
+Settings ...} menu entry will open a web services parameter dialog for the
+currently configured AACon server. Standard presets are provided for quick and
+more expensive conservation calculations, and parameters are also provided to
+change the way that SMERFS calculations are performed.
+AACon settings for an alignment are saved in Jalview projects along with the
+latest calculation results.
+
+\subsubsection{Changing the Server used for AACon Calculations}
+If you are working with alignments too large to analyse with the public JABAWS
+server, then you will most likely have already configured additional JABAWS
+servers. By default, Jalview will chose the first AACon service available from
+the list of JABAWS servers available. If available, you can switch to use
+another AACon service by selecting it from the {\sl Web Services $\Rightarrow$
+Conservation $\Rightarrow$ Switch Server} submenu.
+
+\chapter{Analysis of Alignments}
+\label{alignanalysis}
+Jalview provides support for sequence analysis in two ways. A number of
+analytical methods are `built-in', these are accessed from the {\sl Calculate}
+alignment window menu. Computationally intensive analyses are run outside
+Jalview {\sl via} web services - these are typically accessed {\sl via} the {\sl
+Web Service} menu, and described in chapter \ref{jvwebservices}.
+In this section, we describe the built-in analysis capabilities common to both
+the Jalview Desktop and the JalviewLite applet.
+
+\section{PCA}
+This calculation creates a spatial representation of the similarities within the
+current selection or the whole alignment if no selection has been made. After
+the calculation finishes, a 3D viewer displays the each sequence as a point in
+3D `similarity space'. Sets of similar sequences tend to lie near each other in
+this space.
+Note: The calculation is computationally expensive, and may fail for very large
+sets of sequences - because the JVM has run out of memory. Memory issues, and
+how to overcome them, were discussed in Section \ref{memorylimits}.
+
+\subsubsection{What is PCA?}
+Principal components analysis is a technique for examining the structure of
+complex data sets. The components are a set of dimensions formed from the
+measured values in the data set, and the principal component is the one with the
+greatest magnitude, or length. The sets of measurements that differ the most
+should lie at either end of this principal axis, and the other axes correspond
+to less extreme patterns of variation in the data set.
+In this case, the components are generated by an eigenvector decomposition of
+the matrix formed from the sum of pairwise substitution scores at each aligned
+position between each pair of sequences. The basic method is described in the
+1995 paper by {\sl G. Casari, C. Sander} and {\sl A. Valencia} \footnote{{\sl
+Nature Structural Biology} (1995) {\bf 2}, 171-8.
+PMID: 7749921} and implemented at the SeqSpace server at the EBI.
+
+Jalview provides two different options for the PCA calculation. Protein PCAs are
+by default computed using BLOSUM 62 pairwise substitution scores, and nucleic
+acid alignment PCAs are computed using a score model based on the identity
+matrix that also treats Us and Ts as identical, to support analysis of both RNA
+and DNA alignments. The {\sl Change Parameters} menu also allows the calculation
+method to be toggled between SeqSpace and a variant calculation that is detailed
+in Jalview's built in documentation.\footnote{See
+\url{http://www.jalview.org/help/html/calculations/pca.html}.}
+
+
+\exercise{Principal Component Analysis}{ \exstep{Load the alignment at
+\textsf{http://www.jalview.org/tutorial/alignment.fa} }
+\exstep{Select the menu option {\sl Calculate $\Rightarrow$ Principle Component Analysis}.
+A new window will open. Move this window so that the tree, alignment and PCA viewer window are all visible.
+Try rotating the plot by clicking and dragging the mouse on the plot in the PCA window.
+Note that clicking on points in the plot will highlight them on the alignment. }
+\exstep{ Select {\sl Calculate $\Rightarrow$ Calculate Tree $\Rightarrow$
+Neighbour Joining Using BLOSUM62}. A new tree window will appear.
+Click on the tree window. Careful selection of the tree partition location will divide the alignment into a number of groups,
+each of a different colour.
+Note how the colour of the sequence ID label matches both the colour of
+the partitioned tree and the points in the PCA plot.} }
+
+\subsubsection{The PCA Viewer}
+
+PCA analysis can be launched from the {\sl Calculate $\Rightarrow$ Principlal
+Component Analysis} menu option. {\bf PCA requires a selection containing at
+least 4 sequences}. A window opens containing the PCA tool (Figure \ref{PCA}).
+Each sequence is represented by a small square, coloured by the background
+colour of the sequence ID label. The axes can be rotated by clicking and
+dragging the left mouse button and zoomed using the $\uparrow$ and $\downarrow$
+keys or the scroll wheel of the mouse (if available). A tool tip appears if the
+cursor is placed over a sequence. Sequences can be selected by clicking on them.
+[CTRL]-Click can be used to select multiple sequences.
+
+Labels will be shown for each sequence by toggling the {\sl View $\Rightarrow$
+Show Labels} menu option, and the plot background colour changed {\sl via} the
+{\sl View $\Rightarrow$ Background Colour..} dialog box. A graphical
+representation of the PCA plot can be exported as an EPS or PNG image {\sl via}
+the {\sl File $\Rightarrow$ Save As $\Rightarrow$ \ldots } submenu.
+
+\begin{figure}[hbtp]
+\begin{center}
+\includegraphics[width=2in]{images/PCA1.pdf}
+\includegraphics[width=3in]{images/PCA3.pdf}
+\caption{{\bf PCA Analysis.} }
+\label{PCA}
+\end{center}
+\end{figure}
+
+
+
+\subsubsection{PCA Data Export}
+Although the PCA viewer supports export of the current view, the plots produced
+are rarely suitable for direct publication. The PCA viewer's {\sl File} menu
+includes a number of options for exporting the PCA matrix and transformed points
+as comma separated value (CSV) files. These files can be imported by tools such
+as {\bf R} or {\bf gnuplot} in order to graph the data.
+
+\section{Trees}
+\label{trees}
+Jalview can calculate and display trees, providing interactive tree-based
+grouping of sequences though a tree viewer. All trees are calculated {\sl via}
+the {\sl Calculate $\Rightarrow$ Calculate Tree $\Rightarrow$ \ldots} submenu.
+Trees can be calculated from distance matrices determined from \% identity or
+aggregate BLOSUM 62 score using either {\sl Average Distance} (UPGMA) or {\sl
+Neighbour Joining} algorithms. The input data for a tree is either the selected
+region or the whole alignment, excluding any hidden regions.
+
+On calculating a tree, a new window opens (Figure \ref{trees1}) which contains
+the tree. Various display settings can be found in the tree window {\sl View}
+menu, including font, scaling and label display options, and the {\sl File
+$\Rightarrow$ Save As} submenu contains options for image and Newick file
+export. Newick format is a standard file format for trees which allows them to
+be exported to other programs. Jalview can also read in external trees in
+Newick format {\sl via} the {\sl File $\Rightarrow$ Load Associated Tree} menu
+option. Leaf names on imported trees will be matched to the associated alignment
+- unmatched leaves will still be displayed, and can be highlighted using the
+{\sl View $\Rightarrow$ Mark Unlinked Leaves} menu option.