+\chapter{Multiple Sequence Alignment}
+\label{msaservices}
+Sequences can be aligned using a range of algorithms provided by JABA web
+services. These include ClustalW\footnote{{\sl ``CLUSTAL W: improving the
+sensitivity of progressive multiple sequence alignment through sequence
+weighting, position specific gap penalties and weight matrix choice."} Thompson
+JD, Higgins DG, Gibson TJ (1994) {\sl Nucleic Acids Research} {\bf 22},
+4673-80}, Muscle\footnote{{\sl ``MUSCLE: a multiple sequence alignment method
+with reduced time and space complexity"} Edgar, R.C.
+(2004) {\sl BMC Bioinformatics} {\bf 5}, 113}, MAFFT\footnote{{\sl ``MAFFT: a
+novel method for rapid multiple sequence alignment based on fast Fourier
+transform"} Katoh, K., Misawa, K., Kuma, K. and Miyata, T. (2002) {\sl Nucleic
+Acids Research} {\bf 30}, 3059-3066. and {\sl ``MAFFT version 5:
+improvement in accuracy of multiple sequence alignment"} Katoh, K., Kuma, K.,
+Toh, H. and Miyata, T. (2005) {\sl Nucleic Acids Research} {\bf 33}, 511-518.},
+ProbCons,\footnote{PROBCONS: Probabilistic Consistency-based Multiple Sequence
+Alignment.
+Do, C.B., Mahabhashyam, M.S.P., Brudno, M., and Batzoglou, S.
+(2005) {\sl Genome Research} {\bf 15} 330-340.} T-COFFEE\footnote{T-Coffee:
+A novel method for multiple sequence alignments. (2000) Notredame, Higgins and
+Heringa {\sl JMB} {\bf 302} 205-217} and Clustal Omega.\footnote{Fast, scalable
+generation of high-quality protein multiple sequence alignments using Clustal
+Omega. Sievers F, Wilm A, Dineen DG, Gibson TJ, Karplus K, Li W, Lopez R,
+McWilliam H, Remmert M, Soding J, Thompson JD, Higgins DG (2011) {\sl Molecular
+Systems Biology} {\bf 7} 539
+\href{http://dx.doi.org/10.1038/msb.2011.75}{doi:10.1038/msb.2011.75}} Of these,
+T-COFFEE is the slowest, but also the most accurate. ClustalW is historically
+the most widely used. Muscle is faster than ClustalW and probably the most
+accurate for smaller alignments and MAFFT is probably the best for large
+alignments, however {\bf Clustal Omega}, which was released in 2011, is
+arguably the fastest and most accurate tool for protein multiple alignment.
+
+
+To run an alignment web service, select the appropriate method from the {\sl
+Web Service $\Rightarrow$ Alignment $\Rightarrow$ \ldots} submenu (Figure
+\ref{webservices}). For each service you may either perform an alignment with
+default settings, use one of the available presets, or customise the parameters
+with the `{\sl Edit and Run ..}' dialog box. Once the job is submitted, a
+progress window will appear giving information about the job and any errors that
+occur. After successful completion of the job, a new alignment window is opened
+with the results, in this case an alignment. By default, the new alignment will be
+ordered in the same way as the input sequences. Note: many alignment
+programs re-order the input during their analysis and place homologous
+sequences close together, the MSA algorithm ordering can be recovered
+using the `Algorithm ordering' entry within the {\sl Calculate $\Rightarrow$
+Sort } sub menu.
+
+\subsubsection{Realignment}
+The re-alignment option is currently only supported by ClustalW and Clustal
+Omega. When performing a re-alignment, Jalview submits the current selection to
+the alignment service complete with any existing gaps. This approach is useful
+when one wishes to align additional sequences to an existing alignment without
+any further optimisation to the existing alignment. The re-alignment service
+provided by ClustalW in this case is effectively a simple form of profile
+alignment.
+
+
+\begin{figure}[htbp]
+\begin{center}
+\parbox[c]{1.5in}{\includegraphics[width=1.5in]{images/ws1.pdf}}
+\parbox[c]{2.5in}{\includegraphics[width=2.5in]{images/ws2.pdf}}
+\parbox[c]{2in}{\includegraphics[width=2in]{images/ws3.pdf}}
+\caption{{\bf Multiple alignment via web services} The appropriate method is
+selected from the menu (left), a status box appears (centre), and the results
+appear in a new window (right).}
+\label{webservices}
+\end{center}
+\end{figure}
+
+
+
+
+\exercise{Multiple Sequence Alignment}{
+\exstep{ Close all windows and open the alignment at {\sf
+http://www.jalview.org/tutorial/unaligned.fa}. Select {\sl
+Web Service $\Rightarrow$ Alignment $\Rightarrow$ Muscle with Defaults}.
+A window will open giving the job status. After a short time, a second window will open
+ with the results of the alignment.}
+ \exstep{Return to the first sequence alignment window by clicking on
+ the window, and repeat using Clustal and MAFFT (from the {\sl Web
+ Service $\Rightarrow$ Alignment} menu) on the same initial alignment. Compare them and
+ you should notice small differences. }
+\exstep{Select the last three sequences in the MAFFT alignment, and de-align them
+with {\sl Edit $\Rightarrow$ Remove All Gaps}. Press [ESC] to deselect them and then
+submit the view for re-alignment with Clustal.}
+\exstep{Use [CTRL]-Z to recover the alignment of the last three sequences in the MAFFT alignment.
+Once the Clustal re-alignment has completed, compare the results of re-alignment of the
+three sequences with their alignment in the original MAFFT result.}
+\exstep{Select columns 60 to 125 in the original MAFFT alignment and hide them.
+Select {\sl Web Services $\Rightarrow$ Alignment $\Rightarrow$ Mafft with Defaults} to
+submit the visible portion of the alignment to MAFFT. When the web service job pane appears,
+note that there are now two alignment job status panes shown in the window.}
+\exstep{When the MAFFT job has finished, compare the alignment of the N-terminal visible
+region in the result with the corresponding region of the original alignment. If you wish,
+select and hide a few more columns in the N-terminal region, and submit the alignment to the
+service again and explore the effect of local alignment on the non-homologous parts of the
+N-terminal region.}
+}
+
+\subsubsection{Alignments of Sequences that include Hidden Regions}
+
+If the view or selected region that is submitted for alignment contains hidden
+regions, then {\bf only the visible sequences will be submitted to the service}.
+Furthermore, each contiguous segment of sequences will be aligned independently
+(resulting in a number of alignment `subjobs' appearing in the status window).
+Finally, the results of each subjob will be concatenated with the hidden regions
+in the input data prior to their display in a new window. This approach ensures
+that 1) hidden column boundaries in the input data are preserved in the
+resulting alignment - in a similar fashion to the constraint that hidden columns
+place on alignment editing (see Section \ref{lockededits} and 2) hidden
+columns can be used to preserve existing parts of an alignment whilst the
+visible parts are locally refined.
+
+
+\subsection{Customising the Parameters used for Alignment}
+
+JABA web services allow you to vary the parameters used when performing a
+bioinformatics analysis. For JABA alignment services, this means you are
+usually able to modify the following types of parameters:
+\begin{list}{$\bullet$}{}
+\item Amino acid or nucleotide substitution score matrix
+\item Gap opening and widening penalties
+\item Types of distance metric used to construct guide trees
+\item Number of rounds of re-alignment or alignment optimisation
+\end{list}
+
+
+\subsubsection{Getting Help on the Parameters for a Service}
+Each parameter available for a method usually has a short description, which
+Jalview will display as a tooltip, or as a text pane that can be opened under
+the parameter's controls. In the parameter shown in Figure
+\ref{clustalwparamdetail}, the description was opened by selecting the button on the left hand side. Online help for the
+service can also be accessed, by right clicking the button and selecting a URL
+from the pop-up menu that will open.
+
+\begin{figure}[htbp]
+\begin{center}
+\includegraphics[width=2.5in]{images/clustalwparamdetail.pdf}
+\caption{{\bf ClustalW parameter slider detail}. From the ClustalW {\sl Clustal $\Rightarrow$ Edit settings and run ...} dialog box. }
+\label{clustalwparamdetail}
+\end{center}
+\end{figure}
+
+\subsection{Alignment Presets}
+The different multiple alignment algorithms available from JABA vary greatly in
+the number of adjustable parameters, and it is often difficult to identify what
+are the best values for the sequences that you are trying to align. For these
+reasons, each JABA service may provide one or more presets -- which are
+pre-defined sets of parameters suited for particular types of alignment
+problem. For instance, the Muscle service provides the following presets:
+\begin{list}{$\bullet$}{}
+\item Large alignments (balanced)
+\item Protein alignments (fastest speed)
+\item Nucleotide alignments (fastest speed)
+\end{list}
+
+The presets are displayed in the JABA web services submenu, and can also be
+accessed from the parameter editing dialog box, which is opened by selecting
+the `{\sl Edit settings and run ...}' option from the web services menu. If you have used
+a preset, then it will be mentioned at the beginning of the job status file shown
+in the web service job progress window.
+
+\subsubsection{Alignment Service Limits}
+Multiple alignment is a computationally intensive calculation. Some JABA server
+services and service presets only allow a certain number of sequences to be
+aligned. The precise number will depend on the server that you are using to
+perform the alignment. Should you try to submit more sequences than a service
+can handle, then an error message will be shown informing you of the maximum
+number allowed by the server.
+
+\subsection{User Defined Presets}
+Jalview allows you to create your own presets for a particular service. To do
+this, select the `{\sl Edit settings and run ...}' option for your service,
+which will open a parameter editing dialog box like the one shown in Figure
+\ref{jwsparamsdialog}.
+
+The top row of this dialog allows you to browse the existing presets, and
+when editing a parameter set, allows you to change its nickname. As you
+adjust settings, buttons will appear at the top of the parameters dialog that
+allow you to Revert or Update the currently selected user preset with your changes, Delete the current preset, or Create a new preset, if none exists with the given name. In addition to the parameter set name, you can also provide a short
+description for the parameter set, which will be shown in the tooltip for the
+parameter set's entry in the web services menu.
+
+\begin{figure}[htbc]
+\center{
+\includegraphics[width=3in]{images/jvaliwsparamsbox.pdf}
+\caption{{\bf Jalview's JABA alignment service parameter editing dialog box}.}
+\label{jwsparamsdialog} }
+\end{figure}
+
+\subsubsection{Saving Parameter Sets}
+When creating a custom parameter set, you will be asked for a file name to save
+it. The location of the file is recorded in the Jalview user preferences in the
+same way as a custom alignment colourscheme, so when Jalview is launched again,
+it will show your custom preset amongst the options available for running the
+JABA service.
+
+%
+% \exercise{Creating and using user defined presets}{\label{createandusepreseex}
+% \exstep{Import the file at
+% \textsf{http://www.jalview.org/tutorial/fdx\_unaligned.fa} into jalview.}
+% \exstep{Use the `{\slDiscover Database Ids}' function to recover the PDB cross
+% references for the sequences.}
+% \exstep{Align the sequences using the default ClustalW parameters.}
+% \exstep{Use the `{\sl Edit and run..}'
+% option to open the ClustalW parameters dialog box, and create a new preset using
+% the following settings:
+% \begin{list}{$\bullet$}{}
+% \item BLOSUM matrix (unchanged)
+% \item Gap Opening and End Gap penalties = 0.05
+% \item Gap Extension and Separation penalties = 0.05
+% \end{list}
+%
+% As you edit the parameters, buttons will appear on the dialog box
+% allowing you revert your changes or save your settings as a new parameter
+% set.
+%
+% Before you save your settings, remeber to give them a meaningful name by editing
+% the text box at the top of the dialog box.
+% }
+% \exstep{Repeat the alignment using your new parameter set by selecting it from
+% the {\sl ClustalW Presets menu}.}
+% \exstep{These sequences have PDB structures associated with them, so it is
+% possible to compare the quality of the alignments.
+%
+% Use the {\sl View all {\bf N}
+% structures} option to calculate the superposition of 1fdn on 1fxd for both
+% alignments (refer to section \ref{superposestructs} for instructions). Which
+% alignment gives the best RMSD ? }
+% \exstep{Apply the same alignment parameter settings to the example alignment
+% (available from \textsf{http://www.jalview.org/examples/uniref50.fa}).
+%
+% Are there differences ? If not, why not ?
+% }
+% }
+
+\section{Protein Alignment Conservation Analysis}
+\label{aacons}
+The {\sl Web Service $\Rightarrow$ Conservation} menu controls the computation
+of up to 17 different amino acid conservation measures for the current alignment
+view. The JABAWS AACon Alignment Conservation Calculation Service, which is used
+to calculate these scores, provides a variety of standard measures described by
+Valdar in 2002\footnote{Scoring residue conservation. Valdar (2002) {\sl
+Proteins: Structure, Function, and Genetics} {\bf 43} 227-241.} as well as an efficient implementation of the SMERFs
+score developed by Manning et al. in 2008.\footnote{SMERFS Score Manning et al. {\sl BMC
+Bioinformatics} 2008, {\bf 9} 51 \href{http://dx.doi.org/10.1186/1471-2105-9-51}{doi:10.1186/1471-2105-9-51}}
+
+\subsubsection{Enabling and Disabling AACon Calculations}
+When the AACon Calculation entry in the {\sl Web Services $\Rightarrow$
+Conservation} menu is ticked, AACon calculations will be performed every time
+the alignment is modified. Selecting the menu item will enable or disable
+automatic recalculation.
+
+\subsubsection{Configuring which AACon Calculations are Performed}
+The {\sl Web Services $\Rightarrow$ Conservation $\Rightarrow$ Change AACon
+Settings ...} menu entry will open a web services parameter dialog for the
+currently configured AACon server. Standard presets are provided for quick and
+more expensive conservation calculations, and parameters are also provided to
+change the way that SMERFS calculations are performed.
+AACon settings for an alignment are saved in Jalview projects along with the
+latest calculation results.
+
+\subsubsection{Changing the Server used for AACon Calculations}
+If you are working with alignments too large to analyse with the public JABAWS
+server, then you will most likely have already configured additional JABAWS
+servers. By default, Jalview will chose the first AACon service available from
+the list of JABAWS servers available. If available, you can switch to use
+another AACon service by selecting it from the {\sl Web Services $\Rightarrow$
+Conservation $\Rightarrow$ Switch Server} submenu.
+
+\chapter{Analysis of Alignments}
+\label{alignanalysis}
+Jalview provides support for sequence analysis in two ways. A number of
+analytical methods are `built-in', these are accessed from the {\sl Calculate}
+alignment window menu. Computationally intensive analyses are run outside
+Jalview {\sl via} web services - these are typically accessed {\sl via} the {\sl
+Web Service} menu, and described in chapter \ref{jvwebservices}.
+In this section, we describe the built-in analysis capabilities common to both
+the Jalview Desktop and the JalviewLite applet.
+
+\section{PCA}
+This calculation creates a spatial representation of the similarities within the
+current selection or the whole alignment if no selection has been made. After
+the calculation finishes, a 3D viewer displays the each sequence as a point in
+3D `similarity space'. Sets of similar sequences tend to lie near each other in