-<html>\r
-<head><title>Tree Calculation</title></head>\r
-<body>\r
-<p><strong>UPGMA tree</strong></p>\r
-<p>If this option is selected then all sequences are used to generate a UPGMA\r
- tree. The pairwise distances used to cluster the sequences are the percentage\r
- mismatch between two sequences. For a reliable phylogenetic tree I recommend\r
- other programs (phylowin, phylip) should be used as they have the speed to use\r
- better distance methods and bootstrapping. Again, plans are afoot for a server\r
- to do this and to be able to read in tree files generated by other programs.\r
- <br>\r
- When the tree has been calculated a new window is displayed showing the tree\r
- with labels on the leaves showing the sequence ids. The user can select the\r
- ids with the mouse and the selected sequences will also be selected in the alignment\r
- window and the PCA window if that analysis has been calculated. </p>\r
-<p>Selecting the 'show distances' checkbox will put branch lengths on the branches.\r
- These branch lengths are the percentage mismatch between two nodes. </p>\r
-<p> </p>\r
-<p><strong>Neighbour Joining tree</strong></p>\r
-<p> The distances between sequences for this tree are generated in the same way\r
- as for the UPGMA tree. The method of clustering is the neighbour joining method\r
- which doesn't just pick the two closest leaves to cluster together but compensates\r
- for long edges by subtracting from the distances the average distance from each\r
- leaf to all the others. <br>\r
- Selection and output options are the same as for the UPGMA tree.<br>\r
-</p>\r
-</body>\r
-</html>\r
+<html>
+<!--
+ * Jalview - A Sequence Alignment Editor and Viewer ($$Version-Rel$$)
+ * Copyright (C) $$Year-Rel$$ The Jalview Authors
+ *
+ * This file is part of Jalview.
+ *
+ * Jalview is free software: you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, either version 3
+ * of the License, or (at your option) any later version.
+ *
+ * Jalview is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR
+ * PURPOSE. See the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with Jalview. If not, see <http://www.gnu.org/licenses/>.
+ * The Jalview Authors are detailed in the 'AUTHORS' file.
+ -->
+<head>
+<title>Tree Calculation</title>
+</head>
+<body>
+ <p>
+ <strong>Calculation of trees from alignments</strong>
+ </p>
+ <p>
+ Trees are calculated on either the complete alignment, or just the
+ currently selected group of sequences, via the <a href="calculations.html">calculations dialog</a> opened from the <strong>Calculate→Calculate
+ Tree or PCA...</strong> menu entry. Once calculated, trees are displayed in a new <a
+ href="../calculations/treeviewer.html">tree viewing
+ window</a>. There are four different calculations, using one of two
+ distance measures and constructing the tree from one of two
+ algorithms :
+ </p>
+ <p>
+ <strong>Distance Measures</strong>
+ </p>
+ <p>Trees are calculated on the basis of a measure of similarity
+ between each pair of sequences in the alignment :
+ <ul>
+ <li><strong>PID</strong><br>The percentage identity
+ between the two sequences at each aligned position.
+ <ul>
+ <li>PID = Number of equivalent aligned non-gap symbols *
+ 100 / Smallest number of non-gap positions in either of both
+ sequences<br> <em>This is essentially the 'number of
+ identical bases (or residues) per 100 base pairs (or
+ residues)'.</em>
+ </li>
+ </ul>
+ <li><strong>BLOSUM62, PAM250, DNA</strong><br />These options
+ use one of the available substitution matrices to compute a sum of
+ scores for the residue pairs at each aligned position.
+ <ul>
+ <li>For details about each model, see the <a
+ href="scorematrices.html">list of built-in score
+ matrices</a>.
+ </li>
+ </ul></li>
+ <li><strong>Sequence Feature Similarity</strong><br>Trees
+ are constructed from a distance matrix formed from Jaccard
+ distances between sequence features observed at each column of the
+ alignment.
+ <ul>
+ <li>Similarity at column <em>i</em> = (Total number of
+ features displayed - Sum of number of features in common at <em>i</em>)
+ <br />Similarities are summed over all columns and divided by
+ the number of columns. <br />Since the total number of
+ feature types is constant over all columns of the alignment,
+ we do not scale the matrix, so tree distances can be
+ interpreted as the average number of features that differ over
+ all sites in the aligned region.
+ </li>
+
+ </ul> Distances are computed based on the currently displayed feature
+ types. Sequences with similar distributions of features of the
+ same type will be grouped together in trees computed with this
+ metric. <em>This measure was introduced in Jalview 2.9</em></li>
+ </ul>
+ <p>
+ <strong>Tree Construction Methods</strong>
+ </p>
+ <p>Jalview currently supports two kinds of agglomerative
+ clustering methods. These are not intended to substitute for
+ rigorous phylogenetic tree construction, and may fail on very large
+ alignments.
+ <ul>
+ <li><strong>UPGMA tree</strong><br> UPGMA stands for
+ Unweighted Pair-Group Method using Arithmetic averages. Clusters
+ are iteratively formed and extended by finding a non-member
+ sequence with the lowest average dissimilarity over the cluster
+ members.
+ <p></p></li>
+ <li><strong>Neighbour Joining tree</strong><br> First
+ described in 1987 by Saitou and Nei, this method applies a greedy
+ algorithm to find the tree with the shortest branch lengths.<br>
+ This method, as implemented in Jalview, is considerably more
+ expensive than UPGMA.</li>
+ </ul>
+ <p>
+ A newly calculated tree will be displayed in a new <a
+ href="../calculations/treeviewer.html">tree viewing
+ window</a>. In addition, a new entry with the same tree viewer window
+ name will be added in the Sort menu so that the alignment can be
+ reordered to reflect the ordering of the leafs of the tree. If the
+ tree was calculated on a selected region of the alignment, then the
+ title of the tree view will reflect this.
+ </p>
+
+ <p>
+ <strong>External Sources for Phylogenetic Trees</strong>
+ </p>
+ <p>
+ A number of programs exist for the reliable construction of
+ phylogenetic trees, which can cope with large numbers of sequences,
+ use better distance methods and can perform bootstrapping. Jalview
+ can read <a
+ href="http://evolution.genetics.washington.edu/phylip/newick_doc.html">Newick</a>
+ format tree files using the 'Load Associated Tree' entry of the
+ alignment's File menu. Sequences in the alignment will be
+ automatically associated to nodes in the tree, by matching Sequence
+ IDs to the tree's leaf names.
+ </p>
+
+
+</body>
+</html>