<head><title>Principal Component Analysis</title></head>\r
<body>\r
<p><strong>Principal Component Analysis</strong></p>\r
-<p>This is a method of clustering sequences based on the method developed by G.\r
- Casari, C. Sander and A. Valencia. Structural Biology volume 2, no. 2, February\r
- 1995 . Extra information can also be found at the SeqSpace server at the EBI.\r
- <br>\r
- The version implemented here only looks at the clustering of whole sequences\r
- and not individual positions in the alignment to help identify functional residues.\r
- For large alignments plans are afoot to implement a web service to do this 'residue\r
- space' PCA remotely. </p>\r
-<p>When the Principal component analysis option is selected all the sequences\r
- ( or just the selected ones) are used in the calculation and for large numbers\r
- of sequences this could take quite a time. When the calculation is finished\r
- a new window is displayed showing the projections of the sequences along the\r
- 2nd, 3rd and 4th vectors giving a 3dimensional view of how the sequences cluster.\r
+<p>This calculation creates a spatial representation of the\r
+similarities within a selected group, or all of the sequences in\r
+an alignment. After the calculation finishes, a 3D viewer displays the\r
+set of sequences as points in 'similarity space', and similar\r
+sequences tend to lie near each other in the space.</p>\r
+<p>Note: The calculation is computationally expensive, and may fail for very large sets of sequences -\r
+ usually because the JVM has run out of memory. The next release of\r
+ Jalview release will execute this calculation through a web service.</p>\r
+<p>Principal components analysis is a technique for examining the\r
+structure of complex data sets. The components are a set of dimensions\r
+formed from the measured values in the data set, and the principle\r
+component is the one with the greatest magnitude, or length. The\r
+sets of measurements that differ the most should lie at either end of\r
+this principle axis, and the other axes correspond to less extreme\r
+patterns of variation in the data set.\r
</p>\r
-<p>This 3d view can be rotated by holding the left mouse button down in the PCA\r
- window and moving it. The user can also zoom in and out by using the up and\r
- down arrow keys. </p>\r
-<p>Individual points can be selected using the mouse and selected sequences show\r
- up green in the PCA window and the usual grey background/white text in the alignment\r
- and tree windows. </p>\r
-<p>Different eigenvectors can be used to do the projection by changing the selected\r
- dimensions in the 3 menus underneath the 3d window. <br>\r
+\r
+<p>In this case, the components are generated by an eigenvector\r
+decomposition of the matrix formed from the sum of BLOSUM scores at\r
+each aligned position between each pair of sequences. The basic method\r
+is described in the paper by G. Casari, C. Sander and\r
+A. Valencia. Structural Biology volume 2, no. 2, February 1995 (<a\r
+href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=7749921">pubmed</a>)\r
+ and implemented at the SeqSpace server at the EBI.\r
</p>\r
+\r
+<p><strong>The PCA Viewer</strong></p>\r
+<p>This is an interactive display of the sequences positioned within\r
+ the similarity space. The colour of each sequence point is the same\r
+ as the sequence group colours, white if no colour has been\r
+ defined for the sequence, and green if the sequence is part of a\r
+ the currently selected group.\r
+</p>\r
+ <p>The 3d view can be rotated by dragging the mouse with the\r
+ <strong>left mouse button</strong> pressed. The view can also be\r
+ zoomed in and out with the up and down <strong>arrow\r
+ keys</strong>.</p>\r
+<p>A tool tip gives the sequence ID corresponding to a point in the\r
+ space, and clicking a point toggles the selection of the\r
+ corresponding sequence in the alignment window. Rectangular region\r
+ based selection is also possible, by holding the 'S' key whilst\r
+ left-clicking and dragging the mouse over the display.\r
+</p>\r
+<p>Initially, the display shows the first three components of the\r
+ similarity space, but any eigenvector can be used by changing the selected\r
+ dimension for the x, y, or z axis through each ones menu located\r
+ below the 3d display.\r
+</p>\r
+\r
</body>\r
</html>\r