3 * Jalview - A Sequence Alignment Editor and Viewer ($$Version-Rel$$)
4 * Copyright (C) $$Year-Rel$$ The Jalview Authors
6 * This file is part of Jalview.
8 * Jalview is free software: you can redistribute it and/or
9 * modify it under the terms of the GNU General Public License
10 * as published by the Free Software Foundation, either version 3
11 * of the License, or (at your option) any later version.
13 * Jalview is distributed in the hope that it will be useful, but
14 * WITHOUT ANY WARRANTY; without even the implied warranty
15 * of MERCHANTABILITY or FITNESS FOR A PARTICULAR
16 * PURPOSE. See the GNU General Public License for more details.
18 * You should have received a copy of the GNU General Public License
19 * along with Jalview. If not, see <http://www.gnu.org/licenses/>.
20 * The Jalview Authors are detailed in the 'AUTHORS' file.
23 <title>The Alignment Annotations File</title>
27 <p><strong>The Alignment Annotations File</strong></p>
28 <p>Alignment annotations can be imported onto an alignment since
29 version 2.08 of Jalview, via an annotations file. It is a simple ASCII
30 text file consisting of tab delimited records similar to the <a
31 href="featuresFormat.html">Sequence Features File</a>, and introduced
32 primarily for use with the Jalview applet.</p>
34 <p><strong>Importing annotation files</strong><br/>
35 Alignment annotations files are imported into Jalview in the
38 <li>from the command line<strong><pre>
39 -annotations <<em>Annotations filename</em>></pre></strong></li>
40 <li>Dragging an annotations file onto an alignment window</li>
41 <li>Via the "Load Features / Annotations" entry in the <strong>File</strong>
42 menu of an alignment window.</li>
46 <strong>Exporting annotation files</strong><br /> An annotation file
47 can be created for any alignment view from the "Export
48 Annotations ..." entry in the <strong>File</strong> menu of an
51 <p><strong>THE ANNOTATION FILE FORMAT</strong>
52 <br/>An annotation file consists of lines containing an instruction followed by
53 tab delimited fields. Any lines starting with "#" are considered comments, and
54 ignored. The sections below describe the structure of an annotation file.
56 <li><a href="#annheader">JALVIEW_ANNOTATION</a> mandatory header</li>
57 <li><a href="#annrows">LINE_GRAPH, BAR_GRAPH and NO_GRAPH</a> to create annotation rows</li>
58 <li><a href="#combine">COMBINE, COLOUR and GRAPHLINE</a> for thresholds and complex line graphs</li>
59 <li><a href="#annrowprops">ROWPROPERTIES</a> control the display of individual annotation rows</li>
60 <li><a href="#groupdefs">SEQUENCE_GROUP</a> to define groups of sequences for further annotation</li>
61 <li><a href="#groupprops">PROPERTIES</a> to set visualisation properties for sequence groups</li>
62 <li><a href="#seqgrprefs">SEQUENCE_REF and GROUP_REF</a> for attaching annotation to sequences and groups</li>
63 <li><a href="#refsandviews">VIEW_SETREF, VIEW_HIDECOLS and HIDE_INSERTIONS</a>
64 for defining a reference sequence on the alignment and hiding regions
65 based on gaps in a reference sequence</li>
68 At the end of this document, you can also find notes on <a
69 href="#compatibility">compatibility</a> of annotation files across
70 different versions of Jalview. An <a href="#exampleann">example
71 annotation file</a> is also provided along with instructions on how to
75 <p><strong><em><a name="annheader">Header line</a></em></strong><br/>The first non-commented out line of a valid Annotations file
76 must begin with :<strong><pre>JALVIEW_ANNOTATION</pre></strong></p>
78 <p><strong><em><a name="annrows">LINE_GRAPH, BAR_GRAPH and NO_GRAPH</a></em></strong><br/>
79 Labels, secondary structure, histograms and line graphs are added with a line like <strong><pre><em>GRAPH_TYPE</em>	<em>Label</em>	<em>Description</em> (optional)	<em>Values</em></pre></strong></p>
81 Here, the <em>GRAPH_TYPE</em> field in the first column defines the
82 appearance of the annotation row when rendered by Jalview. The next
83 field is the row <em>label</em> for the annotation. This may be
84 followed by a <em>description</em> for the row, which is shown in a
85 tooltip when the user mouses over the annotation row's label. Since
86 Jalview 2.7, the description field may also contain HTML tags (in the same
87 way as a <a href="featuresFile.html">sequence feature's</a> label),
88 providing the text is enclosed in an <html/> tag.
90 <ul><em>Please note: URL links embedded in HTML descriptions are not yet supported.</em>
93 <p>The final <em>Values</em>
94 field contains a series of "|" separated value fields. Each
95 value field is itself a comma separated list of fields of a particular
96 type defined by the annotation row's <em>GRAPH_TYPE</em>. The allowed values of
97 <em>GRAPH_TYPE</em> and corresponding interpretation of each <em>Value</em> are shown below:
100 <li><strong>BAR_GRAPH</strong><br> Plots a histogram with labels below each
101 bar.<br> <em>number</em>,<em>text character</em>,<em>Tooltip
104 <li><strong>LINE_GRAPH</strong><br> Draws a line between values on the
105 annotation row.<br> <em>number</em>
107 <li><strong>NO_GRAPH</strong><br>For a row consisting of text labels and/or
108 secondary structure symbols.<br><em>{Secondary Structure
109 Symbol}</em>,<em>text label</em>,<em>Tooltip text</em><br/><br/>The type of secondary structure symbol depends on the alignment being annotated being either Protein or RNA. <br/>For proteins, structure symbols are <em>H</em> (for
110 helix) and <em>E</em> (for strand)<br/><br/>For RNA, VIENNA, WUSS or extended notation can be used to specify positions that are paired (e.g. "(|(||)|)" or "|A|A|A|(|a|a|a|)")</li>
112 Any or all value fields may be left empty, as well as the BAR_GRAPH's
113 text character field, and either or both of the text-label and secondary
114 structure symbol fields of the NO_GRAPH type annotation rows.</p>
115 <p>Color strings can be embedded in a value field by enclosing an RGB triplet in square brackets to colour that position in an annotation row.
118 <p><strong><a name="combine">COMBINE, COLOUR and GRAPHLINE</a> for line graphs</font></strong><br/>
119 <em>LINE_GRAPH</em> type annotations can be given a colour
120 (specified as 24 bit RGB triplet in hexadecimal or comma separated
121 values), combined onto the same vertical axis, and have ordinate lines
122 (horizontal lines at a particular vertical axis value) using the
123 following commands (respectively):
124 <pre>COLOUR	<em>graph_name</em>	<em>colour</em>
125 COMBINE	<em>graph_1_name</em>	<em>graph_2_name</em>
126 GRAPHLINE	<em>graph_name</em>	<em>value</em>	<em>label</em>	<em>colour</em><strong><em>
130 <p><strong><a name="annrowprops">ROWPROPERTIES</a></strong><br/>
131 The visual display properties for a set of annotation rows can be modified using the following tab-delimited line:</p>
132 <pre>ROWPROPERTIES	<em>Row label</em>	<em>centrelabs=true( or false)</em>	<em>showalllabs=true(default is false)</em>	<em>scaletofit=true (default is false)</em></pre>
133 <p>This sets the visual display properties according to the given values for all the annotation rows with labels matching <em>Row label</em>. The properties mostly affect the display of multi-character column labels, and are as follows:
134 <ul><li><em>centrelabs</em> Centre each label on its column.</li>
135 <li><em>showalllabs</em> Show every column label rather than only the first of a run of identical labels (setting this to true can have a drastic effect on secondary structure rows).</li>
136 <li><em>scaletofit</em> Shrink each label's font size so that the label fits within the column. Useful when annotating an alignment with a specific column numbering system. (<em>Not available in Jalview applet due to AWT 1.1 limitations</em>)</li>
138 <p><strong><a name="groupdefs">SEQUENCE_GROUP</a></strong><br/>
139 Groups of sequences and column ranges can be defined using a tab delimited statement like:</p>
140 <pre>SEQUENCE_GROUP	Group_Name	Group_Start	Group_End	<em>Sequences</em></pre>
141 <p>The sequences can be defined by alignment index and a range of sequences can
142 be defined in a comma delimited field such as</p>
143 <p>2-5,8-15,20,22</p>
144 <p>Enter * to select all groups. </p>
145 <p><strong>Note:</strong> If the alignment indices are not known, enter -1, followed by a tab and then a tab delimited list
146 of sequence IDs. </p>
147 <p>If a <a href="#seqgrprefs"><strong>SEQUENCE_REF</strong></a> has been defined, then <em>group_start</em> and <em>group_end</em> will be
148 relative to the sequence residue numbering, otherwise the <em>group_start</em> and <em>group_end</em>
149 will be alignment column indices. </p>
151 <p><strong><a name="groupprops">PROPERTIES</a></strong><br/>This statement allows various visualisation properties to be assigned to a named group. This takes a series of tab-delimited <em>key</em>=<em>value</em> pairs:</p>
152 <pre>PROPERTIES	Group_name	tab_delimited_key_value_pairs
154 <p>The currently supported set of sequence group key-value pairs that can be provided here are :</p>
156 <tbody><tr><td width="50%">Key</td><td>Value</td></tr>
157 <tr><td width="50%">description</td><td>Text - may include simple HTML tags</td></tr>
158 <tr><td width="50%">colour</td><td>A string resolving to a valid Jalview colourscheme (e.g. Helix Propensity)</td></tr>
159 <tr><td width="50%">pidThreshold</td><td>A number from 0-100 specifying the Percent Identity Threshold for colouring columns in the group or alignment</td></tr>
160 <tr><td width="50%">consThreshold</td><td>A number from 0-100 specifying the degree of bleaching applied for conservation colouring</td></tr>
161 <tr><td width="50%">outlineColour</td><td>Line colour used for outlining the group (default is red)</td></tr>
162 <tr><td width="50%">displayBoxes</td><td>Boolean (default true) controlling display of shaded box for each alignment position</td></tr>
163 <tr><td width="50%">displayText</td><td>Boolean (default true) controlling display of text for each alignment position</td></tr>
164 <tr><td width="50%">colourText</td><td>Boolean (default false) specifying whether text should be shaded by applied colourscheme</td></tr>
165 <tr><td width="50%">textCol1</td><td>Colour for text when shown on a light background</td></tr>
166 <tr><td width="50%">textCol2</td><td>Colour for text when shown on a dark background</td></tr>
167 <tr><td width="50%">textColThreshold</td><td>Number from 0-100 specifying switching threshold between light and dark background</td></tr>
168 <tr><td width="50%">idColour</td><td>Colour for highlighting the Sequence ID labels for this group<br/>If <em>idColour</em> is given but <em>colour</em> is not, then idColor will also be used for the group background colour.</td></tr>
169 <tr><td width="50%">showunconserved</td><td>Boolean (default false) indicating whether residues should only be shown that are different from current reference or consensus sequence</td></tr>
170 <tr><td width="50%">hide</td><td>Boolean (default false) indicating whether the rows in this group should be marked as hidden.<br/><em>Note:</em> if the group is sequence associated (specified by SEQUENCE_REF), then all members will be hidden and marked as represented by the reference sequence.</td></tr>
171 <!-- <tr><td width="50%">hidecols</td><td>Boolean (default false) indicating whether columns in this groushould be marked as hidden</td></tr> --></tbody>
174 <p><strong>Specifying colours in PROPERTIES key-value pairs</strong><br/>
175 The <strong>colour</strong> property can take either a colour scheme name,
176 or a single colour specification (either a colour name like 'red' or an RGB
177 triplet like 'ff0066'). If a single colour is specified, then the group
178 will be coloured with that colour.</p>
180 <p><strong><a name="seqgrprefs">SEQUENCE_REF and GROUP_REF</a></strong><br/>
182 default, annotation is associated with the alignment as a whole.
183 However, it is also possible to have an annotation row associated with
184 a specific sequence, or a sequence group. Clicking the annotation
185 label for sequence or group associated annotation will highlight the
186 associated rows in the alignment, and double clicking will select
187 those rows, allowing further analysis. While group associated
188 annotation remains associated with a particular alignment, sequence
189 associated annotation can move with a sequence - so copying a sequence
190 to another alignment will also copy its associated annotation.
192 <p>You can associate an annotation with a sequence by preceding its
193 definition with the line:
194 <pre>SEQUENCE_REF	<em>seq_name</em>	<em>[startIndex]</em></pre>
195 All Annotations defined after a SEQUENCE_REF command will then be
196 associated with that sequence, and the first field in the Value field
197 list will (optionally) be placed at the <em>startIndex</em>'th column.</p>
199 <p>Sequence associations are turned off for subsequent annotation
201 <pre>SEQUENCE_REF	ALIGNMENT</pre>
203 <p>Similarly, since Jalview 2.5, group associated annotation can be defined by preceding the row definitions with the line:
204 <pre>GROUP_REF	<em>group_name</em></pre>
205 Group association is turned off for subsequent annotation rows by:
206 <pre>GROUP_REF	<em>ALIGNMENT</em></pre>
209 <p><strong><a name="refsandviews">VIEW_SETREF, VIEW_HIDECOL and HIDE_INSERTIONS</a></strong><br/>
210 Since Jalview 2.8.3, the Annotations file has also supported the definition of views on the alignment, and definition of hidden regions.</p>
212 <em>VIEW_DEF</em> allows the current view to be named according to the
213 first argument after the tab character. If a second argument is
214 provided, then a new view is created with the given name, and
218 <em>VIEW_SETREF</em> takes either a single sequence ID string, or a
219 numeric index (second argument), and attempts to assign a
220 corresponding sequence as the <a href="../features/refsequence.html">reference
221 sequence</a> for the alignment.
223 <em>VIEW_HIDECOLS</em> takes either a single argument consisting of a
224 comma separated series of integer pairs like
225 <em>3-4</em>. These integer pairs define columns (starting from the
226 left-hand column 0) that should be marked as hidden in the alignment
230 <em>HIDE_INSERTIONS</em> takes a either a single sequence ID or a
231 numeric index, or no arguments. This command marks all gapped
232 positions in a specified sequence (either the one located by the
233 arguments, the current SEQUENCE_REF, or the reference sequence for the
236 <p><strong><a name="compatibility">COMPATIBILITY NOTES</a></strong><br/>
237 The interpretation of the COMBINE statement in <em>Version 2.8.1</em> was refined
238 so that only annotation line graphs with the given names ands the same
239 <strong>SEQUENCE_REF</strong> and <strong>GROUP_REF</strong> scope are grouped.</p>
242 <p><strong><a name="exampleann">EXAMPLES</a></strong><br/>
243 An example Annotation file is given below. Copy and paste the contents into a text file and load it onto the Jalview example protein alignment.</p>
244 <pre>#Comment lines follow the hash symbol
246 SEQUENCE_REF	FER1_MESCR	5
247 BAR_GRAPH	Bar Graph 1	<html>an <em>html tooltip</em> for Bar graph 1.</html>	||-100,-|-200,-|-300,-|-400,-|200,+|300,+|150,+
248 LINE_GRAPH	Green Values	1.1|2.2|1.3|3.4|0.7|1.4|3.3|2.2|2.1|-1.1|3.2
249 LINE_GRAPH	Red Values	2.1|3.2|1.3|-1.4|5.5|1.4|1.3|4.2|-1.1|1.1|3.2
250 BAR_GRAPH	Bar Graph 2	1,.|2,*|3,:|4,.|5,*|4,:|3,.|2|1|1|2|3|4|5|4
251 NO_GRAPH	Icons 	||||E,Sheet1|E|E||||H,Sheet 2|H|H|H||||||
252 NO_GRAPH	Purple Letters	m|y|p|r|o|t|e|i|n
253 COLOUR	Bar Graph 2	blue
254 COLOUR	Red Values	255,0,0
255 COLOUR	Green Values	green
256 COLOUR	Purple Letters	151,52,228
257 COMBINE	Green Values	Red Values
258 GRAPHLINE	Red Values	2.6	threshold	black
260 SEQUENCE_GROUP	Group_A	30	50	*
261 SEQUENCE_GROUP	Group_B	1	351	2-5
262 SEQUENCE_GROUP	Group_C	12	14	-1	seq1	seq2	seq3
263 PROPERTIES	Group_A	description=This is the description	colour=Helix Propensity	pidThreshold=0	outlineColour=red	displayBoxes=true	displayText=false	colourText=false	textCol1=black	textCol2=black	textColThreshold=0
264 PROPERTIES	Group_B	outlineColour=red
265 PROPERTIES	Group_C	colour=Clustal