3 * Jalview - A Sequence Alignment Editor and Viewer ($$Version-Rel$$)
4 * Copyright (C) $$Year-Rel$$ The Jalview Authors
6 * This file is part of Jalview.
8 * Jalview is free software: you can redistribute it and/or
9 * modify it under the terms of the GNU General Public License
10 * as published by the Free Software Foundation, either version 3
11 * of the License, or (at your option) any later version.
13 * Jalview is distributed in the hope that it will be useful, but
14 * WITHOUT ANY WARRANTY; without even the implied warranty
15 * of MERCHANTABILITY or FITNESS FOR A PARTICULAR
16 * PURPOSE. See the GNU General Public License for more details.
18 * You should have received a copy of the GNU General Public License
19 * along with Jalview. If not, see <http://www.gnu.org/licenses/>.
20 * The Jalview Authors are detailed in the 'AUTHORS' file.
23 <title>The Alignment Annotations File</title>
28 <strong>The Alignment Annotations File</strong>
31 Alignment annotations can be imported onto an alignment since
32 version 2.08 of Jalview, via an annotations file. It is a simple
33 ASCII text file consisting of tab delimited records similar to the <a
34 href="featuresFormat.html">Sequence Features File</a>, and
35 introduced primarily for use with the Jalview applet.
39 <strong>Importing annotation files</strong><br /> Alignment
40 annotations files are imported into Jalview in the following ways:<br />
42 <li>from the command line<strong><pre>
43 -annotations <<em>Annotations filename</em>></pre></strong></li>
44 <li>Dragging an annotations file onto an alignment window</li>
45 <li>Via the "Load Features / Annotations" entry in
46 the <strong>File</strong> menu of an alignment window.
51 <strong>Exporting annotation files</strong><br /> An annotation
52 file can be created for any alignment view from the "Export
53 Annotations ..." entry in the <strong>File</strong> menu of an
57 <strong>THE ANNOTATION FILE FORMAT</strong> <br />An annotation
58 file consists of lines containing an instruction followed by tab
59 delimited fields. Any lines starting with "#" are
60 considered comments, and ignored. The sections below describe the
61 structure of an annotation file.
64 <li><a href="#annheader">JALVIEW_ANNOTATION</a> mandatory
66 <li><a href="#annrows">LINE_GRAPH, BAR_GRAPH and NO_GRAPH</a>
67 to create annotation rows</li>
68 <li><a href="#combine">COMBINE, COLOUR and GRAPHLINE</a> for
69 thresholds and complex line graphs</li>
70 <li><a href="#annrowprops">ROWPROPERTIES</a> control the
71 display of individual annotation rows</li>
72 <li><a href="#groupdefs">SEQUENCE_GROUP</a> to define groups of
73 sequences for further annotation</li>
74 <li><a href="#groupprops">PROPERTIES</a> to set visualisation
75 properties for sequence groups</li>
76 <li><a href="#seqgrprefs">SEQUENCE_REF and GROUP_REF</a> for
77 specifying target sequences and groups for annotation, reference
78 sequence and column visibilty commands.</li>
79 <li><a href="#refsandviews">VIEW_SETREF, VIEW_HIDECOLS and
80 HIDE_INSERTIONS</a> for assigning the reference sequence on the
81 alignment and hiding columns.</li>
84 At the end of this document, you can also find notes on <a
85 href="#compatibility">compatibility</a> of annotation files
86 across different versions of Jalview. An <a href="#exampleann">example
87 annotation file</a> is also provided along with instructions on how to
92 <strong><em><a name="annheader">Header line</a></em></strong><br />The
93 first non-commented out line of a valid Annotations file must begin
94 with :<strong><pre>JALVIEW_ANNOTATION</pre></strong>
98 <strong><em><a name="annrows">LINE_GRAPH,
99 BAR_GRAPH and NO_GRAPH</a></em></strong><br /> Labels, secondary structure,
100 histograms and line graphs are added with a line like <strong><pre>
101 <em>GRAPH_TYPE</em>	<em>Label</em>	<em>Description</em> (optional)	<em>Values</em>
105 Here, the <em>GRAPH_TYPE</em> field in the first column defines the
106 appearance of the annotation row when rendered by Jalview. The next
107 field is the row <em>label</em> for the annotation. This may be
108 followed by a <em>description</em> for the row, which is shown in a
109 tooltip when the user mouses over the annotation row's label. Since
110 Jalview 2.7, the description field may also contain HTML tags (in
111 the same way as a <a href="featuresFormat.html">sequence
112 feature's</a> label), providing the text is enclosed in an
115 <em>Please note: URL links embedded in HTML descriptions are
116 not yet supported.</em>
120 The final <em>Values</em> field contains a series of "|"
121 separated value fields. Each value field is itself a comma separated
122 list of fields of a particular type defined by the annotation row's
123 <em>GRAPH_TYPE</em>. The allowed values of <em>GRAPH_TYPE</em> and
124 corresponding interpretation of each <em>Value</em> are shown below:
129 <li><strong>BAR_GRAPH</strong><br> Plots a histogram with
130 labels below each bar.<br> <em>number</em>,<em>text
131 character</em>,<em>Tooltip text</em></li>
132 <li><strong>LINE_GRAPH</strong><br> Draws a line between
133 values on the annotation row.<br> <em>number</em></li>
134 <li><strong>NO_GRAPH</strong><br>For a row consisting of
135 text labels and/or secondary structure symbols.<br> <em>{Secondary
136 Structure Symbol}</em>,<em>text label</em>,<em>Tooltip text</em><br />
137 <br />The type of secondary structure symbol depends on the
138 alignment being annotated being either Protein or RNA. <br />For
139 proteins, structure symbols are <em>H</em> (for helix) and <em>E</em>
140 (for strand)<br /> <br />For RNA structures, VIENNA, WUSS, and
141 extended notations can be used to specify paired positions.
142 <ul>e.g. "(|(||)|)" or
143 "|A|A|A|(|a|a|a|)")
146 Any or all value fields may be left empty, as well as the BAR_GRAPH's
147 text character field, and either or both of the text-label and
148 secondary structure symbol fields of the NO_GRAPH type annotation
151 <p>Color strings can be embedded in a value field by enclosing an
152 RGB triplet in square brackets to colour that position in an
156 <strong><a name="combine">COMBINE, COLOUR and GRAPHLINE</a>
157 for line graphs</font></strong><br /> <em>LINE_GRAPH</em> type annotations can be
158 given a colour (specified as 24 bit RGB triplet in hexadecimal or
159 comma separated values), combined onto the same vertical axis, and
160 have ordinate lines (horizontal lines at a particular vertical axis
161 value) using the following commands (respectively):
162 <pre>COLOUR	<em>graph_name</em>	<em>colour</em>
163 COMBINE	<em>graph_1_name</em>	<em>graph_2_name</em>
164 GRAPHLINE	<em>graph_name</em>	<em>value</em>	<em>label</em>	<em>colour</em><strong><em>
170 <strong><a name="annrowprops">ROWPROPERTIES</a></strong><br /> The
171 visual display properties for a set of annotation rows can be
172 modified using the following tab-delimited line:
174 <pre>ROWPROPERTIES	<em>Row label</em>	<em>centrelabs=true( or false)</em>	<em>showalllabs=true(default is false)</em>	<em>scaletofit=true (default is false)</em>
177 This sets the visual display properties according to the given
178 values for all the annotation rows with labels matching <em>Row
179 label</em>. The properties mostly affect the display of multi-character
180 column labels, and are as follows:
182 <li><em>centrelabs</em> Centre each label on its column.</li>
183 <li><em>showalllabs</em> Show every column label rather than
184 only the first of a run of identical labels (setting this to true
185 can have a drastic effect on secondary structure rows).</li>
186 <li><em>scaletofit</em> Shrink each label's font size so that
187 the label fits within the column. Useful when annotating an
188 alignment with a specific column numbering system. (<em>Not
189 available in Jalview applet due to AWT 1.1 limitations</em>)</li>
193 <strong><a name="groupdefs">SEQUENCE_GROUP</a></strong><br />
194 Groups of sequences and column ranges can be defined using a tab
195 delimited statement like:
197 <pre>SEQUENCE_GROUP	Group_Name	Group_Start	Group_End	<em>Sequences</em>
199 <p>The sequences can be defined by alignment index and a range of
200 sequences can be defined in a comma delimited field such as</p>
201 <p>2-5,8-15,20,22</p>
202 <p>Enter * to select all groups.</p>
204 <strong>Note:</strong> If the alignment indices are not known, enter
205 -1, followed by a tab and then a tab delimited list of sequence IDs.
208 If a <a href="#seqgrprefs"><strong>SEQUENCE_REF</strong></a> has
209 been defined, then <em>group_start</em> and <em>group_end</em> will
210 be relative to the sequence residue numbering, otherwise the <em>group_start</em>
211 and <em>group_end</em> will be alignment column indices.
215 <strong><a name="groupprops">PROPERTIES</a></strong><br />This
216 statement allows various visualisation properties to be assigned to
217 a named group. This takes a series of tab-delimited <em>key</em>=<em>value</em>
220 <pre>PROPERTIES	Group_name	tab_delimited_key_value_pairs
222 <p>The currently supported set of sequence group key-value pairs
223 that can be provided here are :</p>
227 <td width="50%">Key</td>
231 <td width="50%">description</td>
232 <td>Text - may include simple HTML tags</td>
235 <td width="50%">colour</td>
236 <td>A string resolving to a valid Jalview colourscheme
237 (e.g. Helix Propensity)</td>
240 <td width="50%">pidThreshold</td>
241 <td>A number from 0-100 specifying the Percent Identity
242 Threshold for colouring columns in the group or alignment</td>
245 <td width="50%">consThreshold</td>
246 <td>A number from 0-100 specifying the degree of bleaching
247 applied for conservation colouring</td>
250 <td width="50%">outlineColour</td>
251 <td>Line colour used for outlining the group (default is
255 <td width="50%">displayBoxes</td>
256 <td>Boolean (default true) controlling display of shaded
257 box for each alignment position</td>
260 <td width="50%">displayText</td>
261 <td>Boolean (default true) controlling display of text for
262 each alignment position</td>
265 <td width="50%">colourText</td>
266 <td>Boolean (default false) specifying whether text should
267 be shaded by applied colourscheme</td>
270 <td width="50%">textCol1</td>
271 <td>Colour for text when shown on a light background</td>
274 <td width="50%">textCol2</td>
275 <td>Colour for text when shown on a dark background</td>
278 <td width="50%">textColThreshold</td>
279 <td>Number from 0-100 specifying switching threshold
280 between light and dark background</td>
283 <td width="50%">idColour</td>
284 <td>Colour for highlighting the Sequence ID labels for this
285 group<br />If <em>idColour</em> is given but <em>colour</em>
286 is not, then idColor will also be used for the group
291 <td width="50%">showunconserved</td>
292 <td>Boolean (default false) indicating whether residues
293 should only be shown that are different from current reference
294 or consensus sequence</td>
297 <td width="50%">hide</td>
298 <td>Boolean (default false) indicating whether the rows in
299 this group should be marked as hidden.<br /> <em>Note:</em>
300 if the group is sequence associated (specified by
301 SEQUENCE_REF), then all members will be hidden and marked as
302 represented by the reference sequence.
305 <!-- <tr><td width="50%">hidecols</td><td>Boolean (default false) indicating whether columns in this groushould be marked as hidden</td></tr> -->
310 <strong>Specifying colours in PROPERTIES key-value pairs</strong><br />
311 The <strong>colour</strong> property can take either a colour scheme
312 name, or a single colour specification (either a colour name like
313 'red' or an RGB triplet like 'ff0066'). If a single colour is
314 specified, then the group will be coloured with that colour.
318 <strong><a name="seqgrprefs">SEQUENCE_REF and GROUP_REF</a></strong><br />
319 By default, annotation is associated with the alignment as a whole.
320 However, it is also possible to have an annotation row associated
321 with a specific sequence, or a sequence group. Clicking the
322 annotation label for sequence or group associated annotation will
323 highlight the associated rows in the alignment, and double clicking
324 will select those rows, allowing further analysis. While group
325 associated annotation remains associated with a particular
326 alignment, sequence associated annotation can move with a sequence -
327 so copying a sequence to another alignment will also copy its
328 associated annotation.
330 <p>You can associate an annotation with a sequence by preceding
331 its definition with the line:
332 <pre>SEQUENCE_REF	<em>seq_name</em>	<em>[startIndex]</em>
334 All Annotations defined after a SEQUENCE_REF command will then be
335 associated with that sequence, and the first field in the Value field
336 list will (optionally) be placed at the
337 <em>startIndex</em>'th column.
340 <p>Sequence associations are turned off for subsequent annotation
342 <pre>SEQUENCE_REF	ALIGNMENT</pre>
344 <p>Similarly, since Jalview 2.5, group associated annotation can
345 be defined by preceding the row definitions with the line:
346 <pre>GROUP_REF	<em>group_name</em>
348 Group association is turned off for subsequent annotation rows by:
349 <pre>GROUP_REF	<em>ALIGNMENT</em>
354 <strong><a name="refsandviews">VIEW_SETREF,
355 VIEW_HIDECOL and HIDE_INSERTIONS</a></strong><br /> Since Jalview 2.9, the
356 Annotations file has also supported the definition of reference
357 sequences and hidden regions for an alignment view.
360 <em>VIEW_DEF</em> allows the current view to be named according to the
361 first argument after the tab character. If a second argument is
362 provided, then a new view is created with the given name, and
366 <em>VIEW_SETREF</em><br />Marks the first sequence in the
367 alignment, or alternately, the one specified by the most recent <em>SEQUENCE_REF</em>
368 statement, as the <a href="../calculations/referenceseq.html">reference
369 sequence</a> for the alignment.
372 <em>HIDE_INSERTIONS</em><br />This command hides all gapped
373 positions in the current target sequence. Any columns already hidden
374 will be re-displayed.<br /> <br>The current target sequence is
375 either the one specified by the most recent <em>SEQUENCE_REF</em>
376 statement, the alignment's reference sequence, or the first sequence
380 <em>VIEW_HIDECOLS</em><br />Modifies the visibility of columns in
381 the view. The statement is followed by a single argument consisting
382 of a comma separated series of single integers or integer pairs
383 (like <em>3-4</em>). These define columns (starting from the
384 left-hand column 0) that should be marked as hidden in the alignment
390 <strong><a name="compatibility">COMPATIBILITY NOTES</a></strong><br />
391 The interpretation of the COMBINE statement in <em>Version
392 2.8.1</em> was refined so that only annotation line graphs with the
393 given names ands the same <strong>SEQUENCE_REF</strong> and <strong>GROUP_REF</strong>
399 <strong><a name="exampleann">EXAMPLES</a></strong><br /> An example
400 Annotation file is given below. Copy and paste the contents into a
401 text file and load it onto the Jalview example protein alignment.
403 <pre>#Comment lines follow the hash symbol
405 SEQUENCE_REF	FER1_MESCR	5
406 BAR_GRAPH	Bar Graph 1	<html>an <em>html tooltip</em> for Bar graph 1.</html>	||-100,-|-200,-|-300,-|-400,-|200,+|300,+|150,+
407 LINE_GRAPH	Green Values	1.1|2.2|1.3|3.4|0.7|1.4|3.3|2.2|2.1|-1.1|3.2
408 LINE_GRAPH	Red Values	2.1|3.2|1.3|-1.4|5.5|1.4|1.3|4.2|-1.1|1.1|3.2
409 BAR_GRAPH	Bar Graph 2	1,.|2,*|3,:|4,.|5,*|4,:|3,.|2|1|1|2|3|4|5|4
410 NO_GRAPH	Icons 	||||E,Sheet1|E|E||||H,Sheet 2|H|H|H||||||
411 NO_GRAPH	Purple Letters	m|y|p|r|o|t|e|i|n
412 COLOUR	Bar Graph 2	blue
413 COLOUR	Red Values	255,0,0
414 COLOUR	Green Values	green
415 COLOUR	Purple Letters	151,52,228
416 COMBINE	Green Values	Red Values
417 GRAPHLINE	Red Values	2.6	threshold	black
419 SEQUENCE_GROUP	Group_A	30	50	*
420 SEQUENCE_GROUP	Group_B	1	351	2-5
421 SEQUENCE_GROUP	Group_C	12	14	-1	seq1	seq2	seq3
422 PROPERTIES	Group_A	description=This is the description	colour=Helix Propensity	pidThreshold=0	outlineColour=red	displayBoxes=true	displayText=false	colourText=false	textCol1=black	textCol2=black	textColThreshold=0
423 PROPERTIES	Group_B	outlineColour=red
424 PROPERTIES	Group_C	colour=Clustal