3 * Jalview - A Sequence Alignment Editor and Viewer ($$Version-Rel$$)
4 * Copyright (C) $$Year-Rel$$ The Jalview Authors
6 * This file is part of Jalview.
8 * Jalview is free software: you can redistribute it and/or
9 * modify it under the terms of the GNU General Public License
10 * as published by the Free Software Foundation, either version 3
11 * of the License, or (at your option) any later version.
13 * Jalview is distributed in the hope that it will be useful, but
14 * WITHOUT ANY WARRANTY; without even the implied warranty
15 * of MERCHANTABILITY or FITNESS FOR A PARTICULAR
16 * PURPOSE. See the GNU General Public License for more details.
18 * You should have received a copy of the GNU General Public License
19 * along with Jalview. If not, see <http://www.gnu.org/licenses/>.
20 * The Jalview Authors are detailed in the 'AUTHORS' file.
23 <title>The Alignment Annotations File</title>
28 <strong>The Alignment Annotations File</strong>
31 Alignment annotations can be imported onto an alignment since
32 version 2.08 of Jalview, via an annotations file. It is a simple
33 ASCII text file consisting of tab delimited records similar to the <a
34 href="featuresFormat.html">Sequence Features File</a>, and
35 introduced primarily for use with the Jalview applet.
39 <strong>Importing annotation files</strong><br /> Alignment
40 annotations files are imported into Jalview in the following ways:<br />
42 <li>from the command line<strong><pre>
43 --annotations <<em>Annotations filename</em>></pre></strong></li>
44 <li>Dragging an annotations file onto an alignment window</li>
45 <li>Via the "Load Features / Annotations" entry in
46 the <strong>File</strong> menu of an alignment window.
51 <strong>Exporting annotation files</strong><br /> An annotation
52 file can be created for any alignment view from the "Export
53 Annotations ..." entry in the <strong>File</strong> menu of an
57 <strong>THE ANNOTATION FILE FORMAT</strong> <br />An annotation
58 file consists of lines containing an instruction followed by tab
59 delimited fields. Any lines starting with "#" are
60 considered comments, and ignored. The sections below describe the
61 structure of an annotation file.
64 <li><a href="#annheader">JALVIEW_ANNOTATION</a> mandatory
66 <li><a href="#annrows">LINE_GRAPH, BAR_GRAPH and NO_GRAPH</a>
67 to create annotation rows</li>
68 <li><a href="#combine">COMBINE, COLOUR and GRAPHLINE</a> for
69 thresholds and complex line graphs</li>
70 <li><a href="#annrowprops">ROWPROPERTIES</a> control the
71 display of individual annotation rows</li>
72 <li><a href="#groupdefs">SEQUENCE_GROUP</a> to define groups of
73 sequences for further annotation</li>
74 <li><a href="#groupprops">PROPERTIES</a> to set visualisation
75 properties for sequence groups</li>
76 <li><a href="#seqgrprefs">SEQUENCE_REF and GROUP_REF</a> for
77 specifying target sequences and groups for annotation, reference
78 sequence and column visibilty commands.</li>
79 <li><a href="#refsandviews">VIEW_SETREF, VIEW_HIDECOLS and
80 HIDE_INSERTIONS</a> for assigning the reference sequence on the
81 alignment and hiding columns.</li>
84 At the end of this document, you can also find notes on <a
85 href="#compatibility">compatibility</a> of annotation files
86 across different versions of Jalview. An <a href="#exampleann">example
87 annotation file</a> is also provided along with instructions on how to
92 <strong><em><a name="annheader">Header line</a></em></strong><br />The
93 first non-commented out line of a valid Annotations file must begin
94 with :<strong><pre>JALVIEW_ANNOTATION</pre></strong>
98 <strong><em><a name="annrows">LINE_GRAPH,
99 BAR_GRAPH and NO_GRAPH</a></em></strong><br /> Labels, secondary structure,
100 histograms and line graphs are added with a line like <strong><pre>
101 <em>GRAPH_TYPE</em>	<em>Label</em>	<em>Description</em> (optional)	<em>Values</em>
105 Here, the <em>GRAPH_TYPE</em> field in the first column defines the
106 appearance of the annotation row when rendered by Jalview. The next
107 field is the row <em>label</em> for the annotation. This may be
108 followed by a <em>description</em> for the row, which is shown in a
109 tooltip when the user mouses over the annotation row's label. Since
110 Jalview 2.7, the description field may also contain HTML tags (in
111 the same way as a <a href="featuresFormat.html">sequence
112 feature's</a> label), providing the text is enclosed in an
115 <em>Please note: URL links embedded in HTML descriptions are
116 not yet supported.</em>
120 The final <em>Values</em> field contains a series of "|"
121 separated value fields. Each value field is itself a comma separated
122 list of fields of a particular type defined by the annotation row's
123 <em>GRAPH_TYPE</em>. The allowed values of <em>GRAPH_TYPE</em> and
124 corresponding interpretation of each <em>Value</em> are shown below:
129 <li><strong>BAR_GRAPH</strong><br> Plots a histogram with
130 labels below each bar.<br> <em>number</em>,<em>text
131 character</em>,<em>Tooltip text</em></li>
132 <li><strong>LINE_GRAPH</strong><br> Draws a line between
133 values on the annotation row.<br> <em>number</em></li>
134 <li><strong>NO_GRAPH</strong><br>For a row consisting of
135 text labels and/or secondary structure symbols.<br> <em>{Secondary
136 Structure Symbol}</em>,<em>text label</em>,<em>Tooltip text</em><br />
137 <br />The type of secondary structure symbol depends on the
138 alignment being annotated being either Protein or RNA. <br />For
139 proteins, structure symbols are <em>H</em> (for helix) and <em>E</em>
140 (for strand)<br /> <br />For RNA structures, VIENNA, WUSS, and
141 extended notations can be used to specify paired positions.
142 <ul>e.g. "(|(||)|)" or
143 "|A|A|A|(|a|a|a|)")
146 Any or all value fields may be left empty, as well as the BAR_GRAPH's
147 text character field, and either or both of the text-label and
148 secondary structure symbol fields of the NO_GRAPH type annotation
151 <p>Color strings can be embedded in a value field by enclosing an
152 RGB triplet in square brackets to colour that position in an
156 <strong><a name="combine">COMBINE, COLOUR and GRAPHLINE</a>
157 for line graphs</font></strong><br /> <em>LINE_GRAPH</em> type annotations can be
158 given a colour (specified as 24 bit RGB triplet in hexadecimal or
159 comma separated values), combined onto the same vertical axis, and
160 have ordinate lines (horizontal lines at a particular vertical axis
161 value) using the following commands (respectively):
162 <pre>COLOUR	<em>graph_name</em>	<em>colour</em>
163 COMBINE	<em>graph_1_name</em>	<em>graph_2_name</em>
164 GRAPHLINE	<em>graph_name</em>	<em>value</em>	<em>label</em>	<em>colour</em><strong><em>
170 <strong><a name="annrowprops">ROWPROPERTIES</a></strong><br /> The
171 visual display properties for a set of annotation rows can be
172 modified using the following tab-delimited line:
174 <pre>ROWPROPERTIES	<em>Row label</em>	<em>centrelabs=true( or false)</em>	<em>showalllabs=true(default is false)</em>	<em>scaletofit=true (default is false)</em>
177 This sets the visual display properties according to the given
178 values for all the annotation rows with labels matching <em>Row
179 label</em>. The properties mostly affect the display of multi-character
180 column labels, and are as follows:
182 <li><em>centrelabs</em> Centre each label on its column.</li>
183 <li><em>showalllabs</em> Show every column label rather than
184 only the first of a run of identical labels (setting this to true
185 can have a drastic effect on secondary structure rows).</li>
186 <li><em>scaletofit</em> Shrink each label's font size so that
187 the label fits within the column. Useful when annotating an
188 alignment with a specific column numbering system. (<em>Not
189 available in Jalview applet due to AWT 1.1 limitations</em>)</li>
194 <strong><a name="groupdefs">SEQUENCE_GROUP</a></strong><br />
195 Groups of sequences and column ranges can be defined using a tab
196 delimited statement like:
198 <pre>SEQUENCE_GROUP	Group_Name	Group_Start	Group_End	<em>Sequences</em>
200 <p>The sequences can be defined by alignment index and a range of
201 sequences can be defined in a comma delimited field such as</p>
202 <p>2-5,8-15,20,22</p>
203 <p>Enter * to select all sequences.</p>
204 <p>Set both <em>Group_Start</em> and <em>Group_End</em> to * to include the full sequence(s) range.
206 <strong>Note:</strong> If the alignment indices are not known, enter
207 -1, followed by a tab and then a tab delimited list of sequence IDs.
210 If a <a href="#seqgrprefs"><strong>SEQUENCE_REF</strong></a> has
211 been defined, then <em>group_start</em> and <em>group_end</em> will
212 be relative to the sequence residue numbering, otherwise the <em>group_start</em>
213 and <em>group_end</em> will be alignment column indices.
217 <strong><a name="groupprops">PROPERTIES</a></strong><br />This
218 statement allows various visualisation properties to be assigned to
219 a named group. This takes a series of tab-delimited <em>key</em>=<em>value</em>
222 <pre>PROPERTIES	Group_name	tab_delimited_key_value_pairs
224 <p>The currently supported set of sequence group key-value pairs
225 that can be provided here are :</p>
229 <td width="50%">Key</td>
233 <td width="50%">description</td>
234 <td>Text - may include simple HTML tags</td>
237 <td width="50%">colour</td>
238 <td>A string resolving to a valid Jalview colourscheme
239 (e.g. Helix Propensity)</td>
242 <td width="50%">pidThreshold</td>
243 <td>A number from 0-100 specifying the Percent Identity
244 Threshold for colouring columns in the group or alignment</td>
247 <td width="50%">consThreshold</td>
248 <td>A number from 0-100 specifying the degree of bleaching
249 applied for conservation colouring</td>
252 <td width="50%">outlineColour</td>
253 <td>Line colour used for outlining the group (default is
257 <td width="50%">displayBoxes</td>
258 <td>Boolean (default true) controlling display of shaded
259 box for each alignment position</td>
262 <td width="50%">displayText</td>
263 <td>Boolean (default true) controlling display of text for
264 each alignment position</td>
267 <td width="50%">colourText</td>
268 <td>Boolean (default false) specifying whether text should
269 be shaded by applied colourscheme</td>
272 <td width="50%">textCol1</td>
273 <td>Colour for text when shown on a light background</td>
276 <td width="50%">textCol2</td>
277 <td>Colour for text when shown on a dark background</td>
280 <td width="50%">textColThreshold</td>
281 <td>Number from 0-100 specifying switching threshold
282 between light and dark background</td>
285 <td width="50%">idColour</td>
286 <td>Colour for highlighting the Sequence ID labels for this
287 group<br />If <em>idColour</em> is given but <em>colour</em>
288 is not, then idColor will also be used for the group
293 <td width="50%">showunconserved</td>
294 <td>Boolean (default false) indicating whether residues
295 should only be shown that are different from current reference
296 or consensus sequence</td>
299 <td width="50%">hide</td>
300 <td>Boolean (default false) indicating whether the rows in
301 this group should be marked as hidden.<br /> <em>Note:</em>
302 if the group is sequence associated (specified by
303 SEQUENCE_REF), then all members will be hidden and marked as
304 represented by the reference sequence.
307 <!-- <tr><td width="50%">hidecols</td><td>Boolean (default false) indicating whether columns in this groushould be marked as hidden</td></tr> -->
312 <strong>Specifying colours in PROPERTIES key-value pairs</strong><br />
313 The <strong>colour</strong> property can take either a colour scheme
314 name, or a single colour specification (either a colour name like
315 'red' or an RGB triplet like 'ff0066'). If a single colour is
316 specified, then the group will be coloured with that colour.
320 <strong><a name="seqgrprefs">SEQUENCE_REF and GROUP_REF</a></strong><br />
321 By default, annotation is associated with the alignment as a whole.
322 However, it is also possible to have an annotation row associated
323 with a specific sequence, or a sequence group. Clicking the
324 annotation label for sequence or group associated annotation will
325 highlight the associated rows in the alignment, and double clicking
326 will select those rows, allowing further analysis. While group
327 associated annotation remains associated with a particular
328 alignment, sequence associated annotation can move with a sequence -
329 so copying a sequence to another alignment will also copy its
330 associated annotation.
332 <p>You can associate an annotation with a sequence by preceding
333 its definition with the line:
334 <pre>SEQUENCE_REF	<em>seq_name</em>	<em>[startIndex]</em>
336 All Annotations defined after a SEQUENCE_REF command will then be
337 associated with that sequence, and the first field in the Value field
338 list will (optionally) be placed at the
339 <em>startIndex</em>'th column.
342 <p>Sequence associations are turned off for subsequent annotation
344 <pre>SEQUENCE_REF	ALIGNMENT</pre>
346 <p>Similarly, since Jalview 2.5, group associated annotation can
347 be defined by preceding the row definitions with the line:
348 <pre>GROUP_REF	<em>group_name</em>
350 Group association is turned off for subsequent annotation rows by:
351 <pre>GROUP_REF	<em>ALIGNMENT</em>
356 <strong><a name="refsandviews">VIEW_SETREF,
357 VIEW_HIDECOL and HIDE_INSERTIONS</a></strong><br /> Since Jalview 2.9, the
358 Annotations file has also supported the definition of reference
359 sequences and hidden regions for an alignment view.
362 <em>VIEW_DEF</em> allows the current view to be named according to the
363 first argument after the tab character. If a second argument is
364 provided, then a new view is created with the given name, and
368 <em>VIEW_SETREF</em><br />Marks the first sequence in the
369 alignment, or alternately, the one specified by the most recent <em>SEQUENCE_REF</em>
370 statement, as the <a href="../calculations/referenceseq.html">reference
371 sequence</a> for the alignment.
374 <em>HIDE_INSERTIONS</em><br />This command hides all gapped
375 positions in the current target sequence. Any columns already hidden
376 will be re-displayed.<br /> <br>The current target sequence is
377 either the one specified by the most recent <em>SEQUENCE_REF</em>
378 statement, the alignment's reference sequence, or the first sequence
382 <em>VIEW_HIDECOLS</em><br />Modifies the visibility of columns in
383 the view. The statement is followed by a single argument consisting
384 of a comma separated series of single integers or integer pairs
385 (like <em>3-4</em>). These define columns (starting from the
386 left-hand column 0) that should be marked as hidden in the alignment
392 <strong><a name="compatibility">COMPATIBILITY NOTES</a></strong><br />
393 The interpretation of the COMBINE statement in <em>Version
394 2.8.1</em> was refined so that only annotation line graphs with the
395 given names ands the same <strong>SEQUENCE_REF</strong> and <strong>GROUP_REF</strong>
401 <strong><a name="exampleann">EXAMPLES</a></strong><br /> An example
402 Annotation file is given below. Copy and paste the contents into a
403 text file and load it onto the Jalview example protein alignment.
405 <pre>#Comment lines follow the hash symbol
407 SEQUENCE_REF	FER1_MESCR	5
408 BAR_GRAPH	Bar Graph 1	<html>an <em>html tooltip</em> for Bar graph 1.</html>	||-100,-|-200,-|-300,-|-400,-|200,+|300,+|150,+
409 LINE_GRAPH	Green Values	1.1|2.2|1.3|3.4|0.7|1.4|3.3|2.2|2.1|-1.1|3.2
410 LINE_GRAPH	Red Values	2.1|3.2|1.3|-1.4|5.5|1.4|1.3|4.2|-1.1|1.1|3.2
411 BAR_GRAPH	Bar Graph 2	1,.|2,*|3,:|4,.|5,*|4,:|3,.|2|1|1|2|3|4|5|4
412 NO_GRAPH	Icons 	||||E,Sheet1|E|E||||H,Sheet 2|H|H|H||||||
413 NO_GRAPH	Purple Letters	m|y|p|r|o|t|e|i|n
414 COLOUR	Bar Graph 2	blue
415 COLOUR	Red Values	255,0,0
416 COLOUR	Green Values	green
417 COLOUR	Purple Letters	151,52,228
418 COMBINE	Green Values	Red Values
419 GRAPHLINE	Red Values	2.6	threshold	black
421 SEQUENCE_GROUP	Group_A	30	50	*
422 SEQUENCE_GROUP	Group_B	1	351	2-5
423 SEQUENCE_GROUP	Group_C	12	14	-1	seq1	seq2	seq3
424 PROPERTIES	Group_A	description=This is the description	colour=Helix Propensity	pidThreshold=0	outlineColour=red	displayBoxes=true	displayText=false	colourText=false	textCol1=black	textCol2=black	textColThreshold=0
425 PROPERTIES	Group_B	outlineColour=red
426 PROPERTIES	Group_C	colour=Clustal