Sequence Features File
The Sequence features file (which used to be known as the "Groups file" prior to version 2.08) is a simple way of getting your own sequence annotations into Jalview. It was introduced to allow sequence features to be rendered in the Jalview applet, and so is intentionally lightweight and minimal because the applet is often used in situations where data file size must be kept to a minimum, and no XML parser is available.
Features files are imported into Jalview in the following ways:
-features <Features filename>
Sequence Features File Format
A features file is a simple ASCII text file, where each line contains tab separated text fields. No comments are allowed.
The first set of lines contain type definitions:
Feature label Feature ColourA feature type has a text label, and a colour specification. This can be either:
[label|]<mincolor>|<maxcolor>|[absolute|]<minvalue>|<maxvalue>[|<thresholdtype>|[<threshold value>]]The fields are as follows:
The remaining lines in the file are the sequence annotation definitions, where the now defined features are attached to regions on particular sequences. Each feature can optionally include some descriptive text which is displayed in a tooltip when the mouse is near the feature on that sequence (and can also be used to generate a colour the feature).
If your sequence annotation is already available in GFF2 (http://gmod.org/wiki/GFF2) or GFF3 (http://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md) format, then you can leave it as is, after first adding a line containing only 'GFF' after any Jalview feature colour definitions (this mixed format capability was added in Jalview 2.6). Alternately, you can use Jalview's own sequence feature annotation format, which additionally allows HTML and URLs to be directly attached to each piece of annotation.
Jalview's sequence feature annotation format
Each feature is specified as a tab-separated series of columns as defined below:
description sequenceId sequenceIndex start end featureType score (optional)This format allows two alternate ways of referring to a sequence, either by its text ID, or its index (base 0) in an associated alignment. Normally, sequence features are associated with sequences rather than alignments, and the sequenceIndex field is given as "-1". In order to specify a sequence by its index in a particular alignment, the sequenceId should be given as "ID_NOT_SPECIFIED", otherwise the sequenceId field will be used in preference to the sequenceIndex field.
The description may contain simple HTML document body tags if
enclosed by "<html></html>" and these will be
rendered as formatted tooltips in the Jalview Application (the
Jalview applet is not capable of rendering HTML tooltips, so all
formatting tags will be removed).
Attaching Links
to Sequence Features
Any anchor tags in an html formatted
description line will be translated into URL links. A link symbol
will be displayed adjacent to any feature which includes links, and
these are made available from the links submenu
of the popup menu which is obtained by right-clicking when a link
symbol is displayed in the tooltip.
Non-positional
features
Specify the start and end for
a feature to be 0 in order to attach it to the
whole sequence. Non-positional features are shown in a tooltip when
the mouse hovers over the sequence ID panel, and any embedded links
can be accessed from the popup menu.
Scores
Scores can be associated with sequence features, and used to sort
sequences or shade the alignment (this was added in Jalview 2.5).
The score field is optional, and malformed scores will be ignored.
Feature annotations can be collected into named groups by prefixing definitions with lines of the form:
startgroup groupname.. and subsequently post-fixing the group with:
endgroup groupnameFeature grouping was introduced in version 2.08, and used to control whether a set of features are either hidden or shown together in the sequence Feature settings dialog box.
A complete example is shown below :
domain red metal ion-binding site 00ff00 transit peptide 0,105,215 chain 225,105,0 modified residue 105,225,35 signal peptide 0,155,165 helix ff0000 strand 00ff00 coil cccccc Your Own description here FER_CAPAA -1 3 93 domain Your Own description here FER_CAPAN -1 48 144 chain Your Own description here FER_CAPAN -1 50 140 domain Your Own description here FER_CAPAN -1 136 136 modified residue Your Own description here FER1_LYCES -1 1 47 transit peptide Your Own description here Q93XJ9_SOLTU -1 1 48 signal peptide Your Own description here Q93XJ9_SOLTU -1 49 144 chain startgroup secondarystucture PDB secondary structure annotation FER1_SPIOL -1 52 59 strand PDB secondary structure annotation FER1_SPIOL -1 74 80 helix endgroup secondarystructure GFF FER_CAPAA GffGroup domain 3 93 . .