Database Reference Fetching

Discovering Database References for Sequences
Database references are associated with a sequence are displayed as a list in the tooltip shown when mousing over its sequence ID. Jalview uses references for the retrieval of PDB structures and DAS features, and for retrieving sequence cross-references such as the protein products of a DNA sequence.

Initiating reference retrieval
The application provides three ways to access the retrieval function. Either:

select the Discover PDB IDs option from the structure submenu of the sequence's popup menu
Choose one of the options from the 'Fetch DB Refs' submenu in the alignment window's Web Services menu:
- Standard Databases will fetch references from the EBI databases plus currently selected DAS sources
- The other entries submenus leading to lists of individual database sources that Jalview can access.
Answer 'Yes' when asked if you wish to retrieve database references for your sequences after initiating a DAS Sequence Feature fetch.

Jalview discovers references for a sequence by generating a set of ID queries from the ID string of each sequence in the alignment. It then tries to query a subset of all the databases it can access in order to match the alignment sequence to any records retrieved from the database. If a match is found, then the sequence is annotated with that database's reference, and any cross-references that it's records contain.

The Sequence Identification Process
The method of accession id discovery is derived from the method which earlier Jalview versions used for Uniprot sequence feature retrieval, and was originally restricted to the identifaction of valid Uniprot accessions.
Essentially, Jalview will try to retrieve records from a subset of the databases accessible by the sequence fetcher using each sequence's ID string (or each string in the ID separated by the '∣' symbol).

If a record (or set of records) is retrieved by any query derived from the ID string of a sequence, then the sequence is aligned to the ones retrieved to determine the correct start and end residue positions (which are displayed when the 'Show Full Sequence ID' option). This is important for the correct display of the location of any features associated with that database.

If the alignment reveals differences between the sequence in the alignment and the one in the record, then Jalview will assume that the aligned sequence is not the one in the retrieved record.

In some cases, the ID used to retrieve records may be out of date and a dialog box will be opened indicating that a 100% match between the sequence and the record was identified, but the sequence name is different. In this case, the can be manually changed (by right clicking on the sequence ID and selecting Sequence→Edit Name).

Note
Please remember to save your alignment if either the start/end numbering, or the sequence IDs were updated during the ID retrieval process.