3 * Jalview - A Sequence Alignment Editor and Viewer ($$Version-Rel$$)
4 * Copyright (C) $$Year-Rel$$ The Jalview Authors
6 * This file is part of Jalview.
8 * Jalview is free software: you can redistribute it and/or
9 * modify it under the terms of the GNU General Public License
10 * as published by the Free Software Foundation, either version 3
11 * of the License, or (at your option) any later version.
13 * Jalview is distributed in the hope that it will be useful, but
14 * WITHOUT ANY WARRANTY; without even the implied warranty
15 * of MERCHANTABILITY or FITNESS FOR A PARTICULAR
16 * PURPOSE. See the GNU General Public License for more details.
18 * You should have received a copy of the GNU General Public License
19 * along with Jalview. If not, see <http://www.gnu.org/licenses/>.
20 * The Jalview Authors are detailed in the 'AUTHORS' file.
23 <title>Importing Variants from VCF</title>
27 <strong>Importing Genomic Variants from VCF</strong>
30 <p>Jalview can annotate nucleotide sequences associated with
31 genomic loci with features representing variants imported from VCF
32 files. This new feature in Jalview 2.11, is currently tuned to work
33 best with tab indexed VCF files produced by the GATK Variant
34 Annotation Pipeline (with or without annotation provided by the
35 Ensembl Variant Effect Predictor), but other sources of VCF files
38 If your sequences have genomic loci, then a <strong>Taxon
39 name</strong> and <strong>chromosome location</strong> should be shown in
40 the Sequence Details report and the Sequence ID tooltip (providing
41 you have enabled it via the submenu in the <em><strong>View</strong></em>
42 menu). Jalview matches the assembly information provided in the VCF
43 file to the taxon name, using an internal lookup table. If a match
44 is found, Jalview employs the Ensembl API's lift-over services to
45 locate your sequences' loci in the VCF file assembly's reference
46 frame. If all goes well, after loading a VCF, Jalview will report
47 the number of variants added as sequence features via the alignment
48 window's status bar. These are added by default when loci are
49 retrieved from Ensembl.
52 <strong>Working with variants from organisms other than
56 <li>Look in your VCF file to identify keywords in the
57 ##reference header that define what species and assembly name the
58 VCF was generated against.</li>
59 <li>Look at ensembl.org to identify the species' short name,
60 and the assembly's unique id.</li>
61 <li>Add mappings to the <strong>VCF_SPECIES</strong> and <strong>VCF_ASSEMBLY</strong>
62 properties in your .jalview_properties file. For example:<pre>
63 VCF_SPECIES=1000genomes=homo_sapiens,c_elegans=celegans
64 VCF_ASSEMBLY=assembly19=GRCh37,hs37=GRCh37</pre><br /> <br />These allow
65 annotations to be mapped from both Human 1000genomes VCF files and
69 <strong>Work in Progress!</strong>
70 <p>VCF support in Jalview is under active development. Please get
71 in touch via our mailing list if you have any questions, problems or
72 otherwise find it useful !</p>