X-Git-Url: http://source.jalview.org/gitweb/?a=blobdiff_plain;f=wiki%2FGSDI.wiki;h=c5eb5e63754e18e2eee7ad8159714ac9ae2b83c1;hb=d89babefe5728a059761b9053f90d0adadf66dd7;hp=bcb6849d661c8257e963ec7e54f1e9085f27f0a2;hpb=f31f530a515e0856df35eccbf1f18cddb8697952;p=jalview.git diff --git a/wiki/GSDI.wiki b/wiki/GSDI.wiki index bcb6849..c5eb5e6 100644 --- a/wiki/GSDI.wiki +++ b/wiki/GSDI.wiki @@ -7,24 +7,40 @@ To infer duplication events on a gene tree given a trusted species tree. == Usage == -{{{ + java -Xmx1024m -cp -path/to/forester.jar org.forester.application.gsdi [-options] [outfile] -}}} +path/to/forester.jar org.forester.application.gsdi [-options] === Options === - * -s: to strip the species tree prior to duplication inference - * -b: to use SDI algorithm instead of GSDI algorithm - * -m: use most parimonious duplication model for GSDI: assign nodes as speciations which would otherwise be assiged as unknown because of polytomies in the species tree - * -q: to allow species tree in other formats than phyloXML (Newick, NHX, Nexus) + * -g: to allow stripping of gene tree nodes without a matching species in the species tree + + * -m: use most parimonious duplication model for GSDI: assign nodes as speciations which would otherwise be assiged as potential duplications due tp polytomies in the species tree + * -q: to allow species tree in other formats than phyloXML (i.e. Newick, NHX, Nexus) + * -b: to use SDIse algorithm instead of GSDI algorithm (for binary species trees) + +==== Gene tree ==== +Must be in phyloXM format, with taxonomy and sequence data in appropriate fields. ==== Species tree ==== -In phyloXML format (unless option -q is used), with taxonomy data in appropriate fields. Must be rooted, polytomies are allowed. +Must be in phyloXML format unless option -q is used. -==== Gene tree ==== -In phyloXM format, with taxonomy and sequence data in appropriate fields. Must be rooted an binary (no polytomies). +=== Output === + +Besides the main output of a gene tree with duplications and speciations assigned to all of its internal nodes, this program also produces the following: + * a log file, ending in `"_gsdi_log.txt"` + * a species tree file which only contains external nodes with were needed for the reconciliation, ending in `"_species_tree_used.xml"` + * if the gene tree contains species with scientific species names such as "Pyrococcus horikoshii strain ATCC 700860" and if a mapping cannot be establish bases on these, GSDI will attempt to map by removing the "strain" (or "subspecies") information, these will be listed in a file ending in `"_gsdi_remapped.txt"`. + +=== Example === +`gsdi -g gene_tree.xml tree_of_life.xml out.xml` + + + +== Reference == +Zmasek CM and Eddy SR "A simple algorithm to infer gene duplication and speciation events on a gene tree" [http://bioinformatics.oxfordjournals.org/content/17/9/821.abstract Bioinformatics, 17, 821-828] + == Download ==