X-Git-Url: http://source.jalview.org/gitweb/?a=blobdiff_plain;f=wiki%2FRIO.wiki;h=1aa1e544ceda59630355e3216cf0a85f102aaac0;hb=d2823363d9efbad977b5029c09c27ee09a4cbece;hp=180a553227c443cd9afb98f8a4a7fdb5fcfa3cca;hpb=2193ea2c5f1e9b6bc98fd07c96007203c4be4a91;p=jalview.git diff --git a/wiki/RIO.wiki b/wiki/RIO.wiki index 180a553..1aa1e54 100644 --- a/wiki/RIO.wiki +++ b/wiki/RIO.wiki @@ -1,77 +1,3 @@ #summary resampled inference of orthologs -= RIO: Resampled Inference of Orthologs = - -== Purpose == - -RIO (Resampled Inference of Orthologs) is a method for automated phylogenomics based on explicit phylogenetic inference. RIO analyses are performed over resampled phylogenetic trees to estimate the reliability of orthology assignments. - -== Usage == -{{{ -java -Xmx1024m -cp -path/to/forester.jar org.forester.application.rio [options] [outfile] -}}} -=== Options === - - * -co: cutoff for ortholog output (default: 50) - - * -t : file-name for output table - - * -q : name for query (sequence/node) - - * -s : sort (default: 2) - - * -u : to output ultra-paralogs (species specific expansions/paralogs) - - * -cu: cutoff for ultra-paralog output (default: 50) - -==== Sort ==== - - * 0: orthologies - * 1: orthologies > super orthologies - * 2: super orthologies > orthologies - -==== Gene trees ==== -The gene trees ideally are in phyloXML, but can also be in New Hamphshire (Newick) or Nexus format as long as species information can be extracted from the gene names - (e.g. "HUMAN" from "BCL2_HUMAN"). - -==== Species tree ==== -Must be in phyloXML format ([http://forester.googlecode.com/files/species.xml example]). - -=== Output === - -Besides the main output of a gene tree with duplications and speciations assigned to all of its internal nodes, this program also produces the following: - * a log file, ending in `"_gsdi_log.txt"` ([http://forester.googlecode.com/files/wnt_gsdi_log.txt example]) - * a species tree file which only contains external nodes with were needed for the reconciliation, ending in `"_species_tree_used.xml"` - * if the gene tree contains species with scientific species names such as "Pyrococcus horikoshii strain ATCC 700860" and if a mapping cannot be establish based on these, GSDI will attempt to map by removing the "strain" (or "subspecies") information, these will be listed in a file ending in `"_gsdi_remapped.txt"`. - -=== Taxonomic mapping between gene and species tree === - -GSDI can establish a taxonomic mapping between gene and species tree on the following three data fields: - * scientific names (e.g. "Pyrococcus horikoshii") - * taxonomic identifiers (e.g. "35932" from uniprot or ncbi) - * taxonomy codes (e.g. "PYRHO") - - - -=== Example === -`gsdi -g -q gene_tree.xml tree_of_life.nwk out.xml` - - -=== Example files === - * [http://forester.googlecode.com/files/wnt_gene_tree.xml gene tree] - * [http://forester.googlecode.com/files/species.xml species tree] - * [http://forester.googlecode.com/files/wnt_gsdi_log.txt log file (output)] - - -== References == - -Zmasek CM and Eddy SR "RIO: Analyzing proteomes by automated phylogenomics using resampled inference of orthologs" [http://www.biomedcentral.com/1471-2105/3/14/ BMC Bioinformatics 2002, 3:14] - -Zmasek CM and Eddy SR "A simple algorithm to infer gene duplication and speciation events on a gene tree" [http://bioinformatics.oxfordjournals.org/content/17/9/821.abstract Bioinformatics, 17, 821-828] - - - -== Download == - -Download forester.jar here: http://code.google.com/p/forester/downloads/list \ No newline at end of file +https://sites.google.com/site/cmzmasek/home/software/forester/rio \ No newline at end of file