#summary Tutorial for multiple sequence alignments and phylogenetic methods in BioRuby -- under development! = Introduction = Tutorial for multiple sequence alignments and phylogenetic methods in !BioRuby -- under development! = Multiple Sequence Alignments = == Multiple Sequence Alignment Input and Output == === Reading in a Multiple Sequence Alignment from a File === _... to be done_ {{{ #!/usr/bin/env ruby require 'bio' }}} === Writing a Multiple Sequence Alignment to a File === _... to be done_ {{{ #!/usr/bin/env ruby require 'bio' }}} == Calculating Multiple Sequence Alignments == !BioRuby can be used to execute a variety of multiple sequence alignment programs (such as [http://mafft.cbrc.jp/alignment/software/ MAFFT], [http://probcons.stanford.edu/ Probcons], [http://www.clustal.org/ ClustalW], [http://www.drive5.com/muscle/ Muscle], and [http://www.tcoffee.org/Projects_home_page/t_coffee_home_page.html T-Coffee]). In the following, examples for using the MAFFT and Muscle are shown. === MAFFT === {{{ #!/usr/bin/env ruby require 'bio' # Calculates the alignment using the MAFFT program on the local # machine with options '--maxiterate 1000 --localpair' # and stores the result in 'report'. options = ['--maxiterate', '1000', '--localpair'] mafft = Bio::MAFFT.new('path/to/mafft', options) report = mafft.query_align(seqs) # Accesses the actual alignment align = report.alignment # Prints each sequence to the console. align.each { |s| puts s.to_s } }}} References: * Katoh, Toh (2008) "Recent developments in the MAFFT multiple sequence alignment program" Briefings in Bioinformatics 9:286-298 * Katoh, Toh 2010 (2010) "Parallelization of the MAFFT multiple sequence alignment program" Bioinformatics 26:1899-1900 === Muscle === {{{ #!/usr/bin/env ruby require 'bio' # Calculates the alignment using the Muscle program on the local # machine with options '-quiet -maxiters 64' # and stores the result in 'report'. options = ['-quiet', '-maxiters', '64'] muscle = Bio::Muscle.new('path/to/muscle', options) report = muscle.query_align(seqs) # Accesses the actual alignment align = report.alignment # Prints each sequence to the console. align.each { |s| puts s.to_s } }}} References: * Edgar, R.C. (2004) "MUSCLE: multiple sequence alignment with high accuracy and high throughput" Nucleic Acids Res 32(5):1792-1797 === Other Programs === [http://probcons.stanford.edu/ Probcons], [http://www.clustal.org/ ClustalW], and [http://www.tcoffee.org/Projects_home_page/t_coffee_home_page.html T-Coffee]) can be used in the same manner as the programs above. == Manipulating Multiple Sequence Alignments == It is probably a good idea to 'clean up' multiple sequence to be used for phylogenetic inference. For instance, columns with more than 50% gaps can be deleted, like so: _... to be done_ {{{ #!/usr/bin/env ruby require 'bio' }}} ---- = Phylogenetic Trees = == Phylogenetic Tree Input and Output == === Reading in of Phylogenetic Trees === _... to be done_ {{{ #!/usr/bin/env ruby require 'bio' }}} Also, see: https://www.nescent.org/wg_phyloinformatics/BioRuby_PhyloXML_HowTo_documentation === Writing of Phylogenetic Trees === _... to be done_ {{{ #!/usr/bin/env ruby require 'bio' }}} Also, see: https://www.nescent.org/wg_phyloinformatics/BioRuby_PhyloXML_HowTo_documentation == Phylogenetic Inference == _Currently !BioRuby does not contain wrappers for phylogenetic inference programs, thus I am progress of writing a RAxML wrapper followed by a wrapper for FastME..._ _What about pairwise distance calculation?_ == Maximum Likelihood == === RAxML === _... to be done_ {{{ #!/usr/bin/env ruby require 'bio' }}} === PhyML === _... to be done_ {{{ #!/usr/bin/env ruby require 'bio' }}} == Pairwise Distance Based Methods == === FastME === _... to be done_ {{{ #!/usr/bin/env ruby require 'bio' }}} === PHYLIP? === == Support Calculation? == === Bootstrap Resampling? === ---- = Analyzing Phylogenetic Trees = == PAML == == Gene Duplication Inference == _need to further test and then import GSoC 'SDI' work..._ == Others? ==