X-Git-Url: http://source.jalview.org/gitweb/?a=blobdiff_plain;f=wiki%2FPhyloBioRuby.wiki;h=4b6b45ffa63c3c20de0a76453f618aa7b8714b94;hb=51067a27d785cb31fbd3d8fd81b09b0d140efdff;hp=ab6faf9d1c6fc050e9b264ed01fd4959810a9161;hpb=0a133c612b5c1acc1436a94b0d65af27dd4bba42;p=jalview.git diff --git a/wiki/PhyloBioRuby.wiki b/wiki/PhyloBioRuby.wiki index ab6faf9..4b6b45f 100644 --- a/wiki/PhyloBioRuby.wiki +++ b/wiki/PhyloBioRuby.wiki @@ -10,7 +10,7 @@ Tutorial for multiple sequence alignments and phylogenetic methods in [http://bi Eventually, this is expected to be placed on the official !BioRuby page. -Author: [http://www.cmzmasek.net/ Christian M Zmasek], Sanford-Burnham Medical Research Institute +Author: [https://sites.google.com/site/cmzmasek/ Christian Zmasek], Sanford-Burnham Medical Research Institute Copyright (C) 2011 Christian M Zmasek. All rights reserved. @@ -23,6 +23,41 @@ Copyright (C) 2011 Christian M Zmasek. All rights reserved. === Reading in a Multiple Sequence Alignment from a File === +This automatically determines the format +{{{ +#!/usr/bin/env ruby +require 'bio' + +seq_ary = Array.new +ff = Bio::FlatFile.auto('bcl2.fasta') +ff.each_entry do |entry| + seq_ary.push(entry) + puts entry.entry_id # prints the identifier of the entry + puts entry.definition # prints the definition of the entry + puts entry.seq # prints the sequence data of the entry +end + +# Creates a multiple sequence alignment (possibly unaligned) named +# 'seqs' from array 'seq_ary'. +seqs = Bio::Alignment.new(seq_ary) +seqs.each { |seq| puts seq.to_s } + +# Writes multiple sequence alignment (possibly unaligned) 'seqs' +# to a file in PHYLIP format. +File.open('out0.phylip', 'w') do |f| + f.write(seqs.output(:phylip)) +end + +# Writes multiple sequence alignment (possibly unaligned) 'seqs' +# to a file in FASTA format. +File.open('out0.fasta', 'w') do |f| + f.write(seqs.output(:fasta)) +end +}}} + + +==== ClustalW Format ==== + The following example shows how to read in a *ClustalW*-formatted multiple sequence alignment. {{{ @@ -43,28 +78,32 @@ msa.each do |entry| end }}} -Blah +==== FASTA Format ==== + +The following example shows how to read in a *FASTA*-formatted multiple sequence file. (_This seems a little clumsy, I wonder if there is a more direct way, avoiding the creation of an array.) {{{ -# Reads in a Fasta-formatted multiple sequence alignment (which does +#!/usr/bin/env ruby +require 'bio' + +# Reads in a FASTA-formatted multiple sequence alignment (which does # not have to be aligned, though) and stores its sequences in # array 'seq_ary'. seq_ary = Array.new -fasta_seqs = Bio::Alignment::MultiFastaFormat.new(File.open('bcl2.fasta').read) +fasta_seqs = Bio::Alignment::MultiFastaFormat.new(File.open('infile.fasta').read) fasta_seqs.entries.each do |seq| - seq_ary.push( seq ) + seq_ary.push(seq) end # Creates a multiple sequence alignment (possibly unaligned) named # 'seqs' from array 'seq_ary'. -seqs = Bio::Alignment.new( seq_ary ) -seqs.each { |seq| puts seq.to_s } - +seqs = Bio::Alignment.new(seq_ary) -puts seqs.consensus +# Prints each sequence to the console. +seqs.each { |seq| puts seq.to_s } # Writes multiple sequence alignment (possibly unaligned) 'seqs' -# to a file in phylip format. -File.open('out1.phylip', 'w') do |f| +# to a file in PHYLIP format. +File.open('outfile.phylip', 'w') do |f| f.write(seqs.output(:phylip)) end }}} @@ -75,6 +114,13 @@ Relevant API documentation: * [http://bioruby.open-bio.org/rdoc/classes/Bio/Alignment.html Bio::Alignment] * [http://bioruby.open-bio.org/rdoc/classes/Bio/Sequence.html Bio::Sequence] +=== Creating a Multiple Sequence Alignment === + + +=== Creating a Multiple Sequence Alignment from a Database === + +? + === Writing a Multiple Sequence Alignment to a File ===