compbio.data.sequence
Class FastaReader

java.lang.Object
  extended by compbio.data.sequence.FastaReader
All Implemented Interfaces:
Iterator<FastaSequence>

public class FastaReader
extends Object
implements Iterator<FastaSequence>

Reads files with FASTA formatted sequences. All the information in the FASTA header is preserved including trailing white spaces. All the white spaces are removed from the sequence. Examples of the correct input:

 
 >zedpshvyzg
 GCQDKNNIAELNEIMGTTRSPSDWQHMKGASPRAEIGLTGKKDSWWRHCCSKEFNKTPPPIHPDMKRWGWMWNRENFEKFLIDNFLNPPCPRLMLTKGTWWRHEDLCHEIFWSTLRWLCLGNQSFSAMIWGHLCECHRMIWWESNEHMFWLKFRRALKKMNSNGPCMGPDNREWMITNRMGKEFCGPAFAGDCQSCWRKCHKTNKICFNEKKGTPTKIDHEQKDIMDILKDIDNHRNWKQCQLWLLTSKSTDQESTTMLTWSTWRDFFIIIKQPFDHKCRGALDANGDFQIAAELKWPAPMIILRQNQKTMHDKSCHHFFTNRCPLMHTTRANDKQCSWHTRKQFICQQDFTTWQHRPDTHRILPSWCMSTRRKNHIKNTPALAFSTCEMGDLPNGWAPGTIILQRQFTQAIKLPQETTGWPRCDPKFDHWNMSKWLRQLLGRDDEMIPPQCD
 
 >xovkactesa
 CPLSKWWNRRAFLSHTANHWMILMTWEGPHDGESKMRIAMMKWSPCKPTMSHFRCGLDAWAEPIRQIACESTFRM
 FCTTPRPIHKLTEMWGHMNGWTGAFCRQLECEWMMPPRHPHPCTSTFNNNKKRLIGQIPNEGKQLFINFQKPQHG
 FSESDIWIWKDNPTAWHEGLTIAGIGDGQHCWNWMPMPWSGAPTSNALIEFWTWLGMIGTRCKTQGMWWDAMNHH
 DQFELSANAHIAAHHMEKKMILKPDDRNLGDDTWMPPGKIWMRMFAKNTNACWPEGCRDDNEEDDCGTHNLHRMC
 
 >ntazzewyvv
 CGCKIF D D NMKDNNRHG TDIKKHGFMH IRHPE KRDDC FDNHCIMPKHRRWGLWD
 EASINM AQQWRSLPPSRIMKLNG       HGCDCMHSHMEAD   DTKQSGIKGTFWNG  HDAQWLCRWG      
 EFITEA WWGRWGAITFFHAH  ENKNEIQECSDQNLKE        SRTTCEIID   TCHLFTRHLDGW 
   RCEKCQANATHMTW ACTKSCAEQW  FCAKELMMN    
   W        KQMGWRCKIFRKLFRDNCWID  FELPWWPICFCCKGLSTKSHSAHDGDQCRRW    WPDCARDWLGPGIRGEF   
   FCTHICQQLQRNFWCGCFRWNIEKRMFEIFDDNMAAHWKKCMHFKFLIRIHRHGPITMKMTWCRSGCCFGKTRRLPDSSFISAFLDPKHHRDGSGMMMWSSEMRSCAIPDPQQAWNQGKWIGQIKDWNICFAWPIRENQQCWATPHEMPSGFHFILEKWDALAHPHMHIRQKKCWAWAFLSLMSSTHSDMATFQWAIPGHNIWSNWDNIICGWPRI
 
    > 12 d t y wi               k       jbke    
   KLSHHDCD
    N
     H
     HSKCTEPHCGNSHQMLHRDP
     CCDQCQSWEAENWCASMRKAILF
 
 

Version:
1.0 April 2011
Author:
Peter Troshin

Constructor Summary
FastaReader(InputStream inputStream)
          This class will not close the incoming stream! So the client should do so.
FastaReader(String inputFile)
          Header data can contain non-ASCII symbols and read in UTF8
 
Method Summary
 void close()
          Call this method to close the connection to the input file if you want to free up the resources.
 boolean hasNext()
          
 FastaSequence next()
          Reads the next FastaSequence from the input
 void remove()
          Not implemented
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

FastaReader

public FastaReader(String inputFile)
            throws FileNotFoundException
Header data can contain non-ASCII symbols and read in UTF8

Parameters:
inputFile - the file containing the list of FASTA formatted sequences to read from
Throws:
FileNotFoundException - if the input file is not found
IllegalStateException - if the close method was called on this instance

FastaReader

public FastaReader(InputStream inputStream)
            throws FileNotFoundException
This class will not close the incoming stream! So the client should do so.

Parameters:
inputStream -
Throws:
FileNotFoundException
Method Detail

hasNext

public boolean hasNext()

Specified by:
hasNext in interface Iterator<FastaSequence>
Throws:
IllegalStateException - if the close method was called on this instance

next

public FastaSequence next()
Reads the next FastaSequence from the input

Specified by:
next in interface Iterator<FastaSequence>
Throws:
AssertionError - if the header or the sequence is missing
IllegalStateException - if the close method was called on this instance

remove

public void remove()
Not implemented

Specified by:
remove in interface Iterator<FastaSequence>

close

public void close()
Call this method to close the connection to the input file if you want to free up the resources. The connection will be closed on the JVM shutdown if this method was not called explicitly. No further reading on this instance of the FastaReader will be possible after calling this method.