1 .TH "sfetch" 1 "@RELEASEDATE@" "@PACKAGE@ @RELEASE@" "@PACKAGE@ Manual"
5 sfetch - get a sequence from a flatfile database.
15 retrieves the sequence named
17 from a sequence database.
20 Which database is used is controlled by the
24 options, or "little databases" and "big
26 The directory location of "big databases" can
27 be specified by environment variables,
28 such as $SWDIR for Swissprot, and $GBDIR
32 A complete file path must be specified
33 for "little databases".
34 By default, if neither option is specified
35 and the name looks like a Swissprot identifier
36 (e.g. it has a _ character), the $SWDIR
37 environment variable is used to attempt
38 to retrieve the sequence
43 A variety of other options are available which allow
44 retrieval of subsequences
46 retrieval by accession number instead of
49 reformatting the extracted sequence into a variety
55 If the database has been GSI indexed, sequence
56 retrieval will be extremely efficient; else,
57 retrieval may be painfully slow (the entire
58 database may have to be read into memory to
62 is recommended for all large or permanent
66 This program was originally named
68 and was renamed because it clashed with a GCG
69 program of the same name.
77 as an accession number, not an identifier.
81 Retrieve the sequence from a sequence file named
85 exists, it is used to speed up the retrieval.
89 Extract a subsequence starting from position
91 rather than from 1. See
99 option), then the sequence is extracted as
100 its reverse complement (it is assumed to be
101 nucleic acid sequence).
105 Print brief help; includes version number and summary of
106 all options, including expert options.
110 Direct the output to a file named
112 By default, output would go to stdout.
118 in the output after extraction. By default, the original
119 sequence identifier would be retained. Useful, for instance,
120 if retrieving a sequence fragment; the coordinates of
121 the fragment might be added to the name (this is what Pfam
126 Extract a subsequence that ends at position
128 rather than at the end of the sequence. See
136 option), then the sequence is extracted as
137 its reverse complement (it is assumed to be
138 nucleic acid sequence)
142 (Babelfish). Autodetect and read a sequence file format other than the
143 default (FASTA). Almost any common sequence file format is recognized
144 (including Genbank, EMBL, SWISS-PROT, PIR, and GCG unaligned sequence
145 formats, and Stockholm, GCG MSF, and Clustal alignment formats). See
146 the printed documentation for a complete list of supported formats.
151 Retrieve the sequence from the main sequence database
153 .I <database>. For each code, there is an environment
154 variable that specifies the directory path to that
156 Recognized codes and their corresponding environment
167 (Wormpep, $WORMDIR); and
170 Each database is read in its native flatfile format.
174 Reformat the extracted sequence into a different format.
175 (By default, the sequence is extracted from the database
176 in the same format as the database.) Available formats
178 .B embl, fasta, genbank, gcg, strider, zuker, ig, pir, squid,
185 .BI --informat " <s>"
186 Specify that the sequence file is in format
188 rather than the default FASTA format.
189 Common examples include Genbank, EMBL, GCG,
190 PIR, Stockholm, Clustal, MSF, or PHYLIP;
191 see the printed documentation for a complete list
192 of accepted format names.
193 This option overrides the default format (FASTA)
196 Babelfish autodetection option.
205 @PACKAGE@ and its documentation is @COPYRIGHT@
206 HMMER - Biological sequence analysis with profile HMMs
207 Copyright (C) 1992-1999 Washington University School of Medicine
210 This source code is distributed under the terms of the
211 GNU General Public License. See the files COPYING and LICENSE
213 See COPYING in the source code distribution for more details, or contact me.
218 Washington Univ. School of Medicine
220 St Louis, MO 63110 USA
221 Phone: 1-314-362-7666
223 Email: eddy@genetics.wustl.edu