forester/archive/RIO/others/hmmer/documentation/man/hmmsearch.man

   1 .TH "hmmsearch" 1 @RELEASEDATE@ "HMMER @RELEASE@" "HMMER Manual"
   2
   3 .SH NAME
   4 .TP
   5 hmmsearch - search a sequence database with a profile HMM
   6
   7 .SH SYNOPSIS
   8 .B hmmsearch
   9 .I [options]
  10 .I hmmfile
  11 .I seqfile
  12
  13 .SH DESCRIPTION
  14
  15 .B hmmsearch
  16 reads an HMM from
  17 .I hmmfile
  18 and searches
  19 .I seqfile
  20 for significantly similar sequence matches.
  21
  22 .PP
  23 .I seqfile
  24 will be looked for first in the current working directory,
  25 then in a directory named by the environment variable
  26 .I BLASTDB.
  27 This lets users use existing BLAST databases, if BLAST
  28 has been configured for the site.
  29
  30 .PP
  31 .B hmmsearch
  32 may take minutes or even hours to run, depending
  33 on the size of the sequence database. It is a good
  34 idea to redirect the output to a file.
  35
  36 .PP
  37 The output consists of four sections: a ranked list
  38 of the best scoring sequences, a ranked list of the
  39 best scoring domains, alignments for all the best scoring
  40 domains, and a histogram of the scores.
  41 A sequence score may be higher than a domain score for
  42 the same sequence if there is more than one domain in the sequence;
  43 the sequence score takes into account all the domains.
  44 All sequences scoring above the
  45 .I -E
  46 and
  47 .I -T
  48 cutoffs are shown in the first list, then
  49 .I every
  50 domain found in this list is
  51 shown in the second list of domain hits.
  52 If desired, E-value and bit score thresholds may also be applied
  53 to the domain list using the
  54 .I --domE
  55 and
  56 .I --domT
  57 options.
  58
  59 .SH OPTIONS
  60
  61 .TP
  62 .B -h
  63 Print brief help; includes version number and summary of
  64 all options, including expert options.
  65
  66 .TP
  67 .BI -A " <n>"
  68 Limits the alignment output to the
  69 .I <n>
  70 best scoring domains.
  71 .B -A0
  72 shuts off the alignment output and can be used to reduce
  73 the size of output files.
  74
  75 .TP
  76 .BI -E " <x>"
  77 Set the E-value cutoff for the per-sequence ranked hit list to
  78 .I <x>,
  79 where
  80 .I <x>
  81 is a positive real number. The default is 10.0. Hits with E-values
  82 better than (less than) this threshold will be shown.
  83
  84 .TP
  85 .BI -T " <x>"
  86 Set the bit score cutoff for the per-sequence ranked hit list to
  87 .I <x>,
  88 where
  89 .I <x>
  90 is a real number.
  91 The default is negative infinity; by default, the threshold
  92 is controlled by E-value and not by bit score.
  93 Hits with bit scores better than (greater than) this threshold
  94 will be shown.
  95
  96 .TP
  97 .BI -Z " <n>"
  98 Calculate the E-value scores as if we had seen a sequence database of
  99 .I <n>
 100 sequences. The default is the number of sequences seen in your
 101 database file
 102 .I <seqfile>.
 103
 104 .SH EXPERT OPTIONS
 105
 106 .TP
 107 .B --compat
 108 Use the output format of HMMER 2.1.1, the 1998-2001 public
 109 release; provided so 2.1.1 parsers don't have to be rewritten.
 110
 111 .TP
 112 .BI --cpu " <n>"
 113 Sets the maximum number of CPUs that the program
 114 will run on. The default is to use all CPUs
 115 in the machine. Overrides the HMMER_NCPU
 116 environment variable. Only affects threaded
 117 versions of HMMER (the default on most systems).
 118
 119 .TP
 120 .B --cut_ga
 121 Use Pfam GA (gathering threshold) score cutoffs.
 122 Equivalent
 123 to --globT <GA1> --domT <GA2>, but the GA1 and GA2 cutoffs
 124 are read from the HMM file. hmmbuild puts these cutoffs there
 125 if the alignment file was annotated in a Pfam-friendly
 126 alignment format (extended SELEX or Stockholm format) and
 127 the optional GA annotation line was present. If these
 128 cutoffs are not set in the HMM file,
 129 .B --cut_ga
 130 doesn't work.
 131
 132 .TP
 133 .B --cut_tc
 134 Use Pfam TC (trusted cutoff) score cutoffs. Equivalent
 135 to --globT <TC1> --domT <TC2>, but the TC1 and TC2 cutoffs
 136 are read from the HMM file. hmmbuild puts these cutoffs there
 137 if the alignment file was annotated in a Pfam-friendly
 138 alignment format (extended SELEX or Stockholm format) and
 139 the optional TC annotation line was present. If these
 140 cutoffs are not set in the HMM file,
 141 .B --cut_tc
 142 doesn't work.
 143
 144 .TP
 145 .B --cut_nc
 146 Use Pfam NC (noise cutoff) score cutoffs. Equivalent
 147 to --globT <NC1> --domT <NC2>, but the NC1 and NC2 cutoffs
 148 are read from the HMM file. hmmbuild puts these cutoffs there
 149 if the alignment file was annotated in a Pfam-friendly
 150 alignment format (extended SELEX or Stockholm format) and
 151 the optional NC annotation line was present. If these
 152 cutoffs are not set in the HMM file,
 153 .B --cut_nc
 154 doesn't work.
 155
 156 .TP
 157 .BI --domE " <x>"
 158 Set the E-value cutoff for the per-domain ranked hit list to
 159 .I <x>,
 160 where
 161 .I <x>
 162 is a positive real number.
 163 The default is infinity; by default, all domains in the sequences
 164 that passed the first threshold will be reported in the second list,
 165 so that the number of domains reported in the per-sequence list is
 166 consistent with the number that appear in the per-domain list.
 167
 168 .TP
 169 .BI --domT " <x>"
 170 Set the bit score cutoff for the per-domain ranked hit list to
 171 .I <x>,
 172 where
 173 .I <x>
 174 is a real number. The default is negative infinity;
 175 by default, all domains in the sequences
 176 that passed the first threshold will be reported in the second list,
 177 so that the number of domains reported in the per-sequence list is
 178 consistent with the number that appear in the per-domain list.
 179 .I Important note:
 180 only one domain in a sequence is absolutely controlled by this
 181 parameter, or by
 182 .B --domT.
 183 The second and subsequent domains in a sequence have a de facto
 184 bit score threshold of 0 because of the details of how HMMER
 185 works. HMMER requires at least one pass through the main model
 186 per sequence; to do more than one pass (more than one domain)
 187 the multidomain alignment must have a better score than the
 188 single domain alignment, and hence the extra domains must contribute
 189 positive score. See the Users' Guide for more detail.
 190
 191 .TP
 192 .BI --forward
 193 Use the Forward algorithm instead of the Viterbi algorithm
 194 to determine the per-sequence scores. Per-domain scores are
 195 still determined by the Viterbi algorithm. Some have argued that
 196 Forward is a more sensitive algorithm for detecting remote
 197 sequence homologues; my experiments with HMMER have not
 198 confirmed this, however.
 199
 200 .TP
 201 .BI --informat " <s>"
 202 Assert that the input
 203 .I seqfile
 204 is in format
 205 .I <s>;
 206 do not run Babelfish format autodection. This increases
 207 the reliability of the program somewhat, because
 208 the Babelfish can make mistakes; particularly
 209 recommended for unattended, high-throughput runs
 210 of HMMER. Valid format strings include FASTA,
 211 GENBANK, EMBL, GCG, PIR, STOCKHOLM, SELEX, MSF,
 212 CLUSTAL, and PHYLIP. See the User's Guide for a complete
 213 list.
 214
 215 .TP
 216 .B --null2
 217 Turn off the post hoc second null model. By default, each alignment
 218 is rescored by a postprocessing step that takes into account possible
 219 biased composition in either the HMM or the target sequence.
 220 This is almost essential in database searches, especially with
 221 local alignment models. There is a very small chance that this
 222 postprocessing might remove real matches, and
 223 in these cases
 224 .B --null2
 225 may improve sensitivity at the expense of reducing
 226 specificity by letting biased composition hits through.
 227
 228 .TP
 229 .B --pvm
 230 Run on a Parallel Virtual Machine (PVM). The PVM must
 231 already be running. The client program
 232 .B hmmsearch-pvm
 233 must be installed on all the PVM nodes.
 234 Optional PVM support must have been compiled into
 235 HMMER.
 236
 237 .TP
 238 .B --xnu
 239 Turn on XNU filtering of target protein sequences. Has no effect
 240 on nucleic acid sequences. In trial experiments,
 241 .B --xnu
 242 appears to perform less well than the default
 243 post hoc null2 model.
 244
 245
 246
 247 .SH SEE ALSO
 248
 249 .PP
 250 Master man page, with full list of and guide to the individual man
 251 pages: see
 252 .B hmmer(1).
 253 .PP
 254 A User guide and tutorial came with the distribution:
 255 .B Userguide.ps
 256 [Postscript] and/or
 257 .B Userguide.pdf
 258 [PDF].
 259 .PP
 260 Finally, all documentation is also available online via WWW:
 261 .B http://hmmer.wustl.edu/
 262
 263 .SH AUTHOR
 264
 265 This software and documentation is:
 266 .nf
 267 @COPYRIGHT@
 268 HMMER - Biological sequence analysis with profile HMMs
 269 Copyright (C) 1992-1999 Washington University School of Medicine
 270 All Rights Reserved
 271
 272     This source code is distributed under the terms of the
 273     GNU General Public License. See the files COPYING and LICENSE
 274     for details.
 275 .fi
 276 See the file COPYING in your distribution for complete details.
 277
 278 .nf
 279 Sean Eddy
 280 HHMI/Dept. of Genetics
 281 Washington Univ. School of Medicine
 282 4566 Scott Ave.
 283 St Louis, MO 63110 USA
 284 Phone: 1-314-362-7666
 285 FAX  : 1-314-362-7855
 286 Email: eddy@genetics.wustl.edu
 287 .fi
 288
 289