3 A new program, sc_to_e, can be used to calculate expectation values
4 from the regression coefficients reported from a search. The
5 expectation value is based on similarity score, sequence length, and
10 fasta30t7 differs from fasta30t6 in the amount of information provided
11 with the -m 10 option.
13 (1) The query and library sequence identifiers are no longer abbreviated.
15 (2) New information about the program and program version are provided:
17 The new information provided is:
19 mp_name: program name (actually argv[0])
20 mp_ver: main program version (can be different from function version)
21 mp_argv: command line arguments (duplicates argv[0])
23 Some statistical information is provided as well:
24 mp_extrap: XXXX YYY - statistics extrapolated from XXX to YYY
25 mp_stats: indicates type of statistics used for E() value
26 mp_KS: Kolmogorov-Smirnoff statistic
28 The "mp_" (main program) information is function independent, while the "pg_"
29 information is produced by a particular comparison function (ssearch,
30 fastx, fasta, etc). "pg_" should probably be called "fn_", and "mp_"
31 called "pg_", but I remain backwards compatible.
33 (3) The end of the "parseable" records is denoted with:
37 (4) There now an compile-time option -DM10_CONS, that allows you to
38 display a final alignment summary:
41 .::.:- .:: .. :. .:.---: : .--.:. :
42 .. .--- ..: :: ... :..: .::.:. . .---. . .:
43 : . . . : .. . :..: .--. . : .:. .. : .
46 or, if M10_CONS_L is defined (in addition to M10_CONS), the output is:
48 p==p=-mmmp==mpzmm=pmmmmz=p---=mmm=mmp--p=zm=m
49 pzmmp---mmzp=m==mzzzm=zp=mz==z=pmzmmz---pmmpmmmp=m
50 m=mzmmzmpm=mmmmppmmmpmmmm=pp=mp--pmpm=mp=pmzzm=mmp
53 where '=' indicates identical residues, '-' a gap in one or the other
54 sequence, 'p' indicates a positive pam value, 'm' indicates a negative
55 pam value, and 'z' indicates a zero pam value.
57 A typical run now looks like:
59 >>>gtm1_mouse.aa, 217 aa vs s library
61 ; mp_ver: version 3.0t7 November, 1996
62 ; mp_argv: fasta3_t -q -m 10 gtm1_mouse.aa s
64 ; pg_ver: 3.06 Sept, 1996
70 ; mp_extrap: 50000 51933
71 ; mp_stats: Expectation fit: rho(ln(x))= 5.8855+/-0.000527; mu= 1.5386+/- 0.029; mean_var=73.0398+/-15.283
72 ; mp_KS: 0.0133 (N=29) at 42
73 >>GTM1_MOUSE GLUTATHIONE S-TRANSFERASE GT8.7 (EC 2.5.1.18) (GST 1-1) (CLASS-MU).
88 PMILGYWNVRGLTHPIRMLLEYTDSSYDEKRYTMGDAPDFDRSQWLNEKF
89 KLGLDFPNLPYLIDGSHKITQSNAILRYLARKHHLDGETEEERIRADIVE
90 NQVMDTRMQLIMLCYNPDFEKQKPEFLKTIPEKMKLYSEFLGKRPWFAGD
91 KVTYVDFLAYDILDQYRMFEPKCLDAFPNLRDFLARFEGLKKISAYMKSS
99 PMILGYWNVRGLTHPIRMLLEYTDSSYDEKRYTMGDAPDFDRSQWLNEKF
100 KLGLDFPNLPYLIDGSHKITQSNAILRYLARKHHLDGETEEERIRADIVE
101 NQVMDTRMQLIMLCYNPDFEKQKPEFLKTIPEKMKLYSEFLGKRPWFAGD
102 KVTYVDFLAYDILDQYRMFEPKCLDAFPNLRDFLARFEGLKKISAYMKSS
104 >>GTM1_RAT GLUTATHIONE S-TRANSFERASE YB1 (EC 2.5.1.18) (CHAIN 3) (CLASS-MU).
118 ; al_display_start: 1
119 PMILGYWNVRGLTHPIRMLLEYTDSSYDEKRYTMGDAPDFDRSQWLNEKF
120 KLGLDFPNLPYLIDGSHKITQSNAILRYLARKHHLDGETEEERIRADIVE
121 NQVMDTRMQLIMLCYNPDFEKQKPEFLKTIPEKMKLYSEFLGKRPWFAGD
122 KVTYVDFLAYDILDQYRMFEPKCLDAFPNLRDFLARFEGLKKISAYMKSS
129 ; al_display_start: 1
130 PMILGYWNVRGLTHPIRLLLEYTDSSYEEKRYAMGDAPDYDRSQWLNEKF
131 KLGLDFPNLPYLIDGSRKITQSNAIMRYLARKHHLCGETEEERIRADIVE
132 NQVMDNRMQLIMLCYNPDFEKQKPEFLKTIPEKMKLYSEFLGKRPWFAGD
133 KVTYVDFLAYDILDQYHIFEPKCLDAFPNLKDFLARFEGLKKISAYMKSS
136 :::::::::::::::::.:::::::::.::::.::::::.::::::::::
137 ::::::::::::::::.::::::::.::::::::: ::::::::::::::
138 :::::.::::::::::::::::::::::::::::::::::::::::::::
139 ::::::::::::::::..::::::::::::.:::::::::::::::::::
144 217 residues in 1 query sequences
145 18531385 residues in 52205 library sequences
146 Tcomplib (4 proc)[version 3.0t7 November, 1996]
147 start: Fri Nov 8 18:20:26 1996 done: Fri Nov 8 18:20:41 1996
148 Scan time: 38.434 Display time: 2.166
150 Function used was FASTA
152 ================================================================
158 Made changes to complib.c, comp_thr.c, nxgetaa.c to allow scoring
159 matrix to be modified in fastx3, fastx3_t.
161 ================================================================
167 nxgetaa.c now accepts query sequences from "stdin" by using "-" as the
168 input file name. If DNA sequences are read in this mode, the "-n"
173 Included code in nxgetaa.c and Makefile.sgi to get around a bug in SGI's
174 sscanf() that prevented compressed GCG databases from being read properly.