$Name: fa_34_26_5 $ - $Id: readme.pvm_3.4,v 1.3 2001/09/17 21:18:19 wrp Exp $


20-August-2001

The pvm/mpi complib programs have been substantially updated with
release 3.4.  See readme.v34t0 for more information.  With version
3.4, the MPI programs are mp34comp*, mu34comp*, etc.

A major effect of this change is to disable automatic sequence type
(protein/DNA) recognition with pv34compfa/mp34compfa.  By default,
protein libraries are assumed.  Thus, pv34compfa/mp34compfa require
the "-n" command line option when running pv34compfa/mp34compfa on DNA
sequence libraries.  This issue does not occur with the other
programs, which will recognize the appropriate sequence type, because
it is determined by the program (e.g. pv34compfx requires
DNA:protein).

================
pv4comp* - July, August, 2000

As noted in readme.pvm_3.3 - the major problem that users have had
with the PVM/MPI version of the programs is in reading database files
on the nodes.  All previous versions of the program (pvcompfa,
pv3compfa, etc) had the nodes read the databases in parallel. Thus,
the database file had to be visible to the nodes, typically through
NFS on modern clusters of workstations.

This strategy caused some problems. It did not work on beowulf-type
systems, where most of the nodes are in an isolated local network and
do not have NFS access to the outside world.  And it made it
complicated to read more than one database file.  Because specialized
functions were used, the nodes could not read the full set of library
file formats available to the other fasta programs.

These problems have been addressed by significantly changing the the
way the pv4comp*/mp4comp* programs read the second "reference"
library.  With these versions, both databases, but specifically the
reference library, are read by a manager process.  The manager process
then sends the sequences to the workers.  This solves problems with
NFS reads from the workers (they don't do any), and uses exactly the
same functions as the other fasta programs, so the full set of
database formats can be read. In addition, the FASTLIBS database
abbreviations are available. This also should also solve problems with
searches of very long sequences (bacterial genomes); they can now be
broken up into smaller pieces with the -N ##### option, as with
fasta33/tfastx33.

Thus, you are encouraged to use the pv4comp*/mp4comp* versions of the
programs, which should run more like fasta33.

================
Program summary:

Programs to produce conventional scores and alignments:

pv4compfa	protein vs protein, DNA vs DNA
pv4compsw	protein vs protein, DNA vs DNA
pv4compfx/	DNA vs protein
pv4comptfx/y	protein vs DNA

Programs to summarize the effectiveness of a search (require
super-family-labeled databases):

ps4compfa	protein vs protein, DNA vs DNA
ps4compsw	protein vs protein, DNA vs DNA
ps4compfx/	DNA vs protein
ps4comptfx/y	protein vs DNA

Programs to report the scores and alignments of the highest scoring
unrelated sequence (require super-family-labeled databases). These
programs are used to evaluate the super-family labeling.

pu4compfa	protein vs protein, DNA vs DNA
pu4compsw	protein vs protein, DNA vs DNA
pucompfx/	DNA vs protein
pu4comptfx/y	protein vs DNA

================
Release notes:

--> Aug. 4, 2000

Compiled and tested mp4compfa/mp4compsw programs.

--> July 22, 2000

First release of restructured p2_complib.c/p2_workcomp.c, which use
the manager program to read both sequence databases and send the
"reference database" to the workers.