From: pvtroshin Date: Wed, 21 Jul 2010 16:27:56 +0000 (+0000) Subject: Add JRonn runner, tester, methods to parse jronn output files. X-Git-Url: http://source.jalview.org/gitweb/?a=commitdiff_plain;h=48190e32e896a6f674284844a9ae75605aa40157;p=jabaws.git Add JRonn runner, tester, methods to parse jronn output files. git-svn-id: link to svn.lifesci.dundee.ac.uk/svn/barton/ptroshin/JABA2@2638 e3abac25-378b-4346-85de-24260fe3988d --- diff --git a/TODO.txt b/TODO.txt index a212651..79f6ca1 100644 --- a/TODO.txt +++ b/TODO.txt @@ -1,35 +1,40 @@ TODO: -DONE: LocalExecutor service must be shutdown on web application undeploy event not on JVM shutdown. -This is because JVM is still running even after web application is shut down! - - -Good toString method for Limits (test with command line client -limits) +Add globprot ws +Add ronn ws -Add facility to distribute other results of the calculations like the trees and annotation file for probcons. +USE CASE - TURN ALIGNMENT INTO PROFILE AND SEARCH SEQUENCE DATABASE USECASE +- Receive user alignment +- use hmmerbuild to turn it to profile +- use hmmersearch to search the database -Use absolute path for web site download links as archives are not included into distro! +#END OF - TURN ALIGNMENT INTO PROFILE AND SEARCH SEQUENCE DATABASE USECASE +New data model for representing psiblast,blast,phmmer,jackhmmer results -(later) Define limits for presets +new parsers for the above programmes output (Stockholm MSA format?) -Add documentation on Presets, Parameters and Limits +Think hard on what to do with large output files? +e.g. serve the hits table in full, but retrieve alignments on demand. +What actually neeeds to be sent? -Explain how to define a limit +Add facility to distribute other results of the calculations like the trees and +annotation file for probcons. -Rearrange web site docs - add links to the top of the page to the topics discussed below +# END OF SEARCHING SEQUENCE DATABASE USECASE -Put the documentation for various executables online - -Pack the test cases and build file to run them in one of the distributives +Good toString method for Limits (test with command line client -limits) -Add test for repeated result collection from cluster engine +JABA DOCS +(later) Define limits for presets - if required +(later) Add documentation on Presets, Parameters and Limits +(later) Explain how to define a limit -Add logging facility to WSTester so more details of the problem can be reported +Add test for repeated result collection from cluster engine -Make left gradient image one-two characters larger + change "For developer ->For Developers" +(low) Pack the test cases and build file to run them in one of the distributives -PART DONE: test local/cluster execution - test Load balancer +(low) Add logging facility to WSTester so more details of the problem can be reported (low) Statistics collector for engines (number of operations/timing) @@ -37,17 +42,23 @@ PART DONE: test local/cluster execution - test Load balancer (later) Implement utility to rerun died tasks -(later) Progress bars - talk about (1. Proper time assessment vs low hustle option - dumping output to screen) - (later) Add recognition for executables for the different architectures (later) use Latch to manage engine reservations. Reservations for parallel jobs? DONE +WILL NOT DO: Progress bars - needs assessing how long does it take to run a task +DONE: LocalExecutor service must be shutdown on web application undeploy event not on JVM shutdown. + This is because JVM is still running even after web application is shut down! +DONE: Use absolute path for web site download links as archives are not included into distro! +DONE: Rearrange web site docs - add links to the top of the page to the topics discussed below +DONE: Put the documentation for various executables online +DONE: Make left gradient image one-two characters larger + change "For developer ->For Developers" +DONE: test local/cluster execution - test Load balancer DONE: Improve SimpleWS client so it can be scripted against: give user alignment and accept parameters DONE: Make a configuration tester class (check that configuration files point to the executables, and they can be executed) DONE: PUT CLIENT JAR INTO WEB-INF/LIB - most likely need to get rid of dist prefix in the build file -DONE: Different packages for donwload and dundee. generic vs specific settings. +DONE: Different packages for download and dundee. generic vs specific settings. DONE: JAVADOC DONE: Refactor EngineResourcesLeak tester to hide a map implementation! DONE: Compile executables for linux in the most generic way diff --git a/binaries/jronn3.1.jar b/binaries/jronn3.1.jar new file mode 100644 index 0000000..fa39e98 Binary files /dev/null and b/binaries/jronn3.1.jar differ diff --git a/build.xml b/build.xml index 1448e35..724776f 100644 --- a/build.xml +++ b/build.xml @@ -8,6 +8,7 @@ + @@ -147,18 +148,34 @@ + + Jar file: Minimal WS client jar + + + + + + + + + + + + + + + + + + - + Jar file: Minimal WS client jar - - - + - - diff --git a/conf/Executable.properties b/conf/Executable.properties index 681b3f0..cb3daa1 100644 --- a/conf/Executable.properties +++ b/conf/Executable.properties @@ -2,7 +2,7 @@ ### Clustal configuration ### local.clustalw.bin.windows=binaries/clustalw2.exe local.clustalw.bin=binaries/src/clustalw/src/clustalw2 -#cluster.clustalw.bin=/homes/pvtroshin/workspace/clustengine/binaries/src/clustalw/src/clustalw2 +cluster.clustalw.bin=/homes/pvtroshin/workspace/JABA2/binaries/src/clustalw/src/clustalw2 # Parameters names which come from RunnerConfig -> Parameters.xml file ultimately are all lowercased in comparison! # see engine.client.Util.getExecProperty() method for details # So they are case insensitive. @@ -10,50 +10,61 @@ clustalw.-matrix.path=binaries/matrices clustalw.presets.file=conf/settings/ClustalPresets.xml clustalw.parameters.file=conf/settings/ClustalParameters.xml clustalw.limits.file=conf/settings/ClustalLimits.xml -#clustalw.cluster.settings=-l h_cpu=24:00:00 -l h_vmem=6000M -l ram=6000M +clustalw.cluster.settings=-l h_cpu=24:00:00 -l h_vmem=6000M -l ram=6000M ### Muscle configuration ### local.muscle.bin.windows=binaries/muscle.exe local.muscle.bin=binaries/src/muscle/muscle # Beware version of muscle on the cluster older and does not support some # of the newer version attributed thus, will not work with Muscle.java wrapper! -cluster.muscle.bin=/homes/pvtroshin/workspace/clustengine/binaries/src/muscle/muscle +cluster.muscle.bin=/homes/pvtroshin/workspace/JABA2/binaries/src/muscle/muscle #The environment variable MUSCLE_MXPATH can be used to specify a path where the matrices are stored # e.g. MUSCLE_MXPATH#binaries/matrices - but need to privide absolute path! muscle.-matrix.path=binaries/matrices muscle.presets.file=conf/settings/MusclePresets.xml muscle.parameters.file=conf/settings/MuscleParameters.xml muscle.limits.file=conf/settings/MuscleLimits.xml -#muscle.cluster.settings=-l h_cpu=24:00:00 -l h_vmem=6000M -l ram=6000M +muscle.cluster.settings=-l h_cpu=24:00:00 -l h_vmem=6000M -l ram=6000M ### Mafft configuration ### #local.mafft.bin.windows= local.mafft.bin=binaries/src/mafft/binaries/mafft -#cluster.mafft.bin=/homes/pvtroshin/workspace/clustengine/binaries/src/mafft/core/mafft +cluster.mafft.bin=/homes/pvtroshin/workspace/JABA2/binaries/src/mafft/core/mafft mafft.bin.env=MAFFT_BINARIES#binaries/src/mafft/binaries;FASTA_4_MAFFT#binaries/src/fasta34/fasta34; mafft.--aamatrix.path=binaries/matrices mafft.presets.file=conf/settings/MafftPresets.xml mafft.parameters.file=conf/settings/MafftParameters.xml mafft.limits.file=conf/settings/MafftLimits.xml -#mafft.cluster.settings=-l h_cpu=24:00:00 -l h_vmem=6000M -l ram=6000M +mafft.cluster.settings=-l h_cpu=24:00:00 -l h_vmem=6000M -l ram=6000M ### Tcoffee configuration ### local.tcoffee.bin=binaries/src/tcoffee/t_coffee_source/t_coffee -#cluster.tcoffee.bin=/homes/pvtroshin/workspace/clustengine/binaries/src/tcoffee/t_coffee_source/t_coffee +cluster.tcoffee.bin=/homes/pvtroshin/workspace/JABA2/binaries/src/tcoffee/t_coffee_source/t_coffee #/sw/bin/t_coffee # Sub matrix support does not work #tcoffee.-matrix.path=binaries/matrices tcoffee.presets.file=conf/settings/TcoffeePresets.xml tcoffee.parameters.file=conf/settings/TcoffeeParameters.xml tcoffee.limits.file=conf/settings/TcoffeeLimits.xml -#tcoffee.cluster.cpunum=4 -#tcoffee.cluster.settings=-q mpi -pe mpi 4 -l h_vmem=1700M -l ram=1700M -l h_cpu=24:00:00 +tcoffee.cluster.cpunum=4 +tcoffee.cluster.settings=-q 64bit-pri.q -pe smp 4 -l h_vmem=1700M -l ram=1700M -l h_cpu=24:00:00 ### Probcons configuration ### #local.probcons.bin.windows= local.probcons.bin=binaries/src/probcons/probcons -#cluster.probcons.bin=/homes/pvtroshin/workspace/clustengine/binaries/src/probcons/probcons +cluster.probcons.bin=/homes/pvtroshin/workspace/JABA2/binaries/src/probcons/probcons #Probcons does not support matrix loading - unrecognised option reported! probcons.parameters.file=conf/settings/ProbconsParameters.xml probcons.limits.file=conf/settings/ProbconsLimits.xml -#probcons.cluster.settings=-l h_cpu=24:00:00 -l h_vmem=6000M -l ram=6000M +probcons.cluster.settings=-l h_cpu=24:00:00 -l h_vmem=6000M -l ram=6000M + +### Jronn configuration ### +local.jronn.bin.windows=D:\\Java\\jdk1.6.0_14\\bin\\java.exe +local.jronn.bin=/sw/java/latest/bin/java +cluster.jronn.bin=/sw/java/latest/bin/java +jronn.jar.file=binaries/jronn3.1.jar +jronn.parameters.file=conf/settings/JronnParameters.xml +jronn.limits.file=conf/settings/JronnLimits.xml +jronn.jvm.options=-Xms32M -Xmx512M +jronn.cluster.cpunum=4 +jronn.cluster.settings=-q 64bit-pri.q -pe smp 4 -l h_vmem=1700M -l ram=1700M -l h_cpu=24:00:00 diff --git a/conf/log4j.properties b/conf/log4j.properties index 6d52128..9fe9a23 100644 --- a/conf/log4j.properties +++ b/conf/log4j.properties @@ -1,6 +1,6 @@ ## CHANGE THIS (The root directory where to store all the log files) -#logDir = . +logDir = . ## Uncomment to enable JWS2 activity logging to standard out (to the console if available) ## for possible log levels please refer to Log4j documentation http://logging.apache.org/log4j/1.2/manual.html @@ -13,20 +13,20 @@ ## FATAL - log fatal events only ## Uncomment this if you would like the system to log messages into stdout -#log4j.rootLogger=ERROR, stdout -#log4j.appender.stdout=org.apache.log4j.ConsoleAppender -#log4j.appender.stdout.Target=System.out -#log4j.appender.stdout.layout=org.apache.log4j.PatternLayout -#log4j.appender.stdout.layout.ConversionPattern=%m%n +log4j.rootLogger=ERROR, stdout +log4j.appender.stdout=org.apache.log4j.ConsoleAppender +log4j.appender.stdout.Target=System.out +log4j.appender.stdout.layout=org.apache.log4j.PatternLayout +log4j.appender.stdout.layout.ConversionPattern=%m%n ## Uncomment to enable JWS2 activity logging to the file -#log4j.logger.compbio=ERROR, ACTIVITY -#log4j.appender.ACTIVITY=org.apache.log4j.RollingFileAppender -#log4j.appender.ACTIVITY.File=${logDir}/activity.log -#log4j.appender.ACTIVITY.MaxFileSize=10MB -#log4j.appender.ACTIVITY.MaxBackupIndex=10000 -#log4j.appender.ACTIVITY.layout=org.apache.log4j.PatternLayout -#log4j.appender.ACTIVITY.layout.ConversionPattern=%d{MM-dd@HH:mm:ss} %-5p %3x - %m%n +log4j.logger.compbio=ERROR, ACTIVITY +log4j.appender.ACTIVITY=org.apache.log4j.RollingFileAppender +log4j.appender.ACTIVITY.File=${logDir}/activity.log +log4j.appender.ACTIVITY.MaxFileSize=10MB +log4j.appender.ACTIVITY.MaxBackupIndex=10000 +log4j.appender.ACTIVITY.layout=org.apache.log4j.PatternLayout +log4j.appender.ACTIVITY.layout.ConversionPattern=%d{MM-dd@HH:mm:ss} %-5p %3x - %m%n ## Uncomment for separate local engine execution log (debugging only) #log4j.logger.compbio.engine.local.LocalExecutorService=INFO, C diff --git a/conf/settings/JronnLimits.xml b/conf/settings/JronnLimits.xml new file mode 100644 index 0000000..9ea48b0 --- /dev/null +++ b/conf/settings/JronnLimits.xml @@ -0,0 +1,13 @@ + + + compbio.runner.disorder.Jronn + + 2000 + 2000 + + + # LocalEngineExecutionLimit # + 5 + 500 + + diff --git a/conf/settings/JronnParameters.xml b/conf/settings/JronnParameters.xml new file mode 100644 index 0000000..27e7a86 --- /dev/null +++ b/conf/settings/JronnParameters.xml @@ -0,0 +1,17 @@ + + + compbio.runner.disorder.Jronn + = + + Disorder probability + The probability of disorder threshold + -d + prog_docs/jronn.html + 0.53 + + Float + 0.01 + 0.99 + + + diff --git a/datamodel/compbio/data/sequence/AnnotatedSequence.java b/datamodel/compbio/data/sequence/AnnotatedSequence.java new file mode 100644 index 0000000..5d09534 --- /dev/null +++ b/datamodel/compbio/data/sequence/AnnotatedSequence.java @@ -0,0 +1,56 @@ +package compbio.data.sequence; + +import java.util.Arrays; + +public class AnnotatedSequence extends FastaSequence { + + private final float[] annotation; + + public AnnotatedSequence(String id, String sequence, float[] annotation) { + super(id, sequence); + this.annotation = annotation; + if (annotation == null || annotation.length != sequence.length()) { + throw new IllegalArgumentException("The length of the annotation (" + + ((annotation != null) ? annotation.length : "0") + + ") does not match the length of the sequence (" + + sequence.length() + ")!"); + } + } + + public AnnotatedSequence(FastaSequence fsequence, float[] annotation) { + this(fsequence.getId(), fsequence.getSequence(), annotation); + } + + public float[] getAnnotation() { + return annotation; + } + + @Override + public int hashCode() { + final int prime = 7; + int result = super.hashCode(); + result = prime * result + Arrays.hashCode(annotation); + return result; + } + + @Override + public boolean equals(Object obj) { + if (this == obj) + return true; + if (!super.equals(obj)) + return false; + if (getClass() != obj.getClass()) + return false; + AnnotatedSequence other = (AnnotatedSequence) obj; + if (!Arrays.equals(annotation, other.annotation)) + return false; + return true; + } + + @Override + public String toString() { + return super.toString() + "Annotation:\n " + + Arrays.toString(annotation) + "\n"; + } + +} diff --git a/datamodel/compbio/data/sequence/SequenceUtil.java b/datamodel/compbio/data/sequence/SequenceUtil.java index 1a3ce5b..f7c923a 100644 --- a/datamodel/compbio/data/sequence/SequenceUtil.java +++ b/datamodel/compbio/data/sequence/SequenceUtil.java @@ -1,6 +1,9 @@ -/* Copyright (c) 2009 Peter Troshin +/* + * @(#)SequenceUtil.java 1.0 September 2009 + * + * Copyright (c) 2009 Peter Troshin * - * JAva Bioinformatics Analysis Web Services (JABAWS) @version: 1.0 + * Jalview Web Services version: 2.0 * * This library is free software; you can redistribute it and/or modify it under the terms of the * Apache License version 2 as published by the Apache Software Foundation @@ -10,7 +13,7 @@ * License for more details. * * A copy of the license is in apache_license.txt. It is also available here: - * @see: http://www.apache.org/licenses/LICENSE-2.0.txt + * see: http://www.apache.org/licenses/LICENSE-2.0.txt * * Any republication or derived work distributed in source code form * must include this copyright and license notice. @@ -21,6 +24,8 @@ package compbio.data.sequence; import java.io.BufferedReader; import java.io.BufferedWriter; import java.io.Closeable; +import java.io.File; +import java.io.FileInputStream; import java.io.IOException; import java.io.InputStream; import java.io.InputStreamReader; @@ -35,9 +40,8 @@ import java.util.regex.Pattern; /** * Utility class for operations on sequences * - * @author pvtroshin - * - * Date September 2009 + * @author Petr Troshin + * @version 1.0 */ public final class SequenceUtil { @@ -111,8 +115,8 @@ public final class SequenceUtil { /** * @return true is the sequence contains only letters a,c, t, g, u */ - public static boolean isNucleotideSequence(FastaSequence s) { - return isNonAmbNucleotideSequence(s.getSequence()); + public static boolean isNucleotideSequence(final FastaSequence s) { + return SequenceUtil.isNonAmbNucleotideSequence(s.getSequence()); } /** @@ -120,11 +124,11 @@ public final class SequenceUtil { * (!) - B char */ public static boolean isNonAmbNucleotideSequence(String sequence) { - sequence = cleanSequence(sequence); - if (DIGIT.matcher(sequence).find()) { + sequence = SequenceUtil.cleanSequence(sequence); + if (SequenceUtil.DIGIT.matcher(sequence).find()) { return false; } - if (NON_NUCLEOTIDE.matcher(sequence).find()) { + if (SequenceUtil.NON_NUCLEOTIDE.matcher(sequence).find()) { return false; /* * System.out.format("I found the text starting at " + @@ -132,7 +136,7 @@ public final class SequenceUtil { * nonDNAmatcher.end()); */ } - Matcher DNAmatcher = NUCLEOTIDE.matcher(sequence); + final Matcher DNAmatcher = SequenceUtil.NUCLEOTIDE.matcher(sequence); return DNAmatcher.find(); } @@ -144,7 +148,7 @@ public final class SequenceUtil { */ public static String cleanSequence(String sequence) { assert sequence != null; - final Matcher m = WHITE_SPACE.matcher(sequence); + final Matcher m = SequenceUtil.WHITE_SPACE.matcher(sequence); sequence = m.replaceAll("").toUpperCase(); return sequence; } @@ -157,10 +161,10 @@ public final class SequenceUtil { * @return cleaned up sequence */ public static String deepCleanSequence(String sequence) { - sequence = cleanSequence(sequence); - sequence = DIGIT.matcher(sequence).replaceAll(""); - sequence = NONWORD.matcher(sequence).replaceAll(""); - Pattern othernonSeqChars = Pattern.compile("[_-]+"); + sequence = SequenceUtil.cleanSequence(sequence); + sequence = SequenceUtil.DIGIT.matcher(sequence).replaceAll(""); + sequence = SequenceUtil.NONWORD.matcher(sequence).replaceAll(""); + final Pattern othernonSeqChars = Pattern.compile("[_-]+"); sequence = othernonSeqChars.matcher(sequence).replaceAll(""); return sequence; } @@ -171,17 +175,17 @@ public final class SequenceUtil { * @return true is the sequence is a protein sequence, false overwise */ public static boolean isProteinSequence(String sequence) { - sequence = cleanSequence(sequence); - if (isNonAmbNucleotideSequence(sequence)) { + sequence = SequenceUtil.cleanSequence(sequence); + if (SequenceUtil.isNonAmbNucleotideSequence(sequence)) { return false; } - if (DIGIT.matcher(sequence).find()) { + if (SequenceUtil.DIGIT.matcher(sequence).find()) { return false; } - if (NON_AA.matcher(sequence).find()) { + if (SequenceUtil.NON_AA.matcher(sequence).find()) { return false; } - Matcher protmatcher = AA.matcher(sequence); + final Matcher protmatcher = SequenceUtil.AA.matcher(sequence); return protmatcher.find(); } @@ -194,20 +198,20 @@ public final class SequenceUtil { * protein or DNA */ public static boolean isAmbiguosProtein(String sequence) { - sequence = cleanSequence(sequence); - if (isNonAmbNucleotideSequence(sequence)) { + sequence = SequenceUtil.cleanSequence(sequence); + if (SequenceUtil.isNonAmbNucleotideSequence(sequence)) { return false; } - if (DIGIT.matcher(sequence).find()) { + if (SequenceUtil.DIGIT.matcher(sequence).find()) { return false; } - if (NON_AA.matcher(sequence).find()) { + if (SequenceUtil.NON_AA.matcher(sequence).find()) { return false; } - if (AA.matcher(sequence).find()) { + if (SequenceUtil.AA.matcher(sequence).find()) { return false; } - Matcher amb_prot = AMBIGUOUS_AA.matcher(sequence); + final Matcher amb_prot = SequenceUtil.AMBIGUOUS_AA.matcher(sequence); return amb_prot.find(); } @@ -221,12 +225,13 @@ public final class SequenceUtil { * - the maximum number of characters to write in one line * @throws IOException */ - public static void writeFasta(OutputStream outstream, - List sequences, int width) throws IOException { - OutputStreamWriter writer = new OutputStreamWriter(outstream); - BufferedWriter fastawriter = new BufferedWriter(writer); - for (FastaSequence fs : sequences) { - fastawriter.write(fs.getOnelineFasta()); + public static void writeFasta(final OutputStream outstream, + final List sequences, final int width) + throws IOException { + final OutputStreamWriter writer = new OutputStreamWriter(outstream); + final BufferedWriter fastawriter = new BufferedWriter(writer); + for (final FastaSequence fs : sequences) { + fastawriter.write(fs.getFormatedSequence(width)); } outstream.flush(); fastawriter.close(); @@ -242,28 +247,30 @@ public final class SequenceUtil { * @return list of FastaSequence objects * @throws IOException */ - public static List readFasta(InputStream inStream) + public static List readFasta(final InputStream inStream) throws IOException { - List seqs = new ArrayList(); - InputStreamReader inReader = new InputStreamReader(inStream); - BufferedReader infasta = new BufferedReader(inReader); - Pattern pattern = Pattern.compile("//s+"); + final List seqs = new ArrayList(); + + final BufferedReader infasta = new BufferedReader( + new InputStreamReader(inStream, "UTF8"), 16000); + final Pattern pattern = Pattern.compile("//s+"); String line; String sname = "", seqstr = null; do { line = infasta.readLine(); - if (line == null || line.startsWith(">")) { - if (seqstr != null) + if ((line == null) || line.startsWith(">")) { + if (seqstr != null) { seqs.add(new FastaSequence(sname.substring(1), seqstr)); + } sname = line; // remove > seqstr = ""; } else { - String subseq = pattern.matcher(line).replaceAll(""); + final String subseq = pattern.matcher(line).replaceAll(""); seqstr += subseq; } } while (line != null); - inReader.close(); + infasta.close(); return seqs; } @@ -275,17 +282,103 @@ public final class SequenceUtil { * @param sequences * @throws IOException */ - public static void writeFasta(OutputStream os, List sequences) - throws IOException { - OutputStreamWriter outWriter = new OutputStreamWriter(os); - BufferedWriter fasta_out = new BufferedWriter(outWriter); - for (FastaSequence fs : sequences) { + public static void writeFasta(final OutputStream os, + final List sequences) throws IOException { + final OutputStreamWriter outWriter = new OutputStreamWriter(os); + final BufferedWriter fasta_out = new BufferedWriter(outWriter); + for (final FastaSequence fs : sequences) { fasta_out.write(fs.getOnelineFasta()); } fasta_out.close(); outWriter.close(); } + public static List readJRonn(final File result) + throws IOException, UnknownFileFormatException { + InputStream input = new FileInputStream(result); + List sequences = readJRonn(input); + input.close(); + return sequences; + } + + /** + * Reader for JRonn horizontal file format + * + * >Foobar + * + * M G D T T A G + * + * 0.48 0.42 0.42 0.48 0.52 0.53 0.54 + * + * All values are tab delimited + * + * @param inStream + * @return + * @throws IOException + * @throws UnknownFileFormatException + */ + public static List readJRonn(final InputStream inStream) + throws IOException, UnknownFileFormatException { + final List seqs = new ArrayList(); + + final BufferedReader infasta = new BufferedReader( + new InputStreamReader(inStream, "UTF8"), 16000); + + String line; + String sname = ""; + do { + line = infasta.readLine(); + if (line == null || line.isEmpty()) { + // skip empty lines + continue; + } + if (line.startsWith(">")) { + // read name + sname = line.trim().substring(1); + // read sequence line + line = infasta.readLine(); + final String sequence = line.replace("\t", ""); + // read annotation line + line = infasta.readLine(); + String[] annotValues = line.split("\t"); + float[] annotation = convertToNumber(annotValues); + if (annotation.length != sequence.length()) { + throw new UnknownFileFormatException( + "File does not look like Jronn horizontally formatted output file!\n" + + JRONN_WRONG_FORMAT_MESSAGE); + } + seqs.add(new AnnotatedSequence(sname, sequence, annotation)); + } + } while (line != null); + + infasta.close(); + return seqs; + } + + private static float[] convertToNumber(String[] annotValues) + throws UnknownFileFormatException { + float[] annotation = new float[annotValues.length]; + try { + for (int i = 0; i < annotation.length; i++) { + annotation[i] = Float.parseFloat(annotValues[i]); + } + } catch (NumberFormatException e) { + throw new UnknownFileFormatException(JRONN_WRONG_FORMAT_MESSAGE, e + .getCause()); + } + return annotation; + } + + private static final String JRONN_WRONG_FORMAT_MESSAGE = "Jronn file must be in the following format:\n" + + ">sequence_name\n " + + "M V S\n" + + "0.43 0.22 0.65\n" + + "Where first line is the sequence name,\n" + + "second line is the tab delimited sequence,\n" + + "third line contains tab delimited disorder prediction values.\n" + + "No lines are allowed between these three. Additionally, the number of " + + "sequence residues must be equal to the number of the disorder values."; + /** * Closes the Closable and logs the exception if any * diff --git a/engine/compbio/engine/client/SkeletalExecutable.java b/engine/compbio/engine/client/SkeletalExecutable.java index b4f82fd..adc3e0b 100644 --- a/engine/compbio/engine/client/SkeletalExecutable.java +++ b/engine/compbio/engine/client/SkeletalExecutable.java @@ -31,233 +31,233 @@ import compbio.util.Util; public abstract class SkeletalExecutable implements Executable { - private static final PropertyHelper ph = PropertyHelperManager - .getPropertyHelper(); + protected static final PropertyHelper ph = PropertyHelperManager + .getPropertyHelper(); - private static Logger log = Logger.getLogger(SkeletalExecutable.class); + private static Logger log = Logger.getLogger(SkeletalExecutable.class); - protected String inputFile = "input.txt"; - protected String outputFile = "output.txt"; - protected String errorFile = "error.txt"; + protected String inputFile = "input.txt"; + protected String outputFile = "output.txt"; + protected String errorFile = "error.txt"; - private boolean isInputSet = false; - private boolean isOutputSet = false; - private boolean isErrorSet = false; + private boolean isInputSet = false; + private boolean isOutputSet = false; + private boolean isErrorSet = false; - /** - * This has to allow duplicate parameters as different options may have the - * same value e.g. Muscle -weight1 clustalw -weight2 clustalw - */ - protected CommandBuilder cbuilder; - - public SkeletalExecutable() { - cbuilder = new CommandBuilder(" "); - } + /** + * This has to allow duplicate parameters as different options may have the + * same value e.g. Muscle -weight1 clustalw -weight2 clustalw + */ + protected CommandBuilder cbuilder; - public SkeletalExecutable(String parameterKeyValueDelimiter) { - assert parameterKeyValueDelimiter != null; - cbuilder = new CommandBuilder(parameterKeyValueDelimiter); - } + public SkeletalExecutable() { + cbuilder = new CommandBuilder(" "); + } - public SkeletalExecutable setInput(String inFile) { - if (compbio.util.Util.isEmpty(inFile)) { - throw new IllegalArgumentException("Input file must not be NULL"); - } - this.inputFile = inFile; - this.isInputSet = true; - return this; - } + public SkeletalExecutable(String parameterKeyValueDelimiter) { + assert parameterKeyValueDelimiter != null; + cbuilder = new CommandBuilder(parameterKeyValueDelimiter); + } - public SkeletalExecutable setOutput(String outFile) { - if (compbio.util.Util.isEmpty(outFile) - || PathValidator.isAbsolutePath(outFile)) { - throw new IllegalArgumentException( - "Output file must not be NULL and Absolute path could not be used! Please provide the filename only. Value provided: " - + outFile); - } - this.outputFile = outFile; - this.isOutputSet = true; - return this; + public SkeletalExecutable setInput(String inFile) { + if (compbio.util.Util.isEmpty(inFile)) { + throw new IllegalArgumentException("Input file must not be NULL"); } - - public SkeletalExecutable setError(String errFile) { - if (compbio.util.Util.isEmpty(errFile) - || PathValidator.isAbsolutePath(errFile)) { - throw new IllegalArgumentException( - "Error file must not be NULL and Absolute path could not be used! Please provide the filename only. Value provided: " - + errFile); - } - this.errorFile = errFile; - this.isErrorSet = true; - return this; + this.inputFile = inFile; + this.isInputSet = true; + return this; + } + + public SkeletalExecutable setOutput(String outFile) { + if (compbio.util.Util.isEmpty(outFile) + || PathValidator.isAbsolutePath(outFile)) { + throw new IllegalArgumentException( + "Output file must not be NULL and Absolute path could not be used! Please provide the filename only. Value provided: " + + outFile); } - - @Override - public CommandBuilder getParameters(ExecProvider provider) { - /* - * Prevent modification of the parameters unintentionally. This is - * important to preserve executable parameters intact as engine could - * add things into the array as it see fit. For instance - * ExecutableWrapper (part of local engines) add command line as the - * first element of an array. - */ - paramValueUpdater(); - return cbuilder; + this.outputFile = outFile; + this.isOutputSet = true; + return this; + } + + public SkeletalExecutable setError(String errFile) { + if (compbio.util.Util.isEmpty(errFile) + || PathValidator.isAbsolutePath(errFile)) { + throw new IllegalArgumentException( + "Error file must not be NULL and Absolute path could not be used! Please provide the filename only. Value provided: " + + errFile); } - - @Override - public Executable addParameters(List parameters) { - cbuilder.addParams(parameters); - return this; - } - - public Executable setParameter(String parameter) { - cbuilder.setParam(parameter); - return this; - } - - /** - * This is a generic method of changing values of the parameters with - * properties - * - * This method iterates via commands for an executable finding matches from - * the Executable.properties file and replacing values in CommandBuilder - * with a combination of value from CommandBuilder to merge path from - * properties + this.errorFile = errFile; + this.isErrorSet = true; + return this; + } + + @Override + public CommandBuilder getParameters(ExecProvider provider) { + /* + * Prevent modification of the parameters unintentionally. This is + * important to preserve executable parameters intact as engine could + * add things into the array as it see fit. For instance + * ExecutableWrapper (part of local engines) add command line as the + * first element of an array. */ - void paramValueUpdater() { - for (Parameter command : cbuilder.getCommandList()) { - if (command.value == null) { - continue; - } - String propertyPath = compbio.engine.client.Util.getExecProperty( - command.name + ".path", getType()); - if (Util.isEmpty(propertyPath)) { - continue; - } - if (new File(command.value).isAbsolute()) { - // Matrix can be found so no actions necessary - // This method has been called already and the matrix name - // is modified to contain full path // no further actions is - // necessary - continue; - } - String absMatrixPath = compbio.engine.client.Util - .convertToAbsolute(propertyPath); - command.value = absMatrixPath + File.separator + command.value; - cbuilder.setParam(command); - } + paramValueUpdater(); + return cbuilder; + } + + @Override + public Executable addParameters(List parameters) { + cbuilder.addParams(parameters); + return this; + } + + public Executable setParameter(String parameter) { + cbuilder.setParam(parameter); + return this; + } + + /** + * This is a generic method of changing values of the parameters with + * properties + * + * This method iterates via commands for an executable finding matches from + * the Executable.properties file and replacing values in CommandBuilder + * with a combination of value from CommandBuilder to merge path from + * properties + */ + void paramValueUpdater() { + for (Parameter command : cbuilder.getCommandList()) { + if (command.value == null) { + continue; + } + String propertyPath = compbio.engine.client.Util.getExecProperty( + command.name + ".path", getType()); + if (Util.isEmpty(propertyPath)) { + continue; + } + if (new File(command.value).isAbsolute()) { + // Matrix can be found so no actions necessary + // This method has been called already and the matrix name + // is modified to contain full path // no further actions is + // necessary + continue; + } + String absMatrixPath = compbio.engine.client.Util + .convertToAbsolute(propertyPath); + command.value = absMatrixPath + File.separator + command.value; + cbuilder.setParam(command); } - - /** - * This method cannot really tell whether the files has actually been - * created or not. It must be overridden as required. - * - * @see compbio.engine.client.Executable#getCreatedFiles() - */ - @Override - public List getCreatedFiles() { - return Arrays.asList(getOutput(), getError()); + } + + /** + * This method cannot really tell whether the files has actually been + * created or not. It must be overridden as required. + * + * @see compbio.engine.client.Executable#getCreatedFiles() + */ + @Override + public List getCreatedFiles() { + return Arrays.asList(getOutput(), getError()); + } + + @Override + public String getInput() { + return inputFile; + } + + protected boolean isInputSet() { + return isInputSet; + } + + protected boolean isOutputSet() { + return isOutputSet; + } + + protected boolean isErrorSet() { + return isErrorSet; + } + + @Override + public String getOutput() { + return outputFile; + } + + @Override + public String getError() { + return errorFile; + } + + @Override + public String toString() { + String value = "Input: " + this.getInput() + "\n"; + value += "Output: " + this.getOutput() + "\n"; + value += "Error: " + this.getError() + "\n"; + value += "Class: " + this.getClass() + "\n"; + value += "Params: " + cbuilder + "\n"; + return value; + } + + @Override + public Executable loadRunConfiguration(RunConfiguration rconfig) { + if (!compbio.util.Util.isEmpty(rconfig.getOutput())) { + setOutput(rconfig.getOutput()); } - - @Override - public String getInput() { - return inputFile; + if (!compbio.util.Util.isEmpty(rconfig.getError())) { + setError(rconfig.getError()); } - - protected boolean isInputSet() { - return isInputSet; - } - - protected boolean isOutputSet() { - return isOutputSet; - } - - protected boolean isErrorSet() { - return isErrorSet; - } - - @Override - public String getOutput() { - return outputFile; + if (!compbio.util.Util.isEmpty(rconfig.getInput())) { + setInput(rconfig.getInput()); } - - @Override - public String getError() { - return errorFile; + this.cbuilder = (CommandBuilder) rconfig.getParameters(); + return this; + } + + @Override + public boolean equals(Object obj) { + if (obj == null) { + return false; } - - @Override - public String toString() { - String value = "Input: " + this.getInput() + "\n"; - value += "Output: " + this.getOutput() + "\n"; - value += "Error: " + this.getError() + "\n"; - value += "Class: " + this.getClass() + "\n"; - value += "Params: " + cbuilder + "\n"; - return value; + if (!(obj instanceof SkeletalExecutable)) { + return false; } - - @Override - public Executable loadRunConfiguration(RunConfiguration rconfig) { - if (!compbio.util.Util.isEmpty(rconfig.getOutput())) { - setOutput(rconfig.getOutput()); - } - if (!compbio.util.Util.isEmpty(rconfig.getError())) { - setError(rconfig.getError()); - } - if (!compbio.util.Util.isEmpty(rconfig.getInput())) { - setInput(rconfig.getInput()); - } - this.cbuilder = (CommandBuilder) rconfig.getParameters(); - return this; + SkeletalExecutable exec = (SkeletalExecutable) obj; + if (!Util.isEmpty(this.inputFile) && !Util.isEmpty(exec.inputFile)) { + if (!this.inputFile.equals(exec.inputFile)) { + return false; + } } - - @Override - public boolean equals(Object obj) { - if (obj == null) { - return false; - } - if (!(obj instanceof SkeletalExecutable)) { - return false; - } - SkeletalExecutable exec = (SkeletalExecutable) obj; - if (!Util.isEmpty(this.inputFile) && !Util.isEmpty(exec.inputFile)) { - if (!this.inputFile.equals(exec.inputFile)) { - return false; - } - } - if (!Util.isEmpty(this.outputFile) && !Util.isEmpty(exec.outputFile)) { - if (!this.outputFile.equals(exec.outputFile)) { - return false; - } - } - if (!Util.isEmpty(this.errorFile) && !Util.isEmpty(exec.errorFile)) { - if (!this.errorFile.equals(exec.errorFile)) { - return false; - } - } - if (!this.cbuilder.equals(exec.cbuilder)) { - return false; - } - return true; + if (!Util.isEmpty(this.outputFile) && !Util.isEmpty(exec.outputFile)) { + if (!this.outputFile.equals(exec.outputFile)) { + return false; + } } - - @Override - public int hashCode() { - int code = inputFile.hashCode(); - code += outputFile.hashCode(); - code += errorFile.hashCode(); - code *= this.cbuilder.hashCode(); - return code; + if (!Util.isEmpty(this.errorFile) && !Util.isEmpty(exec.errorFile)) { + if (!this.errorFile.equals(exec.errorFile)) { + return false; + } } - - public String getClusterSettings() { - String settings = ph.getProperty(getType().getSimpleName() - .toLowerCase() - + ".cluster.settings"); - return settings == null ? "" : settings; + if (!this.cbuilder.equals(exec.cbuilder)) { + return false; } - - public abstract Class> getType(); + return true; + } + + @Override + public int hashCode() { + int code = inputFile.hashCode(); + code += outputFile.hashCode(); + code += errorFile.hashCode(); + code *= this.cbuilder.hashCode(); + return code; + } + + public String getClusterSettings() { + String settings = ph.getProperty(getType().getSimpleName() + .toLowerCase() + + ".cluster.settings"); + return settings == null ? "" : settings; + } + + public abstract Class> getType(); } diff --git a/runner/compbio/runner/Util.java b/runner/compbio/runner/Util.java index a3c86c6..9475a0b 100644 --- a/runner/compbio/runner/Util.java +++ b/runner/compbio/runner/Util.java @@ -27,6 +27,7 @@ import java.util.List; import org.apache.log4j.Logger; import compbio.data.sequence.Alignment; +import compbio.data.sequence.AnnotatedSequence; import compbio.data.sequence.ClustalAlignmentUtil; import compbio.data.sequence.FastaSequence; import compbio.data.sequence.SequenceUtil; @@ -130,6 +131,23 @@ public final class Util { return ClustalAlignmentUtil.readClustalFile(cfile); } + public static final List readJronnFile( + String workDirectory, String clustFile) + throws UnknownFileFormatException, IOException, + FileNotFoundException, NullPointerException { + assert !compbio.util.Util.isEmpty(workDirectory); + assert !compbio.util.Util.isEmpty(clustFile); + File cfile = new File(compbio.engine.client.Util.getFullPath( + workDirectory, clustFile)); + log.trace("Jronn OUTPUT FILE PATH: " + cfile.getAbsolutePath()); + if (!(cfile.exists() && cfile.length() > 0)) { + throw new FileNotFoundException("Result for the jobId " + + workDirectory + " with file name " + clustFile + + " is not found!"); + } + return SequenceUtil.readJRonn(cfile); + } + public static void writeInput(List sequences, ConfiguredExecutable exec) { diff --git a/runner/compbio/runner/disorder/RonnWrapper.java b/runner/compbio/runner/disorder/Jronn.java similarity index 58% rename from runner/compbio/runner/disorder/RonnWrapper.java rename to runner/compbio/runner/disorder/Jronn.java index 0bd621e..0c4245a 100644 --- a/runner/compbio/runner/disorder/RonnWrapper.java +++ b/runner/compbio/runner/disorder/Jronn.java @@ -18,52 +18,59 @@ package compbio.runner.disorder; +import java.io.File; +import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.IOException; +import java.io.InputStream; import java.util.Arrays; import java.util.List; import org.apache.log4j.Logger; -import compbio.data.sequence.Alignment; +import compbio.data.sequence.AnnotatedSequence; +import compbio.data.sequence.SequenceUtil; import compbio.data.sequence.UnknownFileFormatException; import compbio.engine.client.Executable; -import compbio.engine.client.PipedExecutable; import compbio.engine.client.SkeletalExecutable; import compbio.metadata.Limit; import compbio.metadata.LimitsManager; import compbio.metadata.ResultNotAvailableException; import compbio.runner.Util; -public class RonnWrapper extends SkeletalExecutable implements - PipedExecutable { - /* - * RONN does not accept stdin the file name must be defined as parameter It - * can only analyse ONE sequence per run! (or may be not, but the results - * gets overriden!) FASTA format is accepted. - * - * To run it do the following: - * - * 1) copy ronn executables and task file to work directory - * - * 2) execute run processes one by one for each sequence - */ - - private static final String command = "/homes/pvtroshin/soft/RONNv3_fasta/Ronn_runner.sh"; +/** + * Command line + * + * java -Xmx512 -jar jronn_v3.jar -i=test_seq.txt -n=1 -o=out.txt -s=stat.out + * + * @author pvtroshin + * + */ +public class Jronn extends SkeletalExecutable { - private static Logger log = Logger.getLogger(RonnWrapper.class); + private static Logger log = Logger.getLogger(Jronn.class); // Cache for Limits information - private static LimitsManager limits; + private static LimitsManager limits; public static final String KEY_VALUE_SEPARATOR = Util.SPACE; + public static final String STAT_FILE = "stat.txt"; + + public Jronn() { + addParameters(Arrays.asList("-jar", getLibPath(), "-n=1", "-s=" + + STAT_FILE, "-f=H")); + } @SuppressWarnings("unchecked") @Override - public Alignment getResults(String workDirectory) + public List getResults(String workDirectory) throws ResultNotAvailableException { + List sequences = null; try { - return Util.readClustalFile(workDirectory, getOutput()); + InputStream inStream = new FileInputStream(new File(workDirectory, + getOutput())); + sequences = SequenceUtil.readJRonn(inStream); + inStream.close(); } catch (FileNotFoundException e) { log.error(e.getMessage(), e.getCause()); throw new ResultNotAvailableException(e); @@ -77,6 +84,23 @@ public class RonnWrapper extends SkeletalExecutable implements log.error(e.getMessage(), e.getCause()); throw new ResultNotAvailableException(e); } + return sequences; + } + + private static String getLibPath() { + + String settings = ph.getProperty("jronn.jar.file"); + if (compbio.util.Util.isEmpty(settings)) { + throw new NullPointerException( + "Please define jronn.jar.file property in Executable.properties file" + + "and initialize it with the location of jronn jar file"); + } + if (new File(settings).isAbsolute()) { + // Jronn jar can be found so no actions necessary + // no further actions is necessary + return settings; + } + return compbio.engine.client.Util.convertToAbsolute(settings); } @Override @@ -85,18 +109,25 @@ public class RonnWrapper extends SkeletalExecutable implements } @Override - public RonnWrapper setInput(String inFile) { - String input = getInput(); + public Jronn setInput(String inFile) { super.setInput(inFile); + cbuilder.setParam("-i=" + inFile); return this; } @Override - public Limit getLimit(String presetName) { + public Jronn setOutput(String outFile) { + super.setOutput(outFile); + cbuilder.setParam("-o=" + outFile); + return this; + } + + @Override + public Limit getLimit(String presetName) { if (limits == null) { limits = getLimits(); } - Limit limit = null; + Limit limit = null; if (limits != null) { // this returns default limit if preset is undefined! limit = limits.getLimitByName(presetName); @@ -112,7 +143,7 @@ public class RonnWrapper extends SkeletalExecutable implements } @Override - public LimitsManager getLimits() { + public LimitsManager getLimits() { // synchronise on static field synchronized (log) { if (limits == null) { @@ -126,4 +157,8 @@ public class RonnWrapper extends SkeletalExecutable implements public Class> getType() { return this.getClass(); } + + public static String getStatFile() { + return STAT_FILE; + } } diff --git a/testsrc/compbio/data/sequence/SequenceUtilTester.java b/testsrc/compbio/data/sequence/SequenceUtilTester.java index 3dd09ef..f2af670 100644 --- a/testsrc/compbio/data/sequence/SequenceUtilTester.java +++ b/testsrc/compbio/data/sequence/SequenceUtilTester.java @@ -36,80 +36,112 @@ import compbio.metadata.AllTestSuit; public class SequenceUtilTester { - @Test() - public void testisNonAmbNucleotideSequence() { - String dnaseq = "atgatTGACGCTGCTGatgtcgtgagtgga"; - assertTrue(SequenceUtil.isNonAmbNucleotideSequence(dnaseq)); - String dirtyDnaseq = "atgAGTggt\taGGTgc\ncgcACTgc gACtcgcGAt cgA "; - assertTrue(SequenceUtil.isNonAmbNucleotideSequence(dirtyDnaseq)); - String nonDna = "atgfctgatgcatgcatgatgctga"; - assertFalse(SequenceUtil.isNonAmbNucleotideSequence(nonDna)); - - nonDna = "atgc1tgatgcatgcatgatgctga"; - assertFalse(SequenceUtil.isNonAmbNucleotideSequence(nonDna)); - - nonDna = "ARLGRVRWTQQRHAEAAVLLQQASDAAPEHPGIALWLGHALEDAGQAEAAAAAYTRAHQL"; - assertFalse(SequenceUtil.isNonAmbNucleotideSequence(nonDna)); - // String ambDna = "AGTCRYMKSWHBVDN"; // see IUPAC Nucleotide Code - assertFalse(SequenceUtil.isNonAmbNucleotideSequence(nonDna)); - - } - - @Test() - public void testCleanSequence() { - String dirtySeq = "atgAGTggt\taGGTgc\ncgcAC\rTgc gACtcgcGAt cgA "; - assertEquals("atgAGTggtaGGTgccgcACTgcgACtcgcGAtcgA".toUpperCase(), - SequenceUtil.cleanSequence(dirtySeq)); - } - - @Test() - public void testDeepCleanSequence() { - String dirtySeq = "a!t?g.A;GTggt\ta12GGTgc\ncgc23AC\rTgc gAC<>.,?!|\\|/t@cg-c¬GA=_+(0){]}[:£$&^*\"t cgA "; - assertEquals("atgAGTggtaGGTgccgcACTgcgACtcgcGAtcgA".toUpperCase(), - SequenceUtil.deepCleanSequence(dirtySeq)); + @Test() + public void testisNonAmbNucleotideSequence() { + String dnaseq = "atgatTGACGCTGCTGatgtcgtgagtgga"; + assertTrue(SequenceUtil.isNonAmbNucleotideSequence(dnaseq)); + String dirtyDnaseq = "atgAGTggt\taGGTgc\ncgcACTgc gACtcgcGAt cgA "; + assertTrue(SequenceUtil.isNonAmbNucleotideSequence(dirtyDnaseq)); + String nonDna = "atgfctgatgcatgcatgatgctga"; + assertFalse(SequenceUtil.isNonAmbNucleotideSequence(nonDna)); + + nonDna = "atgc1tgatgcatgcatgatgctga"; + assertFalse(SequenceUtil.isNonAmbNucleotideSequence(nonDna)); + + nonDna = "ARLGRVRWTQQRHAEAAVLLQQASDAAPEHPGIALWLGHALEDAGQAEAAAAAYTRAHQL"; + assertFalse(SequenceUtil.isNonAmbNucleotideSequence(nonDna)); + // String ambDna = "AGTCRYMKSWHBVDN"; // see IUPAC Nucleotide Code + assertFalse(SequenceUtil.isNonAmbNucleotideSequence(nonDna)); + + } + + @Test() + public void testCleanSequence() { + String dirtySeq = "atgAGTggt\taGGTgc\ncgcAC\rTgc gACtcgcGAt cgA "; + assertEquals("atgAGTggtaGGTgccgcACTgcgACtcgcGAtcgA".toUpperCase(), + SequenceUtil.cleanSequence(dirtySeq)); + } + + @Test() + public void testDeepCleanSequence() { + String dirtySeq = "a!t?g.A;GTggt\ta12GGTgc\ncgc23AC\rTgc gAC<>.,?!|\\|/t@cg-c¬GA=_+(0){]}[:£$&^*\"t cgA "; + assertEquals("atgAGTggtaGGTgccgcACTgcgACtcgcGAtcgA".toUpperCase(), + SequenceUtil.deepCleanSequence(dirtySeq)); + } + + @Test() + public void testisProteinSequence() { + String dirtySeq = "atgAGTggt\taGGTgc\ncgcAC\rTgc gACtcgcGAt cgA "; + assertFalse(SequenceUtil.isProteinSequence(dirtySeq)); + String notaSeq = "atgc1tgatgcatgcatgatgctga"; + assertFalse(SequenceUtil.isProteinSequence(notaSeq)); + String AAseq = "ARLGRVRWTQQRHAEAAVLLQQASDAAPEHPGIALWLGHALEDAGQAEAAAAAYTRAHQL"; + assertTrue(SequenceUtil.isProteinSequence(AAseq)); + AAseq += "XU"; + assertFalse(SequenceUtil.isProteinSequence(AAseq)); + + } + + @Test() + public void testReadWriteFasta() { + + try { + FileInputStream fio = new FileInputStream( + AllTestSuit.TEST_DATA_PATH + "TO1381.fasta"); + assertNotNull(fio); + List fseqs = SequenceUtil.readFasta(fio); + assertNotNull(fseqs); + assertEquals(3, fseqs.size()); + assertEquals(3, fseqs.size()); + fio.close(); + FileOutputStream fou = new FileOutputStream( + AllTestSuit.TEST_DATA_PATH + "TO1381.fasta.written"); + SequenceUtil.writeFasta(fou, fseqs); + fou.close(); + FileOutputStream fou20 = new FileOutputStream( + AllTestSuit.TEST_DATA_PATH + "TO1381.fasta20.written"); + SequenceUtil.writeFasta(fou20, fseqs, 20); + fou20.close(); + + } catch (FileNotFoundException e) { + e.printStackTrace(); + fail(e.getLocalizedMessage()); + } catch (IOException e) { + e.printStackTrace(); + fail(e.getLocalizedMessage()); } - - @Test() - public void testisProteinSequence() { - String dirtySeq = "atgAGTggt\taGGTgc\ncgcAC\rTgc gACtcgcGAt cgA "; - assertFalse(SequenceUtil.isProteinSequence(dirtySeq)); - String notaSeq = "atgc1tgatgcatgcatgatgctga"; - assertFalse(SequenceUtil.isProteinSequence(notaSeq)); - String AAseq = "ARLGRVRWTQQRHAEAAVLLQQASDAAPEHPGIALWLGHALEDAGQAEAAAAAYTRAHQL"; - assertTrue(SequenceUtil.isProteinSequence(AAseq)); - AAseq += "XU"; - assertFalse(SequenceUtil.isProteinSequence(AAseq)); - + } + + /** + * This test tests the loading of horizontally formatted Jronn output file + */ + @Test + public void loadJronnFile() { + + FileInputStream fio; + try { + fio = new FileInputStream(AllTestSuit.TEST_DATA_PATH + "jronn.out"); + List aseqs = SequenceUtil.readJRonn(fio); + assertNotNull(aseqs); + assertEquals(aseqs.size(), 3); + AnnotatedSequence aseq = aseqs.get(0); + assertNotNull(aseq); + assertNotNull(aseq.getAnnotation()); + //System.out.println(aseq); + assertEquals(aseq.getAnnotation().length, aseq.getSequence() + .length()); + fio.close(); + } catch (FileNotFoundException e) { + e.printStackTrace(); + fail(e.getLocalizedMessage()); + } catch (IOException e) { + e.printStackTrace(); + fail(e.getLocalizedMessage()); + } catch (UnknownFileFormatException e) { + e.printStackTrace(); + fail(e.getLocalizedMessage()); } - @Test() - public void testReadWriteFasta() { - - try { - FileInputStream fio = new FileInputStream( - AllTestSuit.TEST_DATA_PATH + "TO1381.fasta"); - assertNotNull(fio); - List fseqs = SequenceUtil.readFasta(fio); - assertNotNull(fseqs); - assertEquals(3, fseqs.size()); - assertEquals(3, fseqs.size()); - fio.close(); - FileOutputStream fou = new FileOutputStream( - AllTestSuit.TEST_DATA_PATH + "TO1381.fasta.written"); - SequenceUtil.writeFasta(fou, fseqs); - fou.close(); - FileOutputStream fou20 = new FileOutputStream( - AllTestSuit.TEST_DATA_PATH + "TO1381.fasta20.written"); - SequenceUtil.writeFasta(fou20, fseqs, 20); - fou20.close(); - - } catch (FileNotFoundException e) { - e.printStackTrace(); - fail(e.getLocalizedMessage()); - } catch (IOException e) { - e.printStackTrace(); - fail(e.getLocalizedMessage()); - } - } + } } diff --git a/testsrc/compbio/runner/disorder/JronnTester.java b/testsrc/compbio/runner/disorder/JronnTester.java new file mode 100644 index 0000000..ba0b171 --- /dev/null +++ b/testsrc/compbio/runner/disorder/JronnTester.java @@ -0,0 +1,332 @@ +/* Copyright (c) 2009 Peter Troshin + * + * JAva Bioinformatics Analysis Web Services (JABAWS) @version: 1.0 + * + * This library is free software; you can redistribute it and/or modify it under the terms of the + * Apache License version 2 as published by the Apache Software Foundation + * + * This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without + * even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the Apache + * License for more details. + * + * A copy of the license is in apache_license.txt. It is also available here: + * @see: http://www.apache.org/licenses/LICENSE-2.0.txt + * + * Any republication or derived work distributed in source code form + * must include this copyright and license notice. + */ + +package compbio.runner.disorder; + +import static org.testng.Assert.assertEquals; +import static org.testng.Assert.assertFalse; +import static org.testng.Assert.assertNotNull; +import static org.testng.Assert.assertTrue; +import static org.testng.Assert.fail; + +import java.io.File; +import java.io.FileInputStream; +import java.io.FileNotFoundException; +import java.io.IOException; +import java.text.ParseException; +import java.util.List; + +import javax.xml.bind.ValidationException; + +import org.ggf.drmaa.DrmaaException; +import org.ggf.drmaa.JobInfo; +import org.testng.annotations.BeforeMethod; +import org.testng.annotations.Test; + +import compbio.data.sequence.AnnotatedSequence; +import compbio.engine.AsyncExecutor; +import compbio.engine.Configurator; +import compbio.engine.FilePuller; +import compbio.engine.SyncExecutor; +import compbio.engine.client.ConfExecutable; +import compbio.engine.client.ConfiguredExecutable; +import compbio.engine.client.Executable; +import compbio.engine.client.RunConfiguration; +import compbio.engine.cluster.drmaa.ClusterUtil; +import compbio.engine.cluster.drmaa.JobRunner; +import compbio.engine.cluster.drmaa.StatisticManager; +import compbio.engine.local.LocalRunner; +import compbio.metadata.AllTestSuit; +import compbio.metadata.ChunkHolder; +import compbio.metadata.JobExecutionException; +import compbio.metadata.JobStatus; +import compbio.metadata.JobSubmissionException; +import compbio.metadata.LimitsManager; +import compbio.metadata.PresetManager; +import compbio.metadata.ResultNotAvailableException; +import compbio.metadata.RunnerConfig; +import compbio.util.FileWatcher; +import compbio.util.SysPrefs; + +public class JronnTester { + + public static String test_outfile = "TO1381.jronn.out"; // "/homes/pvtroshin/TO1381.clustal.cluster.out + + private Jronn jronn; + + @BeforeMethod(alwaysRun = true) + void init() { + jronn = new Jronn(); + jronn.setInput(AllTestSuit.test_input).setOutput(test_outfile); + } + + @Test(groups = { AllTestSuit.test_group_cluster, + AllTestSuit.test_group_runner }) + public void testRunOnCluster() { + assertFalse(SysPrefs.isWindows, + "Cluster execution can only be in unix environment"); + try { + ConfiguredExecutable confJronn = Configurator + .configureExecutable(jronn, Executable.ExecProvider.Cluster); + JobRunner runner = JobRunner.getInstance(confJronn); + + assertNotNull(runner, "Runner is NULL"); + runner.executeJob(); + // assertNotNull("JobId is null", jobId1); + JobStatus status = runner.getJobStatus(); + assertTrue(status == JobStatus.PENDING + || status == JobStatus.RUNNING, + "Status of the process is wrong!"); + JobInfo info = runner.getJobInfo(); + assertNotNull(info, "JobInfo is null"); + StatisticManager sm = new StatisticManager(info); + assertNotNull(sm, "Statictic manager is null"); + try { + + String exits = sm.getExitStatus(); + assertNotNull("Exit status is null", exits); + // cut 4 trailing zeros from the number + int exitsInt = ClusterUtil.CLUSTER_STAT_IN_SEC.parse(exits) + .intValue(); + assertEquals(0, exitsInt); + System.out.println(sm.getAllStats()); + + } catch (ParseException e) { + e.printStackTrace(); + fail("Parse Exception: " + e.getMessage()); + } + assertFalse(runner.cleanup()); + assertTrue(sm.hasExited()); + assertFalse(sm.wasAborted()); + assertFalse(sm.hasDump()); + assertFalse(sm.hasSignaled()); + + } catch (JobSubmissionException e) { + e.printStackTrace(); + fail("DrmaaException caught:" + e.getMessage()); + } catch (JobExecutionException e) { + e.printStackTrace(); + fail("DrmaaException caught:" + e.getMessage()); + } catch (DrmaaException e) { + e.printStackTrace(); + fail("DrmaaException caught:" + e.getMessage()); + } + } + + /** + * This tests fails from time to time depending on the cluster load or some + * other factors. Any client code has to adjust for this issue + */ + @Test(enabled = false, groups = { AllTestSuit.test_group_cluster, + AllTestSuit.test_group_runner }) + public void testRunOnClusterAsync() { + assertFalse(SysPrefs.isWindows, + "Cluster execution can only be in unix environment"); + try { + ConfiguredExecutable confJronn = Configurator + .configureExecutable(jronn, Executable.ExecProvider.Cluster); + AsyncExecutor aengine = Configurator.getAsyncEngine(confJronn); + String jobId = aengine.submitJob(confJronn); + assertNotNull(jobId, "Runner is NULL"); + // let drmaa to start + Thread.sleep(500); + JobStatus status = aengine.getJobStatus(jobId); + while (status != JobStatus.FINISHED + || status != JobStatus.UNDEFINED) { + System.out.println("Job Status: " + status); + Thread.sleep(1000); + status = aengine.getJobStatus(jobId); + } + } catch (JobSubmissionException e) { + e.printStackTrace(); + fail("DrmaaException caught:" + e.getMessage()); + } catch (InterruptedException e) { + e.printStackTrace(); + fail(e.getMessage()); + } + } + + @Test(groups = { AllTestSuit.test_group_runner }) + public void testRunLocally() { + try { + ConfiguredExecutable confJronn = Configurator + .configureExecutable(jronn, Executable.ExecProvider.Local); + + // For local execution use relative + LocalRunner lr = new LocalRunner(confJronn); + lr.executeJob(); + ConfiguredExecutable al1 = lr.waitForResult(); + assertNotNull(al1.getResults()); + List al2 = confJronn.getResults(); + assertNotNull(al2); + assertEquals(al2.size(), 3); + assertEquals(al1.getResults(), al2); + } catch (JobSubmissionException e) { + e.printStackTrace(); + fail(e.getLocalizedMessage()); + } catch (ResultNotAvailableException e) { + e.printStackTrace(); + fail(e.getLocalizedMessage()); + } catch (JobExecutionException e) { + e.printStackTrace(); + fail(e.getLocalizedMessage()); + } + } + + @Test(groups = { AllTestSuit.test_group_runner }) + public void readStatistics() { + try { + ConfiguredExecutable confJronn = Configurator + .configureExecutable(jronn, Executable.ExecProvider.Local); + // For local execution use relavive + + AsyncExecutor sexec = Configurator.getAsyncEngine(confJronn); + String jobId = sexec.submitJob(confJronn); + FilePuller fw = FilePuller.newFilePuller(confJronn + .getWorkDirectory() + + File.separator + Jronn.getStatFile(), + FileWatcher.MIN_CHUNK_SIZE_BYTES); + int count = 0; + long position = 0; + fw.waitForFile(4); + while (!(sexec.getJobStatus(jobId) == JobStatus.FINISHED + || sexec.getJobStatus(jobId) == JobStatus.FAILED || sexec + .getJobStatus(jobId) == JobStatus.UNDEFINED) + || fw.hasMoreData()) { + ChunkHolder ch = fw.pull(position); + String chunk = ch.getChunk(); + position = ch.getNextPosition(); + System.out.print(chunk); + count++; + } + assertTrue(count > 1); + ConfiguredExecutable al = sexec.getResults(jobId); + assertNotNull(al.getResults()); + } catch (JobSubmissionException e) { + e.printStackTrace(); + fail(e.getMessage()); + } catch (ResultNotAvailableException e) { + e.printStackTrace(); + fail(e.getMessage()); + } catch (IOException e) { + e.printStackTrace(); + fail(e.getMessage()); + } + } + + @Test(groups = { AllTestSuit.test_group_runner }) + public void testPersistance() { + try { + Jronn jronn = new Jronn(); + jronn.setError("errrr.txt").setInput(AllTestSuit.test_input) + .setOutput("outtt.txt"); + assertEquals(jronn.getInput(), AllTestSuit.test_input); + assertEquals(jronn.getError(), "errrr.txt"); + assertEquals(jronn.getOutput(), "outtt.txt"); + ConfiguredExecutable cJronn = Configurator + .configureExecutable(jronn, Executable.ExecProvider.Local); + + SyncExecutor sexec = Configurator.getSyncEngine(cJronn); + sexec.executeJob(); + ConfiguredExecutable al = sexec.waitForResult(); + assertNotNull(al.getResults()); + // Save run configuration + assertTrue(cJronn.saveRunConfiguration()); + + // See if loaded configuration is the same as saved + RunConfiguration loadedRun = RunConfiguration + .load(new FileInputStream(new File(cJronn + .getWorkDirectory(), RunConfiguration.rconfigFile))); + assertEquals( + ((ConfExecutable) cJronn).getRunConfiguration(), + loadedRun); + // Load run configuration as ConfExecutable + ConfiguredExecutable resurrectedCMuscle = (ConfiguredExecutable) cJronn + .loadRunConfiguration(new FileInputStream(new File(cJronn + .getWorkDirectory(), RunConfiguration.rconfigFile))); + assertNotNull(resurrectedCMuscle); + assertEquals(resurrectedCMuscle.getExecutable().getInput(), + AllTestSuit.test_input); + assertEquals(resurrectedCMuscle.getExecutable().getError(), + "errrr.txt"); + assertEquals(resurrectedCMuscle.getExecutable().getOutput(), + "outtt.txt"); + // See in details whether executables are the same + assertEquals(resurrectedCMuscle.getExecutable(), jronn); + + ConfiguredExecutable resJronn = Configurator + .configureExecutable(resurrectedCMuscle.getExecutable(), + Executable.ExecProvider.Local); + + sexec = Configurator.getSyncEngine(resJronn, + Executable.ExecProvider.Local); + sexec.executeJob(); + al = sexec.waitForResult(); + assertNotNull(al); + + } catch (JobSubmissionException e) { + e.printStackTrace(); + fail(e.getMessage()); + } catch (JobExecutionException e) { + e.printStackTrace(); + fail(e.getMessage()); + } catch (FileNotFoundException e) { + e.printStackTrace(); + fail(e.getMessage()); + } catch (IOException e) { + e.printStackTrace(); + fail(e.getMessage()); + } catch (ResultNotAvailableException e) { + e.printStackTrace(); + fail(e.getMessage()); + } + } + + @Test(groups = { AllTestSuit.test_group_runner }) + public void testConfigurationLoading() { + try { + RunnerConfig jronnConfig = ConfExecutable + .getRunnerOptions(Jronn.class); + assertNotNull(jronnConfig); + assertTrue(jronnConfig.getArguments().size() > 0); + + PresetManager jronnPresets = ConfExecutable + .getRunnerPresets(Jronn.class); + assertNotNull(jronnPresets); + assertTrue(jronnPresets.getPresets().size() > 0); + jronnPresets.validate(jronnConfig); + + LimitsManager jronnLimits = ConfExecutable + .getRunnerLimits(Jronn.class); + assertNotNull(jronnLimits); + assertTrue(jronnLimits.getLimits().size() > 0); + jronnLimits.validate(jronnPresets); + + } catch (FileNotFoundException e) { + e.printStackTrace(); + fail(e.getLocalizedMessage()); + } catch (IOException e) { + e.printStackTrace(); + fail(e.getLocalizedMessage()); + } catch (ValidationException e) { + e.printStackTrace(); + fail(e.getLocalizedMessage()); + } + } + +} diff --git a/testsrc/testdata/jronn.out b/testsrc/testdata/jronn.out new file mode 100644 index 0000000..2752911 --- /dev/null +++ b/testsrc/testdata/jronn.out @@ -0,0 +1,13 @@ + +>Foobar_dundeefriends +M T A D G P R E L L Q L R A A V R H R P Q D F V A W L M L A D A E L G M G D T T A G E M A V Q R G L A L H P G H P E A V A R L G R V R W T Q Q R H A E A A V L L Q Q A S D A A P E H P G I A L W L G H A L E D A G Q A E A A A A A Y T R A H Q L L P E E P Y I T A Q L L N W R R R L C D W R A L D V L S A Q V R A A V A Q G V G A V E P F A F L S E D A S A A E Q L A C A R T R A Q A I A A S V R P L A P T R V R S K G P L R V G F V S N G F G A H P T G L L T V A L F E A L Q R R Q P D L Q M H L F A T S G D D G S T L R T R L A Q A S T L H D V T A L G H L A T A K H I R H H G I D L L F D L R G W G G G G R P E V F A L R P A P V Q V N W L A Y P G T S G A P W M D Y V L G D A F A L P P A L E P F Y S E H V L R L Q G A F Q P S D T S R V V A E P P S R T Q C G L P E Q G V V L C C F N N S Y K L N P Q S M A R M L A V L R E V P D S V L W L L S G P G E A D A R L R A F A H A Q G V D A Q R L V F M P K L P H P Q Y L A R Y R H A D L F L D T H P Y N A H T T A S D A L W T G C P V L T T P G E T F A A R V A G S L N H H L G L D E M N V A D D A A F V A K A V A L A S D P A A L T A L H A R V D V L R R E S G V F E M D G F A D D F G A L L Q A L A R R H G W L G I +0.39 0.42 0.46 0.45 0.44 0.42 0.41 0.4 0.4 0.39 0.37 0.37 0.37 0.37 0.38 0.38 0.38 0.38 0.38 0.37 0.36 0.35 0.35 0.34 0.35 0.36 0.37 0.37 0.38 0.41 0.43 0.44 0.45 0.45 0.46 0.46 0.46 0.46 0.47 0.49 0.51 0.51 0.52 0.52 0.52 0.52 0.52 0.51 0.5 0.48 0.49 0.48 0.49 0.48 0.48 0.49 0.5 0.49 0.49 0.47 0.46 0.45 0.45 0.45 0.44 0.43 0.44 0.44 0.46 0.47 0.49 0.49 0.5 0.52 0.52 0.51 0.5 0.49 0.49 0.5 0.52 0.53 0.53 0.55 0.54 0.54 0.53 0.51 0.48 0.46 0.45 0.43 0.41 0.4 0.4 0.4 0.42 0.43 0.43 0.43 0.44 0.44 0.44 0.46 0.46 0.46 0.46 0.47 0.47 0.47 0.47 0.48 0.48 0.48 0.48 0.46 0.45 0.44 0.42 0.39 0.37 0.35 0.32 0.31 0.3 0.28 0.26 0.26 0.25 0.24 0.23 0.23 0.23 0.23 0.24 0.24 0.25 0.26 0.27 0.29 0.3 0.32 0.34 0.34 0.36 0.37 0.38 0.4 0.41 0.42 0.42 0.42 0.42 0.42 0.42 0.43 0.43 0.44 0.44 0.45 0.45 0.45 0.46 0.46 0.47 0.48 0.48 0.5 0.51 0.53 0.54 0.56 0.57 0.57 0.58 0.58 0.58 0.57 0.58 0.58 0.59 0.59 0.59 0.59 0.59 0.59 0.59 0.58 0.59 0.59 0.6 0.6 0.59 0.59 0.59 0.6 0.6 0.6 0.59 0.58 0.58 0.59 0.59 0.58 0.57 0.55 0.53 0.51 0.49 0.45 0.43 0.42 0.4 0.38 0.36 0.34 0.33 0.34 0.34 0.33 0.32 0.3 0.29 0.29 0.28 0.29 0.29 0.29 0.3 0.32 0.34 0.36 0.38 0.41 0.42 0.43 0.43 0.44 0.45 0.46 0.48 0.49 0.52 0.55 0.58 0.58 0.59 0.58 0.58 0.58 0.57 0.56 0.55 0.56 0.55 0.54 0.53 0.52 0.51 0.5 0.49 0.47 0.45 0.44 0.45 0.45 0.46 0.46 0.45 0.45 0.44 0.44 0.44 0.44 0.45 0.45 0.45 0.44 0.44 0.44 0.44 0.43 0.42 0.4 0.39 0.38 0.37 0.38 0.38 0.38 0.37 0.36 0.37 0.36 0.36 0.36 0.37 0.37 0.37 0.36 0.36 0.37 0.38 0.39 0.39 0.4 0.4 0.4 0.39 0.38 0.38 0.36 0.35 0.34 0.34 0.33 0.33 0.33 0.33 0.32 0.31 0.31 0.3 0.3 0.3 0.3 0.3 0.3 0.31 0.32 0.33 0.34 0.35 0.36 0.37 0.37 0.37 0.38 0.4 0.42 0.42 0.43 0.44 0.44 0.44 0.44 0.45 0.46 0.47 0.49 0.5 0.52 0.52 0.54 0.56 0.58 0.6 0.62 0.63 0.63 0.63 0.64 0.65 0.65 0.65 0.64 0.62 0.59 0.57 0.54 0.51 0.49 0.46 0.44 0.41 0.39 0.37 0.35 0.32 0.31 0.29 0.27 0.26 0.28 0.29 0.29 0.3 0.3 0.3 0.3 0.3 0.31 0.3 0.3 0.3 0.3 0.29 0.29 0.29 0.3 0.3 0.3 0.28 0.27 0.27 0.28 0.28 0.28 0.29 0.3 0.31 0.33 0.34 0.35 0.36 0.37 0.39 0.39 0.39 0.39 0.38 0.38 0.39 0.38 0.38 0.38 0.39 0.4 0.39 0.38 0.38 0.38 0.37 0.36 0.35 0.33 0.34 0.34 0.35 0.35 0.35 0.35 0.34 0.34 0.34 0.32 0.3 0.31 0.31 0.31 0.31 0.32 0.33 0.32 0.33 0.32 0.32 0.32 0.34 0.34 0.35 0.36 0.37 0.38 0.4 0.42 0.43 0.44 0.44 0.45 0.44 0.44 0.45 0.46 0.47 0.47 0.47 0.47 0.46 0.48 0.48 0.49 0.48 0.48 0.45 0.43 0.41 0.39 0.38 0.38 0.38 0.39 0.39 0.38 0.38 0.37 0.36 0.36 0.34 0.33 0.33 0.34 0.34 0.36 0.38 0.39 0.4 0.41 0.42 0.42 0.41 0.41 0.42 0.43 0.44 0.44 0.45 0.45 0.44 0.43 0.42 0.41 0.4 0.39 0.37 0.36 0.36 0.35 0.35 0.34 0.33 0.32 0.32 0.32 0.32 0.33 0.33 0.33 0.32 0.31 0.3 0.29 0.29 0.29 0.3 0.3 0.29 0.28 0.27 0.26 0.26 0.25 0.25 0.23 0.21 0.21 0.2 0.21 0.22 0.22 0.23 0.24 0.23 0.2 0.18 0.16 0.15 0.15 +>Foobar +M G D T T A G E M A V Q R G L A L H Q Q R H A E A A V L L Q Q A S D A A P E H P G I A L W L H A L E D A G Q A E A A A A Y T R A H Q L L P E E P Y I T A Q L L N A V A Q G V G A V E P F A F L S E D A S A A E S V R P L A P T R V R S K G P L R V G F V S N G F G A H P T G L L T V A L F E A L Q R R Q P D L Q M H L F A T S G D D G S T L R T R L A Q A S T L H D V T A L G H L A T A K H I R H H G I D L L F D L R G W G G G G R P E V F A L R P A P V Q V N W L A Y P G T S G A P W M D Y V L G D A F A L P P A L E P F Y S E H V L R L Q G A F Q P S D T S R V V A E P P S R T Q C G L P E Q G V V L C C F N N S Y K L N P Q S M A R M L A V L R E V P D S V L W L L S G P G E A D A R L R A F A H A Q G V D A Q R L V F M P K L P H P Q Y L A R Y R H A D L F L D T H P Y N A H T T A S D A L W T G C P V L T T P G E T F A A R V A G S L N H H L G L D E M N V A D D A A F V A K A V A L A S D P A A L T A L H A R V D V L R R E S G V F E M D G F A D D F G A L L Q A L A R R H G W L G I +0.48 0.42 0.42 0.48 0.52 0.53 0.54 0.53 0.52 0.5 0.49 0.49 0.49 0.48 0.47 0.47 0.47 0.49 0.51 0.53 0.54 0.55 0.55 0.54 0.53 0.52 0.51 0.5 0.51 0.52 0.51 0.51 0.52 0.52 0.52 0.51 0.5 0.48 0.45 0.45 0.43 0.43 0.42 0.42 0.43 0.45 0.47 0.47 0.47 0.47 0.47 0.46 0.47 0.47 0.49 0.48 0.47 0.47 0.47 0.47 0.46 0.46 0.46 0.45 0.44 0.42 0.41 0.41 0.41 0.41 0.4 0.4 0.39 0.39 0.4 0.4 0.39 0.38 0.38 0.38 0.38 0.38 0.38 0.38 0.39 0.41 0.42 0.42 0.43 0.44 0.44 0.44 0.46 0.46 0.48 0.5 0.52 0.55 0.57 0.59 0.6 0.61 0.62 0.63 0.63 0.63 0.63 0.63 0.64 0.64 0.64 0.63 0.62 0.61 0.61 0.61 0.59 0.57 0.55 0.53 0.51 0.49 0.45 0.43 0.42 0.4 0.38 0.36 0.34 0.33 0.34 0.34 0.33 0.32 0.3 0.29 0.29 0.28 0.29 0.29 0.29 0.3 0.32 0.34 0.36 0.38 0.41 0.42 0.43 0.43 0.44 0.45 0.46 0.48 0.49 0.52 0.55 0.58 0.58 0.59 0.58 0.58 0.58 0.57 0.56 0.55 0.56 0.55 0.54 0.53 0.52 0.51 0.5 0.49 0.47 0.45 0.44 0.45 0.45 0.46 0.46 0.45 0.45 0.44 0.44 0.44 0.44 0.45 0.45 0.45 0.44 0.44 0.44 0.44 0.43 0.42 0.4 0.39 0.38 0.37 0.38 0.38 0.38 0.37 0.36 0.37 0.36 0.36 0.36 0.37 0.37 0.37 0.36 0.36 0.37 0.38 0.39 0.39 0.4 0.4 0.4 0.39 0.38 0.38 0.36 0.35 0.34 0.34 0.33 0.33 0.33 0.33 0.32 0.31 0.31 0.3 0.3 0.3 0.3 0.3 0.3 0.31 0.32 0.33 0.34 0.35 0.36 0.37 0.37 0.37 0.38 0.4 0.42 0.42 0.43 0.44 0.44 0.44 0.44 0.45 0.46 0.47 0.49 0.5 0.52 0.52 0.54 0.56 0.58 0.6 0.62 0.63 0.63 0.63 0.64 0.65 0.65 0.65 0.64 0.62 0.59 0.57 0.54 0.51 0.49 0.46 0.44 0.41 0.39 0.37 0.35 0.32 0.31 0.29 0.27 0.26 0.28 0.29 0.29 0.3 0.3 0.3 0.3 0.3 0.31 0.3 0.3 0.3 0.3 0.29 0.29 0.29 0.3 0.3 0.3 0.28 0.27 0.27 0.28 0.28 0.28 0.29 0.3 0.31 0.33 0.34 0.35 0.36 0.37 0.39 0.39 0.39 0.39 0.38 0.38 0.39 0.38 0.38 0.38 0.39 0.4 0.39 0.38 0.38 0.38 0.37 0.36 0.35 0.33 0.34 0.34 0.35 0.35 0.35 0.35 0.34 0.34 0.34 0.32 0.3 0.31 0.31 0.31 0.31 0.32 0.33 0.32 0.33 0.32 0.32 0.32 0.34 0.34 0.35 0.36 0.37 0.38 0.4 0.42 0.43 0.44 0.44 0.45 0.44 0.44 0.45 0.46 0.47 0.47 0.47 0.47 0.46 0.48 0.48 0.49 0.48 0.48 0.45 0.43 0.41 0.39 0.38 0.38 0.38 0.39 0.39 0.38 0.38 0.37 0.36 0.36 0.34 0.33 0.33 0.34 0.34 0.36 0.38 0.39 0.4 0.41 0.42 0.42 0.41 0.41 0.42 0.43 0.44 0.44 0.45 0.45 0.44 0.43 0.42 0.41 0.4 0.39 0.37 0.36 0.36 0.35 0.35 0.34 0.33 0.32 0.32 0.32 0.32 0.33 0.33 0.33 0.32 0.31 0.3 0.29 0.29 0.29 0.3 0.3 0.29 0.28 0.27 0.26 0.26 0.25 0.25 0.23 0.21 0.21 0.2 0.21 0.22 0.22 0.23 0.24 0.23 0.2 0.18 0.16 0.15 0.15 + + +>dundeefriends +M T A D G P R E L L Q L R A A V R H R P Q D V A W L M L A D A E L G M G D T T A G E M A V Q R G L A L H P G H P E A V A R L G R V R W T Q Q R H A E A A V L L Q Q A S D A A P E H P G I A L W L G H A L E D H Q L L P E E P Y I T A Q L D V L S A Q V R A A V A Q G V G A V E P F A F L S E D A S A A E Q L A C A R T R A Q A I A A S V R P L A P T R V R S K G P L R V G F V S N G F G A H P T G L L T V A L F E A L Q R R Q P D L Q M H L F A T S G D D G S T L R T R L A Q A S T L H D V T A L G H L A T A K H I R H H G I D L L F D L R G W G G G G R P E V F A L R P A P V Q V N W L A Y P G T S G A P W M D Y V L G D A F A L P P A L E P F Y S E H V L R L Q G A F Q P S D T S R V V A E P P S R T Q C G L P E Q G V V L C C F N N S Y K L N P Q S M A R M L A V L R E V P D S V L W L L S G P G E A D A R L R A F A H A Q G V D A Q R L V F M P K L P H P Q Y L A R Y R H A D L F L D T H P Y N A H T T A S D A L W T G C P V L T T P G E T F A A R V A G S L N H H L G L D E M N V A D D A A F V A K A V A L A S D P A A L T A L H A R V D V L R R E S I +0.39 0.42 0.46 0.45 0.44 0.42 0.4 0.37 0.37 0.36 0.35 0.34 0.33 0.33 0.32 0.32 0.32 0.32 0.31 0.3 0.29 0.28 0.28 0.28 0.29 0.3 0.32 0.32 0.35 0.38 0.39 0.41 0.42 0.44 0.45 0.44 0.45 0.47 0.49 0.5 0.51 0.52 0.52 0.52 0.52 0.52 0.51 0.5 0.48 0.49 0.48 0.49 0.48 0.48 0.49 0.5 0.49 0.49 0.47 0.46 0.45 0.45 0.45 0.44 0.43 0.44 0.44 0.46 0.47 0.49 0.49 0.5 0.52 0.52 0.51 0.5 0.49 0.49 0.5 0.52 0.53 0.53 0.55 0.54 0.54 0.53 0.52 0.49 0.46 0.44 0.43 0.41 0.41 0.4 0.4 0.41 0.42 0.41 0.4 0.39 0.38 0.37 0.37 0.37 0.35 0.34 0.35 0.35 0.35 0.36 0.36 0.37 0.38 0.38 0.39 0.38 0.39 0.38 0.39 0.4 0.4 0.41 0.41 0.42 0.42 0.42 0.43 0.44 0.44 0.45 0.45 0.45 0.46 0.45 0.46 0.46 0.47 0.48 0.48 0.5 0.51 0.53 0.54 0.56 0.57 0.57 0.58 0.58 0.58 0.57 0.58 0.58 0.59 0.59 0.59 0.59 0.59 0.59 0.59 0.58 0.59 0.59 0.6 0.6 0.59 0.59 0.59 0.6 0.6 0.6 0.59 0.58 0.58 0.59 0.59 0.58 0.57 0.55 0.53 0.51 0.49 0.45 0.43 0.42 0.4 0.38 0.36 0.34 0.33 0.34 0.34 0.33 0.32 0.3 0.29 0.29 0.28 0.29 0.29 0.29 0.3 0.32 0.34 0.36 0.38 0.41 0.42 0.43 0.43 0.44 0.45 0.46 0.48 0.49 0.52 0.55 0.58 0.58 0.59 0.58 0.58 0.58 0.57 0.56 0.55 0.56 0.55 0.54 0.53 0.52 0.51 0.5 0.49 0.47 0.45 0.44 0.45 0.45 0.46 0.46 0.45 0.45 0.44 0.44 0.44 0.44 0.45 0.45 0.45 0.44 0.44 0.44 0.44 0.43 0.42 0.4 0.39 0.38 0.37 0.38 0.38 0.38 0.37 0.36 0.37 0.36 0.36 0.36 0.37 0.37 0.37 0.36 0.36 0.37 0.38 0.39 0.39 0.4 0.4 0.4 0.39 0.38 0.38 0.36 0.35 0.34 0.34 0.33 0.33 0.33 0.33 0.32 0.31 0.31 0.3 0.3 0.3 0.3 0.3 0.3 0.31 0.32 0.33 0.34 0.35 0.36 0.37 0.37 0.37 0.38 0.4 0.42 0.42 0.43 0.44 0.44 0.44 0.44 0.45 0.46 0.47 0.49 0.5 0.52 0.52 0.54 0.56 0.58 0.6 0.62 0.63 0.63 0.63 0.64 0.65 0.65 0.65 0.64 0.62 0.59 0.57 0.54 0.51 0.49 0.46 0.44 0.41 0.39 0.37 0.35 0.32 0.31 0.29 0.27 0.26 0.28 0.29 0.29 0.3 0.3 0.3 0.3 0.3 0.31 0.3 0.3 0.3 0.3 0.29 0.29 0.29 0.3 0.3 0.3 0.28 0.27 0.27 0.28 0.28 0.28 0.29 0.3 0.31 0.33 0.34 0.35 0.36 0.37 0.39 0.39 0.39 0.39 0.38 0.38 0.39 0.38 0.38 0.38 0.39 0.4 0.39 0.38 0.38 0.38 0.37 0.36 0.35 0.33 0.34 0.34 0.35 0.35 0.35 0.35 0.34 0.34 0.34 0.32 0.3 0.31 0.31 0.31 0.31 0.32 0.33 0.32 0.33 0.32 0.32 0.32 0.34 0.34 0.35 0.36 0.37 0.38 0.4 0.42 0.43 0.44 0.44 0.45 0.44 0.44 0.45 0.46 0.47 0.47 0.47 0.47 0.46 0.48 0.48 0.49 0.48 0.48 0.45 0.43 0.41 0.39 0.38 0.38 0.38 0.39 0.39 0.38 0.38 0.37 0.36 0.36 0.34 0.33 0.33 0.34 0.34 0.36 0.38 0.39 0.4 0.41 0.42 0.42 0.41 0.41 0.42 0.43 0.44 0.44 0.45 0.45 0.44 0.43 0.42 0.41 0.39 0.39 0.38 0.38 0.38 0.37 0.35 0.35 0.34 0.33 0.31 0.3 0.3 0.29 0.28 0.28 0.28 0.26 0.25 +