Java Bioinformatics Analyses Web Services (JABAWS) developers howto

JABAWS How To

Table of Content

What is JABAWS?

JABAWS stands for JAva Bioinformatics Analysis Web Services. It is a collection of web services for multiple sequence alignment. For simplicity we referer to them as JABAWS. It is a successor of Jalview Web Services. JABAWS makes it easy to access well-known multiple sequence alignment programs from JalView. However, the scope of JABAWS is not limited to multiple sequence alignment programmes. Future versions of JABAWS will incorporate protein disorder prediction, BLAST, PSI BLASTand HMMER database searches and many other tools. For the list of currently supported programs see below

Why JABAWS?

JABAWS offer a range of benefits which were not available earlier. In particular, several benefits span from JABAWS ability to be easily deployable. JABAWS can be deployed on nearly any operation system, it can operate on a stand alone server as well as submit the jobs to the cluster. Thanks to DRMAA it integrates well with a large variety of cluster job management systems. Jalview can be configured to submit jobs to different versions of JABAWS, for example to your local, lab version, or publicly available version elsewhere. As JABAWS can be installed in your lab, it eliminates the need to send you private information to the outside, to one of the publicly accessible servers. JABAWS can run programs with additional parameters defined by you, so you are no longer limited to defaults. JABAWS is safe to install for public access as it could limit the size of the tasks which it accepts and denies access to resources within web application folder.

How to deploy JABAWS?

Download JABAWS Web Application Archive (war file). Deploy it on any Servlet 2.4 compatible container. We have tested deployment on Apache-Tomcat version 6. On windows servers just drop a JABAWS war file into the web application directory. On Linux unpack the war file into web application directory , cd to <webapplicationpath>/binaries/src directory and run setexecflag.sh script to set an executable flag for binaries. Start Tomcat. That is it. This should give you working JABAWS stack with tasks executed locally on the server. If you have cluster you may want to enable JABAWS to submit jobs to the cluster. To find out more about this and to find out about other configuration options read the manual.

I deployed JABAWS, how to make sure it is working?

First of all make sure that tomcat server is started successfully. If this was the case, then you should see JABAWS home page when you navigate to your tomcat JABAWS context path e.g. http://myhost.compbio.ac.uk:8080/jabawsIf you see it, then it is time to make sure that web services are working too. Assuming that you have unpacked/deployed JABAWS from the server war file, you should be able to navigate to the test program which can be found in <webapplicationpath>/WEB-INF/lib/jabaws-client.jar file. To run the tests type: java -jar jabaws-client.jar -h=<Your web application server host name, port and JABAWS context path>

For example to test all JABAWS web services on host myhost.compbio.ac.uk type:

java -jar jabaws-client.jar -h=http://myhost.compbio.ac.uk:8080/jabaws

You can choose a particular web server using -s option like this java -jar jabaws-client.jar -h=http://myhost.compbio.ac.uk:8080/jabaws -s=ClustalWS This command line assumes that java executable is in your path and jabaws-client.jar is located in the current directory.

An example of the report testing tool produces for operating web service looks like this:

Connecting to service MuscleWS on http://myhost.compbio.ac.uk:8080/jabaws ... OK
Testing alignment with default parameters:
Queering job status...OK
Retrieving results...OK
Testing alignment with presets:
Aligning with preset 'Protein alignment(Fastest speed)'... OK
Aligning with preset 'Nucleotide alignment(Fastest speed)'... OK
Aligning with preset 'Huge alignments (speed-oriented)'... OK
Queering presets...OK
Queering Parameters...OK
Queering Limits...OK
Queering Local Engine Limits...OK
Check is completed service MuscleWS IS WORKINGAn example of the response of a web service which is deployed but is not operating is below:

Connecting to service ProbconsWS on http://localhost:8080/ws ... OK
Testing alignment with default parameters:FAILED
Service ProbconsWS IS NOT FUNCTIONALIf the web server did not respond the message looks like following: Connecting to service TcoffeeWS on http://localhost:8080/ws ... FAILED

Which Alignment programs are supported?

JABAWS provide access to the following programs

ClustalW (version 2.0.12)
Mafft (version 6.713)
Muscle (version 3.7)
Tcoffee (version 8.14)
Probcons (version 1.12)

I do not use windows and I am having troubles compiling binaries. Where can I get the pre compiled binaries for my system?

ClustalW
Mafft
Muscle
Tcoffee
Probcons (Linux I386 | AMD64)

We would however recommend to compile the binaries for your system whenever possible. This is likely to give you a significant performance gain.

Can I use a different version of the alignment program with JABAWS?

JABAWS supplied with binaries and source code of the executables which version it supports. So normally you would not need to install your own executables. However, if you have a different version of an executable (e.g. an alignment program) which you prefer, you could use it as long as it supports all the functions JABAWS executable supported. This could be the case with more recent executable. If the options supported by your chosen executable is different when the standard JABAWS executable, than you need to edit ExecutableNameParamaters.xml configuration file.

Is there a documentation for client library methods?

Yes there is. Javadoc is available for all methods of the library and data structures.

I want to use JABAWS. Which JABAWS distribution should I choose?

There are two main packages you could use

A client package - for anyone who whats to use JABAWS from your own code, without Jalview.
Web Services package (there are a few platform specific variants of this one) - for anyone who whats to run they own copy of JABA Web Services.

A client only package (1) contains the code sufficient to connect to a third party version of JABAWS and use it. This is the package for anyone who wants to connect to and to use JABAWS from their own software. The package also includes a command line client tool. Read more about how to use command line client below. JABAWS are fully WS-I compliant, so one could use any language to access them. However, a client package offer additional convenience methods, which is not available otherwise. For example methods to read Clustal formatted sequence alignment files and convert them to the List of FastaSequence objects, which JABAWS will be happy to consume. The trade off is that the client package is written in java, which may not be the language of your choice.

Web Services package (2) contains JABAWS web services. There are versions for Unix/Linux and Windows operation systems. JABAWS will work on any operation system which has web application server like Tomcat, and GNU compatible C/C++ compiler. This includes Mac. If you are interested in running JABAWS on Mac you would need to recompile binaries JABAWS depends on, in particular these executables. You can configure JalView to use your version of JABAWS, or a any combinations of publically available instance of JABAWS with your local.

Finally, you can download core JABAWS package, which contains the code for executing programmes locally or on the varaety of clusters. This is likely to be of interest for developers only..

Can I program against JABAWS?

Yes. The simplest way to do it is to download a client package, and use it to access JABAWS. This package contains value object which you could alternatively generate with wsimport in java, or similar tool in other languages. It offers some additional manually developed methods which further simplify working with JABAWS. For more information please refer to the data model javadoc. However should you wish to generate the code using wsimport tool you will be able to do so. As JABAWS are WS-I basic profile compliant, they can be accessed in a standard way as any other web service.

Is there a command line client to JABAWS?

Yes, it comes as a part of client package which you are welcome to download.

The command client can be used to align sequences using any of JABAWS supported web services. The client is OS independent and supports most of the functions which can be accessed programmatically via JABAWS API. Using this client you could align sequences using presets or custom parameters, please see examples of this below. Here is the list of options supported by the command line client.

Usage: <Class or Jar file name> -h=host_and_context -s=serviceName ACTION [OPTIONS] -h=<host_and_context> - a full URL to the JABAWS web server including context path e.g. http://10.31.10.159:8080/ws
-s=<ServiceName> - one of [MafftWS, MuscleWS, ClustalWS, TcoffeeWS, ProbconsWS]

ACTIONS:
-i=<inputFile> - full path to fasta formatted sequence file, from which to align sequences
-parameters - lists parameters supported by web service
-presets - lists presets supported by web service
-limits - lists web services limits
Please note that if input file is specified other actions are ignored

OPTIONS: (only for use with -i action):
-r=<presetName> - name of the preset to use
-o=<outputFile> - full path to the file where to write an alignment
-f=<parameterInputFile> - the name of the file with the list of parameters to use.
Please note that -r and -f options cannot be used together. Alignment is done with either preset or a parameters from the file, but not both!

Align sequences from input.fasta file using Mafft web service with default settings, print alignment in Clustal format to console.

java -jar jabaws-min-client.jar -h=http://myhost.compbio.ac.uk:8080/jabaws -s=MafftWS -i=d:\input.fasta

Content of input.fasta file is show below (please note sequences has been trimmed for clarity)>Foobar
MTADGPRELLQLRAAVRHRPQDFVAWL
>Bar
MGDTTAGEMAVQRGLALHQ
QRHAEAAVLLQQASDAAPE
>Foofriend
MTADGPRELLQLRAAV

Align as in above example, but write output alignment in a file out.clustal, using parameters defined in prm.in file

java -jar jabaws-min-client.jar -h=http://myhost.compbio.ac.uk:8080/jabaws -s=MafftWS -i=d:\input.fasta -o=d:\out.clustal -f=prm.in

The content of the prm.in file is shown below --nofft
--noscore
--fastaparttree
--retree=10
--op=2.2

The format of the file is the same for all JABAWS web services. Parameters are specified in exactly the same way as for native executables - alignment programs like Mafft etc. So parameters which you can use with command line version of an alignment program can be used with JABAWS. Most of the settings controlling alignment process are supported, but the setting controlling output are not. This is due to the fact the output have to be handled by JABAWS, so must remain within its control. For a list of parameters supported by a web service see the next example. In prm.in parameters are separated by the new line, and name of the parameter is separated from its value with an equal sign. This format is constant no matter which JABAWS web service is used.
java -jar jabaws-min-client.jar -h=http://myhost.compbio.ac.uk:8080/jabaws -s=MafftWS -parameters

Can I use a JABAWS command line client to connect and use JABAWS web services in Dundee and my local lab?

Yes, just point it to the host you want to use by changing the value of -h key. For example you used -h=http://myhost.compbio.ac.uk:8080/jabaws server, now you want to use another server to -h=http://mylabserver.myuni.edu. This comes handy if you want to align many sequence or do not want to sent some of your data to the internet.

Sometimes users sent very large number of sequences to JABAWS server, so that it becomes unresponsive. Can I limit the number of sequences users can submit to my server.

Yes, JABAWS can be configured to reject the requests based the number of sequences as well as the number of sequence and they average length per single align request. Look at the Restricting JABAWS section for further details.

My cluster is quite busy so the waiting times in the task queue is significant. At the same time I have powerful server with many cores. Can I use the server for small tasks and send really big ones to the cluster?

Yes, you can. For this you need to enable and configure both the cluster and the local engines. Once this is done decide on the maximum size of a task to be run on the server locally. In JABAWS the size of the task can be defined as a number of sequences and an average sequence length. Edit "# LocalEngineExecutionLimit #" preset in <ServiceName>Limits.xml file accordingly.

Can I run my own JABAWS if I do not have a cluster?

Yes, JABAWS can be run on a single server. Obviously the capacity will be limited, but may be sufficient for a small lab. Installed on a single server, JABAWS executes tasks in parallel, so the more cores the server has the more requests it will be able to handle.

My cluster uses LSF/PBS/SGE etc can I run JABAWS on my cluster?

JABAWS uses DRMAA v. 1.0 library to send and manage jobs on the cluster. DRMAA supports many different cluster job management systems. Namely Sun Grid Engine, Condor, PBS, GridWay, Globus 2/4, PBSPro, LSF. For up to date information please consult DRMAA web site. We found that DRMAA implementation differ from platform to platform and were trying to use only the basic functions. We have only tested JABAWS on Sun Grid Engine v 6.2. Please let use know if you have any experience of running JABAWS on other platforms.

My cluster does not have shared disk space can I run JABAWS?

No, not on the cluster. At the moment to operate on the cluster JABAWS require a disk space each cluster nodes have access to. However, you could still run JABAWS on a single server.

I installed JABAWS in my lab. I would like to define different limits for public and lab users

Currently only one set of limits is supported per a web service. If you need to provide different quality of service for different group of users it is best to make a second JABAWS installation on a different server (could be on the same server but in the different context) and define different limits on a second server. So the lab users could use one server, and public another.

Jalview uses the same code for local task execution as JABAWS. Currently JABAWS machinery supports limits on web services level only. If you execute the task locally or make direct submission to the cluster no limits are applied.

Can I run many JABAWS instances on the same server?

Yes. JABAWS is supplied as one Web Application Archive which can be dealt with as any other web applications. Make two different contexts on your application server and unpack JABAWS in both of them. For example if your server name is http://www.align.ac.uk, and the context names are public and private. Than one group of users could be given a URL http://www.align.ac.uk/public and another http://www.align.ac.uk/private. These contexts will be served by two independent JABAWS instances, and could be configured differently. If you keep local engine enabled, make sure you reduce the number of threads local engine is allowed to use to avoid overloading the server. Alternatively two completely separate web application server instances (e.g. Apache-Tomcat) could be used. This will give you a better resilience and more flexibility in memory settings.

I deployed JABAWS on windows why not all alignment programs work?

JABAWS are platform independent, and thus can be deployed on any operation system which supports java. However, the executables which do the calculations are platform specific. JABAWS uses different versions of executables for different platforms under the hood. However, if executable is not available for a particular platform, then this service will not function.

If I install JABAWS on windows which web services will work?

From all supported only Clustal and Muscle will work. This is due to the fact that only these two executables supports windows natively.

There are versions of Mafft and Tcoffee for windows, so why you do not support them?

Indeed Mafft and Tcoffee can be executed on windows platform but only in the Linux environment provided by Cygwin. They setup is very complicated and performance will suffer in such an environment. We do not support such a configuration and unlikely to support it in the future. JABAWS are best installed on Unix like environment after this is done they can be accessed from any operation system.

What operation system JABAWS can run on?

JABAWS can be run on any operation system that support java. However, its best to be run on unix like operation system as all the programs JABAWS uses are available for unix platform, and only a few are available on windows. Currently only Clustal and Muscle are available for windows platform.

What happens if the number of requests to my JABAWS installation is greater the the server can process?

That depends on your configuration. If only cluster submission is enabled, than tasks will be sent to the cluster and you will experience a longer wait as a result of a task queuing. It is common, that the number of tasks that web server can send to the cluster is unlimited, but the number of tasks that can be run in parallel is controlled, hence the queue.
If only local execution is enabled, the number of tasks that can be executed in parallel are limited to the number of threads, which is equal to the number of cores available on the server by default, but can be defined declaratively too. The tasks that cannot be executed immediately will be waiting in the queue. You can expect a greater execution time. If both local and cluster submission is enabled, then when the task cannot be immediately executed locally it gets sent to the cluster.

I would like to keep and eye on who is using JABAWS services on my server.

Enable Tomcat log valve. To do this uncomment the following section of <tomcat_root>/conf/server.xml configuration file.

The following information will be logged:

Remote IP	Date	Method server_URL protocol	HTTP status	Response size in bytes
10.31.11.159	[10/Feb/2010:16:51:32 +0000]	"POST /jws2/MafftWS HTTP/1.1"	200	2067

Which can be processed in various log analysing programs, such as WebAlizer, Analog, AWStats.

I would like to know how much of the CPU time has been consumed and which tasks were the longest.

JABAWS stores cluster task ids for all tasks which were run on the cluster. Using cluster ids the detailed statistics can be extracted from cluster accounting system. Due to the fact that each cluster supported by JABAWS have different accounting system it was not possible to provide ready to use statistics.
For the local execution the starting and finishing time in nano seconds can be found in STARTED and FINISHED files respectively. In time we will provide the tools to extract execution time statistics, so keep the content of your working directory ready!

I noticed that jobsout, conf and binaries directories are not placed in WEB-INF directory are they not accessible to anyone?

Access to these directories is prohibited to any unauthorized users by means of security constrain defined in web application descriptor file. There is a special user role called admin who can access these directories. This comes handy if you would like to keep an eye on any of the task outputs stored in jobsout, or would like to view the configuration files. To access these directories add admin user into your application server. The way you do it will depends on where you would like the user passwords to come from and you web application server. If you use Tomcat, than the simplest way is to use Tomcat Memory Realm which is linked to a plain text configuration file. To define the user in Tomcat server add an entry in conf/tomcat-user.xml file. <role rolename="admin"/>
<user username="admin" password="your password here " roles="admin"/>

Once this is done make sure the servlet that returns the web application directory listings is enabled. Look in the <tomcatroot>/conf/web.xml file for the following <param-name>listings</param-name>
<param-value>true</param-value>

The whole section that defined default listing servlet is below

<servlet>
<servlet-name>default</servlet-name>
<servlet-class>org.apache.catalina.servlets.DefaultServlet</servlet-class>
<init-param>
<param-name>debug</param-name>
<param-value>0</param-value>
</init-param>
<init-param>
<param-name>listings</param-name>
<param-value>true</param-value>
</init-param>
<load-on-startup>1</load-on-startup>
</servlet>

These listings are read only by default.

Sometimes something goes wrong with JABAWS, but I cannot figure out why. Can you help?

JABAWS logs all errors to the stdout and in the file called activity.log if logging is enabled. Stdout is usually recorded in web server log files. Have a look there and you may find the reason for the problems. If it is still unclear what went wrong try increasing the logging level. Setting the logging level to TRACE or DEBUG will give you a lot of insights in what goes on behind the scene. We would need this log if you need us to help you, or if you would like to report the bug. To change the log level, replace ERROR keyword in ACTIVITY logger to TRACE.

Connecting to JABAWS

For a complete working example of JABAWS client please see compbio.ws.client package.
In particular compbio.ws.client.Jws2Client - a command line client
compbio.ws.client.WSTester - JABAWS tester. JABAWS source is available from the download page. Please note that for now all the examples are in Java other languages will follow given a sufficient demand.

Download jaba client library. Add client library to the class path. The following code excerpt will connect your program to Clustal web service deployed in the University of Dundee.

1) URL url = new URL("http://www.compbio.dundee.ac.uk/jabaws/ClustalWS?wsdl");
2) QName qname = new QName("http://msa.data.compbio/01/01/2010/", "ClustalWS");
3) Service serv = Service.create(url, qname);
4) MsaWS<T> msaws = serv.getPort(new QName("http://msa.data.compbio/01/01/2010/", "ClustalWSPort"), MsaWS.class);

A more generic connection method would look like this

String qualifiedServiceName = "http://msa.data.compbio/01/01/2010/";
String host = "http://www.compbio.dundee.ac.uk/jaba" ;
URL url = new URL(host + "/" + service.toString() + "?wsdl");
QName qname = new QName(qualifiedServiceName, service.toString());
Service serv = Service.create(url, qname);
MsaWS<T> msaws = serv.getPort(new QName(qualifiedServiceName, service + "Port"), MsaWS.class);

Where service is enumeration of JABAWS web services. All JABAWS multiple sequence alignment methods confirm to MsaWS specification, thus from the called point all JABAWS web services can be represented by MsaWS interface.

Once you have connected do not forget to disconnect

((Closeable) msaws).close();

Building web services artifacts

JABAWS are the standard JAX-WS SOAP web services, which are WS-I basic profile compatible. This means that you could use whatever tool your language has to work with web services. Below is how you can generate portable artifacts to work with JABAWS from java. Alternatively, from java, you could use the client library.

wsimport -keep http://www.compbio.dundee.ac.uk/jabaws/ClustalWS?wsdl

Valid service names are

ClustalWS
MuscleWS
MafftWS
TcoffeeWS
ProbconsWS

Aligning sequences

1) List<FastaSequence> fastalist = SequenceUtil.readFasta(new FileInputStream(file));
2) String jobId = msaws.align(fastalist);
3) Thread.sleep(1000);
4) Alignment alignment = msaws.getResult(jobId);

Line one loads fasta sequence from the file
Line two submits them to web service represented by msaws proxy
Line three waits a second
Line four retrieves the alignment from a web service and blocks the execution until the result is available.
Methods and classes mentioned in the excerpt are available from the JABAWS client library.

Aligning with presets

1) PresetManager<T> presetman = msaws.getPresets();
2) Preset<T> preset = presetman.getPresetByName(presetName);
3) List<FastaSequence> fastalist = SequenceUtil.readFasta(new FileInputStream(file));
4) String jobId = msaws.presetAlign(fastalist, preset);
5) Thread.sleep(1000);
6) Alignment alignment = msaws.getResult(jobId);

Lines one obtains the lists of presets supported by a web service.
Line two return a particular Preset by its name
Lines three to six are doing the same job as in default alignment example.

Using custom parameters

Writing alignments to a file

There is a utility method in the client library that does exactly that.

Alignment alignment = align(...)
FileOutputStream outStream = new FileOutputStream(file);
ClustalAlignmentUtil.writeClustalAlignment(outStream, align);

I dropped jaba.war file into web application directory but nothing happened. What do I do next?

Make sure tomcat have sufficient access rights to read your war file.
Restart the tomcat, sometimes it will not since that the new war file is added without restart
If tomcat still refuses to unpack the war file, unpack it manually into web application folder. Set executable flag for native executables (alignment programs) as described here. Restart tomcat.

I removed the JABAWS war file after it was deployed from the webapps directory and my JABAWS web application folder disappear. Where it has gone to?

If you used tomcat automated deployment for JABAWS, which is when you just drop a war file in to webapps directory than, if the war file is removed, tomcat will automatically undeploy your application. So do not remove the war file if you intend to keep JABAWS running.

I want to make sure that tomcat will not undeploy/delete JABAWS directory from the server. What should I do?

Use an explicit application descriptor. It could come in different flavors, the one I prefer if to drop a context descriptor file into <tomcatRoot>conf/Catalina/localhost directory. Name your context file the same as your application folder e.g. if you JABAWS resides in webappl/jaba folder, then call the context file jaba.xml. Below is an example of content this file might have.

<?xml version="1.0" encoding="UTF-8"?>
<Context antiResourceLocking="false" privileged="true" />

This should be sufficient to prevent tomcat from removing your JABAWS deployment folder. For more information about tomcat deployer read this documentation on the tomcat web site.

JAva Bioinformatics Analysis Web Services