JABAWS How To

Table of content

About

Installation

Configuration

JABAWS on Apache-Tomcat

JABAWS on VM (Virtual Machine)

About

What is JABAWS?

JABAWS stands for JAva Bioinformatics Analysis Web Services. It is a collection of web services for multiple sequence alignment. For simplicity we referrer to them as JABAWS. It is a successor of Jalview Web Services. JABAWS makes it easy to access well-known multiple sequence alignment programs from JalView. However, the scope of JABAWS is not limited to multiple sequence alignment programs. Future versions of JABAWS will incorporate protein disorder prediction, BLAST, PSIBLAST and HMMER database searches and many other tools. For the list of currently supported programs see below

Why JABAWS?

JABA Web Services has a number of distinct features that are not found in other bioinformatics web services systems. In particular, JABAWS:

  1. Provides uniform remote access to a number of popular command line tools.
    JABAWS enable you to access your favorite research tools anywhere, at any time. For instance, all multiple sequence alignment services can be accessed with a single command line interface, simplifying their invocation. At the same time most of the command line options for each program are supported, so you have nearly the same level of control as if you were running them on the command line yourself.
  2. Enables web based or stand-alone applications, like Jalview, to access a variety of bioinformatics analysis methods.
    The JABAWS client library makes it very easy to connect to one or more instances of JABAWS, so if one server is off line, all you need to know is the URL of another server that will do the job for you. Moreover you are not limited to JABAWS own client as JABAWS are WS-I basic profile v. 1.1 compatible, which means that clients can be created for them in almost any programming language.
  3. Can be easily deployed as a server on a variety of platforms, with command line tools run on the same machine or on a cluster.This allow you keeping your private data safe.
    You don't need to send your data to the Internet anymore. Simply download and install JABAWS on a trusted machine in your lab or institute, and use its web address instead of the public JABAWS services. No data will leave your lab any longer! The JABAWS server can run programs on a single machine or on a cluster, and are easy to install. If your server is going to be heavily used, then it is better to configure JABAWS to access your cluster, which is straightforward. JABAWS integrates with a number of cluster job management systems (e.g. GridEngine, PBS, LSF, Condor). It also intelligently manages task scheduling depending on their size, eliminating the scalability issues and let you focus on your research.
  4. Support custom parameters, unlike other web services
    JABAWS includes a comprehensive parameter model and validation mechanism, allowing you to specify additional options and arguments. Want to use PAM200 substitution matrix, set the number of iterations, or sequence clustering method? No problem - JABAWS lets you do that. You are no longer limited to defaults!

Installation

How to deploy JABAWS?

Download JABAWS Web Application Archive (war file). Deploy it on any Servlet 2.4 compatible container. We have tested deployment on Apache-Tomcat version 6. On windows servers just drop a JABAWS war file into the web application directory. On Linux unpack the war file into web application directory , cd to <webapplicationpath>/binaries/src directory and run setexecflag.sh script to set an executable flag for binaries. Start Tomcat. That is it. This should give you working JABAWS stack with tasks executed locally on the server. If you have cluster you may want to enable JABAWS to submit jobs to the cluster. To find out more about this and to find out about other configuration options read the manual.

I deployed JABAWS, how to make sure it is working?

First of all make sure that tomcat server is started successfully. If this was the case, then you should see JABAWS home page when you navigate to your tomcat JABAWS context path e.g. http://myhost.compbio.ac.uk:8080/jabawsIf you see it, then it is time to make sure that web services are working too. Assuming that you have unpacked/deployed JABAWS from the server war file, you should be able to navigate to the test program which can be found in <webapplicationpath>/WEB-INF/lib/jabaws-client.jar file. To run the tests type: java -jar jabaws-client.jar -h=<Your web application server host name, port and JABAWS context path>

For example to test all JABAWS web services on host myhost.compbio.ac.uk type:

java -jar jabaws-client.jar -h=http://myhost.compbio.ac.uk:8080/jabaws

You can choose a particular web server using -s option like this java -jar jabaws-client.jar -h=http://myhost.compbio.ac.uk:8080/jabaws -s=ClustalWS This command line assumes that java executable is in your path and jabaws-client.jar is located in the current directory.

An example of the report testing tool produces for operating web service looks like this:

Connecting to service MuscleWS on http://myhost.compbio.ac.uk:8080/jabaws ... OK
Testing alignment with default parameters:
Queering job status...OK
Retrieving results...OK
Testing alignment with presets:
Aligning with preset 'Protein alignment(Fastest speed)'... OK
Aligning with preset 'Nucleotide alignment(Fastest speed)'... OK
Aligning with preset 'Huge alignments (speed-oriented)'... OK
Queering presets...OK
Queering Parameters...OK
Queering Limits...OK
Queering Local Engine Limits...OK
Check is completed service MuscleWS IS WORKING
An example of the response of a web service which is deployed but is not operating is below:

Connecting to service ProbconsWS on http://localhost:8080/ws ... OK
Testing alignment with default parameters:FAILED
Service ProbconsWS IS NOT FUNCTIONAL
If the web server did not respond the message looks like following: Connecting to service TcoffeeWS on http://localhost:8080/ws ... FAILED

Which Alignment programs are supported?

JABAWS provide access to the following programs

I do not use windows and I am having troubles compiling binaries. Where can I get the pre compiled binaries for my system?

We would however recommend to compile the binaries for your system whenever possible. This is likely to give you a significant performance gain.

Configuration

Can I use a different version of the alignment program with JABAWS?

JABAWS supplied with binaries and source code of the executables which version it supports. So normally you would not need to install your own executables. However, if you have a different version of an executable (e.g. an alignment program) which you prefer, you could use it as long as it supports all the functions JABAWS executable supported. This could be the case with more recent executable. If the options supported by your chosen executable is different when the standard JABAWS executable, than you need to edit ExecutableNameParamaters.xml  configuration file.

Is there a documentation for client library methods?

Yes there is. Javadoc is available for all methods of the library and data structures.

I want to use JABAWS. Which JABAWS distribution should I choose?

There are two main packages you could use

  1. A client package - for anyone who wants to use JABAWS from your own code, without Jalview.
  2. Web Services package (there are a few platform specific variants of this one) - for anyone who wants to run they own copy of JABA Web Services.
  3. Virtual Appliance - for anyone who wants to run JABAWS locally, but work on Windows or have configuration problems.

A client only package (1) contains the code sufficient to connect to a third party version of JABAWS and use it. This is the package for anyone who wants to connect to and to use JABAWS from their own software. The package also includes a command line client tool. Read more about how to use command line client below. JABAWS are fully WS-I compliant, so one could use any language to access them. However, a client package offer additional convenience methods, which is not available otherwise. For example methods to read Clustal formatted sequence alignment files and convert them to the List of FastaSequence objects, which JABAWS will be happy to consume. The trade off is that the client package is written in java, which may not be the language of your choice.

Web Services package (2) contains JABAWS web services. There are versions for Unix/Linux and Windows operation systems. JABAWS will work on any operation system which has web application server like Tomcat, and GNU compatible C/C++ compiler. This includes Mac. If you are interested in running JABAWS on Mac you would need to recompile binaries JABAWS depends on, in particular these executables. You can configure JalView to use your version of JABAWS, or a any combinations of publicly available instance of JABAWS with your local.

Finally, you can download core JABAWS package, which contains the code for executing programs locally or on the variety of clusters. This is likely to be of interest for developers only.

Virtual Appliance package (3) contains TurnKey Linux with JABAWS installed. You can use this package as long as you can run a virtual appliance on your computer. You can find out more about JABAWS virtual appliance in the relevant manual and how to sections.

Can I program against JABAWS?

Yes. The simplest way to do it is to download a client package, and use it to access JABAWS. This package contains value object which you could alternatively generate with wsimport in java, or similar tool in other languages. It offers some additional manually developed methods which further simplify working with JABAWS. For more information please refer to the data model javadoc. However should you wish to generate the code using wsimport tool you will be able to do so. As JABAWS are WS-I basic profile compliant, they can be accessed in a standard way as any other web service.

Is there a command line client to JABAWS?

Yes, it comes as a part of client package which you are welcome to download.

The command client can be used to align sequences using any of JABAWS supported web services. The client is OS independent and supports most of the functions which can be accessed programmatically via JABAWS API. Using this client you could align sequences using presets or custom parameters, please see examples of this below. Here is the list of options supported by the command line client.

Usage: java -jar <path_to_jar_file> -h=host_and_context -s=serviceName ACTION [OPTIONS] -h=<host_and_context> - a full URL to the JABAWS web server including context path e.g. http://10.31.10.159:8080/ws
-s=<ServiceName> - one of [MafftWS, MuscleWS, ClustalWS, TcoffeeWS, ProbconsWS]


ACTIONS:
-i=<inputFile> - full path to fasta formatted sequence file, from which to align sequences
-parameters - lists parameters supported by web service
-presets - lists presets supported by web service
-limits - lists web services limits
Please note that if input file is specified other actions are ignored


OPTIONS: (only for use with -i action):
-r=<presetName> - name of the preset to use
-o=<outputFile> - full path to the file where to write an alignment
-f=<parameterInputFile> - the name of the file with the list of parameters to use.
Please note that -r and -f options cannot be used together. Alignment is done with either preset or a parameters from the file, but not both!

Align sequences from input.fasta file using Mafft web service with default settings, print alignment in Clustal format to console.

java -jar jabaws-min-client.jar -h=http://myhost.compbio.ac.uk:8080/jabaws -s=MafftWS -i=d:\input.fasta

Content of input.fasta file is show below (please note sequences has been trimmed for clarity)>Foobar
MTADGPRELLQLRAAVRHRPQDFVAWL
>Bar
MGDTTAGEMAVQRGLALHQ
QRHAEAAVLLQQASDAAPE
>Foofriend
MTADGPRELLQLRAAV

Align as in above example, but write output alignment in a file out.clustal, using parameters defined in prm.in file

java -jar jabaws-min-client.jar -h=http://myhost.compbio.ac.uk:8080/jabaws -s=MafftWS -i=d:\input.fasta -o=d:\out.clustal -f=prm.in

The content of the prm.in file is shown below --nofft
--noscore
--fastaparttree
--retree=10
--op=2.2

The format of the file is the same for all JABAWS web services. Parameters are specified in exactly the same way as for native executables - alignment programs like Mafft etc. So parameters which you can use with command line version of an alignment program can be used with JABAWS. Most of the settings controlling alignment process are supported, but the setting controlling output are not. This is due to the fact the output have to be handled by JABAWS, so must remain within its control. For a list of parameters supported by a web service see the next example. In prm.in parameters are separated by the new line, and name of the parameter is separated from its value with an equal sign. This format is constant no matter which JABAWS web service is used.
java -jar jabaws-min-client.jar -h=http://myhost.compbio.ac.uk:8080/jabaws -s=MafftWS -parameters

Can I use a JABAWS command line client to connect and use JABAWS web services in Dundee and my local lab?

Yes, just point it to the host you want to use by changing the value of -h key. For example you used -h=http://myhost.compbio.ac.uk:8080/jabaws server, now you want to use another server to -h=http://mylabserver.myuni.edu. This comes handy if you want to align many sequence or do not want to sent some of your data to the internet.

Sometimes users sent very large number of sequences to JABAWS server, so that it becomes unresponsive. Can I limit the number of sequences users can submit to my server.

Yes, JABAWS can be configured to reject the requests based the number of sequences as well as the number of sequence and they average length per single align request. Look at the Restricting JABAWS section for further details.

My cluster is quite busy so the waiting times in the task queue is significant. At the same time I have powerful server with many cores. Can I use the server for small tasks and send really big ones to the cluster?

Yes, you can. For this you need to enable and configure both the cluster and the local engines. Once this is done decide on the maximum size of a task to be run on the server locally. In JABAWS the size of the task can be defined as a number of sequences and an average sequence length. Edit "# LocalEngineExecutionLimit #" preset in <ServiceName>Limits.xml file accordingly.

Can I run my own JABAWS if I do not have a cluster?

Yes, JABAWS can be run on a single server. Obviously the capacity will be limited, but may be sufficient for a small lab. Installed on a single server, JABAWS executes tasks in parallel, so the more cores the server has the more requests it will be able to handle.

My cluster uses LSF/PBS/SGE etc can I run JABAWS on my cluster?

JABAWS uses DRMAA v. 1.0 library to send and manage jobs on the cluster. DRMAA supports many different cluster job management systems. Namely Sun Grid Engine, Condor, PBS, GridWay, Globus 2/4, PBSPro, LSF. For up to date information please consult DRMAA web site. We found that DRMAA implementation differ from platform to platform and were trying to use only the basic functions. We have only tested JABAWS on Sun Grid Engine v 6.2. Please let use know if you have any experience of running JABAWS on other platforms.

My cluster does not have shared disk space can I run JABAWS?

No, not on the cluster. At the moment to operate on the cluster JABAWS require a disk space each cluster nodes have access to. However, you could still run JABAWS on a single server.

I installed JABAWS in my lab. I would like to define different limits for public and lab users

Currently only one set of limits is supported per a web service. If you need to provide different quality of service for different group of users it is best to make a second JABAWS installation on a different server (could be on the same server but in the different context) and define different limits on a second server. So the lab users could use one server, and public another.

I tried aligning 5000 sequences using Jalview on my laptop. It crashed. Why JalView allowed me to do that?

Jalview uses the same code for local task execution as JABAWS. Currently JABAWS machinery supports limits on web services level only. If you execute the task locally or make direct submission to the cluster no limits are applied.

Can I run many JABAWS instances on the same server?

Yes. JABAWS is supplied as one Web Application Archive which can be dealt with as any other web applications. Make two different contexts on your application server and unpack JABAWS in both of them. For example if your server name is http://www.align.ac.uk, and the context names are public and private. Than one group of users could be given a URL http://www.align.ac.uk/public and another http://www.align.ac.uk/private. These contexts will be served by two independent JABAWS instances, and could be configured differently. If you keep local engine enabled, make sure you reduce the number of threads local engine is allowed to use to avoid overloading the server. Alternatively two completely separate web application server instances (e.g. Apache-Tomcat) could be used. This will give you a better resilience and more flexibility in memory settings.

I deployed JABAWS on windows why not all alignment programs work?

JABAWS are platform independent, and thus can be deployed on any operation system which supports java. However, the executables which do the calculations are platform specific. JABAWS uses different versions of executables for different platforms under the hood. However, if executable is not available for a particular platform, then this service will not function.

If I install JABAWS on windows which web services will work?

From all supported only Clustal and Muscle will work. This is due to the fact that only these two executables supports windows natively.

There are versions of Mafft and Tcoffee for windows, so why you do not support them?

Indeed Mafft and Tcoffee can be executed on windows platform but only in the Linux environment provided by Cygwin. They setup is very complicated and performance will suffer in such an environment. We do not support such a configuration and unlikely to support it in the future. JABAWS are best installed on Unix like environment after this is done they can be accessed from any operation system.

What operation system JABAWS can run on?

JABAWS can be run on any operation system that support java. However, its best to be run on unix like operation system as all the programs JABAWS uses are available for unix platform, and only a few are available on windows. Currently only Clustal and Muscle are available for windows platform.

What happens if the number of requests to my JABAWS installation is greater the the server can process?

That depends on your configuration. If only cluster submission is enabled, than tasks will be sent to the cluster and you will experience a longer wait as a result of a task queuing. It is common, that the number of tasks that web server can send to the cluster is unlimited, but the number of tasks that can be run in parallel is controlled, hence the queue.
If only local execution is enabled, the number of tasks that can be executed in parallel are limited to the number of threads, which is equal to the number of cores available on the server by default, but can be defined declaratively too. The tasks that cannot be executed immediately will be waiting in the queue. You can expect a greater execution time. If both local and cluster submission is enabled, then when the task cannot be immediately executed locally it gets sent to the cluster.

I would like to keep and eye on who is using JABAWS services on my server.

Enable Tomcat log valve. To do this uncomment the following section of <tomcat_root>/conf/server.xml configuration file.

<Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs"
prefix="localhost_access_log." suffix=".txt" pattern="common" resolveHosts="false"/>

The following information will be logged:

Remote IP Date Method server_URL protocol HTTP status Response size in bytes
10.31.11.159 [10/Feb/2010:16:51:32 +0000] "POST /jws2/MafftWS HTTP/1.1" 200 2067

Which can be processed in various programs for log analysis , such as WebAlizer, Analog, AWStats.

I would like to know how much of the CPU time has been consumed and which tasks were the longest.

JABAWS stores cluster task ids for all tasks which were run on the cluster. Using cluster ids the detailed statistics can be extracted from cluster accounting system. Due to the fact that each cluster supported by JABAWS have different accounting system it was not possible to provide ready to use statistics.
For the local execution the starting and finishing time in nano seconds can be found in STARTED and FINISHED files respectively. In time we will provide the tools to extract execution time statistics, so keep the content of your working directory ready!

I noticed that jobsout, conf and binaries directories are not placed in WEB-INF directory are they not accessible to anyone?

Access to these directories is prohibited to any unauthorized users by means of security constrain defined in web application descriptor file. There is a special user role called admin who can access these directories. This comes handy if you would like to keep an eye on any of the task outputs stored in jobsout, or would like to view the configuration files. To access these directories add admin user into your application server. The way you do it will depends on where you would like the user passwords to come from and you web application server. If you use Tomcat, than the simplest way is to use Tomcat Memory Realm which is linked to a plain text configuration file. To define the user in Tomcat server add an entry in conf/tomcat-user.xml file. <role rolename="admin"/>
<user username="admin" password="your password here " roles="admin"/>

Once this is done make sure the servlet that returns the web application directory listings is enabled. Look in the <tomcatroot>/conf/web.xml file for the following <param-name>listings</param-name>
<param-value>true</param-value>

The whole section that defined default listing servlet is below

<servlet>
<servlet-name>default</servlet-name>
<servlet-class>org.apache.catalina.servlets.DefaultServlet</servlet-class>
<init-param>
<param-name>debug</param-name>
<param-value>0</param-value>
</init-param>
<init-param>
<param-name>listings</param-name>
<param-value>true</param-value>
</init-param>
<load-on-startup>1</load-on-startup>
</servlet>

These listings are read only by default.

Sometimes something goes wrong with JABAWS, but I cannot figure out why. Can you help?

JABAWS logs all errors to the stdout and in the file called activity.log if logging is enabled. Stdout is usually recorded in web server log files. Have a look there and you may find the reason for the problems. If it is still unclear what went wrong try increasing the logging level. Setting the logging level to TRACE or DEBUG will give you a lot of insights in what goes on behind the scene. We would need this log if you need us to help you, or if you would like to report the bug. To change the log level, replace ERROR keyword in ACTIVITY logger to TRACE.

JABAWS on Apache-Tomcat

I dropped jaba.war file into web application directory but nothing happened. What do I do next?

  • Make sure tomcat have sufficient access rights to read your war file.
  • Restart the tomcat, sometimes it will not since that the new war file is added without restart
  • If tomcat still refuses to unpack the war file, unpack it manually into web application folder. Set executable flag for native executables (alignment programs) as described here. Restart tomcat.

I removed the JABAWS war file after it was deployed from the webapps directory and my JABAWS web application folder disappear. Where it has gone to?

If you used tomcat automated deployment for JABAWS, which is when you just drop a war file in to webapps directory than, if the war file is removed, tomcat will automatically undeploy your application. So do not remove the war file if you intend to keep JABAWS running.

I want to make sure that tomcat will not undeploy/delete JABAWS directory from the server. What should I do?

Use an explicit application descriptor. It could come in different flavors, the one I prefer if to drop a context descriptor file into <tomcatRoot>conf/Catalina/localhost directory. Name your context file the same as your application folder e.g. if you JABAWS resides in webappl/jaba folder, then call the context file jaba.xml. Below is an example of content this file might have.

<?xml version="1.0" encoding="UTF-8"?>
<Context antiResourceLocking="false" privileged="true" />

This should be sufficient to prevent tomcat from removing your JABAWS deployment folder. For more information about tomcat deployer read this documentation on the tomcat web site.

JABAWS on VM (Virtual Machine)

I cannot open VM using VirtualBox due to VERR_VMX_MSR_LOCKED_OR_DISABLED exception. Can you help?

VERR_VMX_MSR_LOCKED_OR_DISABLED exception means that Intel Virtualization technology is disabled or not supported by your computer. If you have such a problem, please make sure you have configured the JABAWS VM with 1 CPU and disabled VT-X extensions. Alternatively you can enable virtualization extensions ion from the BIOS of your computer. Unfortunately, we cannot give you exact instructions on how to do this, as this would depend on your computer BIOS manufacturer. For MACs it may not be possible at all.

VMWare Player - Failed to query source for information. I cannot open OVF file using VMware player. Why is this?

At the time of writing, the latest version of VMware Player 3.1.2 supported only a legacy OVF version 0.9. Whereas OVF packaged with JABAWS VM is version 1.0. Please use VMX - VMware specific configuration file with all VMware products.

I want to connect to the Internet from my VM. Can I do that?

By default the JABAWS VM is configured to use host-only networking. This means that the host can communicate with the VM via a network, but no other machines can. Similarly, the VM cannot communicate with any other computers apart from the host. If you want to connect to the Internet from the VM, configure your VM to use NAT network. However, you will not be able to connect to the VM from the host in such case. If you want to be able to connect to your VM and let VM connect to the internet at the same time you would have to use a Bridged network. In such a case you would have to configure the VM IP address manually (unless of course your network has a DHCP server to do that).