Advanced Usage ============== JABAWS web services are WS-I basic profile compliant, which means they can be accessed using any programming language or system that can utilize standard SOAP web services. The Web Service Definition Language (WSDL) for each service is published on the JABAWS home page, and you can use this to automatically generate service bindings for your program. If you use Java you may wish to use our client package to access JABAWS. This package is based on the autogenerated source code produced by wsimport, which is the Java tool for creating web service bindings. In addition, this offers some additional methods that simplify working with JABAWS. For more information please refer to the data model javadoc. ------------ .. _jabaws_wsdl: Valid WSDL ---------- **Multiple sequence alignment services** * ClustalOWS - http://www.compbio.dundee.ac.uk/jabaws/ClustalOWS?wsdl * ClustalWS - http://www.compbio.dundee.ac.uk/jabaws/ClustalWS?wsdl * MuscleWS - http://www.compbio.dundee.ac.uk/jabaws/MuscleWS?wsdl * MafftWS - http://www.compbio.dundee.ac.uk/jabaws/MafftWS?wsdl * TcoffeeWS - http://www.compbio.dundee.ac.uk/jabaws/TcoffeeWS?wsdl * ProbconsWS - http://www.compbio.dundee.ac.uk/jabaws/ProbconsWS?wsdl * MSAprobsWS - http://www.compbio.dundee.ac.uk/jabaws/MSAprobsWS?wsdl * GLprobsWS - http://www.compbio.dundee.ac.uk/jabaws/GLprobsWS?wsdl **Protein disorder prediction services** * IUPredWS - http://www.compbio.dundee.ac.uk/jabaws/IUPredWS?wsdl * GlobPlotWS - http://www.compbio.dundee.ac.uk/jabaws/GlobPlotWS?wsdl * DisemblWS - http://www.compbio.dundee.ac.uk/jabaws/DisemblWS?wsdl * JronnWS - http://www.compbio.dundee.ac.uk/jabaws/JronnWS?wsdl **Amino acid conservation service** * AAConWS - http://www.compbio.dundee.ac.uk/jabaws/AAConWS?wsdl **RNA Secondary Structure Prediction** * RNAalifoldWS - http://www.compbio.dundee.ac.uk/jabaws/RNAalifoldWS?wsdl Please replace http://www.compbio.dundee.ac.uk/ with your JABAWS instance host name, and jabaws with your JABAWS context name to access your local version of JABAWS web services. For example http://localhost:8080/jabaws would be a valid URL for the default Apache-Tomcat installation and jabaws.war file deployment. ------------ .. _jabaws_config: JABAWS Configuration -------------------- There are three parts of the system you can configure. The local and the cluster engines, and the paths to the individual executables for each engine. These settings are stored in configuration files within the web application directory (for an overview, then take a look at the war file content table) [link]. Initially, JABAWS is configured with only the local engine enabled, with job output written to directory called "jobsout" within the web application itself. This means that JABAWS will work out of the box, but may not be suitable for serving a whole lab or a university. ------------ .. _jabaws_config_le: Local Engine Configuration ~~~~~~~~~~~~~~~~~~~~~~~~~~ The Local execution engine configuration is defined in the properties file ``conf/Engine.local.properties``. The supported configuration settings are: ``engine.local.enable=true`` - enable or disable local engine, valid values true | false ``local.tmp.directory=D:\\clusterengine\\testoutput`` - a directory to use for temporary files storage, optional, defaults to java temporary directory ``engine.local.thread.number=4`` - Number of threads for tasks execution (valid values between 1 and 2x cpu. Where x is a number of cores available in the system). Optional defaults to the number of cores for core number <=4 and number of cores-1 for greater core numbers. If the local engine going to be heavily loaded (which is often the case if you do not have a cluster) it is a good idea to increase the amount of memory available for the web application server. If you are using Apache-Tomcat, then you can define its memory settings in the JAVA_OPTS environment variable. To specify which JVM to use for Apache-Tomcat, put the full path to the JRE installation in the JAVA_HOME environment variable. (We would recommend using Sun Java Virtual Machine (JVM) in preference to Open JDK). Below is an example of code which can be added to ``/bin/setenv.sh`` script to define which JVM to use and a memory settings for Tomcat server. Tomcat server startup script (``catalina.sh``) will execute ``setenv.sh`` on each server start automatically. .. code:: bash export JAVA_HOME=/homes/ws-dev2/jdk1.6.0_17/ export JAVA_OPTS="-server -Xincgc -Xms512m -Xmx1024m" ------------ .. _jabaws_config_ce: Cluster Engine Configuration ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Supported configuration settings: ``engine.cluster.enable=true`` - enable or disable local engine true | false, defaults to false ``cluster.tmp.directory=/homes/clustengine/testoutput`` - a directory to use for temporary files storage. The value must be an absolute path to the temporary directory. This is required. The value must be different from what is defined for local engine. This directory must be accessible from all cluster nodes. For the cluster engine to work, the SGE_ROOT and LD_LIBRARY_PATH environment variables have to be defined. They tell the cluster engine where to find DRMAA libraries. These variables should be defined when the web application server starts up, e.g. .. code:: bash SGE_ROOT=/gridware/sge LD_LIBRARY_PATH=/gridware/sge/lib/lx24-amd64 Finally, do not forget to configure executables for the cluster execution, they may be the same as for the local execution but may be different. Please refer to the executable configuration section for further details. ------------ .. _jabaws_config_ec: Executable Configuration ~~~~~~~~~~~~~~~~~~~~~~~~ All the executable programs are configured in conf/Executable.properties file. Each executable is configured with a number of options. They are: .. code:: bash local.X.bin.windows= local.X.bin= cluster.X.bin= X.bin.env= X.--aamatrix.path= X.presets.file= X.parameters.file= X.limits.file= X.cluster.settings= Where X any of the bioinformatics tools available (e.g. clustalw, muscle, mafft, probcons, t-coffee, etc.). Default JABAWS configuration includes path to local executables to be run by the local engine only, all cluster related settings are commented out, but they are there for you as examples. Cluster engine is disabled by default. To configure executable for cluster execution uncomment the X.cluster settings and change them appropriately. By default limits are set well in excess of what you may want to offer to the users outside your lab, to make sure that the tasks are never rejected. The default limit is 100000 sequences of 100000 letters on average for all of the JABA web services. You can adjust the limits according to your needs by editing ``conf/settings/Limit.xml`` files. After you have completed the editing your configuration may look like this: .. code:: bash local.mafft.bin=binaries/mafft cluster.mafft.bin=/homes/cengine/mafft mafft.bin.env=MAFFT_BINARIES#/homes/cengine/mafft;FASTA_4_MAFFT#/bin/fasta34; mafft.--aamatrix.path=binaries/matrices mafft.presets.file=conf/settings/MafftPresets.xml mafft.parameters.file=conf/settings/MafftParameters.xml mafft.limits.file=conf/settings/MafftLimits.xml mafft.cluster.settings=-q bigmem.q -l h_cpu=24:00:00 -l h_vmem=6000M -l ram=6000M Please not that relative paths must only be specified for the files that reside inside web application directory, all other paths must be supplied as absolute! Furthermore, you should avoid using environment variables within the paths or options - since these will not be evaluated correctly. Instead, please explicitly specify the absolute path to anything normally evaluated from an environment variable at execution time. If you are using JABAWS to submit jobs to the cluster (with cluster engine enabled), executables must be available from all cluster nodes the task can be sent to, also paths to the executables on the cluster e.g. ``cluster..bin`` must be absolute. Executables can be located anywhere in your system, they do not have to reside on the server as long as the web application server can access and execute them. Cluster settings are treated as a black box, the system will just pass whatever is specified in this line directly to the cluster submission library. This is how DRMAA itself treats this settings. More exactly DRMAA ``JobTemplate.setNativeSpecification()`` function will be called. For further details and examples of configuration please refer to the ``Executable.properties`` file supplied with JABAWS. ------------ .. _jabaws_config_env_exe: Defining Environment Variables for Executables ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Environment variables can be defined in property .. code:: bash x.bin.env Where x is one of thw executables supported by JABAWS. Several environment variables can be specified in the same line. For example. .. code:: bash mafft.bin.env=MAFFT_BINARIES#/homes/cengine/mafft;FASTA_4_MAFFT#/bin/fasta34; The example above defines two environment variables with names ``MAFFT-BINARIES`` and ``FASTA_4_MAFFT`` and values ``/homes/cengine/mafft and /bin/fasta34`` respectively. Semicolon is used as a separator between different environment variables whereas hash is used as a separator for name and value of the variable. ------------ .. _jabaws_config_env_mafft: Configure JABAWS to Work with Mafft ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If you use default configuration you do not need to read any further. The default configuration will work for you without any changes, however, if you want to install Mafft yourself then there is a couple of more steps to do. Mafft executable needs to know the location of other files supplied with Mafft. In addition some Mafft functions depends on the fasta executable, which is not supplied with Mafft, but is a separate package. Mafft needs to know the location of fasta34 executable. To let Mafft know where the other files from its package are, change the value of MAFFT-BINARIES environment variables. To let Mafft know where is the fasta34 executable set the value of FASTA_4_MAFFT environment variable to point to a location of fasta34 program. The latter can be added to the PATH variable instead. If you are using executables supplied with JABAWS, the path to Mafft binaries would be like ``/binaries/src/mafft/binaries`` and the path to fasta34 binary would be ``/binaries/src/fasta34/fasta34``. You can specify the location of Mafft binaries as well as fasta34 program elsewhere by providing an absolute path to them. All these settings are defined in ``conf/Executable.properties`` file. ------------ .. _jabaws_config_env_limit: Limiting the size of the job accepted by JABAWS ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ JABAWS can be configured to reject excessively large tasks. This is useful if you operate JABAWS service for many users. By defining a maximum allowed task size you can provide an even service for all users and prevents waste of resources on the tasks too large to complete successfully. You can define the maximum number of sequences and the maximum average sequence length that JABAWS accepts for each JABA Web Service independently. Furthermore, you can define different limits for different presets of the same web service. By default limits are disabled. You can enable them by editing ``conf/Executable.properties`` file. You can adjust the limits according to your needs by editing ``conf/settings/Limit.xml`` files. .. _war_precompiled_bin: Pre-compiled binaries ~~~~~~~~~~~~~~~~~~~~~ .. danger:: improve this bit Using a different version of the alignment program with JABAWS JABAWS is supplied with binaries and source code of the executables related to the version it supports. So normally you would not need to install your own executables. However, if you have a different version of an executable (e.g. an alignment program) which you prefer, you could use it as long as it supports all the functions JABAWS executable require. This could be the case with more recent executable. If the options supported by your chosen executable is different from the standard JABAWS executable, then you need to edit ExecutableNameParamaters.xml configuration file. JABAWS comes with pre-compiled x86 Linux binaries, thus on such systems JABAWS should work straight out of the box. If you are in any doubts or experience problems you may want to make sure that the binaries supplied work under your OS. To do this just execute each binary, without any command line options or input files. If you see an error message complaining about missing libraries or other problems, then you probably need to recompile the binaries [link]. You can try the JABAWS functionality with the JABAWS test client or have a look at deploying on Tomcat [link] tips if you experience any problems. .. note:: You may want to enable logging, as described here [link]. JABAWS's web services use command line programs to do the actual analysis, so it must have access to programs which can be executed on your platform. The native executables bundled with JABAWS for Windows (32-bit) and Linux (i386, 32-bit) should be OK for those systems. The source code for these programs is also provided so you can recompile for your own architecture [link] and exploit any optimizations that your system can provide. Alternately, if you have already got binaries on your system, then you can simply change the paths in JABAWS's configuration files [link] so these are used instead. ------------ .. _war_recompile_bin: Recompiling binaries for your system ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If you have a fully equipped build environment on your (POSIX-like) system, then you should be able to recompile the programs from the source distributions which are included in the JABAWS war file. A script called 'compilebin.sh' is provided to automate this task. 1. In a terminal window, change the working directory to ``binaries/src`` 2. Execute the compilebin.sh script: .. code:: bash chmod +x compilebin.sh; compilebin.sh > compilebin.out; 3. Then run: .. code:: bash chmod +x setexecflag.sh; sh setexecflag.sh If any of the binaries was not recompiled, then a 'file not found' error will be raised. 4. Finally, restart your Tomcat server (or JABAWS application only), and test JABAWS [link] to check that it can use the new binaries. If you couldn't compile everything, then it may be that your system does not have all the tools required for compiling the programs. At the very least check that you have gcc, g++ and make installed in your system. If not install these packages and repeat the compilation steps again. You should also review the compilebin.sh output - which was redirected to compilebin.out, and any errors output to the terminal. Finally, try obtaining the pre compiled binaries [link] for your OS. ------------ .. _war_reusing_bin: Obtaining or reusing binaries ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ You could search for pre-packaged compiled executable in your system package repository or alternately, download pre-compiled binaries from each alignment program's home page. Then, either replace the executables supplied with the downloaded ones, or modify the paths defined in ``executable.properties`` as described below. .. Below are some suggestions on where you may be able to get the binaries for your system. If you would like to use the binaries you already have, then you just need to let JABAWS know where they are. To do this, edit: ``conf/Executable.properties`` When specifying paths to executables that already exist on your system, make sure you provide an absolute path, or one relative to the JABAWS directory inside webapps. For example, the default path for clustalw is defined as ``local.clustalw.bin=binaries/src/clustalw/src/clustalw2`` Alternatively, instead of changing ``Executable.properties`` you could also replace the executables bundled with JABAWS with the ones that you have, or make symbolic links to them. Then the default configuration will work for you. More information about the Executable.properties file is given in the JABAWS Configuration page. ------------ .. _jabaws_config_lb: Load balancing -------------- If your cluster is busy and has significant waiting times, you can achieve a faster response by allowing the server machine to calculate small tasks and then reserve the cluster for bigger jobs. This works especially well if your server is a powerful machine with many CPUs. To do this you need to enable and configure both the cluster and the local engines. Once this is done decide on the maximum size of a task to be run on the server locally. Then, edit "# LocalEngineExecutionLimit #" preset in ``Limits.xml`` file accordingly. JABAWS server then will balance the load according to the following rule: If the task size is smaller than the maximum task size for local engine, and the local engine has idle threads, then it calculates task locally otherwise it submit the task to the cluster. ------------ .. _war_testing: Testing the JABAWS Server ------------------------- .. danger:: improve this bit Access ``/ServiceStatus`` to test all web services. Each time you access this URL, all services are tested. For production configuration we recommend prohibiting requests to this URL for non authenticated users to prevent excessive load on the server. Alternatively, you can use a command line client (part of the client only package) to test your JABAWS installation as described here. If you downloaded a JABAWS server package, you can use ``/WEB-INF/lib/jaba-client.jar`` to test JABAWS installation as described here. If you downloaded the source code, then you could run a number of test suites defined in the build.xml Apache Ant file. First of all make sure that Tomcat server is started successfully. If this was the case, then you should see JABAWS home page when you navigate to your Tomcat JABAWS context path in your browser (e.g. at ``http://myhost.compbio.ac.uk:8080/jabaws`` => ````) If you see it, then it is time to make sure that web services are working too. The easiest way to do this is to access Services Status page available from the main JABAWS web page menu. If you need to monitor web service health automatically when the best option is to use service checker that responds with the standard HTTP status code. To access this checker use the following URL: **Using JABAWS service status checker** If you see it, then it is time to make sure that web services are working too. The easiest way to do this is to access Services Status page available from the main JABAWS web page menu. If you need to monitor web service health automatically when the best option is to use service checker that responds with the standard HTTP status code. To access this checker use the following URL: ``/HttpCodeResponseServiceStatus`` or alternatively ``/man_serverwar.jsp`` This page returns code 200, and no page context if all services are operational, 503 if one of the services have problems. You can also check each web service individually by providing the name of the web service to check at the end of the service checker URL like this: ``/HttpCodeResponseServiceStatus/ClustalWS`` Upon request, the service status checker will examine the health of the ClustalWS web service only. If the service name is not valid, then the service checker will return code 400. **Using command line client** Alternatively, you should be able to use the test program which can be found in ``/WEB-INF/lib/jabaws-client.jar`` file. To run the tests type: .. code:: bash java -jar jabaws-client.jar -h= For example to test all JABAWS web services on host myhost.compbio.ac.uk type: .. code:: bash java -jar jabaws-client.jar -h=http://myhost.compbio.ac.uk:8080/jabaws You can choose a particular web server using -s option like this java -jar jabaws-client.jar -h=http://myhost.compbio.ac.uk:8080/jabaws -s=ClustalWS This command line assumes that java executable is in your path and jabaws-client.jar is located in the current directory. An example of the report testing tool produces for operating web service looks like this: .. code:: bash Connecting to service MuscleWS on http://myhost.compbio.ac.uk:8080/jabaws ... OK Testing alignment with default parameters: Queering job status...OK Retrieving results...OK Testing alignment with presets: Aligning with preset 'Protein alignment(Fastest speed)'... OK Aligning with preset 'Nucleotide alignment(Fastest speed)'... OK Aligning with preset 'Huge alignments (speed-oriented)'... OK Queering presets...OK Queering Parameters...OK Queering Limits...OK Queering Local Engine Limits...OK Check is completed service MuscleWS IS WORKING An example of the response of a web service which is deployed but is not operating is below: .. code:: bash Connecting to service ProbconsWS on http://localhost:8080/ws ... OK Testing alignment with default parameters:FAILED Service ProbconsWS IS NOT FUNCTIONAL If the web server did not respond the message looks like following: .. code:: bash Connecting to service TcoffeeWS on http://localhost:8080/ws ... FAILED ------------ .. _war_logging: JABAWS internal logging ----------------------- JABAWS can be configured to log what it is doing. This comes in handy if you would like to see who is using your web services or need to chase some problems. JABAWS uses log4j to do the logging, the example of log4j configuration is bundled with JABAWS war file. You will find it in the ``/WEB-INF/classes/log4j.properties`` file. All the lines in this file are commented out. The reason why the logging is disabled by default it simple, log4j has to know the exact location of where the log files are stored. This is not known up until the deployment time. To enable the logging you need to define logDir property in the log4j.properties and uncomment section of the file which corresponds to your need. More information is given in the log4j.properties file itself. Restart the Tomcat or the JABAWS web application to apply the settings. After you have done this, assuming that you did not change the log4j.properties file yourself, you should see the application log file called activity.log. The file called activity.log. The amount of information logged can be adjusted using different logging levels, it is reduced in the following order of log levels TRACE, DEBUG, INFO, WARN, ERROR, FATAL. If you would like to know who is using your services, you might want to enable Tomcat request logging. ------------ .. _war_logging_req: JABAWS requests logging ~~~~~~~~~~~~~~~~~~~~~~~ Enable Tomcat log valve. To do this uncomment the following section of /conf/server.xml configuration file. .. code-block:: xml The following information will be logged: +--------------+------------------------------+-------------------------------+---------------+------------------------+ | Remote IP | Date | Method server_URL protocol | HTTP status | Response size in bytes | +==============+==============================+===============================+===============+========================+ | 10.31.11.159 | [10/Feb/2010:16:51:32 +0000] | "POST /jws2/MafftWS HTTP/1.1" | 200 | 2067 | +--------------+------------------------------+-------------------------------+---------------+------------------------+ Which can be processed in various programs for log analysis, such as WebAlizer, Analog, AWStats [links]. ------------ .. _jabaws_config_ga: JABAWS and Google Analytics --------------------------- JABAWS reports web services usage to our group Google Analytics (GA) account. JABAWS usage statistics are collected for funding and reporting purposes, and no private information is collected. The data sent by JABAWS is as follows: 1. The IP address of the JABAWS server machine (the server IP can anonymized see ``conf/GA.properties`` config file) 2. The name of the web service that was called. 3. A few details of the system such as JABAWS version, java version, user language, color depth, screen resolution and character encoding. Google Analytics can be disabled or adjusted by removing/editing ``conf/GA.properties`` Google Analytics (GA) settings file. We would appreciate it greatly if you could leave it on! All calls to GA are very lightweight, completed asynchronously, create very little overhead and do not influence the server response time or performance. ------------ .. _war_contents: JABAWS War File Content ----------------------- +---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+ | Directory | Content description | +=====================+==========================================================================================================================================================+ | conf/ contains | configuration files such as Executable.properties, Engine.local.properties, Engine.cluster.properties | +---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+ | conf/settings | Contains individual executable description files. In particular XXXParameters.xml, XXXPresets.xml, XXXLimits.xml where XXX is the name of the executable | +---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+ | ExecutionStatistics | The database for storing the execution statistics | +---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+ | statpages | Web pages for usage statistics visialization and webservices status queries | +---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+ | jobsout/ | Contains directories generated when running an individual executable. E.g. input and output files and some other task related data (optional) | +---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+ | binaries/ | Directory contains native executables - programs, windows binaries (optional) | +---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+ | binaries/src | Contains source of native executables and Linux i386 binaries | +---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+ | binaries/windows | Contains binaries for MS Windows operating system | +---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+ | binaries/matrices | Substitution matrices | +---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+ | WEB-INF | Web application descriptor | +---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+ | WEB-INF/lib | Web application libraries | +---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+ | WEB-INF/classes | log4j.properties - log configuration file (optional) | +---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+ | static | Static content such as CSS, JavaScript and Image files | +---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+ +---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+ | **Help Pages** | | +=====================+==========================================================================================================================================================+ | / | help pages, index.html is the starting page | +---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+ | `dm_javadoc`_ | JavaDoc for the JABAWS Data Model | +---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+ | `full_javadoc`_ | JavaDoc for the complete JABAWS | +---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+ | `prog_docs`_ | Documentation for programs that are included in JABAWS | +---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+ .. _dm_javadoc: http://www.compbio.dundee.ac.uk/jabaws/dm_javadoc/index.html .. _full_javadoc: http://www.compbio.dundee.ac.uk/jabaws/full_javadoc/index.html .. _prog_docs: http://www.compbio.dundee.ac.uk/jabaws/prog_docs/