1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
\r
2 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
\r
3 <html xmlns="http://www.w3.org/1999/xhtml">
\r
5 <meta name="Last-modified" content="Mon, 11 Dec 2010 01:03:33 GMT"/>
\r
6 <title>Java Bioinformatics Analyses Web Services (JABAWS) Server Configuration manual</title>
\r
7 <link href="ws.css" rel="stylesheet" type="text/css" media=
\r
8 "screen, projection, handheld, tv" />
\r
9 <link rel="stylesheet" type="text/css" media="print" href=
\r
11 <script type="text/javascript" src="prototype-1.6.0.3.js"></script>
\r
17 <tr><td style="width:130px;"><a href="http://www.dundee.ac.uk"><img class="logo" src="images/uod_lt.gif" alt="University of Dundee" title="University of Dundee" longdesc="http://www.dundee.ac.uk"/></a></td>
\r
18 <td class="bg"><h1><span class="headeru">JA</span>va <span class=
\r
19 "headeru">B</span>ioinformatics <span class="headeru">A</span>nalysis <span class="headeru">W</span>eb <span
\r
20 class="headeru">S</span>ervices</h1></td>
\r
22 </table></div><!-- banner end-->
\r
25 <div id="panel"><a href="index.html">Home</a>
\r
26 <a class="selected" href="manual.html">Manual</a>
\r
28 <a href="manual.html">Quick Start Guide</a>
\r
29 <a href="man_about.html">About</a>
\r
30 <a href="man_servervm.html" title="JABAWS Server as Virtual Appliance">Server VA</a>
\r
31 <a href="man_serverwar.html" title="JABAWS Server as Web Application aRchive">Server WAR</a>
\r
32 <a class="selected" href="man_configuration.html" >Server<br/>
\r
34 <a href="man_client.html" title="JABAWS Command Line Client">CMD Client</a>
\r
35 <a href="man_dev.html" title="Accessing JABAWS from your program">Accessing<br/>
\r
38 <a href="download.html">Download</a>
\r
39 <a href="http://www.compbio.dundee.ac.uk">Barton Group</a>
\r
44 <h2 id="headtitle">JABAWS MANUAL</h2>
\r
46 <h2>JABAWS Configuration </h2>
\r
48 <li><a href="#defjabaconf">JABAWS Configuration </a></li>
\r
49 <li><a href="#locEngConf">Local Engine Configuration</a></li>
\r
50 <li><a href="#clustEngConf">Cluster Engine Configuration</a></li>
\r
51 <li><a href="#exec">Executable Configuration</a></li>
\r
52 <li><a href="#setexecenv">Defining Environment Variables for
\r
53 Executables</a></li>
\r
54 <li><a href="#mafftconf">Configure JABAWS to Work
\r
56 <li><a href="#settinglimit">Limiting the size of the job accepted by JABAWS Server </a></li>
\r
57 <li><a href="#diffbin">Using a different version of the alignment program with JABAWS</a></li>
\r
58 <li><a href="#mixuse">Load balancing </a></li>
\r
59 <li><a href="#confaccessright">Reviewing JABAWS configuration via web browser</a></li>
\r
60 <li><a href="#testingJaba">Testing JABA Web Services</a></li>
\r
61 <li><a href="#logs">JABAWS requests logging </a></li>
\r
62 <li><a href="#logfiles">JABAWS internal logging </a></li>
\r
63 <li><a href="#execstat">Monitoring JABAWS</a></li>
\r
64 <li><a href="#warfile">JABAWS War File Content</a></li>
\r
66 <h3><a name="defjabaconf" id="defjabaconf"></a>JABAWS Configuration </h3>
\r
67 <p>There are three parts of the system you can configure. The local
\r
68 and the cluster engines, and the paths to the individual executables for
\r
69 each engine. These settings are stored in configuration files
\r
70 within the web application directory (for an overview, then take a
\r
71 look at the <a href="#warfile">war file content table</a>). </p>
\r
72 <p>Initially, JABAWS is configured with only the local engine
\r
73 enabled, with job output written to directory called "jobsout"
\r
74 within the web application itself. This means that JABAWS will work
\r
75 out of the box, but may not be suitable for serving a whole lab or
\r
77 <h3><a name="locEngConf" id="locEngConf"></a>Local Engine Configuration</h3>
\r
79 <p>The Local execution engine configuration is defined in the
\r
80 properties file <span class="hightlight">conf/Engine.local.properties. </span>The supported
\r
81 configuration settings are:<br />
\r
82 <span class="hightlight">engine.local.enable=true</span> - #
\r
83 enable or disable local engine, valid values true | false<br />
\r
85 "hightlight">local.tmp.directory=D:\\clusterengine\\testoutput</span>
\r
86 - a directory to use for temporary files storage, optional,
\r
87 defaults to java temporary directory<br />
\r
88 <span class="hightlight">engine.local.thread.number=4</span> -
\r
89 Number of threads for tasks execution (valid values between 1 and
\r
90 2x cpu. Where x is a number of cores available in the system).
\r
91 Optional defaults to the number of cores for core number <=4 and
\r
92 number of cores-1 for greater core numbers.</p>
\r
94 <p>If the local engine going to be heavily loaded (which is often the case if you do not have a cluster) it is a good idea to increase
\r
95 the amount of memory available for the web application server. If
\r
96 you are using Apache-Tomcat, then you can define its memory
\r
97 settings in the JAVA_OPTS environment variable. To specify which
\r
98 JVM to use for Apache-Tomcat, put the full path to the JRE
\r
99 installation in the JAVA_HOME environment variable (We would
\r
100 recommend using Sun Java Virtual Machine (JVM) in preference to
\r
101 Open JDK). Below is an example of code which can be added to <span
\r
102 class="hightlight"><tomcat_dir>/bin/setenv.sh</span> script
\r
103 to define which JVM to use and a memory settings for Tomcat server.
\r
104 Tomcat server startup script (<span class=
\r
105 "hightlight">catalina.sh</span>) will execute <span class=
\r
106 "hightlight">setenv.sh</span> on each server start
\r
107 automatically.<br />
\r
108 <span class="code">export
\r
109 JAVA_HOME=/homes/ws-dev2/jdk1.6.0_17/<br />
\r
110 export JAVA_OPTS="-server -Xincgc -Xms512m -Xmx1024m"</span></p>
\r
112 <h3><a name="clustEngConf" id="clustEngConf"></a>Cluster Engine Configuration</h3>
\r
114 <p>Supported configuration settings:<br />
\r
115 <span class="hightlight">engine.cluster.enable=true</span> - #
\r
116 enable or disable local engine true | false, defaults to
\r
119 "hightlight">cluster.tmp.directory=/homes/clustengine/testoutput-</span>
\r
120 a directory to use for temporary files storage. The value must be
\r
121 an absolute path to the temporary directory. Required. The value
\r
122 must be different from what is defined for local engine. This
\r
123 directory must be accessible from all cluster nodes.<br />
\r
124 For the cluster engine to work, the SGE_ROOT and LD_LIBRARY_PATH
\r
125 environment variables have to be defined. They tell the cluster
\r
126 engine where to find DRMAA libraries. These variables
\r
127 should be defined when the web application server starts up, e.g.</p>
\r
129 <p><span class="code">SGE_ROOT=/gridware/sge<br />
\r
130 LD_LIBRARY_PATH=/gridware/sge/lib/lx24-amd64</span></p>
\r
132 <p>Finally, do not forget to configure executables for the cluster
\r
133 execution, they may be the same as for the local execution but may
\r
134 be different. Please refer to the executable configuration section
\r
135 for further details.</p>
\r
137 <h3><a name="exec" id="exec"></a>Executable Configuration</h3>
\r
139 <p>All the executable programs
\r
140 are configured in <span class="hightlight">conf/Executable.properties</span> file. Each executable
\r
141 is configured with a number of options. They are: <span class=
\r
142 "code">local.X.bin.windows=<path to executable under windows
\r
143 system, optional><br />
\r
144 local.X.bin=<path to the executable under non-windows system,
\r
146 cluster.X.bin=<path to the executable on the cluster, all
\r
147 cluster nodes must see it, optional><br />
\r
148 X.bin.env=<semicolon separated list of environment variables
\r
149 for executable, use hash symbol as name value separator,
\r
151 X.--aamatrix.path=<path to the directory containing
\r
152 substitution matrices, optional><br />
\r
153 X.presets.file=<path to the preset configuration file, optional
\r
155 X.parameters.file=<path to the parameters configuration file,
\r
157 X.limits.file=<path to the limits configuration file,
\r
159 X.cluster.settings=<list of the cluster specific options,
\r
160 optional></span></p>
\r
162 <p>Where X is either clustal, muscle, mafft, probcons or tcoffee. </p>
\r
164 <p>Default JABAWS configuration includes path to local executables
\r
165 to be run by the local engine only, all cluster related settings
\r
166 are commented out, but they are there for you as example. Cluster
\r
167 engine is disabled by default. To configure executable for cluster
\r
168 execution un comment the X.cluster settings and change them
\r
169 appropriately. </p>
\r
170 <p>By default limits are set well in excess of what you may want to offer to the users outside your lab, to make sure that the tasks are never rejected. The default limit is 100000 sequences of 100000 letters on average for all of the JABA web services. You can adjust the limits according to your needs by editing <span class="hightlight">conf/settings/<X>Limit.xml</span> files.<br />
\r
171 After you have completed the editing your configuration may look like
\r
172 this:<span class="code">local.mafft.bin.windows=<br />
\r
173 local.mafft.bin=binaries/mafft<br />
\r
174 cluster.mafft.bin=/homes/cengine/mafft<br />
\r
175 mafft.bin.env=MAFFT_BINARIES#/homes/cengine/mafft;FASTA_4_MAFFT#/bin/fasta34;<br />
\r
176 mafft.--aamatrix.path=binaries/matrices<br />
\r
177 mafft.presets.file=conf/settings/MafftPresets.xml<br />
\r
178 mafft.parameters.file=conf/settings/MafftParameters.xml<br />
\r
179 mafft.limits.file=conf/settings/MafftLimits.xml<br />
\r
180 mafft.cluster.settings=-q bigmem.q -l h_cpu=24:00:00 -l
\r
181 h_vmem=6000M -l ram=6000M</span></p>
\r
182 <p>Please not that relative paths must only be specified for the
\r
183 files that reside inside web application directory, all other paths
\r
184 must be supplied as absolute!</p>
\r
186 <p>Furthermore, you should avoid using environment variables within the paths or options - since these will not be evaluated correctly. Instead, please explicitly
\r
187 specify the absolute path to anything
\r
188 normally evaluated from an environment variable at execution time.</p>
\r
190 <p>If you are using JABAWS to submit jobs to the cluster (with
\r
191 cluster engine enabled), executables must be available from all
\r
192 cluster nodes the task can be sent to, also paths to the
\r
193 executables on the cluster e.g. <span class=
\r
194 "hightlight">cluster.<exec_name>.bin</span> must be
\r
197 <p>Executables can be located anywhere in your system, they do not
\r
198 have to reside on the server as long as the web application server
\r
199 can access and execute them.</p>
\r
201 <p>Cluster settings are treated as a black box, the system will
\r
202 just pass whatever is specified in this line directly to the
\r
203 cluster submission library. This is how DRMAA itself treats this
\r
204 settings. More exactly DRMAA <span class="hightlight">JobTemplate.setNativeSpecification()</span> function will be called.</p>
\r
206 <h3><a name="setexecenv" />Defining Environment Variables for
\r
209 <p>Environment variables can be defined in property <span class=
\r
210 "code">x.bin.env</span> Where <span class="hightlight">x</span> is
\r
211 one of five executables supported by JABAWS. Several environment
\r
212 variables can be specified in the same line. For example.<br />
\r
214 "code">mafft.bin.env=MAFFT_BINARIES#/homes/cengine/mafft;FASTA_4_MAFFT#/bin/fasta34;</span></p>
\r
216 <p>The example above defines two environment variables with names
\r
217 MAFFT-BINARIES and FASTA_4_MAFFT and values /homes/cengine/mafft
\r
218 and /bin/fasta34 respectively. Semicolon is used as a separator
\r
219 between different environment variables whereas hash is used as a
\r
220 separator for name and value of the variable.</p>
\r
222 <h3><a name="mafftconf" id="mafftconf"></a>Configure JABAWS to Work
\r
225 <p>If you use default configuration you do not need to read any
\r
226 further. The default configuration will work for you without any
\r
227 changes, however, if you want to install Mafft yourself then there
\r
228 is a couple of more steps to do.</p>
\r
230 <p>Mafft executable needs to know the location of other files
\r
231 supplied with Mafft. In addition some Mafft functions depends on
\r
232 the fasta executable, which is not supplied with Mafft, but is a
\r
233 separate package. Mafft needs to know the location of fasta34
\r
236 <p>To let Mafft know where the other files from its package are
\r
237 change the value of MAFFT-BINARIES environment variables. To let
\r
238 Mafft know where is the fasta34 executable set the value of
\r
239 FASTA_4_MAFFT environment variable to point to a location of
\r
240 fasta34 program. The latter can be added to the PATH variable
\r
241 instead. If you are using executables supplied with JABAWS, the
\r
242 path to Mafft binaries would be like <span class=
\r
243 "hightlight"><relative path to web application
\r
244 directory>/binaries/src/mafft/binaries</span> and the path to
\r
245 fasta34 binary would be <span class="hightlight"><relative path
\r
247 directory>/binaries/src/fasta34/fasta34</span>. You can specify
\r
248 the location of Mafft binaries as well as fasta34 program elsewhere
\r
249 by providing an absolute path to them. All these settings are
\r
250 defined in <span class=
\r
251 "hightlight">conf/Executable.properties</span> file.</p>
\r
252 <h3><a name="settinglimit" id="settinglimit"></a>Limiting the size of the job accepted by JABAWS </h3>
\r
253 <p>JABAWS can be configured to reject excessively large tasks. This is useful if you operate JABAWS service for many users. By defining a maximum allowed task size you can provide an even service for all users and prevents waist of resources on the tasks too large to complete successfully. You can define the maximum number of sequences and the maximum average sequence length that JABAWS accepts for each JABA Web Service independently.
\r
254 Furthermore, you can define different limits for different presets of the same web service. <br />
\r
255 By default limits are set well in excess of what you may want to offer to the users outside your lab, to make sure that the tasks are never rejected. The default limit is 100000 sequences of 100000 letters on average for all of the JABA web services. You can adjust the limits according to your needs by editing <span class="hightlight">conf/settings/<X>Limit.xml</span> files.</p>
\r
256 <h3><a name="diffbin" id="diffbin"></a>Using a different version of the alignment program with JABAWS</h3>
\r
257 <p>JABAWS supplied with binaries and source code of the executables which version it supports. So normally you would not need to install your own executables. However, if you have a different version of an executable (e.g. an alignment program) which you prefer, you could use it as long as it supports all the functions JABAWS executable supported. This could be the case with more recent executable. If the options supported by your chosen executable is different when the standard JABAWS executable, than you need to edit <em>ExecutableName</em>Paramaters.xml configuration file. </p>
\r
258 <h3><a name="mixuse" id="mixuse"></a>Load balancing </h3>
\r
259 <p>If your cluster is busy and have significant waiting times you can achieve a faster response by allowing the server machine to calculate small tasks and the reserve the cluster for bigger jobs. This works especially well if your server is a powerful machine with many CPUs. To do this you need to enable and configure both the cluster and the local engines. Once this is done decide on the maximum size of a task to be run on the server locally. Then, edit <span class="hightlight">"# LocalEngineExecutionLimit #" </span>preset in<span class="hightlight"> <ServiceName>Limits.xml</span> file accordingly. JABAWS server then will balance the load according to the following rule: If the task size is smaller then the maximum task size for local engine, and the local engine has idle threads, then calculate task locally otherwise submit the task to the cluster. </p>
\r
260 <h3><a name="confaccessright" id="confaccessright"></a>Reviewing JABAWS configuration via web browser</h3>
\r
261 <p>Access to configuration files is prohibited to any unauthorized users by means of security constrain defined in web application descriptor file. There is a special user role called <span class="hightlight">admin</span> who can access these files. This comes handy if you would like to keep an eye on any of the task outputs stored in jobsout, or would like to view the configuration files. To access the configuration files add admin user into your application server. The way you do it depends on where you would like the user passwords to come from and your web application server. If you use Tomcat, then the simplest way is to use Tomcat Memory Realm which is linked to a plain text configuration file. To define the user in Tomcat server add an entry in <span class="hightlight">conf/tomcat-user.xml</span> file. <span class="code"><role rolename="admin"/><br />
\r
262 <user username="admin" password="your password here " roles="admin"/></span></p>
\r
263 <p>Once this is done make sure the servlet that returns the web application directory listings is enabled. Look in the <span class="hightlight"><tomcatroot>/conf/web.xml</span> file for the following <span class="code"><param-name>listings</param-name><br />
\r
264 <param-value>true</param-value></span></p>
\r
265 <p>The whole section that defines default listing servlet is below</p>
\r
266 <p class="code"> <servlet><br />
\r
267 <servlet-name>default</servlet-name><br />
\r
268 <servlet-class>org.apache.catalina.servlets.DefaultServlet</servlet-class><br />
\r
269 <init-param><br />
\r
270 <param-name>debug</param-name><br />
\r
271 <param-value>0</param-value><br />
\r
272 </init-param><br />
\r
273 <init-param><br />
\r
274 <param-name>listings</param-name><br />
\r
275 <param-value>true</param-value><br />
\r
276 </init-param><br />
\r
277 <load-on-startup>1</load-on-startup><br />
\r
278 </servlet><br />
\r
280 <p>These listings are read only by default.</p>
\r
281 <h3><a name="testingJaba" id="testingJaba"></a>Testing JABA Web Services</h3>
\r
282 <p>You can use a command line client (part of the client only
\r
283 package) to test your JABAWS installation as described <a href="man_client.html">here</a>. If you downloaded a JABAWS
\r
284 server package, you can use <span class=
\r
285 "hightlight"><your_jaba_context_name>/WEB-INF/lib/jaba-client.jar</span> to test JABAWS installation as described in <a href=
\r
286 "man_serverwar.html#usingWsTester">here</a>. If you downloaded the source
\r
287 code, then you could run a number of test suits defined in the
\r
288 build.xml Apache Ant file.</p>
\r
289 <h3><a name="logs" id="logs"></a>JABAWS requests logging </h3>
\r
290 <p>Enable Tomcat log valve. To do this uncomment the following section of <span class="hightlight"><tomcat_root>/conf/server.xml</span> configuration file. </p>
\r
291 <p class="code"> <Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs" <br />
\r
292 prefix="localhost_access_log." suffix=".txt" pattern="common" resolveHosts="false"/></p>
\r
293 <p> The following information will be logged:</p>
\r
294 <table width="100%" border="0" style="margin:0">
\r
298 <th>Method server_URL protocol </th>
\r
299 <th>HTTP status </th>
\r
300 <th>Response size in bytes </th>
\r
303 <td>10.31.11.159</td>
\r
304 <td>[10/Feb/2010:16:51:32 +0000]</td>
\r
305 <td>"POST /jws2/MafftWS HTTP/1.1"</td>
\r
310 <p>Which can be processed in various programs for log analysis , such as <a href="http://www.webalizer.org/">WebAlizer</a>, <a href="http://www.analog.cx/">Analog</a>, <a href="http://awstats.sourceforge.net/">AWStats</a>. </p>
\r
311 <h3><a name="logfiles" id="logfiles"></a>JABAWS internal logging </h3>
\r
312 <p>JABAWS can be configured to log what it is doing. This comes
\r
313 handy if you would like to see who is using your web services or
\r
314 need to chase some problems. JABAWS uses <a href=
\r
315 "http://logging.apache.org/log4j/1.2/">log4j</a> to do the logging,
\r
316 the example of log4j configuration is bundled with JABAWS war file.
\r
317 You will find it in the <span class=
\r
318 "hightlight">/WEB-INF/classes/log4j.properties</span> file. All the
\r
319 lines in this file are commented out. The reason why the logging is
\r
320 disabled by default it simple, log4j have to know the exact
\r
321 location where the log files should be stored. This is not known up
\r
322 until the deployment time. To enable the logging you need to
\r
323 define<span class="hightlight"> logDir</span> property in the <span
\r
324 class="hightlight">log4j.properties</span> and uncomment section of
\r
325 the file which corresponds to your need. More information is given
\r
326 in the <span class="hightlight">log4j.properties</span> file
\r
327 itself. Restart the Tomcat or the JABAWS web application to apply
\r
329 <p>After you have done this, assuming that you did not change the
\r
330 log4j.properties file yourself, you should see the application log
\r
331 file called <span class="hightlight">activity.log</span>. The
\r
332 amount of information logged can be adjusted using different
\r
333 logging levels, it is reduced in the following order of log levels
\r
334 TRACE, DEBUG, INFO, WARN, ERROR, FATAL.</p>
\r
335 <p>If you would like to know who is using your services, you might
\r
336 want to <a href="#logs">enable Tomcat request
\r
338 <h3><a name="execstat" id="execstat"></a>Monitoring JABAWS</h3>
\r
339 <p>JABAWS stores cluster task ids for all tasks which were run on the cluster. Using cluster ids the detailed statistics can be extracted from cluster accounting system. Due to the fact that each cluster supported by JABAWS have different accounting system it was not possible to provide ready to use statistics. <br />
\r
340 For the local execution the starting and finishing time in nano seconds can be found in STARTED and FINISHED files respectively. In time we will provide the tools to extract execution time statistics, so keep the content of your working directory ready!</p>
\r
341 <h3><a name="warfile" id="warfile"></a>JABAWS War File Content</h3>
\r
342 <table width="100%">
\r
344 <th style="width:19%">Directory</th>
\r
345 <th style="width:81%">Content description</th>
\r
349 <td>contains configuration files such as Executable.properties,
\r
350 Engine.local.properties, Engine.cluster.properties</td>
\r
353 <td>conf/settings</td>
\r
354 <td>Contains individual executable description files. In particular
\r
355 XXXParameters.xml, XXXPresets.xml, XXXLimits.xml where XXX is the
\r
356 name of the executable</td>
\r
360 <td>Contains directories generated when running an individual executable. E.g. input and output files and some other task
\r
361 related data. (optional)</td>
\r
365 <td>Directory contains native executables - programs,
\r
366 windows binaries (optional)</td>
\r
369 <td>binaries/src</td>
\r
370 <td>Contains source of native executables and Linux i386
\r
374 <td>binaries/matrices</td>
\r
375 <td>Substitution matrices
\r
376 <!-- what format ? --></td>
\r
380 <td>Web application descriptor</td>
\r
383 <td>WEB-INF/lib</td>
\r
384 <td>Web application libraries</td>
\r
387 <td>WEB-INF/classes</td>
\r
388 <td>log4j.properties - log configuration file (optional)</td>
\r
391 <td colspan="2"><strong>Help Pages</strong> </td>
\r
395 <td>help pages, index.html is the starting page</td>
\r
398 <td>dm_javadoc</td>
\r
399 <td>javadoc for JABAWS client (the link is available from How To
\r
404 <td>documentation for programs that JABAWS uses</td>
\r
408 <td>images referenced by html pages</td>
\r
413 <!-- content end-->
\r
414 <div id="copyright">Last update: 7 January 2011<br />
\r
415 Peter Troshin, Jim Procter and Geoff Barton, The Barton Group, University of
\r
419 <!-- wrapper end-->
\r
423 <!-- Google analitics -->
\r
424 <script type="text/javascript">
\r
425 var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
\r
426 document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
\r
428 <script type="text/javascript">
\r
430 var pageTracker = _gat._getTracker("UA-5356328-1");
\r
431 pageTracker._trackPageview();
\r