1 <?xml version="1.0" encoding="UTF-8"?>
\r
2 <!DOCTYPE html PUBLIC "XHTML 1.0 Strict"
\r
3 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
\r
4 <html xmlns="http://www.w3.org/1999/xhtml">
\r
6 <meta name="Last-modified" content="Mon, 4 Apr 2011 12:00:00 GMT"/>
\r
7 <title>Java Bioinformatics Analyses Web Services (JABAWS) Server Configuration manual</title>
\r
8 <link href="ws.css" rel="stylesheet" type="text/css" media=
\r
9 "screen, projection, handheld, tv" />
\r
10 <link rel="stylesheet" type="text/css" media="print" href=
\r
12 <script type="text/javascript" src="prototype-1.6.0.3.js"></script>
\r
18 <tr><td style="width:130px;"><a href="http://www.dundee.ac.uk"><img src="images/uod_lt.gif" alt="University of Dundee" class="logo" title="University of Dundee" longdesc="http://www.dundee.ac.uk"/></a></td>
\r
19 <td class="bg"><img src="images/jabaws.png" title="JABAWS:MSA" alt="JABAWS:MSA"/></td>
\r
20 <td class="bg"><img src="images/align.png"/></td>
\r
22 </table></div><!-- banner end-->
\r
25 <div id="panel"><a href="index.html">Home</a>
\r
26 <a href="quick_start.html">Getting Started</a>
\r
27 <a class="selected" href="man_about.html">Manual</a>
\r
29 <a href="man_about.html">About</a>
\r
30 <a href="man_servervm.html" title="JABAWS Server as Virtual Appliance">Server VA</a>
\r
31 <a href="man_serverwar.html" title="JABAWS Server as Web Application aRchive">Server WAR</a>
\r
32 <a class="selected" href="man_configuration.html" >Server<br/>
\r
34 <a href="man_client.html" title="JABAWS Command Line Client">CMD Client</a>
\r
35 <a href="man_dev.html" title="Accessing JABAWS from your program">Accessing<br/>
\r
38 <a href="download.html">Download</a>
\r
39 <a href="contacts.html">Contact Us</a>
\r
40 <a href="http://www.compbio.dundee.ac.uk">Barton Group</a>
\r
45 <h2 id="headtitle">JABAWS MANUAL</h2>
\r
47 <h2>JABAWS Configuration </h2>
\r
49 <li><a href="#defjabaconf">JABAWS Configuration </a></li>
\r
50 <li><a href="#locEngConf">Local Engine Configuration</a></li>
\r
51 <li><a href="#clustEngConf">Cluster Engine Configuration</a></li>
\r
52 <li><a href="#exec">Executable Configuration</a></li>
\r
53 <li><a href="#setexecenv">Defining Environment Variables for
\r
54 Executables</a></li>
\r
55 <li><a href="#mafftconf">Configure JABAWS to Work
\r
57 <li><a href="#settinglimit">Limiting the size of the job accepted by JABAWS Server </a></li>
\r
58 <li><a href="#diffbin">Using a different version of the alignment program with JABAWS</a></li>
\r
59 <li><a href="#mixuse">Load balancing </a></li>
\r
60 <li><a href="#confaccessright">Reviewing JABAWS configuration via web browser</a></li>
\r
61 <li><a href="#testingJaba">Testing JABA Web Services</a></li>
\r
62 <li><a href="#logs">JABAWS requests logging </a></li>
\r
63 <li><a href="#logfiles">JABAWS internal logging </a></li>
\r
64 <li><a href="#execstat">Monitoring JABAWS</a></li>
\r
65 <li><a href="#warfile">JABAWS War File Content</a></li>
\r
67 <h3><a name="defjabaconf" id="defjabaconf"></a>JABAWS Configuration </h3>
\r
68 <p>There are three parts of the system you can configure. The local
\r
69 and the cluster engines, and the paths to the individual executables for
\r
70 each engine. These settings are stored in configuration files
\r
71 within the web application directory (for an overview, then take a
\r
72 look at the <a href="#warfile">war file content table</a>). </p>
\r
73 <p>Initially, JABAWS is configured with only the local engine
\r
74 enabled, with job output written to directory called "jobsout"
\r
75 within the web application itself. This means that JABAWS will work
\r
76 out of the box, but may not be suitable for serving a whole lab or
\r
78 <h3><a name="locEngConf" id="locEngConf"></a>Local Engine Configuration</h3>
\r
80 <p>The Local execution engine configuration is defined in the
\r
81 properties file <span class="hightlight">conf/Engine.local.properties. </span>The supported
\r
82 configuration settings are:<br />
\r
83 <span class="hightlight">engine.local.enable=true</span> - #
\r
84 enable or disable local engine, valid values true | false<br />
\r
86 "hightlight">local.tmp.directory=D:\\clusterengine\\testoutput</span>
\r
87 - a directory to use for temporary files storage, optional,
\r
88 defaults to java temporary directory<br />
\r
89 <span class="hightlight">engine.local.thread.number=4</span> -
\r
90 Number of threads for tasks execution (valid values between 1 and
\r
91 2x cpu. Where x is a number of cores available in the system).
\r
92 Optional defaults to the number of cores for core number <=4 and
\r
93 number of cores-1 for greater core numbers.</p>
\r
95 <p>If the local engine going to be heavily loaded (which is often the case if you do not have a cluster) it is a good idea to increase
\r
96 the amount of memory available for the web application server. If
\r
97 you are using Apache-Tomcat, then you can define its memory
\r
98 settings in the JAVA_OPTS environment variable. To specify which
\r
99 JVM to use for Apache-Tomcat, put the full path to the JRE
\r
100 installation in the JAVA_HOME environment variable (We would
\r
101 recommend using Sun Java Virtual Machine (JVM) in preference to
\r
102 Open JDK). Below is an example of code which can be added to <span
\r
103 class="hightlight"><tomcat_dir>/bin/setenv.sh</span> script
\r
104 to define which JVM to use and a memory settings for Tomcat server.
\r
105 Tomcat server startup script (<span class=
\r
106 "hightlight">catalina.sh</span>) will execute <span class=
\r
107 "hightlight">setenv.sh</span> on each server start
\r
108 automatically.<br />
\r
109 <span class="code">export
\r
110 JAVA_HOME=/homes/ws-dev2/jdk1.6.0_17/<br />
\r
111 export JAVA_OPTS="-server -Xincgc -Xms512m -Xmx1024m"</span></p>
\r
113 <h3><a name="clustEngConf" id="clustEngConf"></a>Cluster Engine Configuration</h3>
\r
115 <p>Supported configuration settings:<br />
\r
116 <span class="hightlight">engine.cluster.enable=true</span> - #
\r
117 enable or disable local engine true | false, defaults to
\r
120 "hightlight">cluster.tmp.directory=/homes/clustengine/testoutput-</span>
\r
121 a directory to use for temporary files storage. The value must be
\r
122 an absolute path to the temporary directory. Required. The value
\r
123 must be different from what is defined for local engine. This
\r
124 directory must be accessible from all cluster nodes.<br />
\r
125 For the cluster engine to work, the SGE_ROOT and LD_LIBRARY_PATH
\r
126 environment variables have to be defined. They tell the cluster
\r
127 engine where to find DRMAA libraries. These variables
\r
128 should be defined when the web application server starts up, e.g.</p>
\r
130 <p><span class="code">SGE_ROOT=/gridware/sge<br />
\r
131 LD_LIBRARY_PATH=/gridware/sge/lib/lx24-amd64</span></p>
\r
133 <p>Finally, do not forget to configure executables for the cluster
\r
134 execution, they may be the same as for the local execution but may
\r
135 be different. Please refer to the executable configuration section
\r
136 for further details.</p>
\r
138 <h3><a name="exec" id="exec"></a>Executable Configuration</h3>
\r
140 <p>All the executable programs
\r
141 are configured in <span class="hightlight">conf/Executable.properties</span> file. Each executable
\r
142 is configured with a number of options. They are: <span class=
\r
143 "code">local.X.bin.windows=<path to executable under windows
\r
144 system, optional><br />
\r
145 local.X.bin=<path to the executable under non-windows system,
\r
147 cluster.X.bin=<path to the executable on the cluster, all
\r
148 cluster nodes must see it, optional><br />
\r
149 X.bin.env=<semicolon separated list of environment variables
\r
150 for executable, use hash symbol as name value separator,
\r
152 X.--aamatrix.path=<path to the directory containing
\r
153 substitution matrices, optional><br />
\r
154 X.presets.file=<path to the preset configuration file, optional
\r
156 X.parameters.file=<path to the parameters configuration file,
\r
158 X.limits.file=<path to the limits configuration file,
\r
160 X.cluster.settings=<list of the cluster specific options,
\r
161 optional></span></p>
\r
163 <p>Where X is either clustal, muscle, mafft, probcons or tcoffee. </p>
\r
165 <p>Default JABAWS configuration includes path to local executables
\r
166 to be run by the local engine only, all cluster related settings
\r
167 are commented out, but they are there for you as example. Cluster
\r
168 engine is disabled by default. To configure executable for cluster
\r
169 execution un comment the X.cluster settings and change them
\r
170 appropriately. </p>
\r
171 <p>By default limits are set well in excess of what you may want to offer to the users outside your lab, to make sure that the tasks are never rejected. The default limit is 100000 sequences of 100000 letters on average for all of the JABA web services. You can adjust the limits according to your needs by editing <span class="hightlight">conf/settings/<X>Limit.xml</span> files.<br />
\r
172 After you have completed the editing your configuration may look like
\r
173 this:<span class="code">local.mafft.bin.windows=<br />
\r
174 local.mafft.bin=binaries/mafft<br />
\r
175 cluster.mafft.bin=/homes/cengine/mafft<br />
\r
176 mafft.bin.env=MAFFT_BINARIES#/homes/cengine/mafft;FASTA_4_MAFFT#/bin/fasta34;<br />
\r
177 mafft.--aamatrix.path=binaries/matrices<br />
\r
178 mafft.presets.file=conf/settings/MafftPresets.xml<br />
\r
179 mafft.parameters.file=conf/settings/MafftParameters.xml<br />
\r
180 mafft.limits.file=conf/settings/MafftLimits.xml<br />
\r
181 mafft.cluster.settings=-q bigmem.q -l h_cpu=24:00:00 -l
\r
182 h_vmem=6000M -l ram=6000M</span></p>
\r
183 <p>Please not that relative paths must only be specified for the
\r
184 files that reside inside web application directory, all other paths
\r
185 must be supplied as absolute!</p>
\r
187 <p>Furthermore, you should avoid using environment variables within the paths or options - since these will not be evaluated correctly. Instead, please explicitly
\r
188 specify the absolute path to anything
\r
189 normally evaluated from an environment variable at execution time.</p>
\r
191 <p>If you are using JABAWS to submit jobs to the cluster (with
\r
192 cluster engine enabled), executables must be available from all
\r
193 cluster nodes the task can be sent to, also paths to the
\r
194 executables on the cluster e.g. <span class=
\r
195 "hightlight">cluster.<exec_name>.bin</span> must be
\r
198 <p>Executables can be located anywhere in your system, they do not
\r
199 have to reside on the server as long as the web application server
\r
200 can access and execute them.</p>
\r
202 <p>Cluster settings are treated as a black box, the system will
\r
203 just pass whatever is specified in this line directly to the
\r
204 cluster submission library. This is how DRMAA itself treats this
\r
205 settings. More exactly DRMAA <span class="hightlight">JobTemplate.setNativeSpecification()</span> function will be called.</p>
\r
207 <h3><a name="setexecenv" />Defining Environment Variables for
\r
210 <p>Environment variables can be defined in property <span class=
\r
211 "code">x.bin.env</span> Where <span class="hightlight">x</span> is
\r
212 one of five executables supported by JABAWS. Several environment
\r
213 variables can be specified in the same line. For example.<br />
\r
215 "code">mafft.bin.env=MAFFT_BINARIES#/homes/cengine/mafft;FASTA_4_MAFFT#/bin/fasta34;</span></p>
\r
217 <p>The example above defines two environment variables with names
\r
218 MAFFT-BINARIES and FASTA_4_MAFFT and values /homes/cengine/mafft
\r
219 and /bin/fasta34 respectively. Semicolon is used as a separator
\r
220 between different environment variables whereas hash is used as a
\r
221 separator for name and value of the variable.</p>
\r
223 <h3><a name="mafftconf" id="mafftconf"></a>Configure JABAWS to Work
\r
226 <p>If you use default configuration you do not need to read any
\r
227 further. The default configuration will work for you without any
\r
228 changes, however, if you want to install Mafft yourself then there
\r
229 is a couple of more steps to do.</p>
\r
231 <p>Mafft executable needs to know the location of other files
\r
232 supplied with Mafft. In addition some Mafft functions depends on
\r
233 the fasta executable, which is not supplied with Mafft, but is a
\r
234 separate package. Mafft needs to know the location of fasta34
\r
237 <p>To let Mafft know where the other files from its package are
\r
238 change the value of MAFFT-BINARIES environment variables. To let
\r
239 Mafft know where is the fasta34 executable set the value of
\r
240 FASTA_4_MAFFT environment variable to point to a location of
\r
241 fasta34 program. The latter can be added to the PATH variable
\r
242 instead. If you are using executables supplied with JABAWS, the
\r
243 path to Mafft binaries would be like <span class=
\r
244 "hightlight"><relative path to web application
\r
245 directory>/binaries/src/mafft/binaries</span> and the path to
\r
246 fasta34 binary would be <span class="hightlight"><relative path
\r
248 directory>/binaries/src/fasta34/fasta34</span>. You can specify
\r
249 the location of Mafft binaries as well as fasta34 program elsewhere
\r
250 by providing an absolute path to them. All these settings are
\r
251 defined in <span class=
\r
252 "hightlight">conf/Executable.properties</span> file.</p>
\r
253 <h3><a name="settinglimit" id="settinglimit"></a>Limiting the size of the job accepted by JABAWS </h3>
\r
254 <p>JABAWS can be configured to reject excessively large tasks. This is useful if you operate JABAWS service for many users. By defining a maximum allowed task size you can provide an even service for all users and prevents waist of resources on the tasks too large to complete successfully. You can define the maximum number of sequences and the maximum average sequence length that JABAWS accepts for each JABA Web Service independently.
\r
255 Furthermore, you can define different limits for different presets of the same web service. <br />
\r
256 By default limits are set well in excess of what you may want to offer to the users outside your lab, to make sure that the tasks are never rejected. The default limit is 100000 sequences of 100000 letters on average for all of the JABA web services. You can adjust the limits according to your needs by editing <span class="hightlight">conf/settings/<X>Limit.xml</span> files.</p>
\r
257 <h3><a name="diffbin" id="diffbin"></a>Using a different version of the alignment program with JABAWS</h3>
\r
258 <p>JABAWS supplied with binaries and source code of the executables which version it supports. So normally you would not need to install your own executables. However, if you have a different version of an executable (e.g. an alignment program) which you prefer, you could use it as long as it supports all the functions JABAWS executable supported. This could be the case with more recent executable. If the options supported by your chosen executable is different when the standard JABAWS executable, than you need to edit <em>ExecutableName</em>Paramaters.xml configuration file. </p>
\r
259 <h3><a name="mixuse" id="mixuse"></a>Load balancing </h3>
\r
260 <p>If your cluster is busy and have significant waiting times you can achieve a faster response by allowing the server machine to calculate small tasks and the reserve the cluster for bigger jobs. This works especially well if your server is a powerful machine with many CPUs. To do this you need to enable and configure both the cluster and the local engines. Once this is done decide on the maximum size of a task to be run on the server locally. Then, edit <span class="hightlight">"# LocalEngineExecutionLimit #" </span>preset in<span class="hightlight"> <ServiceName>Limits.xml</span> file accordingly. JABAWS server then will balance the load according to the following rule: If the task size is smaller then the maximum task size for local engine, and the local engine has idle threads, then calculate task locally otherwise submit the task to the cluster. </p>
\r
261 <h3><a name="confaccessright" id="confaccessright"></a>Reviewing JABAWS configuration via web browser</h3>
\r
262 <p>Access to configuration files is prohibited to any unauthorized users by means of security constrain defined in web application descriptor file. There is a special user role called <span class="hightlight">admin</span> who can access these files. This comes handy if you would like to keep an eye on any of the task outputs stored in jobsout, or would like to view the configuration files. To access the configuration files add admin user into your application server. The way you do it depends on where you would like the user passwords to come from and your web application server. If you use Tomcat, then the simplest way is to use Tomcat Memory Realm which is linked to a plain text configuration file. To define the user in Tomcat server add an entry in <span class="hightlight">conf/tomcat-user.xml</span> file. <span class="code"><role rolename="admin"/><br />
\r
263 <user username="admin" password="your password here " roles="admin"/></span></p>
\r
264 <p>Once this is done make sure the servlet that returns the web application directory listings is enabled. Look in the <span class="hightlight"><tomcatroot>/conf/web.xml</span> file for the following <span class="code"><param-name>listings</param-name><br />
\r
265 <param-value>true</param-value></span></p>
\r
266 <p>The whole section that defines default listing servlet is below</p>
\r
267 <p class="code"> <servlet><br />
\r
268 <servlet-name>default</servlet-name><br />
\r
269 <servlet-class>org.apache.catalina.servlets.DefaultServlet</servlet-class><br />
\r
270 <init-param><br />
\r
271 <param-name>debug</param-name><br />
\r
272 <param-value>0</param-value><br />
\r
273 </init-param><br />
\r
274 <init-param><br />
\r
275 <param-name>listings</param-name><br />
\r
276 <param-value>true</param-value><br />
\r
277 </init-param><br />
\r
278 <load-on-startup>1</load-on-startup><br />
\r
279 </servlet><br />
\r
281 <p>These listings are read only by default.</p>
\r
282 <h3><a name="testingJaba" id="testingJaba"></a>Testing JABA Web Services</h3>
\r
283 <p>You can use a command line client (part of the client only
\r
284 package) to test your JABAWS installation as described <a href="man_client.html">here</a>. If you downloaded a JABAWS
\r
285 server package, you can use <span class=
\r
286 "hightlight"><your_jaba_context_name>/WEB-INF/lib/jaba-client.jar</span> to test JABAWS installation as described in <a href=
\r
287 "man_serverwar.html#usingWsTester">here</a>. If you downloaded the source
\r
288 code, then you could run a number of test suits defined in the
\r
289 build.xml Apache Ant file.</p>
\r
290 <h3><a name="logs" id="logs"></a>JABAWS requests logging </h3>
\r
291 <p>Enable Tomcat log valve. To do this uncomment the following section of <span class="hightlight"><tomcat_root>/conf/server.xml</span> configuration file. </p>
\r
292 <p class="code"> <Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs" <br />
\r
293 prefix="localhost_access_log." suffix=".txt" pattern="common" resolveHosts="false"/></p>
\r
294 <p> The following information will be logged:</p>
\r
295 <table width="100%" border="0" style="margin:0">
\r
299 <th>Method server_URL protocol </th>
\r
300 <th>HTTP status </th>
\r
301 <th>Response size in bytes </th>
\r
304 <td>10.31.11.159</td>
\r
305 <td>[10/Feb/2010:16:51:32 +0000]</td>
\r
306 <td>"POST /jws2/MafftWS HTTP/1.1"</td>
\r
311 <p>Which can be processed in various programs for log analysis , such as <a href="http://www.webalizer.org/">WebAlizer</a>, <a href="http://www.analog.cx/">Analog</a>, <a href="http://awstats.sourceforge.net/">AWStats</a>. </p>
\r
312 <h3><a name="logfiles" id="logfiles"></a>JABAWS internal logging </h3>
\r
313 <p>JABAWS can be configured to log what it is doing. This comes
\r
314 handy if you would like to see who is using your web services or
\r
315 need to chase some problems. JABAWS uses <a href=
\r
316 "http://logging.apache.org/log4j/1.2/">log4j</a> to do the logging,
\r
317 the example of log4j configuration is bundled with JABAWS war file.
\r
318 You will find it in the <span class=
\r
319 "hightlight">/WEB-INF/classes/log4j.properties</span> file. All the
\r
320 lines in this file are commented out. The reason why the logging is
\r
321 disabled by default it simple, log4j have to know the exact
\r
322 location where the log files should be stored. This is not known up
\r
323 until the deployment time. To enable the logging you need to
\r
324 define<span class="hightlight"> logDir</span> property in the <span
\r
325 class="hightlight">log4j.properties</span> and uncomment section of
\r
326 the file which corresponds to your need. More information is given
\r
327 in the <span class="hightlight">log4j.properties</span> file
\r
328 itself. Restart the Tomcat or the JABAWS web application to apply
\r
330 <p>After you have done this, assuming that you did not change the
\r
331 log4j.properties file yourself, you should see the application log
\r
332 file called <span class="hightlight">activity.log</span>. The
\r
333 amount of information logged can be adjusted using different
\r
334 logging levels, it is reduced in the following order of log levels
\r
335 TRACE, DEBUG, INFO, WARN, ERROR, FATAL.</p>
\r
336 <p>If you would like to know who is using your services, you might
\r
337 want to <a href="#logs">enable Tomcat request
\r
339 <h3><a name="execstat" id="execstat"></a>Monitoring JABAWS</h3>
\r
340 <p>JABAWS stores cluster task ids for all tasks which were run on the cluster. Using cluster ids the detailed statistics can be extracted from cluster accounting system. Due to the fact that each cluster supported by JABAWS have different accounting system it was not possible to provide ready to use statistics. <br />
\r
341 For the local execution the starting and finishing time in nano seconds can be found in STARTED and FINISHED files respectively. In time we will provide the tools to extract execution time statistics, so keep the content of your working directory ready!</p>
\r
342 <h3><a name="warfile" id="warfile"></a>JABAWS War File Content</h3>
\r
343 <table width="100%">
\r
345 <th style="width:19%">Directory</th>
\r
346 <th style="width:81%">Content description</th>
\r
350 <td>contains configuration files such as Executable.properties,
\r
351 Engine.local.properties, Engine.cluster.properties</td>
\r
354 <td>conf/settings</td>
\r
355 <td>Contains individual executable description files. In particular
\r
356 XXXParameters.xml, XXXPresets.xml, XXXLimits.xml where XXX is the
\r
357 name of the executable</td>
\r
361 <td>Contains directories generated when running an individual executable. E.g. input and output files and some other task
\r
362 related data. (optional)</td>
\r
366 <td>Directory contains native executables - programs,
\r
367 windows binaries (optional)</td>
\r
370 <td>binaries/src</td>
\r
371 <td>Contains source of native executables and Linux i386
\r
375 <td>binaries/matrices</td>
\r
376 <td>Substitution matrices
\r
377 <!-- what format ? --></td>
\r
381 <td>Web application descriptor</td>
\r
384 <td>WEB-INF/lib</td>
\r
385 <td>Web application libraries</td>
\r
388 <td>WEB-INF/classes</td>
\r
389 <td>log4j.properties - log configuration file (optional)</td>
\r
392 <td colspan="2"><strong>Help Pages</strong> </td>
\r
396 <td>help pages, index.html is the starting page</td>
\r
399 <td>dm_javadoc</td>
\r
400 <td>javadoc for JABAWS client (the link is available from How To
\r
405 <td>documentation for programs that JABAWS uses</td>
\r
409 <td>images referenced by html pages</td>
\r
414 <!-- content end-->
\r
415 <div id="copyright">Last update: 1 April 2011<br />
\r
416 Peter Troshin, Jim Procter and Geoff Barton, The Barton Group, University of
\r
420 <!-- wrapper end-->
\r
424 <!-- Google analitics -->
\r
425 <script type="text/javascript">
\r
426 var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
\r
427 document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
\r
429 <script type="text/javascript">
\r
431 var pageTracker = _gat._getTracker("UA-5356328-1");
\r
432 pageTracker._trackPageview();
\r