3 * Jalview - A Sequence Alignment Editor and Viewer ($$Version-Rel$$)
4 * Copyright (C) $$Year-Rel$$ The Jalview Authors
6 * This file is part of Jalview.
8 * Jalview is free software: you can redistribute it and/or
9 * modify it under the terms of the GNU General Public License
10 * as published by the Free Software Foundation, either version 3
11 * of the License, or (at your option) any later version.
13 * Jalview is distributed in the hope that it will be useful, but
14 * WITHOUT ANY WARRANTY; without even the implied warranty
15 * of MERCHANTABILITY or FITNESS FOR A PARTICULAR
16 * PURPOSE. See the GNU General Public License for more details.
18 * You should have received a copy of the GNU General Public License
19 * along with Jalview. If not, see <http://www.gnu.org/licenses/>.
20 * The Jalview Authors are detailed in the 'AUTHORS' file.
23 <title>UniProtKB query fields</title>
28 <strong>UniProtKB query fields</strong>
31 Supported query fields for searching specific data in UniProtKB (see
32 also <a href="uniprotsequencefetcher.html#text-search">query
36 <table border="1" width="95%">
44 <td><code>accession:P62988</code></td>
45 <td>Lists all entries with the primary or secondary accession
50 <td><code>active:no </code></td>
51 <td>Lists all obsolete entries.</td>
56 annotation:(type:non-positional) <br />
57 annotation:(type:positional) <br /> annotation:(type:mod_res
58 "Pyrrolidone carboxylic acid" evidence:experimental)
60 <td>Lists all entries with:
62 <li>any general annotation (comments [CC])</li>
63 <li>any sequence annotation (features [FT])</li>
64 <li>at least one amino acid modified with a Pyrrolidone
65 carboxylic acid group</li>
71 <td><code> author:ashburner </code></td>
72 <td>Lists all entries with at least one reference co-authored
73 by Michael Ashburner.</td>
77 <td><code> cdantigen:CD233 </code></td>
78 <td>Lists all entries whose cluster of differentiation number
84 citation:("intracellular structural proteins") <br />
85 citation:(author:ashburner journal:nature) citation:9169874
87 <td>Lists all entries with a literature citation:
89 <li>containing the phrase "intracellular structural
90 proteins" in either title or abstract</li>
91 <li>co-authored by Michael Ashburner and published in
93 <li>with the PubMed identifier 9169874</li>
99 <td><code> cluster:UniRef90_A5YMT3 </code></td>
100 <td>Lists all entries in the UniRef 90% identity cluster
101 whose representative sequence is UniProtKB entry A5YMT3.</td>
106 annotation:(type:transmem count:5)<br />
107 annotation:(type:transmem count:[5 TO *])<br />
108 annotation:(type:cofactor count:[3 TO *])
110 <td>Lists all entries with:
112 <li>exactly 5 transmembrane regions</li>
113 <li>5 or more transmembrane regions</li>
114 <li>3 or more Cofactor comments</li>
121 created:[20121001 TO *]<br /> reviewed:yes AND
122 created:[current TO *]
124 <td>Lists all entries created since October 1st 2012.<br />
125 Lists all new UniProtKB/Swiss-Prot entries in the last release.
131 database:(type:pfam) <br /> database:(type:pdb 1aut)
133 <td>Lists all entries with:
135 <li>a cross-reference to the Pfam database</li>
136 <li>a cross-reference to the PDB database entry 1aut</li>
143 <td><code> domain:VWFA </code></td>
144 <td>Lists all entries with a Von Willebrand factor type A
145 domain described in the 'Family and Domains' section.</td>
149 <td><code> ec:3.2.1.23 </code></td>
150 <td>Lists all beta-galactosidases.</td>
155 annotation:(type:signal evidence:ECO_0000269)<br />
156 (type:mod_res phosphoserine evidence:ECO_0000269)<br />
157 annotation:(type:function AND evidence:ECO_0000255)
159 <td>Lists all entries with:
161 <li>a signal sequence whose positions have been
162 experimentally proven</li>
163 <li>experimentally proven phosphoserine sites</li>
164 <li>a function manually asserted according to rules</li>
170 <td><code> family:serpin </code></td>
171 <td>Lists all entries belonging to the Serpin family of
176 <td><code> fragment:yes </code></td>
177 <td>Lists all entries with an incomplete sequence.</td>
182 <td><code> gene:HSPC233 </code></td>
183 <td>Lists all entries for proteins encoded by gene HSPC233.</td>
188 go:cytoskeleton <br /> go:0015629
190 <td>Lists all entries associated with:
192 <li>a GO term containing the word "cytoskeleton"</li>
193 <li>the GO term Actin cytoskeleton and any subclasses</li>
200 host:mouse <br /> host:10090 <br /> host:40674
202 <td>Lists all entries for viruses infecting:
204 <li>organisms with a name containing the word "mouse"</li>
205 <li>Mus musculus (Mouse)</li>
206 <li>all mammals (all taxa classified under the taxonomy
207 node for Mammalia)</li>
213 <td><code>id:P00750</code></td>
214 <td>Returns the entry with the primary accession number
219 <td><code> inn:Anakinra </code></td>
220 <td>Lists all entries whose "International Nonproprietary
221 Name" is Anakinra.</td>
225 <td><code> interactor:P00520 </code></td>
226 <td>Lists all entries describing interactions with the
227 protein described by entry P00520.</td>
231 <td><code> keyword:toxin </code></td>
232 <td>Lists all entries associated with the keyword Toxin.</td>
236 <td><code> length:[500 TO 700] </code></td>
237 <td>Lists all entries describing sequences of length between
238 500 and 700 residues.</td>
243 <td>This field is a synonym for the field <code>taxonomy</code>.
248 <td><code> mass:[500000 TO *] </code></td>
249 <td>Lists all entries describing sequences with a mass of at
250 least 500,000 Da.</td>
255 method:maldi <br /> method:xray
257 <td>Lists all entries for proteins identified by:
258 matrix-assisted laser desorption/ionization (MALDI),
259 crystallography (X-Ray). The <code>method</code> field searches
260 names of physico-chemical identification methods in the
261 'Biophysicochemical properties' subsection of the 'Function'
262 section, the 'Publications' and 'Cross-references' sections.
267 <td><code> mnemonic:ATP6_HUMAN </code></td>
268 <td>Lists all entries with entry name (ID) ATP6_HUMAN.
269 Searches also obsolete entry names.</td>
274 modified:[20120101 TO 20120301]<br /> reviewed:yes AND
275 modified:[current TO *]
277 <td>Lists all entries that were last modified between January
278 and March 2012.<br /> Lists all UniProtKB/Swiss-Prot entries
279 modified in the last release.
284 <td><code> name:"prion protein" </code></td>
285 <td>Lists all entries for prion proteins.</td>
289 <td><code> organelle:Mitochondrion </code></td>
290 <td>Lists all entries for proteins encoded by a gene of the
291 mitochondrial chromosome.</td>
296 organism:"Ovis aries" <br /> organism:9940 <br />
297 organism:sheep <br />
299 <td>Lists all entries for proteins expressed in sheep (first
300 2 examples) and organisms whose name contains the term "sheep".
306 <td><code> plasmid:ColE1 </code></td>
307 <td>Lists all entries for proteins encoded by a gene of
312 <td><code> proteome:UP000005640 </code></td>
313 <td>Lists all entries from the human proteome.</td>
316 <td>proteomecomponent</td>
317 <td><code> proteomecomponent:"chromosome 1" and
318 organism:9606 </code></td>
319 <td>Lists all entries from the human chromosome 1.</td>
323 <td><code> replaces:P02023 </code></td>
324 <td>Lists all entries that were created from a merge with
329 <td><code> reviewed:yes </code></td>
330 <td>Lists all UniProtKB/Swiss-Prot entries.</td>
334 <td><code> scope:mutagenesis </code></td>
335 <td>Lists all entries containing a reference that was used to
336 gather information about mutagenesis.</td>
340 <td><code> sequence:P05067-9 </code></td>
341 <td>Lists all entries containing a link to isoform 9 of the
342 sequence described in entry P05067. Allows searching by specific
343 sequence identifier.</td>
346 <td>sequence_modified</td>
348 sequence_modified:[20120101 TO 20120301]<br /> reviewed:yes
349 AND sequence_modified:[current TO *]
351 <td>Lists all entries whose sequences were last modified
352 between January and March 2012.<br /> Lists all
353 UniProtKB/Swiss-Prot entries whose sequences were modified in
359 <td><code> source:intact </code></td>
360 <td>Lists all entries containing a GO term whose annotation
361 source is the IntAct database.</td>
365 <td><code> strain:wistar </code></td>
366 <td>Lists all entries containing a reference relevant to
371 <td><code> taxonomy:40674 </code></td>
372 <td>Lists all entries for proteins expressed in Mammals. This
373 field is used to retrieve entries for all organisms classified
374 below a given taxonomic node taxonomy classification).</td>
378 <td><code> tissue:liver </code></td>
379 <td>Lists all entries containing a reference describing the
380 protein sequence obtained from a clone isolated from liver.</td>
384 <td><code> web:wikipedia </code></td>
385 <td>Lists all entries for proteins that are described in