From: Jim Procter Date: Mon, 8 Aug 2022 15:46:04 +0000 (+0100) Subject: Merge branch 'patch/JAL-4036_uniprot_fts_legacy_endpoint' into develop X-Git-Tag: Release_2_11_2_6~32 X-Git-Url: http://source.jalview.org/gitweb/?p=jalview.git;a=commitdiff_plain;h=4469e6d2c6a0c05c6278b10b0a157522aaa98091;hp=cd36659dd8a3855b3d1eb61cafbb0bbf604142be Merge branch 'patch/JAL-4036_uniprot_fts_legacy_endpoint' into develop --- diff --git a/help/help/html/features/uniprotqueryfields.html b/help/help/html/features/uniprotqueryfields.html index 182b206..66082f2 100644 --- a/help/help/html/features/uniprotqueryfields.html +++ b/help/help/html/features/uniprotqueryfields.html @@ -33,359 +33,238 @@ syntax).

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
FieldExampleDescription
accessionaccession:P62988Lists all entries with the primary or secondary accession - number P62988.
activeactive:no Lists all obsolete entries.
annotation - annotation:(type:non-positional)
- annotation:(type:positional)
annotation:(type:mod_res - "Pyrrolidone carboxylic acid" evidence:experimental) -
Lists all entries with: -
    -
  • any general annotation (comments [CC])
  • -
  • any sequence annotation (features [FT])
  • -
  • at least one amino acid modified with a Pyrrolidone - carboxylic acid group
  • -
-
author author:ashburner Lists all entries with at least one reference co-authored - by Michael Ashburner.
cdantigen cdantigen:CD233 Lists all entries whose cluster of differentiation number - is CD233.
citation - citation:("intracellular structural proteins")
- citation:(author:ashburner journal:nature) citation:9169874 -
Lists all entries with a literature citation: -
    -
  • containing the phrase "intracellular structural - proteins" in either title or abstract
  • -
  • co-authored by Michael Ashburner and published in - Nature
  • -
  • with the PubMed identifier 9169874
  • -
-
cluster cluster:UniRef90_A5YMT3 Lists all entries in the UniRef 90% identity cluster - whose representative sequence is UniProtKB entry A5YMT3.
count - annotation:(type:transmem count:5)
- annotation:(type:transmem count:[5 TO *])
- annotation:(type:cofactor count:[3 TO *]) -
Lists all entries with: -
    -
  • exactly 5 transmembrane regions
  • -
  • 5 or more transmembrane regions
  • -
  • 3 or more Cofactor comments
  • -
-
created - created:[20121001 TO *]
reviewed:yes AND - created:[current TO *] -
Lists all entries created since October 1st 2012.
- Lists all new UniProtKB/Swiss-Prot entries in the last release. -
database - database:(type:pfam)
database:(type:pdb 1aut) -
Lists all entries with: -
    -
  • a cross-reference to the Pfam database
  • -
  • a cross-reference to the PDB database entry 1aut
  • -
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
rest.uniprot.org fieldrest.uniprot.org exampleDescription
accessionaccession:P62988The old behaviour was to list all entries with primary or secondary accession number P62988. The new behaviour will list all primary / canonical isoform accessions P62988. To search over secondary accessions, we have introduced the sec_acc field.
activeactive:falseLists all obsolete entries.
Refer to the page: Sequence AnnotationsLists all entries with:
  1. any general annotation (comments [CC])
  2. any sequence annotation (features [FT])
  3. at least one amino acid modified with a Pyrrolidone carboxylic acid group
lit_authorlit_author:ashburnerLists all entries with at least one reference co-authored by Michael Ashburner.
protein_nameprotein_name:CD233Lists all entries whose cluster of differentiation number is CD233 (see cdlist.txt).
chebichebi:18420Lists all entries which are associated with the small molecule corresponding to ChEBI identifier 18420, Mg(2+) (see How can I search UniProt for chemical or reaction data?).
uniprot_id (/uniref), then uniref_cluster_90 (/uniprotkb)
  1. uniprot_id:A5YMT3 to find cluster UniRef90_P00395
  2. uniref_cluster_90:UniRef90_P00395
Find all entries in the UniRef 90% identity cluster whose representative sequence is UniProtKB entry A5YMT3 (about UniRef).
xrefcount_pdb (or xref_count)xref_count_pdb:[20 TO *]Lists all entries with 20 or more cross-references to PDB
date_createddate_created:[2012-10-01 TO *]Lists all entries created since October 1st 2012.
database, xref
  1. database:pfam
  2. xref:pdb-1aut
Lists all entries with:
  1. a cross-reference to the Pfam database
  2. a cross-reference to the PDB database entry 1aut
(see Databases cross-referenced in UniProtKB and Database mapping)
ecec:3.2.1.23Lists all beta-galactosidases (Enzyme nomenclature database).
Refer to the pages: Comments or Sequence AnnotationsLists all entries with:
  1. a signal sequence whose positions have been experimentally proven
  2. experimentally proven phosphoserine sites
  3. a function manually asserted according to rules
(see Evidence attribution)
existenceexistence:3See Protein existence criteria.
familyfamily:serpinLists all entries belonging to the Serpin family of proteins (Index of protein domains and families).
fragmentfragment:trueLists all entries with an incomplete sequence.
genegene:HPSELists all entries for proteins encoded by gene HPSE, but also by HPSE2.
gene_exactgene_exact:HPSELists all entries for proteins encoded by gene HPSE, but excluding variations like HPSE2 or HPSE_0.
gogo:0015629)Lists all entries associated with the GO term Actin cytoskeleton and any subclasses
virus_host_name, virus_host_idvirus_host_id:10090Lists all entries for viruses infecting Mus musculus (Mouse)
accession_idaccession_id:P00750Returns the entry with the primary accession number P00750.
inchikeyinchikey:WQZGKKKJIJFFOK-GASJEMHNSA-NReturns entries associated with the small molecule identified by the InChIKey WQZGKKKJIJFFOK-GASJEMHNSA-N, i.e. D-glucopyranose (see How can I search UniProt for chemical or reaction data?). To get the CHEBI identifier for an Inchikey value, one can now use the advanced search builder.
protein_nameprotein_name:AnakinraLists all entries whose protein name includes the "International Nonproprietary Name" is Anakinra.
interactorinteractor:P00520Lists all entries describing interactions with the protein described by entry P00520.
keyword
  1. keyword:toxin
  2. keyword:KW-0800
  1. Lists all entries associated with a keyword matching "Toxin" in its name or description (UniProtKB Keywords).
  2. Lists all entries associated with the UniProtKB keyword Toxin.
lengthlength:[500 TO 700]Lists all entries describing sequences of length between 500 and 700 residues.
massmass:[500000 TO *]Lists all entries describing sequences with a mass of at least 500,000 Da.
cc_mass_spectrometrycc_mass_spectrometry:maldiLists all entries for proteins identified by: matrix-assisted laser desorption/ionization (MALDI), crystallography (X-Ray). The method field searches names of physico-chemical identification methods in the 'Biophysicochemical properties' subsection of the 'Function' section, the 'Publications' and 'Cross-references' sections.
date_modifiedmodified:[2012-01-01 TO 2019-03-01] AND active:trueLists all active entries that were last modified between January and March 2019.
protein_nameprotein_name:"prion protein"Lists all entries for prion proteins.
organelleorganelle:MitochondrionLists all entries for proteins encoded by a gene of the mitochondrial chromosome.
organism_name, organism_id
  1. organism_name:"Ovis aries"
  2. organism_id:9940
  3. organism_name:sheep
Lists all entries for proteins expressed in sheep (first 2 examples) and organisms whose name contains the term "sheep" (UniProt taxonomy).
plasmidplasmid:ColE1Lists all entries for proteins encoded by a gene of plasmid ColE1 (Controlled vocabulary of plasmids).
proteomeproteome:UP000005640Lists all entries from the human proteome.
proteomecomponentproteomecomponent:"chromosome 1" AND organism_id:9606Lists all entries from the human chromosome 1.
sec_accsec_acc:P02023Lists all entries that were created from a merge with entry P02023 (see FAQ).
reviewedreviewed:trueLists all UniProtKB/Swiss-Prot entries (about UniProtKB).
scopescope:mutagenesisLists all entries containing a reference that was used to gather information about mutagenesis (Entry view: "Cited for", See 'Publications' section of the user manual).
sec_accsec_acc:P62988Lists all entries containing a secondary accession P62988.
sequenceaccession:P05067-9 AND is_isoform:trueLists all entries containing a link to isoform 9 of the sequence described in entry P05067. Allows searching by specific sequence identifier.
date_sequence_modified
  1. date_sequence_modified:[2012-01-01 TO 2012-03-01]
  2. date_sequence_modified:[2012-01-01 TO 2012-03-01]
  1. Lists all entries whose sequences were last modified between January and March 2012.
  2. Lists all UniProtKB/Swiss-Prot entries whose sequences were modified after the start of 2012.
strainstrain:wistarLists all entries containing a reference relevant to strain wistar (Lists of strains in reference comments and Taxonomy help: organism strains).
taxonomy_name, taxonomy_id
  1. taxonomy_name:mammal
  2. taxonomy_id:40674
Lists all entries for proteins expressed in Mammals. This field is used to retrieve entries for all organisms classified below a given taxonomic node (taxonomy classification).
tissuetissue:liverLists all entries containing a reference describing the protein sequence obtained from a clone isolated from liver (Controlled vocabulary of tissues).
cc_webresourcecc_webresource:wikipediaLists all entries for proteins that are described in Wikipedia.
-
domain domain:VWFA Lists all entries with a Von Willebrand factor type A - domain described in the 'Family and Domains' section.
ec ec:3.2.1.23 Lists all beta-galactosidases.
evidence - annotation:(type:signal evidence:ECO_0000269)
- (type:mod_res phosphoserine evidence:ECO_0000269)
- annotation:(type:function AND evidence:ECO_0000255) -
Lists all entries with: -
    -
  • a signal sequence whose positions have been - experimentally proven
  • -
  • experimentally proven phosphoserine sites
  • -
  • a function manually asserted according to rules
  • -
-
family family:serpin Lists all entries belonging to the Serpin family of - proteins.
fragment fragment:yes Lists all entries with an incomplete sequence.
gene gene:HSPC233 Lists all entries for proteins encoded by gene HSPC233.
go - go:cytoskeleton
go:0015629 -
Lists all entries associated with: -
    -
  • a GO term containing the word "cytoskeleton"
  • -
  • the GO term Actin cytoskeleton and any subclasses
  • -
-
host - host:mouse
host:10090
host:40674 -
Lists all entries for viruses infecting: -
    -
  • organisms with a name containing the word "mouse"
  • -
  • Mus musculus (Mouse)
  • -
  • all mammals (all taxa classified under the taxonomy - node for Mammalia)
  • -
-
idid:P00750Returns the entry with the primary accession number - P00750.
inn inn:Anakinra Lists all entries whose "International Nonproprietary - Name" is Anakinra.
interactor interactor:P00520 Lists all entries describing interactions with the - protein described by entry P00520.
keyword keyword:toxin Lists all entries associated with the keyword Toxin.
length length:[500 TO 700] Lists all entries describing sequences of length between - 500 and 700 residues.
lineage - This field is a synonym for the field taxonomy. -
mass mass:[500000 TO *] Lists all entries describing sequences with a mass of at - least 500,000 Da.
method - method:maldi
method:xray -
Lists all entries for proteins identified by: - matrix-assisted laser desorption/ionization (MALDI), - crystallography (X-Ray). The method field searches - names of physico-chemical identification methods in the - 'Biophysicochemical properties' subsection of the 'Function' - section, the 'Publications' and 'Cross-references' sections. -
mnemonic mnemonic:ATP6_HUMAN Lists all entries with entry name (ID) ATP6_HUMAN. - Searches also obsolete entry names.
modified - modified:[20120101 TO 20120301]
reviewed:yes AND - modified:[current TO *] -
Lists all entries that were last modified between January - and March 2012.
Lists all UniProtKB/Swiss-Prot entries - modified in the last release. -
name name:"prion protein" Lists all entries for prion proteins.
organelle organelle:Mitochondrion Lists all entries for proteins encoded by a gene of the - mitochondrial chromosome.
organism - organism:"Ovis aries"
organism:9940
- organism:sheep
-
Lists all entries for proteins expressed in sheep (first - 2 examples) and organisms whose name contains the term "sheep". -
plasmid plasmid:ColE1 Lists all entries for proteins encoded by a gene of - plasmid ColE1.
proteome proteome:UP000005640 Lists all entries from the human proteome.
proteomecomponent proteomecomponent:"chromosome 1" and - organism:9606 Lists all entries from the human chromosome 1.
replaces replaces:P02023 Lists all entries that were created from a merge with - entry P02023.
reviewed reviewed:yes Lists all UniProtKB/Swiss-Prot entries.
scope scope:mutagenesis Lists all entries containing a reference that was used to - gather information about mutagenesis.
sequence sequence:P05067-9 Lists all entries containing a link to isoform 9 of the - sequence described in entry P05067. Allows searching by specific - sequence identifier.
sequence_modified - sequence_modified:[20120101 TO 20120301]
reviewed:yes - AND sequence_modified:[current TO *] -
Lists all entries whose sequences were last modified - between January and March 2012.
Lists all - UniProtKB/Swiss-Prot entries whose sequences were modified in - the last release. -
source source:intact Lists all entries containing a GO term whose annotation - source is the IntAct database.
strain strain:wistar Lists all entries containing a reference relevant to - strain wistar.
taxonomy taxonomy:40674 Lists all entries for proteins expressed in Mammals. This - field is used to retrieve entries for all organisms classified - below a given taxonomic node taxonomy classification).
tissue tissue:liver Lists all entries containing a reference describing the - protein sequence obtained from a clone isolated from liver.
web web:wikipedia Lists all entries for proteins that are described in - Wikipedia.
- \ No newline at end of file + diff --git a/help/help/html/features/uniprotsequencefetcher.html b/help/help/html/features/uniprotsequencefetcher.html index 25d1a17..12ac7dc 100644 --- a/help/help/html/features/uniprotsequencefetcher.html +++ b/help/help/html/features/uniprotsequencefetcher.html @@ -32,14 +32,13 @@ allows sequences to be located via gene name, keywords, or even via manual cross-referencing from UniProt or other bioinformatics websites. -

- Please Note:Versions of Jalview older than 2.11.2.3 may need a configuration change - in order to access freetext search. Please see this post: - https://discourse.jalview.org/t/uniprot-free-text-search-not-working-in-jalview-2-11-2-2-and-earlier/1825 - in Jalview's discussion forum for a workaround.
+ Please Note:UniProt updated their API in July 2022. Versions of Jalview older than 2.11.2.4 will not work with the July 2022 UniProt free text search. +
+ The new UniProt API has a different search syntax for ranges of dates and numbers, and different query fields for advanced searches. The general syntax of combining queries remains the same. Because of these differences, your previously saved searches will not appear in the dropdown list next to the search box. If you need to access these old searches they can be found in your ~/.jalview_properties file with the label CACHE.UNIPROT_FTS. If you want to transfer them to the new API search then copy the values to the CACHE.UNIPROT_2022_FTS label (or rename the existing label if the new one does not exist) (see the UniProtKB query fields page). +
+ A change in accepted formats for number and date ranges means that number ranges should now always be entered as e.g.[1 TO 100] or [2020-01-01 TO 2022-07-26] although a * wildcard can be used for half-open ranges, e.g.[2020-01-01 TO *]. See the UniProtKB query fields page for more examples.

To open the UniProt Sequence Fetcher, select UniProt as the database from any Sequence Fetcher dialog (opened @@ -78,8 +77,7 @@

  • Complex queries with the UniProt query Syntax The text box also allows complex queries to be entered. The table below provides a brief overview - of the supported syntax (see query - fields for UniProtKB): + of the supported syntax (see the UniProtKB query fields page for more details): @@ -144,7 +142,7 @@ acids. - + @@ -171,7 +169,7 @@ like to be displayed or removed.

    The UniProt Free Test Search Interface was introduced in - Jalview 2.10.0 + Jalview 2.10.0 and updated to the July 2022 API in Jalview 2.11.2.4

    - \ No newline at end of file + diff --git a/resources/fts/uniprot_data_columns-2022.txt b/resources/fts/uniprot_data_columns-2022.txt new file mode 100644 index 0000000..b4ce8b4 --- /dev/null +++ b/resources/fts/uniprot_data_columns-2022.txt @@ -0,0 +1,356 @@ +/* + * Jalview - A Sequence Alignment Editor and Viewer ($$Version-Rel$$) + * Copyright (C) $$Year-Rel$$ The Jalview Authors + * + * This file is part of Jalview. + * + * Jalview is free software: you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation, either version 3 + * of the License, or (at your option) any later version. + * + * Jalview is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty + * of MERCHANTABILITY or FITNESS FOR A PARTICULAR + * PURPOSE. See the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with Jalview. If not, see . + * The Jalview Authors are detailed in the 'AUTHORS' file. + */ + +uniprot_data_columns +# +_group.id +_group.name +_group.sort_order +g3;Names & Taxonomy;1 +g6;Miscellaneous;6 +g7;Sequences;7 +g8;Function;8 +g9;Interaction;9 +g10;Expression;10 +g11;Gene Ontology (GO);11 +g12;Pathology & Biotech;12 +g13;Subcellular location;13 +g14;PTM / Processing;14 +g15;Structure;15 +g16;Publications;16 +g17;Date of;17 +g18;Family & Domains;18 +g19;2D Gel Databases;1000 +g20;3D Structure Databases;1000 +g21;Chemistry Databases;1000 +g22;Enzyme And Pathway Databases;1000 +g23;Family And Domain Databases;1000 +g24;Gene Expression Databases;1000 +g25;Genetic Variation Databases;1000 +g26;Genome Annotation Databases;1000 +g27;Miscellaneous Databases;1000 +g28;Organism-Specific Databases;1000 +g29;Phylogenomic Databases;1000 +g30;Protein Family/Group Databases;1000 +g31;Protein-Protein Interaction Databases;1000 +g32;Proteomic Databases;1000 +g33;Protocols And Materials Databases;1000 +g34;Ptm Databases;1000 +g35;Sequence Databases;1000 +# +_data_column.primary_key;id +_data_column.default_response_page_size;500 +# +_data_column.name +_data_column.code|_data_column.alt_code (optional: used for specifying search code when different from original code) +_data_column.group_id +_data_column.data_type +_data_column.min_col_width +_data_column.max_col_width +_data_column.preferred_col_width +_data_column.is_shown_by_default +_data_column.is_searchable +ALL;Search All;String;g7;50;1000;95;false;true +Entry;accession;String;g3;80;150;85;true;true +Entry name;id|accession_id;String;g3;100;150;105;true;true +Gene names;gene_names|gene;String;g3;100;1000;145;true;true +Gene names (primary);gene_primary;String;g3;50;1000;95;false;false +Gene names (synonym);gene_synonym;String;g3;50;1000;95;false;false +Gene names (ordered locus);gene_oln;String;g3;50;1000;95;false;false +Gene names (ORF);gene_orf;String;g3;50;1000;95;false;false +Organism;organism_name;String;g3;100;1000;200;true;true +Organism ID;organism_id;int;g3;60;100;80;false;true +Protein names;protein_name;String;g3;300;1500;500;true;true +Proteomes;xref_proteomes;String;g3;50;1000;95;false;false +Taxonomic lineage;lineage|taxonomy_name;String;g3;50;400;95;false;false +Taxonomic lineage (IDs);lineage_ids|taxonomy_id;String;g3;50;400;95;false;false +Virus hosts;virus_hosts|virus_host_id;String;g3;50;1000;95;false;true +Annotation;annotation_score;String;g6;50;1000;95;false;false +Caution;cc_caution;String;g6;50;1000;95;false;false +Comment Count;comment_count;String;g6;50;1000;95;false;false +Features;feature_count;String;g6;50;1000;95;false;false +Keyword ID;keywordid;String;g6;50;1000;95;false;false +Keywords;keyword;String;g6;50;1000;95;false;true +Miscellaneous [CC];cc_miscellaneous;String;g6;50;1000;95;false;false +Protein existence;protein_existence|existence;String;g6;50;1000;95;false;true +Reviewed;reviewed;String;g6;50;100;95;true;true +Tools;tools;String;g6;50;1000;95;false;false +UniParc;uniparc_id;String;g6;50;1000;95;false;false +Alternative products;cc_alternative_products;String;g7;50;1000;95;false;false +Alternative sequence;ft_var_seq;String;g7;50;1000;95;false;false +Erroneous gene model prediction;error_gmodel_pred;String;g7;50;1000;95;false;false +Fragment;fragment;String;g7;50;1000;95;false;false +Gene encoded by;organelle;String;g7;50;1000;95;false;false +Length;length;int|T|0;g7;50;100;65;true;true +Mass;mass;int|T|0;g7;50;100;80;false;true +Mass spectrometry;cc_mass_spectrometry;String;g7;50;1000;95;false;false +Natural variant;ft_variant;String;g7;50;1000;95;false;false +Non-adjacent residues;ft_non_cons;String;g7;50;1000;95;false;false +Non-standard residue;ft_non_std;String;g7;50;1000;95;false;false +Non-terminal residue;ft_non_ter;String;g7;50;1000;95;false;false +Polymorphism;cc_polymorphism;String;g7;50;1000;95;false;false +RNA editing;cc_rna_editing;String;g7;50;1000;95;false;false +Sequence;sequence;String;g7;50;1000;95;false;false +Sequence caution;cc_sequence_caution;String;g7;50;1000;95;false;false +Sequence conflict;ft_conflict;String;g7;50;1000;95;false;false +Sequence uncertainty;ft_unsure;String;g7;50;1000;95;false;false +Sequence version;sequence_version;String;g7;50;1000;95;false;false +Absorption;absorption;String;g8;50;1000;95;false;false +Active site;ft_act_site;String;g8;50;1000;95;false;false +Activity regulation;cc_activity_regulation;String;g8;50;1000;95;false;false +Binding site;ft_binding;String;g8;50;1000;95;false;false +Calcium binding;ft_ca_bind;String;g8;50;1000;95;false;false +Catalytic activity;cc_catalytic_activity;String;g8;50;1000;95;false;false +Cofactor;cc_cofactor;String;g8;50;1000;95;false;false +DNA binding;ft_dna_bind;String;g8;50;1000;95;false;false +EC number;ec;String;g8;50;1000;95;false;true +Function [CC];cc_function;String;g8;50;1000;95;false;false +Kinetics;kinetics;String;g8;50;1000;95;false;false +Metal binding;ft_metal;String;g8;50;1000;95;false;false +Nucleotide binding;ft_np_bind;String;g8;50;1000;95;false;false +Pathway;cc_pathway;String;g8;50;1000;95;false;false +pH dependence;ph_dependence;String;g8;50;1000;95;false;false +Redox potential;redox_potential;String;g8;50;1000;95;false;false +Rhea ID;rhea;String;g8;50;1000;95;false;false +Site;ft_site;String;g8;50;1000;95;false;false +Temperature dependence;temp_dependence;String;g8;50;1000;95;false;false +Interacts with;cc_interaction;String;g9;50;1000;95;false;false +Subunit structure[CC];cc_subunit;String;g9;50;1000;95;false;false +Developmental stage;cc_developmental_stage;String;g10;50;1000;95;false;false +Induction;cc_induction;String;g10;50;1000;95;false;false +Tissue specificity;cc_tissue_specificity;String;g10;50;1000;95;false;false +Gene ontology (biological process);go_p;String;g11;50;1000;95;false;false +Gene ontology (cellular component);go_c;String;g11;50;1000;95;false;false +Gene ontology (GO);go;String;g11;50;1000;95;false;true +Gene ontology (molecular function);go_f;String;g11;50;1000;95;false;false +Gene ontology IDs;go_id;String;g11;50;1000;95;false;false +Allergenic properties;cc_allergen;String;g12;50;1000;95;false;false +Biotechnological use;cc_biotechnology;String;g12;50;1000;95;false;false +Disruption phenotype;cc_disruption_phenotype;String;g12;50;1000;95;false;false +Involvement in disease;cc_disease;String;g12;50;1000;95;false;false +Mutagenesis;ft_mutagen;String;g12;50;1000;95;false;false +Pharmaceutical use;cc_pharmaceutical;String;g12;50;1000;95;false;false +Toxic dose;cc_toxic_dose;String;g12;50;1000;95;false;false +Intramembrane;ft_intramem;String;g13;50;1000;95;false;false +Subcellular location[CC];cc_subcellular_location;String;g13;50;1000;95;false;false +Topological domain;ft_topo_dom;String;g13;50;1000;95;false;false +Transmembrane;ft_transmem;String;g13;50;1000;95;false;false +Chain;ft_chain;String;g14;50;1000;95;false;false +Cross-link;ft_crosslnk;String;g14;50;1000;95;false;false +Disulfide bond;ft_disulfid;String;g14;50;1000;95;false;false +Glycosylation;ft_carbohyd;String;g14;50;1000;95;false;false +Initiator methionine;ft_init_met;String;g14;50;1000;95;false;false +Lipidation;ft_lipid;String;g14;50;1000;95;false;false +Modified residue;ft_mod_res;String;g14;50;1000;95;false;false +Peptide;ft_peptide;String;g14;50;1000;95;false;false +Post-translational modification;cc_ptm;String;g14;50;1000;95;false;false +Propeptide;ft_propep;String;g14;50;1000;95;false;false +Signal peptide;ft_signal;String;g14;50;1000;95;false;false +Transit peptide;ft_transit;String;g14;50;1000;95;false;false +3D;structure_3d;String;g15;50;1000;95;false;false +Beta strand;ft_strand;String;g15;50;1000;95;false;false +Helix;ft_helix;String;g15;50;1000;95;false;false +Turn;ft_turn;String;g15;50;1000;95;false;false +PubMed ID;lit_pubmed_id;String;g16;50;1000;95;false;false +Date of creation;date_created;String;g17;80;150;100;false;true +Date of last modification;date_modified;String;g17;80;150;100;false;true +Date of last sequence modification;date_sequence_modified;String;g17;80;150;100;false;true +Entry version;version;int;g17;80;100;80;false;false +Coiled coil;ft_coiled;String;g18;50;1000;95;false;false +Compositional bias;ft_compbias;String;g18;50;1000;95;false;false +Domain[CC];cc_domain;String;g18;80;1000;95;false;false +Domain[FT];ft_domain;String;g18;50;1000;95;false;false +Motif;ft_motif;String;g18;50;1000;95;false;false +Protein families;protein_families|family;String;g18;50;1000;95;false;true +Region;ft_region;String;g18;50;1000;95;false;false +Repeat;ft_repeat;String;g18;50;1000;95;false;false +Zinc finger;ft_zn_fing;String;g18;50;1000;95;false;false +COMPLUYEAST-2DPAGE;xref_compluyeast-2dpage;String;g19;50;1000;95;false;false +DOSAC-COBS-2DPAGE;xref_dosac-cobs-2dpage;String;g19;50;1000;95;false;false +OGP;xref_ogp;String;g19;50;1000;95;false;false +REPRODUCTION-2DPAGE;xref_reproduction-2dpage;String;g19;50;1000;95;false;false +SWISS-2DPAGE;xref_swiss-2dpage;String;g19;50;1000;95;false;false +UCD-2DPAGE;xref_ucd-2dpage;String;g19;50;1000;95;false;false +World-2DPAGE;xref_world-2dpage;String;g19;50;1000;95;false;false +AlphaFoldDB;xref_alphafolddb;String;g20;50;1000;95;false;false +BMRB;xref_bmrb;String;g20;50;1000;95;false;false +PCDDB;xref_pcddb;String;g20;50;1000;95;false;false +PDB;xref_pdb;String;g20;50;1000;95;false;false +PDBsum;xref_pdbsum;String;g20;50;1000;95;false;false +SASBDB;xref_sasbdb;String;g20;50;1000;95;false;false +SMR;xref_smr;String;g20;50;1000;95;false;false +BindingDB;xref_bindingdb;String;g21;50;1000;95;false;false +ChEMBL;xref_chembl;String;g21;50;1000;95;false;false +DrugBank;xref_drugbank;String;g21;50;1000;95;false;false +DrugCentral;xref_drugcentral;String;g21;50;1000;95;false;false +GuidetoPHARMACOLOGY;xref_guidetopharmacology;String;g21;50;1000;95;false;false +SwissLipids;xref_swisslipids;String;g21;50;1000;95;false;false +BRENDA;xref_brenda;String;g22;50;1000;95;false;false +BioCyc;xref_biocyc;String;g22;50;1000;95;false;false +PathwayCommons;xref_pathwaycommons;String;g22;50;1000;95;false;false +PlantReactome;xref_plantreactome;String;g22;50;1000;95;false;false +Reactome;xref_reactome;String;g22;50;1000;95;false;false +SABIO-RK;xref_sabio-rk;String;g22;50;1000;95;false;false +SIGNOR;xref_signor;String;g22;50;1000;95;false;false +SignaLink;xref_signalink;String;g22;50;1000;95;false;false +UniPathway;xref_unipathway;String;g22;50;1000;95;false;false +CDD;xref_cdd;String;g23;50;1000;95;false;false +DisProt;xref_disprot;String;g23;50;1000;95;false;false +Gene3D;xref_gene3d;String;g23;50;1000;95;false;false +HAMAP;xref_hamap;String;g23;50;1000;95;false;false +IDEAL;xref_ideal;String;g23;50;1000;95;false;false +InterPro;xref_interpro;String;g23;50;1000;95;false;false +PANTHER;xref_panther;String;g23;50;1000;95;false;false +PIRSF;xref_pirsf;String;g23;50;1000;95;false;false +PRINTS;xref_prints;String;g23;50;1000;95;false;false +PROSITE;xref_prosite;String;g23;50;1000;95;false;false +Pfam;xref_pfam;String;g23;50;1000;95;false;false +ProDom;xref_prodom;String;g23;50;1000;95;false;false +SFLD;xref_sfld;String;g23;50;1000;95;false;false +SMART;xref_smart;String;g23;50;1000;95;false;false +SUPFAM;xref_supfam;String;g23;50;1000;95;false;false +TIGRFAMs;xref_tigrfams;String;g23;50;1000;95;false;false +Bgee;xref_bgee;String;g24;50;1000;95;false;false +CleanEx;xref_cleanex;String;g24;50;1000;95;false;false +CollecTF;xref_collectf;String;g24;50;1000;95;false;false +ExpressionAtlas;xref_expressionatlas;String;g24;50;1000;95;false;false +Genevisible;xref_genevisible;String;g24;50;1000;95;false;false +BioMuta;xref_biomuta;String;g25;50;1000;95;false;false +DMDM;xref_dmdm;String;g25;50;1000;95;false;false +dbSNP;xref_dbsnp;String;g25;50;1000;95;false;false +Ensembl;xref_ensembl;String;g26;50;1000;95;false;false +EnsemblBacteria;xref_ensemblbacteria;String;g26;50;1000;95;false;false +EnsemblFungi;xref_ensemblfungi;String;g26;50;1000;95;false;false +EnsemblMetazoa;xref_ensemblmetazoa;String;g26;50;1000;95;false;false +EnsemblPlants;xref_ensemblplants;String;g26;50;1000;95;false;false +EnsemblProtists;xref_ensemblprotists;String;g26;50;1000;95;false;false +GeneID;xref_geneid;String;g26;50;1000;95;false;false +Gramene;xref_gramene;String;g26;50;1000;95;false;false +KEGG;xref_kegg;String;g26;50;1000;95;false;false +MANE-Select;xref_mane-select;String;g26;50;1000;95;false;false +PATRIC;xref_patric;String;g26;50;1000;95;false;false +UCSC;xref_ucsc;String;g26;50;1000;95;false;false +VectorBase;xref_vectorbase;String;g26;50;1000;95;false;false +WBParaSite;xref_wbparasite;String;g26;50;1000;95;false;false +WBParaSiteTranscriptProtein;xref_wbparasitetranscriptprotein;String;g26;50;1000;95;false;false +BioGRID-ORCS;xref_biogrid-orcs;String;g27;50;1000;95;false;false +ChiTaRS;xref_chitars;String;g27;50;1000;95;false;false +EvolutionaryTrace;xref_evolutionarytrace;String;g27;50;1000;95;false;false +GeneWiki;xref_genewiki;String;g27;50;1000;95;false;false +GenomeRNAi;xref_genomernai;String;g27;50;1000;95;false;false +PHI-base;xref_phi-base;String;g27;50;1000;95;false;false +PRO;xref_pro;String;g27;50;1000;95;false;false +Pharos;xref_pharos;String;g27;50;1000;95;false;false +RNAct;xref_rnact;String;g27;50;1000;95;false;false +ArachnoServer;xref_arachnoserver;String;g28;50;1000;95;false;false +Araport;xref_araport;String;g28;50;1000;95;false;false +CGD;xref_cgd;String;g28;50;1000;95;false;false +CTD;xref_ctd;String;g28;50;1000;95;false;false +ConoServer;xref_conoserver;String;g28;50;1000;95;false;false +DisGeNET;xref_disgenet;String;g28;50;1000;95;false;false +EchoBASE;xref_echobase;String;g28;50;1000;95;false;false +FlyBase;xref_flybase;String;g28;50;1000;95;false;false +GeneCards;xref_genecards;String;g28;50;1000;95;false;false +GeneReviews;xref_genereviews;String;g28;50;1000;95;false;false +HGNC;xref_hgnc;String;g28;50;1000;95;false;false +HPA;xref_hpa;String;g28;50;1000;95;false;false +LegioList;xref_legiolist;String;g28;50;1000;95;false;false +Leproma;xref_leproma;String;g28;50;1000;95;false;false +MGI;xref_mgi;String;g28;50;1000;95;false;false +MIM;xref_mim;String;g28;50;1000;95;false;false +MaizeGDB;xref_maizegdb;String;g28;50;1000;95;false;false +MalaCards;xref_malacards;String;g28;50;1000;95;false;false +NIAGADS;xref_niagads;String;g28;50;1000;95;false;false +OpenTargets;xref_opentargets;String;g28;50;1000;95;false;false +Orphanet;xref_orphanet;String;g28;50;1000;95;false;false +PharmGKB;xref_pharmgkb;String;g28;50;1000;95;false;false +PomBase;xref_pombase;String;g28;50;1000;95;false;false +PseudoCAP;xref_pseudocap;String;g28;50;1000;95;false;false +RGD;xref_rgd;String;g28;50;1000;95;false;false +SGD;xref_sgd;String;g28;50;1000;95;false;false +TAIR;xref_tair;String;g28;50;1000;95;false;false +TubercuList;xref_tuberculist;String;g28;50;1000;95;false;false +VEuPathDB;xref_veupathdb;String;g28;50;1000;95;false;false +VGNC;xref_vgnc;String;g28;50;1000;95;false;false +WormBase;xref_wormbase;String;g28;50;1000;95;false;false +Xenbase;xref_xenbase;String;g28;50;1000;95;false;false +ZFIN;xref_zfin;String;g28;50;1000;95;false;false +dictyBase;xref_dictybase;String;g28;50;1000;95;false;false +euHCVdb;xref_euhcvdb;String;g28;50;1000;95;false;false +neXtProt;xref_nextprot;String;g28;50;1000;95;false;false +GeneTree;xref_genetree;String;g29;50;1000;95;false;false +HOGENOM;xref_hogenom;String;g29;50;1000;95;false;false +InParanoid;xref_inparanoid;String;g29;50;1000;95;false;false +KO;xref_ko;String;g29;50;1000;95;false;false +OMA;xref_oma;String;g29;50;1000;95;false;false +OrthoDB;xref_orthodb;String;g29;50;1000;95;false;false +PhylomeDB;xref_phylomedb;String;g29;50;1000;95;false;false +TreeFam;xref_treefam;String;g29;50;1000;95;false;false +eggNOG;xref_eggnog;String;g29;50;1000;95;false;false +Allergome;xref_allergome;String;g30;50;1000;95;false;false +CAZy;xref_cazy;String;g30;50;1000;95;false;false +CLAE;xref_clae;String;g30;50;1000;95;false;false +ESTHER;xref_esther;String;g30;50;1000;95;false;false +IMGT_GENE-DB;xref_imgt_gene-db;String;g30;50;1000;95;false;false +MEROPS;xref_merops;String;g30;50;1000;95;false;false +MoonDB;xref_moondb;String;g30;50;1000;95;false;false +MoonProt;xref_moonprot;String;g30;50;1000;95;false;false +PeroxiBase;xref_peroxibase;String;g30;50;1000;95;false;false +REBASE;xref_rebase;String;g30;50;1000;95;false;false +TCDB;xref_tcdb;String;g30;50;1000;95;false;false +UniLectin;xref_unilectin;String;g30;50;1000;95;false;false +BioGRID;xref_biogrid;String;g31;50;1000;95;false;false +CORUM;xref_corum;String;g31;50;1000;95;false;false +ComplexPortal;xref_complexportal;String;g31;50;1000;95;false;false +DIP;xref_dip;String;g31;50;1000;95;false;false +ELM;xref_elm;String;g31;50;1000;95;false;false +IntAct;xref_intact;String;g31;50;1000;95;false;false +MINT;xref_mint;String;g31;50;1000;95;false;false +STRING;xref_string;String;g31;50;1000;95;false;false +CPTAC;xref_cptac;String;g32;50;1000;95;false;false +EPD;xref_epd;String;g32;50;1000;95;false;false +MassIVE;xref_massive;String;g32;50;1000;95;false;false +MaxQB;xref_maxqb;String;g32;50;1000;95;false;false +PRIDE;xref_pride;String;g32;50;1000;95;false;false +PaxDb;xref_paxdb;String;g32;50;1000;95;false;false +PeptideAtlas;xref_peptideatlas;String;g32;50;1000;95;false;false +ProMEX;xref_promex;String;g32;50;1000;95;false;false +ProteomicsDB;xref_proteomicsdb;String;g32;50;1000;95;false;false +TopDownProteomics;xref_topdownproteomics;String;g32;50;1000;95;false;false +jPOST;xref_jpost;String;g32;50;1000;95;false;false +ABCD;xref_abcd;String;g33;50;1000;95;false;false +Antibodypedia;xref_antibodypedia;String;g33;50;1000;95;false;false +CPTC;xref_cptc;String;g33;50;1000;95;false;false +DNASU;xref_dnasu;String;g33;50;1000;95;false;false +CarbonylDB;xref_carbonyldb;String;g34;50;1000;95;false;false +DEPOD;xref_depod;String;g34;50;1000;95;false;false +GlyConnect;xref_glyconnect;String;g34;50;1000;95;false;false +GlyGen;xref_glygen;String;g34;50;1000;95;false;false +MetOSite;xref_metosite;String;g34;50;1000;95;false;false +PhosphoSitePlus;xref_phosphositeplus;String;g34;50;1000;95;false;false +SwissPalm;xref_swisspalm;String;g34;50;1000;95;false;false +UniCarbKB;xref_unicarbkb;String;g34;50;1000;95;false;false +iPTMnet;xref_iptmnet;String;g34;50;1000;95;false;false +CCDS;xref_ccds;String;g35;50;1000;95;false;false +EMBL;xref_embl;String;g35;50;1000;95;false;false +PIR;xref_pir;String;g35;50;1000;95;false;false +RefSeq;xref_refseq;String;g35;50;1000;95;false;false +# diff --git a/resources/lang/Messages.properties b/resources/lang/Messages.properties index dda5624..ed3e06b 100644 --- a/resources/lang/Messages.properties +++ b/resources/lang/Messages.properties @@ -778,8 +778,10 @@ label.transformed_points_for_params = Transformed points for {0} label.variable_color_for = Variable Feature Colour for {0} label.select_background_colour = Select Background Colour label.invalid_font = Invalid Font +label.search_db_all = Search all of {0} +label.search_db_index = Search {0} index {1} label.separate_multiple_accession_ids = Enter one or more accession IDs separated by a semi-colon ";" -label.separate_multiple_query_values = Enter one or more {0}s separated by a semi-colon ";" +label.separate_multiple_query_values = Enter one or more {0} separated by a semi-colon ";" label.search_all = Enter one or more search values separated by a semi-colon ";" (Note: This searches the entire database) label.replace_commas_semicolons = Replace commas with semi-colons label.parsing_failed_syntax_errors_shown_below_param = Parsing failed. Syntax errors shown below {0} diff --git a/resources/lang/Messages_es.properties b/resources/lang/Messages_es.properties index fb87f7d..d8772ef 100644 --- a/resources/lang/Messages_es.properties +++ b/resources/lang/Messages_es.properties @@ -700,6 +700,8 @@ label.transformed_points_for_params = Puntos transformados de {0} label.variable_color_for = Color variable para la característica de {0} label.select_background_colour = Seleccionar color de fondo label.invalid_font = Fuente no válida +label.search_db_all = Buscar en todo {0} +label.search_db_index = Buscar índice {0} {1} label.separate_multiple_accession_ids = Separar los accession id con un punto y coma ";" label.replace_commas_semicolons = Cambiar comas por puntos y comas label.parsing_failed_syntax_errors_shown_below_param = Parseo erróneo. A continuación, se muestras los errores de sintaxis {0} @@ -1139,7 +1141,7 @@ label.threshold_filter=Filtro de Umbral label.add_reference_annotations=Añadir anotaciones de referencia label.hide_insertions=Ocultar Inserciones info.change_threshold_mode_to_enable=Cambiar Modo de Umbral para Habilitar -label.separate_multiple_query_values=Introducir uno o mas {0}s separados por punto y coma ";" +label.separate_multiple_query_values=Introducir uno o mas {0} separados por punto y coma ";" label.fetch_chimera_attributes = Buscar atributos desde Chimera label.fetch_chimera_attributes_tip = Copiar atributo de Chimera a característica de Jalview label.view_rna_structure=Estructura 2D VARNA diff --git a/src/jalview/fts/api/GFTSPanelI.java b/src/jalview/fts/api/GFTSPanelI.java index 974cc88..ddb9959 100644 --- a/src/jalview/fts/api/GFTSPanelI.java +++ b/src/jalview/fts/api/GFTSPanelI.java @@ -152,4 +152,11 @@ public interface GFTSPanelI * checkbox */ public String getAutosearchPreference(); + + /** + * Return the name of the database being searched + * + * @return The database name + */ + public String getDbName(); } diff --git a/src/jalview/fts/core/FTSRestClient.java b/src/jalview/fts/core/FTSRestClient.java index ac5b280..62dd13f 100644 --- a/src/jalview/fts/core/FTSRestClient.java +++ b/src/jalview/fts/core/FTSRestClient.java @@ -20,8 +20,6 @@ */ package jalview.fts.core; -import java.util.Locale; - import java.io.BufferedReader; import java.io.IOException; import java.io.InputStream; @@ -29,13 +27,13 @@ import java.io.InputStreamReader; import java.util.ArrayList; import java.util.Collection; import java.util.HashMap; +import java.util.Locale; import java.util.Objects; import jalview.fts.api.FTSDataColumnI; import jalview.fts.api.FTSDataColumnI.FTSDataColumnGroupI; -import jalview.fts.core.FTSDataColumnPreferences.PreferenceSource; -import jalview.fts.service.threedbeacons.TDBeaconsFTSRestClient; import jalview.fts.api.FTSRestClientI; +import jalview.fts.core.FTSDataColumnPreferences.PreferenceSource; /** * Base class providing implementation for common methods defined in @@ -146,7 +144,7 @@ public abstract class FTSRestClient implements FTSRestClientI @Override public String toString() { - return lineData[0]; + return getName(); } @Override diff --git a/src/jalview/fts/core/GFTSPanel.java b/src/jalview/fts/core/GFTSPanel.java index ea206e9..d52ff89 100644 --- a/src/jalview/fts/core/GFTSPanel.java +++ b/src/jalview/fts/core/GFTSPanel.java @@ -21,18 +21,6 @@ package jalview.fts.core; -import jalview.bin.Cache; -import jalview.fts.api.FTSDataColumnI; -import jalview.fts.api.GFTSPanelI; -import jalview.fts.core.FTSDataColumnPreferences.PreferenceSource; -import jalview.gui.Desktop; -import jalview.gui.IProgressIndicator; -import jalview.gui.JvSwingUtils; -import jalview.gui.SequenceFetcher; -import jalview.io.cache.JvCacheableInputBox; -import jalview.util.MessageManager; -import jalview.util.Platform; - import java.awt.BorderLayout; import java.awt.CardLayout; import java.awt.Dimension; @@ -77,6 +65,18 @@ import javax.swing.event.InternalFrameEvent; import javax.swing.table.DefaultTableModel; import javax.swing.table.TableColumn; +import jalview.bin.Cache; +import jalview.fts.api.FTSDataColumnI; +import jalview.fts.api.GFTSPanelI; +import jalview.fts.core.FTSDataColumnPreferences.PreferenceSource; +import jalview.gui.Desktop; +import jalview.gui.IProgressIndicator; +import jalview.gui.JvSwingUtils; +import jalview.gui.SequenceFetcher; +import jalview.io.cache.JvCacheableInputBox; +import jalview.util.MessageManager; +import jalview.util.Platform; + /** * This class provides the swing GUI layout for FTS Panel and implements most of * the contracts defined in GFSPanelI @@ -544,7 +544,10 @@ public abstract class GFTSPanel extends JPanel implements GFTSPanelI } txt_search.getComponent().setToolTipText( JvSwingUtils.wrapTooltip(true, tooltipText)); + // if (btn_autosearch.isSelected()) searchAction(true); + + setCmbSearchTargetTooltip(); } } }); @@ -678,6 +681,7 @@ public abstract class GFTSPanel extends JPanel implements GFTSPanelI pnl_actions.add(btn_cancel); pnl_results.add(tabbedPane); + setCmbSearchTargetTooltip(); pnl_inputs.add(cmb_searchTarget); pnl_inputs.add(txt_search.getComponent()); pnl_inputs.add(txt_help); @@ -1085,4 +1089,28 @@ public abstract class GFTSPanel extends JPanel implements GFTSPanelI @Override public abstract void okAction(); + + private void setCmbSearchTargetTooltip() + { + JComboBox cmb = getCmbSearchTarget(); + if (cmb.isEnabled()) + { + boolean isAll = "all" + .equalsIgnoreCase(cmb.getSelectedItem().toString()); + FTSDataColumnI index = (FTSDataColumnI) cmb.getSelectedItem(); + String indexCode = index.getAltCode(); + String dbName = getDbName(); + String message = isAll ? MessageManager + .formatMessage("label.search_db_all", new Object[] + { dbName }) + : MessageManager.formatMessage("label.search_db_index", + new Object[] + { dbName, indexCode }); + cmb.setToolTipText(message); + } + else + { + cmb.setToolTipText(""); + } + } } diff --git a/src/jalview/fts/service/pdb/PDBFTSPanel.java b/src/jalview/fts/service/pdb/PDBFTSPanel.java index 33c1b6c..c9d7676 100644 --- a/src/jalview/fts/service/pdb/PDBFTSPanel.java +++ b/src/jalview/fts/service/pdb/PDBFTSPanel.java @@ -21,6 +21,12 @@ package jalview.fts.service.pdb; +import java.util.HashMap; +import java.util.HashSet; +import java.util.Map; + +import javax.help.HelpSetException; + import jalview.fts.api.FTSDataColumnI; import jalview.fts.api.FTSRestClientI; import jalview.fts.core.FTSRestRequest; @@ -31,12 +37,6 @@ import jalview.gui.Help.HelpId; import jalview.gui.SequenceFetcher; import jalview.util.MessageManager; -import java.util.HashMap; -import java.util.HashSet; -import java.util.Map; - -import javax.help.HelpSetException; - @SuppressWarnings("serial") public class PDBFTSPanel extends GFTSPanel { @@ -302,4 +302,9 @@ public class PDBFTSPanel extends GFTSPanel e1.printStackTrace(); } } + + public String getDbName() + { + return "PDB"; + } } \ No newline at end of file diff --git a/src/jalview/fts/service/threedbeacons/TDBeaconsFTSPanel.java b/src/jalview/fts/service/threedbeacons/TDBeaconsFTSPanel.java index 6ca3ca2..253de42 100644 --- a/src/jalview/fts/service/threedbeacons/TDBeaconsFTSPanel.java +++ b/src/jalview/fts/service/threedbeacons/TDBeaconsFTSPanel.java @@ -20,21 +20,13 @@ */ package jalview.fts.service.threedbeacons; -import java.io.BufferedReader; -import java.io.IOException; -import java.io.InputStreamReader; import java.net.HttpURLConnection; -import java.net.MalformedURLException; -import java.net.URL; import java.util.HashMap; import java.util.HashSet; import java.util.Map; import javax.swing.SwingUtilities; -import org.json.JSONArray; -import org.json.JSONObject; - import jalview.bin.Console; import jalview.datamodel.AlignmentI; import jalview.fts.api.FTSDataColumnI; @@ -42,12 +34,9 @@ import jalview.fts.api.FTSRestClientI; import jalview.fts.core.FTSRestRequest; import jalview.fts.core.FTSRestResponse; import jalview.fts.core.GFTSPanel; -import jalview.fts.service.pdb.PDBFTSRestClient; import jalview.gui.SequenceFetcher; import jalview.io.DataSourceType; import jalview.io.FileFormat; -import jalview.io.FileFormatI; -import jalview.io.FileLoader; import jalview.io.FormatAdapter; import jalview.util.MessageManager; @@ -297,4 +286,9 @@ public class TDBeaconsFTSPanel extends GFTSPanel // no multiple query support yet return enteredText; } + + public String getDbName() + { + return "3D-Beacons"; + } } diff --git a/src/jalview/fts/service/uniprot/UniProtFTSRestClient.java b/src/jalview/fts/service/uniprot/UniProtFTSRestClient.java index 2606b62..1827293 100644 --- a/src/jalview/fts/service/uniprot/UniProtFTSRestClient.java +++ b/src/jalview/fts/service/uniprot/UniProtFTSRestClient.java @@ -21,6 +21,9 @@ package jalview.fts.service.uniprot; +import java.lang.invoke.MethodHandles; +import java.net.MalformedURLException; +import java.net.URL; import java.util.ArrayList; import java.util.Collection; import java.util.List; @@ -37,31 +40,63 @@ import jalview.bin.Cache; import jalview.bin.Console; import jalview.fts.api.FTSData; import jalview.fts.api.FTSDataColumnI; -import jalview.fts.api.FTSRestClientI; import jalview.fts.core.FTSRestClient; import jalview.fts.core.FTSRestRequest; import jalview.fts.core.FTSRestResponse; +import jalview.util.ChannelProperties; import jalview.util.MessageManager; import jalview.util.Platform; +/* + * 2022-07-20 bsoares + * See https://issues.jalview.org/browse/JAL-4036 + * The new Uniprot API is not dissimilar to the old one, but has some important changes. + * Some group names have changed slightly, some old groups have gone and there are quite a few new groups. + * + * Most changes are mappings of old column ids to new field ids. There are a handful of old + * columns not mapped to new fields, and new fields without an old column. + * [aside: not all possible columns were listed in the resources/fts/uniprot_data_columns.txt file. + * These were presumably additions after the file was created] + * For existing/mapped fields, the same preferences found in the resource file have been migrated to + * the new file with the new field name, id and group. + * + * The new mapped groups and files are stored and read from resources/fts/uniprot_data_columns-2022.txt. + * + * There is now no "sort" query string parameter. + * + * See https://www.uniprot.org/help/api_queries + * + * SIGNIFICANT CHANGE: Pagination is no longer performed using a record offset, but with a "cursor" + * query string parameter that is not really a cursor. The value is an opaque string that is passed (or + * rather a whole URL is passed) in the "Link" header of the HTTP response of the previous page. + * Where such a link is passed it is put into the cursors ArrayList. + * There are @Overridden methods in UniprotFTSPanel. + */ + public class UniProtFTSRestClient extends FTSRestClient { - private static final String DEFAULT_UNIPROT_DOMAIN = "https://legacy.uniprot.org"; + private static final String DEFAULT_UNIPROT_DOMAIN = "https://rest.uniprot.org"; + + private static final String USER_AGENT = ChannelProperties + .getProperty("app_name", "Jalview") + " " + + Cache.getDefault("VERSION", "Unknown") + " " + + MethodHandles.lookup().lookupClass() + " help@jalview.org"; static { Platform.addJ2SDirectDatabaseCall(DEFAULT_UNIPROT_DOMAIN); } - private static FTSRestClientI instance = null; + private static UniProtFTSRestClient instance = null; public final String uniprotSearchEndpoint; public UniProtFTSRestClient() { super(); - uniprotSearchEndpoint = Cache.getDefault("UNIPROT_DOMAIN", - DEFAULT_UNIPROT_DOMAIN) + "/uniprot/"; + this.clearCursors(); + uniprotSearchEndpoint = Cache.getDefault("UNIPROT_2022_DOMAIN", + DEFAULT_UNIPROT_DOMAIN) + "/uniprotkb/search"; } @SuppressWarnings("unchecked") @@ -69,6 +104,12 @@ public class UniProtFTSRestClient extends FTSRestClient public FTSRestResponse executeRequest(FTSRestRequest uniprotRestRequest) throws Exception { + return executeRequest(uniprotRestRequest, null); + } + + public FTSRestResponse executeRequest(FTSRestRequest uniprotRestRequest, + String cursor) throws Exception + { try { String wantedFields = getDataColumnsFieldsAsCommaDelimitedString( @@ -85,11 +126,10 @@ public class UniProtFTSRestClient extends FTSRestClient } else { - query = uniprotRestRequest.getFieldToSearchBy() - .equalsIgnoreCase("Search All") - ? uniprotRestRequest.getSearchTerm() - + " or mnemonic:" - + uniprotRestRequest.getSearchTerm() + query = uniprotRestRequest.getFieldToSearchBy().equalsIgnoreCase( + "Search All") ? uniprotRestRequest.getSearchTerm() + // + " or mnemonic:" + // + uniprotRestRequest.getSearchTerm() : uniprotRestRequest.getFieldToSearchBy() + ":" + uniprotRestRequest.getSearchTerm(); } @@ -119,18 +159,62 @@ public class UniProtFTSRestClient extends FTSRestClient WebResource webResource = null; webResource = client.resource(uniprotSearchEndpoint) - .queryParam("format", "tab") - .queryParam("columns", wantedFields) - .queryParam("limit", String.valueOf(responseSize)) - .queryParam("offset", String.valueOf(offSet)) - .queryParam("sort", "score").queryParam("query", query); - if (Console.isDebugEnabled()) + .queryParam("format", "tsv") + .queryParam("fields", wantedFields) + .queryParam("size", String.valueOf(responseSize)) + /* 2022 new api has no "sort" + * .queryParam("sort", "score") + */ + .queryParam("query", query); + if (offSet != 0 && cursor != null && cursor.length() > 0) + // 2022 new api does not do pagination with an offset, it requires a + // "cursor" parameter with a key (given for the next page). + // (see https://www.uniprot.org/help/pagination) { - Console.debug("Uniprot FTS Request: " + webResource.toString()); + webResource = webResource.queryParam("cursor", cursor); } + Console.debug( + "Uniprot FTS Request: " + webResource.getURI().toString()); // Execute the REST request - ClientResponse clientResponse = webResource - .accept(MediaType.TEXT_PLAIN).get(clientResponseClass); + WebResource.Builder wrBuilder = webResource + .accept(MediaType.TEXT_PLAIN); + if (!Platform.isJS()) + /** + * Java only + * + * @j2sIgnore + */ + { + wrBuilder.header("User-Agent", USER_AGENT); + } + ClientResponse clientResponse = wrBuilder.get(clientResponseClass); + + if (!Platform.isJS()) + /** + * Java only + * + * @j2sIgnore + */ + { + if (clientResponse.getHeaders().containsKey("Link")) + { + // extract the URL from the 'Link: ; ref="stuff"' header + String linkHeader = clientResponse.getHeaders().get("Link") + .get(0); + if (linkHeader.indexOf("<") > -1) + { + String temp = linkHeader.substring(linkHeader.indexOf("<") + 1); + if (temp.indexOf(">") > -1) + { + String nextUrl = temp.substring(0, temp.indexOf(">")); + // then get the cursor value from the query string parameters + String nextCursor = getQueryParam("cursor", nextUrl); + setCursor(cursorPage + 1, nextCursor); + } + } + } + } + String uniProtTabDelimittedResponseString = clientResponse .getEntity(String.class); // Make redundant objects eligible for garbage collection to conserve @@ -144,15 +228,27 @@ public class UniProtFTSRestClient extends FTSRestClient throw new Exception(errorMessage); } - int xTotalResults = Platform.isJS() ? 1 - : Integer.valueOf(clientResponse.getHeaders() - .get("X-Total-Results").get(0)); + // new Uniprot API is not including a "X-Total-Results" header when there + // are 0 results + List resultsHeaders = clientResponse.getHeaders() + .get("X-Total-Results"); + int xTotalResults = 0; + if (Platform.isJS()) + { + xTotalResults = 1; + } + else if (resultsHeaders != null && resultsHeaders.size() >= 1) + { + xTotalResults = Integer.valueOf(resultsHeaders.get(0)); + } clientResponse = null; client = null; return parseUniprotResponse(uniProtTabDelimittedResponseString, uniprotRestRequest, xTotalResults); } catch (Exception e) { + Console.warn("Problem with the query: " + e.getMessage()); + Console.debug("Exception stacktrace:", e); String exceptionMsg = e.getMessage(); if (exceptionMsg.contains("SocketException")) { @@ -352,7 +448,7 @@ public class UniProtFTSRestClient extends FTSRestClient }; } - public static FTSRestClientI getInstance() + public static UniProtFTSRestClient getInstance() { if (instance == null) { @@ -364,7 +460,94 @@ public class UniProtFTSRestClient extends FTSRestClient @Override public String getColumnDataConfigFileName() { - return "/fts/uniprot_data_columns.txt"; + return "/fts/uniprot_data_columns-2022.txt"; + } + + /* 2022-07-20 bsoares + * used for the new API "cursor" pagination. See https://www.uniprot.org/help/pagination + */ + private ArrayList cursors; + + private int cursorPage = 0; + + protected int getCursorPage() + { + return cursorPage; + } + + protected void setCursorPage(int i) + { + cursorPage = i; + } + + protected void setPrevCursorPage() + { + if (cursorPage > 0) + cursorPage--; + } + + protected void setNextCursorPage() + { + cursorPage++; + } + + protected void clearCursors() + { + cursors = new ArrayList(10); } -} + protected String getCursor(int i) + { + return cursors.get(i); + } + + protected String getNextCursor() + { + if (cursors.size() < cursorPage + 2) + return null; + return cursors.get(cursorPage + 1); + } + + protected String getPrevCursor() + { + if (cursorPage == 0) + return null; + return cursors.get(cursorPage - 1); + } + + protected void setCursor(int i, String c) + { + cursors.ensureCapacity(i + 1); + while (cursors.size() <= i) + { + cursors.add(null); + } + cursors.set(i, c); + Console.debug( + "Set UniprotFRSRestClient cursors[" + i + "] to '" + c + "'"); + // cursors.add(c); + } + + public static String getQueryParam(String param, String u) + { + if (param == null || u == null) + return null; + try + { + URL url = new URL(u); + String[] kevs = url.getQuery().split("&"); + for (int j = 0; j < kevs.length; j++) + { + String[] kev = kevs[j].split("=", 2); + if (param.equals(kev[0])) + { + return kev[1]; + } + } + } catch (MalformedURLException e) + { + Console.warn("Could not obtain next page 'cursor' value from 'u"); + } + return null; + } +} \ No newline at end of file diff --git a/src/jalview/fts/service/uniprot/UniprotFTSPanel.java b/src/jalview/fts/service/uniprot/UniprotFTSPanel.java index 33ad8c4..aa0942f 100644 --- a/src/jalview/fts/service/uniprot/UniprotFTSPanel.java +++ b/src/jalview/fts/service/uniprot/UniprotFTSPanel.java @@ -21,6 +21,13 @@ package jalview.fts.service.uniprot; +import java.util.HashMap; +import java.util.HashSet; +import java.util.Map; + +import javax.help.HelpSetException; + +import jalview.bin.Console; import jalview.fts.api.FTSDataColumnI; import jalview.fts.api.FTSRestClientI; import jalview.fts.core.FTSRestRequest; @@ -31,12 +38,6 @@ import jalview.gui.Help.HelpId; import jalview.gui.SequenceFetcher; import jalview.util.MessageManager; -import java.util.HashMap; -import java.util.HashSet; -import java.util.Map; - -import javax.help.HelpSetException; - @SuppressWarnings("serial") public class UniprotFTSPanel extends GFTSPanel { @@ -46,7 +47,7 @@ public class UniprotFTSPanel extends GFTSPanel private static Map tempUserPrefs = new HashMap<>(); - private static final String UNIPROT_FTS_CACHE_KEY = "CACHE.UNIPROT_FTS"; + private static final String UNIPROT_FTS_CACHE_KEY = "CACHE.UNIPROT_2022_FTS"; private static final String UNIPROT_AUTOSEARCH = "FTS.UNIPROT.AUTOSEARCH"; @@ -69,10 +70,19 @@ public class UniprotFTSPanel extends GFTSPanel @Override public void searchAction(boolean isFreshSearch) { + searchAction(null, isFreshSearch); + } + + public void searchAction(String cursor, boolean isFreshSearch) + { mainFrame.requestFocusInWindow(); if (isFreshSearch) { offSet = 0; + UniProtFTSRestClient c = UniProtFTSRestClient.getInstance(); + c.clearCursors(); + c.setCursorPage(0); + c.setCursor(0, ""); } new Thread() { @@ -97,12 +107,12 @@ public class UniprotFTSPanel extends GFTSPanel request.setSearchTerm(searchTerm); request.setOffSet(offSet); request.setWantedFields(wantedFields); - FTSRestClientI uniProtRestClient = UniProtFTSRestClient + UniProtFTSRestClient uniProtRestClient = UniProtFTSRestClient .getInstance(); FTSRestResponse resultList; try { - resultList = uniProtRestClient.executeRequest(request); + resultList = uniProtRestClient.executeRequest(request, cursor); } catch (Exception e) { setErrorMessage(e.getMessage()); @@ -268,4 +278,72 @@ public class UniprotFTSPanel extends GFTSPanel e1.printStackTrace(); } } + + /* + * 2022-07-20 bsoares + * The new Uniprot API has a strange pagination process described at + * https://www.uniprot.org/help/pagination + * When a successful request returns results, with more results past the size + * limit, the response sends a "Link" header with a URL containing the a "cursor" + * parameter with an opaque string that refers to the next page of results. + * These are store as nextCursor in the UniProtFTSRestClient along with the currCursor. + * When navigation across pages occurs these should be swapped around. + */ + @Override + public void refreshPaginatorState() + { + UniProtFTSRestClient c = UniProtFTSRestClient.getInstance(); + setNextPageButtonEnabled(c.getNextCursor() != null); + setPrevPageButtonEnabled(c.getPrevCursor() != null); + } + + @Override + public void prevPageAction() + { + updatePaginatorCart(); + UniProtFTSRestClient c = UniProtFTSRestClient.getInstance(); + String prevCursor = c.getPrevCursor(); + if (prevCursor != null) + { + if (offSet >= pageLimit) + { + offSet -= pageLimit; + } + else + { + // not sure what's happening if we get here though it wouldn't surprise + // me + Console.warn( + "UniprotFTSPanel: prevCursor exists but offset < pageLimit. This probably shouldn't be happening."); + } + c.setPrevCursorPage(); + searchAction(prevCursor, false); + } + else + { + refreshPaginatorState(); + } + } + + @Override + public void nextPageAction() + { + UniProtFTSRestClient c = UniProtFTSRestClient.getInstance(); + String nextCursor = c.getNextCursor(); + if (nextCursor != null) + { + offSet += pageLimit; + c.setNextCursorPage(); + searchAction(nextCursor, false); + } + else + { + refreshPaginatorState(); + } + } + + public String getDbName() + { + return "UniProt"; + } }
    human antigen
    citation:(author:Arai author:Chung)(lit_author:Arai) AND (lit_author:Chung) All entries with a publication that was coauthored by two specific authors.