X-Git-Url: http://source.jalview.org/gitweb/?a=blobdiff_plain;ds=sidebyside;f=forester%2Fresources%2Fphyloxml_schema%2F1.10%2Fphyloxml.xsd;h=bd54eb5fbeb5f580abc3a5ce1d0a8e121f304042;hb=0b49b8e750b34d28a5989facdd8a7959870de996;hp=d8fba850e2f1f8c2a874fc8c40cfadc6e3b66d18;hpb=cc27e69a8f08da4ee1d0c1823511256dae3dd32f;p=jalview.git
diff --git a/forester/resources/phyloxml_schema/1.10/phyloxml.xsd b/forester/resources/phyloxml_schema/1.10/phyloxml.xsd
index d8fba85..bd54eb5 100644
--- a/forester/resources/phyloxml_schema/1.10/phyloxml.xsd
+++ b/forester/resources/phyloxml_schema/1.10/phyloxml.xsd
@@ -1,592 +1,574 @@
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- phyloXML is an XML language to describe evolutionary trees and associated data. Version: 1.10.
- License: dual-licensed under the LGPL or Ruby's License. Copyright (c) 2008-2011 Christian M Zmasek.
-
-
-
-
-
-
- 'phyloxml' is the name of the root element. Phyloxml contains an arbitrary number of
- 'phylogeny' elements (each representing one phylogeny) possibly followed by elements from other namespaces.
-
-
-
-
-
-
-
-
-
-
- Element Phylogeny is used to represent a phylogeny. The required attribute 'rooted' is used
- to indicate whether the phylogeny is rooted or not. The attribute 'rerootable' can be used to indicate that
- the phylogeny is not allowed to be rooted differently (i.e. because it is associated with root dependent
- data, such as gene duplications). The attribute 'type' can be used to indicate the type of phylogeny (i.e.
- 'gene tree'). It is recommended to use the attribute 'branch_length_unit' if the phylogeny has branch
- lengths. Element clade is used in a recursive manner to describe the topology of a phylogenetic
- tree.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Element Clade is used in a recursive manner to describe the topology of a phylogenetic tree.
- The parent branch length of a clade can be described either with the 'branch_length' element or the
- 'branch_length' attribute (it is not recommended to use both at the same time, though). Usage of the
- 'branch_length' attribute allows for a less verbose description. Element 'confidence' is used to indicate
- the support for a clade/parent branch. Element 'events' is used to describe such events as gene-duplications
- at the root node/parent branch of a clade. Element 'width' is the branch width for this clade (including
- parent branch). Both 'color' and 'width' elements apply for the whole clade unless overwritten in-sub
- clades. Attribute 'id_source' is used to link other elements to a clade (on the xml-level).
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Element Taxonomy is used to describe taxonomic information for a clade. Element 'code' is
- intended to store UniProt/Swiss-Prot style organism codes (e.g. 'APLCA' for the California sea hare 'Aplysia
- californica') or other styles of mnemonics (e.g. 'Aca'). Element 'authority' is used to keep the authority,
- such as 'J. G. Cooper, 1863', associated with the 'scientific_name'. Element 'id' is used for a unique
- identifier of a taxon (for example '6500' with 'ncbi_taxonomy' as 'provider' for the California sea hare).
- Attribute 'id_source' is used to link other elements to a taxonomy (on the xml-level).
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Element Sequence is used to represent a molecular sequence (Protein, DNA, RNA) associated
- with a node. 'symbol' is a short (maximal 20 characters) symbol of the sequence (e.g. 'ACTM') whereas
- 'name' is used for the full name (e.g. 'muscle Actin'). 'gene_name' can be used when protein and gene names differ.
- 'location' is used for the location of a sequence on a genome/chromosome. The actual sequence can be stored with the
- 'mol_seq' element. Attribute 'type' is used to indicate the type of sequence ('dna', 'rna', or 'protein').
- One intended use for 'id_ref' is to link a sequence to a taxonomy (via the taxonomy's 'id_source') in case
- of multiple sequences and taxonomies per node.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Element 'mol_seq' is used to store molecular sequences. The 'is_aligned' attribute is used
- to indicated that this molecular sequence is aligned with all other sequences in the same phylogeny for
- which 'is aligned' is true as well (which, in most cases, means that gaps were introduced, and that all
- sequences for which 'is aligned' is true must have the same length).
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Element Accession is used to capture the local part in a sequence identifier (e.g. 'P17304'
- in 'UniProtKB:P17304', in which case the 'source' attribute would be 'UniProtKB').
-
-
-
-
-
-
-
-
-
-
-
- Used to store accessions to additional resources.
-
-
-
-
-
-
-
-
- This is used describe the domain architecture of a protein. Attribute 'length' is the total
- length of the protein
-
-
-
-
-
-
-
-
- To represent an individual domain in a domain architecture. The name/unique identifier is
- described via the 'id' attribute. 'confidence' can be used to store (i.e.) E-values.
-
-
-
-
-
-
-
-
-
-
-
-
-
- Events at the root node of a clade (e.g. one gene duplication).
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- The names and/or counts of binary characters present, gained, and lost at the root of a
- clade.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- A literature reference for a clade. It is recommended to use the 'doi' attribute instead of
- the free text 'desc' element whenever possible.
-
-
-
-
-
-
-
-
-
- The annotation of a molecular sequence. It is recommended to annotate by using the optional
- 'ref' attribute (some examples of acceptable values for the ref attribute: 'GO:0008270',
- 'KEGG:Tetrachloroethene degradation', 'EC:1.1.1.1'). Optional element 'desc' allows for a free text
- description. Optional element 'confidence' is used to state the type and value of support for a annotation.
- Similarly, optional attribute 'evidence' is used to describe the evidence for a annotation as free text
- (e.g. 'experimental'). Optional element 'property' allows for further, typed and referenced annotations from
- external resources.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Property allows for typed and referenced properties from external resources to be attached
- to 'Phylogeny', 'Clade', and 'Annotation'. The value of a property is its mixed (free text) content.
- Attribute 'datatype' indicates the type of a property and is limited to xsd-datatypes (e.g. 'xsd:string',
- 'xsd:boolean', 'xsd:integer', 'xsd:decimal', 'xsd:float', 'xsd:double', 'xsd:date', 'xsd:anyURI'). Attribute
- 'applies_to' indicates the item to which a property applies to (e.g. 'node' for the parent node of a clade,
- 'parent_branch' for the parent branch of a clade). Attribute 'id_ref' allows to attached a property
- specifically to one element (on the xml-level). Optional attribute 'unit' is used to indicate the unit of
- the property. An example: <property datatype="xsd:integer" ref="NOAA:depth" applies_to="clade"
- unit="METRIC:m"> 200 </property>
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- A uniform resource identifier. In general, this is expected to be an URL (for example, to
- link to an image on a website, in which case the 'type' attribute might be 'image' and 'desc' might be
- 'image of a California sea hare').
-
-
-
-
-
-
-
-
-
-
-
- A general purpose confidence element. For example this can be used to express the bootstrap
- support value of a clade (in which case the 'type' attribute is 'bootstrap').
-
-
-
-
-
-
-
-
-
-
-
- A general purpose identifier element. Allows to indicate the provider (or authority) of an
- identifier.
-
-
-
-
-
-
-
-
-
-
- The geographic distribution of the items of a clade (species, sequences), intended for
- phylogeographic applications. The location can be described either by free text in the 'desc' element and/or
- by the coordinates of one or more 'Points' (similar to the 'Point' element in Google's KML format) or by
- 'Polygons'.
-
-
-
-
-
-
-
-
-
- The coordinates of a point with an optional altitude (used by element 'Distribution').
- Required attributes are the 'geodetic_datum' used to indicate the geodetic datum (also called 'map datum',
- for example Google's KML uses 'WGS84'). Attribute 'alt_unit' is the unit for the altitude (e.g. 'meter').
-
-
-
-
-
-
-
-
-
-
-
-
- A polygon defined by a list of 'Points' (used by element 'Distribution').
-
-
-
-
-
-
-
-
-
- A date associated with a clade/node. Its value can be numerical by using the 'value' element
- and/or free text with the 'desc' element' (e.g. 'Silurian'). If a numerical value is used, it is recommended
- to employ the 'unit' attribute to indicate the type of the numerical value (e.g. 'mya' for 'million years
- ago'). The elements 'minimum' and 'maximum' are used the indicate a range/confidence
- interval
-
-
-
-
-
-
-
-
-
-
-
-
- This indicates the color of a clade when rendered (the color applies to the whole clade
- unless overwritten by the color(s) of sub clades).
-
-
-
-
-
-
-
-
-
-
- This is used to express a typed relationship between two sequences. For example it could be
- used to describe an orthology (in which case attribute 'type' is 'orthology').
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- This is used to express a typed relationship between two clades. For example it could be
- used to describe multiple parents of a clade.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ phyloXML is an XML language to describe evolutionary trees and associated data. Version: 1.10.
+ License: dual-licensed under the LGPL or Ruby's License. Copyright (c) 2008-2011 Christian M Zmasek.
+
+
+
+
+
+
+ 'phyloxml' is the name of the root element. Phyloxml contains an arbitrary number of
+ 'phylogeny' elements (each representing one phylogeny) possibly followed by elements from other namespaces.
+
+
+
+
+
+
+
+
+
+
+ Element Phylogeny is used to represent a phylogeny. The required attribute 'rooted' is used
+ to indicate whether the phylogeny is rooted or not. The attribute 'rerootable' can be used to indicate that
+ the phylogeny is not allowed to be rooted differently (i.e. because it is associated with root dependent
+ data, such as gene duplications). The attribute 'type' can be used to indicate the type of phylogeny (i.e.
+ 'gene tree'). It is recommended to use the attribute 'branch_length_unit' if the phylogeny has branch
+ lengths. Element clade is used in a recursive manner to describe the topology of a phylogenetic
+ tree.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Element Clade is used in a recursive manner to describe the topology of a phylogenetic tree.
+ The parent branch length of a clade can be described either with the 'branch_length' element or the
+ 'branch_length' attribute (it is not recommended to use both at the same time, though). Usage of the
+ 'branch_length' attribute allows for a less verbose description. Element 'confidence' is used to indicate
+ the support for a clade/parent branch. Element 'events' is used to describe such events as gene-duplications
+ at the root node/parent branch of a clade. Element 'width' is the branch width for this clade (including
+ parent branch). Both 'color' and 'width' elements apply for the whole clade unless overwritten in-sub
+ clades. Attribute 'id_source' is used to link other elements to a clade (on the xml-level).
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Element Taxonomy is used to describe taxonomic information for a clade. Element 'code' is
+ intended to store UniProt/Swiss-Prot style organism codes (e.g. 'APLCA' for the California sea hare 'Aplysia
+ californica') or other styles of mnemonics (e.g. 'Aca'). Element 'authority' is used to keep the authority,
+ such as 'J. G. Cooper, 1863', associated with the 'scientific_name'. Element 'id' is used for a unique
+ identifier of a taxon (for example '6500' with 'ncbi_taxonomy' as 'provider' for the California sea hare).
+ Attribute 'id_source' is used to link other elements to a taxonomy (on the xml-level).
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Element Sequence is used to represent a molecular sequence (Protein, DNA, RNA) associated
+ with a node. 'symbol' is a short (maximal ten characters) symbol of the sequence (e.g. 'ACTM') whereas
+ 'name' is used for the full name (e.g. 'muscle Actin'). 'location' is used for the location of a sequence on
+ a genome/chromosome. The actual sequence can be stored with the 'mol_seq' element. Attribute 'type' is used
+ to indicate the type of sequence ('dna', 'rna', or 'protein'). One intended use for 'id_ref' is to link a
+ sequence to a taxonomy (via the taxonomy's 'id_source') in case of multiple sequences and taxonomies per
+ node.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Element 'mol_seq' is used to store molecular sequences. The 'is_aligned' attribute is used
+ to indicated that this molecular sequence is aligned with all other sequences in the same phylogeny for
+ which 'is aligned' is true as well (which, in most cases, means that gaps were introduced, and that all
+ sequences for which 'is aligned' is true must have the same length).
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Element Accession is used to capture the local part in a sequence identifier (e.g. 'P17304'
+ in 'UniProtKB:P17304', in which case the 'source' attribute would be 'UniProtKB').
+
+
+
+
+
+
+
+
+
+
+ This is used describe the domain architecture of a protein. Attribute 'length' is the total
+ length of the protein
+
+
+
+
+
+
+
+
+ To represent an individual domain in a domain architecture. The name/unique identifier is
+ described via the 'id' attribute. 'confidence' can be used to store (i.e.) E-values.
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Events at the root node of a clade (e.g. one gene duplication).
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The names and/or counts of binary characters present, gained, and lost at the root of a
+ clade.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ A literature reference for a clade. It is recommended to use the 'doi' attribute instead of
+ the free text 'desc' element whenever possible.
+
+
+
+
+
+
+
+
+
+ The annotation of a molecular sequence. It is recommended to annotate by using the optional
+ 'ref' attribute (some examples of acceptable values for the ref attribute: 'GO:0008270',
+ 'KEGG:Tetrachloroethene degradation', 'EC:1.1.1.1'). Optional element 'desc' allows for a free text
+ description. Optional element 'confidence' is used to state the type and value of support for a annotation.
+ Similarly, optional attribute 'evidence' is used to describe the evidence for a annotation as free text
+ (e.g. 'experimental'). Optional element 'property' allows for further, typed and referenced annotations from
+ external resources.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Property allows for typed and referenced properties from external resources to be attached
+ to 'Phylogeny', 'Clade', and 'Annotation'. The value of a property is its mixed (free text) content.
+ Attribute 'datatype' indicates the type of a property and is limited to xsd-datatypes (e.g. 'xsd:string',
+ 'xsd:boolean', 'xsd:integer', 'xsd:decimal', 'xsd:float', 'xsd:double', 'xsd:date', 'xsd:anyURI'). Attribute
+ 'applies_to' indicates the item to which a property applies to (e.g. 'node' for the parent node of a clade,
+ 'parent_branch' for the parent branch of a clade). Attribute 'id_ref' allows to attached a property
+ specifically to one element (on the xml-level). Optional attribute 'unit' is used to indicate the unit of
+ the property. An example: <property datatype="xsd:integer" ref="NOAA:depth" applies_to="clade"
+ unit="METRIC:m"> 200 </property>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ A uniform resource identifier. In general, this is expected to be an URL (for example, to
+ link to an image on a website, in which case the 'type' attribute might be 'image' and 'desc' might be
+ 'image of a California sea hare').
+
+
+
+
+
+
+
+
+
+
+
+ A general purpose confidence element. For example this can be used to express the bootstrap
+ support value of a clade (in which case the 'type' attribute is 'bootstrap').
+
+
+
+
+
+
+
+
+
+
+ A general purpose identifier element. Allows to indicate the provider (or authority) of an
+ identifier.
+
+
+
+
+
+
+
+
+
+
+ The geographic distribution of the items of a clade (species, sequences), intended for
+ phylogeographic applications. The location can be described either by free text in the 'desc' element and/or
+ by the coordinates of one or more 'Points' (similar to the 'Point' element in Google's KML format) or by
+ 'Polygons'.
+
+
+
+
+
+
+
+
+
+ The coordinates of a point with an optional altitude (used by element 'Distribution').
+ Required attributes are the 'geodetic_datum' used to indicate the geodetic datum (also called 'map datum',
+ for example Google's KML uses 'WGS84'). Attribute 'alt_unit' is the unit for the altitude (e.g. 'meter').
+
+
+
+
+
+
+
+
+
+
+
+
+ A polygon defined by a list of 'Points' (used by element 'Distribution').
+
+
+
+
+
+
+
+
+
+ A date associated with a clade/node. Its value can be numerical by using the 'value' element
+ and/or free text with the 'desc' element' (e.g. 'Silurian'). If a numerical value is used, it is recommended
+ to employ the 'unit' attribute to indicate the type of the numerical value (e.g. 'mya' for 'million years
+ ago'). The elements 'minimum' and 'maximum' are used the indicate a range/confidence
+ interval
+
+
+
+
+
+
+
+
+
+
+
+
+ This indicates the color of a clade when rendered (the color applies to the whole clade
+ unless overwritten by the color(s) of sub clades).
+
+
+
+
+
+
+
+
+
+
+ This is used to express a typed relationship between two sequences. For example it could be
+ used to describe an orthology (in which case attribute 'type' is 'orthology').
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ This is used to express a typed relationship between two clades. For example it could be
+ used to describe multiple parents of a clade.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+