From fc8a155fc531ebae642d5e2f8710c7c4cbe41b90 Mon Sep 17 00:00:00 2001 From: cmzmasek Date: Fri, 15 Jul 2016 10:55:41 -0700 Subject: [PATCH] version 1.20 --- .../resources/phyloxml_schema/1.20/phyloxml.xsd | 217 +++++++++++--------- 1 file changed, 119 insertions(+), 98 deletions(-) diff --git a/forester/resources/phyloxml_schema/1.20/phyloxml.xsd b/forester/resources/phyloxml_schema/1.20/phyloxml.xsd index f4e5872..eba4fef 100644 --- a/forester/resources/phyloxml_schema/1.20/phyloxml.xsd +++ b/forester/resources/phyloxml_schema/1.20/phyloxml.xsd @@ -19,19 +19,21 @@ + targetNamespace="http://www.phyloxml.org" elementFormDefault="qualified" + attributeFormDefault="unqualified"> - phyloXML is an XML language to describe evolutionary trees and associated data. Version: 1.10. - License: dual-licensed under the LGPL or Ruby's License. Copyright (c) 2008-2011 Christian M Zmasek. + phyloXML is an XML language to describe evolutionary trees and associated + data. Version: 1.20. License: dual-licensed under the LGPL or Ruby's License. Copyright (c) + 2008-2016 Christian M Zmasek. - 'phyloxml' is the name of the root element. Phyloxml contains an arbitrary number of - 'phylogeny' elements (each representing one phylogeny) possibly followed by elements from other namespaces. - + 'phyloxml' is the name of the root element. Phyloxml contains an + arbitrary number of 'phylogeny' elements (each representing one phylogeny) possibly + followed by elements from other namespaces. @@ -41,13 +43,14 @@ - Element Phylogeny is used to represent a phylogeny. The required attribute 'rooted' is used - to indicate whether the phylogeny is rooted or not. The attribute 'rerootable' can be used to indicate that - the phylogeny is not allowed to be rooted differently (i.e. because it is associated with root dependent - data, such as gene duplications). The attribute 'type' can be used to indicate the type of phylogeny (i.e. - 'gene tree'). It is recommended to use the attribute 'branch_length_unit' if the phylogeny has branch - lengths. Element clade is used in a recursive manner to describe the topology of a phylogenetic - tree. + Element Phylogeny is used to represent a phylogeny. The required + attribute 'rooted' is used to indicate whether the phylogeny is rooted or not. The + attribute 'rerootable' can be used to indicate that the phylogeny is not allowed to be + rooted differently (i.e. because it is associated with root dependent data, such as gene + duplications). The attribute 'type' can be used to indicate the type of phylogeny (i.e. + 'gene tree'). It is recommended to use the attribute 'branch_length_unit' if the + phylogeny has branch lengths. Element clade is used in a recursive manner to describe + the topology of a phylogenetic tree. @@ -56,8 +59,10 @@ - - + + @@ -69,15 +74,16 @@ - Element Clade is used in a recursive manner to describe the topology of a phylogenetic tree. - The parent branch length of a clade can be described either with the 'branch_length' element or the - 'branch_length' attribute (it is not recommended to use both at the same time, though). Usage of the - 'branch_length' attribute allows for a less verbose description. Element 'confidence' is used to indicate - the support for a clade/parent branch. Element 'events' is used to describe such events as gene-duplications - at the root node/parent branch of a clade. Element 'width' is the branch width for this clade (including - parent branch). Both 'color' and 'width' elements apply for the whole clade unless overwritten in-sub - clades. Attribute 'id_source' is used to link other elements to a clade (on the xml-level). - + Element Clade is used in a recursive manner to describe the topology of + a phylogenetic tree. The parent branch length of a clade can be described either with + the 'branch_length' element or the 'branch_length' attribute (it is not recommended to + use both at the same time, though). Usage of the 'branch_length' attribute allows for a + less verbose description. Element 'confidence' is used to indicate the support for a + clade/parent branch. Element 'events' is used to describe such events as + gene-duplications at the root node/parent branch of a clade. Element 'width' is the + branch width for this clade (including parent branch). Both 'color' and 'width' elements + apply for the whole clade unless overwritten in-sub clades. Attribute 'id_source' is + used to link other elements to a clade (on the xml-level). @@ -103,12 +109,14 @@ - Element Taxonomy is used to describe taxonomic information for a clade. Element 'code' is - intended to store UniProt/Swiss-Prot style organism codes (e.g. 'APLCA' for the California sea hare 'Aplysia - californica') or other styles of mnemonics (e.g. 'Aca'). Element 'authority' is used to keep the authority, - such as 'J. G. Cooper, 1863', associated with the 'scientific_name'. Element 'id' is used for a unique - identifier of a taxon (for example '6500' with 'ncbi_taxonomy' as 'provider' for the California sea hare). - Attribute 'id_source' is used to link other elements to a taxonomy (on the xml-level). + Element Taxonomy is used to describe taxonomic information for a clade. + Element 'code' is intended to store UniProt/Swiss-Prot style organism codes (e.g. + 'APLCA' for the California sea hare 'Aplysia californica'). Element 'authority' is used + to keep the authority, such as 'J. G. Cooper, 1863', associated with the + 'scientific_name'. Element 'id' is used for a unique identifier of a taxon (for example + '6500' with 'ncbi_taxonomy' as 'provider' for the California sea hare). Attribute + 'id_source' is used to link other elements to a taxonomy (on the + xml-level). @@ -190,13 +198,15 @@ - Element Sequence is used to represent a molecular sequence (Protein, DNA, RNA) associated - with a node. 'symbol' is a short (maximal 20 characters) symbol of the sequence (e.g. 'ACTM') whereas - 'name' is used for the full name (e.g. 'muscle Actin'). 'gene_name' can be used when protein and gene names differ. - 'location' is used for the location of a sequence on a genome/chromosome. The actual sequence can be stored with the - 'mol_seq' element. Attribute 'type' is used to indicate the type of sequence ('dna', 'rna', or 'protein'). - One intended use for 'id_ref' is to link a sequence to a taxonomy (via the taxonomy's 'id_source') in case - of multiple sequences and taxonomies per node. + Element Sequence is used to represent a molecular sequence (Protein, + DNA, RNA) associated with a node. 'symbol' is a short (maximal 20 characters) symbol of + the sequence (e.g. 'ACTM') whereas 'name' is used for the full name (e.g. 'muscle + Actin'). 'gene_name' can be used when protein and gene names differ. 'location' is used + for the location of a sequence on a genome/chromosome. The actual sequence can be stored + with the 'mol_seq' element. Attribute 'type' is used to indicate the type of sequence + ('dna', 'rna', or 'protein'). One intended use for 'id_ref' is to link a sequence to a + taxonomy (via the taxonomy's 'id_source') in case of multiple sequences and taxonomies + per node. @@ -222,10 +232,11 @@ - Element 'mol_seq' is used to store molecular sequences. The 'is_aligned' attribute is used - to indicated that this molecular sequence is aligned with all other sequences in the same phylogeny for - which 'is aligned' is true as well (which, in most cases, means that gaps were introduced, and that all - sequences for which 'is aligned' is true must have the same length). + Element 'mol_seq' is used to store molecular sequences. The 'is_aligned' + attribute is used to indicated that this molecular sequence is aligned with all other + sequences in the same phylogeny for which 'is aligned' is true as well (which, in most + cases, means that gaps were introduced, and that all sequences for which 'is aligned' is + true must have the same length). @@ -243,8 +254,9 @@ - Element Accession is used to capture the local part in a sequence identifier (e.g. 'P17304' - in 'UniProtKB:P17304', in which case the 'source' attribute would be 'UniProtKB'). + Element Accession is used to capture the local part in a sequence + identifier (e.g. 'P17304' in 'UniProtKB:P17304', in which case the 'source' attribute + would be 'UniProtKB'). @@ -257,16 +269,16 @@ Used to store accessions to additional resources. - + - + - This is used describe the domain architecture of a protein. Attribute 'length' is the total - length of the protein + This is used describe the domain architecture of a protein. Attribute + 'length' is the total length of the protein @@ -275,8 +287,9 @@ - To represent an individual domain in a domain architecture. The name/unique identifier is - described via the 'id' attribute. 'confidence' can be used to store (i.e.) E-values. + To represent an individual domain in a domain architecture. The + name/unique identifier is described via the 'id' attribute. 'confidence' can be used to + store (i.e.) E-values. @@ -290,7 +303,8 @@ - Events at the root node of a clade (e.g. one gene duplication). + Events at the root node of a clade (e.g. one gene duplication). + @@ -313,8 +327,8 @@ - The names and/or counts of binary characters present, gained, and lost at the root of a - clade. + The names and/or counts of binary characters present, gained, and lost + at the root of a clade. @@ -336,8 +350,8 @@ - A literature reference for a clade. It is recommended to use the 'doi' attribute instead of - the free text 'desc' element whenever possible. + A literature reference for a clade. It is recommended to use the 'doi' + attribute instead of the free text 'desc' element whenever possible. @@ -347,13 +361,14 @@ - The annotation of a molecular sequence. It is recommended to annotate by using the optional - 'ref' attribute (some examples of acceptable values for the ref attribute: 'GO:0008270', - 'KEGG:Tetrachloroethene degradation', 'EC:1.1.1.1'). Optional element 'desc' allows for a free text - description. Optional element 'confidence' is used to state the type and value of support for a annotation. - Similarly, optional attribute 'evidence' is used to describe the evidence for a annotation as free text - (e.g. 'experimental'). Optional element 'property' allows for further, typed and referenced annotations from - external resources. + The annotation of a molecular sequence. It is recommended to annotate by + using the optional 'ref' attribute (some examples of acceptable values for the ref + attribute: 'GO:0008270', 'KEGG:Tetrachloroethene degradation', 'EC:1.1.1.1'). Optional + element 'desc' allows for a free text description. Optional element 'confidence' is used + to state the type and value of support for a annotation. Similarly, optional attribute + 'evidence' is used to describe the evidence for a annotation as free text (e.g. + 'experimental'). Optional element 'property' allows for further, typed and referenced + annotations from external resources. @@ -369,15 +384,17 @@ - Property allows for typed and referenced properties from external resources to be attached - to 'Phylogeny', 'Clade', and 'Annotation'. The value of a property is its mixed (free text) content. - Attribute 'datatype' indicates the type of a property and is limited to xsd-datatypes (e.g. 'xsd:string', - 'xsd:boolean', 'xsd:integer', 'xsd:decimal', 'xsd:float', 'xsd:double', 'xsd:date', 'xsd:anyURI'). Attribute - 'applies_to' indicates the item to which a property applies to (e.g. 'node' for the parent node of a clade, - 'parent_branch' for the parent branch of a clade). Attribute 'id_ref' allows to attached a property - specifically to one element (on the xml-level). Optional attribute 'unit' is used to indicate the unit of - the property. An example: <property datatype="xsd:integer" ref="NOAA:depth" applies_to="clade" - unit="METRIC:m"> 200 </property> + Property allows for typed and referenced properties from external + resources to be attached to 'Phylogeny', 'Clade', and 'Annotation'. The value of a + property is its mixed (free text) content. Attribute 'datatype' indicates the type of a + property and is limited to xsd-datatypes (e.g. 'xsd:string', 'xsd:boolean', + 'xsd:integer', 'xsd:decimal', 'xsd:float', 'xsd:double', 'xsd:date', 'xsd:anyURI'). + Attribute 'applies_to' indicates the item to which a property applies to (e.g. 'node' + for the parent node of a clade, 'parent_branch' for the parent branch of a clade). + Attribute 'id_ref' allows to attached a property specifically to one element (on the + xml-level). Optional attribute 'unit' is used to indicate the unit of the property. An + example: <property datatype="xsd:integer" ref="NOAA:depth" applies_to="clade" + unit="METRIC:m"> 200 </property> @@ -439,9 +456,9 @@ - A uniform resource identifier. In general, this is expected to be an URL (for example, to - link to an image on a website, in which case the 'type' attribute might be 'image' and 'desc' might be - 'image of a California sea hare'). + A uniform resource identifier. In general, this is expected to be an URL + (for example, to link to an image on a website, in which case the 'type' attribute might + be 'image' and 'desc' might be 'image of a California sea hare'). @@ -453,8 +470,9 @@ - A general purpose confidence element. For example this can be used to express the bootstrap - support value of a clade (in which case the 'type' attribute is 'bootstrap'). + A general purpose confidence element. For example this can be used to + express the bootstrap support value of a clade (in which case the 'type' attribute is + 'bootstrap'). @@ -466,8 +484,8 @@ - A general purpose identifier element. Allows to indicate the provider (or authority) of an - identifier. + A general purpose identifier element. Allows to indicate the provider + (or authority) of an identifier. @@ -478,10 +496,11 @@ - The geographic distribution of the items of a clade (species, sequences), intended for - phylogeographic applications. The location can be described either by free text in the 'desc' element and/or - by the coordinates of one or more 'Points' (similar to the 'Point' element in Google's KML format) or by - 'Polygons'. + The geographic distribution of the items of a clade (species, + sequences), intended for phylogeographic applications. The location can be described + either by free text in the 'desc' element and/or by the coordinates of one or more + 'Points' (similar to the 'Point' element in Google's KML format) or by 'Polygons'. + @@ -491,10 +510,10 @@ - The coordinates of a point with an optional altitude (used by element 'Distribution'). - Required attributes are the 'geodetic_datum' used to indicate the geodetic datum (also called 'map datum', - for example Google's KML uses 'WGS84'). Attribute 'alt_unit' is the unit for the altitude (e.g. 'meter'). - + The coordinates of a point with an optional altitude (used by element + 'Distribution'). Required attributes are the 'geodetic_datum' used to indicate the + geodetic datum (also called 'map datum', for example Google's KML uses 'WGS84'). + Attribute 'alt_unit' is the unit for the altitude (e.g. 'meter'). @@ -506,8 +525,8 @@ - A polygon defined by a list of 'Points' (used by element 'Distribution'). - + A polygon defined by a list of 'Points' (used by element + 'Distribution'). @@ -516,11 +535,12 @@ - A date associated with a clade/node. Its value can be numerical by using the 'value' element - and/or free text with the 'desc' element' (e.g. 'Silurian'). If a numerical value is used, it is recommended - to employ the 'unit' attribute to indicate the type of the numerical value (e.g. 'mya' for 'million years - ago'). The elements 'minimum' and 'maximum' are used the indicate a range/confidence - interval + A date associated with a clade/node. Its value can be numerical by using + the 'value' element and/or free text with the 'desc' element' (e.g. 'Silurian'). If a + numerical value is used, it is recommended to employ the 'unit' attribute to indicate + the type of the numerical value (e.g. 'mya' for 'million years ago'). The elements + 'minimum' and 'maximum' are used the indicate a range/confidence + interval @@ -533,8 +553,8 @@ - This indicates the color of a clade when rendered (the color applies to the whole clade - unless overwritten by the color(s) of sub clades). + This indicates the color of a clade when rendered (the color applies to + the whole clade unless overwritten by the color(s) of sub clades). @@ -546,8 +566,9 @@ - This is used to express a typed relationship between two sequences. For example it could be - used to describe an orthology (in which case attribute 'type' is 'orthology'). + This is used to express a typed relationship between two sequences. For + example it could be used to describe an orthology (in which case attribute 'type' is + 'orthology'). @@ -572,8 +593,8 @@ - This is used to express a typed relationship between two clades. For example it could be - used to describe multiple parents of a clade. + This is used to express a typed relationship between two clades. For + example it could be used to describe multiple parents of a clade. -- 1.7.10.2