Vorba ID - Vamsas Object Request Broker Address ID (name needs to be worked on): Suggest it could be of the form documentRoot/datasetName/SequenceUID for a dataset sequence. Alignment sequence: documentRoot/datasetName/AlignmentId/SequenceUID for an aligned form of a dataset sequence contains unassociated trees and a number of analysis sets Primary Key for vamsas object referencing Properties. Generally, these are mutable so an application should check them each time. This may change depending on the context of the property Contains a named collection of trees TODO: define way of referencing leaves of global tree for any sequence/alignment object. SUggestion 1: Each named tree leafnode has a unique id (which may be unique in combination with trees own vorba ID). Dataset sequences can be tagged with a property "vamsas:tree_leaf" Primary Key for vamsas object referencing node identity and mapping data between tree representations and vamsas document objects reference to one or more trees containing the node being described. String uniquely identifying a particular node in the referenced tree according to the format of the tree representation that is referenced. Primary Key for vamsas object referencing base type for citing arbitrary links between vamsas objects Optional human readable description of the relationship Primary Key for vamsas object referencing List of one or more vamsas object references Short name for this node Descriptive text for this node Direct associations between this node and any vamsas objects Primary Key for vamsas object referencing Named and typed property string The type specifies how the property will be parsed. Empty property strings are allowed, and can be used to prototype the input to a document. TODO: specify allowed types Primitive labelled URI object The URI Specify an ordered set of positions and/or regions on the principle dimension of some associated vamsas object Keeping to jaxb-1.0 specification for the moment - this choice should become a substitution group when we use jaxb-2.0 capable bindings a position within the associated object's coordinate system a region from start to end, with flag for inclusivity of terminii when false, a consecutive range like 'start=1, end=2' means the region lying after position 1 and before position 2 Annotation for a rangeSpec - values can be attached for the whole specification, and to each position within the spec. following the orientation specified by the ordered set of rangeSpec (pos, seg) elements. Short, meaningful name for the annotation - if this is absent, then the type string should be used in its place. Human readable description of the annotation TODO: specify this - we have considered taking the GO evidence codes as a model for assessing a measure of quality to an annotation. Annotation Element position maps to ordered positions defined by the sequence of rangeType pos positions or concatenated seg start/end segments. Ordered set of optionally named float values for the whole annotation Note:These are mutable so an application should check them each time. Primary Key for vamsas object referencing Annotation with the same non-empty group name are grouped together A Das Feature has both a type and a Type ID. We go the route of requiring the type string to be taken from a controlled vocabulary if an application expects others to make sense of it. The type may qualified - so uniprot:CHAIN is a valid type name, and considered distinct from someotherDB:CHAIN Specifies a named and typed value used to perform some data transformation. LATER: experiment with xml validation of property set prototypes for services Named and typed property string The type specifies how the property will be parsed. Empty property strings are allowed, and can be used to prototype the input to a document. TODO: specify allowed types Selects all or part of a collection of vamsas objects as a named input to some transformation process. Many inputs with the same name imply a group input (such as a collection of sequences) Reference Frame for rangeType specfication Defines the origin and series of operations applied directly to the object that references it. Who With which application Did what When additional information parameter for the action bioinformatic objects input to action Primary Key for vamsas object referencing A collection of sequences, alignments, trees and other things. TODO: Add a title field and properties for programs that can present the user with different distinct datasets For the moment, the program just presents them as a list and perhaps lets the user work out which dataset it wants based on the alignments that it contains. (Dominik and Jim 7th June 2007) a primary or secondary sequence record from which all other sequences may be derived Store a list of database references for this sequence record - with optional mapping from database sequence to the given sequence record the local mapType maps from the parent sequence coordinate frame to the reference frame defined by the dbRef element. The mapped mapType is the mapped range defined on the dbRef element's reference frame. Conventionally, the unit attribute defaults to 1, or will be inferred from the local sequence's dictionary type and any dictionary type associated with the database being mapped to. However, it may be used to avoid ambiguity. TODO Database Naming Convention: either start using LSID (so change type to URI) or leave this as an uncontrolled/unspecified string ID Version must be specified - TODO: make some specification of the database field from which this accessionId is taken from - should that be a special property of the dbRef object ? Primary Key for vamsas object referencing explicitly named cross reference to other objects in the document. Primary Key for vamsas object referencing symbol class for sequence A mapping between the specified 'local' and 'mapped' sequence coordinate frames. The step size between each coordinate frame depends on the sequence dictionary types, or alternatively specified in the optional unit attribute on each range element. Object on which the local range is defined. Object on which the mapped range is defined. Annotate over positions and regions of a dataset sequence annotation is associated with a particular dataset sequence This is annotation over the coordinate frame defined by all the columns in the alignment. TODO: decide if this flag is redundant - when true it would suggest that the annotationElement values together form a graph annotation is associated with a range on a particular group of alignment sequences Annotate over positions and regions of the ungapped sequence in the context of the alignment TODO: decide if this flag is redundant - when true it would suggest that the annotationElement values together form a graph Primary Key for vamsas object referencing Dataset Sequence from which this alignment sequence is taken from typical properties may be additional alignment score objects Primary Key for vamsas object referencing Primary Key for vamsas object referencing per-site symbolic and/or quantitative annotation SecondaryStructure and display character (from Jalview) have been subsumed into the glyph element Free text at this position Discrete symbol - possibly graphically represented specifies the symbol dictionary for this glyph - eg utf8 (the default), aasecstr_3 or kd_hydrophobicity - the content is not validated so applications must ensure they gracefully deal with invalid entries here TODO: specify a minimum list of glyph dictionaries to get us started and provide a way for the vamsasClient to validate their content if regexes are specified Ordered set of float values - an application may treat the values together as a vector with common support for a set of annotation elements - but this is, again, not validated so applications should deal gracefully with varying numbers of dimensions position with respect to the coordinate frame defined by a rangeType specification true means the annotation element appears between the specified position and the next Primary Key for vamsas object referencing additional typed properties Data specific to a particular type and version of vamsas application Data available to just a particular user Data available to just a specific instance of the application VAMSAS/Pierre: Is this data volatile ? Application instances may not be accessible after the session has closed - the user may have to be presented with the option of picking up the data in that instance Version string describing the application specific data storage version used Canonical name of application General data container to attach a typed data object to any vamsas object true implies data will be decompresses with Zip before presenting to application Type of arbitrary data - TODO: decide format - use (extended) MIME types ? Object the arbitrary data is associated with Primary Key for vamsas object referencing Two sets of ranges defined between objects - usually sequences, indicating which regions on each are mapped. number of dictionary symbol widths involved in each mapped position on this sequence (for example, 3 for a dna sequence exon region that is being mapped to a protein sequence). This is optional, since the unit can be usually be inferred from the dictionary type of each sequence involved in the mapping. Contains lock information: locktype:ApplicationHandle locktype is 'local' or 'full' A lock is only valid if the ApplicationHandle resolves to a living application in the vamsas session. A local lock means that the application has locked changes to all local properties on the object. A full lock means that the application has locked changes to all properties on the object, and any objects that it holds references to.