NeXML Test Files
From Evolutionary Informatics Working Group
Hackathon
• Report
• Participants
• Products
• Metadata
• Use Cases
Subgroups
• Semantic CDAO
• Java / NeXML
• Phylr
• Taxonomy
• Visualization
Tags
• Hackathon
• Resources
• CDAO
• NeXML
• PhyloWS
• Standards
Contents
Purposes
These files are for experimenting with representation schemes, and to provide instances for tests.
Representation scheme
- to understand the syntax for metadata, see the metadata representation page (note: what the standard says on referencing blank nodes)
- in the example below, we wish to assign a bootstrap value to an edge in a tree; since nexml does not have a bootstrap attribute, we can use the cdao:has_Support_Value concept like this
<xml>
<edge id="edge22" about="foo"> <meta property="cdao:has_Support_Value">0.425</meta> </edge>
</xml>
Tools
- go to the nexml home for a nexml validator
- go to the nexml home for a nexus-to-nexml converter (accessed via pull-down menu)
- Peter also provides a nexus-to-nexml converter
Test files from the nexml web site
http://nexml-dev.nescent.org/nexml/examples/
- file: http://nexml-dev.nescent.org/nexml/examples/characters.xml
- comment:
- file: http://nexml-dev.nescent.org/nexml/examples/nexml.xml
- comment:
- file: http://nexml-dev.nescent.org/nexml/examples/taxa.xml
- comment:
- file: http://nexml-dev.nescent.org/nexml/examples/timetree.xml
- comment:
- file: http://nexml-dev.nescent.org/nexml/examples/tolweb.xml
- comment:
- file: http://nexml-dev.nescent.org/nexml/examples/trees.xml
- comment:
Test files from the hackathon code repository
- file: http://code.google.com/p/dbhack1/source/browse/trunk/data/nexml/Fang_2003.xml
- uses:
- one use
- another use
- comment: Fang & Douglas, 2003 data encoded as part of the phenoscape project; includes: provenance (specimen), standard character encoding, TAO and PATO links for states
- uses:
Bootstrap values test files
- uses: test ability to display or process branch support values such as bootstraps
- comment: The newick tree looks like this:
(otuA:4,(((otuB:1,otuC:1):1[0.35],otuD:2):1[0.425],(otuE:1,otuF:1):2[0.65]):1);
Bootstraps represented using the "meta" tag
<xml>
<trees> <tree> <edge id="edge22" about="this_edge22" length="1" source="node21" target="node22"> <meta typeof="cdao:Edge" property="cdao:has_Support_Value" datatype="xsd:float">0.425</meta> </edge> <edge id="edge27" about="this_edge27" length="2" source="node21" target="node27"> <meta typeof="cdao:Edge" property="cdao:has_Support_Value" datatype="xsd:float">0.65</meta> </edge> <edge id="edge23" about="this_edge23" length="1" source="node22" target="node23"> <meta typeof="cdao:Edge" property="cdao:has_Support_Value" datatype="xsd:float">0.35</meta> </edge> </tree> <trees>
</xml>
Bootstraps represented without new tags or elements
- file: http://code.google.com/p/dbhack1/source/browse/trunk/data/nexml/03_bootstraps_in_tag.xml
- comment: we don't need an "about" attribute to define the subject, because "typeof" does this; however, an "about" attribute would be allowable
<xml>
<trees> <tree> <edge id="edge22" length="1" source="node21" target="node22" typeof="cdao:Edge" property="cdao:has_Support_Value" datatype="xsd:float" content="0.425" /> <edge id="edge27" length="2" source="node21" target="node27" typeof="cdao:Edge" property="cdao:has_Support_Value" datatype="xsd:float" content="0.65" /> <edge id="edge23" length="1" source="node22" target="node23" typeof="cdao:Edge" property="cdao:has_Support_Value" datatype="xsd:float" content="0.35" /> </tree> <trees>
</xml>
Labels for internal nodes
- uses: test ability to display or process labels for ancestral (internal) nodes
- comment: the Newick standard allows internal node labels like this:
(otuA:4,(((otuB:1,otuC:1)inodeBC:1,otuD:2)inodeBCD:1,(otuE:1,otuF:1)inodeEF:2)inodeBCDEF:1)inodeABCDEF;
- comment: here is what it looks like
- file: http://code.google.com/p/dbhack1/source/browse/trunk/data/nexml/04_labeled_ancestors.xml
- comment: this is standard nexml
<xml>
<trees> <tree> <node id="node19" label="inodeABCDEF" root="true"/> <node id="node20" label="otuA" otu="otu2"/> <node id="node21" label="inodeBCDEF"/> <node id="node22" label="inodeBCD"/> <node id="node27" label="inodeEF"/> <node id="node23" label="inodeBC"/> <node id="node26" label="otuD" otu="otu5"/> </tree> <trees>
</xml>
Ancestral states test files
- uses: test ability to display or process ancestral state data
- comment:
Ancestral states represented RDFa style using meta tags
- file: http://code.google.com/p/dbhack1/source/browse/trunk/data/nexml/05_ancestral_states.xml
- comment: in CDAO, the relation between a TU and a state is mediated by a datum for which the state is a value; therefore it requires two relations to express this; the first relation is that a TU has a datum (inversely, the datum belongs to the TU) and the datum has a state.
<xml>
<trees> <tree> <node id="node19" label="inodeABCDEF" root="true" about="uniquely_this_node19"> <meta property="cdao:has_Datum"> <meta typeof="cdao:CharacterStateDatum" rel="cdao:has_State" resource="#s1"/> </meta> </node> <node id="node20" label="otuA" otu="otu2"/> <node id="node21" label="inodeBCDEF" about="uniquely_this_node21"> <meta property="cdao:has_Datum"> <meta typeof="cdao:CharacterStateDatum" rel="cdao:has_State" resource="#s1"/> </meta> </node> <node id="node22" label="inodeBCD" about="uniquely_this_node22"> <meta property="cdao:has_Datum"> <meta typeof="cdao:CharacterStateDatum" rel="cdao:has_State" resource="#s1"/> </meta> </node> <node id="node27" label="inodeEF" about="uniquely_this_node27"> <meta property="cdao:has_Datum"> <meta typeof="cdao:CharacterStateDatum" rel="cdao:has_State" resource="#s3"/> </meta> </node> <node id="node23" label="inodeBC" about="uniquely_this_node23"> <meta property="cdao:has_Datum"> <meta typeof="cdao:CharacterStateDatum" rel="cdao:has_State" resource="#s1"/> </meta> </node> </tree> <trees>
</xml>
Ancestral states represented some other way
- file: http://code.google.com/p/dbhack1/source/browse/trunk/data/nexml/05_...not done
- comment:
<xml>
<trees> <tree> </tree> <trees>
</xml>
Taxonomy references test files
- Description: These are pairs of files. Each file has 6 OTUs, one of which is labeled "S. glaucus". In one file, this is supposed to mean "Scomber glaucus", the Derbio (related to mackerels and tunas), and in the other file it is supposed to mean "Squalus glaucus", the blue shark.
- uses:
- the pair of files together can be used to address the super-tree case of finding matching OTUs
- either file can serve for a visualization case
- either file can serve for a taxonomy resolution case
- either file can be used to resolve synonyms. According to fishbase, both S. glaucus names are junior synonyms: the preferred name for Scomber glaucus is Trachinotus ovatus (English common name is Derbio or Pompano); the preferred name for Squalus glaucus is Prionace glauca (common name "blue shark").
OTUs with no taxrefs
- description: two files that share the OTU label "S. glaucus" and have no external tax refs by which to disambiguate
- file(s):
- dogfish (genus Squalus): 02_dogfish_no_taxrefs.xml
- mackerel (genus Scomber): 02_mackerel_no_taxrefs.xml
- comment: these files pass the nexml validator
OTUs with tax refs via the TDWG taxon concept
- description: two files that share the OTU label "S. glaucus" and have tax refs that could be used to disambiguate
- file(s):
- dogfish (genus Squalus): 02_dogfish_rdfa_tdwg_lsid_taxrefs.xml
- mackerel (genus Scomber): 02_mackerel_rdfa_tdwg_lsid_taxrefs.xml
- comment: THESE FILES FAIL THE NEXML VALIDATOR AT NEXML.ORG!!
- comment: this has the tax refs in RDFa syntax that Roger suggests to assign an LSID value TDWG taxon concept
<xml>
<otu id="otu3" label="S. glaucus" about="foo" typeof="tc:TaxonConcept" xmlns:tc="http://rs.tdwg.org/ontology/voc/TaxonConcept#" xmlns:tn="http://rs.tdwg.org/ontology/voc/TaxonName#"> <meta rel="tc:hasName"> <meta typeof="tn:TaxonName" about="urn:lsid:zoobank.org:act:A6AD1B85-C65C-4079-B425-56828418620C"> <meta property="tn:genusPart">Scomber</meta> <meta property="tn:specificEpithet">glaucus</meta> </meta> </meta> </otu>
</xml>
OTUs with tax refs in alternative RDFa syntax
- description: two files that share the OTU label "S. glaucus" and have tax refs that could be used to disambiguate
- file(s):
- dogfish (genus Squalus): 02_dogfish_rdfa_2_cdao_lsid_taxrefs.xml
- mackerel (genus Scomber): 02_mackerel_rdfa_2_cdao_lsid_taxrefs.xml
- comment: these files are valid nexml according to the validator
- comment: this syntax may be more consistent with the use of RDFa, but it is not extendable to having multiple triples with the same implicit subject; this is why there is a value to have an "annotation" or "any" tag (used like "span" in html docs)
<xml>
<otus id="otus1" xmlns:cdao="http://www.evolutionaryontology.org/cdao/1.0/cdao.owl"> <otu id="otu3" label="S. glaucus" about="[_:this_otu3]" typeof="cdao:TU" property="cdao:has_Taxonomy_Reference" resource="http://www.zoobank.org/?urn:lsid:zoobank.org:act:A6AD1B85-C65C-4079-B425-56828418620C" /> </otus>
</xml>
OTUs with tax refs in deprecated "dict" syntax
- description: two files that share the OTU label "S. glaucus" and have taxrefs that could allow disambiguation
- file(s):
- dogfish (genus Squalus): 02_dogfish_dict_cdao_lsid_taxrefs.xml
- mackerel (genus Scomber): 02_mackerel_dict_cdao_lsid_taxrefs.xml
- comment: this will validate at nexml-dev.nescent.org (if 'any' element has ' xmlns="" ')
- comment: this has the tax refs in nexml "dict" syntax to assign an LSID value via CDAO property
<xml>
<otu id="otu5" label="S. glaucus"> <dict xmlns:cdao="http://evolutionaryontology.org/cdao/1.0/cdao.owl" id="dict1"> <any id="bar" xmlns=""> <cdao:TU rdf:id="baz"> <cdao:has_Taxonomy_Reference rdf:resource="http://www.zoobank.org/?lsid=urn:lsid:zoobank.org:act:2CD1E27B-D572-447C-BD0F-2AA998447FBF"/> </cdao:TU> </any> </dict> </otu>
</xml>
OTUs with tax refs in other syntax
- description: planned, in progress
- file(s):
- dogfish (genus Squalus):
- mackerel (genus Scomber):
- comment:
notes for dict references
Prunella
Oenanthe ZooBank record (urn:lsid:zoobank.org:act:5E78566E-FE5B-4135-ABC0-B70AA36396EF)
Pavonia** Ubio LSID: urn:lsid:ubio.org:namebank:798505 Nemphalid genus Pavonia Godart 1824 and swampmallow
Anacharis
Morus
Pieris
Mallotus