Difference between revisions of "CDAO"

From Evolutionary Informatics Working Group
Jump to: navigation, search
(Abstract)
(initial page layout)
Line 1: Line 1:
=Framework for a Comparative Data Analysis Ontology=
+
=Comparative Data Analysis Ontology=
* Enrico Pontelli, Julie Thompson, Arlin Stoltzfus
 
* other authors as needed, included in alphabetical or other order
 
  
==Abstract==
+
The material previously on this page has been moved to [[CDAOManuscript]].  
Formal ontologies have the potential to improve the practice of biological data analysis by facilitating: semantic transformation and other forms of automated reasoning; storage, transfer, re-use and integration of data; and validation and other forms of quality assurance.  Here we describe the initial implementation of an ontology for ''Evolutionary Comparative Analysis'', an analysis framework in which similarities and differences of ''OTUs'' ("Operational Taxonomic Units") such as genes or proteins (or genomes, species, etc.) are understood to have emerged from common ''ancestor'' entities by a branching process of ''transitions'' in the ''states'' of ''characters''.  In consultation with a group of domain scientists specializing in phylogenetic analysis software, we documented use cases, developed a concept glossary, and studied related artefacts in order to identify core concepts and relations.  The related artefacts included ontologies, file formats, and database schemata.  While the NEXUS file format and the TreeBase II database schema capture many of the core concepts and relations of comparative data analysis, no formal ontology does so.  Therefore, we implemented a Comparative Data Analysis Ontology (CDAO), using the OWL-DL representation language.  Here we describe the core concepts and relations of CDAO, and an initial evaluation of its implementation.
 
  
==Introduction==
+
This page is for ongoing work and contains links to supporting docs, past work, and sub-topics.
  
===Ontologies and Interoperability===
+
==Test Data Sets==
  
===Comparative Data Analysis===
 
can use material from wiki for this
 
 
===NESCent EvoInfo Working Group===
 
 
==Development Strategy==
 
The strategy devised to develop an ontology involves five distinct operations:
 
# define domain by means of use cases
 
# develop a concept glossary
 
# study available related ontologies (and other artefacts) 
 
# implement the core concepts and relations of the domain
 
# evaluate the effectiveness of the ontology
 
 
The overall strategy is not merely a simple linear sequence of these operations, since feedback and iteration are required.  Nevertheless, the ordering above is not completely arbitrary, since an initial iteration of some steps must be taken before a subsequent step is attempted.
 
 
===Use Cases===
 
to do:
 
* add some introductory comments
 
* insert re-worded list here from grant proposal
 
 
===Concept Glossary===
 
A concept glossary was developed with participation of members of the NESCent evoinfo working group.  The current version of the glossary, available at https://www.nescent.org/wg_evoinfo/ConceptGlossary, contains 63 defined terms and 65 undefined terms.  The definitions include some information on subclass and synonymy relationships. 
 
 
to do:
 
* add specific examples of how the glossary clarifies core concepts and relations
 
 
===Analysis of Related Artefacts===
 
to do:
 
* add list of related artefacts from wiki
 
* draw conclusions
 
** what overlaps exist
 
** what design principles should be re-used
 
** what design principles should be avoided
 
** what artefacts should be incorporated directly
 
 
===Design Principles===
 
to do:
 
* add list of design principles
 
  
 
==Initial Implementation==
 
==Initial Implementation==
  
===Core concepts and relations===
 
some core concepts:
 
* character-state data matrix
 
* phylogeny (history, tree, reconstruction)
 
* transition model
 
* analysis
 
* publication
 
 
to do:
 
* explain core concepts
 
* illustrate with parts of ontology
 
  
===Example===
+
==Evaluation==
to do:
 
* give an example with instance data
 
  
===Connections with other ontologies===
 
* explain planned connections with other artefacts
 
** cited references (pubmed biblio item?)
 
** NCBI taxonomy
 
  
==Discussion==
 
  
===Evaluation Strategy===
+
==Meeting Notes==
* semantic transformation projects
 
* other
 
  
===Lessons Learned===
+
===Telecon, 7 March, 2007===
  
==Acknowledgments==
+
present: Brian DeVries, Francisco Prodoscimi, Julie Thompson, Enrico Pontelli, Arlin Stoltzfus
* NESCent evoinfo and phyloinformatics participants
 
* NESCent informatics leadership
 
* funding sources
 
  
==Literature Cited==
+
===another meeting===

Revision as of 11:23, 7 March 2008

Comparative Data Analysis Ontology

The material previously on this page has been moved to CDAOManuscript.

This page is for ongoing work and contains links to supporting docs, past work, and sub-topics.

Test Data Sets

Initial Implementation

Evaluation

Meeting Notes

Telecon, 7 March, 2007

present: Brian DeVries, Francisco Prodoscimi, Julie Thompson, Enrico Pontelli, Arlin Stoltzfus

another meeting