Difference between revisions of "Data Resources"

From Evolutionary Informatics Working Group
Jump to: navigation, search
(Data Resources)
(questions to ask for each resource)
Line 5: Line 5:
 
==questions to ask for each resource==
 
==questions to ask for each resource==
  
 +
* What is the scope of the resource? 
 +
* Who controls it? 
 +
* How can users access data?
 +
* How are data organized?  Is there an explicit data model, schema, or format description?
 +
* Is there evidence that this is an important resources (registered users; citations)?
  
 
== Phylogeny Services ==
 
== Phylogeny Services ==

Revision as of 14:35, 23 September 2008

Quick analysis of data resources to target for interop hackathon

The idea is that each of these resources has comparative data that we might want to make interoperable.

questions to ask for each resource

  • What is the scope of the resource?
  • Who controls it?
  • How can users access data?
  • How are data organized? Is there an explicit data model, schema, or format description?
  • Is there evidence that this is an important resources (registered users; citations)?

Phylogeny Services

Phylogeny.Fr

http://www.phylogeny.fr

output formats are Newick, NHX and Phylip, but apparently no means to export alignment and tree together.

Data Resources

TreeFam

TreeFam

  • citations for 2006 paper: 36

Hovergen, hogenom

Hovergen, hogenom

  • hovergen 1994 paper cited 157 times

TreeBase

TreeBaseII has the kind of granular schema that would be a good challenge to try to accommodate using cdao. What is missing from cdao (and nexml) is the notion of a "study" with various types of metadata, including publication/reference metadata (which is somewhat dc-like).

Its not clear what reference to site. The TreeBase "Intro" page cites about 7 different references up to 2000, including posters and papers. Some of the papers apparently are scientific studies and not implementation papers. Some of the most likely candidates for a TreeBase citation are as follows (perhaps we should include all of these):

  • Sanderson, et al., 1994 (Am. Jour. Bot.), cited 59 times, can't track citing papers
  • Sanderson, et al., 1993 (Syst. Biol.) cited 29 times, most citing papers are meta-analyses ! (that's good)
  • Morel, 1996, cited 21 times

Pandit

Pandit

pPOD

pPOD (not really a data resource, its a db tech project led by computer scientists)

Tree o' Life

Microbes Online

Arkin lab's MicrobesOnline server has cool tree-based view of sequence families

PhylomeDB

PhylomeDB

PhyLoTA

PhyLoTA

PhyloFacts

PhyloFacts

TimeTree

TimeTree

MorphoBank

MorphoBank

MorphBank

MorphBank