TreeBASE Data

From Evolutionary Informatics Working Group
Revision as of 12:14, 10 March 2009 by William.piel@yale.edu (talk) (Dump Contents)
Jump to: navigation, search

TreeBASE Dump

A Postgres dump for TreeBASE can be obtained [here].

Dump Contents

               List of relations
Schema |         Name         |   Type   | Owner
--------+----------------------+----------+-------
public | edges                | table    | piel
public | ncbi_names           | table    | piel
public | ncbi_nodes           | table    | piel
public | node_path            | table    | piel
public | nodes                | table    | piel
public | nodes_node_id        | sequence | piel
public | study                | table    | piel
public | study_id_seq         | sequence | piel
public | taxa                 | table    | piel
public | taxon_id_seq         | sequence | piel
public | taxon_variant_id_seq | sequence | piel
public | taxon_variants       | table    | piel
public | tb_labels            | table    | piel
public | tb_labels_id         | sequence | piel
public | trees                | table    | piel
public | trees_tree_id        | sequence | piel
(16 rows)

For each "study" record, there are many "trees" records. Each "trees" record has many "nodes" records, which are wired to each other via the "edges" table and a transitive closure index is in the "node_path" table. Each "tb_labels" record can point to many "nodes" records -- "tb_labels" is a table of unique taxon labels that appear in all trees. Each taxon_variant record maps to zero or more tb_labels; each "taxa" record maps to one or more taxon_variant records. Each "taxa" record represents a single, normalized taxon, usually a species, but could be a subspecies or a higher taxon. Wherever possible, each "taxa" record has an ncbi_taxid -- that is, the IDs used by ncbi in their Genbank distribution. Consequently, these taxids connect the "ncbi_names" table, which in turn uses the "ncbi_nodes" table as a hierarchical classification. This classification has been pre-indexed with left and right IDs so that hierarchical searching is possible.

Example Queries