Domain scientists with an interest in the archiving and re-use of phylogenetic data have called for a reporting standard designated "Minimal Information for a Phylogenetic Analysis", or MIAPA (Leebens-Mack, et al. 2006). Ideally the research community would develop, and adhere to, a standard that imposes a minimal reporting burden yet ensures that the reported data can be interpreted and re-used. Such a standard might be adopted by
- data repositories such as TreeBase or Dryad
- journals that publish supplementary material for phylogenetic studies
- granting organizations that support phylogenetic studies
- organizations that develop taxonomic nomenclature
Currently MIAPA is aspirational and represents an open call for further work. As a starting point, Leebens-Mack, et al. suggest that a study should report objectives, sequences, taxa, alignment method, alignment, phylogeny inference method, and phylogeny.
Some thoughts on developing MIAPA
Leebens-Mack, et al. called for further work, attempting to attract attention to this idea in order to stimulate effort. However, there has been no further effort to develop MIAPA. The NESCent evolutionary informatics working group invited Dr. Leebens-Mack to speak, and there was general agreement with the value of developing a MIAPA standard, and with the importance of providing ways to support a MIAPA standard through nexml and CDAO.
- What it might mean to have an effective MIAPA standard:
- an explicit (possibly formal) description of the standard, specifying types of data and metadata
- an explicit conformance policy
- a controlled vocabulary for data and metadata
- a file format for MIAPA documents
- What software support might entail
- interactive software to facilitate creation of MIAPA-compliant documents
- a relational mapping of the MIAPA standard
- a formal taxonomy or ontology of metadata terms
- What it might take to get there
- a working group with external funding
- a consortium with representatives from data resources, publishers, researchers, and programmers
- user testing at scientific conferences
- multiple rounds of revision
- What would ease the burden on scientists (i.e., this is the goal behind the "minimal" in MIAPA)?
- fewer categories of metadata
- fewer arbitrary restrictions on format
- familiarity of metadata concepts
- software support for annotation with a controlled vocabulary
- What makes data reusable?
- standard formats
- provenance, ideally, provenance that can be traced automatically
- description of methods sufficient to reproduce results from data
Knowledge Capture Exercise