Education and Outreach

From Evolutionary Informatics Working Group
Jump to: navigation, search


When and how do we spread the word about evolutionary informatics? How do we catalyze progress? Scientific meetings often offer opportunities for symposia or tutorials. What funding agencies may support infrastructure development? Are there other organizations with which we should collaborate (e.g., European POSSUM data standard initiative).

Goals for the working group

At the first meeting (May 20-23, 2007), we chose some specific outreach tasks that focus on identifying partners and disseminating standards recommendations.

Should we move this information into other topic areas, e.g., the ontology-related partnering into the ontology page?

Task 1: Identify partners and stakeholder organizations for ontology development

Gather information and make contact with potential partners

  1. National Center for Biomedical Ontology (Hilmar)
  2. pPod - Bill Piel is aware of our efforts; Arlin invited Val Tannen to our next meeting
  3. NCBI (pop set), ask Lipman (Arlin)
  4. EBI/EnsEMBL group - Aaron is in contact with this group and they have been apprised of our efforts
  5. TreeBASE - several of us are in regular contact with Bill Piel; he is part of the ontology project from Enrico, Gopal and Arlin
  6. Adam Goldstein (the philosopher, not the disk jockey): see e.g. his blog
  7. PosSUM (european) data standard initiative

Task 2: Collect information on funding opportunities

  1. INTEROP: a new NSF Program Solicitation: Community-based Data Interoperability Networks
  2. NIH Data Ontologies for Biomedical Research (R01) Letters of Intent: December 18, 2007, submission January 18, 2008.
  3. NIH Collaborations with National Centers for Biomedical Computing (R01 and R21 deadlines: 17 May, 17 Jan); specific information on collaborating with NCBO

Task 3: Assess community interest in databasing alignments and trees

  1. Society and Journals of Interest:
    • Sudhir's data might provide a place to start: find the top 6 journals that are publishing phylo analyses, and then contact the editorial boards of these.
    • SSE (Society for the Study of Evolution): Published Evolution;
    • SMBE (Society for Molecular Biology and Evolution): Publishes "Molecular Biology and Evolution";
    • Journal of Molecular Evolution;
    • Molecular Phylogenetics and Evolution;
    • ASN (American Society of Naturalist): Published American Naturalist;
    • SSB (Society of Systematic Biologists): Publishes "Systematic Biology"
  2. Gather information, write wiki report of interest from journals (Weigang)
    • The following message was sent to Dr Jessica Gurevitch, SSE executive vice president, who in turn forwarded it to the SSE president. Also to George Zhang (SMBE Secretary) and Bill Martin (MBE Editor). No response yet.
On behalf of a NESCent evolutionary informatics working group, I would like to keep you and SSE informed on our work. The main purpose of the EvoInfo working group is to solve the problem of interoperability of evolutionary data. For instance, we all know the problem of different programs (e.g., PHYLIP, PAUP, MrBayes, MEGA, the four most popular phylogenetic software packages) requiring different input formats, and even a supposedly single NEXUS format has arbitrary, mutually incompatible implementations. This greatly reduced the chance of data re-usability, e.g.,for verification by reviewers, meta-analysis by others, and updated analyses by the authors themselves.

The working group consists of developers of these major software packages, as well as computer scientists who specialize in interoperability and logic programming. The first meeting took place two weeks ago (5/21-23) in NESCent. The group began working on:

(i) a generalized, abstract data model for evolutionary analysis, (ii) logic-based translation and validation capabilities for data formats (NEXUS, PHYLIP, etc.). (iii) a description language for evolutionary transition models.

Further information on the activities of the working group is available at

The immediate utility of such a "phylogenetic ontology" approach would be to offer an assurance that the file format used by individual software developers would be readable by other programs. In the long run, we envision providing web-based tools for file format validation and file editing, as a service for individual researchers.

Because interoperability standards and practices must be disseminated and adopted widely to be successful, the working group must be responsive to the research community. As part of a long-term outreach plan, we are making contact with various representatives of the evolutionary research community, including the editorial boards of key journals.

At this moment, we are interested in gauging the level of interest from leading trade journals in promoting file format standards and database archives to standardize and to store data used in evolutionary analyses. How important is it to promote re-use of data published in Evolution? Would it advance the mission of SSE/Evolution for authors to submit supplementary data in standard formats accessible to reviewers and readers? To submit key data sets to an external database archive?

As more genomic data are emerging and analyzed in a comparative framework, we anticipate that phylogenetic data will move to a center stage of biomedical research, and be adopted by researchers outside of the traditional evolution/ecology/behavior research community.

We value your feedback on these issues, and on other ways that the NESCent working group could partner with SSE/Evolution to achieve common goals. Please feel free to contact me by email or phone.


  1. Presentations at Google Summer of Code (Hilmar and Weigang)
    We presented results and goals of EvoInfo workgroup during the Google Summer of Code meeting in August. We made it clear that many of the GSoC projects were decided based on the outcome of the Hackthorn and 1st EvoInfo workgroup meeting. I think these presentations made am impact to the next generation of evoinfor students by exposing them to the main informatics challenges (it is not the lack of applications, but a lack of coordination and interoperability and reusability). Most students were not aware of the interoperability problem, and a quite a few wrote their own tree and NEXUS parsers and viewers. Many of them were new to Bio::Tree and Bio::NEXUS.

Relevant information and links

Note: see the Links page (SideBar) for more information on possible partners, funding sources, etc.

Some activities we know about

  • a course at NESCent that arose out of the phylohackathon (need link)
  • the Woods Hole molecular evolution course?
  • comparative genomics course in Europe somewhere (can't remember)