TreeBASE provides a system for user-initiated archiving of data sets associated with published analyses, but currently it is under-utilized. One thing that TreeBASE needs is input format validation (most input files are not properly formed NEXUS files). What else is needed to make the system friendly and useful for users in different fields? How much new capacity would be needed? How would this be supported?
Analysis of current needs
As one group member puts it "I want data sets and collections that other poeple are using so when X says 'I did this with X data' it is something I can just go get and use".
Actually, we have something for this already-- its called TreeBase. But its under-utilized. Its popular in the molecular systematics community, virtually unknown elsewhere. The administrator spends considerable time cleaning up input files which supposedly are in NEXUS format but most often are not.
Goals for the working group
We do not have the resources to work on this directly, but we can make recommendations.
The recently funded pPOD project is working on this.
The BioSQL project also is relevant.