PhyloWS/REST

From Evolutionary Informatics Working Group
Revision as of 05:08, 14 February 2008 by Hlapp (talk) (Phylodb: Phylogenetic Tree Database)
Jump to: navigation, search

Specification for a REST-based instantiation of the PhyloWS API.

Disclaimer: Note that this is pre-alpha, work in progress, and not even nearly finished. Any comments highly appreciated.

Principles

HTTP CRUD
POST Create, Update, Delete
GET Read
PUT Create, Update
DELETE Delete
From the Wikipedia entry on REST.
  1. RESTful queries should be stateless, hence all needed input needs to be provided in a single call, rather than "accumulating" the input (if there are multiple pieces of input) over multiple calls.
  2. The architecture builds on the principle of viewing data and operations as resources, which get created, modified (delete, update), or retrieved.
  3. The allowable HTTP method (GET, POST, PUT, DELETE) depends on the type of CRUD operation being represented by the resource call. For example, resource calls that simply retrieve data without also creating a resource should use the GET method, and not POST.
    Note that a resource may be created virtually; for example, calling a resource might start a calculation and return the results directly (rather than returning a URL to the results, which would be more appropriate). Nonetheless, the calculation is still being created. Question: is the mapping PUT=create, GET=retrieve, POST=update, DELETE=delete? If so, what is PUT used for? Is a message body being submitted? I don't think so...
  4. The API should also be described as a WSDL document. In fact, WSDL 2.0 allows binding to HTTP methods and supports a RESTful interface.

Basic structure

  1. All resource URIs start with BASE_URL/phylows/. BASE_URL is specific to the implementing service.
  2. Optional resources may not be implemented.
  3. The first path element designates the type of data the resource points to, or the operation.
    • Examples for data: /phylows/tree, /phylows/clade, /phylows/node
    • Examples for operation: /phylows/aggregate, /phylows/find
  4. The second path element depends on whether the resource points to a data resource or to an operation:
    • Data resources: The second path element gives the unique identifier of the data resource. This need not (and arguably should not) be the primary key of the resource in the provider database, but could also be an accession number-type identifier.
    • Operation resources: For operations that act on or return possibly multiple resources (data elements), the second path element gives the data type being acted on, or being returned. Subsequent path elements specify the query path, with final parameters giving query parameters.
      • Examples: /phylows/find/tree, with additional parameters specifying the query, e.g.: /phylows/find/tree/?name=Primates
  5. Operations acting on a single identified data resource use the URI of the data resource, and express the action as the HTTP method, possibly in combination with input parameters.
    • Examples: /phylows/tree/TreeBASE:S123455 with HTTP method DELETE is a request to delete the respective tree. /phylows/node/urn:lsid:phylodb.org:node:123456 with HTTP method DELETE is a request to prune the respective node from the tree it is in.

Specification

Conventions:

  • <value> is a placeholder and must be replaced with an actual value when accessing the resource (i.e., typically a mandatory parameter)
  • {val1|val2|val3} must be replaced with exactly one of the values (literals) between the curly braces, delimited by the vertical bar
  • [param=<value>] or [param] denote optional parameters (with or without value, respectively)

Phylodb: Phylogenetic Tree Database

Retrieve:

Task: Retrieve tree and tree metadata

  • Query URI: /phylows/tree/<identifier>/?metadata={true|false}&topology={true|false}&[format=<format>]
  • identifier is a valid and unique identifier of the tree, for example a namespace:ID specification, a primary key, or an LSID
  • If metadata is true, all metadata of the tree will be returned. If topology is true, the topology (structure) of the tree will be returned.
  • format designates the desired response format. Example formats are nhx (New Hampshire Extended) and nexml (default). If the data provider doesn't support the requested format, an error will result.

Task: Retrieve all metadata for a node

  • Query URI: /phylows/node/<identifier>/?[format=<format>]
  • identifier is a valid and unique identifier of the node, for example a namespace:ID specification, a primary key, or an LSID
  • format designates the desired response format. Example formats are nhx (New Hampshire Extended) and nexml (default). If the data provider doesn't support the requested format, an error will result.