Adama¶
Warning
This document is currently a collection of mostly independent pieces of documentation for Adama. In the near future we plan to organize it properly, to function as the one-stop-shop for all Adama documentation.
Contents:¶
Adama Tutorial¶
Warning
This API is preliminary, and it may change.
Follow this tutorial with any HTTP client. Here we’ll use curl.
The languages supported by Adama are currently: Python, Javascript.
The base url of the Adama is https://api.araport.org/community/v0.3. For brevity, in what
follows we use the environment variable API
to refer to this base.
Declare the variable in your shell to be able to copy and paste the
examples:
$ export API=https://api.araport.org/community/v0.3
You need to obtain a token through your method of choice. In what
follows, the environment variable TOKEN
is assumed to be set to
such token. It is convenient to set it as:
$ export TOKEN=my-token
Checking access to Adama¶
A GET
request to $API/status
should return:
$ curl -L -X GET $API/status -H "Authorization: Bearer $TOKEN"
{
"api": "Adama v0.3",
"hash": "6869fde8e2617ab8f8a58c5c09b1512a80185500",
"status": "success"
}
The hash
field points to the git commit of Adama that is currently
serving the response.
Registering a Namespace¶
Namespaces allow Adama to group adapters. Create a new namespace with:
$ curl -X POST $API/namespaces -Fname=tacc -Fdescription="TACC namespace" \
-H "Authorization: Bearer $TOKEN"
{
"result": "https://api.araport.org/community/v0.3/tacc",
"status": "success"
}
Retrieve information about a known namespace from the url
$API/<namespace>
(for example $API/tacc
). Obtain the list of
all registered namespaces with:
$ curl -X GET $API/namespaces -H "Authorization: Bearer $TOKEN"
{
"result": [
{
"description": "TACC namespace",
"name": "tacc",
"url": null
}
],
"status": "success"
}
Delete a namespace with the verb DELETE
to the url
$API/<namespace>
.
Registering an Adapter¶
Adama currently supports two types of adapters: query
and
map_filter
.
A query
adapter receives a request through Adama, performs a query
to an external service and returns the results as JSON objects.
A map_filter
adapter transforms and/or filters JSON objects
returned from an external service.
An adapter can be registered using two methods (or a combination of them):
POST
the code and the metadata. The code can be a single file, tarball, or zip archive.POST
an URL to a git repository containing the code and the metadata.
It is strongly recommended to use the second method, since it makes it easier to share, to modify, and to keep track of changes in the adapters.
We show an example of a query
adapter registered via the first
method, and an example of a map_filter
adapter registered via the
second method.
Writing a query adapter¶
Write a Python module main.py
, with a function search
that
takes a JSON object as argument in the form of a dictionary. Print
JSON objects to standard output, separated by the characters
"---"
.
For example:
# file: main.py
import json
def search(args):
print json.dumps({'obj': 1, 'args': args})
print "---"
print json.dumps({'obj': 2, 'args': args})
This function can be tested in the Python interpreter:
>>> import main
>>> main.search({'x': 5})
{"args": {"x": 5}, "obj": 1}
---
{"args": {"x": 5}, "obj": 2}
Registering¶
To register this adapter with the name example
in the namespace
tacc
, we POST
to $API/tacc/services
with the metadata and
the code. In this example we show only some of the optional fields,
refer to the API docs for the full documentation.
- name (mandatory): the name of the adapter (
example
in this case), - type (mandatory): the type of adapter:
query
, ormap_filter
, - version (optional): version (default
0.1
), - url (mandatory): URL of the external service
(
http://example.com
in this case), - notify (optional): URL to notify with a POST request when the adapter is ready to use,
- code (mandatory): module
main.py
.
Using curl:
$ curl -L -X POST $API/tacc/services \
-F "name=example" -F "type=query" -F "url=http://example.com" \
-F code=@main.py -F "notify=https://my.url" \
-H "Authorization: Bearer $TOKEN"
{
"message": "registration started",
"result": {
"notification": "https://my.url",
"search": "https://api.araport.org/community/v0.3/search",
"list": "https://api.araport.org/community/v0.3/list",
"state": "https://api.araport.org/community/v0.3/example_v0.1"
},
"status": "success"
}
At this point the registration procedure is started in the server. It may take some time, and in the meantime the state of the adapter can be checked with:
$ curl -L -X GET $API/tacc/example_v0.1 \
-H "Authorization: Bearer $TOKEN"
{
"result": {
"msg": "Workers started",
"service": null,
"slot": "busy",
"stage": 4,
"total_stages": 5
},
"status": "success"
}
When ready, Adama will post to the url specified in the notify
parameter (if any), and the adapter can be seen in the directory of
services. To see a list of all the available services:
$ curl -L -X GET $API/tacc/services \
-H "Authorization: Bearer $TOKEN"
{
"result": [
{
"code_dir": "/tmp/tmpolAjqz/user_code",
"description": "",
"json_path": "",
"language": "python",
"main_module": "main",
"metadata": "",
"name": "example",
"namespace": "tacc",
"notify": "https://my.url",
"requirements": [],
"type": "query",
"url": "http://example.com",
"version": "0.1",
"whitelist": [
"localhost",
"example.com"
],
"workers": [
"57a4e10cb84aba5473d81c58011fcb78ce1b2684d67f0c2cc7540be191d4b589"
]
}
],
"status": "success"
}
Delete the service example_v0.1
by using the DELETE
verb to
$API/tacc/example_v0.1
.
Writing a map_filter adapter¶
Start a git repository as:
$ mkdir map_filter_example
$ cd map_filter_example
$ git init
Add the file main.py
with content:
def map_filter(obj):
obj['processed_by'] = 'Adama'
return obj
This module can be tested in the Python interpreter:
>>> import main
>>> main.map_filter({'key': 1})
{'key': 1, 'processed_by': 'Adama'}
Add also the file metadata.yml
with the metadata information:
---
name: map_example
version: 0.1
type: map_filter
main_module: main.py
url: https://api.araport.org/community/v0.3/json
whitelist: ['127.0.0.1']
description: ''
requirements: []
notify: ''
json_path: result
The url https://api.araport.org/community/v0.3/json
returns a sample JSON response:
$ curl https://api.araport.org/community/v0.3/json
{
"result": [
{
"key": 1
},
{
"key": 2
},
{
"key": 3
}
],
"status": "success"
}
The array of objects we want to process is in the field result
, so
we declare it in the json_path
field of the metadata file.
Commit both files into the git repository:
$ git add main.py metadata.yml
$ git commit -m "Add main and metadata"
The git repository has to be made available somewhere. For example, if
using Github with the username waltermoreira
and repository name
map_adapter
, we can register the adapter with:
$ curl -L -X POST $API/tacc/services \
-F "git_repository=https://github.com/waltermoreira/map_adapter.git"
Performing a query¶
Use the adapter example_v0.1
registered in the tacc
namespace
by doing a GET
from $API/tacc/example_v0.1/search
.
For example:
$ curl -L "$API/tacc/example_v0.1/search?word1=hello&word2=world" \
-H "Authorization: Bearer $TOKEN"
{"result": [
{"args": {"worker": "887e5cf7c82f", "word1": "hello", "word2": "world"}, "obj": 1}
, {"args": {"worker": "887e5cf7c82f", "word1": "hello"], "word2": "world"}, "obj": 2}
],
"metadata": {"time_in_main": 0.0001881122589111328},
"status": "success"}
Notice that the result consists of the two objects generated by
main.py
, including the query argument (in this
case containing some extra metadata added by Adama).
Use the adapter map_example_v0.1
in a similar way:
$ curl -L $API/map_example_v5/search \
-H "Authorization: Bearer $TOKEN"
{"result": [
{"processed_by": "Adama", "key": 1}
, {"processed_by": "Adama", "key": 2}
, {"processed_by": "Adama", "key": 3}
],
"metadata": {},
"status": "success"}
Summary¶
Current endpoints for Adama:
$API/status
GET
: get information about Adama server
$API/namespaces
GET
: list namespacesPOST
: create namespace
$API/<namespace>
GET
: get information about a namespaceDELETE
: remove a namespace
$API/<namespace>/services
GET
: list all servicesPOST
: create a service
$API/<namespace>/<service>
GET
: get information about a serviceDELETE
: remove a service
$API/<namespace>/<service>/search
GET
: perform a query
$API/<namespace>/<service>/list
GET
: perform a listing
Documenting Parameters¶
Starting in version 0.3, Adama allows to document the parameters for all the types of adapters.
The user provided documentation is a backward compatible extension of
the metadata file. The syntax for declaring parameters is based in
Swagger 2.0. The full Swagger spec can be used in the
endpoints
field in the file metadata.yml
, but a simpler subset
is also supported, allowing easier documentation of the most common
cases of adapters.
Example¶
The following is a basic example of an Adama adapter with
autogenerated documentation. Refer to the base tutorial for
instructions about registering an adapter, and change the base URL
from api.araport.org
to adama-dev.cloudapp.net
(this is a
sandbox, don’t worry to break things here).
This adapter is hosted at github.com/waltermoreira/sample-parameter-docs.
The git repository contains two files: main.py
and
metadata.yml
. The file metadata.yml
describes an adapter with
four parameters:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | ---
name: my_adapter
type: query
validate_request: yes
endpoints:
/search:
parameters:
- name: x
description: Parameter X
type: string
required: true
- name: y
description: Parameter Y
type: integer
format: int64
required: false
default: 5
- name: z
description: Parameter Z
type: array
required: true
collectionFormat: multi
items:
type: number
format: double
- name: w
description: Parameter W
type: string
required: true
enum:
- Spam
- Eggs
|
Let’s explain the meaning of the several sections.
Line 6 declares that the service wants Adama to validate the input
before passing the control to the module main.py
. The default is
no
, which allows for backward compatibility and gradual
documentation of existing adapters. When Adama validates the input,
the developer of the adapter can be confident that the arguments for
the function main.search
have the types specified in
metadata.yml
.
For example, the function main.search
in this adapter contains the
lines:
def search(args):
# ...
w = args['w']
assert w in ('Spam', 'Eggs')
# ...
When validate_request: yes
, the developer knows that these two
lines will not raise an error, since the parameter w
is mandatory
and it is one of the values described in the enumeration.
Lines 8-9 declare that we are documenting the endpoint /search
of
the service. Current Adama adapters have a defined list of supported
endpoints (see live API docs), but the list can be expanded in the
future.
Lines 11-14 describe the mandatory parameter x
of type string.
Lines 15-20 describe the optional parameter y
of type integer. In
case it is not passed in the request, the adapter will get the default
value 5.
Lines 21-28 describe a parameter z
of type array of numbers. An array
can be passed in a request in different ways. Line 25 declares that
this adapter accepts a multi
format. This means that a request of
the form:
https://araport-dev.cloudapp.net/.../my_adapter_v0.1/search?z=2&z=3.1&z=10&...
will pass the array [2.0, 3.1, 10.0]
to the function
main.search
. See the Swagger spec for the available ways to pass
arrays in requests.
Lines 29-35 describe a parameter w
with a defined set of possible
values.
Registering the adapter¶
Register the adapter following the instructions in the base tutorial. Here is an example using the tool httpie [1] for performing the requests:
export ADAMA=https://adama-dev.cloudapp.net/community/v0.3
export TOKEN=...my token...
http POST https://$ADAMA/my_namespace/services \
Authorization:"Bearer $TOKEN" \
git_repository=https://github.com/waltermoreira/sample-parameter-docs \
validate_request=yes
The adapter health can be checked with the request:
http https://$ADAMA/my_namespace/my_adapter_v0.1 Authorization:"Bearer $TOKEN"
which should return the a successful response with a lot of “nerd stats”.
Accessing the documentation¶
Once the adapter is successfully registered, the full Swagger
documentation can be accessed in the /docs
endpoint of the
adapter. For example:
# return Swagger documentation in JSON format
http https://$ADAMA/my_namespace/my_service_v0.1/docs \
Authorization:"Bearer $TOKEN"
# return Swagger documentation in YAML format
http https://$ADAMA/my_namespace/my_service_v0.1/docs?format=yaml \
Authorization:"Bearer $TOKEN"
This output is usually not meant for human consumption (although the
YAML output is very readable). The main goal of this endpoint is to
be fed to any Swagger 2.0 complaint browser. Araport near future
plans include to provide a developer console that will include all the
adapters documentation, the Agave API, and the Adama base API. In the
meantime, Adama provides the endpoint /docs/swagger
which is a
basic instance of a Swagger browser.
To interact with the documentation, access with the browser the URL:
https://adama-dev.cloudapp.net/community/v0.3/my_namespace/my_adapter_v0.1/docs/swagger
Footnotes
[1] | httpie is a strongly recommended replacement for curl : http://httpie.org/ |
Metadata format¶
An example of metadata.yml
file:
---
description: "Given a valid AGI locus, fetch coexpressed genes from the ATTED-II database"
url: http://atted.jp/
main_module: main.py
name: atted_coexpressed_by_locus
type: query
version: 0.1
whitelist:
- atted.jp
The file metadata.yml
accepts the following fields:
name
- The name of the adapter
version
- The version of the adapter. Please, use semantic versioning.
type
Type of the adapter. One of:
query
generic
map_filter
passthrough
description
- Free form text to describe the purpose of the adapter.
url
- Url for the third party data source the adapter is accessing, if
any. Depending on the type of the adapter, this may be for
documentation purposes (
query
andgeneric
), or it may be used directly by Adama to access the service on behalf of the user (map_filter
andpassthrough
). whitelist
- An additional list of ip’s or domains that the adapter may need to access.
main_module
- The name (including the path relative to the root of the git
repository) of the main module for this adapter. If omitted, Adama
will search for a module named
main.*
. notify
- An url that will receive a POST with the data of the new registered adapter once it is ready to receive requests.
requirements
- A list of extra modules to add to the adapter at installation time.
These modules should be installable via the standard package manager
of the language used by the adapter (for example:
pip
for Python,gem
for Ruby, etc.) validate_request
- Whether to validate the parameters according to the provided
documentation. By default this option is
no
. If enabled, the parameters of a request are validated before passing control to the user’s code in the adapter. endpoints
- Documentation about the parameters accepted by this adapter (see documenting parameters).
json_path
This field is meaningful only for
map_filter
adapters. If the third party service returns an array of JSON objects to be processed by the adapter, then this field can be empty. Otherwise, if the array is nested inside a JSON object, this field is used to specify how to reach it.For example, if the response of the third party service is the following JSON object:
{ "status": "success", "result": { "data": [1, 2, 3, 4, 5], "name": "integers" } }
and the adapter is interested in the array of integers, then we set:
json_path: result.data
Types of Adapters¶
At the fundamental level, Adama is a builder of webservices. Its task is to abstract the infrastructure necessary to publish a webservice, such as security, scalability, fault tolerance, monitoring, caching, etc. A developer of an adapter can concentrate just in the source and transformations of the data.
To make it easy to develop webservices, Adama provides several types of adapters, trying to minimize the amount of code a developer has to write.
Note
Webservices
A webservice is a function or process (usually located at some URL) that accepts a HTTP request (usually a GET or POST, but in general any HTTP verb) and returns a response. The type of response varies wildy: JSON, text, images, HTML, etc. In Adama there are extra features for dealing with the JSON type, since it’s the preferred format for webservices.
We denote a webservice that returns a response of type \(T\) by
Note
Simple webservices
We call simple webservice to a webservice that restricts the
request types to just a GET
with query parameters. Such
requests have the form:
GET http://example.com?key1=value1&key2=value2&...
In other words, a simple webservice is a service that accepts a set of key/values and returns a response of type \(T\). We denote it with the symbol:
Adama prefers simple webservices that return data as an array of JSON objects: \(\text{SimpleWS}_{[\text{JSON}]}\).
There are two types of adapters that return a simple webservice of type \([JSON]\):
query: a query adapter has the type:
\[\bigl( \{k:v\} \to \text{Stream}(\text{JSON}) \bigr) \to \text{SimpleWS}_{[\text{JSON}]}\]This means that the developer provides a function that accepts a set of key/values and that returns a stream of JSON objects. Given this function, Adama constructs a simple webservice that accepts a
GET
request with query parameters and returns an array of JSON objects.To return a stream of JSON objects, the developer just has to print to standard output, separating each object with the line
---
.map_filter: a map_filter adapter has the type:
\[\bigl( \text{JSON}\to\text{JSON}, \text{SimpleWS}_{[\text{JSON}]} \bigr) \to \text{SimpleWS}_{[\text{JSON}]}\]This adapter takes two arguments: a function that transforms JSON objects, and an existing simple webservice returning an array of JSON objects. Given those parameters, Adama constructs a simple webservice that consists in transforming the output of the original webservice via the provided function.
There are two additional adapters that provide extra functionality, for the cases when returning JSON objects is not feasible:
generic: a generic adapter is similar to a query adapter, but the return type is arbitrary:
\[\bigl( \{k:v\} \to T \bigr) \to \text{SimpleWS}_T\]Also, rather than returning an stream via printing to standard output, a generic adapter simply returns the object of type \(T\).
passthrough: a passthrough adapter makes Adama a proxy for an arbitrary existing webservice. The type is:
\[\bigl( \text{WS}_T \bigr) \to \text{WS}_T\]That is, it takes an existing webservice and it constructs the same webservice, except by changing the URL and by providing extra features such as caching, authentication, etc.
History¶
Warning
This section contains some legacy documents. Most of the information here may be out-of-date or incorrect.
Introduction¶
The Arabidopsis Information Portal project wants to present a unified interface to a wide set of services that provide data for Arabidopsis research.
Problem¶
There is a huge amount of data related to Arabidopsis being produced and being served to the internet. However, the way the data is presented varies wildly across each service. The main aspects in which services vary are:
- The format the data is presented: anything from CSV to binary formats;
- The API used to access the service: anything from a RESTful service to SOAP based service;
- The vocabulary used to denote the objects.
Trying to collect and to unify these aspects is difficult given their dynamic nature. Data formats and interfaces change quickly. A central service would need to be continuously updating its software. Adama tries to solve this problem by engaging the developers of each service in a distributed and federated architecture.
Strategy¶
Adama is a RESTful service that provides two main endpoints:
- Registration interface: a provider of a data source (see Roles) can register a service to be accessible through Adama. The provider’s work (see Developer Role Quickstart) consists in defining a conversion procedure between the Araport language (see Araport Language) and the native format of the data source.
- Query interface: an user of Araport (see User Role Quickstart) can perform queries in the Araport language and direct them to one or more registered services.
In addition to this interface or API, the architecture provides the following features that aim to solve the points in Problem:
- Unified language and API. Araport provides access through a RESTful service, accepting and returning JSON objects. The schema for these objects is defined in the Araport Language, a rationalized and extensible format designed to express queries for all the registered services. This allows a query to be spread to multiple data sources simultaneously and the results to be collected through a single endpoint.
- Distributed responsability. The task of maintaining an up-to-date conversion between the Araport Language and the third party services is distributed among the developers of the latter. Adama provides an extremely easy way to develop the adapters and to test them in complete isolation of the Araport site (see Developer Role Quickstart).
- Security and isolation. The adapters from each developer and for each service are run fully isolated and with a restricted access to the network. This allows to control the security in execution, and to control the network usage (including throttling, caching, etc.).
- Scalability. Adama takes care of the horizontal scaling of the adapters. The scalability can be performed in a dynamic way, depending on the load to particular services. In addition, load balancing to the full Adama API is performed automatically.
- Extra Services. Adama provides extra services for each data source: pagination, count, and caching. These services do not require support from the data source and they do not require extra work from the developer of the adapter.
Architecture¶
Overview¶
The following picture illustrates the architecture of Adama:
The main components are:
- Adama itself is a Python application running over an Apache web server, providing the REST endpoints and business logic. (see API).
- A data store (Redis) collects a registry of adapters.
- A task queue (RabbitMQ) distributes work to the adapters for each
request to the
query
endpoint. - The workers are Docker containers that get created through the
register
endpoint, and get started and shutdown under Adama command, depending on the load of thequery
endpoint. The workers are continuously consuming work from the task queue. - The user module in the picture is the adapter provided by the developer at registration time (see Adapter API).
- The user module, data sources, and 3rd party protocol (colored blue in the picture) specify the parts under control of the developer role (see Developer Role Quickstart).
- Elasticsearch intercepts objects generated by the queries and creates an aggregated database that allows extra capabalities for searching and analyzing the data sources. Data sources can opt-in or opt-out of this functionality at any time.
API¶
Warning
This section is in flux. The API will change before a stable release.
In what follows, assume that $ADAMA
is the base url of the Adama
services.
The resource $ADAMA/register
accepts a GET
and POST
request. The GET
verb returns a list of registered adapters in
the form:
{
"identifier": "foo_v0.1",
"language": "python",
"name": "foo",
"url": "http://example.com",
"version": "0.1",
"workers": [
"ad89eefd01ca4648dc388dd12b95816cc24fe938ca801bd938ef982fa057a489",
"7ec01fefe8c8a54d92a773569e0dc0b38be8b3a9bbdea7a16da05c5a800117ad",
"ca88c73a13c17704e348eca468c101c784654dbe90b1156846e34507d0cccd6a",
"7f3fbb1faadd7493c349dd316ad3d1dcf8775b8851dbf0e9111b00bf64c03612"
]
}
The identifier is unique and can be used in the queries to refer to
the service provided by this adapter. The workers
field contains
internal information about the workers currently running to attend
this adapter (it may be removed from the public API in the
future).
The POST
verb allows to register a new adapter. It accepts the
parameters described below. The type of the parameter and whether
they are mandatory or optional is described besides the paramter name.
name
[form, mandatory]- Name of the service provided by this adapter (together with the version they must form a unique identifier)
version
[form, mandatory]- Version of the service.
url
[form, mandatory]- URL of the data source. The network access for the adapter may be restricted to access only this URL.
description
[form, mandatory]- Human readable description of the service.
requirements
[form, optional]- Comma separated list of third party modules to be installed inside
the workers. They should be accessible in the standard package
repository for the language bein used (i.e.,
pypi
for Python,rubygems
for Ruby, etc.). code
[file, mandatory]- The user’s code for the adapter. See Adapter API for its API and requirements. The code can be provided in a single file, or in a tarball or zip compressed archive. The type is detected automatically.
Workers are started immediately after registration. The response is
the standard (status
, message
, result
) triple (see Agave).
The verbs PUT
and DELETE
will be implemented in the future to
allow administration of already registered adapters.
The resource $ADAMA/query
accepts POST
requests to perform
queries to a selected list of services. The parameter is a JSON
encoded in the body with the schema:
{
"serviceName": "foo_v0.1",
"query": "...Araport Language query..."
}
The serviceName
field can also be a list of multiple services. The
query will be delivered to all of them, and responses will be
collected together. See Araport Language for the schema of the
queries.
Adapter API¶
Warning
This section is in flux. The API will change before a stable release.
An adapter can be written in any of the programming languages supported by Adama. The list initially includes:
Python, Javascript (node.js), Ruby, Java, Lua, Perl.
Other languages can be added in the future by request.
The description that follows is generic, and details for each language will be provided in newer revisions of this document.
An adapter is a module called main
. It contains a function named
process
which accepts a string and returns nothing. The string
argument is a JSON encoded object that will be passed by Adama, and it
will contain the query from the user (in the Araport Language).
The task of the function process
is:
Convert the Araport Language query to the proper format for the 3rd party service.
Send the query to the 3rd party service and retrieve the results.
For each result, convert it to Araport Language and print it to screen (as a JSON encoded value).
The output may use several lines with no restriction. Print the line
---
to separate results. Many results can be generated from every result from the data source.
The function process
in the module main
can be tested by the
developer by simply running it in his or her own system, with no
access to Adama or Araport. As long as the adapter follows the
protocol to print to standard output as JSON, and to separate the
objects with ---
, Adama will be able to capture the results.
Note
By printing to standard output, the adapter is effectively using an asynchronous output model, allowing Adama to start delivering results to the clients as soon as possible. It does not require any effort from the developer. And it is actually easier to code than collecting the results in a temporary container and returning them.
The main
module can have dependencies of two types:
- It can depend on other modules provided by the developer. In such
case, the developer can choose to register the adapter as tarball or
zip compressed archive. The only requirement is that the
main
module has to be at the root level of the compressed archive. Other files or assets needed by the module can be at the same level or in subdirectories. - Or it can depend on modules or packages from the standard package repository for the corresponding language. In this case, the extra modules will be installed at registration time in the workers (see Registration API).
The language of the adapter is detected automatically by looking at
the module main
uploaded during registration.
See Developer Role Quickstart for an example of this API.
Araport Language¶
Todo
To be defined and to be written.
Using and Developing for Adama¶
This section shows very basic and quick examples of using the
register
and query
endpoints of Adama.
Note
These examples are meant to be executed against an Adama
instance and they should work. Currently, we do not have an
official and public accessible instance, but a dedicated
user can build his or her own from
github.com/waltermoreira/adama
, if desired. We expect
to have an alpha release on TBD.
Roles¶
We called developer an individual who has access or knowledge of an existing data source, and he or she wants to register it into Adama. The developer does not need special access to the third party data source, other than the standard API access it already provides. He or she needs an account on Araport with permissions to register adapters. A developer is expected to have a minimum of skills on some programming language (from the list supported by Adama, see programming languages).
We called user an individual who is using Adama through the
query
interface. No special permissions are required. Knowledge
to access a RESTful API is recommended, although command line and web
tools are also planned in the future.
Developer Role Quickstart¶
In this setion we register an adapter for the service Expressolog by
locus at bar.utoronto.ca
.
The service accepts queries as GET
requests to the URL:
http://bar.utoronto.ca/webservices/get_expressologs.php
with parameter request=[{"gene": "...""}]
.
The native output of the service for the gene At2g26230 looks like :
{"At2g26230": [
{"probeset_A": "267374_at",
"gene_B": "Glyma20g17440",
"probeset_B": "Glyma20g17440",
"correlation_coefficient": "0.5264",
"seq_similarity":"67",
"efp_link": "http://bar.utoronto.ca/efp_soybean/cgi-bin
/efpWeb.cgi?dataSource=soybean&primaryGene
=Glyma20g17440&modeInput=Absolute"
},
{"probeset_A": "267374_at",
"gene_B":"Solyc11g006550",
"probeset_B": "Solyc11g006550",
"correlation_coefficient": "0.1768",
"seq_similarity": "68",
"efp_link":"http://bar.utoronto.ca/efp_tomato/cgi-bin
/efpWeb.cgi?dataSource=tomato&primaryGene
=Solyc11g006550&modeInput=Absolute"
}
]
}
The Araport Language query would look like (see Araport Language):
{
"query": {"locus": "At2g26230"},
"countOnly": false,
"pageSize": 100,
"page": 1
}
The transformed Araport Language output will contain two records for each of the previous records:
{
"status": "success",
"message": "",
"result": [
{
"class": "locus_relationship",
"reference": "TAIR10",
"locus": "At2g26230",
"type": "coexpression",
"related_entity": "Solyc11g006550",
"direction": "undirected",
"score": [
{
"correlation_coefficient": 0.1768
}
],
"source": "tomato"
},
{
"class": "locus_relationship",
"reference": "TAIR10",
"locus": "At2g26230",
"type": "similarity",
"related_entity": "Solyc11g006550",
"direction": "undirected",
"score": [
{
"similarity_percentage": 68
}
],
"source": "tomato"
},
{
"class": "locus_relationship",
"reference": "TAIR10",
"locus": "At2g26230",
"type": "coexpression",
"related_entity": "Glyma20g17440",
"direction": "undirected",
"score": [
{
"correlation_coefficient": 0.5264
}
],
"source": "soybean"
},
{
"class": "locus_relationship",
"reference": "TAIR10",
"locus": "At2g26230",
"type": "similarity",
"related_entity": "Glyma20g17440",
"direction": "undirected",
"weight": [
{
"similarity_percentage": 67
}
],
"source": "soybean"
}
]
}
The complete code (in Python) for the adapter can be seen at Adama github repository.
In pseudo-code, the function process
of the module main.py
can
be described as:
def process(args):
# <extract 'locus' from the JSON object 'args'>
# <send request to the expressologs service>
for result in # <results from expressologs>:
obj = # <convert result to Araport format (for type "similarity")>
print obj
print '---'
obj = # <convert result to Araport format (for type "coexpression")>
print obj
print '---'
print 'END'
This is the code which the developer of the adapter for the third party data source will create. The developer can test this function interactively in the Python interpreter, independently of any interaction with Adama, as in:
>>> import main, json
>>> main.process(json.dumps(
... {"query": {"locus": "At2g26230"},
... "countOnly": false,
... "pageSize": 100,
... "page": 1}))
...
If successful, the function will print to standard output a sequence
of JSON objects separated by the lines '---'
, and it will print
the 'END'
string when the stream of results is exhausted. After a
successful testing, the developer will upload this module to Adama by
posting to $ADAMA/register
, for example with the (Python) command:
>>> requests.post('$ADAMA/register',
... data={'name': 'expressologs_by_locus',
... 'version': '0.1',
... 'url': 'http://bar.utoronto.ca/webservices/get_expressologs.php',
... 'requirements': 'requests'},
... files={'code': ('main.py', open('main.py'))})
...
Note that we are assuming the main.py
module (as in the example on
the github repository) is using the module requests
to access the
expressolog service. For such reason, we include it in the
requirements
field, so it is properly installed in the Adama
workers. This example can be found already built in the
examples directory of Adama github repository.
On success, Adama will return the full identifier for this service
(expressologs_by_locus_v0.1
), and it will start workers ready to
attend queries.
User Role Quickstart¶
After a developer registered the expressolog service as in
Developer Role Quickstart, a user can query the service by posting to
$ADAMA/query
:
>>> requests.post('$ADAMA/query',
... data=json.dumps({
... 'serviceName': 'expressologs_by_locus_v0.1',
... 'query': {'locus': 'At2g26230'}}))
...
On success, the response will return a JSON object with the results in the format described in Developer Role Quickstart. As mentioned in Adapter API, Adama will start streaming the results as soon as they are available from the data source. Users can start reading the results using any of the available increamental JSON parsers.
Provenance¶
Starting in version 0.3, Adama allows declaring the provenance for each microservice response.
Provenance for a microservice can be specified in the field
sources:
of the metadata. Its content is a list of objects with
the following fields:
title:
- (mandatory) Name of the source
description:
- (mandatory) Human readable description of the source
language:
- (optional) Language (following RFC 4646)
last_modified:
- (optional) Date of last modification. ISO format preferred (e.g.: “2015-04-17T09:44:56”), but it’s more permissive. Most unambiguous formats are supported.
sponsor_organization_name:
- (mandatory) Sponsor organization
sponsor_uri:
- (optional) Sponsor URI
provider_name:
- (mandatory) Provider name
provider_email:
- (optional) Provider email
uri:
- (optional) Agent URI
license:
- (optional) License
sources:
- (optional) A list of nested sources containing the above fields (recursively including more sources, if necessary). By default it is the empty list.