REST API Reference¶

version 1.7.1

This reference and the REST API itself is still under heavy development and is subject to change at any time. Feedback through our GitHub issues is appreciated!

Table of Contents¶

Introduction
Resource Object Schemas
API Endpoints
Bundle Actions API
Bundle Permissions API
Bundle_Stores API
Bundles API
CLI API
Groups API
OAuth2 API
User API
Users API
Workers API
Worksheet Interpretation API
Worksheet Items API
Worksheet Permissions API
Worksheets API

Introduction¶

We use the JSON API v1.0 specification with the Bulk extension. - http://jsonapi.org/format/ - https://github.com/json-api/json-api/blob/9c7a03dbc37f80f6ca81b16d444c960e96dd7a57/extensions/bulk/index.md

The following specification will not provide the verbose JSON formats of the API requests and responses, as those can be found in the JSON API specification. Instead: - Each resource type below (worksheets, bundles, etc.) specifies a list of attributes with their respective data types and semantics, along with a list of relationships. - Each API call specifies the HTTP request method, the endpoint URI, HTTP parameters with their respective data types and semantics, description of the API call.

How we use JSON API¶

Using the JSON API specification allows us to avoid redesigning the wheel when designing our request and response formats. This also means that we will not specify all the details of the API (for example, Content-Type headers, the fact that POST requests should contain a single resource object, etc.) in this document at this time, while we may choose to continue copying in more details as we go in the design and implementation process. However, since there are many optional features of the JSON API specification, we will document on a best-effort basis the ways in which we will use the specification that are specific to our API, as well as which parts of the specification we use, and which parts we do not.

Top-level JSON structure¶

Every JSON request or response will have at its root a JSON object containing either a “data” field or an “error” field, but not both. Thus the presence of an “error” field will unambiguously indicate an error state.

Response documents may also contain a top-level "meta" field, containing additional constructed data that are not strictly resource objects, such as summations in a search query.

Response documents may also contain a top-level "included" field, discussed below.

Primary Data¶

The JSON API standard specifies that the “data” field will contain either a resource object or an array of resource objects, depending on the nature of the request. More specfically, if the client is fetching a single specific resource (e.g. GET /bundles/0x1d09b495), the “data” field will have a single JSON object at its root. If the client intends to fetch a variable number of resources, then the “data” field will have at its root an array of zero or more JSON objects.

The structure of a JSON response with a single resource object will typically look like this:

{
  "data": {
    "type": "bundles",
    "id": "0x1d09b495410249f89dee4465cd21d499",
    "attributes": {
      // ... this bundle's attributes
    },
    "relationships": {
      // ... this bundle's relationships
    }
  }
}

Note that we use UUIDs as the "id" of a resource when available (i.e. for worksheets, bundles, and groups) and some other unique key for those resources to which we have not prescribed a UUID scheme.

For each of the resource types available in the Worksheets API, we define the schema for its attributes, as well as list what relationships each instance may have defined. Relationships are analogous to relationships in relational databases and ORMs—some may be to-one (such as the "owner" of a bundle) and some may be to-many (such as the "permissions" of a bundle).

We will use the following subset of the relationship object schema for our Worksheets API (Orderly schema):

object {
  object {
    string related;   // URL to GET the related resource
  } links;
  object {            // used to identify resource in includes
    string type;      // type of the related resource
    string id;        // id of the related resource
  } data?;
}

Query Parameters¶

The client may provide additional parameters for requests as query parameters in the request URL. Available parameters will be listed under each API route. In general: - Boolean query parameters are encoded as 1 for "true" and 0 for "false". - Some query parameters take multiple values, which can be passed by simply listing the parameter multiple times in the query, e.g. GET /bundles?keywords=hello&keywords=world

Includes¶

The client will often want to fetch data for resources related to the primary resource(s) in the same request: for example, client may want to fetch a worksheet along with all of its items, as well as data about the bundles and worksheets referenced in the items.

Currently, most of the API endpoints will include related resources automatically. For example, fetching a "worksheet" will also include the "bundles" referenced by the worksheet in the response. These related resource objects will be included in an array in the top-level "included" field of the response object.

Non-JSON API Endpoints¶

The JSON API specification is not all-encompassing, and there are some cases in our API that fall outside of the specification. We will indicate this explicitly where it applies, and provide an alternative schema for the JSON format where necessary.

Authorization and Authentication¶

The Bundle Service also serves as an OAuth2 Provider.

All requests to protected resources on the Worksheets API must include a valid OAuth bearer token in the HTTP headers:

Authorization: Bearer xxxxtokenxxxx

If the token is expired, does not authorize the application to access the target resource, or is otherwise invalid, the Bundle Service will respond with a 401 Unauthorized or 403 Forbidden status.

Resource Object Schemas¶

users¶

Name	Type
`id`	String
`user_name`	String
`first_name`	String
`last_name`	String
`affiliation`	String
`url`	Url
`date_joined`	LocalDateTime
`avatar_id`	String
`has_access`	Boolean
`email`	String
`notifications`	Integer
`time_quota`	Integer
`parallel_run_quota`	Integer
`time_used`	Integer
`disk_quota`	Integer
`disk_used`	Integer
`last_login`	LocalDateTime
`is_verified`	Boolean

users¶

Name	Type
`id`	String
`user_name`	String
`first_name`	String
`last_name`	String
`affiliation`	String
`url`	Url
`date_joined`	LocalDateTime
`avatar_id`	String
`has_access`	Boolean
`email`	String
`notifications`	Integer
`time_quota`	Integer
`parallel_run_quota`	Integer
`time_used`	Integer
`disk_quota`	Integer
`disk_used`	Integer
`last_login`	LocalDateTime

bundle-actions¶

Name	Type
`id`	Integer
`uuid`	String
`type`	String
`subpath`	String
`string`	String

BundleDependencySchema¶

Plain (non-JSONAPI) Marshmallow schema for a single bundle dependency. Not defining this as a separate resource with Relationships because we only create a set of dependencies once at bundle creation.

Name	Type
`child_uuid`	String
`child_path`	String
`parent_uuid`	String
`parent_path`	String
`parent_name`	Method
`parent_state`	Method

bundle_locations¶

Name	Type
`id`	String
`bundle_store_uuid`	String
`name`	String
`storage_type`	String
`storage_format`	String
`url`	String

bundle_locations¶

Name	Type
`id`	String
`bundle_uuid`	String
`bundle_store_uuid`	String

bundle-permissions¶

Name	Type
`id`	CompatibleInteger
`bundle`	Relationship(bundles)
`group`	Relationship(groups)
`group_name`	String
`permission`	Integer
`permission_spec`	PermissionSpec

bundles¶

Name	Type
`id`	String
`uuid`	String
`bundle_type`	String
`command`	String
`state`	String
`state_details`	String
`owner`	Relationship(users)
`frozen`	DateTime
`is_anonymous`	Boolean
`storage_type`	String
`is_dir`	Boolean
`metadata`	Dict
`dependencies`	BundleDependencySchema
`children`	Relationship(bundles)
`group_permissions`	Relationship(bundle-permissions)
`host_worksheets`	Relationship(worksheets)
`args`	String
`permission`	Integer
`permission_spec`	PermissionSpec

bundle_stores¶

Name	Type
`id`	String
`uuid`	String
`owner`	Integer
`name`	String
`storage_type`	String
`storage_format`	String
`url`	String
`authentication`	String
`authentication_env`	String

groups¶

Name	Type
`id`	String
`name`	String
`user_defined`	Boolean
`owner`	Relationship(users)
`admins`	Relationship(users)
`members`	Relationship(users)

users¶

Name	Type
`id`	String
`user_name`	String
`first_name`	String
`last_name`	String
`affiliation`	String
`url`	Url
`date_joined`	LocalDateTime
`avatar_id`	String
`has_access`	Boolean

worksheet-items¶

Name	Type
`id`	CompatibleInteger
`worksheet`	Relationship(worksheets)
`subworksheet`	Relationship(worksheets)
`bundle`	Relationship(bundles)
`value`	String
`type`	String
`sort_key`	Integer

worksheet-permissions¶

Name	Type
`id`	CompatibleInteger
`worksheet`	Relationship(worksheets)
`group`	Relationship(groups)
`group_name`	String
`permission`	Integer
`permission_spec`	PermissionSpec

worksheets¶

Name	Type
`id`	String
`uuid`	String
`name`	String
`owner`	Relationship(users)
`title`	String
`frozen`	DateTime
`is_anonymous`	Boolean
`date_created`	DateTime
`date_last_modified`	DateTime
`tags`	List
`group_permissions`	Relationship(worksheet-permissions)
`items`	Relationship(worksheet-items)
`last_item_id`	Integer
`permission`	Integer
`permission_spec`	PermissionSpec

↑ Back to Top

API Endpoints¶

Bundle Actions API¶

`POST /bundle-actions`¶

Sends the message to the worker to do the bundle action, and adds the action string to the bundle metadata.

↑ Back to Top

Bundle Permissions API¶

`POST /bundle-permissions`¶

Bulk set bundle permissions.

A bundle permission created on a bundle-group pair will replace any existing permissions on the same bundle-group pair.

↑ Back to Top

Bundle_Stores API¶

`GET /bundle_stores`¶

Fetch the bundle stores available to the user. No required arguments.

Query parameters: - name: (Optional) name of bundle store. If specified, only query information about the bundle store with given name. If not, return information of all the bundle stores.

Returns a list of bundle stores, each having the following parameters: - uuid: bundle store UUID - owner_id: (integer) owner of bundle store - name: name of bundle store - storage_type: type of storage being used for bundle store (GCP, AWS, etc) - storage_format: the format in which storage is being stored (UNCOMPRESSED, COMPRESSED_V1, etc) - url: a self-referential URL that points to the bundle store.

`POST /bundle_stores`¶

Add a bundle store that the user can access. JSON parameters: - name: name of bundle store - storage_type: type of storage being used for bundle store (GCP, AWS, etc) - storage_format: the format in which storage is being stored (UNCOMPRESSED, COMPRESSED_V1, etc). If unspecified, an optimal default will be set. - url: a self-referential URL that points to the bundle store. - authentication: key for authentication that the bundle store uses. Returns the data of the created bundle store.

`GET /bundle_stores/<uuid:re:0x[0-9a-f]{32}>`¶

Fetch the bundle store corresponding to the specified uuid.

Returns a single bundle store, with the following parameters: - uuid: bundle store UUID - owner_id: owner of bundle store - name: name of bundle store - storage_type: type of storage being used for bundle store (GCP, AWS, etc) - storage_format: the format in which storage is being stored (UNCOMPRESSED, COMPRESSED_V1, etc) - url: a self-referential URL that points to the bundle store.

`DELETE /bundle_stores`¶

Delete the specified bundle stores.

↑ Back to Top

Bundles API¶

`GET /bundles/<uuid:re:0x[0-9a-f]{32}>`¶

Fetch bundle by UUID.

Query parameters:

include_display_metadata: 1 to include additional metadata helpful for displaying the bundle info, 0 to omit them. Default is 0.
include: comma-separated list of related resources to include, such as "owner"

`GET /bundles`¶

Fetch bundles in the following two ways: 1. By bundle specs OR search keywords . Behavior is undefined when both specs and keywords are provided.

Query parameters:

worksheet: UUID of the base worksheet. Required when fetching by specs.
specs: Bundle spec of bundle to fetch. May be provided multiples times to fetch multiple bundle specs. A bundle spec is either:
1. a UUID (8 or 32 hex characters with a preceding '0x')
2. a bundle name referring to the last bundle with that name on the given base worksheet
3. or a reverse index of the form ^N referring to the Nth-to-last bundle on the given base worksheet.
keywords: Search keyword. May be provided multiple times for multiple keywords. Bare keywords match the names and descriptions of bundles. Examples of other special keyword forms:
- name=<name> : More targeted search of using metadata fields.
- size=.sort : Sort by a particular field.
- size=.sort- : Sort by a particular field in reverse.
- size=.sum : Compute total of a particular field.
- .mine : Match only bundles I own.
- .floating : Match bundles that aren't on any worksheet.
- .count : Count the number of bundles.
- .limit=10 : Limit the number of results to the top 10.
include_display_metadata: 1 to include additional metadata helpful for displaying the bundle info, 0 to omit them. Default is 0.
include: comma-separated list of related resources to include, such as "owner"

When aggregation keywords such as .count are used, the resulting value is returned as:

{
    "meta": {
        "results": <value>
    }
}

2. By bundle command and/or dependencies (for --memoized option in cl [run/mimic] command). When dependencies is not defined, the searching result will include bundles that match with command only.

Query parameters: - command : the command of a bundle in string - dependencies : the dependencies of a bundle in the format of '[{"child_path":key1, "parent_uuid":UUID1}, {"child_path":key2, "parent_uuid":UUID2}]' 1. a UUID should be in the format of 32 hex characters with a preceding '0x' (partial UUID is not allowed). 2. the key should be able to uniquely identify a (child_path, parent_uuid) pair in the list. The returning result will be aggregated in the same way as 1.

`POST /bundles`¶

Bulk create bundles.

Query parameters: - worksheet: UUID of the parent worksheet of the new bundle, add to this worksheet if not detached or shadowing another bundle. The new bundle also inherits permissions from this worksheet. - bundle_store: UUID of the bundle store that the new bundle should be stored on. Optional. - shadow: UUID of the bundle to "shadow" (the new bundle will be added as an item immediately after this bundle in its parent worksheet). - detached: 1 if should not add new bundle to any worksheet, so the bundle does not have a hosted worksheet. This is set to 1, for example, if the user is uploading their avatar as a bundle. or 0 otherwise. Default is 0. - wait_for_upload: 1 if the bundle state should be initialized to "uploading" regardless of the bundle type, or 0 otherwise. Used when copying bundles from another CodaLab instance, this prevents these new bundles from being executed by the BundleManager. Default is 0.

`PATCH /bundles`¶

Bulk update bundles.

`DELETE /bundles`¶

Delete the bundles specified.

Query parameters: - force: 1 to allow deletion of bundles that have descendants or that appear across multiple worksheets, or 0 to throw an error if any of the specified bundles have multiple references. Default is 0. - recursive: 1 to remove all bundles downstream too, or 0 otherwise. Default is 0. - data-only: 1 to only remove contents of the bundle(s) from the bundle store and leave the bundle metadata intact, or 0 to remove both the bundle contents and the bundle metadata. Default is 0. - dry-run: 1 to just return list of bundles that would be deleted with the given parameters without actually deleting them, or 0 to perform the deletion. Default is 0.

`GET /bundles/locations`¶

Fetch locations of bundles.

Query parameters: - uuids: List of bundle UUID's to get the locations for

`GET /bundles/<bundle_uuid:re:0x[0-9a-f]{32}>/locations/`¶

Returns a list of BundleLocations associated with the given bundle.

Query parameters: - bundle_uuid: Bundle UUID to get the locations for

`POST /bundles/<bundle_uuid:re:0x[0-9a-f]{32}>/locations/`¶

Adds a new BundleLocation to a bundle. If need to generate sas token, generate Azure SAS token and connection string. Request body must contain the fields in BundleLocationSchema.

Query parameters: - need_bypass: (Optional) Bool. If true, if will return SAS token (for Azure) or signed url (for GCS) to bypass server upload. - is_dir: (Optional) Bool. Whether the uploaded file is directory.

`GET /bundles/<bundle_uuid:re:%s>/locations/<bundle_store_uuid:re:%s>/`¶

Get info about a specific BundleLocation.

Query parameters: - bundle_uuid: Bundle UUID to get the location for - bundle_store_uuid: Bundle Store UUID to get the location for

`POST /bundles/<bundle_uuid:re:0x[0-9a-f]{32}>/state`¶

Updates a bundle state. Used to finalize a bundle's upload status after it is uploaded by the client directly to the bundle store, such as uploading to blob storage and bypassing the server.

Query parameters: - success: The state of upload. - state_on_success: (Optional) String. New bundle state if success - state_on_failure: (Optional) String. Bundle UUID corresponding to the new location - error_msg: (Optional) String. Error message if upload fails.

`GET /bundles/<uuid:re:0x[0-9a-f]{32}>/contents/info/<path:path>`¶

Fetch metadata of the bundle contents or a subpath within the bundle.

Query parameters: - depth: recursively fetch subdirectory info up to this depth. Default is 0.

Response format:

{
  "data": {
      "name": "<name of file or directory>",
      "link": "<string representing target if file is a symbolic link>",
      "type": "<file|directory|link>",
      "size": <size of file in bytes>,
      "perm": <unix permission integer>,
      "contents": [
          {
            "name": ...,
            <each file of directory represented recursively with the same schema>
          },
          ...
      ]
  }
}

`GET /bundles/<uuid:re:0x[0-9a-f]{32}>/contents/info/`¶

Fetch metadata of the bundle contents or a subpath within the bundle.

Query parameters: - depth: recursively fetch subdirectory info up to this depth. Default is 0.

Response format:

{
  "data": {
      "name": "<name of file or directory>",
      "link": "<string representing target if file is a symbolic link>",
      "type": "<file|directory|link>",
      "size": <size of file in bytes>,
      "perm": <unix permission integer>,
      "contents": [
          {
            "name": ...,
            <each file of directory represented recursively with the same schema>
          },
          ...
      ]
  }
}

`PATCH /bundles/<uuid:re:0x[0-9a-f]{32}>/contents/filesize/`¶

This function is used to fix the file size field in the index.sqlite file. This only allows user to increase the file size for a single file.

`PUT /bundles/<uuid:re:0x[0-9a-f]{32}>/netcat/<port:int>/`¶

Send a raw bytestring into the specified port of the running bundle with uuid. Return the response from this bundle.