Reference
Last updated
Last updated
Copyright © 2024 Instructure, Inc. All rights reserved.
Jobs (and all objects they created) are deleted 24 hours after the job was started. Deleted jobs are no longer returned by this endpoint.
Unique identifier returned when the job was started by querying data.
A snapshot query that has completed with success. OR An incremental query that has completed with success. OR A data access job that has terminated with failure.
If no scope is specified, then the endpoint uses the default scope of the authenticated user. Returns an error if the user has access to several scopes and the scope is not explicitly specified.
Identifies the domain or product that the request pertains to, e.g. canvas
.
A list of tables in the given scope.
If data is returned in JSON Lines format (*.jsonl
) then the schema applies to the JSON object obtained by combining
the sub-objects accessed via the key
and value
properties, respectively, of JSON items.
Assume the schema reads as follows:
{
"type": "object",
"properties": {
"pkey": {
"type": "integer",
"format": "int64"
},
"prop1": {
"type": "string"
},
"prop2": {
"type": "integer"
},
"additionalProperties": false,
"required": [
"pkey",
"prop1"
]
}
}
Suppose we have the following JSON output:
{ "meta": { "action": "U", ... }, "key": { "pkey": 1 }, "value": { "prop1": "value1", "prop2": 42 } }
{ "meta": { "action": "U", ... }, "key": { "pkey": 2 }, "value": { "prop1": "value2", "prop2": null } }
{ "meta": { "action": "D", ... }, "key": { "pkey": 3 } }
In the example directly above, the first and second items (update
records) would both validate against the pre-defined schema.
The validator would check the following synthesized JSON objects:
{ "pkey": 1, "prop1": "value1", "prop2": 42 }
{ "pkey": 2, "prop1": "value2", "prop2": null }
The third item (a delete
record) does not have to validate because it indicates that the client is to remove the item.
If data is returned in Comma-Separated Values format (*.csv
) then the schema type constraints apply to CSV key
and value
columns, respectively, but not CSV meta
columns. For example, assume we have the following CSV output:
meta.action,key.pkey,value.prop1,value.prop2
U,1,"value1",42
U,2,"value2",
D,3,,
Then the schema would read the same as in the JSON example above.
Nested JSON objects are flattened to simple fields, with composite names constructed using the dot notation (parent.child
).
If no scope is specified, then the endpoint uses the default scope of the authenticated user. Returns an error if the user has access to several scopes and the scope is not explicitly specified.
Identifies the domain or product that the request pertains to, e.g. canvas
.
Canonical name of the table whose schema to return.
The versioned JSON schema specification for the table.
The JSON Schema object to validate against.
The version of the schema.
In contrast to objects, which have a longer lifetime, pre-signed URLs are valid for a shorter duration, typically 15 minutes.
File paths returned by this endpoint do not adhere to any specification. While they may contain auxiliary information such as job ID or part counter, these are only informative. Downstream systems should not depend on any specific patterns of file names, or make any assumptions how much data each file contains.
Uniquely identifies the object.
A list of pre-signed URLs.
A dictionary of key-value pairs consisting of an ObjectID and the corresponding resource URL.
This is an asynchronous operation. Calling this endpoint will start a new job and return immediately with status information. However, the operation will continue running on the server. The caller can poll the status of the job to find out when it is ready.
If a job with the same query parameters already exists, its details are returned rather than starting a new job.
If no scope is specified, then the endpoint uses the default scope of the authenticated user. Returns an error if the user has access to several scopes and the scope is not explicitly specified.
For incremental queries, the output uses a special metadata field called action
to identify whether a record is upserted
(inserted or updated) or (hard) deleted (U
corresponds to upsert, and D
to delete):
{ "meta": { "action": "U", ... }, "key": { "pkey": 1 }, "value": { "prop1": "value1", "prop2": 42 } }
{ "meta": { "action": "U", ... }, "key": { "pkey": 2 }, "value": { "prop1": "value2", "prop2": null } }
{ "meta": { "action": "D", ... }, "key": { "pkey": 3 } }
Upserted records have the primary key fields present in the sub-object key
and all other data fields in the sub-object value
.
Deleted records only have the primary key fields in the key
property, and lack the value
property.
Hard deletes are infrequent. They only take place when a record is irreversibly deleted from the source database, e.g. to comply with privacy or legal requirements.
In most cases, records are soft-deleted, i.e. they are updated in such a way as to be understood deleted or inactive though
the record is retained in the database, e.g. by setting a workflow_state
column to the value inactive
or deleted
. In this
context, soft deletes are equivalent to an update, and are denoted with a U
, and all field values are included in the output.
In the rare event that inserting a record is quickly followed by a hard delete in the source database between two successive
incremental queries, a record might appear with a new (so far unseen) key
, no value
and an action
of D
.
For snapshot queries, deleted records are not included in the output:
{ "meta": { ... }, "key": { "pkey": 1 }, "value": { "prop1": "value1", "prop2": 42 } }
{ "meta": { ... }, "key": { "pkey": 2 }, "value": { "prop1": "value2", "prop2": null } }
This is a rate-limited endpoint. If excessive data volume is requested repeatedly using this endpoint (e.g. a full snapshot every hour), future requests may be denied. We encourage making use of incremental queries, which substantially reduce the amount of data returned.
Snapshot queries and incremental queries serve different purposes in data retrieval and management within the DAP environment.
It is recommended taking a snapshot exactly once (as an initialization step) and then using incremental queries thereafter. By utilizing snapshot queries for initial data load and incremental queries for subsequent updates, users can maintain up-to-date datasets efficiently and effectively.
Identifies the domain or product that the request pertains to, e.g. canvas
.
Canonical name of the table whose data to return.
A snapshot query that has completed with success. OR An incremental query that has completed with success. OR A data access job that has terminated with failure.