Transcript REST API Reference
This document is a work in progress. Meanwhile, see gRPC transcoding and consult the gRPC API reference.
Authorization
Calls to the Tiro services need to be authorized with an API key or a JWT token
signed with the client's private key. The key or token is either supplied in an
HTTP header Authorization: Bearer ACCESS_TOKEN
.
Contact tiro@tiro.is to gain access or request an access token.
Server host
https://ritari.talgreinir.is
Submitting transcript jobs
Submit a new media file to be transcribed. Media should be sent in via a URL.
POST /v1alpha1/transcriptjob:submit
Request body fields (JSON)
See the gRPC reference for documentation on the fields.
Response fields (JSON)
See the gRPC reference for documentation on the fields.
Examples (cURL)
Example that submits an audio file via URL to be transcribed. For submitting a
URL the required fields are: metadata.languageCode
, useUri
, uri
and
metadata.fileType
. Generally available language codes for Icelandic are
is-IS
and is-IS-x-exp
.
curl -X POST \
-H "Authorization: Bearer $TIRO_TOKEN" \
https://ritari.talgreinir.is/v1alpha1/transcriptjob:submit -d@payload.json | jq
Where payload.json
contains:
{
"metadata": {
"fileType": "AUDIO",
"languageCode": "is-IS",
"subject": "Test Spegillinn",
"description": "Test description",
"keywords": [
"keyword1",
"keyword2"
]
},
"useUri": true,
"uri": "https://ruv-vod-app-dcp-v4.secure.footprint.net/opid/vefur/200826thingidamorgun.mp3"
}
This will return a TranscriptJob in the form:
{
"name": "transcriptjob/ea893d3c-...",
"startTime": "2020-08-26T20:18:29.017571Z",
"transcriptMetadata": {
"fileType": "AUDIO",
"languageCode": "is-IS",
"originalUri": "https://ruv-vod-app-dcp-4.secure.footprint.net/opid/vefur/200826thingidamorgun.mp3",
"subject": "Test Spegillinn",
"description": "Test description",
"keywords": [
"keyword1",
"keyword2"
]
}
}
Transcript jobs for longer media files can take a while to process. To check the
status of jobs A GET request is used. To query the status the name
returned in
the response above is used in the request:
curl -X GET \
-H "Authorization: Bearer $TIRO_TOKEN" \
https://ritari.talgreinir.is/v1alpha1/transcriptjob/ea893d3c-...
While the job is being processed (in a PROCESSING
state) a response similar to
the following is returned, where progressPercent
will indicate how much of the
audio has been transcribed.
{
"name": "transcriptjob/ea893d3c-...",
"state": "PROCESSING",
"progressPercent": 98,
"startTime": "2020-08-26T20:18:29.017571Z",
"lastUpdatedTime": "2020-08-26T20:21:49.869051Z",
"transcriptMetadata": {
"fileType": "AUDIO",
"languageCode": "is-IS",
"originalUri": "https://ruv-vod-app-dcp-v4.secure.footprint.net/opid/vefur/200826thingidamorgun.mp3",
"subject": "Test Spegillinn",
"description": "Test description",
"keywords": [
"keyword1",
"keyword2"
]
}
}
If the transcript job successfully finished (in a SUCCESS
state) an example
response looks like the following, where the field transcript
has been
populated. This name is used to retrieve the contents of the
transcript.
{
"name": "transcriptjob/ea893d3c-...",
"state": "SUCCESS",
"progressPercent": 100,
"startTime": "2020-08-26T20:18:29.017571Z",
"lastUpdatedTime": "2020-08-26T20:22:09.344261Z",
"transcript": "transcripts/ea893d3c-...",
"transcriptMetadata": {
"fileType": "AUDIO",
"languageCode": "is-IS",
"originalUri": "https://ruv-vod-app-dcp-v4.secure.footprint.net/opid/vefur/200826thingidamorgun.mp3",
"subject": "Test Spegillinn",
"description": "Test description",
"keywords": [
"keyword1",
"keyword2"
],
"recordingDuration": "792.904s"
}
}
Listing transcripts
List or query transcripts accessible to the authorized user.
GET /v1alpha/transcripts
Query parameters
Parameter | Description |
---|---|
pageSize |
Number of results returned per page |
pageToken |
Each response contains a nextPageToken which can be used to list more results |
filter |
Filter by metadata attached to the transcripts. See filter description. |
Filter description
Currently there are only two filters available: Filtering by subject (or title) and keywords (or tags).
To filter by subject specify the filter
parameter as: metadata.subject
CONTAINS "..."
.
To filter by keywords specify the filter
parameter as: metadata.keywords
CONTAINS ["..."]
Examples (cURL)
List all (up to a server specified default) transcripts accessible to the authorized user:
curl -X GET -H "Content-Type: application/json" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
https://ritari.talgreinir.is/v1alpha1/transcripts | jq
List transcripts that contain a specific string, Kastljós, in the subject:
curl -X GET -H "Content-Type: application/json" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
'https://ritari.talgreinir.is/v1alpha1/transcripts?filter=metadata.subject%20CONTAINS%20%22Kastlj%C3%B3s%22' | jq
These requests return a response with the following structure:
{
"transcripts": [
{
"name": "transcripts/...",
"metadata": {
"fileType": "VIDEO",
"languageCode": "is-IS",
"originalUri": "https://...",
"subject": "...",
"description": "",
"keywords": [
"xyz",
],
"additionalMetadata": {
"abc": "xyz"
},
"recordingDuration": "1463.382s",
"waveformUri": "",
"speakers": {}
},
"segments": [],
"uri": "",
"version": {
"name": "...",
"parent": "...",
"creationTime": "2022-05-31T20:29:04.469478Z"
}
},
...
],
"nextPageToken": "2"
}
Retrieve a transcript
Get the contents and metadata of a transcript identified by the name
TRANSCRIPT_NAME
, i.e. the contents of the name
field described
above.
GET /v1alpha/
TRANSCRIPT_NAME
This endpoint returns the metadata in the same structure as when listing
accessible transcripts in addition to a segments
field which contains the
time-aligned segments. The full text for the transcript is obtained by
concatenating every word in every segment in order.
Examples (cURL)
Retrieve a transcript with the name transcripts/8657e641-...
:
curl -X GET -H "Content-Type: application/json" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
'https://ritari.talgreinir.is/v1alpha1/transcripts/8657e641-...' | jq
Example response:
{
"name": "transcripts/8657e641-...",
"metadata": {
"fileType": "VIDEO",
"languageCode": "is-IS",
"originalUri": "...",
"subject": "Example subject",
"description": "",
"keywords": [
"examplekeyword",
],
"additionalMetadata": {
"xyz": "abc"
},
"recordingDuration": "1463.382s",
"waveformUri": "...",
"speakers": {}
},
"segments": [
{
"startTime": "18.415s",
"endTime": "28.196s",
"words": [
{
"startTime": "18.415s",
"endTime": "18.625s",
"word": "Gott "
},
{
"startTime": "18.625s",
"endTime": "18.924s",
"word": "kvöld "
},
{
"startTime": "18.926s",
"endTime": "19.016s",
"word": "og "
},
...
],
"speakerId": ""
},
...,
{
"startTime": "1457.024s",
"endTime": "1463.382s",
"words": [
{
"startTime": "1457.024s",
"endTime": "1457.114s",
"word": "af "
},
{
"startTime": "1457.114s",
"endTime": "1457.294s",
"word": "hverju "
},
...
],
"speakerId": ""
}
],
"uri": "https://...",
"version": {
"name": "435ae148-...",
"parent": "7ba2d6f4-...",
"creationTime": "2022-05-31T20:29:04.469478Z"
}
}
Create a transcript
Create a user created transcript using caller supplied text and timestamps.
POST /v1alpha1/transcripts
The body of the request is a Transcript
in the same format as returned when
retrieving a transcript. Note that the word
field of
each word in a segment also includes any whitespace that should appear before
the next word in the segment.
Examples (cURL)
curl -X POST -H "Content-Type: application/json" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
'https://ritari.talgreinir.is/v1alpha1/transcripts' -d@payload.json | jq
where payload.json
contains:
{
"metadata": {
"fileType": "AUDIO",
"languageCode": "is-IS",
"subject": "Example subject",
"keywords": [
"examplekeyword"
],
"dictation": true
},
"segments": [
{
"startTime": "18.415s",
"endTime": "28.196s",
"words": [
{
"startTime": "18.415s",
"endTime": "18.625s",
"word": "Gott "
},
{
"startTime": "18.625s",
"endTime": "18.924s",
"word": "kvöld "
},
{
"startTime": "18.926s",
"endTime": "19.016s",
"word": "og "
},
...
]
},
...,
{
"startTime": "1457.024s",
"endTime": "1463.382s",
"words": [
{
"startTime": "1457.024s",
"endTime": "1457.114s",
"word": "af "
},
{
"startTime": "1457.114s",
"endTime": "1457.294s",
"word": "hverju "
},
...
]
}
]
}
Example response:
{"name": "transcripts/8657e641-..."}
Which can be used to retrieve this transcript.
Update a transcript
Update the contents and/or metadata of a transcript identified by the name
TRANSCRIPT_NAME
, i.e. the contents of the name
field described
above.
PATCH /v1alpha1/
TRANSCRIPT_NAME
The body of the request is a partial Transcript
in the same format as returned
when retrieving a transcript. Note that the word
field of each word in a segment also includes any whitespace that should appear
before the next word in the segment. The updatable fields are segments
and
metadata
, and only one has to be present in the request. The return value is
the full updated Transcript
.
Examples (cURL)
Example that updates only the segments
, i.e. the content of the transcript.
curl -X PATCH -H "Content-Type: application/json" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
'https://ritari.talgreinir.is/v1alpha1/transcripts/8657e641-...' -d@payload.json | jq
where payload.json
contains:
{
"segments": [
{
"startTime": "18.415s",
"endTime": "28.196s",
"words": [
{
"startTime": "18.415s",
"endTime": "18.625s",
"word": "Vont "
},
{
"startTime": "18.625s",
"endTime": "18.924s",
"word": "kvöld "
},
{
"startTime": "18.926s",
"endTime": "19.016s",
"word": "og "
},
...
]
},
...,
{
"startTime": "1457.024s",
"endTime": "1463.382s",
"words": [
{
"startTime": "1457.024s",
"endTime": "1457.114s",
"word": "af "
},
{
"startTime": "1457.114s",
"endTime": "1457.294s",
"word": "hverju "
},
...
]
}
]
}
Example response:
{
"name": "transcripts/8657e641-...",
"metadata": {
"fileType": "AUDIO",
"languageCode": "is-IS",
"dictation": true,
"dataSource": "DATA_SOURCE_UNSPECIFIED",
"subject": "Example subject",
"description": "",
"keywords": [
"examplekeyword"
],
"additionalMetadata": {},
"recordingDuration": null,
"originalCharLength": 0,
"originalByteLength": 0,
"waveformUri": "",
"speakers": {}
},
"segments": [
{
"startTime": "18.415s",
"endTime": "28.196s",
"words": [
{
"startTime": "18.415s",
"endTime": "18.625s",
"word": "Vont "
},
{
"startTime": "18.625s",
"endTime": "18.924s",
"word": "kvöld "
},
{
"startTime": "18.926s",
"endTime": "19.016s",
"word": "og "
},
...
]
},
...,
{
"startTime": "1457.024s",
"endTime": "1463.382s",
"words": [
{
"startTime": "1457.024s",
"endTime": "1457.114s",
"word": "af "
},
{
"startTime": "1457.114s",
"endTime": "1457.294s",
"word": "hverju "
},
...
]
}
],
"uri": "",
"version": {
"name": "c77689ff-2ced-4fda-868c-3aab9a2b263a",
"parent": "54ac33c9-81b1-4bc8-af47-b790fd5c7224",
"creationTime": "2024-05-16T13:45:28.949372Z"
}
}
Uploading an audio file for a user created transcript
Generate an upload URL using.
POST /v1alpha1/initupload
Body fields (JSON)
Field | Description |
---|---|
resourceName |
The name of the transcript (or other resource) for which to generate an upload URL |
Response fields (JSON)
Field | Description |
---|---|
gcsSignedUrl |
Temporary URL that can be uploaded to using a PUT request. |
Example (CURL)
Generate an upload URL:
curl -X POST -H "Content-Type: application/json" \
-H "Authorization: Bearer $TIRO_TOKEN" \
'https://ritari.talgreinir.is/v1alpha1/initupload' -d@-
{"resourceName": "transcripts/8657e641-..."}
which will generate a response containing an upload URL:
{
"gcsSignedUrl": "https://storage.googleapis.com/upload/storage/v1/b/talgreinir-is-transcript-assets/..."
}
Which can be uploaded to using:
curl -X PUT --data-binary @audio_file.wav \
"https://storage.googleapis.com/upload/storage/v1/b/talgreinir-is-transcript-assets/..."
Once an audio (or video) file has been uploaded for a transcript, the uri
field in response when retrieving a transcript will
contain a temporary URL.