Transcription and translation transforms
VOD transcription and VOD translation transforms are created using the transform endpoint in MK.IO API. They are both created using the #MediaKind.AIPipelinePreset value for @odata.type attribute.
VOD transcription and VOD translation are only available for MP4 content.
VOD transcription
Configuration parameters
The following table lists the configuration parameters for VOD transcription:
| Parameter | Description |
|---|---|
| @odata.type | The following value must be used: #MediaKind.AIPipelinePreset |
| pipeline name | Predefined_ACSVodTranscription |
| language | Language spoken in the audio to transcribe |
| phrases | Words or phrases expected in the audio. This improves recognition by making these terms more likely to be picked up |
Transform example
The example below shows how to configure a Transform that transcribes the audio track using AI models. It specifies en-US as the transcription language and uses a custom phrase list to improve recognition accuracy for domain-specific terms.
Once the transform is in place, it can be used to create a job on a given VOD asset.
curl --request PUT \
--url https://api.mk.io/api/v1/projects/<project_name>/media/transforms/transform_name \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--header 'Authorization: Bearer bearer-token' \
--data '
{
"properties": {
"description": "Transcription en-US",
"outputs": [
{
"preset": {
"@odata.type": "#MediaKind.AIPipelinePreset",
"pipeline": {
"name": "Predefined_ACSVodTranscription",
"arguments": {
"VodTranscription": [
{
"name": "language",
"value": "en-US"
},
{
"name": "phrases",
"value": [
"Cyperus papyrus",
"Heliotropium indicum",
"Zamioculcas zamiifolia",
"Monstera deliciosa",
"Alocasia odora",
"Tillandsia cyanea",
"Drosera capensis",
"Euphorbia tirucalli",
"Ficus lyrata",
"Calathea orbifolia"
]
}
]
}
}
}
}
]
}
}
'VOD translation
Configuration parameters
| Parameter | Description |
|---|---|
| @odata.type | The following value must be used: #MediaKind.AIPipelinePreset |
| pipeline name | Predefined_ACSVodTranslation |
| language | Language spoken in the audio to transcribe |
| targetLanguages | Specify the languages into which the transcription should be translated |
| phrases | Words or phrases expected in the audio. This improves recognition by making these terms more likely to be picked up |
Transform example
The example below demonstrates how to configure a Transform that uses AI models to transcribe and translate the audio track. It transcribes the audio in en-US and translates the output into pt-pt, fr-FR, and es-ES. A custom phrase list is also included to enhance recognition accuracy for domain-specific terms.
Once the transform is in place, it can be used to create a job on a given VOD asset.
curl --request PUT \
--url https://api.mk.io/api/v1/projects/<project_name>/media/transforms/transform_name \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--header 'Authorization: Bearer bearer-token' \
--data '
{
"properties": {
"description": "Transcription en-US, translation fr-FR pt-PT es-ES",
"outputs": [
{
"preset": {
"@odata.type": "#MediaKind.AIPipelinePreset",
"pipeline": {
"name": "Predefined_ACSVodTranslation",
"arguments": {
"VodTranscription": [
{
"name": "language",
"value": "en-US"
},
{
"name": "targetLanguages",
"value": [
"pt-pt",
"fr-FR",
"es-ES"
]
},
{
"name": "phrases",
"value": [
"Cyperus papyrus",
"Heliotropium indicum",
"Zamioculcas zamiifolia",
"Monstera deliciosa",
"Alocasia odora",
"Tillandsia cyanea",
"Drosera capensis",
"Euphorbia tirucalli",
"Ficus lyrata",
"Calathea orbifolia"
]
}
]
}
}
}
}
]
}
}
'Track insertion
Once the VOD Transcription or VOD Translation job is complete, track-insertion Transforms can be used to insert the generated VTT files in a previously encoded asset as the output.
Updated about 3 hours ago