Multi-Language VOD Dubbing Guide

This guide walks you through MK.IO's AI dubbing pipeline to automatically generate Spanish, German, and French audio tracks from English-language video content.

After following this guide, a single English-language video becomes a multi-language streaming asset containing:

  • Original English video and audio
  • 3 dubbed audio tracks (Spanish, German, French)
  • All audio tracks available for playback selection
❗️

Requirements and Limitations

VOD Dubbing supports MP4 content only. For a complete list of supported languages and specifications, see the AI workflows documentation.

Prerequisites

To complete this guide, you need:

  • ✅ Active MK.IO project access
  • ✅ API token (available in Organization Settings → API Tokens)
  • ✅ Postman or equivalent API client
  • ✅ English-language MP4 video (2–10 minutes recommended for testing)
  • ✅ Azure Storage account connected to MK.IO

Starting point: If you already have encoded MP4 files in an asset, skip to Step 2. Otherwise, start with Step 1.


Step 1: Upload and Encode Source Video

1.1 Create an Asset

  1. In the MK.IO dashboard, navigate to Assets+ Add Asset
  2. Select your storage location
  3. Enter the following details:
    • Asset name: english-source-video
    • Container: videos
    • Storage account: (Select your Azure Storage account)
  4. Upload your MP4 file. For this example, use a file named english-video-demo.mp4
  5. Click Upload and wait for completion

1.2 Create an Encoding Transform and Job

  1. Navigate to Video ProcessingTransforms+ Create Transform
  2. Configure the transform:
    • Name: encode-streaming
    • Type: Encoding
    • Preset: H.264 Multiple Bitrate 1080p
  3. Click Create
  4. Navigate to Video ProcessingJobs+ Create Job
  5. Configure the job:
    • Name: encode-english-source
    • Transform: encode-streaming
    • Input asset name: Select english-source-videoenglish-video-demo.mp4
    • Output asset name: english-encoded
  6. Click Create and monitor the job status
  1. Wait for the job status to display Finished
📘

Why encode first? Encoding generates the .ism manifest files required for track insertion operations.

1.3 Create an Asset for Dubbed Tracks

Create an asset to hold the dubbed audio files generated in Step 2:

PUT
https://api.mk.io/api/v1/projects/<PROJECT_NAME>/media/assets/<ASSET_NAME>
Path Parameters:
<PROJECT_NAME>Your unique project identifier
<ASSET_NAME>Asset name (use: dubbed-audio)
Request Body:
{
  "properties": {
    "description": "Dubbed audio output",
    "storageAccountName": "stmkiouk"
  }
}

For this guide, set ASSET_NAME to dubbed-audio.


Step 2: Create Multi-Language Dubs

2.1 Create a Dubbing Transform

A dubbing transform defines the source language and target languages for the AI dubbing pipeline.

ParameterDescription
@odata.typeMust be set to: #MediaKind.AIPipelinePreset
pipeline namePredefined_ACSVodSpeechToSpeech
languageSource language code (e.g., en-US)
targetLanguagesArray of target language codes for dubbing
speakerCountNumber of speakers in the source audio (auto for automatic detection)
personalVoicetrue to preserve the original speaker's voice characteristics
PUT
https://api.mk.io/api/v1/projects/<PROJECT_NAME>/media/transforms/<TRANSFORM_NAME>
Path Parameters:
<PROJECT_NAME>Your unique project identifier
<TRANSFORM_NAME>Transform name (e.g., dubbing-transform)
Request Body:
{
  "properties": {
    "description": "AI dub: English to Spanish, German, French",
    "outputs": [
      {
        "preset": {
          "@odata.type": "#MediaKind.AIPipelinePreset",
          "pipeline": {
            "name": "Predefined_ACSVodSpeechToSpeech",
            "arguments": {
              "VodSpeechToSpeechTranslation": [
                {
                  "name": "language",
                  "value": "en-US"
                },
                {
                  "name": "targetLanguages",
                  "value": [
                    "es-ES",
                    "de-DE",
                    "fr-FR"
                  ]
                },
                {
                  "name": "speakerCount",
                  "value": "auto"
                },
                {
                  "name": "personalVoice",
                  "value": false
                }
              ]
            }
          }
        }
      }
    ]
  }
}

Important notes:

  • Source language: "value": "en-US" specifies the original video language
  • Target languages: The targetLanguages array includes all three output languages. A single dubbing job generates all three language dubs simultaneously
  • Personal voice: Setting personalVoice to false disables voice replication, allowing the AI model to create a new synthetic voice across all dubbed languages

2.2 Create a Dubbing Job

Create a job to execute the dubbing transform:

PUT
https://api.mk.io/api/v1/projects/<YOUR_PROJECT_NAME>/media/transforms/<TRANSFORM_NAME>/jobs/<JOB_NAME>
Path Parameters:
<YOUR_PROJECT_NAME>Your unique project identifier
<TRANSFORM_NAME>The dubbing transform created in step 2.1 (e.g., dubbing-transform)
<JOB_NAME>Unique identifier for this dubbing job
Request Body:
{
  "properties": {
    "description": "Generate Spanish, German, French dubs",
    "priority": "Normal",
    "input": {
      "files": [
        "english-video-demo_320x180_400k.mp4"
      ],
      "@odata.type": "#Microsoft.Media.JobInputAsset",
      "assetName": "english-encoded"
    },
    "outputs": [
      {
        "@odata.type": "#Microsoft.Media.JobOutputAsset",
        "assetName": "dubbed-audio"
      }
    ]
  }
}

Configuration details:

  • Input file: Specify one encoded bitrate variant from the source video (e.g., english-video-demo_320x180_400k.mp4). All encoded variants contain the audio track necessary for dubbing
  • Input asset: english-encoded is the asset containing encoded source files
  • Output asset: dubbed-audio is the asset created in Step 1.3, where dubbed audio files will be stored

Monitor progress: In the MK.IO dashboard, navigate to Video ProcessingJobs and wait for the job status to change to Finished.

Output files: Upon completion, the dubbed-audio asset contains three MP4 files:

  • english-video-demo_320x180_400k.mp4_es-ES.mp4 (Spanish)
  • english-video-demo_320x180_400k.mp4_de-DE.mp4 (German)
  • english-video-demo_320x180_400k.mp4_fr-FR.mp4 (French)

Step 3: Insert Audio Tracks into Final Asset

This step adds the dubbed audio tracks to your encoded video asset, making all languages available to viewers.

3.1 Create Track Insertion Transforms

Create three separate transforms - one for each dubbed language. Begin with the Spanish insertion transform:

PUT
https://api.mk.io/api/v1/projects/<YOUR_PROJECT_NAME>/media/transforms/<TRANSFORM_NAME>
Path Parameters:
<YOUR_PROJECT_NAME>Your unique project identifier
<TRANSFORM_NAME>Transform name (e.g., spanish-insert)
Request Body:
{
  "properties": {
    "description": "Insert Spanish audio",
    "outputs": [
      {
        "preset": {
          "tracks": [
            {
              "@odata.type": "#MediaKind.AudioTrack",
              "trackName": "audio-spanish",
              "displayName": "Español (AI Dubbed)",
              "languageCode": "es-ES"
            }
          ],
          "@odata.type": "#MediaKind.TrackInserterPreset"
        },
        "relativePriority": "Normal"
      }
    ]
  }
}

Repeat for German and French:

  • German transform: Update trackName to audio-german, displayName to Deutsch (AI Dubbed), and languageCode to de-DE
  • French transform: Update trackName to audio-french, displayName to Français (AI Dubbed), and languageCode to fr-FR

3.2 Create Track Insertion Jobs

Create one job for each track insertion transform. Each job inserts the corresponding dubbed audio track into the english-encoded asset:

PUT
https://api.mk.io/api/v1/projects/<YOUR_PROJECT_NAME>/media/transforms/<TRANSFORM_NAME>/jobs/<JOB_NAME>
Path Parameters:
<YOUR_PROJECT_NAME>Your unique project identifier
<TRANSFORM_NAME>The track insertion transform (e.g., spanish-insert)
<JOB_NAME>Unique job identifier (e.g., job-insert-spanish)
Request Body:
{
  "properties": {
    "input": {
      "files": [
        "english-video-demo_320x180_400k.mp4_es-ES.mp4"
      ],
      "@odata.type": "#Microsoft.Media.JobInputAsset",
      "assetName": "dubbed-audio"
    },
    "outputs": [
      {
        "@odata.type": "#Microsoft.Media.JobOutputAsset",
        "assetName": "english-encoded"
      }
    ],
    "priority": "Normal"
  }
}

Configuration details:

  • Input file: The dubbed audio file from Step 2 (e.g., english-video-demo_320x180_400k.mp4_es-ES.mp4 for Spanish)
  • Input asset: dubbed-audio contains the dubbed audio files
  • Output asset: english-encoded is the asset where the track will be inserted

Repeat for German and French:

Update the file path and job names for each language:

  • German: Input file english-video-demo_320x180_400k.mp4_de-DE.mp4, job name job-insert-german
  • French: Input file english-video-demo_320x180_400k.mp4_fr-FR.mp4, job name job-insert-french

Monitor progress: Wait for all three jobs to complete (status: Finished).

Verification: Navigate to Assetsenglish-encoded and view the Tracks section. You should now see three audio tracks representing the Spanish, German, and French dubbed content.


Step 4: Configure Streaming

4.1 Create a Streaming Endpoint

  1. In the MK.IO dashboard, navigate to Streaming Endpoints+ Create Streaming Endpoint
  2. Configure the endpoint:
    • Name: production
    • Base URL: content
    • Type: Dedicated
  3. Click Create and then Start

4.2 Create a Streaming Locator

  1. Navigate to Assets and select english-encoded
  2. Assign the endpoint:
    • Select the production endpoint created in Step 4.1
  3. Add a Streaming Locator:
    • Name: live
    • Policy: Predefined_DownloadAndClearStreaming
  4. Copy the playback URLs provided

4.3 Test Multi-Language Playback

  1. Click the embedded player in the asset details
  2. Verify that audio and subtitle track switching functions correctly
  3. Confirm that Spanish, German, and French audio tracks are selectable

Explore more

Now get started automating this process!