The packaging phase of encoding means packaging the encoded video, audio, and subtitles as MPEG DASH, HLS, or CMAF, depending on the settings in the job description

Packaging Phase

This article discusses the packaging phase of encoding. You can find an overview of the entire encoding process in the Job Processing article.

The encoded video, audio, and subtitles are packaged as MPEG DASH, HLS, MPEG2TS, or CMAF, depending on the settings in the job description’s ContentProcessing section.

Adaptive Streaming Format

Adaptive Streaming format to be used can be selected by the element ContentProcessing/OutputFormat:

{
    "ContentProcessing" : {
        "OutputFormat" : ["Dash", "Hls"],
        ...
    }
}

The supported values are:

Output Format Description

Dash

MPEG DASH

DashOnDemand

MPEG DASH, On-Demand Profile

Hls

HTTP Live Streaming (HLS)

Cmaf

Common Media Application Format (CMAF) - the recommended option

The values "Dash" and "Hls" can be combined. If both are specified, the video will be packaged twice and both projects will be stored in their respective subfolders in the output folder. If only one value is specified, the project will be stored directly in the output folder. The default naming of the files can be changed using the ContentProcessing/Naming element (see Custom Naming).

DASH

The output folder will contain a DASH manifest (manifest.mpd) and a number of chunk files (.m4s) for each of the streams (video, audio, subtitles). DASH uses the fragmented mp4 (fMP4) format.

Default structure of the output folder with a DASH project (click to expand)
+
|--- manifest.mpd // (1)
|--- video-H264-216-300k_1.m4s // (2)
|--- video-H264-216-300k_2.m4s
|--- video-H264-216-300k_3.m4s
|--- video-H264-216-300k_4.m4s
|--- video-H264-216-300k_5.m4s
|--- video-H264-216-300k_6.m4s
|--- video-H264-216-300k_7.m4s
|--- video-H264-216-300k_8.m4s
|--- video-H264-216-300k_init.mp4
|--- video-H264-288-400k_1.m4s
|--- video-H264-288-400k_2.m4s
|--- video-H264-288-400k_3.m4s
|--- video-H264-288-400k_4.m4s
|--- video-H264-288-400k_5.m4s
|--- video-H264-288-400k_6.m4s
|--- video-H264-288-400k_7.m4s
|--- video-H264-288-400k_8.m4s
|--- video-H264-288-400k_init.mp4
|--- video-H264-360-800k_1.m4s
|--- video-H264-360-800k_2.m4s
|--- video-H264-360-800k_3.m4s
|--- video-H264-360-800k_4.m4s
|--- video-H264-360-800k_5.m4s
|--- video-H264-360-800k_6.m4s
|--- video-H264-360-800k_7.m4s
|--- video-H264-360-800k_8.m4s
|--- video-H264-360-800k_init.mp4
|--- audio-en_1.m4s // (3)
|--- audio-en_2.m4s
|--- audio-en_3.m4s
|--- audio-en_4.m4s
|--- audio-en_5.m4s
|--- audio-en_6.m4s
|--- audio-en_7.m4s
|--- audio-en_8.m4s
|--- audio-en_init.mp4
|--- subtitle-ts_1.m4s // (4)
|--- subtitle-ts_2.m4s
|--- subtitle-ts_3.m4s
|--- subtitle-ts_4.m4s
|--- subtitle-ts_5.m4s
|--- subtitle-ts_6.m4s
|--- subtitle-ts_7.m4s
|--- subtitle-ts_8.m4s
|--- subtitle-ts_init.mp4
  1. Manifest file

  2. Video chunk, lowest bitrate

  3. Audio chunk, English

  4. Subtitle chunk

HLS

Similarly, the output folder will contain an HLS manifest (manifest.m3u8) and a number of chunk files (.ts, .vtt) for each of the streams (video, audio, subtitles). HLS uses the transport stream (TS) format.

Default structure of the output folder with an HLS project (click to expand)
+
|--- manifest.m3u8 // (1)
|--- video-H264-216-300k.m3u8
|--- video-H264-216-300k_1.ts // (2)
|--- video-H264-216-300k_2.ts
|--- video-H264-216-300k_3.ts
|--- video-H264-216-300k_4.ts
|--- video-H264-216-300k_5.ts
|--- video-H264-216-300k_6.ts
|--- video-H264-216-300k_7.ts
|--- video-H264-216-300k_8.ts
|--- video-H264-288-400k.m3u8
|--- video-H264-288-400k_1.ts
|--- video-H264-288-400k_2.ts
|--- video-H264-288-400k_3.ts
|--- video-H264-288-400k_4.ts
|--- video-H264-288-400k_5.ts
|--- video-H264-288-400k_6.ts
|--- video-H264-288-400k_7.ts
|--- video-H264-288-400k_8.ts
|--- video-H264-360-800k.m3u8
|--- video-H264-360-800k_1.ts
|--- video-H264-360-800k_2.ts
|--- video-H264-360-800k_3.ts
|--- video-H264-360-800k_4.ts
|--- video-H264-360-800k_5.ts
|--- video-H264-360-800k_6.ts
|--- video-H264-360-800k_7.ts
|--- video-H264-360-800k_8.ts
|--- audio-en.m3u8
|--- audio-en_1.ts // (3)
|--- audio-en_2.ts
|--- audio-en_3.ts
|--- audio-en_4.ts
|--- audio-en_5.ts
|--- audio-en_6.ts
|--- audio-en_7.ts
|--- audio-en_8.ts
|--- subtitle-ts.m3u8
|--- subtitle-ts_1.vtt // (4)
|--- subtitle-ts_2.vtt
|--- subtitle-ts_3.vtt
|--- subtitle-ts_4.vtt
|--- subtitle-ts_5.vtt
|--- subtitle-ts_6.vtt
|--- subtitle-ts_7.vtt
|--- subtitle-ts_8.vtt
  1. Manifest file

  2. Video chunk, lowest bitrate

  3. Audio chunk, English

  4. Subtitle chunk

CMAF

With CMAF, the video material is packaged once using fMP4, but both manifests for DASH and for HLS are added, which make the same chunks usable for players implementing DASH or HLS, respectively. CMAF does not store every chunk on a separate file, but uses the byte-ranges instead, thus reducing the number of files.

Default structure of the output folder with CMAF project (click to expand)
+
|--- manifest.mpd
|--- manifest.m3u8
|--- cmaf +
|         |--- manifest.mpd
|         |--- manifest.m3u8
|         |--- video-H264-240-300k.m3u8
|         |--- video-H264-240-300k_iframes.m3u8
|         |--- video-H264-240-300k-video-avc1.mp4
|         |--- audio-bo.m3u8
|         |--- audio-bo-audio-bo-mp4a.mp4
|         |--- audio-de.m3u8
|         |--- audio-de-audio-de-mp4a.mp4
|         |--- audio-en.m3u8
|         |--- audio-en-audio-en-mp4a.mp4
|         |--- subtitle-ar.m3u8
|         |--- subtitle-ar.vtt
|         |--- subtitle-de.m3u8
|         |--- subtitle-de.vtt
|         |--- subtitle-en.m3u8
|         |--- subtitle-en.vtt
Note
The difference in manifests inside/outside of the folder CMAF is only in the paths to the media files. The upper level manifest has the paths with a subfolder in it, while the inner level manifest doesn’t include a subfolder. Hence, the playback is possible with either of the two manifests. The same applies to DashOnDemand.

DashOnDemand

This is a special version of the DASH output using the so-called "On-Demand" profile. (The "Dash" option is using the so-called "Live" profile, which is a confusing name, because the Live profile is used very well in the Video-on-Demand scenarios.) The On-Demand profile keeps all the chunks in a single file, i.e. there is a file for each video bitrate, for each audio language, and for each subtitles language. The manifest file in the root contains the byte-range access definitions.

Default structure of the output folder with a DashOnDemand project (click to expand)
+
|--- manifest.mpd
|--- dash +
|         |--- manifest.mpd
|         |--- video-H264-240-300k.mp4
|         |--- audio-bo.mp4
|         |--- audio-de.mp4
|         |--- audio-en.mp4
|         |--- subtitle-ar.mp4
|         |--- subtitle-de.mp4
|         |--- subtitle-en.mp4

MPEG2TS

MPEG2TS, or Transport Stream, is commonly used for broadcasting. This format encapsulates a single video, along with multiple audio and subtitle tracks, all within a single .ts file. Though it doesn’t offer built-in DRM support, it offers a seamless method to package and transmit several streams.

Important

MPEG2TS does not support built-in Digital Rights Management (DRM). If content protection is essential for your use-case, consider using alternative formats or additional protective measures outside the stream itself.

The output folder for an MPEG2TS project will house a singular .ts file that will encompass all its video, audio, and subtitle tracks.

Default structure of the output folder with an MPEG2TS project (click to expand)

|--- video-H264-240-320k.ts

Note

While MPEG2TS is a versatile format capable of containing numerous streams, it misses out on some of the advanced adaptive streaming features present in newer formats like DASH and HLS. Nonetheless, it is a reliable choice for specific broadcasting scenarios or when catering to platforms that particularly demand the .ts format.

Bitrates

In adaptive streaming, several representations are created for a video with different quality levels. The quality levels are defined by frame resolutions. The higher the resolution, the higher the quality and hence the bitrate (amount of information which needs to be transferred to the player per second). Every representation is segmented into small "chunks" (10 seconds each). Every chunk of every representation is stored in a separate file, e.g. video-H264-360-800k_3.m4s. At runtime, the player dynamically decides for each point in time from which representation (bitrate) it downloads a specific chunk. This decision depends on the CPU and network load, i.e. the player adapts to the CPU and network load (hence the adaptive streaming).

During encoding, the Encoding Service has to decide which quality levels to generate. If nothing is specified in the job description, it will use a pre-defined default set of the quality levels (bitrates), derived from Axinom’s experience of building video delivery solutions. See Default Bitrates for more information.

This default can be overridden using the element ContentProcessing/VideoRepresentations:

{
    "ContentProcessing" : {
        ...
        "VideoRepresentations": [
            {
                "Height": 720,
                "BitrateInKbps": 1500
            },
            {
                "Height": 480,
                "BitrateInKbps": 800
            }
        ]
    }
}

It is enough to provide either Width or Height - the other value will be selected to preserve the aspect ratio.

Note
In no case will the Encoding Service produce the output with a higher resolution than the input material. The requested quality levels with a higher resolution than the source will be ignored. If all the requested levels are higher than the source, a single highest possible resolution will be created.

It is also possible to generate only the single best possible quality level with the option ContentProcessing/UseHighestPossibleBitrate:

{
    "ContentProcessing" : {
        ...
        "UseHighestPossibleBitrate": true
    }
}

The VideoRepresentations element is not necessary in this case, and the default set of bitrates is also not used.

Audio Bitrates

Our service also supports creating multiple audio representations with different bitrates and sound. Please see this paragraph for more information.

Archiving

A video representation consisting of many small chunks is very convenient for delivering the chunks to the player. However, if the created video files need to be copied elsewhere (e.g. a disconnected environment, like onboard an aircraft), the need to copy thousands of small files can slow down the copying process significantly. This effect is stronger for small files, such as audio, and especially for the tiny subtitles. To reduce the overhead for the later copy operation, the Encoding Service can combine all the chunks into a bigger TAR file (=Archiving). There are several options supported for Archiving, selected with the element ContentProcessing/Archiving.

Tip
If you use the output formats "CMAF" (recommended) or "DashOnDemand", the chunks are also not created into separate files, thus eliminating the need for TAR archiving.
{
    "ContentProcessing" : {
        ...
        "Archiving": "Split",
        "ArchiveOutputName": "VideoArchive",
        "MaxArchiveSize": 200000,
        "ChecksumFileName": "Checksum"
        ...
    }
}

The following options for "Archiving" are supported: None, Tar, SingleTar, FlatTar, Split.

In a nutshell, when Archiving is activated:

  • All files in the output folder are combined together into a big archive (or multiple archive volumes for Split)

  • This file is named after the job ID - <jobID>.tar, unless the "ArchiveOutputName" is specified (then it is <ArchiveOutputName>.tar)

  • A MD5-hash is calculated for the .tar file and stored in the file with the same name as .tar with the extension .md5.

  • Additionally, a text file checksum.md5 is created with a list of all files in the archive and their respective MD5 hash and added to the archive. With the element "ChecksumFileName", an alternative name for this file can be specified.

The difference between the TAR options is important if the output folder contains subfolders. This is the case if ["Dash", "Hls"] is requested as the OutputFormat. In this case, the subfolder "Dash" contains the DASH representation and the folder "Hls" contains the HLS representation.

Tar

Archiving is applied for each subfolder separately. As a result, each subfolder contains its own .tar archive.

Note
MD5 checksums are not implemented for the "Tar" option.
Sample folder structure for the Tar option
+
|--- Dash +
|         |--- VideoArchive.tar
|--- Hls  +
          |--- VideoArchive.tar

SingleTar

Archiving is applied for the output folder as a whole. As a result, a single .tar archive is created, which contains the two subfolders and all their content.

Sample folder structure for the SingleTar option
+
|--- VideoArchive.tar
|--- VideoArchive.md5


VideoArchive.tar:
+
|--- Dash +
|         |--- file1.m4s
|         |--- ...
|--- Hls  +
          |--- file1.ts
          |--- ...

FlatTar

A single .tar archive is created containing all the files from all subfolders, as if they were in a single folder.

Sample folder structure for the FlatTar option
+
|--- VideoArchive.tar
|--- VideoArchive.md5


VideoArchive.tar:
+
|--- file1.m4s
|--- file1.ts
|--- ...

Split

With this option, the TAR archive is split into multiple volumes. You can limit the size of each volume. The reason for this is to avoid the need to handle huge files (a video project can weigh several gigabytes), which can be as difficult as handling thousands of small files.

You can experiment with an optimal value for the volume size. Axinom found the values between 100 MB and 200 MB to work best.

The archive size is controlled by the MaxArchiveSize property of ContentProcessing. It expects a numeric value in bytes. The minimum value is 102400.

Revision History

The table below lists the document versions and any changes to them.

Version Date Description

1.0

April 19, 2021

  • Initial version.

1.1

October 26, 2021

Clarification on SplitTar and MaxArchiveSize

2.0

July 3, 2023

Added note about audio representations