The encoding phase is also a part of the whole encoding process. In this phase, both the video and audio are encoded with a specific codec

Encoding Phase

This article discusses the encoding phase of an encoding job. You can find an overview of the entire encoding process in the Job Processing article.

Supported Input

The Encoding Service supports a variety of video, audio, and subtitles formats. The table below lists the formats known to be supported. There are likely many more formats which are also processed well. Let Axinom know if you have any issues with a specific format.

Video codecs

AVCHD, DNxHD, DVCPRO, DVCPROHD, H.264/AVC, H.265/HEVC, MPEG-1 Video, MPEG-2 Video, MPEG-4 Video, ProRes, Theora, VP6, VP8, VP9

Audio codecs

AAC, AC3, AIFF, FLAC, MP2, MP3, Ogg Vorbis, WAV, WMA


WebVTT, TTML, SubRip (=SRT), PAC, itt




You can modify the encoding settings in the Content Processing section of the encoding job request.

It is also possible to skip encoding using the packaging-only mode if the video is already provided in the desired format.


VideoFormat specifies the codec. Supported values are:

  • H264 - H.264 (AVC = Advanced Video Coding) (default, used if omitted)

  • H265 - H.265 (HEVC = High Efficiency Video Coding). It is supported by Safari, but please note that whole streams or some representations (1080p or better) might not be available with H.265 on some older Macs (for example, MacBook Pros made in 2014).

  • DoNotEncode - don’t encode the video at all, see packaging-only mode.

Supported Aspect Ratios

The Encoding Service by default supports a variety of Display Aspect Ratio (DAR) values for input video. The list includes:








When using aspect ratios 4:3 or 16:9 you may omit "VideoRepresentations" in the job request. Axinom Encoding will then use the default bitrate ladder, depending on the codec. For all other aspect ratios you must provide "VideoRepresentations" in the job request.

In case you need to use an aspect ratio that is not in above list you need to explicitly set "ForceAspectRatioToStandard": "False" in the job request.


Whereas creating the video representations is quite straightforward - we find the source video track that has the best bitrate and encode it to the necessary dimensions and bitrates -, creating the audio representations has a few more things to it. Mainly, we have multiple source audio tracks, some of which might be embedded in the video and some of which might be in separate audio files. Perhaps the audio files have even been provided by different studios. This all makes it possible that the source audio tracks can have slightly different audio levels, bitrates or audio channel counts.

The main thing to keep in mind is that we do not upmix audio channels. This means that if you want to have a 5.1 representation of a language, the source files must include a 5.1 audio track for it. Read more about this below (link).


The audio codec is chosen automatically and will be:

  1. AAC for mono and stereo sound.

  2. AC3 for 5.1 sound.

Audio Representations

Let’s look at an example AudioRepresentations array:

    "ContentProcessing" : {
        "AudioRepresentations": [
                "BitrateInKbps": 128,
                "Sound": "Mono",
                "AudioFormat": "AAC",
                "Create": "OnMatchingSound"
                "BitrateInKbps": 256,
                "Sound": "Stereo",
                "AudioFormat": "AAC",
                "Create": "Always"
                "BitrateInKbps": 384,
                "Sound": "5.1",
                "AudioFormat": "AC3",
                "Create": "Always"

Let’s look at what this particular configuration does (you can of course use your own configuration):

  1. Mono sound has "Create": "OnMatchingSound" which means that mono representations will be created only from mono audio tracks in the source files. All source audio tracks that have more (or fewer) than 1 audio channel will be ignored for these representations. Since mono source audio is not that common, mono representations are usually not created.

  2. Stereo sound has "Create": "Always" which means that stereo representations will be created from source audio tracks that have at least stereo audio. All source audio tracks that have fewer than 2 channels will be ignored for these representations. If there are both stereo and 5.1 tracks provided, we automatically pick the track with the closest-matching audio channel count for a given representation so that there is as little downmixing done as possible.

  3. 5.1 sound also has "Create": "Always" which means that 5.1 representations will similarly be created from source audio tracks that have at least 5.1 audio (e.g. also 7.1). All source audio tracks that have fewer than 6 channels will be ignored for these representations.

  4. You can specify the audio codec by setting AudioFormat to either AAC or AC3 in any of the audio representations. Setting AudioFormat is optional and if it’s omitted AAC is used by default.

Revision History

The table below lists the document versions and any changes to them.

Version Date Description


April 19, 2021

  • Initial version.


October 6, 2021


July 3, 2023