Axinom Encoding is a video ingestion and processing service that allows to easily create VoD content from various source formats. Find out its core features here.

Axinom Encoding Overview

Axinom Encoding is a video ingestion and processing service. It allows to easily create video on demand (VoD) content from various source video formats. The service combines strong security and powerful transcoding capabilities wrapped into a simple HTTP-based API.

The core features of the Encoding Service are:

  • Support for many source video, audio, and subtitles formats

  • Codecs: H.264 (AVC), H.265 (HEVC), AAC

  • Packaging: DASH, HLS, and CMAF for adaptive streaming

  • Ability to acquire and publish files from/to many different storage types, such as FTPS, Amazon S3 or Azure Blob Storage

  • DRM protection using Widevine, PlayReady, and FairPlay

  • Direct integration with Axinom DRM Key Service

Encoding Service integrates seamlessly with Axinom DRM. However, thanks to the support of the industry standards (such as CPIX), it can be used with any other DRM supplier as well.

Summary

Encoding Service is a managed service (SaaS). It is fully stateless and uses a pattern of jobs.

Encoding Overview
Figure 1. Axinom Encoding Service overview

A typical client interaction with the Encoding Service looks like this:

  • Client sets up an input storage and uploads the source video material there. The Encoding Service will take the video from the input storage.

  • Client sets up an output storage where the Encoding Service shall store the processed output.

  • Client sets up a message queue to which the Encoding Service shall send the job progress messages.

  • Client creates a job definition (JSON) which tells the Encoding Service what exactly to do with the video. The job definition also contains the pointers to the input/output storages and to the message queue.

  • Client submits the job definition to the Encoding API.

  • Client observes the messages arriving to the message queue until the job’s completion (success/error).

  • Client downloads the produced results from the output storage (unless the output storage is directly used as an origin for the video distribution).

See also: description of the Encoding API.

Job Processing Phases Overview

An encoding job is a long-running process (can take minutes and even hours).

When a job is submitted, the service performs a validation of the request body (JSON). It also extracts, decrypts, and validates the supplied Content Keys for encryption, or, in the DRM Managed Mode, acquires the necessary keys from the Key Service. It returns to the client a response containing the assigned JobID and the ID(s) of the Content Keys which will be used for encryption.

ClientInput_StorageEncoding_APIEncoding_ServiceKey_ServiceMessage_PublisherUpload contentSet up & ListenJob InitiationPOST /Jobvalidate requestacquire content keyscontent keysjob ID, keyIDsstart job executionjob progress events (periodically)Success or FinalError eventJob CancellationPOST /Job/<job_id>/canceljob statusJob StatusGET /Reporting/<job_id>job status
Figure 2. Request Processing

The actual job execution happens asynchronously. The job execution involves several distinct phases:

Pre-ValidationAcquisitionMedia MappingEncodingDRM ProtectionPackagingImage ExtractionPublishing
Figure 3. Job processing phases
Note
The phases shown above are conceptional. In reality, some of the processing steps happen simultaneously, e.g. encryption and packaging.

The table below briefly describes each processing phase, including the configuration sections and events for each phase.

Phase Description Configuration section(s) Events

Pre-Validation

Validating the submitted job description and the credentials supplied to the external storages.

JobCreated

Acquisition

Downloading the content from the input storage to a temporary storage used by the subsequent operations.

ContentAcquisition

AcquistionProgress, ContentAcquired

Media Mapping

The input storage is a folder and can contain multiple files. Based on the supplied settings, the Encoding Service decides which file contains the video stream and which files contain the audio and subtitles streams in which language.

MediaMappings

ContentMapped, ContentPreProcessed

Encoding

Encoding the video and audio according to the supplied settings.

ContentProcessing

VideoEncodingStarted, EncodingProgress, EncodingFinished

DRM Protection

If required, encrypting the content according to the supplied settings.

ContentProcessing

Packaging

Packaging the encoded video, audio and subtitles as DASH, HLS, or CMAF, depending on the settings and, optionally, archiving the output into one or more TAR files.

ContentProcessing

Image Extraction

Encoding Service can extract frames from the video stream at specified time indexes and store them as JPEG images in a specified location. External systems (like Content Management) can use the generated images.

Encoding Service can also generate thumbnails (used as preview images) at regular intervals and include them as a part of the DASH representation in accordance with the DASH IF IOP 4.3, section 6.2.6 "Tiles of thumbnail images".

ImageExtraction, Thumbnails

ImagesExtracted

Publishing

Publishing produced output to the specified output storage.

ContentPublishing

ContentPublished, JobSuccess, FinalError

Once this Publishing phase finishes, the Encoding Service doesn’t keep any data related to the job (except the log files).

Encoding Job Phases

Pre-Validation

To enable faster feedback in case of wrong credentials supplied ("fail fast"), the Encoding Service tries to connect to the specified input storage and output storage before it does any further processing. For the output storage. it also uploads a small dummy file to ensure it has the write privileges. The job fails immediately if access doesn’t work.

Note
The Location specified for the Image Extraction is not validated here. The job will also not fail if Image Extraction is configured, but the specified location is not accessible. The extracted images will just not be uploaded.

Acquisition

In the Acquisition phase, the content is downloaded from the input storage specified by the Storage Providers in the job description’s section Acquisition to a temporary storage. The specified credentials shall allow reading the content. If the option " DeleteFilesFromSourceWhenDone" is used, the write/delete privilege is also required.

Encoding_ServiceInput_StorageMessage_Publisherdownload contentloopcontentAcquisitionProgressContentAcquired
Figure 4. Interactions between the Encoding Service and other systems during the Acquisition Phase

See also: ContentAcquisition section, AcquistionProgress event, ContentAcquired event.

Media Mapping

To decide which input files represent which streams, the Encoding Service uses the settings in the MediaMapping section. There are three layers of settings which enable more and more focused filtering of the input files:

  • General regular expression to match the files containing video, audio, subtitles, and captions files

  • Exact mapping of the specific files to their content type and language

  • Filter for the list of accepted languages.

Read more: Media Mapping

See also: MediaMappings section, ContentMapped event, ContentPreProcessed event

Encoding

In the Encoding phase, the video and audio are encoded using a specific Codec. Audio is always encoded using AAC (Advanced Audio Coding, successor of MP3). For video encoding, H.264/AVC or H.265/HVEC can be selected alongside with their optimization settings. It is also possible to skip encoding using a packaging-only mode, if the video is provided in the desired format already.

Read more: Encoding

DRM Protection

Axinom Encoding protects the videos to be used with the major DRM technologies, such as Widevine, PlayReady, and FairPlay. All DRM technologies use the AES encryption with a 128-bit Content Key. The Content Key could be handled either in the Direct mode (content key is supplied as a part of the Job) or in the Managed mode (credentials for the Key Service are supplied and the Encoding Service acquires the necessary keys on its own). Moreover, the Encoding Service supports using multiple keys which means that you can encrypt different streams with different Content Keys.

Read more: DRM Protection

See also: ContentProcessing section

Packaging

The encoded video, audio, and subtitles are further packaged as MPEG DASH, HLS, or CMAF, depending on the settings in the job description’s ContentProcessing section. for DASH both the Live and OnDemand profiles are supported. DASH and HLS can be produced simultaneously, to get better compatibility with the end user devices. Alternatively, CMAF can be used to supply the video content only once while still enabling both the DASH- and HLS-compatible players.

While the Encoding Service makes a choice of the bitrates and resolutions to generate for optimal experience, it is also possible to override the default settings and to request the exact set of bitrates.

Read more: Packaging

See also: ContentProcessing section

Image Extraction

Encoding Service can extract frames from the video stream at specified time indexes and store them as JPEG images in a specified location. External systems (like Content Management) can use the generated images.

Encoding Service can also generate thumbnails (used as preview images) at regular intervals and include them as a part of the DASH representation in accordance with the DASH IF IOP 4.3, section 6.2.6 "Tiles of thumbnail images". Any player supporting this DASH standard can display the thumbnails for any time index.

Read more: Image Extraction

See also: ImageExtraction section, Thumbnails section, ImagesExtracted event

Publishing

In the Publishing phase, the encoded and protected content is published to the output storage specified by the [Storage Provider] in the section ContentPublishing of the job description. The specified credentials shall allow writing the content.

Encoding_ServiceInput_StorageOutput_StorageMessage_PublisherEncoded & Protected Contentopt[If Required]Delete Source FilesContentPublishedJobSuccess or FinalError
Figure 5. Interactions between the Encoding Service and other systems in the Publishing Phase

See also: ContentPublishing section, ContentPublished event, JobSuccess event, FinalError event

Job Status and Reporting

As the Encoding Service informs the client about the job progress every time it reaches a certain phase, you can easily keep track of it. You are notified of the events via message publishers that are defined in the job description.

You could also acquire the job status by using the Encoding API endpoint GET /reporting/<job_id>.

It is also possible to acquire a list of all jobs in a month in the same format using the Encoding API endpoint GET /reporting/<year>/<month>.

Read more about progress tracking and reporting from the Reporting page.

Storage Providers

For accessing external storage, e.g. input/output storage, the Encoding Service uses the concept of a Storage Provider. The following Storage Providers are supported:

Ftps

Any FTPS server

AmazonS3

Amazon S3

AzureBlob

Azure Blob Storage

Configuration of a Storage Provider occurs with the following parameters:

{
    "Provider": "Ftps",
    "UriPath": "ftpes://server.ftp.com/source/dir/",
    "CredentialsName": "user",
    "CredentialsSecret": "pass",
    "CredentialsProtection": "Unencrypted"
}

The exact interpretation of the "UriPath", "CredentialsName", and "CredentialsSecret" depends on the Storage Provider. See more details under Storage Providers.

To pass the value of "CredentialsSecret" in an encrypted form when using the Credentials Protection set the "CredentialsProtection" to "Encrypted".

Languages

The Encoding Service supports languages encoded with 2 or 3 letter codes according to ISO-639. View the full list of Supported Languages as a reference.

The mapping of the input files to the languages can be done implicitly or explicitly. See the Media Mapping Phase for more details.

If an unsupported language is provided, the Encoding Service will still use it and include it in the stream description in the manifest and in the file names, but some features may not be available. For example, the language name will not be detected and included.

Subtitles & Closed Captions

The Encoding Service supports subtitles and closed captions (CC). Technically, they are the same. The difference is more in their purpose.

HTML5 defines subtitles as a "transcription or translation of the dialog when sound is available but not understood" by the viewer (for example, dialog in a foreign language) and captions as a "transcription or translation of the dialog, sound effects, relevant musical cues, and other relevant audio information when sound is unavailable or not clearly audible" (for example, when audio is muted or the viewer is deaf or hard of hearing).

The input formats supported for the subtitles and closed captions are:

Regardless of the input format, the Encoding Service translates all subtitles and closed captions to WebVTT, as suggested by the DASH standard.

All input files shall be provided in UTF-8.

The formatting inside the WebVTT files is taken over as is. For all other formats, only the time indexes and the text is taken over; no format-specific formatting instructions.

Security

The Encoding Service provides credentials protection every time that secrets are passed to the Encoding API. Moreover, as a stateless service, the Encoding Service only maintains the log files, not the output content. It could also remove the source files if you activate that option. Learn more about security from the security section.

Revision History

The table below lists the document versions and any changes to them.

Version Date Description

1.0

October 14, 2020

Initial version

1.1

October 21, 2020

C# code sample added

1.2

December 14, 2020

H.265 added to video codecs

2.0

March 23, 2021

Extracted information on DRM integration to a separate article

3.0

April 16, 2021

Extracted information on specific phases to separate articles