Publishing

Table of Contents

Overview
Publishing Process Steps
Publish Format
- Payload Data

Publishing is the task that makes entities from one service available to other consumers. Primarily, it makes the data of the Media Service (with data from other related services, such as the Image and Video Service) available in the Catalog Service.

Overview

The management side of a Mosaic solution often consists of multiple services to manage videos, images, and entity metadata. Usually, there is one single place that acts as an aggregator of all the details. For the OTT media scenario, this is the Media Service where the Movie/TV show/etc., and their references to videos and images are managed. When you publish an entity, this service gathers the metadata from the main entity, its related tables, and the data from all the connected services. It transforms the data into the publishing format and publishes that data within a publishing message.

For example, to publish a movie, it acquires the metadata information of all the related images from the image service, all the details for its main video and trailers from the Video Service, and the data from the main movie table and related tables (e.g. tags, genres, licensing, etc.) from the Media Service. All that data is aggregated into the desired format, and stored as a publishing snapshot. A publishing snapshot captures all the metadata and validation rules at one point in time. The actual publishing can then take place immediately or at some later point in time.

Only the main entities are published. Examples of main entities are movies, TV shows, seasons, and episodes. Moreover, the movie and TV show genres are also published as a genre list. Items that are not used on their own are only published as a part of other entities.

In the OTT scenario, the Media Service sends this metadata as an event message through RabbitMQ. The Catalog Service listens to these messages. When it receives such a message, it updates its database accordingly.

Publishing Process Steps

Publishing involves multiple steps to gather all the required data from the different services. This section walks you through the full publishing process with an example based on a movie publication. The following sequence diagram shows the involved steps:

Figure 1. Sequence diagram for single item publishing

The snapshot creation process is started from the Media Service UI for the movie movie-1. The process can be started to only create the snapshot or to create the snapshot and directly publish the result if the snapshot is valid.

Gather Metadata

The Media Service backend retrieves the movie metadata, including casts, licenses, movie genres, production countries, and tags. It also acquires the related image ID for the cover and teaser image and the video IDs for the main video and all assigned trailers.

Then, it requests the video metadata for all the video IDs from the Video Service. Based on the returned data, some pre-validations are performed to check whether the video is fully encoded, whether it was QA reviewed, etc.

In parallel, a request is sent to the image service to get the image metadata for the cover and teaser image. Pre-validation is used to check that the image still exists and the image-type does match.

This step does not only gather the metadata that is required to create the publishing snapshot. It also includes all the details that are required for the snapshot validation.

Create the Snapshot

The publishing snapshot is generated in this step based on the gathered metadata. The format of the snapshot is JSON. It is described in detail below. All the metadata that should be published for one main entity is aggregated as a single publishing JSON document. This includes all the movie metadata as well as the required video and image details.

The JSON data as well as all the validation results are stored as a part of a database snapshot entity. If the metadata of the movie or any related data changes, the snapshot would still be in the same state as when the snapshot creation process was started.

Validate the Snapshot

Before the snapshot data can be published, it must first be validated to ensure that all the requirements are fulfilled.

Validation is done with JSON schema validation based on the created JSON snapshot. Moreover, custom validation rules based on the data from the metadata gathering step are also used. Those are checks for required fields, field length checks, number of related items, status validation, etc. See Publish Validations for more details.

This is all done in the same process to ensure that the snapshot creation and the validation use the same data. The validation results are also stored in the snapshot entity.

Publishing the Snapshot

After the snapshot is created and validated, it can be published. Depending on how the snapshot creation was started, the publishing can happen automatically (if all validation checks were passed) or manually. In the manual case, a management user can do a final check on the snapshot and decide whether the publishing should be started or not.

You can access snapshots in the UI from the entity (e.g. movie) workflow and from the snapshot registry. The snapshot registry is especially useful if you created a lot of snapshots through bulk operations. If there are no errors, you can trigger the publication. If there are some errors or you want to adjust something, you can create a new snapshot.

The publication process itself uses a messaging-based approach. It sends an event that contains the generated JSON snapshot data through RabbitMQ. It has a specific routing key (e.g. media_service.movie.published) and all interested consumers can subscribe to this message.

Potential extension point: as all the publishing messages are self-contained, it is also possible to implement scheduled publishing. When the snapshot creation process is started, it could also ask for the desired publishing date and time. When such a date is defined (and the snapshot validation succeeds), the system sends out the message only when that date is reached. This way it is picked up by the consumers of the event only after the time has been reached.

The most important subscriber is the Catalog Service. It extracts all the metadata of the entity (e.g. a movie) with all the related data (tags, licenses, cast, production countries, etc.) and details from the related entities (images, videos, etc.) and stores them in the Catalog Service database.

Other services can also integrate these publishing events. These services could include a reporting service, a recommendations engine, etc.

After the message is sent, the publication status of the snapshot is updated as published.

Unpublish the Snapshot

The counterpart of publishing is unpublishing. Unpublishing sends out an event message that informs the services, listening to this message, that the mentioned item is now unpublished. The services then remove this item from their storage.

Publish Format

The publish format defines the JSON structure that the publishing data must adhere to. There is a payload field in the RabbitMQ message that contains the snapshot data with the content_id and other relevant data. The exact JSON schema of the payload is defined as a JSON schema in the custom messaging library.

Payload Data

The snapshot payload for a movie that is being published looks like this:

Movie Snapshot Data

{
  "content_id": "movie-6",
  "title": "Avatar",
  "synopsis": "In 2154, humans have depleted Earth's natural resources...",
  "description": "Avatar is a 2009 American epic science fiction film...",
  "original_title": "James Cameron's Avatar",
  "released": "2009-12-10",
  "studio": "20th Century Fox",
  "production_countries": [
    "United States of America",
    "Estonia",
    "Germany",
    "COL",
    "ESP"
  ],
  "genres": [
    "movie_genre-3",
    "movie_genre-18"
  ],
  "cast": [
    "Sam Worthington",
    "Zoe Saldana",
    "Sigourney Weaver"
  ],
  "tags": [
    "3D",
    "SciFi",
    "Highlight"
  ],
  "licenses": [
    {
      "start_time": "2020-08-01T00:00:00+00:00",
      "end_time": "2020-08-30T23:59:59.999+00:00",
      "countries": [
        "ABW",
        "AUT",
        "FIN"
      ]
    }
  ],
  "images": [
    {
      "width": 1800,
      "height": 1012,
      "type": "COVER",
      "path": "/transform/0000000000000000-0000000000000000/9FqubDgdtLaSjXmnBc9UNf.jpg"
    },
    {
      "width": 1800,
      "height": 1012,
      "type": "TEASER",
      "path": "/transform/0000000000000000-0000000000000000/43BncavVQvDmjxiwQtt3kd.jpg"
    }
  ],
  "videos": [
    {
      "type": "MAIN",
      "title": "avatar",
      "is_protected": false,
      "output_format": "DASH",
      "duration": 9720,
      "audio_languages": [
        "en",
        "de"
      ],
      "subtitle_languages": [
        "en",
        "de"
      ],
      "caption_languages": [
        "en",
        "de"
      ],
      "dash_manifest": "https://videoimagedev.blob.core.windows.net/transcoded-videos/QtmGgafYUYmrmxw5apkD66/dash/manifest.mpd"
    },
    {
      "type": "TRAILER",
      "title": "Avatar trailer",
      "is_protected": false,
      "output_format": "DASH",
      "duration": 343,
      "audio_languages": [
        "en"
      ],
      "subtitle_languages": [
        "en",
        "es",
        "de"
      ],
      "caption_languages": [
        "en"
      ],
      "dash_manifest": "https://videoimagedev.blob.core.windows.net/transcoded-videos/SewobHEyxbg3A1y6aWPHve/dash/manifest.mpd"
    },
    {
      "type": "TRAILER",
      "title": "Avatar special",
      "is_protected": false,
      "output_format": "DASH",
      "duration": 973,
      "audio_languages": [
        "en",
        "de"
      ],
      "subtitle_languages": [
        "en",
        "de"
      ],
      "caption_languages": [
        "en",
        "de"
      ],
      "dash_manifest": "https://videoimagedev.blob.core.windows.net/transcoded-videos/83Y9JqrKjiCMLpj1bCnsed/dash/manifest.mpd"
    }
  ]
}

Points of interest:

The content_id is the field that uniquely identifies a main entity. For example, a movie, TV show, season, episode, and collection. It also includes the movie and TV show genres.
The content_id is formed by combining the entity type name with the unique entity ID in the format <entity-type>-<entity-database-id>. When generating the ID, we expect that every service uses unique entity type names. There should not be two Mosaic services that define the same type, such as a movie entity type.
The main video and all the trailers share the same structure. The type property defines if a video is used as the main video or a trailer. The video data comes from the Video Service.
The images are defined similarly to videos but they have a usage type.