What is DRM?
- DRM Approach
- Keys Synchronization
- Disconnected Environments
- DRM Technologies
- Which DRM Technology to Use?
- Next Steps
You have video content and you want to protect it against unauthorized use? Let’s review a typical setup in a video distribution solution:
There is one issue with this solution: it is far too easy to steal the content. An attacker can get access to the content as a regular user, download the content, and then use it in various ways outside of the system, e.g. publish the content on the Internet.
DRM technology can help here. DRM stands for Digital Rights Management. It relies on encryption to protect the content.
In a nutshell, it works like this:
Generate a content key.
Encrypt the video with the content key.
Distribute the video to its consumers via any (unprotected) channel.
Provide access to the content key to the authorized users (and only to them).
Control what the authorized users can do with the video.
The trickiest part is the last one: "control what the users can do". You can’t directly provide the content keys to the users. Otherwise, any authorized user could compromise the system by decrypting the video once and, for example, publishing it in clear on the Internet.
It is crucial to maintain control over the content keys at all times. A trusted component on the client side is needed to ensure this. This component is called the Content Decryption Module (CDM). Various DRM technologies, such as Widevine, PlayReady, and FairPlay, come with their own CDM, which encapsulates the functions of content decryption and usage rules enforcement. The deeper this CDM resides in the operating system (and even better - in the hardware), the more secure the system is.
Let’s have a closer look at the individual components and how they collaborate in a DRM workflow.
Encryption usually happens as one of the processing steps during video encoding and it is performed by the same software. (The other steps are typically video encoding and packaging for adaptive streaming).
The most reliable and efficient encryption algorithm today for large binary files is AES (Advanced Encryption Standard). It is a symmetric-key algorithm, which means that the same key is used to encrypt and decrypt the data. Key lengths of 128, 192, or 256 bits are supported. Most modern DRM technologies use AES with 128-bit keys.
During encryption, some metadata is added to the video. This data is called DRM signaling. As a minimum, it identifies the DRM system used (e. g. PlayReady or Widevine). You can also find the term PSSH (Protection System Specific Header). DRM signaling helps the players to understand that the video is encrypted and to correctly use the DRM technology on the device to play the content.
The encryption process needs an encryption key, also called a Content Key. Any cryptographically strong random number with 128 bits can be used as a Content Key. Still, usually a separate service is used to generate a content key - it’s the so-called Key Service. A Key Service shall synchronize the content keys it generates with the License Service (see below), which provides the content keys to the video player in a secure way.
It is common to use the CPIX format for content key exchange.
There are various key exchange protocols in use, supported by different encoders, for example, SPEKE (AWS) and Widevine Common Encryption (Google).
Once encrypted, the video can be distributed to its consumers even via insecure channels. The system security comes from the trust into the cryptographical strength of the AES algorithm and from the secure content key delivery process. (However, the content owners still prefer to limit exposure even of the encrypted content.)
Typically, the encrypted video is distributed to its users via a CDN (Content Delivery Network), which uses the usual application protocols, such as HTTPS, and caches the video at the edge node closest to the consumer.
A License Service is the component which provides the content key for decryption, accompanied by applicable usage rules, to authorized end user devices. They key is provided in an encrypted form as part of a data structure called a DRM license.
In addition to the content key, the DRM license contains usage rules - rules defining what the consumer is able to do with the video. For example, they can define how long the video should be playable or whether it is allowed to store the DRM license persistently on the device or to play the video on analog devices.
The DRM license is only usable on the device which generated the original license request.
All DRM technologies have slightly different formats for the DRM license and support different usage rules.
A License Service can have embedded entitlement logic to decide for which request to grant a license and for which not. However, it is common to separate the two responsibilities: to authorize the request and to generate a DRM-technology-specific license based on the already issued authorization. In this case, you need an additional component - Entitlement Service (see below) which generates an Entitlement Message (a token). A License Service expects such token as a part of a license request, validates the token, and generates the DRM License according to the definitions in the Entitlement Message passed inside the token.
Every video solution has some authorization rules. Less restrictive systems would
grant a DRM license for every legitimate user of the system. Others would implement
complex rules based on the user’s subscriptions, number of used devices, geography,
etc. These business rules are typically implemented in a component called the
Entitlement Service. Physically, it doesn’t have to be a separate deployable service.
The logic can be a part of a bigger service (an endpoint
/entitlement) or even a
part of the frontend application.
Entitlement Service makes a decision whether or not to grant a DRM license. If the decision is positive, it also decides which additional restrictions shall be applied. The result is typically represented by a data structure, called Entitlement Message. This message is packed into a token (usually JWT) and returned to the Frontend. With such a token, the Frontend can request a DRM license from a License Service.
To avoid token manipulation on its way, Entitlement Service should sign the token and the License Service should validate the signature, e.g. using a shared secret.
A DRM-unaware video player that tries to play encrypted content would fail. However, if the video player supports the used DRM technology, it understands the DRM signaling and knows how to request a content key from a License Service.
To fetch a DRM license, the player must send a License request to the License Service. In order to create the license request, the player turns to the CDM to help construct the license request. The CDM creates a crpytographically protected payload that is individualized and tamper-proof and which the player must send to the DRM License Service. The response contains the DRM License. It is encrypted in such a way that only the CDM that originally created the license request can decrypt and use it.
For playback, the player downloads the encrypted video segment by segment and hands them to the CDM. The decryption of the video frames and playback is performed by the CDM. As such, the player itself or any part of a custom application never get access to an unencrypted video. The CDM also ensures that the usage rules are enforced.
Before the DRM license request can be made, the Frontend shall talk to the Entitlement Service and ask for an Entitlement Message. If the Entitlement Service grants the request, it generates a token, which the Frontend appends to the DRM License Request. Based on this token, the License Service grants the license and translates the rules from the Entitlement Message into - DRM-specific - usage rules.
Most of the modern video players support the DRM technologies which are desribed here.
The content key for each video shall be known to both the Key Service and the License Service. If both of them run next to each other, the key exchange can be organized easily. They can even share the same key database.
However, if the License Service is deployed to a different security context than the Key Service, a concept is needed to ensure the way how content keys stay in sync.
If a Key Service generates a new random content key for every request, it should store all generated keys. The keys database should be periodically synchronized with the License Service. For increased security, key synchronization should occur through a different communication channel, separately from the video distribution.
In the Key Seed model, the generated content keys are not fully random, but derived from a single secret value, the so-called Key Seed. A Key Seed shall be generated once and shared between the Key Service and the License Service. A content key is derived from this Key Seed and a Key ID, using a known algorithm.
Key ID is a unique identifier, usually a GUID, assigned to the key. Key ID is public information distributed as a part of video’s metadata. A Key Service and a License Service use the same algorithm to derive the content key from the Key ID:
Content Key = f(Key Seed, Key ID)
The Key Seed itself shall be a cryptographically strong random number. It shall be generated once and made available to both: Key Service and License Service. The Key Seed is kept strictly secret.
If this is given, there is no further need to exchange the keys between the Key Service and the License Service. The Key ID used by the Key Service to generate the Content Key is public and can be transferred to the player together with the video. Therefore, the player can always send the same Key ID to the License Service and receive the same Content Key: the License Service uses the same Key Seed as the Key Service.
If you can assume that your users are online when they play videos, you can plan a centralized License Service running in a Cloud. It gets more complicated if video playback takes place in a disconnected environment without Internet access.
An environment can be disconnected due to technical or security reasons. A good example is an entertainment system on board of an aircraft or a train.
In a disconnected environment, you have to ensure that your License Service runs locally, without relying on Internet connectivity. It is usually possible, but brings along additional challenges:
Depending on the server hardware/OS in the disconnected environment, the License Service needs to support various platforms.
Content Key synchronization is more complicated (the Key Seed model can be helpful).
DRM technology providers require regular updates.
Regular software/security updates are mandatory.
Logs/statistics from the License Service should be collected and aggregated centrally.
During the last decade, we witnessed a competition between a few DRM technology vendors. There is no single winner, but there are "the big three":
Each DRM technology provides an implementation of a CDM and its best integration into the respective client platforms:
Chrome, FireFox, Edge, Opera
All DRM technologies mostly follow the same principles. They provide at least:
Client-side implementation for respective platforms
Interface (specification) for a license request and response
Format (specification) for a DRM license
Format (specification) for DRM encryption and DRM signaling
Third party vendors and service providers deliver the real implementations for the needed components for a DRM-enabled system, e.g.:
Each DRM technology vendor runs their certification programs for 3rd party service providers and their developers. For example, here is a list of Microsoft PlayReady technology partners: https://www.microsoft.com/playready/partners/
A few years ago this would have been a hard question. However, in 2021, the answer is: all of them! If you examine the table above, each platform has its own "native" DRM, e.g. on iOS, it is FairPlay, on Android, Widevine and on the Xbox, PlayReady.
Recommendation for the Frontend is: on each client platform, use the DRM technology native to that platform.
Until recently, such recommendation would have resulted in a need to prepare and store several sets of the video library, once for each DRM technology, because the used encryption schemes and DRM signaling were not compatible. Today, thanks to the standards like CMAF (Common Media Application Format), CENC (Common Encryption Scheme), and the CBCS encryption mode, it is possible to encrypt the video only once and use it with multiple DRM technologies. Of course, this requires the License Service to provide DRM licenses in different formats, depending on which client platform initiates the request. This approach is called Multi-DRM and it is being used more and more in the industry.
Let’s state here, for the purpose of integrity, that DRM technologies can also be used on platforms that are non-native for them. This usually requires 3rd party tools, e.g. SDKs, produces additional costs, and is often less secure than the native technology. However, it can help to reduce the number of DRM technologies used.