Thursday, April 24, 2003

Cryptography Research Content Protection

[Note - this is an archived version of the original posting from 03:10 AM EDT, Apr 24 2003]

Cryptography Research's content protection system is summarized online and apparently described in more detail in the patent application. The basic idea is to include some executable code with the content which would run in a VM in every content player. This code would then make use of pre-existing low-level crypto functions and keys built into the players to decrypt and play the content.

The main advantage of this approach is that as various content protection schemes are cracked, or keys extracted, future releases of content can be updated to use new schemes and to invalidate the stolen keys. Unlike current approaches which build the technology into firmware in players, this method would be much more flexible and adaptable in the face of ongoing attacks.

Note that there would still be limits on what could be done in terms of revising protection schemes, based on the cryptographic primitives which are built into players. If some future design for content protection were invented which relied on the Weil pairing, for example, it would not be implementable if the necessary primitives are not supported on widely used devices.

A disadvantage is that each player must be powerful enough to run this VM program at speed. It's not clear how complex the programs will be or how fast they will have to run. Providing general purpose computing functionality may be more expensive and difficult than implementing a special-purpose standard in firmware and hardware, especially for small portable devices. The CR report estimates that a 1 MIPS processor would be adequate, but they don't offer any justification.

All cryptography-based protection schemes have a fundamental flaw, which is that keys "scraped" out of authorized devices can be used to unlock data and then release it in unprotected form. CR proposes to address this by watermarking the data. For this to work, the watermark must inherently reveal the key which was used for the decryption. This involves the technique broadly known as traitor tracing. The problem is that if the bad guys have access to a considerable number of stolen keys it may be intractable to devise a traitor tracing scheme that can identify them.

CR advocates "forensic watermarking". In the longer report (available by email request) they describe this as a system where there are two versions of selected portions of the content - for example, two alternate versions of a particular movie frame. There would be multiple such "polymorphs" throughout the content, and each device would have keys such that for each polymorph it would see only one version. By randomizing and encrypting the frames it can be arranged that the devices can't even tell which frames are polymorphic. The set of keys assigned to a playback device implicitly identifies the device itself, so that if an unprotected version of the movie is released, the specific versions of the polymorphs that are present will reveal which device did the decryption.

The obvious attack is to combine the output from multiple devices from which keys have been scraped, but this does not work (up to a point) because even when multiple devices are used, there is still enough information in the output to identify which specific devices were involved. CR gives an example of a 90 minute movie, 30 frames per second, with 1% of the frames being polymorphic - 1620 frames. Even if an adversary breaks into 4 playback devices and gets their keys in order to identify the polymorph frames, the manufacturer can identify those four devices with an error probability, according to the formula derived by the CR report, of less than 4 x 10^(-10), an extremely good detection rate.

But what happens if you use the CR formula with the assumption that the attacker cracks one more device for a total of 5? Suddenly the system doesn't work so well, and there are over 10^20 possible sets of 5 devices that could produce the combined output! We go from 4 x 10^(-10) to 10^20 with just one more device. This kind of exponential explosion is common to many traitor tracing schemes. The attackers have an inherent mathematical advantage which is very hard to address. All this is glossed over in the CR analysis.

And this is leaving aside the question of whether we can create polymorphic frames at all - 2 different versions of a movie frame, either of which is equally plausible as an intermediate frame among the others in the sequence. It must be done so that looking at one version of the movie or the other will not call attention in any way to the frames which have been altered like this. And it has to be doable in a completely (or at least largely) automated way, due to the volume of content that would have to be protected.

It also would be desirable to create content that can run on all future players, at a time when many of them may not yet have been built or even designed. For example, pressing a CD today, it should still play on CD players built 10 years from now. Could the content be unlocked by devices whose keys have not yet been chosen, but still be watermarked so as to reveal those keys? Or would we need a centralized key-creation agency which pre-creates all of the keys which will ever be used in a CD player for the rest of time, and then assigns them to manufacturers who enter the CD business? Most of these traitor tracing and broadcast-encryption schemes have been implemented in closed environments like satellite TV broadcasts, but the complexity of a multi-vendor world and an open-ended future may push this technology beyond its current limitations.

Even in the most optimistic assessment, what we would expect to see is a constant struggle between "protectors" and "unlockers". Each success by the unlockers will be thwarted in the next generation of content released by the protectors. But the old content will still be vulnerable. So we would expect that this approach would yield a time window for protection. At the time it was released, new content would use the latest techniques and be relatively secure. Over time, breaks would occur to which content released in the past would be vulnerable. So content in general would have a limited time in which it was protected. This would be the window in which profitability was possible. Content older than some threshold - weeks? months? years? - would be freely available.

0 Comments:

Post a Comment

<< Home