## Information-theoretic analysis of information hiding (2003)

Venue: | IEEE Transactions on Information Theory |

Citations: | 228 - 18 self |

### BibTeX

@ARTICLE{Moulin03information-theoreticanalysis,

author = {Pierre Moulin and Joseph A. O’sullivan},

title = {Information-theoretic analysis of information hiding},

journal = {IEEE Transactions on Information Theory},

year = {2003},

volume = {49},

pages = {563--593}

}

### Years of Citing Articles

### OpenURL

### Abstract

Abstract—An information-theoretic analysis of information hiding is presented in this paper, forming the theoretical basis for design of information-hiding systems. Information hiding is an emerging research area which encompasses applications such as copyright protection for digital media, watermarking, fingerprinting, steganography, and data embedding. In these applications, information is hidden within a host data set and is to be reliably communicated to a receiver. The host data set is intentionally corrupted, but in a covert way, designed to be imperceptible to a casual analysis. Next, an attacker may seek to destroy this hidden information, and for this purpose, introduce additional distortion to the data set. Side information (in the form of cryptographic keys and/or information about the host signal) may be available to the information hider and to the decoder. We formalize these notions and evaluate the hiding capacity, which upper-bounds the rates of reliable transmission and quantifies the fundamental tradeoff between three quantities: the achievable information-hiding rates and the allowed distortion levels for the information hider and the attacker. The hiding capacity is the value of a game between the information hider and the attacker. The optimal attack strategy is the solution of a particular rate-distortion problem, and the optimal hiding strategy is the solution to a channel-coding problem. The hiding capacity is derived by extending the Gel’fand–Pinsker theory of communication with side information at the encoder. The extensions include the presence of distortion constraints, side information at the decoder, and unknown communication channel. Explicit formulas for capacity are given in several cases, including Bernoulli and Gaussian problems, as well as the important special case of small distortions. In some cases, including the last two above, the hiding capacity is the same whether or not the decoder knows the host data set. It is shown that many existing information-hiding systems in the literature operate far below capacity. Index Terms—Channel capacity, cryptography, fingerprinting, game theory, information hiding, network information theory,

### Citations

8563 |
Elements of Information Theory
- Cover, Thomas
- 1991
(Show Context)
Citation Context ...s distributed according to . Given random variables , , , we denote the entropy of by , the mutual information between and by , and the conditional mutual information between and , conditioned on ,by =-=[31]-=-. The Gaussian distribution with mean and variance is denoted by . Finally, we write as to denote asymptotic equality of two functions and , i.e., A. Description of the Problem There are various formu... |

1476 |
Information Theory and Reliable Communication
- Gallager
- 1968
(Show Context)
Citation Context ...their elements to an appropriate set of “well-behaved” functions (including properties such as boundedness and continuity). Under these assumptions, the mutual informations (5.1) and (5.2) are finite =-=[48]-=-, and so is for all admissible . For any , select a finite partition of the sets such that (5.3) (5.4) (5.5) (5.6) for all . The existence of such a partition is guaranteed by our smoothness assumptio... |

806 | T.Leighton and T.Shamoon, “Secure Spread Spectrum Watermarking for Multimedia
- Cox
- 1997
(Show Context)
Citation Context ...the owner of the data set embeds a serial number, or fingerprint, that uniquely identifies the user of the dataset and makes it possible to trace any unauthorized use of the data set back to the user =-=[16]-=-, [17]. This application is particularly challenging as it opens up the possibility of a collusion between different users to remove these fingerprints. A different type of application is the embeddin... |

792 |
Communication Theory of Secrecy systems
- Shannon
- 1949
(Show Context)
Citation Context ...n the host data is often the single most important requirement. Moreover, while cryptography has received significant attention in the Information Theory community, following Shannon's landmark paper =-=[19]-=-, information hiding today is an immature subject, both on a mathematical and a technological level. For instance, there is no consensus today about the formulation of system requirements; and there i... |

710 | The Rate-distortion Function for Source Coding with Side Information at The Decoder
- Wyner, Ziv
- 1976
(Show Context)
Citation Context ...d Willems [11]. Our initial derivation of hiding capacity was based on the analogy between the watermarking problem and Wyner–Ziv’s problem of source coding with side information at the decoder [31], =-=[45]-=-. This analogy was further developed by Chou et al. [29] and Barron et al. [30]. Four key differences between our setup and Gel’fand and Pinsker’s are • the presence of distortion constraints; • the a... |

671 |
Writing on dirty paper
- Costa
- 1983
(Show Context)
Citation Context ...imal attack for Gaussian . The optimal attack is again the Gaussian test channel (5.10). The optimal distribution is the same optimal distribution that achieves capacity in a problem studied by Costa =-=[50]-=-. Costa’s result is an elegant extension of Gel’fand and Pinsker’s results (see Fig. 3) to the case of additive white Gaussian noise channels with input power constraints. The hiding capacity for the ... |

669 |
Cryptography: Theory and Practice
- Stinson
- 1995
(Show Context)
Citation Context ...is Bernoulli(D 2 ). The derivation of hiding capacity in the cases D 1 # 1 2 or D 2 # 1 2 is straightforward. # The encoding system for X above is the same as Vernam's one--time pad encryption system =-=[32]-=-. The distributions p(Z) and p(Z|X) are identical, which means that this system satisfies Shannon's perfect secrecy condition [19, 32]: observing the data X does not provide the attacker with any info... |

460 |
Olsder.Dynamic Noncooperative Game Theory
- Basar, J
- 1999
(Show Context)
Citation Context ... 2.6: The information-hiding capacity is the supremum of all achievable rates for distortion and attacks in the class . III. THE INFORMATION-HIDING GAME Information hiding can be thought of as a game =-=[39]-=- between two cooperative players (the information hider and the decoder) and an opponent (the attacker). The first party tries to maximize a payoff function, and the opponent tries to minimize it. A n... |

432 |
Techniques for data hiding
- Bender, Gruhl, et al.
- 1996
(Show Context)
Citation Context ...ion of . In other problems, only partial information about (e.g., image features [35]) is available at the decoder. Other examples of side information include hash values [36], location of watermarks =-=[37]-=-, [38], and seeds for modulating pseudonoise sequences in spreadspectrum systems [20], [28]. In blind-information-hiding applications, the decoder is not allowed access to any side information, so any... |

285 | Collusion-secure fingerprinting for digital data
- Boneh, Shaw
- 1995
(Show Context)
Citation Context ...y degrading the signal, and the fingerprint itself should be imperceptible. Developing a successful fingerprinting system is difficult, because of possible collusion between multiple users [3], [16], =-=[17]-=-. We show later that collusion allows users to compute a good estimate of the host signal, which contains little residual information about the individual messages. Single-user detectors are used to e... |

252 | Multimedia watermarking techniques
- Hartung, Kutter
(Show Context)
Citation Context ... and privacy protection [1], [2]. An excellent review of the current state of the art appears in [3], and comprehensive surveys of image and multimedia watermarking techniques are available from [4], =-=[5]-=-. The majority of the papers to date have focused on novel ways to hide information and to detect and/or remove hidden information. However, these papers have lacked a guiding theory describing the fu... |

225 |
Coding for channel with random parameters
- Gel’fand, Pinsker
- 1980
(Show Context)
Citation Context ...hievability is proved using a random bin coding technique [23, p. 410]. The decoder uses joint typical set decoding. The proof of Prop. 3.1 is an adaptation of techniques used by Gel'fand and Pinsker =-=[29]-=- to derive the capacity of a discrete memoryless channel with random parameters (state of the channel) that are known at the encoder but not at the decoder, see Fig. 3. The capacity of this channel is... |

202 | Information Hiding - a Survey
- Petitcolas, Anderson, et al.
- 1999
(Show Context)
Citation Context ...ng theory describing the fundamental limits of any information--hiding system. The need for practitioners and system designers to understand the nature of these fundamental limits has been recognized =-=[1, 3, 5, 6, 7]-=-. We help to close this gap by providing a theoretical basis for a generic version of the information-hiding problem. We formulate the information--hiding problem as a communication problem and seek t... |

197 | An information-theoretic model for steganography
- Cachin
- 1998
(Show Context)
Citation Context ...ssage secret, but its very presence within the host data set should be undetectable. Steganography and related applications have a long, sometimes romantic history dating from ancient times [3], [6], =-=[19]-=-, [20]. This brief discussion suggests that information hiding borrows from a variety of areas, including signal processing, communications, game theory, and cryptography. Indeed, a vast array of tech... |

150 | Watermarking of uncompressed and compressed video
- Hartung, Girod
- 1998
(Show Context)
Citation Context ...d private information hiding, the host data themselves are available to the decoder [16]; and in the second case, which we term blind information hiding, no side information at all is available [27], =-=[28]-=-. 1 Our theory quantifies the effect of side information on hiding capacity; the role of side information in watermarking has also been explored by Cox et al. [7], Chou et al. [29], Barron et al. [30]... |

149 | Perceptual watermarks for digital images and video
- Wolfgang, Podilchuk, et al.
- 1999
(Show Context)
Citation Context ...elation rule or a modified correlation rule [16]. More elaborate designs of in the image watermarking literature exploit the perceptual characteristics of the human visual system and are not additive =-=[4]-=-, [7]. While such designs make it convenient for the information hider to satisfy distortion constraints, heuristic choices can be largely suboptimal in terms of achievable rates. As our subsequent an... |

141 | Information and Information Stability of Random Variables and Processes. San-Francisco: Holden-Day - Pinsker - 1964 |

130 | Watermarking as communications with side information
- Cox, Miller, et al.
- 1999
(Show Context)
Citation Context ...cribing the fundamental limits of any information-hiding system. The need for practitioners and system designers to understand the nature of these fundamental limits has been recognized [1], [3], [5]–=-=[7]-=-. We help to close this gap by providing a theoretical basis for a generic version of the information-hiding problem. We formulate the information-hiding problem as a communication problem and seek th... |

127 |
A coding theorem for the discrete memoryless broadcast channel
- Marton
- 1979
(Show Context)
Citation Context ...ork. The analogy between Gel'fand and Pinsker's work and watermarking problems was first identified by Chen [9]. The converse theorem (Proposition 3.2) cannot be proved using the same technique as in =-=[29, 31]-=-, because the attack channel is not known to the encoder, so we need the Markov chain property (U,sX , K) # X # Y (see proof of Theorem 3.3.) Other di#erences between our setup and Gel'fand and Pinske... |

125 |
Coding for channel with random parameters,” Probl
- Gelfand, Pinsker
- 1980
(Show Context)
Citation Context ...tributed as i.e., forms a Markov chain. satisfies convexity properties stated in Proposition 4.1. Properties ii), iii), iv), v), and vii) are analogous to those stated in Gel’fand and Pinsker’s paper =-=[43]-=-. The payoff (4.4) is convex in but is nonconcave in . Proposition 4.1 (Convexity Properties of ): i) For fixed and , the payoff (4.4) is convex in . ii) For fixed , and , (4.4) is concave in [46]. ii... |

119 | Reliable Communication Under Channel Uncertainty
- Lapidoth, Narayan
- 1998
(Show Context)
Citation Context ... to the decoder and enable the use of randomized codes. This is a standard communication technique which generally leads to improved transmission performance and is used to design combat jamming [32]–=-=[34]-=-. Second, may provide side information about to the decoder. The dependencies between and are modeled using a joint distribution . For instance, may be fully available at the decoder, a common assumpt... |

117 |
On the capacity of computer memory with defects
- Heegard, Gamal
- 1983
(Show Context)
Citation Context ...e random channel parameter (“state” of the channel), and is an auxiliary random variable. These results have been extended by Heegard and El Gamal to find the capacity of computer memory with defects =-=[44]-=-. In the information-hiding problem, the host data plays the role of the random parameter in Gel’fand and Pinsker’s work. The analogy between Gel’fand and Pinsker’s work and watermarking problems was ... |

107 | The Gaussian watermarking game
- Cohen, Lapidoth
- 2002
(Show Context)
Citation Context ...ugh the communication system. Related aspects of this problem have also been recently explored by Merhav [8], Steinberg and Merhav [9], Somekh-Baruch and Merhav [10], Willems [11], Cohen and Lapidoth =-=[12]-=-, and Chen and Wornell [13], [14]. Also, see the recent study by Hernández and Pérez-González on decision-theoretic aspects of the watermarking problem [15]. In our generic information-hiding problem,... |

92 | Information hiding-a survey
- Petitcolas, Anderson, et al.
- 1999
(Show Context)
Citation Context ...he message secret, but its very presence within the host data set should be undetectable. Steganography and related applications have a long, sometimes romantic history dating from ancient times [3], =-=[6]-=-, [19], [20]. This brief discussion suggests that information hiding borrows from a variety of areas, including signal processing, communications, game theory, and cryptography. Indeed, a vast array o... |

91 |
Communication theory of secrecy systems. Bell system technical journal
- Shannon
- 1949
(Show Context)
Citation Context ...n the host data is often the single most important requirement. Moreover, while cryptography has received significant attention in the Information Theory community, following Shannon’s landmark paper =-=[25]-=-, information hiding today is an immature subject, both on a mathematical and a technological level. For instance, there is no consensus today about the formulation of system requirements; and there i... |

86 |
Cryptology for digital TV broadcasting
- Macq, Quisquater
- 1995
(Show Context)
Citation Context ...ion programs. Here the message is secret in the sense that it should not be decipherable by unauthorized decoders. Other applications of information hiding to television broadcasting are described in =-=[18]-=-. Another, more classical application of information hiding is steganography. Here not only is the message secret, but its very presence within the host data set should be undetectable. Steganography ... |

84 | DCT-based watermark recovering without resorting to the uncorrupted original image
- Piva, Barni, et al.
- 1997
(Show Context)
Citation Context ... termed private information hiding, the host data themselves are available to the decoder [16]; and in the second case, which we term blind information hiding, no side information at all is available =-=[27]-=-, [28]. 1 Our theory quantifies the effect of side information on hiding capacity; the role of side information in watermarking has also been explored by Cox et al. [7], Chou et al. [29], Barron et al... |

68 | The duality between information embedding and source coding with side information and some applications
- Barron, Chen, et al.
- 2003
(Show Context)
Citation Context ...[28]. 1 Our theory quantifies the effect of side information on hiding capacity; the role of side information in watermarking has also been explored by Cox et al. [7], Chou et al. [29], Barron et al. =-=[30]-=-, Chen and Wornell [14], and Willems [11]. The theory is illustrated using an example based on a Bernoulli process and a Hamming distortion function. In Section V, we extend these results to the case ... |

67 | Spread Spectrum Image Steganography
- Marvel, Boncelet, et al.
(Show Context)
Citation Context ...secret, but its very presence within the host data set should be undetectable. Steganography and related applications have a long, sometimes romantic history dating from ancient times [3], [6], [19], =-=[20]-=-. This brief discussion suggests that information hiding borrows from a variety of areas, including signal processing, communications, game theory, and cryptography. Indeed, a vast array of techniques... |

65 | A public key watermark for image verification and authentication
- Wong
- 1998
(Show Context)
Citation Context ...16]. In this case, is a function of . In other problems, only partial information about (e.g., image features [35]) is available at the decoder. Other examples of side information include hash values =-=[36]-=-, location of watermarks [37], [38], and seeds for modulating pseudonoise sequences in spreadspectrum systems [20], [28]. In blind-information-hiding applications, the decoder is not allowed access to... |

57 | Statistical analysis of watermarking schemes for copyright protection of images - Hernandez, Perez-Gonzalez - 1999 |

35 | G.W.Wornell, “Preprocessed and postprocessed quantization index modulation methods for digital watermarking
- Chen
(Show Context)
Citation Context ... have briefly described some of these extensions. Much work remains to be done in designing practical information-hiding codes that approach capacity. Recent results have been reported in [14], [51], =-=[53]-=-–[55]. Our analysis has outlined the potential benefits of using randomized codes. Other practical problems include the choice of a suitable distortion measure, which is a holy grail in audio, image, ... |

34 |
An information-theoretic approach to the design of robust digital watermarking systems
- Chen, Wornell
- 1999
(Show Context)
Citation Context ... have been used to design algorithms for hiding information (e.g., the spread-spectrum methods popularized by Cox et al. [16], or the dithered quantization methods developed by Chen and Wornell [14], =-=[21]-=-) and for attempting to remove that information (by means of techniques such as compression, signal warping, and addition of noise). Perceptual models for audio, imagery, and video have helped to quan... |

34 | On the duality between distributed source coding and data hiding
- Chou, Pradhan, et al.
- 1999
(Show Context)
Citation Context ... is available [27], [28]. 1 Our theory quantifies the effect of side information on hiding capacity; the role of side information in watermarking has also been explored by Cox et al. [7], Chou et al. =-=[29]-=-, Barron et al. [30], Chen and Wornell [14], and Willems [11]. The theory is illustrated using an example based on a Bernoulli process and a Hamming distortion function. In Section V, we extend these ... |

27 | On the Capacity Game of Public Watermarking Systems
- Somekh-Baruch, Merhav
- 2004
(Show Context)
Citation Context ...mum rate of reliable communication through the communication system. Related aspects of this problem have also been recently explored by Merhav [8], Steinberg and Merhav [9], Somekh-Baruch and Merhav =-=[10]-=-, Willems [11], Cohen and Lapidoth [12], and Chen and Wornell [13], [14]. Also, see the recent study by Hernández and Pérez-González on decision-theoretic aspects of the watermarking problem [15]. In ... |

27 | Information Theory: Coding Theory for Discrete Memoryless Systems - Csiszár, Körner - 1981 |

27 | A robust optimization solution to the data hiding problem using distributed source coding principles - Chou, Pradhan, et al. - 2000 |

20 |
Digital watermarking: from concepts to realtime video applications
- Busch, Funk, et al.
- 1999
(Show Context)
Citation Context ... . In other problems, only partial information about (e.g., image features [35]) is available at the decoder. Other examples of side information include hash values [36], location of watermarks [37], =-=[38]-=-, and seeds for modulating pseudonoise sequences in spreadspectrum systems [20], [28]. In blind-information-hiding applications, the decoder is not allowed access to any side information, so anyone ca... |

19 | Identification in the presence of side information with application to watermarking
- Steinberg, Merhav
- 2001
(Show Context)
Citation Context ...tion problem and seek the maximum rate of reliable communication through the communication system. Related aspects of this problem have also been recently explored by Merhav [8], Steinberg and Merhav =-=[9]-=-, Somekh-Baruch and Merhav [10], Willems [11], Cohen and Lapidoth [12], and Chen and Wornell [13], [14]. Also, see the recent study by Hernández and Pérez-González on decision-theoretic aspects of the... |

19 |
Cryptography: Theory and Practice (Boca
- Stinson
- 1995
(Show Context)
Citation Context ...ent, and . So condition (4.10) is satisfied. The derivation of hiding capacity in the cases or is straightforward. The encoding system for above is the same as Vernam’s one-time pad encryption system =-=[47]-=-. The distributions and are identical, which means that this system satisfies Shannon’s perfect secrecy condition [25], [47]: observing the data does not provide the attacker with any information abou... |

16 | Steganalysis and game equilibria
- Ettinger
- 1998
(Show Context)
Citation Context ...and video have helped to quantify the distortions introduced by information-hiding and attack algorithms. Game-theoretic aspects of information hiding have been explored for special cases by Ettinger =-=[22]-=- and O’Sullivan et al. [23], [24]. Cryptographic aspects of information hiding include the use of secret keys to protect the message. It should, however, be clearly recognized that the functional requ... |

16 |
Information-theoretic analysis of steganography
- O’Sullivan, Moulin, et al.
- 1998
(Show Context)
Citation Context ...antify the distortions introduced by information-hiding and attack algorithms. Game-theoretic aspects of information hiding have been explored for special cases by Ettinger [22] and O’Sullivan et al. =-=[23]-=-, [24]. Cryptographic aspects of information hiding include the use of secret keys to protect the message. It should, however, be clearly recognized that the functional requirements of cryptography an... |

16 | Iteratively decodable codes for watermarking applications
- Kesal, Mihcak, et al.
- 2000
(Show Context)
Citation Context ... briefly described some of these extensions. Much work remains to be done in designing practical information-hiding codes that approach capacity. Recent results have been reported in [14], [51], [53]–=-=[55]-=-. Our analysis has outlined the potential benefits of using randomized codes. Other practical problems include the choice of a suitable distortion measure, which is a holy grail in audio, image, and v... |

12 |
Convex Analysis
- Rockefellar
- 1972
(Show Context)
Citation Context ...d be the Cartesian product of at most probability simplices in , where each probability simplex is indexed by a different value of . is a closed, convex, polyhedral set. Its vertices (extremal points =-=[57]-=-) satisfy the following property: the functions are zero/one functions for all . The maximum of a convex function over a convex polyhedral set is attained at a vertex of that set. Due to the distortio... |

11 | Recovery of watermarks from distorted images
- Johnson, Duric, et al.
- 1999
(Show Context)
Citation Context ...y available at the decoder, a common assumption in the watermarking literature [3], [4], [16]. In this case, is a function of . In other problems, only partial information about (e.g., image features =-=[35]-=-) is available at the decoder. Other examples of side information include hash values [36], location of watermarks [37], [38], and seeds for modulating pseudonoise sequences in spreadspectrum systems ... |

11 |
Some information theoretic saddlepoints
- Borden, Mason, et al.
- 1985
(Show Context)
Citation Context ...me in this case is . It is in the interest of neither party to deviate from a saddlepoint strategy [39]. For examples of saddlepoint strategies in information-theoretic games, see [31, p. 263], [41], =-=[42]-=-. In many games, Nash equilibria and saddlepoints do not exist. Then the information available to each party critically determines the value of the game. If the players choose their actions in a given... |

10 |
Multimedia Data--Embedding and Watermarking Strategies
- Swanson, Kobayashi, et al.
- 1998
(Show Context)
Citation Context ... SELECTED AREAS IN COMMUNICATIONS and of the PROCEEDINGS OF THE IEEE were recently devoted to copyright and privacy protection [1], [2]. An excellent review of the current state of the art appears in =-=[3]-=-, and comprehensive surveys of image and multimedia watermarking techniques are available from [4], [5]. The majority of the papers to date have focused on novel ways to hide information and to detect... |

10 |
Information rates of non-Gaussian processes
- Gerrish, Schultheiss
- 1964
(Show Context)
Citation Context ...ION ANALYSIS The case of small distortions and is typical of many information-hiding problems. One may wonder whether some simplifications occur in the theory, possibly like in rate-distortion theory =-=[52]-=-. We show that this is indeed the case. We consider the squarederror-distortion metric over the real line and show that the hiding capacity is independent of the statistics of , asymptotically as . Th... |

9 |
Quantization index modulation methods: A class of provably good methods for digital watermarking and information embedding
- Chen, Wornell
- 2001
(Show Context)
Citation Context ...ated aspects of this problem have also been recently explored by Merhav [8], Steinberg and Merhav [9], Somekh-Baruch and Merhav [10], Willems [11], Cohen and Lapidoth [12], and Chen and Wornell [13], =-=[14]-=-. Also, see the recent study by Hernández and Pérez-González on decision-theoretic aspects of the watermarking problem [15]. In our generic information-hiding problem, a message is to be embedded in a... |

8 |
A complete characterization of minimax, and maximin encoder-decoder policies for communication channels with incomplete statistical description
- Basar
- 1985
(Show Context)
Citation Context ...equality holds only at saddlepoints [39]), 3 we have , as one does expect, owing to the additional information available to the attacker. The formulation (3.3) has been used in jamming problems [34], =-=[40]-=- and in recent watermarking problems [10], [12] and yields a lower value of the information-hiding game. An upper value of the game is obtained using the unrealistic (because it is overly optimistic) ... |