Informática 33 (2009) 49-68 49 Blind Watermark Estimation Attack for Spread Spectrum Watermarking Hafiz Malik Electrical and Computer Engineering Department University of Michigan-Dearborn, Dearborn, MI 48128, USA E-mail: hafiz@umich.edu, URL: http://www-personal.engin.umd.umich.edu/~hafiz Keywords: Spread-spectrum watermarking, independent component analysis, blind source separation, watermark estimation, detection, decoding Received: September 18, 2008 This paper presents an efficient scheme for blind watermark estimation embedded using additive watermark embedding methods. The scheme exploits mutual independence between the host media and the embedded watermark and non-Gaussianity of the host media for watermark estimation. The proposed scheme employs the framework of independent component analysis (ICA) and poses the problem of watermark estimation as a blind source separation (BSS) problem. Analysis of the scheme shows that the proposed detector significantly outperforms existing correlation-based blind detectors traditionally used for SS-based watermarking. The proposed ICA-based blind detection/decoding scheme has been simulated using real-world audio clips. The simulation results show that the proposed ICA-based method can detect and decode watermark with extremely low decoding bit error probability (less than 0.01) against common watermarking attacks and benchmark degradations. Povzetek: Opisanaje metoda odkrivanja vodnega tiska. 1 Introduction Digital forgeries and unauthorized sharing of digital media have emerged as a growing concern over the last decade. The widespread use of multimedia information is aided by factors such as the growth of the Internet, the proliferation of low-cost and reliable storage devices, the deployment of seamless broadband networks, the availability of state-of-the-art digital media production and editing technologies, and the development of efficient multimedia compression algorithms. Multimedia piracy has subjected the entertainment industry to enormous annual revenue losses. For example, music industry alone claims multi-million illegal music downloads on the Internet every week. It is therefore imperative to have robust technologies to protect copyrighted digital media from illegal sharing and tampering. Traditional digital data protection techniques, such as encryption and scrambling, alone cannot provide adequate protection as these technologies are unable to protect digital content once they are decrypted or unscrambled. Digital watermarking technology complements cryptography for protecting digital content even after it is deciphered [1]. Digital watermarking refers to the process of imperceptible embedding information (watermark) into the digital object ( or the host object). Existing watermarking schemes based on the watermark embedding method used can be classified into two major categories: 1. blind embedding, in which the watermark embed-der does not exploit the host signal information during watermark embedding process. Watermarking schemes based on spread-spectrum (SS) [1, 2, 3, 4, 5] fall into this category. 2. informed embedding, in which the watermark embed-der exploits knowledge of the host signal during watermark embedding process. Watermarking schemes based on quantization index modulation [1,6] belong to this category. Similarity, existing watermarking schemes based on the detection method used cab be classified into two major categories: 1. informed detector, which assume that the host signal is available at the detector during watermark detection process, and 2. blind detector, which assume that the host signal is not available at the detector for watermark detection. Although the performance expected from a given watermarking system depends on the target application area [1], but robustness of the embedded watermark and efficient detection are desirable features of a give watermarking scheme. In addition, fidelity (or imperception) of the embedded watermark is additional requirement of perception based watermarking schemes [1]. To meet fidelity requirement, the power of the embedded watermark (watermark strength) is generally kept much lower than the host signal power. In this paper we consider additive watermark embedding model, e.g. SS-based watermarking, where the watermark signal is added to the host signal in the marking space to 50 Informatica 33 (2009) 49-68 H. Malik obtain the watermarked signal. Existing watermark detection schemes for SS-based watermarking generally employ statistical characterization of the host signal to develop an optimal or suboptimal watermark detector [6,7, 8]. It is important to mention that blind watermark detectors for SS-based watermarking perform poorly as the host-signal acts as interference at the blind decoder. Therefore, nonzero decoding error probability at the blind watermark decoder even in the absence of attack-channel distortion is one of the limitations of existing blind watermark detectors for SS-based watermarking schemes. This paper presents a novel blind watermark detection method for the blind additive watermark embedding schemes[1, 2, 3, 4, 5]. The main motivation of this paper is to design a blind detector for SS-based watermarking schemes capable of suppressing host-signal interference (or improving watermark-to-host ratio) at the detector, hence improving decoding as well as detection performance. Towards this end, the proposed detector uses ICA framework by posing watermark detection problem as a blind source separation (BSS) problem. The proposed detector models the received watermarked signal as a linear mixture of underlying independent components (the host signal and the watermark). It also assumes non-Gaussianity of the host signal. Recently, we have shown in [15, 16, 17] that the watermark estimation problem for SS-based watermarking can be modeled as that of BSS of underdetermined mixture of independent sources. Therefore, the ICA framework could be used to estimate the watermark from the watermarked signals obtained using additive embedding model. The proposed ICA-based detector first estimates the hidden independent components (i.e., the watermark and the host signal) from the received watermarked signal using the ICA framework, and then these estimated components are used to detect the embedded watermark. We present theoretical analysis to show that the proposed ICA-based detector performs significantly better than the existing watermark detectors operating without canceling the host signal interference at the watermark detector for watermark detection [6, 7]. Simulation results also show that the proposed detector in estimation-correlation based detection settings also outperforms the normalised correlation based detector (commonly used for watermark detection in SS-based watermarking community [1, 2, 3]) operating without host interference suppression. Simulation results presented in this paper are evaluated against variety of signal manipulations and degradations applied to the watermarked media. These signal degradations include addition of colored and white noise, resampling, requantization, lossy compression, filtering, time- and frequency-scaling, and StirMark for audio benchmark attacks [20, 19, 18]. The proposed ICA-based watermark detector is applicable to SS-based watermarking of all media types, i.e. audio, video and images. However, in this paper the proposed detector is tested for digital audio (which includes music and voiced speech signals only) as the host media for watermark embedding, detection, and performance analysis. In the past ICA-based framework has been used for multimedia watermarking [9, 10, 11, 13, 14, 12]. However, existing ICA-based data-hiding schemes are either not applicable to SS-based watermarking [9, 10, 11, 13] or use an informed detection framework for watermark extraction/extraction [14, 12] therefore are not discussed in this manuscript. For example, Yu et al in [14] have proposed ICA-based watermark detector that can be used for SS-based watermarking but their detector uses the embedded watermark and a private data during watermark extraction process. Similarly, Sener et al's proposed ICA based watermark detector in [12] is also applicable to SS-based watermark detection, but their proposed detector also also requires the original watermark during watermark detection process; therefore, cannot be used for blind watermark detection/extraction applications. Rest of the paper is organized as follow: basics of SS-based watermarking are discussed in Section 2; a brief overview of the independent component analysis theory is provided in Section 3. The proposed ICA-based watermark detector along with its decoding, detection, and maximum watermarking-rate performance analysis are described in Section 4. Simulation results for decoding bit error probability performance of the proposed ICA-based watermark detector and a correlation-based detector against different attacks and signal degradations are described in Section 5. Finally the concluding remarks along with future research directions are presented in Section 6. 2 Basics of SS-based watermarking The SS based watermarking system can be modeled using a classical secure communication model [1], as shown in Fig. 1. In Fig. 1, S G Rn is a vector containing coefficients of the host signal in marking space. It is assumed that the coefficients, Si : i = 0,1, • • • ,n - 1, are independent and identically distributed (i.i.d.) random variables (r.v.) with zero mean and variance a^. A watermark, V, is generated using: (1) a message bit, b g {±1}, to be embedded into n coefficients of the host signal, (2) a key-dependent pseudo-random sequence W g {±1}n, and (3) a perceptual mask, a g Rn, estimated based on the human auditory system (HAS) and the host signal S, i.e. a = f (S, HAS). We further assume that the watermark sequence W and the host signal coefficients S are mutually independent. The amplitude-modulated watermark is spectrally shaped according to perceptual mask a to meet the fidelity requirement of the perception based watermarking. The watermarked signal X is obtained by adding an amplitude-modulated watermark V = a © Wb, here © denotes element-wise product of the two vectors, to the host signal S. The watermarked signal X can be expressed as X = S + V, (1) The embedding distortion, De can be expressed as, De = X - S. (2) BLIND WATERMARK ESTIMATION. Informatica 33 (2009) 49-68 51 Input Message i Watermark Embedder Message Encoder Wfty V = aWb Perceptual Mask Estimation -* Data Embedding Key Host Media i i i i Adversary Attack Attack Channel N Watermark Detector Message Detector Message Decoder Figure 1: Perceptual based data hiding system with blind receiver The mean-squared embedding distortion, de is expressed as, de = - E {||De H2} n -E{||X - S||2} n 1 Ha © Wb||2 n 1 n — 1 n E a2 22 = (3) where || • || represents the Euclidian norm, E{} denotes expected value of a r.v., and a2v represents variance of the watermark V. The signal distortion due to an active adversary attack can be viewed as channel noise, N, as shown in Fig. 1. The received watermarked signal at the detector, X, XX = X + N, (4) fs(r ) = 2 e ß —ß\T \ | T \< œ, (5) 3. no pre-processing is applied to the watermarked audio to suppress host interference, 4. Wi takes values ±1 with probability 1, In addition, for performance analysis we will consider two information embedding scenarios: (1) one bit b e {±1} of information is embedded in each coefficient of the host signal, Si, and (2) one bit b e {±1} of information is embedding in |C| coefficients of the host signal S, where |C| denotes the cardinality of the selected coefficient indices set C. Consider one bit embedding per coefficient, i.e. n = 1, case first. It has been shown in [7] that the ML decoder estimates b = 1 if X0W0 > 0 and an error will occur when X0 W0 < 0. The average Pe is given by Pe = Pr{XoWo < 0\b =1} / fs (t - a) dT. J — OO (6) is processed for watermark detection. The watermarking schemes based on blind additive embedding model generally use probabilistic characterization of the host signal to develop an optimal or suboptimal watermark detector (in ML sense). The statistical characterizations of real-world host signal are available in spatial domain as well as in the transform domain. For example, stationary speech samples/coefficients both in the time domain and in the DWT domain can be approximated by Laplacian distribution [21] (see Appendix A for the probability distribution function (pdf) of DWT coefficients) i.e., Assuming the Laplacian distribution model for the host, it can be shown Pe = 1 , (7) where A0 = which is generally referred as signal-