Skip to Main content Skip to Navigation
Conference papers

A high-capacity watermarking technique for audio signals based on MDCT-domain quantization

Abstract : Watermarking is a technique that consists in hiding/embedding binary information within a signal in an imperceptibly way, meaning in the present context of audio signals that the mark is inaudible. Watermarking was first used for the protection of digital contents as part of the DRM (Digital Rights Management). In this context of secured applications, important efforts were devoted to ensure robustness of watermarks against pirate attacks aiming at neutralizing it rather than improving the quantity of watermarked information; the bitrate was usually within the range of tens of bits per second bps for audio signals. Nowadays, audio watermarking can be used for other kinds of applications, and in particular for metadata transmission. However, bitrates are usually still quite low, although such applications require extended bitrates balanced with lower robustness. In this study we propose a high-capacity watermarking technique for audio signals. This technique is suitable for many uncompressed audio signals, more particularly for 16-bit Pulse Coded Modulation (PCM) signals as widely used in audio-CD and wav formats. The proposed technique is based on the application of the Quantization Index Modulation (QIM) technique on the MDCT (Modified Discrete Cosine Transform) coefficients of the signal. The underlying basic principle is that, if those coefficients can be significantly modified by quantization in audio compression schemes such as MPEG MP3/AAC without quality impairments, they can also be modified to embed watermark codes. Following audio compression principles, a psychoacoustic model (PAM) is used at the watermark embedder to take into consideration the behavior of the human auditory system and match the inaudibility constraint. The PAM is used to estimate an optimal watermarking capacity for each sub-band of each MDCT frame. The resulting capacity values are transmitted as (watermarked) side-information to the decoder (so that the decoder can retrieve the usefull watermarked information in the corresponding sub-band). For this aim, specific fixed capacities are allotted in the higher sub-band of the spectrum. With this technique, maximal bitrates of about 250kbps per audio channel can be reached (depending on the audio content), at the expense of robustness: the system can be used for "non-secure" applications where the signal suffers any attack other than quantization for uncompressed format conversion. For instance, we use this technique in a watermark-informed source separation system presented at the same congress.
Complete list of metadatas
Contributor : Laurent Girin <>
Submitted on : Tuesday, November 9, 2010 - 6:58:04 PM
Last modification on : Wednesday, October 7, 2020 - 11:40:12 AM


  • HAL Id : hal-00534502, version 1


Jonathan Pinel, Laurent Girin, Cléo Baras, Mathieu Parvaix. A high-capacity watermarking technique for audio signals based on MDCT-domain quantization. 20th International Congress on Acoustics (ICA 2010), Aug 2010, Sydney, Australia. pp.ICA2010. ⟨hal-00534502⟩



Record views