The inaudible signal that measures who's listening
Published by Joseph SARDIN, on
Summary
- An inaudible audio watermark is embedded in TV and radio broadcasts
- It carries the channel ID, date, and time of transmission
- Three rival systems share the global market: Nielsen PPM, Kantar SNAP, MΓ©diamΓ©trie
- Each one picks a different frequency band and a different encoding trick
- The same technology has been quietly repurposed by advertisers to track phones
Turn on your radio. Right now, somewhere in the audio coming out of the speaker, a signal just slipped past. You didn't hear it, and neither did anyone else in the room. But it was there, sitting at the edge of human hearing, carrying three very specific pieces of information : the station's identifier, the date, and the exact time. This signal has a name. It's called audio watermarking. And at this very moment, it's riding inside millions of radio and television streams all over the world.
Why mark the sound itself
The reason this technology exists comes down to a very practical headache. For decades, measuring audiences meant plugging a meter into the back of a TV set and tracking channel changes. Then came cable, satellite, DSL, fiber, every kind of cable box, streaming platforms, replay services. Every new delivery method forced a new adaptation. It became unmanageable.
The clever idea was to mark the audio itself. If every channel embeds a unique signature directly in its sound, then a microphone in the room can figure out what's coming out of the speaker, no matter how it got there. That's the whole principle of audio watermarking.
Three players, three philosophies
Today, three big companies share the global market, and what's revealing is that they've each made very different technical bets.
In the United States, Nielsen Audio runs the most familiar system : the Portable People Meter, or PPM. Originally developed by Arbitron starting in 1992, then acquired by Nielsen in 2013, the PPM has been deployed across the top 48 radio markets in the country since 2007. The encoding sits in the 1 to 3 kHz band, smack in the middle of the audible range. Instead of placing the signal above human hearing, Nielsen tucks it under the masking threshold of the human ear : a louder sound hides quieter sounds nearby, so the watermark slips into that perceptual blind spot. The modulation looks something like multi-band frequency-shift keying, with brief tone bursts scattered across the speech-frequency spectrum.
In France, MΓ©diamΓ©trie went the opposite direction back in 2008. Their proprietary signal sits high in the spectrum, somewhere between 18 and 22 kHz, just above the hearing threshold of most adults whose high-frequency sensitivity drops with age, but still well within the bandwidth of standard electret microphones. Too high to be heard, too low to be a true ultrasound the way I described it in my piece on infrasound and inaudible waves. The same MΓ©diamΓ©trie technology has since been exported to Morocco, India (where it powers the world's largest TV panel with 33,000 households), and Norway through a partnership with Kantar.
Kantar Media, the third player, is British-based with a global footprint. Their two proprietary systems are called SNAP and INK. SNAP relies on phase modulation, yet another approach. The technology was selected by SMPTE for standardization in 2017 and is now used by the BBC, by NRJ Audio in France for online content, and is even built natively into AWS Elemental MediaConvert for cloud streaming workflows. More than 2,200 channels worldwide carry a Kantar watermark.
Three companies, three different frequency bands, three different modulation schemes. Audio watermarking isn't a single recipe. It's a family of solutions all wrestling with the same impossible compromise between inaudibility, robustness against compression, and how much data you can squeeze through the channel.
Three pieces of information, and everything else follows
The payload itself is almost always the same. Just three things : the unique identifier of the channel or station, the date, and the time. Some systems swap the timestamp for a relative timecode, which lets them measure non-linear content like podcasts and on-demand video.
That minimal payload turns out to be enough for a lot. By comparing the timestamp inside the watermark with the actual time when the boxed picked it up, the system figures out whether you watched something live or on replay. If you sit down with a recorded prime-time newscast at 11 PM, the meter understands it's the same program, just consumed three hours later. That's the trick that lets these systems credit live, time-shifted, and replay viewing all from the same signal stream.
The marking happens far upstream, right inside each station's broadcast facility, before transmission. From there the watermark rides along with the program, crosses the air in your living room, and finally lands on a microphone tuned to listen for it.
The Voltair affair
If the whole setup sounds clean, recent history begs to differ. Starting around 2014, a technical debate turned into a very public fight in American radio. A company called Telos Alliance, through its 25-Seven subsidiary, released a hardware unit called Voltair that boosted the strength of the Nielsen watermark on a station's outgoing signal. Stations that installed it suddenly saw their measured audience jump by 30 percent or more in a single month, with no programming changes whatsoever.
The explanation goes back to the way Nielsen's psychoacoustic masking works. The watermark needs dense audio content to hide underneath. On a music station, that's easy : there's always energy filling the spectrum. On a talk station full of pauses, breaths, and quiet stretches, the watermark gets weak and sometimes disappears entirely. The mechanical consequence : music stations were systematically over-counted, talk stations under-counted. Voltair leveled that playing field by re-evaluating the masking conditions in real time and letting broadcasters tune the watermark gain more aggressively.
Nielsen pushed back hard, accusing the box of introducing audible artifacts and altering the codes. Numeris, the Canadian audience measurement body, banned Voltair outright in June 2015. By 2016, Nielsen rolled out an enhanced encoding called eCBET that, for all practical purposes, did the same thing Voltair did. The dispute ended in a kind of technical stalemate, but it exposed an uncomfortable truth : audience measurement by watermarking is not a neutral instrument. Depending on programming format, ambient noise, and how well the masking holds up, the numbers can swing wildly. And behind those swings, advertising dollars shift in real time.
The PPM has carried other controversies too. Back in 2008 and 2009, several US states sued Arbitron over panel composition, arguing that the methodology under-sampled Black and Hispanic listeners and unfairly hurt urban-format stations. The disputes reached congressional hearings. The technology that quietly counts what we listen to has always been more political than it looks.
When advertising hijacks the technique
The story takes a darker turn when you look at how the same engineering has been repurposed elsewhere. Around 2014, an ad-tech company called SilverPush had an idea : use the same kind of inaudible audio beacon, but for cross-device tracking. A TV commercial emits an ultrasonic ping. An app installed on the viewer's smartphone, running quietly in the background, hears the ping through the phone's microphone and silently links the phone to the TV. No user awareness required.
A 2017 study presented at the IEEE European Symposium on Security and Privacy identified 234 Android apps embedding the SilverPush SDK or similar tracking kits like Lisnr and Shopkick. Many of them were innocent-looking utilities : fast food, weather, games. The Federal Trade Commission sent warning letters to a dozen developers in 2016, demanding explicit disclosure to users. Google pulled the offending apps from the Play Store. SilverPush officially shut down its tracking arm in 2015. The technology, however, didn't go anywhere.
What this case shows is that audio watermarking is morally neutral on its own. It's just a covert side channel between a speaker and a microphone. Depending on who's running it, that channel becomes either consented and anonymized audience measurement, or opaque commercial surveillance. Same physics, two very different ethical worlds.
A whole layer of the spectrum, always busy
For anyone who records ambient sound for a living, as I do for the BigSoundBank library, this story has a slightly unsettling angle. The useful range for field recording typically caps around 20 kHz. Meanwhile, while you're capturing an urban ambience, radios and televisions in the background are constantly emitting watermarks that sit right at the edge of that range, or buried inside it under the music and voices, completely invisible without a spectrum analyzer.
An entire layer of the audio spectrum is in constant operation, hidden from human hearing but very much present in the air. Every newscast playing in a waiting room, every car radio in a cab, every phone speaker in a coffee shop releases tiny digital messages into the surrounding space, where they may end up in the memory of a 40-gram meter clipped to someone's belt, or in the microphone of an app on a stranger's phone. Out of the statistical aggregate of millions of these micro-events come the audience numbers handed to broadcasters every morning, and sometimes, without anyone realizing, the advertising profiles handed to brands.
Did you know there's a nearly audible signal traveling through your radio and TV streams all the time? Have you ever pulled up a TV recording in a spectrum analyzer just to see what lives in the 1-3 kHz or 18-22 kHz bands?
"Any news, information to share or writing talents? Contact me!"
♥ - Joseph SARDIN - Founder of BigSoundBank.com - About - Contact