Global Big Data Conference

Industry News Details

Stopping 'them' from spying on you: New AI can block rogue microphones Posted on : Apr 19 - 2022

Ever noticed online ads following you that are eerily close to something you've recently talked about with your friends and family? Microphones are embedded into nearly everything today, from our phones, watches, and televisions to voice assistants, and they are always listening to you. Computers are constantly using neural networks and AI to process your speech, in order to gain information about you. If you wanted to prevent this from happening, how could you go about it?

Back in the day, as portrayed in the hit TV show "The Americans," you would play music with the volume way up or turn on the water in the bathroom. But what if you didn't want to constantly scream over the music to communicate? Columbia Engineering researchers have developed a new system that generates whisper-quiet sounds that you can play in any room, in any situation, to block smart devices from spying on you. And it's easy to implement on hardware like computers and smartphones, giving people agency over protecting the privacy of their voice.

"A key technical challenge to achieving this was to make it all work fast enough," said Carl Vondrick, assistant professor of computer science. "Our algorithm, which manages to block a rogue microphone from correctly hearing your words 80% of the time, is the fastest and the most accurate on our testbed. It works even when we don't know anything about the rogue microphone, such as the location of it, or even the computer software running on it. It basically camouflages a person's voice over-the-air, hiding it from these listening systems, and without inconveniencing the conversation between people in the room."

Staying ahead of conversations

While the team's results in corrupting automatic speech recognition systems have been theoretically known to be possible in AI for a while, achieving them fast enough to use in practical applications has remained a major bottleneck. The problem has been that a sound that breaks a person's speech now—at this specific moment—isn't a sound that will break speech a second later. As people talk, their voices constantly change as they say different words and speak very fast. These alterations make it almost impossible for a machine to keep up with the fast pace of a person's speech.

"Our algorithm is able to keep up by predicting the characteristics of what a person will say next, giving it enough time to generate the right whisper to make," said Mia Chiquier, lead author of the study and a Ph.D. student in Vondrick's lab. "So far our method works for the majority of the English language vocabulary, and we plan to apply the algorithm on more languages, as well as eventually make the whisper sound completely imperceptible."

Launching 'predictive attacks'

The researchers needed to design an algorithm that could break neural networks in real time, that could be generated continuously as speech is spoken, and applicable to the majority of vocabulary in a language. While earlier work had successfully tackled at least one of these three requirements, none have achieved all three. Chiquier's new algorithm uses what she calls "predictive attacks"—a signal that can disrupt any word that automatic speech recognition models are trained to transcribe. View more

Get the