Slashdot Log In
Compressed VoIP Calls Vulnerable To Bugging
Posted by
kdawson
on Friday June 13, @12:20PM
from the say-that-again-slowly dept.
from the say-that-again-slowly dept.
holy_calamity writes "Security researchers at Johns Hopkins report that a variable bit-rate compression scheme being rolled out on VoIP systems leaves encrypted calls vulnerable to bugging. Simpler syllables are squeezed into smaller data packets, with more complex ones taking up more space; the researchers built software that uses this to spot phrases of interest in encrypted calls simply by measuring packet size."
Related Stories
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.

Easy Solution: (Score:5, Insightful)
Reply to This
Re:Easy Solution: (Score:5, Insightful)
Better solution: Fix the stupid, broken protocol.
For instance, the concept of RSA blinding had to be invented because people discovered that certain bits of the SSL private key can be determined simply by measuring the time it takes to encode messages. This was due to some implementation details inside SSLeay where it switched from one multiplication algorithm to a different one depending on the size of certain numbers in the algorithm.
OAEP had to be invented for similar reasons
"Music in the background" is not a security solution. In fact, that's a freaking joke.
Reply to This
Parent
Re:Easy Solution: (Score:5, Funny)
Yes, but a joke you can dance on.
Reply to This
Parent
Protocol isn't broken - it's badly mixed (Score:5, Insightful)
Voice codecs are designed to support a given level of audio quality subject to bit rate and computational complexity limitations. Most codecs are fixed-rate, or fixed-rate with silence suppression. Encryption isn't part of their design; it's somebody else's problem, and many VOIP systems aren't encrypted anyway (for instance, connections between an office phone and a PBX usually aren't.) Variable bit rate codecs are sometimes a good choice, depending on the kind of sounds you're trying to compress and the networks you're transmitting them on, and they're at least an alternative to the usual fixed-rate codecs.
Encryption systems usually aren't designed to deal with real-time message streams or timing attacks. Typically VOIP encryption protocols are designed for constant bit rate codec output, which is what most codecs provide, and the codecs usually package up 10, 20, or 30ms audio samples into a data packet for transmission over IP.
The problem occurs when you're choosing your codec and encryption separately, and you take a crypto system designed for fixed-rate codecs and use a variable-bit-rate codec instead. It's difficult to keep people from doing that sort of thing, especially if they're using huge-overhead approaches like VOIP inside IPSEC as opposed to VOIP systems with the crypto built in. It's also difficult to prevent people from making bad choices like that when they're using open-source software applications, as opposed to proprietary phones that only have the small set of codecs the manufacturer built in (typically uncompressed G.711, or G.729 or a GSM codec, all of which are fixed-rate except for silence suppression.)
Reply to This
Parent
Not really... (Score:4, Informative)
The conclusions do not apply to more standardized codecs like G.711 and G.729a, which use fixed size packets.
The paper itself can be downloaded from here [jhu.edu]. Get it quick, before the IEEE figures this out and make the author remove it so they can extort their fee.
Reply to This
Parent
Do what my grandparents do (Score:5, Interesting)
Random switches between languages would probably confuse the heck out of filters guessing compressed data. That or you could just learn Russian... I don't think they *have* any simple-syllable words in Russian
Reply to This
Re:Do what my grandparents do (Score:5, Funny)
Reply to This
Parent
Re:Do what my grandparents do (Score:5, Funny)
Da!
Reply to This
Parent
Re:Do what my grandparents do (Score:4, Funny)
Reply to This
Parent
Re:Do what my grandparents do (Score:4, Interesting)
Reply to This
Parent
Re: (Score:3, Insightful)
Re: (Score:3, Funny)
Evasive, ummm, technology (Score:5, Funny)
FTFA
So, ummm, what we should do to, umm, well, protect ourselves from, ummm, yaknow, eavesdroppers, heh-heh, is well, make sure there's enough, ummmmmmm, yaknow, like extra noise, like, mixed in, dude.
Reply to This
Re:Evasive, ummm, technology (Score:5, Funny)
Oh my god, thats like, totally, like, a great idea, yaknow. I mean, like, they'll never figure out what we're, like, saying, yaknow?
Oryoucouldspeakreallyfastwithoutpausesbetweenwords. Thatwaythey'llneverknowwhatyousaid =)
Or. We. Could. All. Speak. Like. Shatner. Random. Long. Pauses. Genius.
Cheers
Reply to This
Parent
It's easy to encrypt your conversations (Score:3, Interesting)
Or maybe you shouldn't say anything on VoIP that you don't want anyone else to hear.
Reply to This
Re: (Score:3)
A couple honest questions...
1) Why do I see so much about wiretapping/bugging VoIP lately? I guess I've always assumed that VoIP was just as vulnerable to bugging as POTS
Bad science (Score:4, Insightful)
vowels actually are simpler than consonant to compress (because of spectral complexity - consonant use much more different frequencies. They are mostly noises and have a more "random"-like wave form making them harder to compress). They got it completely in reverse.
Then TFA doens't show a method to magically guess was is being said over a crypted channel only by looking at the bitrates, it only says that it finds some predetermined pattern in a given set of samples to test against. The whole thing would only be able to answer to some very simple questions like "did the words XYZ appear in the conversation ? or did ABC appear in the conversation ?" - with a rather bad success rate if those words are long and complex enough - which hardly makes it enough to obtain personal information or otherwise efficiently spy on someone.
Then the whole system has a lot of short comings :
- As said before it assumes that the spy know exactly that some phrase has to be said - if the spy doesn't guess exactly what words he must search for the attack fails (the users may be speaking in a foreign language to begin with).
- It assumes that the speech-generator-made needle they are looking for in the hay sack will be close to what they are looking for. The users may have an accent and pronounce words differently (cf alumnium vs. aluminium, etc...)
- And worse of all, it assume that the granularity of the packed will be small enough so that the phonemes will have an influence on the bit rate. Whereas in reality, short packets have a big overhead of bandwidth, longer packets increases the latency. But lots of VoIP users are happy with a 500ms latency because it really diminishes the overhead. At 500ms you can have a couple of words in a single packet. The whole packet will tend to have a corresponding bandwidth close to the average (there will be small difference between phonemes, but these will all be packed into the same packet and will average).
- It fails to take into account an interleaved video stream. Video conferencing is really popular, and its own bandwith will completely dwarf the bandwidth used by audio. So unless the VoIP uses 2 separate stream (some VoIP systems do), and only encrypt at the stream level, and the transmission is happening over a non crypted channel (no sane person should do that), this method will fail epically.
Reply to This
ode-cay (Score:4, Funny)
Reply to This
Re:Randomize the packets slightly (Score:4, Informative)
Time/space attacks are well known. Somebody who actually, hmm, UNDERSTOOD cryptographic security would never have designed the protocol this way in the first place.
The people suggesting that we should just inject noise or background patterns are being ridiculous. Why sacrifice communication quality when there are BETTER ways to fix it? DO IT RIGHT.
Reply to This
Parent
Re: (Score:3)
Hahaha! Compressing encrypted data?! My sides are splitting!
In case you can't figure it out: good encryption makes data look completely random. Do you know of any algorithms which compress PURELY RANDOM data? I sure as hell don't.
Re:Here's a thought (Score:5, Insightful)
The point of compression is to take data that's expressed in a way that doesn't maximize entropy and reexpress it in a way that is higher-entropy (more information per bit). As such, maximum-entropy data is, by its nature, incompressible.
Reply to This
Parent
Re:Here's a thought (Score:5, Insightful)
The issue is that VOIP is an application that needs low latency. You have to send the data you have within (.1 seconds? something small) a specific amount of time, and can't wait for the buffer to fill before sending it, compressed, encrypted or not. Thus you get packets that are different sizes.
This isn't sending the whole conversation at once, this is a constant stream of data with specific requirements on latency.
A solution would be to make each packet the same size by padding it with random data that the other side will discard. But that eliminates some of the benefit of compression.
Maybe just use a fixed bit rate, as opposed to a VBR encoding?
Reply to This
Parent
Re: (Score:3, Insightful)
What idiot modded this up? Encrypted data is (pretty much by definition) uncompressable. Encryption works by hiding information and removing redundancy. Compression works by identifying and removing redundancy. The two concepts simply CANNOT BE APPLIED IN
Re:Here's a thought (Score:4, Funny)
Reply to This
Parent
Re:Here's a thought (Score:5, Interesting)
Voice data just CAN'T be securely encrypted. That's because the spacetime information HAS to be there because we inherently interpret voice data according to these characteristics. Either you reveal this information in the stream, or you must increase the latency to the point that communication is impossible. If you want security, don't speak, WRITE, and use a cryptosystem that isn't a piece of shit.
Reply to This
Parent