Generating Fast MD5 Collisions With ATI Video Cards 72
An anonymous reader writes "Yesterday at Black Hat USA 2009, a talk entitled
MD5 Chosen-Prefix Collisions on GPUs
(whitepaper) (Both PDFs)
presented an implementation written in assembly language for ATI video cards that achieves
1.6 billion MD5 hash/sec, or 2.2 billion MD5 hash/sec with reversing,
on an ATI Radeon HD 4850 X2. This is faster than the much-publicized 1.4-1.9 billion hash/sec figure that was
supposedly reached on a PlayStation 3 by Nick Breese at Black Hat Europe 2008 (he
later noticed an error in his benchmarking tool). Compared to the cluster of 215 PlayStation 3s that was used to
create a rogue CA in December 2008,
Marc Bevand claimed a cluster of 12 machines with 24 video cards would be
a bit faster, consume 5 times less power, and be 10 times cheaper."
Re:first (Score:5, Funny)
Generated with the help of an ATI card, I assume.
Re:1.6 1.9 (Score:3, Insightful)
or 2.2 billion MD5 hash/sec with reversing
Keep in mind I have completely no idea what "reversing" means.
Re: (Score:2)
It means going backwards, or turning something around.
Re: (Score:3, Insightful)
Re: (Score:3, Informative)
The numbers don't add up no matter how I turn them. He claims to be getting 14% more performance from each graphics card than from each PS3.
No. He didn't say that.
The performance difference was the cluster of 12 pc's with 24 cards, to the cluster of 215 PS3's
So, how are the numbers supposed to be interpreted?
Why are you interpreting them? They seem pretty clear as written.
I don't understand why anybody still finds it newsworthy when somebody come up with faster collision attacks against MD5.
It was newsworthy in January when it was first presented to the CA's.
It's newsworthy now because it's a significant per processor performance increase.
If you had read the article and not interjected your flawed interpretation, that would be obvious.
We already know, that collisions can be generated for MD5, and they can be generated fast enough, that we have to worry about it. It no longer matters exactly how fast they can be generated. If somebody managed to come up with a practical second preimage attack against MD5, then it would be newsworthy.
It's newsworthy due to the application to certain mathematical processes.
No one
Re: (Score:1)
The slashdot summary says that. In the actual slides he claim that the PS3 code is about 20 times slower than the people who wrote it said, and that a single graphics card can achieve the same as 20 PS3s.
What was newsworthy at the time was mostly, that CAs and browsers were still using a flawed algorithm. As far as I know, most browsers will still accept MD5 signatures. There wasn't much news in the attack, it was we
Re: (Score:1)
Easier Way (Score:5, Insightful)
If all you want is a signed SSL certificate, I suspect it would be easier to bribe an employee at a CA to skip a few steps when validating you.
Re: (Score:2)
Hey, if that's all you want, I'll give you a signed certificate, and my mother will recognize the signature too. No bribe required, but tips will be graciously accepted, of course.
Re: (Score:3, Interesting)
Re: (Score:2)
Enjoy it while it lasts, because I plan to charge exorbitant rates soon, just like Verisign.
Credibility? Fine. Mine vs. Theirs.
Sincerely, Operator Error.
Re:Easier Way (Score:5, Interesting)
It would be harder than you seem to think. It's not just any old fake cert they created. They created a CA certificate. That is, a certificate that can be used to issue other certificates. You can issue any many of these "other" certificates as you want and they will look legitimate.
It's very rare for a real CA to issue a certificate like that. That is the "top of the food chain" in certificates so to speak. You would have to bribe a fairly high level employee to get something like that. They keep those high level keys very well protected and there are only a few people that even have access to them.
Re: (Score:2)
yeah
if by high level you mean just about any of their sysadmin with access to the website? getting access to the actual key is unneccessary. you only need to be able to get something signed without them checking for some fields (ie, existence of CN, or capabilities bits..)
sure you might not be able to bribe verisign (though i doubt that) but in this case you only need to bribe one sysadmin from one of the big-name CA (any which has a certificate in your browser will do)
Re: (Score:2)
Totally bullshit. For signing another CA the CA can (and will) use the same key as they use to sign "ordinary" certificates. After all the difference between a CA and a non-CA certificate is just a flag in the X509v3 extensions in the cert. There is no special "high level key" which is only used for signing a CA certificate. Any key/certificate which build a certificate chain up to a root cert will do.
Re:Easier Way (Score:5, Funny)
Re: (Score:2)
That's good to know!
Sensible collissions that don't affect size? (Score:5, Interesting)
Somewhat off-topic, but I guess related all the same...
Nobody should use MD5 for authentication and whatnot... and even as a 'checksum' of sorts you have to be careful (i.e. make sure that the source of the MD5 text/file isn't the very same source as the file it was generated for, as a compromised file probably means the MD5 string would be equally compromised).
But I'm curious.. are any of the attacks capable of injecting new data that..
1. doesn't affect filesize - the wiki mentions that successful attacks can prepend and append, but presuming you'd include the file size with the MD5 string, that would be another parameter to check
2. actually does something.. be it useful or nefarious, rather than just crash the app or insert gibberish in a text document, etc.
e.g. if I took the declaration of independence as a .txt file, are there any attacks that could subtly, or non-subtly, change the wording without increasing or decreasing the size of the file, and still match an original MD5?
--
On-topic: cool; but not particularly new? Most everybody knows that GPUs are great at taking in a tiny bit of data, crunching it, and spitting a result back out. Kudos for actually writing optimized code for the given platform (in this case an AMD/ATi GPU), but it's still the same number crunching instead of an improved method.. correct?
Re: (Score:2)
Presumably (and I'm making a lot of assumptions here, I don't know enough about the subject), you could just snip the file by however many bytes the process would append to it, so when it does all of the calculations and appends it, it ends up the same size.
Also presumably, it would mean the last few bytes of the text file would be utter garbage.
Re:Sensible collissions that don't affect size? (Score:5, Insightful)
The point of the attack is that you can change the file to whatever you want, prefix some ignored garbage, and end up with a file with the same md5. So yes you could do something useful or nefarious by changing the file usefully or nefariously.
Re: (Score:2, Informative)
What you are describing is a second preimage attack. Nobody have achieved that against md5. What has been achieved so far has only been collision attacks. The first collision attack against md5 was demonstrated in 2004. Later some better collision attacks were demonstrated, in which you can choose the prefixes. The chosen prefix attack works in the following way
Re: (Score:2)
Re:Sensible collissions that don't affect size? (Score:5, Insightful)
The attack that is mentioned in the story, the creation of the rogue CA certificate, is an example of a successful MD5 collision attack with a practical application. The "random" garbage was inserted in a part of the certificate signing request which is opaque to the certificate authority. That was also an example of a useful collision attack, so these are actually dangerous (not just pre-image attacks).
Re: (Score:2)
Signing a hash is a very common method in cryptography. DSA for example signs with SHA-1 (SHA-2 these days), if you sign the unhashed message it isn't DSA.
Re:Sensible collissions that don't affect size? (Score:5, Insightful)
I don't think folks have to avoid MD5 as strongly & immediately as you suggest... the attacks are for the most part theoretical or require more compute power / patience that people outside of this blackhat con can muster. It was my understanding the PS3 cluster actually got a cert which could be used nefariously... and this guy showed he could do it cheaper and faster. This is perfectly inline with my understanding: Attacks always get better, they never get worse. So I suppose it is time to work out a migration plan for whatever uses MD5
On your closing comment: I think the author was suggesting that if people had been paying attention a lot more of them would be using ATI GPGPU clusters for stuff they used to use Vector processors and now use fleets of X86 variants for.
I don't completely disagree with him but there a lot of small GPU clusters out there and there are a lot of reasons why more people haven't really got with the program. I think the biggest reason is the difficulty developing for GPGPUs. It's not the hardest thing I've ever done but it really takes a deliberate effort to get into a different state of mind. And the ATI SDK just plain sucks. I'll take the performance hit and develop using a C superset with a NVIDIA target. The process can run during that extra time I am not pounding my head against a hard flat surface. Actually now that I think of it, I've just kept a lot the old FORTRAN code I have and used the NVIDIA kit... rather than porting to the ATI SDK.
Having said that I don't think that this state will last long at all. The rate of increase of performance in GPUs is steeper than that of CPUs; AMD & NVIDIA are really serious about getting into the general compute market (with the same or similar chips to what they already market); The power consumption, cooling, and noise are all really favorable.
I am sort of curious what OpenCL will be like, being a Mac user... but here lately Apple has been going further out of their way to make things suck, so I am not holding my breath.
Re:Sensible collissions that don't affect size? (Score:4, Insightful)
The first collision was demonstrated about five years ago. Anything that relied on collision resistance, should have been migrated away from MD5 at least four years ago. The attack in 2004 just wasn't taken serious enough.
Re: (Score:2)
I did some custom file 'fingerprinting' work some time ago when management didn't want to spring for Tripwire. For each file, the system stored both the md5sum and an shasum in addition to the file size. Figured that it was sufficiently improbable that a single altered file could collide in both hashing functions, particularly without changing in file size.
Granted, a rootkit could probably mess with return values to make it look as though the file hadn't changed at all, but at that point monitoring binaries
Re: (Score:2)
If you're using MD5 as a way to verify that the file isn't festooned with viruses. I don't think that was the intention of MD5 from the beginning, though, as it's a pretty useless way of going about it..
Re: (Score:2)
If you want to ensure that it's got no viruses or malicious code in it, then invest in a proper antivirus, keep it up to date, and scan everything you download.
Newly released viruses don't appear in antivirus programs' signature lists.
Re: (Score:2)
good job reading the sentence that came right after the one you quoted. :)
No tech needed (Score:2)
Just add politicians and wait...
Re: (Score:3, Interesting)
Yep, there are both collision checkers and crackers for CUDA too ... ATI is significantly faster though (this kind of computation bound stuff is ideal for them).
So how about NVIDEA ? (Score:1)
Not again (Score:2)
...consume 5 times less power, and be 10 times cheaper
*sigh*
Re: (Score:2)
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:3, Interesting)
It's not an error. Times Less = 1/x times as much in language, and has done so for 3 centuries
"Jonathan Swift, for instance, used it in 1711, writing "I am resolved to drink ten times less than before." It wasn't till the 20th century that language commentators - not mathematicians - came up with the notion that "three times closer" and "100 times slower" were illogical and confusing."
from http://www.boston.com/news/globe/ideas/articles/2007/10/21/do_the_math/ [boston.com]
Just because it sounds like it can be misinterpr
Re: (Score:2)
Re: (Score:2)
Could we just drop the deci- prefix and go with bels instead? Deci- isn't an SI prefix and it makes bels, an already hard to understand measurement for stupid^W lay people to not get confused by, something even harder to comprehend for these individuals.
Re: (Score:1)
I don't think changing things at this stage would help. People are generally aware that decibels indicates loudness (although they do seem to consider it an absolute linear scale). Talking about "bels" would make them wonder just what you're on about.
Re: (Score:2)
Some prefixes have just become more commonly used, as they lay within the range of human perception and usefulness, and translating to another SI unit is a mental step. Like your lay person wouldn't grasp that the speed of light is 300 megameters per second, since we stop at kilometers in our normal usage. The logarithmic measure of energy keeps the size of the number comfortable, and makes i
Re: (Score:3, Funny)
consume 5 times less power, and be 10 times cheaper
Actually I'm more concerned about the rise of the eco-cracker. The "green cracker" who wants to have a low carbon footprint and crack into your bank account inexpensively.
Re: (Score:2)
People are able to grasp the meaning of the statement, and it's in use by so many people now that I've stopped trying to fight it. After trying to explain so many times why "X times less" is wrong, I've given up. I suggest you do, too.
"Enthused" still annoys me, though.
And "I could care less" pisses me off to no end.
24 video cards... (Score:2)
Re: (Score:2)
RTFS again.
Marc Bevand claimed a cluster of 12 machines with 24 video cards [...]
Re: (Score:2)
So Who Said That ATI Cards Aren't Programmable? (Score:2)
Re:So Who Said That ATI Cards Aren't Programmable? (Score:5, Informative)
ATI cards are programmable, Brook+ is just a little too high level for writing simple computational kernels (you drop too much performance) and CAL too low level for most people (it's basically assembly). So generally people just stick to CUDA, even in the few cases where ATI's architecture is superior.
This problem is ideal for ATI, very little input necessary (NVIDIA has more texture samplers) and no inter thread communication necessary (ATI does not have random writes on it's local data share at the moment, making that communication harder than it is with NVIDIA). So basically it just comes down to FLOPS and ATI wins big there.
Basically this was done in CAL because it was done by a hacker and not by an academic researcher (who doesn't really care about performance if he can just as easily get his paper published on a slower GPU with less effort, easier in fact since editors know CUDA).
Re: (Score:2)
So who has been saying all along that GPU compute on ATI cards just isn't up to snuff?
Mainly people who haven't been paying attention to what ATI has been doing since AMD bought it and began merging tech ~3 years ago, along with the usual business/management changes that go with that kind of consolidation. Basically today's ATI isn't the ATI of just a few years ago.
To be fair to those folks, the Radeon HD 4800 series is, roughly speaking, less than 2 years old, with the 4850 X2 being only ~1 year old. Before the HD 4800 series came out (based on the RV770 [wikipedia.org]), which was the *second* generatio
OpenCL Anyone? (Score:2)
It would be very interesting to see if this class of algorithm ports easily to OpenCL - the GPGPU technology built into the upcoming 10.6 version of Mac OS X:
http://www.apple.com/macosx/technology/#opencl [apple.com]
If so, this kind of attack suddenly becomes very easy to gather the compute power for and a lot easier to code as you don't need to do the low-level stuff yourself.
Todays new: CPUs processes data! (Score:2, Funny)
Well that explains it! (Score:1)
Huh, so that's who bought all those PS3s.
Is it time for a new math copro war? (Score:2)
It'd be interesting to have a modern days mathematical monster installed in every PC for a number of different tasks, from 3D rendering to