MD5 Proven Ineffective for App Signatures 117
prostoalex writes "Marc Stevens, Arjen K. Lenstra, and Benne de Weger have released their paper 'Vulnerability of software integrity and code signing applications to chosen-prefix collisions for MD5'. It describes a reproducible attack on MD5 algorithms to fake software signatures. Researchers start off with two simplistic Windows applications — HelloWorld.exe and GoodbyeWorld.exe, and apply a known prefix attack that makes md5() signatures for both of the applications identical. Researchers point out: 'For abusing a chosen-prefix collision on a software integrity protection or a code signing scheme, the attacker should be able to manipulate the files before they are being hashed and/or signed. This may mean that the attacker needs insider access to the party operating the trusted software integrity protection or code signing process.'"
Re:Nothing new (Score:2, Informative)
14 Sep 2005
use two hash functions (Score:5, Informative)
The case that matters is producing a program with the same checksum as a given program, without the ability to manipulate the correct program beforehand. That's still hard.
Nevertheless, code signing mechanisms in general should probably be prepared for flaws in hash functions. It might be best always to use two hash functions and to have some strategy of migrating. That way, if one hash function gets compromised, there is still another one in place and can be used until the original one has been replaced.
Re:Birthday Attack (Score:4, Informative)
This is a REMOTE attack, and reasonably potent (Score:4, Informative)
If you'd read the article, you'd see that one of the (prominent) possible attack scenarios listed is that of software distribution: distribute a good file, with the intent of replacing it later. For example, in debian, even with MD5 checksums on all your data, and tools reporting what's changed during the software update, this would still allow downloading infected files, without noticing.
It's a danger both from malicious distributors, and from hacked distribution sites.
Re:Nothing new (Score:5, Informative)
When you inspect these binaries at bit level, they contain only the "good" or the "bad" version, and some random data appended to it to make the MD5 hash of the files collide. This technique thus also works for file formats which don't have control statements such as "if" or "file starts at offset". See also: http://www.win.tue.nl/hashclash/Nostradamus/ [win.tue.nl], scroll down to: "Didn't Daum and Lucks do something like this in 2005?"
Marc Stevens already constructed these "chosen-prefix" collisions for X.509 Certificates, see the HashClash [win.tue.nl] project page. What's new in these results, is that it did not require massively distributed computing efforts, only one Playstation 3 and less than two days of computation. There is no paper available yet as to how he achieved this major optimization, but his MSc thesis [win.tue.nl] gives a clue: see "future work" at the end of section 7.4.
Re:Nothing new (Score:5, Informative)
X; if (X) then GOOD else EVIL
and
Y; if (X) then GOOD else EVIL
but the evil code would be in the signed good program, it would not be run.
The new attack is different: it is a method to generate blocks GX and EX for two random files such that the files GOOD+GX and EVIL+EX hash to the same checksum.
Re:Nothing new (Score:2, Informative)
This attack means that you get to choose the two files, and the attack generates two blocks to append to the original files which mean they hash to the same value.
So the exploits before have been:
Where [A] and [B] are blocks which collide (and they are aligned on block boundaries for calculating the hash).The new attack is exploited like this::
Where [C] and [D] are generated by this algorithm. This means that on a quick glance, the code doesn't look completely silly and there is no trace of what the hidden content says (just a bit of random stuff at the end of a file which might suggest that it is there)Re:ONE block, surely (Score:5, Informative)
Re:Nothing new (Score:3, Informative)
He *does* need access to good.exe. You can't generate a file that matches a given MD5, what you however can is generate two files that have the same MD5 and different content, both good.exe and evil.exe contain appended data to make the sums match. Its still a weakness, but a much less critical one then being able to generate a file for a given MD5.
Re:So MD5SUM veriefies downloads only (Score:3, Informative)
Re:Nothing new (Score:2, Informative)
Because you want an honest party to verify the "good" one, sign its MD5 with their trusted key, and actually distribute the good one.
Then you can in chosen circumstances replace it with the bad one (on, say, specific installs), and an ordinary audit will see the trusted signature on the package you thoughtfully provided on DVD.
Or think contracts: any signed-MD5 signature for a document in a format that ordinarily includes random-looking garbage is now untrustworthy, because what that person signed may have nothing in common with what you're being shown.
Re:Birthday Attack (Score:1, Informative)
That's the whole point of the parent, isn't it? Unless I'm missing something, the collision is generated by a birthday attack approach and the parent was exactly right. It's why the attacker needs to write both GOOD.EXE and EVIL.EXE; if he could generate an arbitrary md5 sum, he could replace any GOOD file and this would be much worse.
IMHO it's still pretty annoying (though not new).
Re:Bost projects I've seen.. (Score:3, Informative)
As I understand it, the normal way to generate a digital signature is to use a hash algorithm like MD5 or SHA1 and then encrypt the hash with a private key. Then you verify by hashing the file and decrypting the signature with the public key and checking to see if they match. Therefore, distributing signatures instead of hashes is orthogonal to the discussion at hand. If the hash is broken, then the signature is broken, too.
See Wikipedia for more information on digital signatures [wikipedia.org].
Re:Well, duh! (Score:1, Informative)