MD5 Collision Source Code Released 411
SiliconEntity writes "The crypto world was shaken to its roots last year with the announcement of a new algorithm to find collisions in the still widely-used MD5 hash algorithm. Despite considerable work and commentary since then, no source code for finding such collisions has been published. Until today! Patrick Stach has announced the availability of his source code for finding MD5 collisions and MD4 collisions (Coral cache links provided to prevent slashdotting). MD4 collisions can be found in a few seconds (but nobody uses that any more), while MD5 collisions (still being used!) take 45 minutes on a 1.6 GHz P4. At last we will be able to implement various attacks which have been purely hypothetical until now. This more than anything should be the final stake in the heart of MD5, now that anyone can generate collisions whenever they want."
SHA1 (Score:5, Funny)
Re:SHA1 (Score:4, Informative)
Re:SHA1 (Score:2)
Re:SHA1 (Score:5, Informative)
No, see:
http://www.computerworld.com/securitytopics/secur
and
http://www.computerworld.com/softwaretopics/softw
I like this quote:
"SHA-1 is a wounded fish in shark-infested waters, but I'm more worried about MD5 because it's used everywhere," said Niels Ferguson, a cryptographer at Microsoft Corp. "Try to switch away from SHA-1 as quickly as you can, but switch away from MD5 first," he said when asked what recommendations he has regarding the algorithms during a panel discussion at the conference.
Re:SHA1 (Score:4, Insightful)
Someone mod the GP post Funny, before we get more "informative" posts. It looked like a tongue-in-cheek comment to me. I actually laughed. Then I saw the follow-ups from the Unfunny Brigade... It was a joke!
Seriously, who doesn't know about the SHA-1 weakness by now.
Re:SHA1 (Score:5, Informative)
Crypto people are recommending SHA-256 or SHA-512 which is only like SHA-1 in name.
Obviously check your the hash length beforehand and make sure your database column is wide enough.
When migrating existing hashes to the new hash be careful not to store the old hash anywhere -- that can be the weak link in the chain. For example, generating passwords and having the MD5 around lets attackers generate valid inputs and then try them against the more computationally complex hash. It gives them an approach to attacking your stronger hash.
Take a copy of your database and hash all the existing passwords into SHA-512 form, and you'll need some way of distinguishing the MD5-to-SHA512 hashes from the SHA512 hashes, so add a date column with todays date in it. Then write a function "hashString" as a wrapper that can identify when something was hashed, and go down a different branch of code based on that.
The first branch does MD5 then SHA512, the second branch does SHA512, and it does this based on the date column.
And, of course, re-salt both branches.
Re:SHA1 (Score:3, Informative)
hash1 = md5(input1)
hash2 = sha1(input1)
Then to validate you run the input through the same thing. Like I said, the chances of a valid input colliding with both are ignorably slim.
Re:SHA1 (Score:4, Informative)
Look up the Joux Multicollision paper from 2004 CRYPTO. This is a famous result.
This scheme you propose has been broken by Joux.
-Matt
Exactly! (Score:4, Informative)
Because that would require the hashed password and a preimage attack.
See here [cryptography.com].
Summary for those who are lazy:
In conclusion, the value of this attack/exploit is only relative to how the hash function is used in an application. Just because this exploit and source code for it exists, that does not that these hashing algorithms are completely useless.
Re:SHA1 (NO) (Score:4, Informative)
MD5 is dead. SHA-1 is currently dying. They're both based on the same fundamental math, and the attacks on SHA1 are getting stronger and stronger. Now would be a really good time to get off of that entire family of hashes if you can (MD4, MD5, RIPEMD-* SHA-*, etc).
The crypto world is in a little bit of a bind when it comes to strong hashes now. We simply don't have any left that both have a long proven track record of analysis and haven't been seriously compromised. Best bet, IMHO, is to go with a new-blood hash algorithm. They seem to be the answer we're looking for - but of course what they lack is more years of intensive study by the experts for flaws they themselves might harbor.
So if you want to give them a whirl, I'd recommend looking at Tiger and Whirlpool:
http://en.wikipedia.org/wiki/Tiger_(hash) [wikipedia.org]
http://en.wikipedia.org/wiki/Whirlpool_(algorithm
SHA1 is not a good alternative in some cases (Score:4, Interesting)
For the impatient, here is a summary for my recommendations for 2005-2006:
See the paper for mode details.
Managed to get just the last few lines... (Score:4, Funny)
(With sincere apologies to Bryce Jasmer [netfunny.com].)
Re:Managed to get just the last few lines... (Score:5, Funny)
-confused
shaken to our what? (Score:3, Insightful)
Any half-way intelligent cryptographer would have suggested SHA-1, TIGER or perhaps HAVAL since quite some time already.
Tom
Re:shaken to our what? (Score:3, Insightful)
But a household user can crack in an hour on a normal mid-line computer something that "a few years" ago professionals might have recommended. That's the news. If low-end PCs can crack encryption that's only a few years outdated, then the hounds are nipping at the heels of the cyptography industry. And imagine what hackers could do with more powerful computers (yes, I know ther
MD5 is not an encryption algo (Score:2, Insightful)
Re:MD5 is not an encryption algo (Score:2)
It's also the default algorithm to hash passwords (i.e. if you type in your password, it gets hashed into an MD5 sum which is then compared to what the MD5-ed password should be, thereby avoiding plaintext password storage).
Lucklily, most sane systems use salt [wikipedia.org] so this algorithm won't work out of the box.
Re:MD5 is not an encryption algo (Score:5, Informative)
t's also the default algorithm to hash passwords (i.e. if you type in your password, it gets hashed into an MD5 sum which is then compared to what the MD5-ed password should be, thereby avoiding plaintext password storage).
Doesn't matter. This attack has no significant effect on password hashing, with or without salt.
This attack allows you to find a pair of plaintexts that hash to the same value; you don't get to pick either the plaintexts or the hash value. It does not help you find a plaintext that hashes to a given value. To use this to attack an unsalted password hashing system you would need to first generate a collision, then convince the target of your attack to set one of those plaintexts to be his/her password, then you could log in using the other plaintext. But if you can convince the target to use a particular password, why not just use that to log in?
This is not an insignificant cryptologic result, and people should move away from MD-5 as fast as practically possible (actually, people have been moving away from it for years due to some results against MD-4, which MD-5 is very similar to) but it doesn't really have any practical implications right now.
breaking torrents? (Score:5, Insightful)
now if you you were a software company you could put torrents out (I assume they use blocks of MD5sum), and then after the torrent becomes popular start randomly seeding people with blocks that hash correctly but are complete garbage (since you can't pick what exactly you hash). if you do it right you would have games out there that would still mostly run. but would crash, or have garbled game data, etc.
I'm not sure if this is really all that useful. but this exploit certainly seems to make it easy to do.
Re:breaking torrents? (Score:4, Insightful)
Though to be fair, most games seem to come in the form of a compressed archive of some sort -- either a bunch of .rar files (for warez) or a .exe file with .cab files (for Windows installers) or something similar. In that case, the corruption would be detected at installation, though it wouldn't be easy to determine from the torrent exactly which blocks are corrupt.
In short, MD5 being broken (and now the code being available) is very bad. I expect to see the anti-piracy vigilantes jumping on this very quickly to create code to totally break bit torrent and similar things -- it would come up and look like a seed, but would be spewing garbage that the other clients couldn't detect as garbage. Of course, such poisoning would also end up being used to corrupt completely legitimate torrents as well, just because people think it's fun.
(The fix? Update things like bittorrent to use hash routines that haven't been cracked yet, or to use multiple hash routines on the same blocks.)
Re:breaking torrents? (Score:3, Interesting)
now if you you were a software company you could put torrents out (I assume they use blocks of MD5sum), and then after the torrent becomes popular start randomly seeding people with blocks that hash correctly but are complete garbage (since you can't pick what exactly you hash)
Yeah, they could do that. They could even use the same trick as the guys who showed the Postscript exploit and make the software do different things based on which of the collision pair is in the file. Get the right version and t
MD5 and digital signatures (Score:3, Informative)
The main problem with MD5 as it's used today comes when MD5 is used as a component of a digital signature scheme. Most digital signature schemes based on public key crypto work like this:
To verify a digital signature, the following is performed:
Re:MD5 and digital signatures (Score:3, Interesting)
And well, here's a bit of news: someone has done just that with X.509 certificates based on MD5.
Ouch. I'm not sure how I missed that one. The first page of their paper shows that I greatly underestimated the severity of the attack. What I hadn't realized is:
Re:MD5 is not an encryption algo (Score:3, Insightful)
For one: any system that uses one-way hashed password storage or key generation mechanism, as many sites use, is now some percentage easier to violate because multiple strings may resolve to same hash.
It's the case with all one-way hashes that multiple strings hash to the same value. Since an n-bit hash function can only produce 2^n distinct hash values, if there are more than 2^n possible input plaintexts, than multiple plaintexts hash to single values. In fact, since MD5 and similar can hash inputs o
Re:MD5 is not an encryption algo (Score:5, Informative)
Stop right there, because it's quite clear that the one who doesn't understand is you. Nobody has the ability to generate a collision for a given MD5 hash. All we have is the ability to generate two bits of random junk that share the same hash. This makes some attacks possible, but it does not make it possible for you to distribute a malicious version of someone else's software that has the same MD5 hash as their version.
Re:shaken to our what? (Score:2)
SHA-1??? (Score:4, Insightful)
TIGER is good, as is Whirlpool. Whirlpool has the advantage that it uses AES as the basis, and AES is regarded as secure. It was also certified for secure work by NESSIE - a European group trying to do for the EU what NIST does for the US - which means that it's probably certified for use in secure environments in Europe.
According to the Hashing Function Lounge, there are other hashing functions that have not been broken (eg: cellhash and fft-hash) but these are sufficiently obscure that a lack of a known exploit may be through lack of study and not through the presence of good security. It would make them good for beating skript-kiddies, as they won't have the skills to find exploits and those skilled enough at finding them aren't studying those algorithms much. (I don't like security through obscurity, but technically these aren't obscured as anyone CAN study the algorithm.)
Re:SHA-1??? (Score:5, Informative)
How to calculate SHA-2 under GNU/Linux (Score:4, Informative)
A lot of people have the programs md5sum and sha1sum installed, but they often don't have equivalent programs for the SHA-2 family. To calculate those you can use the command:
or you can download a dedicated program: http://www.ouah.org/~ogay/sha2/ [ouah.org].
Re:SHA-1??? (Score:3, Interesting)
Re:SHA-1??? (Score:3, Interesting)
And then, when generating hashes of a 5 gigabyte file on my hard drive (there's been a lot of good stuff on Freshmeat, recently):
So you found a collision, big deal (Score:3, Interesting)
Re:So you found a collision, big deal (Score:5, Informative)
It demonstrates the generation of two postscript files with the same MD5 hash that nevertheless display completely differently.
Re:So you found a collision, big deal (Score:5, Informative)
Re:So you found a collision, big deal (Score:3, Informative)
How?
The collision vulnerabilities do not allow this. They require both the MD5, and the original plaintext, to produce a modified plaintext that has the same MD5.
Think about it - how do you know it's a collision at all, unle
Re:So you found a collision, big deal (Score:4, Insightful)
The actual issues are for document signing; the attacker could give you one document to sign, and use the signature on a different document with the same hash. There are smaller issues in the case where code expects no two documents to have the same hash. Obviously, collisions must exist, but the code to handle the case is likely not to be well-tested, since test cases were previously impractical to find.
You can spoof (almost) arbitrary documents! (Score:5, Interesting)
Most document formats have lots of "dead space", parts you can pretty much modify at will without changing what the user actually sees. Comments in HTML or PostScript. Old junk data in Word documents. Executables can have just about anything you like added if you know your stuff. The MD5 attacks currently available only 128 "dead space" bytes to generate a collision. So far from being a gibberish document, one can generate almost any document you want. This page has a simple example with PostScript files. [cits.rub.de] Both files have the same MD5 hash, but one is a relatively harmless letter of recommendation while the other is a grant of security clearance. Get your boss to sign your letter of recommendation digitally, swap in the security clearance file, and pass it on. This is a Big Deal and a Major Problem.
MD5 broken, not useless (Score:2, Insightful)
md5 is still good for keeping track of if your files have changed. It should not be used for document signing.
Re:MD5 broken, not useless (Score:2)
But only for non-security uses. Don't use MD5 in your intrusion detection system.
So what the hell do I do now? (Score:5, Interesting)
Re:So what the hell do I do now? (Score:2)
Re:So what the hell do I do now? (Score:5, Insightful)
MD5 has not been invalidated for those uses. Checking the MD5 sum of an ISO download is not done for security purposes, it's done so that you can make sure you didn't get a bad byte or two somewhere in that 650MB. I mean, if hackers could upload a malware-filled ISO to the FTP server, they could upload a new MD5SUMS file too, right?
Re:So what the hell do I do now? (Score:5, Informative)
Re:So what the hell do I do now? (Score:3, Insightful)
Heh, sorry, but you lose. I use them for that.
I've never had a problem with an "infected" iso, but I have had one or two that downloaded wrong and flat out failed to work after I burned them. Sure enough, I check the md5 sum and it's not correct- download the iso again from the same site and it works.
So I guess maybe either my hard drive is funky (or some other system failure on one end) or this "packet integrity" isn't so great.
Re:So what the hell do I do now? (Score:2)
Re:So what the hell do I do now? (Score:2)
Re:So what the hell do I do now? (Score:2)
Re:So what the hell do I do now? (Score:4, Informative)
$ rpm --checksig lynx-2.8.5-23.2.x86_64.rpm
lynx-2.8.5-23.2.x86_64.rpm: (sha1) dsa sha1 md5 gpg OK
$
So, even if it is also possible to generate collisions for DSA and GPG keys as well as SHA1 and MD5, the chances of being able to generate a collision for all four checksums/signatures at the same time is quite likely infinitesimally small. And that's just for a random file, things are going to get much more complex if you want that random file to can pass as whatever format the original was supposed to be and actually deliver a payload that might do something useful for the cracker.
Re:So what the hell do I do now? (Score:4, Informative)
So, even if it is also possible to generate collisions for DSA and GPG keys as well as SHA1 and MD5, the chances of being able to generate a collision for all four checksums/signatures at the same time is quite likely infinitesimally small.
Actually, only SHA1 and MD5 need collide. DSA and GPG aren't used for hashing. DSA is used for signing the hashes, and GPG just implements and structures all of the above.
Still, you're right that finding collisions for both MD5 and SHA-1 on the same pair of files is extremely unlikely.
And that's just for a random file, things are going to get much more complex if you want that random file to can pass as whatever format the original was supposed to be and actually deliver a payload that might do something useful for the cracker.
Hammerhead, meet nail.
The cycle begins anew... (Score:2, Funny)
Re:The cycle begins anew... (Score:2)
Should I care? (Score:5, Funny)
Replacement Hash Functions (Score:5, Informative)
http://en.wikipedia.org/wiki/SHA-2 [wikipedia.org]
http://en.wikipedia.org/wiki/WHIRLPOOL [wikipedia.org]
http://en.wikipedia.org/wiki/RIPEMD-160 [wikipedia.org]
Why? (Score:4, Insightful)
I use HMAC-SHA-256 with PasswordMaker.
https://passwordmaker.org/passwordmaker.html [passwordmaker.org]
Re:Why? (Score:5, Insightful)
Re:Why? (Score:2)
Re:Why? (Score:2)
In that case, if MD5 is broken and 'data' can be constructed to give the same MD5 hash, the SHA1 hash provides no extra security. In fact calculating a hash of a hash reduces the security, as if ANY of the algorithms in the chain are broken, the final result can be manipulated easily.
Re:Why? (Score:3, Insightful)
True, but you could use a hash function like SHA1('data').MD5('data'), where the . operator stands for string concatenation.
The reason that this isn't generally done is that should not provide more security than a proper cryptographic hash algorithm that produces hashes as long as the two different hash algorithms concatenated together.
If you want additional collision resistance, just generate a longer hash. I believe this is how people are advised to handle SHA-class algorithms right now.
Re:Why? (Score:3, Insightful)
Hmmm....
I actually like the concatenation more. The reason is that the longer hash solution only provides protection provided that the hash algorythm is not broken. Once it is broken your entire system fails. In other words, the security doesn't have much of a concept of depth.
If you use two independant hash algorythms that are different, then the cha
Re:Why? (Score:4, Informative)
Just like mixing medications can have very bad synergistic side effects, so should encryption or hashing technologies be mixed and matched.
As an example, when DES was first known to be broken, the most intuitive solution would be to double-encrypt the plaintext. However, upon cryptographic analysis, this acutally fails to improve the complexity of an attack (and in some cases may simplify it). Thus, Triple DES.
Be very wary of trying to combine "broken" algorithms in an attempt to gain security, especially if you have no real grounding in cryptanalysis. Vulnerabilities in each have a nasty tendency to either amplify or at least complement each other in highly unpredictible ways.
Remember one of the basic tenets of cryptography: it's easy to create an algorithm that you can't break. But just because you can't think of a way to break it doesn't mean there's not a trivial way to do so.
Re:Why? (Score:3, Interesting)
Your point is well taken. However, in this case, I believe you are mistaken. Indeed 3DES is just an extended form of DES which uses 2 keys and three encryptions. But here you are speaking of access so the analogy is likely to be fatal
Re:Why? (Score:3, Insightful)
Now, with my proposal, one would include independant hashs which would be checked independantly. If either one fails, one assumes that the data has been tampered with. The issue is that it would be difficult to defeat both simultaneously for this specific type of check. Being able to do so on demand while editing the file in a meaningful way might well prove impossible.
Yes, but only if you mean 'might well prove impossible' in the same way that it 'might well prove impossible' to break SHA-1 or MD5. Th
Re:Why? (Score:2)
Every re-hashing increases the collision rate, so generally for highest security, you use one unbroken hashing algorithm. Yes, using more than one hashing algorithm would reduce the chances that your whole line of defense is invalidated one day, but it increases the chance that a single hole could be found.
Re:Why? (Score:2)
If this was in fact more secure than perhaps it would have already been released bundled as one algorithm. Why make people use two sets of algorithms unless the goal is to confuse potential crackers? However, mose uses of MD5 involve the recipient knowing the algorithm used so that wouldn't work.
Re:Why? (Score:3, Informative)
I noticed that NetBSD's source-based package management system, pkgsrc [netbsd.org], already does this using SHA1 and RMD160 (apparently RIPEMD-160 [wikipedia.org] is the official name for the digest). Here's what it looks like in the archive fetching phase of a package installation:
=> Checksum SHA1 OK for unzip-5.52/unzip552.t
Re:Why? (Score:3, Insightful)
As a mathematician, this "seems" like it would be intuitively much more difficult for someone to discover an attack against.
As an engineer, this seems like it is obviously NOT a "good" solution.
If H1() and H2() are both weak, the hybrid H12() is obviously stronger than either, but also fundamentally flawed.
Right now it's time to... (Score:3, Interesting)
In other words, right now it's time to...
...LICK OUT THE KAMS, NOTHERGUCKERS!
bittorrent? (Score:5, Insightful)
Re:bittorrent? (Score:5, Informative)
Re:bittorrent? (Score:2)
Re:bittorrent? (Score:2)
Re:bittorrent? (Score:3, Informative)
WTF this source is useless (Score:4, Interesting)
Re:WTF this source is useless (Score:3, Insightful)
Not all code documentation lives in the source.
Re:WTF this source is useless (Score:4, Informative)
If you RTFRFC [faqs.org] you might notice that those are the variable and section names used in the algorithm specificaiton.
This is misleading - MD5 is still useful (Score:5, Insightful)
Re:This is misleading - MD5 is still useful (Score:3, Insightful)
Yet.
Re:This is misleading - MD5 is still useful (Score:2)
MOD ME DOWN (Score:5, Informative)
The parent comment, which I wrote, was based on a severe misunderstanding of the extent of the capability of the attack. In particular, I didn't realize that the attack could find collisions even with arbitrary, attacker-specified IVs. What that means is that it is indeed possible to generate x.509 certificates containing different keys but the same MD5 hash (and therefore the same signature). In fact, it's been done [win.tue.nl].
Collisions do not mean the end of MD5 (Score:5, Insightful)
No, no, no. This does not allow an attacker to generate any collision they like. They cannot find data that collides with a piece of data I provide them with. All they can do is provide me with 2 pieces of data that happen to collide.
This means that an attacker can theoretically provide 2 different documents to people with the same hash, but they cannot easily produce a document that has the same hash as a document I have written.
(Disclaimer: I haven't actually been able to RTFA (it's
"broken" does not mean broken (Score:5, Informative)
This does not totally invalidate MD5 for verification. This attack still does not let you poison a torrent feed, etc, unless you are the author of the original source data and you engineered the data specifically to be vulnerable to this attack.
Re:"broken" does not mean broken (Score:2)
Will people stop saying this? (Score:3, Insightful)
No, no it won't be. It won't be, because MD5 is useful for many things where the existance of an "easy" (in quotes because easy is relative) method of generating colisions is irrelavant.
It won't even kill off the use of MD5 checksums as a signature for verifying authenticity, because if your data is smaller than the checksum there may not be a colision at all, and an exploit wouldn't matter.
This is an important discovery, but it doesn't make MD5 useless any more than CRC32 is useless.
Got Salt? (Score:2, Insightful)
But aren't most people using MD5 using salted (as opposed to unsalted) hashes? (for those unclear on the difference, a "salted" has basically uses a local seed as part of it's MD5 hash, in addition to the value to be encrypted)
Doesn't seem likely that salted hashes can be easily broken by this technique, although clearly it's a concern that, should the salt value become known, all your passwords, etc, become breakable...
Not -that- bad (Score:3, Interesting)
If we could craft a collision block that contained a specified string at a specified position, that would be another issue altogether.
The ability to find collision blocks easily does suggest that crafted collision blocks might be possible, but for now, you have as good a chance of getting a viable exploit out of
This doesn't mean we shouldn't look to other options for the newest releases of high-security software, but it doesn't mean that the md5 algorithm should be purged from our systems altogether either. It's still extremely valuable at detecting accidental corruption, and useful-with-caveats at detecting malicious corruption (45 minutes to discover a block of data that matches the sum is not really useful in either speed or resulting data for any kind of man in the middle attack, for example, so using md5 to validate network packets is safer than using it to validate disk files).
Of course, the black hats may know more than we do about md5 weaknesses, but 'may know' is just as true of any other algorithm.
MD5 and verification (Score:5, Insightful)
It might only take minutes to calculate two random strings with the same hash, but it would still take a very long time to calculate a second string that collides with a pre-existing string. So even though it is now cryptographically weak, it can still be used effectively to check the integrity of files.
Re:MD5 and verification (Score:2)
Coral cache? (Score:5, Funny)
Im sorry, you must be new here.
For windows users (Score:2)
if(!CryptAcquireContext(&cryptHandle, NULL, NULL, PROV_RSA_FULL, CRYPT_NEWKEYSET ))
with
if(!CryptAcquireContext(&cryptHandle, NULL, NULL, PROV_RSA_FULL, 0 ))
To actually run the program you have to convert your MD5 to four ints. Take your MD5, such as 098f6bcd4621d373cade4e832627b4f6
Divide it in four pieces and convert them to dec.
Hex
=======
098f6bcd
4621d373
cade4e83
2627b4f6
Dec
=======
160394189
1176621939
340356672
Still not a preimage attack (Score:2)
It's the same attack that has already ben spoken about, just now it is available for the masses.
The scope of the attack is one can generate two files having the same MD5 sum, if he can have a random looking section in the middle of the file. i.e. possible in many binary formats but not possible in well formed ASCII text.
What the attack doesn't do is given a MD5 hash being able to find a byte stream that hashes to the same value. So passwords stored as their MD5 sums are still safe, you can't attack t
Weak code. (Score:4, Funny)
Q and A (Score:4, Informative)
Question: Does this mean all my MD5 passwords for all my users can be cracked?
Answer: The short answer is yes, they can be cracked. The long answer is no, not if you used a salt, and the attacker has to get those md5 hashes first. You are not safe you are storing your user's password field input directly to the database ala the php/sql method of:
Question: How should I remedy this?
Answer: Always use a salt or salts. For example in the case above you could use this php/sql method instead:
Question: How/Why is this safer?
Answer: Collisions are only direct input for the md5 function to get the same md5 hash. So in the above case, $password was directly taken from the user and made into a hash. Assuming an attacker got an SQL injection in and grabbed the database, they could run this collision creator on a hash and produce an input that gave that exact hash.
But, this would be much more difficult with any code that introduced a salt. That is why the second code is better, it includes two salts that the attacker (through his SQL injection) is unable to account for.
Re:Q and A (Score:5, Interesting)
This is a very bad password salting scheme and vulnerable to a dictionary attack. Once I have your database and salts, I can run a dictionary of common passwords through your scheme and crack any weak passwords.
You can make things much harder by having your salt change for each password - include the username for example. Now I have to run my entire dictionary through the sha/md5 function for each user. By doing this, you make the attack O(m*n) instead of O(m) (where m = the number of words in my dictionary and n = the number of users).
And as you mentioned in a follow up post, this code only generates documents with identical md5 sums, it does not generate a document with a given sum. So MD5 is broken for document signing and the like, but secure for password hashing for the time being.
Re:Q and A (Score:4, Insightful)
Way back when the hash was unix crypt(), everyone got a little nervous about the idea that someone with a lot of computing power to burn might just fill up a large hard-drive with a map of plaintext->hash for the entire oxford dictionary for example. If they did that, then all they'd need to do subsequently to figure out the plaintext for any of the values they did the crypt()'ing for would be to look through the rive for the crypt value that matched the crypt value in the password file.
The idea of a salt is simple, it changes the hashed result in a computably repeatable fashion, but one that requires that the entire hash has to be performed with the salt in mind in order to get the correct end result.
So using a completely public bit of knowledge is fine. What is preferable is that the salt be random yet repeatable. That's why you see unix passwd files have the salt prepended to the hash (in crypt, it use to be the first 2 characters for example), you generate it randomly once, then whenever you want to see if the plaintext you've been given is the same, you grab the salt, hash the plaintext with it, and then check to see if the result matches the original hash.
In summary, a salt protects you from the problem of a pre-computed hash dictionary, nothing else.
Doesn't Microsoft networking still use MD4? (Score:3, Interesting)
Adi Shamir's Hash Function *IS* 'unbreakable'.... (Score:4, Informative)
(from material at the Pure Crypto Project - http://senderek.de/pcp/ [senderek.de] )
Quote below from http://senderek.de/pcp/pcp-security.html [senderek.de]
The nice thing about Adi Shamir's hash function is that it, as well as the RSA cryptosystem co-created with Rivest and Len Adleman is all based on simple modular exponentiation.
Too bad the Feds consider arbitrary precision mathematics used for encryption purposes to be 'a munition' and 'a controlled export'....
Years ago, they raked Phil Zimmerman [philzimmermann.com] over the coals over his email cryptosystem PGP then (eventually) left him alone.
Can't cryptosavvy individuals secure the details of their affairs with strong encryption WITHOUT being hassled by 'the Man'?...
P.S. However, Rivest came up with a scheme that gives you 'confidientiality *without* encryption' through a scheme he calls Chaffing and Winnowing [mit.edu]
Enjoy!
Re:Torrent Poisoning (Score:2)
Re:The end of edonkey (Score:5, Informative)
It is still very difficult to produce a colliding file given a pre-existing file on the network.
It should also be noted that edonkey splits a file into 9500KB chunks, and then into smaller chunks again, and hashes each one. It would be far more difficult to produce a chunk that causes collisions on all three levels.
Anyway, I expect an eMule extension will come out soon to allow for sharing of SHA1 hashes between clients (if it doesn't already exist).
Re:This cannot be used to crack passwords (Score:3, Informative)
I may not have been clear enough... (Score:4, Interesting)
Or you may have a brain tumor...
The generaly accepted definition for "cracking a password" is, given the encrypted password, being able to generate a password the once entered in the authentication system will generate the same encrypted form.
For the second time, this attack does not permit that. This attack permits build two byte streams that hashes to the same value. So in a password context, assuming the password system permits the entry permits more than 1024 bits of arbitrary data to be entered as password (I don't remind the details well, but I think this is the amount of data that you must be able to change between the two streams) one could generate to "passwords" (being that long they don't qualify for this name anymore IMHO) that would hash to the same value.
How does that amount to an attack on MD5 passwords again? Never at any point the attack had been being able given a unknown to him MD5 hash to produce a stream that would hash to the same value.