Chinese Prof Cracks SHA-1 Data Encryption Scheme 416
Hades1010 writes to mention an article in the Epoch Times (a Chinese newspaper) about a brilliant Chinese professor who has cracked her fifth encryption scheme in ten years. This one's a doozy, too: she and her team have taken out the SHA-1 scheme, which includes the (highly thought of) MD5 algorithm. As a result, the U.S. government and major corporations will cease using the scheme within the next few years. From the article: " These two main algorithms are currently the crucial technology that electronic signatures and many other password securities use throughout the international community. They are widely used in banking, securities, and e-commerce. SHA-1 has been recognized as the cornerstone for modern Internet security. According to the article, in the early stages of Wang's research, there were other data encryption researchers who tried to crack it. However, none of them succeeded. This is why in 15 years Hash research had become the domain of hopeless research in many scientists' minds. "
Old (Score:5, Informative)
Slashdot editors are idiots. (Score:1, Informative)
Article is a bit confused (Score:5, Informative)
What? (Score:5, Informative)
They also use the word "online" too many times for me to take them seriously. The implication is that because the professor broke SHA 1 that my online bank account is going to be drained. Not likely.
Hashing != Encryption (Score:5, Informative)
The original article is full of misstatements like this doozy:
this SHA-1 encryption includes the world's gold standard Message-Digest algorithm 5 (MD5). Before Professor Wang cracked it, the MD5 could only be deciphered by today's fastest supercomputer running codes for more than a million years.
SHA-1 is NOT encryption, and it certainly doesn't "include" MD5. They are 2 completely different hashing algorithms. Hash algorithms are not "deciphered". Neither of them has been "cracked". They have been found, in theory, to not be as collision-proof as previously thought, but noone has yet found a way to take one block of data and modify it such that it would have an identical hash signature as the original. Both are merely found to be not quite as collision-proof (the most important thing for any hashing algorithm) as previously thought. This is old news.
The original article blows and contains no useful information whatsoever, it was written by someone who hasn't the faintest hint of knowledge about cryptography or mathematics in general.
Re:Old (Score:5, Informative)
Here are Wang's papers on cracking hashes, which show the age of the cracks, from her webpage:
1)Xiaoyun Wang1, Hongbo Yu, Yiqun Lisa Yin, Efficient Collision Search Attacks on SHA-0,Crypto'05.
2)Xiaoyun Wang, Yiqun Yin, Hongbo Yu, Finding Collisions in the Full SHA-1,Crypto'05.
3)Xiaoyun Wang, Yiqun Yin, Hongbo Yu, Collision Search Attacks on SHA1,2005.
4)Arjen Lenstra, Xiaoyun Wang,Benne de Weger, Colliding X.509 Certificates, E-print 2005.
5)Xiaoyun Wang, Collisions for Hash Functions MD4, MD5,HAVAL-128 and RIPEMD,Crypto'04,E-print.
6) X. Y. Wang, X. J. Lai etc, Cryptanalysis of the Hash Functions MD4 and RIPEMD, Eurocrypto’05.
7) X. Y. Wang, Hongbo Yu, How to Break MD5 and Other Hash Functions, Eurocrypto’05.
I believe in crypto 2004 she was given a standing ovation for her presentation, which is almost unheard of in the ultra-competative world of crypto.
Epoch Times (Score:5, Informative)
Far from being a Chinese newspaper it's actually published out of New York, and you might see (Chinese) people handing out copies on the street in your country (I see them in NZ from time to time).
So yeah, it wouldn't surprise me if the article was vague... I'd take it all with a grain of salt.
Re:How long until... (Score:1, Informative)
Snuffle (Score:5, Informative)
Any hash algorithm can be used as a stream cipher: hash the key and take successive values to make a pseudorandom stream, and then XOR it against the plaintext. This is the idea behind Daniel J. Bernstein's Snuffle ciphers [wikipedia.org].
Published in New Scientist 17 December 2005 (Score:3, Informative)
Busted! A crisis in cryptography [newscientisttech.com]
"LAST year, I walked away saying thank God she didn't get a break in SHA-1," says William Burr. "Well, now she has." Burr, a cryptographer at the National Institute of Standards and Technology in Gaithersburg, Maryland, is talking about Xiaoyun Wang, a Chinese cryptographer with a formidable knack for breaking things. Last year Wang, now at Tsinghua University in Beijing, stunned the cryptographic community by breaking a widely used computer security formula called MD5. This year, to Burr's dismay, she went further. Much further."
cute... [ningning.org]
Further information on the "crack" (Score:5, Informative)
In other words, this attack is 2^17, or 131,072 times faster than brute forcing the hash, and from what I've read, this is considered pretty impressive stuff. That said, crypto researchers have known for a while that SHA-1 is on its last legs. From Schneider's blog in February, 2005:
Digest Functions In Relation To Encryption (Score:3, Informative)
Re:Anyone have a link to a *coherent* translation? (Score:4, Informative)
http://www.infosec.sdu.edu.cn/people/wangxiaoyun.
The details on the hash collision can be found in the following papers:
Xiaoyun Wang, Yiqun Yin, Hongbo Yu, Finding Collisions in the Full SHA-1,Crypto'05
http://www.infosec.sdu.edu.cn/paper/Finding%20Col
Xiaoyun Wang, Yiqun Yin, Hongbo Yu, Collision Search Attacks on SHA1,2005
http://www.infosec.sdu.edu.cn/paper/Collision%20S
She has also previously found methods for collisions in X.509, MD4/MD5, HAVAL-128, RIPEMD and SHA-0.
However, the problem is not entirely the algorithms, there will always be collisions on hashing algorithms, if you could represent an infinite amount of data in 160/128/whatever bits then there would be no point in having 161/129/whatever bits, the fact that your hard drive is much larger than that is a testament that collisions in any type of algorithm where you try to uniquely represent X bits in Y bits (where X > Y) (Yes I realize this is a somewhat oversimplified exaplantion).
The problem is in the paradigm in which these algorithms get used, 'one hash to represent them all' is a broken mentality, use multiple hashing algorithms when it matters, while it is indeed possible that the same data can cause a collision in all of the employed algorithms, its incredibly unlikely and AFAIK no one has created a PoC where two sets of data produce the same checksum in both md4 and sha-0.
It WAS reported on Slashdot two years ago... (Score:3, Informative)
Incredibly old news. EE Times [eetimes.com] reported on it at the time, correctly referring to SHA-1 as a hashing algorithm, nothing more... by itself, anyway.
Re:Old (Score:1, Informative)
Re:Old (Score:5, Informative)
Ummm well...... (Score:3, Informative)
Re:Bullshit propaganda (Score:5, Informative)
It is actually run by the notorious Fa Lun Gong cult. The 'epoch' here refers to the new era the cult is supposed to bring us into, with the leader kind like Jesus. A lot of the stuff on that media, especially the Chinese version, is total crap. Despite its lack of credibility, Epoch Times seems always have quite a lot of money to burn. You can sort of pick up the recent copy FREE at major convenience shops in your local Chinatown, amongst stuff like Jehovah Witness's pamphlets. I even once found copies of both language versions at a community library here in UK.
HERE's the coral cache: (Score:3, Informative)
Re:Old (Score:4, Informative)
The problem is that you're essentially creating a new hash function, H(x) = SHA1(x) || SHA256(x) || MD5(x), for which collisions can be computed piece-wise. To compute a collision for H(x), you can always start by creating a sequence of MD5 collisions, and see if any of these are also collisions for SHA-1 and SHA-256---which, I imagine, is more likely than you might think, since SHA1, SHA256, and MD5 all use the same basic design (compared to algorithms like Whirlpool). That won't necessarily work with a single hash function like SHA-512.
Re:Old (Score:2, Informative)
SHA1(m) || MD5(m). The resulting output is 128-bits + 160-bits. Even though the output is 288-bits, it really only gives about 2^70ish security, instead of the expected 144-bits of security.
-mattjf
Wrong, wrong, wrong. (Score:5, Informative)
"According to a Beijing digest, this SHA-1 encryption includes the world's gold standard Message-Digest algorithm 5 (MD5)."
Where do I start? SHA-1 stands for 'Secure Hash Algorithm 1' and is not an encryption scheme. Neither does it include MD5 which is a completely different hash (or message digest) algorithm.
See Schneier - http://www.schneier.com/blog/archives/2005/02/sha1 _broken.html [schneier.com]
and http://www.schneier.com/blog/archives/2005/02/cryp tanalysis_o.html [schneier.com] for actual coverage of the break. "They can find collisions in SHA-1 in 2**69 calculations, about 2,000 times faster than brute force. Right now, that is just on the far edge of feasibility with current technology. Two comparable massive computations illustrate that point." That's down from 2**80, so it's a concern, but not exactly the end of the world.
New apps being written should probably be using SHA-256 (256 bits) rather than with SHA1 (160 bits only).
Re:How long until... (Score:1, Informative)
Re: MD5 is broken and should no longer be used (Score:3, Informative)
It is relatively easy with MD5. It would probably require less than a week of time on a modern computer, possibly only hours.
If you spent 10 million on an SHA-1 cracking box, it's estimated that it would take about 127 days to find two colliding files.
Here is a PDF that's my source [qut.edu.au] for this information.
An additional problem is that you can embed interesting things in .pdf, .ps or even HTML documents. You could embed both the evil code, and the good code. Then use a colliding block someone found a long time ago to choose between the evil code and the good code. So, once even one collision is found, it's possible to leverage that one collision into all kinds of existing documents because of the block nature of the two algorithms.
I expect that .pdf and .ps documents rarely see code review looking for evil code. So it's quite likely something like this would go compeltely undetected until the evil version was released into the wild causing a ton of confusion and lost time before someone figured out what was wrong.
Re:Not so fast. (Score:4, Informative)
Re:Digest Functions In Relation To Encryption (Score:3, Informative)
This is a message from Me to You, send me some $$$!
If there was a weakness in the hash function you may be able to find another plaintext that generates the same hash code, for instance, the hash function may also return a code of 123456 for the plaintext:
fy87dsf5dkjsf75SI5sdfISAfd576fHFKhsudg6%&FDSHf576
Sounds pretty useful doesn't it! I mean, OH My God! They are going to be able to like break into my online bank account now! Yea right. The "duplicate" plaintext that you may find for a given hash code most likely won't even be recognizable, and certainly wouldn't be in a form that would be useful. For instance, a duplicate plaintext with the same hashcode of a TCP/IP frame wouldn't likely even be in the proper format to be able to be decoded.
Think about it.
Re:Not so fast. (Score:5, Informative)
WTF? Have you been living in a cave or something?
Crypto mailing lists, newsgroups, and discussion forums talked about almost nothing else for about six months following the announcement that SHA-1 had been broken.
Even the US government, which moves at the speed of a glacier, proposed replacements for SHA-1 in FIPS back in March last year.
http://csrc.nist.gov/publications/drafts.html [nist.gov]
Re:Old (Score:3, Informative)
Re:Multiple hashes (Score:5, Informative)
This exact proposal shows up, like clockwork, literally dozens and dozens of times for each slashdot story about hash functions. Since the number of people who know why this proposal fails is miniscule compared to the number of people who think of the idea, it is literally impossible to respond to all the people who keep suggesting this idea. I mean, even if all of us spent literally every minute of every day responding to people who suggest this idea, we would still not have time to reply to every single post.
Here is an old post [slashdot.org] on slashdot explaining exactly why this idea doesn't work. The post has some details wrong ... for example, the correct security strength of the combined md5+sha1 hash is in reality 2^80 + 160*2^64, which is much weaker than even the already weakened security level cited in the post. However, the general idea is correct, and if you google for the title of the paper cited in that post, you can find much more information.
I hope that this reply helps to educate at least one poster, but judging by the regularity with which this idea keeps reoccurring, it's a little bit like rearranging chairs on the Titanic.
Re:Snuffle (Score:3, Informative)
While you can say that SHA-1 can be used as the basis for a cipher (such as Snuffle), that doesn't change the fact that SHA-1, by itself, is a hash function, not a cipher. SHA-1, by itself, is not an encryption algorithm. But Snuffle may very well be.
Joux's multicollisions attack (Score:3, Informative)
Actually, you don't know what you're talking about. Go read "Multicollisions in Iterated Hash Functions. Application to Cascaded Constructions" by Antoine Joux. Unfortunately, it's not generally available online, but Hal Finney wrote a nice explanation of the problem here [mail-archive.com].
Re:How long until... (Score:2, Informative)
Without the ability to break things like SHA-1 and RSA encryption, NSA's tremendous rate of information gathering is pointless, because most of the useful stuff is encrypted.
The continued existence and even growth of the NSA is proof that they have ways to break open all that encrypted information they're gathering.
Re:Not so fast. (Score:5, Informative)
True.
Also true AFAIK. I have not heard of anyone breaking those. But I must admit, I don't know if the weaknesses found ind SHA-1 applies to other variants of SHA as well.
You are completely mistaken about this part. A chain is not stronger than the weakest link. If you do signatures using SHA-1 and RSA, only one of the two has to be broken to forge a signature. When you sign a message, you put a signature on the output of the hash. If anybody can find another message with the same hash, they can simply put together your signature with the other message, and it will be a valid signature on a message you had never seen.
What could save you is the fact that there are different degrees of brokenness for a hash function. There are three kinds of common attacks to attempt on a hash function. The easiest one is to just generate a collision where you get to choose both messages. Next comes the problem of generating a collision where you are given one of the messages. Finally the hardest case is to be given a hash value and having to generate a message with that hash without having already an example of how to reach that hash value.
For MD5 an actual collision has been found, but still now algorithm to find a collision with an arbitrary message. For SHA1 there is AFAIK only demonstrated weaknesses. I have yet to see an actual SHA1 collision.
For signatures it might not be considered enough to just find a collision, after all you have to match the hash of a message, which was already signed. But even though you might feel secure, there are some things to worry about. First of all, once a technique to find collisions have been found, it only takes a little extra work to generate meaningful collisions. This is obvious to people with sufficient knowledge of the field, but a wouldn't believe this until it was actually demonstrated. With MD5 it has been demonstrated how to take two arbitrary plaintext files and from those generating two postscript files containing the two different texts but the same hash. Postscript was obviously chosen because the format contains a Turing complete language and thus was an easy target. But even simpler formats might be targeted with some additional work.
Consider the following scenario you send a signed email to somebody. You receive a reply saying something like "thank you for your email, but we need the signature on a postscript version, could you please sign the attached file?", and you find attached a postscript file containing the exact text you originally wrote. Would you sign that postscript file?
Re:Not so fast. (Score:3, Informative)
Saying it once more for clarity:
1. You send a digitally signed email A which states, for example, that you do not approve of a particular business proposal.
2. They email you an unsigned postscript file A', which you print out for verification, and it looks just like your email. So you digitally sign it and email it to them.
3. They detach the digital signature from A' and attach it to another postscript file B', which states that you do approve of the proposal. Anyone attempting to verify the signature on B' will think you signed it.
4. You lose your job.
Now get this: in actual fact, they don't even NEED a broken digital signature algorithm to trap yu this way. It is possible -- not even difficult -- to construct a postscript file so that it prints out one way on a specific printer and a different way on every other printer. Unless you view the
postscript code, you'll never know. Remember, postscript is a fully capable programming language, not just a page definition markup scheme.
Not a surprise - here are old references (Score:3, Informative)
Also Bruce Schneier [schneier.com] wrote about it back then.
I guess it takes a while for the US government and Microsoft, et al to take action on the news.