Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Encryption Security IT

SHA-1 Broken 751

Nanolith writes "From Bruce Schneier's weblog: 'SHA-1 has been broken. Not a reduced-round version. Not a simplified version. The real thing. The research team of Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu (mostly from Shandong University in China) have been quietly circulating a paper announcing their results...'" Note, though, that Schneier also writes "The paper isn't generally available yet. At this point I can't tell if the attack is real, but the paper looks good and this is a reputable research team."
This discussion has been archived. No new comments can be posted.

SHA-1 Broken

Comments Filter:
  • by Hulkster ( 722642 ) on Tuesday February 15, 2005 @11:27PM (#11685462) Homepage
    For those interested, here is the actual detailed/lengthy FIPS PUB 180-1 from NIST, [nist.gov] as typical, Wikipedia has a nice summary, [wikipedia.org] and the W3 Folks [w3.org] have a short snippet ...
  • For more info (Score:3, Informative)

    by response3 ( 751852 ) on Tuesday February 15, 2005 @11:28PM (#11685488)
    I'm not sure if this post is news or what, but for more info, click here:

    http://www.itl.nist.gov/fipspubs/fip180-1.htm [nist.gov]
  • Brought to You By (Score:5, Informative)

    by z0ink ( 572154 ) on Tuesday February 15, 2005 @11:29PM (#11685504)
    Same group of people that found the MD5 Hash Collision. Self [slashdot.org] references [slashdot.org] and the MD5 paper [iacr.org].
  • by metlin ( 258108 ) on Tuesday February 15, 2005 @11:32PM (#11685530) Journal
    Which *one* of SHA-2?

    FYI, SHA-224, SHA-256, SHA-384, and SHA-512 are all referred to as SHA-2.
  • by NEOtaku17 ( 679902 ) on Tuesday February 15, 2005 @11:32PM (#11685532) Homepage

    SHA-1 Hash Algorithm [ietf.org] and Source Code [cr0.net].

  • Re:Damn it (Score:3, Informative)

    by ad0gg ( 594412 ) on Tuesday February 15, 2005 @11:34PM (#11685548)
    MD5 was cracked before this.
  • Re:Hmm (Score:2, Informative)

    by defile ( 1059 ) on Tuesday February 15, 2005 @11:34PM (#11685552) Homepage Journal

    If your system has an md5sum command, it will also have a sha1sum command. Same idea, different output: feed each of them a file and they will give you a hash of that file that fits in 128-bits (md5) or 160-bits (sha1).

    In practice, no two files (or period of a data stream) will have the same signature. Hashing algorithms are used in data integrity checks and authentication.

    An sha1 crack likely means that they found a way to make tampered data still hash to a desired value, maybe.

    sha1 and md5 are generally considered so weak that they should only be used to combat error or accidents, not fraud.

  • by jd ( 1658 ) <imipak@yahoGINSBERGo.com minus poet> on Tuesday February 15, 2005 @11:39PM (#11685595) Homepage Journal
    Whirlpool has the same hash length as SHA-256 and is based on the Rijndael encryption function, which is currently believed "safe enough". As such, I'm going to say that that is the best bet right now.


    The Hashing Function Lounge [terra.com.br] also lists Cellhash, Parallel FFT-Hash , RIPEMD-128, RIPEMD-160, Subhash and Tiger as (so far) unbroken.

  • by Anonymous Coward on Tuesday February 15, 2005 @11:41PM (#11685614)
    Doest not affect HMAC. So it does not affect IPSEC and WEP.

    RTFA.
  • by prockcore ( 543967 ) on Tuesday February 15, 2005 @11:43PM (#11685626)

    FYI, SHA-224, SHA-256, SHA-384, and SHA-512 are all referred to as SHA-2.


    Doesn't matter. The only difference is key length. The algorithm is the same.
  • by Ant2 ( 252143 ) on Tuesday February 15, 2005 @11:44PM (#11685639)
    Well, for starters, there's:
    SHA-256
    SHA-384
    SHA-512

    The numbers refer to the bit length of the generated hash. SHA-1 uses only a 160 bit length, called a message digest. But then, you'd know all that if you would have rtfa.

    --I wish there was some way to automatically append a line of text to messages posted on slashdot.
  • Re:Yeah... (Score:5, Informative)

    by Ctrl-Z ( 28806 ) <timNO@SPAMtimcoleman.com> on Tuesday February 15, 2005 @11:46PM (#11685650) Homepage Journal
    Well, no. Not exactly. SHA-1 is supposed to be a one-way function, meaning that you can't just reverse the operation. So you can't just "crack" it like solving an equation.

    I'm not sure if you are talking about retrieving the original file from the hash, but if you are, then you don't understand what hash functions are for. In this case, there are an infinite number of combinations of bytes that have the same SHA-1 hash. The goal is to find one that has the same hash value, regardless of whether it is actually the same file. SHA-1 is not a cipher.
  • What a hash is/does (Score:4, Informative)

    by cbr2702 ( 750255 ) on Tuesday February 15, 2005 @11:47PM (#11685662) Homepage
    No, that would be one application of a hash (and not a very good one, because someone wanting to mess with it enroute could just re-hash the doctored version and pass on the new hash. What you discribe could be a way to check for accidental errors, though.). A hash is a function that given data gives a smaller amount of data. This smaller amount of data is then also called the hash of the origonal data. A good hash function has the property that if you know the hash for a file, you shouldn't be able to come up with another file that has the same hash without a prohibitive amount of work. A hash function is broken if this property stops holding.
  • by Ctrl-Z ( 28806 ) <timNO@SPAMtimcoleman.com> on Tuesday February 15, 2005 @11:50PM (#11685680) Homepage Journal
    Actually, that wouldn't really work in practice. What would stop someone from intercepting it and changing the message and the hash before your receive it?

    I think what you are thinking of is a digital signature system, where the document is hashed and then the hash is signed. Any tampering would invalidate the signature. The hash is used because it takes a lot of random data to encrypt an arbitrary file, while it takes quite a bit less to encrypt a short, fixed-length hash like SHA-1. Since (in theory), the probability of message collision is quite low, the hash is (practically) as good as the real thing for signing.
  • by Anonymous Coward on Tuesday February 15, 2005 @11:50PM (#11685686)
    Blatantly Stolen from blog comments:


    That's 2**11 less operations. Let's say breaking this (2**69 ops) takes the NSA a week. If it had been 2**80, it would have taken 2048 weeks, or 39 years. If it would have taken the NSA (or whomever) a year to break SHA-1 before, it could be broken in 4 hours.

    My guess would be it would still take a lot longer than a week - but would now be in the realm of possibility, whereas before it would have been in the lifetime(s) range. However, this is totally a wild-assed-guess, based on the assumption that it was expected to take 100+ years before this to crack.

    Posted by: Randell Jesup at February 15, 2005 09:19 PM
  • by Anonymous Coward on Tuesday February 15, 2005 @11:51PM (#11685689)
    Uh, you could never trust the algorithm "100%"; hashes, by their nature, will always have collisions. The big deal is that brute-forcing the algorithm just got 2**11 times easier to do.
  • Not a problem (yet) (Score:5, Informative)

    by Spy Hunter ( 317220 ) on Tuesday February 15, 2005 @11:52PM (#11685701) Journal
    For password hashes this attack shouldn't be a problem, if it is as described in the article. The attack does only one thing: allows an attacker to generate two streams of data which hash to the same value. This is a problem for digital signatures, because somebody can sign one data stream, then distribute another with the same signature. So the signature doesn't guarantee the data has not been modified. However, this attack does not allow an attacker to magically deduce your password from its hash, or even generate another password that would hash to the same value as yours. So you don't need to immediately jump up and replace SHA-1 wherever you use it.

    OTOH, this attack indicates that other types of attacks may be found sooner than was previously thought. So it is still a good idea to move away from SHA-1 in the medium to long term. Though it's not entirely clear what you should move to. And it is not certain that more attacks will be found soon.

  • Re:Broken or not? (Score:2, Informative)

    by mek2600 ( 677900 ) on Tuesday February 15, 2005 @11:53PM (#11685702) Homepage Journal
    Fucking Google It [fuckinggoogleit.com]
  • by clap_hands ( 320732 ) on Tuesday February 15, 2005 @11:55PM (#11685722) Homepage
    We cannot reasonably move to H(x)=MD5(SHA-1(x)). If you have a pair x, y such that SHA-1(x)=SHA-1(y) (i.e. a collision of SHA-1), then MD5(SHA(x))=MD5(SHA(y)). So H(x)=H(y) (a collision of H).

    But don't worry (yet). There's still no known practical way to produce SHA-1 collisions.
  • by Anonymous Coward on Wednesday February 16, 2005 @12:00AM (#11685753)
    So even if the SHA-1 family is flawed for the reason given in the article, all is not lost since they just reduced the work by a couple of orders of magnitude. If you go to 256 bit or 512 bit hashes, you're definitively going to be safe for much, much longer (since even if the given attack works twice as good for 512 bits, you would need many more centuries to try out the same percentage of the keyspace).

    However, the real problem is that todays OpenSSL and libgcrypt (gnupg) libraries don't even have support for SHA-2 (I just tried to find it). So it will actually take quite a while before this can be adopted. And that's the real problem.
  • Re:Damn it (Score:5, Informative)

    by psetzer ( 714543 ) on Wednesday February 16, 2005 @12:00AM (#11685754)
    It's still not really practically breakable unless this is something bigger than what I'm guessing. SHA-0 was broken a few months ago, and MD5 a while before that. What does it mean for you? Not much.

    Some attacker would have to be REALLY dedicated to use this vulnerability to harm you, and they would still require hideous amounts of processor time to mount an effective attack. Digests are a quick and easy way to verify that some message or file is correct. If the hash is signed as well, then you can verify the sender, too. When you download something like a Linux ISO, there is often another file on the server containing the hashes of the files, so you can verify that everything downloaded correctly. If you want to make sure that nobody other than a trusted person modified the files, then that trusted person could encrypt the digest with their private key, allowing anybody with their public key to verify that everything's correct.

    A person can, with a broken hash, create another ISO file, perhaps with malicious code inserted, that has the same digest, meaning you can no longer trust the signed digest. Let's say that this vulnerability reduces the average time needed to find a collision from 2^48 tries via the Birthday paradox (If this isn't a 96-bit hash, then I really need to get more sleep) to 2^32 tries. That's over 65,000 times faster, but you know why I'm not worried? That's still over 4,000,000,000 ISO files that the attacker would have to try before hitting on one that's got the wanted characteristics and the correct digest to boot, and if it requires equivalent memory usage to its time usage, then I'd expect it to use at least 48 gigabytes of memory to store all of the previous attempted hashes. If it takes 15 seconds to compute one digest, then you're looking at a mere 2,000 processor years to find a vulnerability, compared to the much more comfortable 130,000,000 processor years that it would have required using the brute force method.

    Feel better now? If I really got mixed up, and was wrong about the size, then just multiply all the listed times by 2^32, and wake me in 8 trillion AD.

  • by Ray Alloc ( 835739 ) on Wednesday February 16, 2005 @12:02AM (#11685764)
    Check this article [fcw.com]: Federal agencies have been put on notice that National Institute of Standards and Technology officials plan to phase out a widely used cryptographic hash function known as SHA-1 in favor of larger and stronger hash functions such as SHA-256 and SHA-512.
  • Some background (Score:3, Informative)

    by ZakMcCracken ( 753422 ) on Wednesday February 16, 2005 @12:05AM (#11685783)
    Applications that would be broken by this are long-lived cryptographic signatures. Indeed, when a document is "signed", usually only a hash of the document is signed. Finding collisions means one can find two different documents with the same signature.

    This affects all applications using SHA-1 for signature, that is signed email (whether PKIX or PGP), server certificates (which are signed documents). This should be mitigated by the fact that in order to be really usable in some cases, the collision must also be meaningful. That is if you find a collision to a signed email but if it is meaningless, you won't really be able to use it to spoof an email. It depends on the attack quality whether collisions are "meaningful" or not.

    Some applications that should not be broken are the use of SHA-1 for key derivation, i.e. where one uses SHA-1 essentially as the basis of a random function to generate deterministic new keys from a pre-shared key. (I think that's what Schneier meant by HMAC applications.)

    Also, some short-lived signatures should still not be realistically breakable in the time that they would need to be for an attack to be successful; short-lived signatures are typically used in protocols such as IPsec or SSL for authentication. Additionally, to mount an attack on some of these protocols an attacker would need to generate a collision involving "unpredictable" data coming from another party, which the attack may or may not allow.
  • Re:Well (Score:5, Informative)

    by Anthony Liguori ( 820979 ) on Wednesday February 16, 2005 @12:05AM (#11685786) Homepage
    Had to happen, didn't it?

    No algorithm is all-powerful - it only withstands attacks for so long.


    No, it didn't. In fact, this is the most important problem in CS. The theory is that there are certainly problems where checking a solution is easy (2 and 3 are unique factors of 6 because it's easy to see that 2*3 == 6) but where the only possible way to find the solution given the answer is to compute the solution for every possible answer.

    It's not been proven whether hashing is this type of problem (whether it's NP-complete). Moreover, it's never been proven that there isn't a solution for problems we think are NP.

    What's more, it *has* been proven that once we find a solution to an NP-complete problem we'll instantly have solutions for *every* NP-complete problem.
  • Re:Hmm (Score:5, Informative)

    by infiniti99 ( 219973 ) <justin@affinix.com> on Wednesday February 16, 2005 @12:10AM (#11685805) Homepage
    sha1 and md5 are generally considered so weak that they should only be used to combat error or accidents, not fraud.

    Not true. SHA-1 is the hashing algorithm of practically all common security standards. It's found in SSL/TLS, X.509, PGP (the protocol, not the program, so that means GPG also!), S/MIME, etc. In other words... everything. Replacing this is going to suck. :(
  • by ajs ( 35943 ) <{ajs} {at} {ajs.com}> on Wednesday February 16, 2005 @12:19AM (#11685868) Homepage Journal
    if I understand correctly, SHA-1 is a similiar algorithm to MD5, which is commonly used to uniquely identify files

    You do not quite understand correctly. MD5 and SHA-1 are hashing algorithms, and as such it is expected (and accepted) that there are collisions. That is, you might find that your /etc/passwd and /bin/ls files have the same MD5 hash. The value in MD5 and other such hashes is that the probability of that happening is so remote that as a first approximation, comparing hashes is just as good as comparing files.

    That is, you can either keep a backup copy of your filesystem to compare against or you can keep a list of hashes, and mathematically, all this "break" has demonstrated is that the chances are 1:590295810358705651712 not 1:1208925819614629174706176 of a collision. In other words, don't lose sleep.

    Now, for secure cryptographic signatures, the implications are much more unpleasant. It's not the end of the world, but this is that big red light that says: switch to SHA-512 (or something equally secure) ASAP!
  • by Anonymous Coward on Wednesday February 16, 2005 @12:22AM (#11685889)
    You realize that all of the RIPEMD versions have been broken?
    Collision Paper [iacr.org]
    "Collisions for Hash Functions MD4, MD5, HAVAL-128 and RIPEMD"
    Found by the same team that reportedly just broke SHA1 are the same people who broke MD5 and the 'similar' algorithms some months ago.
  • by Tony Hoyle ( 11698 ) <tmh@nodomain.org> on Wednesday February 16, 2005 @12:24AM (#11685903) Homepage
    That's not what's been broken. It's impossible to get the cleartext from a hash - that's why it's called one way (there are an infinite number of cleartexts which can generate that hash, so in theory you can get it, but you've got a 1/infinity probability of picking the right one...)

    SHA1 is not 'broken' in any real sense. Someone claims to have reduced the collission rate to 1 in 2**69. That's still bloody small. It'd take your PC a couple of thousand years to check the hashes to generate a collission.

    Of course if you had a big enough cluster you could get that down to a year or two I guess.

    Man in the middle attacks are *not* what this is about.
  • Re:Not true. (Score:3, Informative)

    by magefile ( 776388 ) on Wednesday February 16, 2005 @12:24AM (#11685908)
    Replying to self, I know, but I found a copy of a November 1996 executive order in Clinton's Presidential Library site that contradicts the GP. http://www.clintonfoundation.org/legacy/111596-exe cutive-order-13026-on-crypto-export-controls.htm [clintonfoundation.org]
  • by Great_Geek ( 237841 ) on Wednesday February 16, 2005 @12:40AM (#11686002)
    Note that what cryptographers consider a "break" is not necessarily the same as what users consider a break. (Neither is more strict, they are just different criteria for different people).

    In this case, the researchers from Shandong University (supposedly) reduced the work required to find a collision from 2**80 to 2**69; this is a major cryptographic result. It is major because SHA-1, as a "cryptographically strong hash", is not supposed to have any attacks better then random. A factor of 2**11 reduction shows SHA-1 to be very far from ideal; and since lots of clever people have tried to show this, the research team should be proud.

    Does this mean the bad-guy-of-your-choice can now start forging digital contracts? Not yet - there is no guarantee that the collision will be meaningful (as least their earlier papers didn't show that result). For a forgery to be useful, the forger needs to make the fake message say something useful - may be change the $1 to $1 million, or change the name, or something. A collision at a random place (or a non-sensical string) is essentially useless as a forgery (there may be some interested DOS attacks, but I am talking about outright forgery which is the point of the hash functions).

    And lastly, 2**69 (roughly 10**21) is still a big number! Assume that some clever people wrote a super-duper hand-optimized code that does a whole SHA-1 in a micro-second on a late model 4 Ghz PC, that is 10**6 hashes/sec. A grad-student using all the PC's on a campus, say ten thousand, that's another 10**4. This would take 10**11 seconds (or roughly 20K years). Note that for SHA-0, their break is 2**39 operations, which *is* practical - it would take the grad student only a minute, or a single PC a week.

    This break is yet *practical* for *most* people. (Would I still use SHA-1? Not in new application, and I make sure that existing applications get changed over eventually.)

    Lest I be accused of ignoring the big boys, the equation changes for them. If a Three Letter Agency is willing to invest a lot of money and design some cool chips that has awsome parallelism and everything, then each break may take only a week. For example, assume these chips has a bunch of pipes that can do a hash every nano-second (or 10**9 hash/second). Further, say there are 100 of these pipes per chip, 100 chips per board, 100 board per rack (or 10**6 pipes/rack). Each rack can then do 10**15 hash/sec, With such a magical rack, it would take 10**6 seconds (or just under two weeks) to find a collision. This would cost Some Real Dollars, but is it within the budget of some three letter agency? You bet. Hack, I would be willing to sell you one for under a billion dollar US. On the other hand, for that kind of money, cryptanalysis takes on different textures - why spend a billion to crack SHA-1 when you can buy the right wet-ware unit for a million?
  • Re:Hmm (Score:2, Informative)

    by cecom ( 698048 ) on Wednesday February 16, 2005 @12:42AM (#11686014) Journal
    Surely you mean 'unhashed' values, not value. A 160-bit hash value can map to about 4 billion 'unhashed' values of length 192-bits - good luck finding the right one :-)
  • by AlexCV ( 261412 ) on Wednesday February 16, 2005 @12:43AM (#11686017)
    No. SHA1 can still not be reversed, they found a COLLISION. That is, with 2^69 tries, they can come up with a value that will have the same SHA-1 hash as the password.

    For passwords, this is nearly meaningless.

    For digital signatures, it's a different thing.
  • Re:Prison. (Score:2, Informative)

    by Tom7 ( 102298 ) on Wednesday February 16, 2005 @12:47AM (#11686032) Homepage Journal
    Of course not. Wang gave a presentation at CRYPTO 2004 to a stunned audience and standing ovation. WTF is wrong with you?
  • Re:Damn it (Score:1, Informative)

    by Anonymous Coward on Wednesday February 16, 2005 @12:50AM (#11686044)
    Let's say that this vulnerability reduces the average time needed to find a collision from 2^48 tries via the Birthday paradox (If this isn't a 96-bit hash, then I really need to get more sleep) to 2^32 tries. That's over 65,000 times faster, but you know why I'm not worried? That's still over 4,000,000,000 ISO files that the attacker would have to try before hitting on one that's got the wanted characteristics and the correct digest to boot, and if it requires equivalent memory usage to its time usage, then I'd expect it to use at least 48 gigabytes of memory to store all of the previous attempted hashes. If it takes 15 seconds to compute one digest, then you're looking at a mere 2,000 processor years to find a vulnerability, compared to the much more comfortable 130,000,000 processor years that it would have required using the brute force method.

    Except that rather than taking 15 seconds to compute a digest, even an imperfect Python implementation [ibm.com] will compute over 150,000 digests per second. Which means that 2^32 digests can be computed in under 8 hours, with just a desktop PC.

    Feel better now? If I really got mixed up, and was wrong about the size, then just multiply all the listed times by 2^32, and wake me in 8 trillion AD.

    Luckily your math was wrong and SHA-1 is a 160-bit cipher. But this is still not particularly good news.

  • by brsmith4 ( 567390 ) <.brsmith4. .at. .gmail.com.> on Wednesday February 16, 2005 @12:50AM (#11686045)
    Pretty fucking hard. The NSA doesn't lend out CPU time on classified supercomputers to anyone but a select few government organizations. As much as I think the congress and various others are in cahoots with the RI/MPAA, the NSA would probably not stand for such a thing.
  • by gnuman99 ( 746007 ) on Wednesday February 16, 2005 @12:51AM (#11686048)
    Of course you can make

    SHA1(data1) == SHA2(data2) where data1!=data2

    because SHA1 maps from a large space to 160bits. There WILL be collisions for any maping like that. The question is can you make,

    SHA1(data1)==SHA2(data2) && data1.length==data2.length) ?

    Can you make the length of the hashed data to be equal?

    If you cannot, then most of the signed hashes cannot be compromised anyway.

  • Re:Hmm (Score:5, Informative)

    by LnxAddct ( 679316 ) <sgk25@drexel.edu> on Wednesday February 16, 2005 @12:51AM (#11686049)
    Relax... it still takes 2^69 tries. That is 590,295,810,358,705,651,712 hash operations. To brute force sha-1 it takes 2**80. This is only 2**11 times faster then a brute force attack... thats 2048 times faster. Its significant but it's not that big of a deal. It is no more significant then if someone with a 2000 node cluster tried to brute force your hash (which is completely feasible...especially for large government agencies like the NSA). In other words, if you were capable of performing 1 trillion (1,000,000,000,000) hash operations per second, it'd still take nearly 19 years for a collision to be found. I assume the NSA can knock that number down to under 24 hours, but thats expected of them. For anyone else in the world, assuming your not being followed by the NSA... and god help you if you are... sha-1 will still be fine and the entire internet security infastructure will not need to be redesigned.
    Regards,
    Steve
  • by daniel de graaf ( 771021 ) on Wednesday February 16, 2005 @12:52AM (#11686052) Homepage
    See the birthday paradox [wikipedia.org].

    With this technique, you can create 2 files, by modifying both of them, that have the same hash. That does not mean that you can create a file with a given hash.

  • by Pseudonym ( 62607 ) on Wednesday February 16, 2005 @12:53AM (#11686057)
    Essentially, don't sign anything that someone else has given you without changing it in some way, or your signature might also apply to some other document they have chosen.

    Right, this is important.

    Decent digital signature protocols (as opposed to just the algorithms) require that you hash more than just the document. For example, you might pick a small amount of random data ("salt"), add that to the message, hash the combination and sign that. You then put the salt in the signature packet so that your signature can be verified.

    OpenPGP, for example, requires that certain signature subpackets be part of the hash, such as the signature creation time. It probably should require random salt.

  • by Anonymous Coward on Wednesday February 16, 2005 @12:58AM (#11686077)
    No, you are confused.

    The attack lets me generate two strings that have the same SHA-1. But I can't choose _either_ string, nor can I choose what their SHA-1 will be.

    To defeat password authentication, you have to be able to take a given SHA-1, and generate a string that has that SHA-1.

    The latter is a much harder problem; even in the ideal case, where you have a perfect, unbreakable 160-bit hash, the first problem takes ~2^80 operations, while the latter takes ~2^160. The latter is just a much harder problem.
  • by Anonymous Coward on Wednesday February 16, 2005 @01:00AM (#11686092)
    Those two statements don't go together either it does let you create identical hashes for different data or it doesn't.

    You're missing an important subtlety:

    This attack makes it computationally feasible to carefully craft two data streams that hash the same. This (theoretically) means I can make two contracts that hash the same, and digitally sign them both. I'll make you agree on one, and then I'll swap it for the other. When you complain, I can successfully repudiate.

    This attack does not do much for making it easier to reproduce a hash for unknown plain text.

    (Of course, I have not read the paper and am only speculating, but I believe that's what the parent to your post meant.)

  • by Spy Hunter ( 317220 ) on Wednesday February 16, 2005 @01:02AM (#11686102) Journal
    There is no logic flaw. Your summary of the first attack is not correct. The attack does not allow you to produce data that has a specified hash value. It allows you to find two sets of data with the same hash, but you don't control the actual hash value. Think of it this way: the process of attaking the hash involves two sets of data; you perform the attack by modifying *both* data sets until their hashes are the same. Now you have two different data sets with the same hash, but the actual hash value is random and not controlled by you.
  • by Lehk228 ( 705449 ) on Wednesday February 16, 2005 @01:10AM (#11686130) Journal
    both papers were (IIRC) generate two datasets X and Y with the same hash Z
    the next step up is to, for any data X and hash Z determine a Y which does not equal X which has hash Z. THe ultimate breakage is when you can, for any data X with hash Z and arbitrary data Y generate M which has the property of Y+M has a hash of Z. At this point you can substitute a conrolled and malicious piece of data which can substitute for X.
  • by gkwok ( 773963 ) on Wednesday February 16, 2005 @01:11AM (#11686132) Homepage
    Actually, both statements can coexist. In most password systems, the hash of the password itself is not stored; rather, it is a hash of the password concatenated with a string of random characters.

    For example, if my password is "foobar", then the server does not store "8843d7f92416211de9ebb963ff4ce28125932878" as the hash, but perhaps the hash of "foobarDKTUHRAOHL" or "19747e26b86ee7939c046c0171a991926f0e01ae". The salt value "DKTUHRAOHL" is stored on the server and never revealed to anyone. So, even if somebody knows the hash value "19747...e01ae", they cannot come up with another string of characters that hashes to the same value, because even if they could, the value they enter in an attempt to hack my account is appended with "DKTUHRAOHL", rendering (almost certainly) a different hash value.

    Now, if they know the salt value, the problem becomes equivalent to finding a string ending with "DKTUHRAOHL" that hashes to "19747...e01ae." However, if someone has gained access to a properly secured server's salt values, you have a large problem on your hands indeed.

  • Poly1305-AES (Score:5, Informative)

    by D. J. Bernstein ( 313967 ) on Wednesday February 16, 2005 @01:14AM (#11686148)
    For people interested in secret-key message authentication: There are authenticators that are (1) much faster than HMAC-SHA-1 and (2) guaranteed to be as secure as AES. In particular, check out Poly1305-AES [cr.yp.to]. Public-domain Poly1305-AES software is now online, though it isn't nicely packaged yet; if you're interested in further announcements, join the Poly1305 mailing list.

    (This is not meant as a comment on the security of HMAC-SHA-1.)

  • Re:Bittorrent? (Score:3, Informative)

    by ArbitraryConstant ( 763964 ) on Wednesday February 16, 2005 @01:18AM (#11686171) Homepage
    Well... TFA says that collisions occur in SHA-1 160 with a probability of 1 in 2^69, but that's for any collision, not whatever message you want. Now, there's a lot of torrents and a lot of digests out there, so if all they want to do is attack any torrent that's out there it might be practical if they had a large cluster of dedicated hardware... but that seems unlikely. Lawyers are cheaper.

    A move to bigger SHA-1 functions or not-currently-broken algorithms might be in order for future revisions though. Things seem to be eroding pretty quickly these days for SHA-1.
  • by SiliconEntity ( 448450 ) * on Wednesday February 16, 2005 @01:27AM (#11686215)
    I don't know about this, but when SHA (the Secure Hash Algorithm) was submitted as an approved algorithm for government use, the NSA reviewed it and suggested a minor change. That modified algorithm is what we now know as SHA-1.

    Note quite right. The NSA invented SHA, but then a couple of years later discovered a weakness and made a slight change, to create SHA-1. The older version is now called SHA-0. According to Schneier's report, SHA-0 was broken even worse by the Chinese team, 2^39 work for a collision vs 2^69 for SHA-1. So the NSA's change was important and made a major increase in the strength of the algorithm.

    It was a few years before public-sector cryptographers caught on to what the significance of the changes was (I wish I could explain it, but it is beyond me).

    They just added a 1 bit rotate in one phase of the algorithm. Without it, SHA tends to keep the bits lined up and there is less mixing.
  • by gkwok ( 773963 ) on Wednesday February 16, 2005 @01:35AM (#11686253) Homepage
    Oh, yes you're right. I don't know what I was thinking. The purpose of salting is to discourage dictionary attacks, so that would-be attackers cannot compile a list of words and their associated hashes. The randomness of the salt value eliminates anything dictionary-like about the password. Right, there's no reason the salt itself cannot be published; the problem is still equivalent to finding a string ending with a given salt value that hashes to a given hash.
  • by interiot ( 50685 ) on Wednesday February 16, 2005 @01:39AM (#11686284) Homepage
    Adding data2.length to the check is basically just increasing the number of bits used for the hash (and thus reducing the number of chunks of data that hash to the same value, and increasing the amount of time required to find a collision). However, length turns out to be a very poor thing to hash on (non-linearity is prized in hash functions, and you can't much more linear than the length function). If you're going to set aside more space for a hash value, it's better to use a longer hash or multiple hashes together.
  • by yppiz ( 574466 ) on Wednesday February 16, 2005 @01:40AM (#11686291) Homepage
    Even adding the length restriction, you are still mapping from a large space (the space of all binary strings of length n > 160) to a small space (the space of all binary strings of length 160), so there will still be collisions.

    --Pat / zippy@cs.brandeis.edu
  • Not necessarily (Score:5, Informative)

    by jd ( 1658 ) <imipak@yahoGINSBERGo.com minus poet> on Wednesday February 16, 2005 @01:45AM (#11686324) Homepage Journal
    Many hashing functions operate by simple bit manipulations. There are classes of hash function which use cellular automata instead. I believe there are some which use fast fourier transforms.


    In general, we can say that there are infinitely fewer hashes than there are possible data objects you may wish to hash, and therefore there are infinitely many collisions. We can also say that for an N bit hash, at least one collision must occur over a range of (2^N)+1 values for the initial data object.


    However, if the collisions occur on a totally cyclic basis, it doesn't matter if there's only ever one within that range. You know where it is, without the bother of looking.


    Therefore, the strength of a hash can be measured as a function of two properties:

    • The fewer collisions within one complete cycle, the better.
    • The more random the distance between collisions, the better.


    Bit operations have tended to be used, because they're fast and they allow some control over these two parameters. Other than that, there is no particular merit in using them.


    Cellular automata can produce some excellent one-way functions. Their behaviour can also be far harder to predict, if the algorithm is good. However, they are computationally very expensive and getting a usefully strong algorithm is much harder than with bit manipulations.


    Transforms are not generally considered one-way, because 99.9% of the time they are only useful because they are two-way. I've not really looked into how transform operations are used in hashes, but they presumably have some strengths.


    (Transforms in cryptography, where you want to go from one domain to another and then back again, would make sense. They would also be useful for encryption modes, for generating the new encryption key for the next block.)

  • by Kiryat Malachi ( 177258 ) on Wednesday February 16, 2005 @01:49AM (#11686343) Journal
    Here's the worry.

    Let's say someone trusts a digital signature, signed with SHA-1, to the point of allowing money to be predicated on the validity of this signature. If the message is signed and valid, the payer pays the payee $X dollars, where X is some very large amount.

    Message #1 is generated and sent. It validates.

    The money is paid. At which point the payee produces a second message which hashes the same as the first but claims to be turning down the deal, or modifying the terms of the deal s.t. they don't have to do anything to earn that money, and they claim that's what was actually sent.

    This is a problem, since the break apparently allows the construction of two (relatively) arbitrary message sequences that hash to the same value, which is an easier and much different problem from constructing an arbitrary message that hashes the same as a pre-existing message.
  • Re:Hmm (Score:2, Informative)

    by LnxAddct ( 679316 ) <sgk25@drexel.edu> on Wednesday February 16, 2005 @01:58AM (#11686383)
    They reduced it from 2**80 to 2**69. And the paper hasnt even been released yet so noone knows if that is even accurate.
    Regards,
    Steve
  • by dantheman82 ( 765429 ) on Wednesday February 16, 2005 @01:59AM (#11686386) Homepage
    I read on one site - in answer to the question "What's the big deal - is 2**69 really all that bad?"

    That's 2**11 less operations. Let's say breaking this (2**69 ops) takes the NSA a week. If it had been 2**80, it would have taken 2048 weeks, or 39 years. If it would have taken the NSA (or whomever) a year to break SHA-1 before, it could be broken in 4 hours.

    My guess would be it would still take a lot longer than a week - but would now be in the realm of possibility, whereas before it would have been in the lifetime(s) range. However, this is totally a wild-assed-guess, based on the assumption that it was expected to take 100+ years before this to crack.
  • by Anonymous Coward on Wednesday February 16, 2005 @02:20AM (#11686472)
    Same researchers announced some vulnerabilities in MD5 and SHA-1:

    See:
    http://www.arnnet.com.au/index.php/id;1503863220;f p;512;fpid;710205681 [arnnet.com.au]
    Researchers have discovered a flaw in the MD5 algorithm that is used to provide a unique signature for data.

    Xiaoyun Wang, a Chinese expert, and three colleagues have discovered the flaw in the hash function algorithm, which is used in applications, such as EMC's Centera content-addressable file store. The flaw was revealed at the Crypto 2004 conference.

    Also:
    http://www.rsasecurity.com/rsalabs/node.asp?id=273 8 [rsasecurity.com]

    * The hash function collisions recently discovered have minimal practical impact at this time due to the limitations discussed further. It is not clear that these research results can be turned into practical exploits on most typical uses of these hash functions, so there is no immediate need to replace hash algorithms.
    * As a precaution, applications using a legacy hash function described as vulnerable should upgrade to the NIST-approved SHA1 or SHA2 family of algorithms (RSA Laboratories suggested a migration to SHA1 in 1996).
    * Applications using SHA1 do not appear to be at risk, but conservatively, developers may also consider planning an upgrade to the SHA2 family in the next few years.
  • by Bitsy Boffin ( 110334 ) on Wednesday February 16, 2005 @02:21AM (#11686475) Homepage
    The concept is not the same.

    Encryption is not in any way hashing, and hashing is not in any way encryption.

    A one way hash cannot in any way be decrypted, thats why it's called one way. It's physically not possible.

    A hash is not used to "protect" the data you are hashing, it's used to identify the data you are hashing. You can take an unhashed value, hash it and compare it with another hashed value to identify that the two original values were very likely the same value

    The strength of a hash is simply how likely it is that the two values were the same value, or conversly, how likely it is that they were not the same value.

    When two distinct values have the same resultant hash, we call that a "collision". It should be obvious that there are an infinite number of collisions for a fixed-length hash value - pidgeon hole principle shows you that.

    And SHA1 is not "broken", not yet, to be "broken" we would have to have a feasible way to generate a string of data that when hashed produces the same hash as an *already known* hash.

    If we could generate collisions for KNOWN hashes, in a reasonable time, then we could use those collisions to falsly identify to systems that use a hash of the original (still unknown) value.

  • by Theatetus ( 521747 ) on Wednesday February 16, 2005 @03:06AM (#11686630) Journal
    they could concievably glean the cleartext

    No. Hashes like SHA-1 are lossy; there is less information in the hash than in the plaintext. Lost information like that cannot be recovered unless just about everything we know from information theory (and thermodynamics) is wrong.

  • by boots@work ( 17305 ) on Wednesday February 16, 2005 @03:10AM (#11686647)
    A better idea is to use UUIDs [ku.edu], where these problems have already been considered, and systematically handled. On Linux, just read from /proc/sys/kernel/random/uuid.
  • Re:China's Motive (Score:2, Informative)

    by jameszhou2000 ( 811168 ) on Wednesday February 16, 2005 @03:13AM (#11686660)
    as long as they show the results, it can be verified.

    in theory, given a hash number, it takes centuries to find the collision.

    given the hash number, if you could come up with a collision in days or even in months on your PC. that means you break the code.

    in order to believe it is broken, you don't need to know the details of the alorithm.
  • Not a big surprise (Score:2, Informative)

    by Anonymous Coward on Wednesday February 16, 2005 @03:24AM (#11686700)
    More than 6 months ago, papers were going around telling people to 1. Stop using MD5 as a secure checksum immediately, and start moving away from SHA-1 as it's security was becoming more vulnerable by the day. Researchers reccomended moving to SHA-256 to prevent accidental breach of data security (sooner, rather than later). There have been many different cracks involving SHA-1 (albiet simplified versions, but you just knew more would eventually be coming). Well it's here now! SHA-512 here we come!
  • Re:Sigh (Score:4, Informative)

    by Craigj0 ( 10745 ) on Wednesday February 16, 2005 @04:28AM (#11686872)
    For a password, hopefully your system would lock the account long before there are that many failed login attempts. However, if your attacker has that kind of resources, you can assume it is feasable for them to find a hash collision. That's really only significant for governments, multi-national organizations, and other major enterprises, but not for most people.

    That is not how it works. THey are using the birthday paradox, that is why brute force is 2^80 not 2^160. Put simple the birthday paradox finds two pieces of data with the same hash. It does NOT (as so many posts believe) allow you to find a matching hash to a fixed piece of data. (This would take 2^160, perhaps less with the weaknesses discovered but not close to 2^69). Hence this does not allow you to break passwords.
  • Re:Sigh (Score:4, Informative)

    by hobbicik ( 229710 ) on Wednesday February 16, 2005 @04:35AM (#11686899)

    Lets say I have an ISO disk image. I hack it, and want to modify some of the 'junk' bits using their algorithm. I'd still need to perform 590295810358705651712 hash operations on that image. Computing the hash of a disc is a slow operation. That's not something I could do in a day, week, or even a few months. Perhaps if I had a massivly parallel computer available, I could do it, but not as an individual.

    No need to compute the hash of a whole disc. You can calculate the internal state of SHA-1 after processing the whole image except - lets say - the last kilobyte (you do it ONCE) and find a collision by modifying only this last kilobyte with great chance of succeeding. There are 2^8192 variants of the last kilobyte, but only 2^160 variants of the hash - that's why you'll probably succeed.

  • Re:Sigh (Score:3, Informative)

    by fulgan ( 116418 ) on Wednesday February 16, 2005 @04:52AM (#11686960)
    Actually, you're both righ and wrong.

    You're right because 2^69 operation is an awfull lot of work: as someone of Bruce Schneier's web log said, if you had a processor clocked at 4ghz capable of testing one hash per cycle, it would still take 4000 year to breakj a single hash. Clearly, this isn't feasible today or, at least, not without a lot of resources (hugh clusters of code-breaking computers).

    You're wrong because you don't have to parse the whole file: SHA-1 works by dividing the data to computer into chunks of identical size (padding them if necessray, SHA-1 uses blocks of 512 bits) and applying a set of operation to each block in turn, using the previous block as initial state.

    So it means that, if you have a way to create a collision between to hash function, all you have to do to "patch" your ISO image is work on the LAST chunk of data and make sure it ends up with the correct state. So you'll have to computer the hash of the full ISO only once per image.
  • by vagabond_gr ( 762469 ) on Wednesday February 16, 2005 @04:59AM (#11686975)
    For password hashes this attack shouldn't be a problem, if it is as described in the article. The attack does only one thing: allows an attacker to generate two streams of data which hash to the same value. This is a problem for digital signatures, because somebody can sign one data stream, then distribute another with the same signature. So the signature doesn't guarantee the data has not been modified

    Even for signatures, it depends on the application. There are two types of collision resistance:
    - Weak collision resistance: Given x, I cannot coumpute y s.t. H(x)=H(y)
    - Strong collision resistance: I cannot compute arbitrary x,y s.t. H(x)=H(y)

    Usually collision results show that a hash algorithm is not strong resistant.

    So if I want to create random data (a nonce) and sign it there is a problem, I can create x,y with the same signature. However if I want to sign something specific, say an email, then I have to break weak resistance, random x,y won't do since x is unlikely to be the email I wrote.
  • Re:better yet-- (Score:3, Informative)

    by dago ( 25724 ) on Wednesday February 16, 2005 @06:37AM (#11687237)
    You neither, just run ROT-7.5 26 times and it should decrypt any ROT-13 encrypted message.

    Of course, ROT-6.5 is faster (just 2 times).
  • Re:Sigh (Score:4, Informative)

    by mlyle ( 148697 ) on Wednesday February 16, 2005 @06:40AM (#11687246)
    OK, and then let's do some math.

    Let's say you have 2^20 (1048576) machines. Let's say each can do 2^20 hashes per second (this is optimistic). Then it will take you 2^29 seconds to find a hash collision-- this is about 17 years.

    This doesn't even let you collide with an arbitrary thing-- rather, you can provide something to someone to sign, and have another message that hashes to the same thing.

    It is worrisome, though, because perhaps attacks will improve and it'll continue to get cheaper.
  • Re:Sigh (Score:1, Informative)

    by Anonymous Coward on Wednesday February 16, 2005 @06:43AM (#11687255)
    Actually your example is wrong. The 2^69 refers to finding collisions between two random plaintexts, not finding a collision for a given plaintext.
  • by ticktockticktock ( 772894 ) on Wednesday February 16, 2005 @07:09AM (#11687302)
    If your system has an md5sum command, it will also have a sha1sum command. Same idea, different output: feed each of them a file and they will give you a hash of that file that fits in 128-bits (md5) or 160-bits (sha1).

    If you have "openssl [openssl.org]" installed, you could also use openssl to generate the hashes. (supported hashes [openssl.org])

    For md5 hashes:
    $ openssl md5 filenames...
    Output: MD5(filename)= hash

    For sha1 hashes:
    $ openssl sha1 filenames...
    Output: SHA1(filename)= hash

  • by mindstrm ( 20013 ) on Wednesday February 16, 2005 @08:10AM (#11687463)
    Yes, there will always be collisions, a great many collisions, in any hashing scheme. That's not the issue.

    The issue is that you are not supposed to be able to deterministically generate collisions whenever you want. Previously, the only way to find a collision was by brute-force.

    If someone finds a way to generate two streams of data that generate to the same value, that's a problem.
  • Re:Prison. (Score:3, Informative)

    by chialea ( 8009 ) <chialea&gmail,com> on Wednesday February 16, 2005 @10:09AM (#11688102) Homepage
    Hi Tom.

    As you weren't at crypto last year, you missed a bit of a brouhaha where people were considering moving the conference (which is ALWAYS at UCSB) outside of the US. The proximate motivation for this was that there was a grad student who had a paper, and was supposed to present it. When she found out, she got the next embassy appointment to get a visa, which was three weeks before the conference. She went, and they freaked out about the "crypto" thing, and said they'd have to send her work back for review to see if she was a terrorist or some such. They gave her a new appointment a week after the conference. Less than useful, eh?

    The visa people need to remove their heads from their posteriors about published work; it's hard to threaten the US by talking about something published in a very prominent conference at that conference. (Someone's going to do it!)

    Lea
  • by sholden ( 12227 ) on Wednesday February 16, 2005 @11:55AM (#11689058) Homepage
    It's a hashing function, so broken means it is possible to generate a collision in less work than brute force (try all possibilities) needs.

    Of course TFA gives a more precise definition.
  • by Anonymous Coward on Wednesday February 16, 2005 @03:28PM (#11691651)
    Yes, there are, in both, due to the use of the majority function. Actually producing these collisions is, however, left as an exercise to the reader, as the Chinese technique cannot produce pads that short.
  • by wirelessbuzzers ( 552513 ) on Wednesday February 16, 2005 @05:57PM (#11693346)
    It's based heavily on AES's core operations which would make me feel uneasy. Diversity in the underlying techniques for crypto algos is exemplified here by how just about every hash we use today fell because of a lack of diversity.

    True. But even if AES were to fall, its core is totally different from 3DES, IDEA, TWOFISH, BLOWFISH, SERPENT, MARS, RC6, TEA and so on. There are probably dozens of different, still-unbroken symmetric ciphers out there, and this doesn't even include stream ciphers like RC4. And the ciphers that I listed don't bear that much relationship: many of them are Feistel ciphers, but that's about the extent of the structural similarity. Even if a weakness were found in the Feistel design, we'd still have AES, IDEA and MARS at least (not sure about RC6).

"Experience has proved that some people indeed know everything." -- Russell Baker

Working...