Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Encryption Security The Internet

First Successful Collision Attack On the SHA-1 Hashing Algorithm (google.com) 87

Artem Tashkinov writes: Researchers from Dutch and Singapore universities have successfully carried out an initial attack on the SHA-1 hashing algorithm by finding a collision at the SHA1 compression function. They describe their work in the paper "Freestart collision for full SHA-1". The work paves the way for full SHA-1 collision attacks, and the researchers estimate that such attacks will become reality at the end of 2015. They also created a dedicated web site humorously called The SHAppening.

Perhaps the call to deprecate the SHA-1 standard in 2017 in major web browsers seems belated and this event has to be accelerated.

This discussion has been archived. No new comments can be posted.

First Successful Collision Attack On the SHA-1 Hashing Algorithm

Comments Filter:
  • I don't know; every man has his breaking point.

  • ... if one day it gets out that this was discovered a long time again by certain intelligence agencies.

    • by Lunix Nutcase ( 1092239 ) on Friday October 09, 2015 @11:00AM (#50693603)

      People have been attacking SHA-1 since 2005.

      https://en.wikipedia.org/wiki/... [wikipedia.org]

      No need for any conspiracy since people were warned about potential weaknesses in SHA-1 for a decade.

      • People have been attacking SHA-1 since 2005.

        https://en.wikipedia.org/wiki/... [wikipedia.org]

        No need for any conspiracy since people were warned about potential weaknesses in SHA-1 for a decade.

        There's also no need for any conspiracy when staying ahead of such things (with their nearly unlimited resources and no concern for profit) is part of the reason why such agencies exist...

      • by arglebargle_xiv ( 2212710 ) on Friday October 09, 2015 @09:36PM (#50697219)

        People have been attacking SHA-1 since 2005.
        https://en.wikipedia.org/wiki/ [wikipedia.org]...
        No need for any conspiracy since people were warned about potential weaknesses in SHA-1 for a decade.

        It's also important to point out that this is a free-start collision, where the attacker gets to choose the initial values, something that isn't possible with full SHA-1. This makes the attack much, much easier than an attack on full SHA-1. It took nearly a decade to go from the first free-start collision on MD5 to an actual attack, and MD5 was a much weaker function than SHA-1. Their estimate of "end of the year" may be a bit optimistic.

    • ... if one day it gets out that this was discovered a long time again by certain intelligence agencies.

      Well, wasn't that what happened with Dual_EC_DRBG?

      • by Lunix Nutcase ( 1092239 ) on Friday October 09, 2015 @11:55AM (#50693995)

        Not really. People at Microsoft Research showed it to be broken years before it became a scandal. No one bothered to listen.

        • Not really. People at Microsoft Research showed it to be broken years before it became a scandal. No one bothered to listen.

          This is one major reason I prefer full public disclosure with functioning reference exploits. Those seem much harder to ignore.

        • Not really. People at Microsoft Research showed it to be broken years before it became a scandal. No one bothered to listen.

          I didn't know that. Got a link?

      • by TheRealHocusLocus ( 2319802 ) on Friday October 09, 2015 @12:17PM (#50694177)

        Well, wasn't that what happened with Dual_EC_DRBG?

        We can never know for sure, but empirically, I really don't think Dual_EC_DRBG ever pinged on NSA's --- or any other state intel actor's --- radar. At least not before EC vulnerabilities became public knowledge. Its use by default in the RSA BSafe toolkit meant that products using that toolklit would be vulnerable. And YES, that was a rich prize. BSafe may have been part of a program to seed a backdoor towards, say, a particular target state or industry.

        BUT... there is for me an irreconcilable problem with that theory. I ran an ISP in those crazy early days when administrators were faced with a choice of whether to 'drop in' a BSafe object library under license (prove USA blahdy-blah) or compile the SSLeay/OpenSSL source, which was by no means as smooth and functional as it is today. But even pre-2000 it was obvious that the whole world was going the OpenSSL open source route as soon as it was stable.

        Given that OpenSSL's populary was increasing by leaps and bounds... and yet, the OpenSSL FIPS Object Module v2.0 had a bug that prevented Dual_EC_DRBG from being used [marc.info]. *IF* the back door was being actively exploited by some state actor, they would have noticed this right away and it would have been a trivial matter (and top priority) for some helpful volunteer to emerge from the shadows and toss in a fix for it. Maybe even a soft-sell for epileptic curves. But this did not happen. Ergo, circumstances more closely resemble a situation in which NOBODY, including NSA, cared.

        Remember that intel agencies are padded with the same bloviating internal memos as any organization, and love to take 'credit' for a thing to show their prowess whether or not the thing is actively being used. Maybe a good part of Snowden's trove are empty boasts.

  • by slashdice ( 3722985 ) on Friday October 09, 2015 @10:51AM (#50693529)
    Git uses SHA1 so every git repository should now be considered compromised. Dice is holding an all-hands meeting this afternoon to find a replacement. Since sourceforge supports SVN and CVS, we may use them. They're highly performant, easy to use, and (most importantly) their crypto can't be broken since they don't have any.
    • Re:what about git? (Score:4, Interesting)

      by Immerman ( 2627577 ) on Friday October 09, 2015 @10:58AM (#50693587)

      Har har.
      But seriously, as I recall git doesn't use SHA 1 for security, but just as a really good hashing algorithm.

      • by fisted ( 2295862 )

        With a SHA1 collision you can rewrite a repositorty's history.

        • Re:what about git? (Score:4, Insightful)

          by NotInHere ( 3654617 ) on Friday October 09, 2015 @11:18AM (#50693727)

          No. That's second preimage attack. Collision is if you can chose multiple versions to map to the same hash.

          • by fisted ( 2295862 )

            Well, yes. Then let's say one could introduce a new (chosen) commit that one could secretly amend after the fact. I guess preimage attacks is the logical next step, though.

            • by Anonymous Coward

              If you can somehow make a security-flaw inducing subtle change that ALSO hashes to the same SHA1 key, there's a three letter agency or two that would like to meet your acquaintance.

          • Yes - so it means that you can rewrite history: by producing an alternative set of commits that match the hashes. They may be junk, but they are an alternative history. 2nd preimage means that you can choose what the alternative commits are: so that you can choose a plausible alternative history.

            • No, the difference between 2nd preimage and collision is that for your hash function HASH, you have for 2nd preimage sth, and HASH(sth) given, and want to get sth2 so that HASH(sth) == HASH(sth2). sth2 is choseable by you, either completely, or only in parts. Sometimes you only have HASH(sth), but you never can modify sth.

              Now for collision, you only have HASH given, as function, but you can chose both sth and sth2, either completely, or in parts.

              This means you never can rewrite history with git, if you only

    • Re:what about git? (Score:5, Insightful)

      by queazocotal ( 915608 ) on Friday October 09, 2015 @10:58AM (#50693589)

      Not quite.
      This is not yet a full attack on SHA-1.
      It cannot - yet - be used to generate a collision for any known hash.
      It is an indication that you should move away from sha-1 as fast as you can.

    • Why can't git be updated to just use another algorithm? I guess that would be a much better solution than just moving to something else.

      • Re:what about git? (Score:5, Informative)

        by Anonymous Coward on Friday October 09, 2015 @11:34AM (#50693863)

        Why can't git be updated to just use another algorithm?

        First off, Linus on the topic of SHA1 safety: (SO link, as the git mailing list links are flaky on me today) [stackoverflow.com]

        The problem is that git uses the SHA1 hash *extensively* for "permanent" identification of things. There's a host of existing usage out there which would need to be updated/converted, and any conversion of an existing repository would completely invalidate any crosslinks/references using the SHA1 format. Also, because git allows shortened hashes to be used for identification, there's no way you can use the length of the hash to tell the difference between two hash formats for a "mixed" repository.

        That said, it's not really a big deal. Even if you can manufacture a hash collision, there really isn't a good way to use it to attack a (remote) git repository. Even if you could create a file with the same SHA1 hash as a typical file in a git repository, it's highly unlikely to be anything approximating something that's in an appropriate format. The colliding file will be line noise, rather than a compile-able C++ file, for example.

        Moreover, git is set up to use the *previous* version of a file in case two files have the same SHA1 has. So you can create a SHA1 collision of an existing file ... which is then ignored by git in favor of the other file. The only way around that is if you have admin access to the remote git repository, or can somehow contrive to get your malicious file accepted to the repository prior to the file you're trying to collide with. (In which case, where are you getting the SHA1 you're targeting from?)

        Even then, if someone has a "clean" copy of the file you're colliding with, makes a modification to that and re-commits, your malicious file will be overwritten wholesale by the new version of the non-malicious file (as git commits encode full file changes, rather than file deltas, so the new SHA1 will be encoded as the new version of the old SHA1).

        You might be able to promote a divergence in the code tree due to the different files, but given that everyone in git has a full version of the repository on their disk, it would soon become apparent that something "funky" is going on in the commit history.

        In short, even if you can make deliberate collisions with SHA1, that doesn't change the usefulness (and safety) of SHA1 for git, just like rot13 being a poor encryption doesn't mean you need to use PGP to encode your usenet joke punchlines.

        (BTW,. I'm guessing the GP post is supposed to be a joke)

        • by Bengie ( 1121981 )

          even if you can manufacture a hash collision, there really isn't a good way to use it to attack a (remote) git repository.

          If you have $150k to drop on creating a hash-collision, you can afford someone to hack the remote system. Most systems are not properly secured.

          Even then, if someone has a "clean" copy of the file you're colliding with, makes a modification to that and re-commits, your malicious file will be overwritten wholesale by the new version of the non-malicious file

          Same could be said about the malicious file.

        • by Anonymous Coward on Friday October 09, 2015 @12:15PM (#50694165)

          The colliding file will be line noise

          I guess Perl projects using git are in trouble.

          • Given the state of the art in Perl golf, the colliding file might be condensed implementation of a complete proof to Fermat's Conjecture, or a DNA codemap to cure cancer.

            Or another stupid "JAPH". [wikipedia.org]

        • TL;DR an SHA1 hash collision is only good for password cracking.

        • Why can't git be updated to just use another algorithm?

          First off, Linus on the topic of SHA1 safety: (SO link, as the git mailing list links are flaky on me today) [stackoverflow.com]

          The Linus' comment is somewhat outdated.

          For the first type of collision - the inadvertent kind - a check was added to the git very long time ago. It will not let you commit, if there is a hash collision. The time-stamp is also part of the commit, and as such, the workaround is to simply wait one second and try to commit again.

    • by Anonymous Coward

      There are limits on what it means to break SHA. Your encryption key should look like random gibberish, so if I find other random gibberish that hashes to the same key, then my key looks like your key as far as the SHA hash is concerned. Git hashes the actual content of the commit, so if you find random gibberish that hashes to the same thing as my real commit, you can replace my commit with your gibberish. The real challenge here is that in order for this to not be immediately obvious, you can't use ran

    • Re:what about git? (Score:5, Interesting)

      by John Allsup ( 987 ) <slashdot@chal i s q u e.net> on Friday October 09, 2015 @11:20AM (#50693739) Homepage Journal

      Immerman's point is essentially right. Here is a more thorough opinion.

      Git does not use SHA1 for cryptographic purposes. The use of SHA1 for cryptographic purpose is what should be deprecated. If major git repositories start calculating SHA256 hashes too, and keep an eye out for in the wild collisions, it will probably be ok. Git does not need to be attack resistant like TLS does. In any case, it is worth rejigging the code so that the hash is done via a plugin and can be migrated, if this isn't already done. I haven't read the git source and am not sure, but it would be easy to get it done before it becomes a problem for git. I use md5sum for a lot of applications which don't require security sufficient for cryptographic purposes. Cryptography is the Formula 1 of computation, and just like most vehicles don't need to compete against an F1 car, many of the trickle-down uses of cryptographic hashes will be fine for a while. Git only has an issue if two versions of files in the same repo produce the same hash. In practice that means two compilable source files, rather than arbitrary meaningful input. That makes cracking much harder since you have a language recognition problem bolted onto the frontend of your hash, so most potentially colliding inputs will be excluded by this (if one colliding file is a C file, and the other is bad French poetry, it is clear which is intended -- cryptographic purposes cannot rely upon such applications of commonsense recognition). Do not worry about Git.

      As an exercise, try and write two valid Python3 files between 10 and 30 lines long importing only sys, re and glob, such that they have identical md5sum outputs. By reducing the input space for a hash, you can make collisions less likely. What is important about this attack is that there is a round trip forward through the hash, and then backwards to a different input. By looking at the information discarded by the nonlinear parts of the hashing algorithm (that is, the non-reversible steps) you can start to make meaningful sense of what the hash is doing. Interestingly, if you produce a language specification which permits fewer valid inputs than the number of possible hash outputs, it is in principle possible that no collisions will occur. Indeed it would be a good exercise for a beginning cryptanalyst to try and construct a language such that valid inputs were guaranteed to get different md5sum outputs.

      • Interestingly, if you produce a language specification which permits fewer valid inputs than the number of possible hash outputs, it is in principle possible that no collisions will occur.

        Yes, and knowing each possible valid input would allow you to build a rainbow table to decode each hash back to its original value (and not just to a value that will give you the same hash).

        Indeed it would be a good exercise for a beginning cryptanalyst to try and construct a language such that valid inputs were guaranteed to get different md5sum outputs.

        Only because they would, shortly thereafter, learn that hashes are, in fact, meant to not be reversible. Guaranteeing a 1-to-1 mapping (e.g. no collisions) makes them reversible, negating the point of the hash.

        • by thogard ( 43403 )

          The entropy in hashes must be less than the entropy in the data or it isn't a hash. That means that a hash requires that there be collisions by definition. A good hash will minimize those but there will always be a risk.

          When writing a program that requires a hash, I find it useful to gut the hash function so that if I'm using sha256, I set all the bytes except for one to zero so I see what happens with collisions and can test that functionality. It is amazing how many bugs I've found in protocol implemen

          • This bro knows his stuff.

            Tim, thanks for elaborating on my point; you gave much more detail than I thought was necessary, but I may not have given enough.
    • I know you're joking but... Git uses SHA1 to generate commit hashes, not for encryption. Just so nobody gets confused reading your post.
    • Instead of panicking, maybe you should actually see if this is really a problem. Here is what Linus says:

      - The attacker kind of collision because somebody broke (or brute-forced) SHA1.

      This one is clearly a _lot_ more likely than the inadvertent kind, but by definition it's always a "remote" repository. If the attacker had access to the local repository, he'd have much easier ways to screw you up.

      So in this case, the collision is entirely a non-issue: you'll get a "bad" repository that is different from what the attacker intended, but since you'll never actually use his colliding object, it's _literally_ no different from the attacker just not having found a collision at all, but just using the object you already had (ie it's 100% equivalent to the "trivial" collision of the identical file generating the same SHA1).

  • combine them? (Score:4, Interesting)

    by JigJag ( 2046772 ) on Friday October 09, 2015 @11:17AM (#50693713)

    One thing that always bothered me with announcements like 'MD5 is dead because we can forge collisions' is that what are the chances that the forgery would pass *both* MD5 and SHA1 ?

    Say you have a string S and a forged S' so that S != S' and MD5(S) = MD5(S') and let's say you can create S' easily regardless of S. That's the definition of a hash collision and a proof that the algorithm can't be trusted anymore. Surely, the odds that it also satisfies SHA1(S) = SHA1(S') are close enough to impossible, no?

    If that's the case, then sign your certs, code, etc with concat(MD5(S),SHA1(S)) instead of just one broken hash. Yes, two broken hashes are indeed protecting you.

    • Re:combine them? (Score:4, Informative)

      by Sigma 7 ( 266129 ) on Friday October 09, 2015 @12:03PM (#50694079)

      Apparently, concatenation isn't as effective as it could be. It will be at least as strong as either MD5 or SHA1, and while it seems that you'd get a 288 bit hash, it's about as strong as if you had 174 bits.

      It's probably easier to make a 288 bit hash from the start.

      Discussion page: http://crypto.stackexchange.co... [stackexchange.com]

      • by Bengie ( 1121981 )
        I think the point is finding a collision that will pass both MD5 and SHA1 is harder than find a collision that only passes SHA1. Even if you're pessimistic, you're at least as strong as your strongest link in this situation.
    • Of course it would be just as easy to add sha256 rather than to add md5. You could then deprecate the sha1 and after a while stop using it at all. If you keep the two hashes separate rather than concatenating them, you can deprecate a weaker one every ten years or so, as as needed. Instead of:

      if matches (candidate, md5hash)
      You'd use:
      if matches (candidate, @undeprecated_hashes)

    • One thing that always bothered me with announcements like 'MD5 is dead because we can forge collisions' is that what are the chances that the forgery would pass *both* MD5 and SHA1 ?

      The increased difficulty of finding a collision is derived by storing two separate hashes not diversifying algorithms.

      For example one could just as easily perturb a plaintext in a publicly known deterministic way then rerun very same hash algorithm again for similar effect. If you assume both algorithms are no longer sufficiently collision resistant for your needs then switching it up makes no practical difference.

    • by cdrudge ( 68377 )

      Based on all the published benchmarks I could find, the amount of time it takes to compute the MD5 + SHA1 hashes is approximately the same if not greater than what it takes to compute the SHA256. Why bother to compute a hash with two "broken" algorithms when you can spend less time using an unbroken one?

      Here's the results of running a benchmark test using openssl speed on my i5 laptop:

      Doing md5 41943040 times on 16 size blocks: 41943040 md5's in 17.67s
      Doing sha1 41943040 times on 16 size blocks: 41943040 sh

      • by tlhIngan ( 30335 )

        Perhaps you might be able to optimize things so that both the md5 and sha1 hashes were computed simultaneously as the bytes were read so that they only had to be traversed once. But do you think you'd be able to shave in this example 30% the combined time to equal the sha256 time? And then you'd still be left with two individually broken algorithms.

        The goal of combining isn't to save time, it's because it's what's available. Let's say you have an embedded device, and it can do SHA1 and MD5 already in hardwa

      • by JigJag ( 2046772 )

        the issue in using one hash is still present though. One day if SHA256 is broken, you will be back to the same problem.

        Suppose a (near?) future where SHA256 is widely deployed and just got broken: full collision on demand. That future also means that SHA1 is even more trivially broken and MD5 even more so.

        My point is that it would harder to conjure S' so that S != S' AND MD5(S) = MD5(S') AND SHA1(S) = SHA1(S') than it would be to have S != S' AND SHA256(S) = SHA256(S')

        For that matter, string 3 of those hash

  • I have nothing more to say.
  • It is called a freestart collision.
    A freestart collision is one where the attacker gets to choose the initialization vector. In maybe all practical applications, it doesn't happen as it is fixed by the standard.

    Unlike MD5, it is still impossible to get two different files that have the same standard SHA-1 checksum.

    And even true collision attacks are quite limited. For many applications (like cracking passwords), what you need is a preimage attack, and neither MD5 nor SHA-1 have one.

    • Re:Weak attack (Score:4, Informative)

      by BronsCon ( 927697 ) <social@bronstrup.com> on Friday October 09, 2015 @01:58PM (#50695039) Journal

      Unlike MD5, it is still impossible to get two different files that have the same standard SHA-1 checksum.

      False. As long as there are potentially more bits in the input than there are in the output (read: the input can be longer than the resulting hash), any hashing algorithm will have collisions. It is the difficulty in generating these collisions that makes the algorithm strong or weak; and they are quite easy to generate for MD5.

      • by Cramer ( 69040 )

        While true, the issue is one of ease of generating a (meaningful) input that matches the hash. That is, given a hash, one cannot instantly provide a plain text to generate it. (this can only be done today with rainbow tables -- i.e. try everything until there's a match, which is far from "quick".) Nor can one start with a given plaintext and alter it while not altering the hash. (an example of such exists for MD5, thus it's "broken", however, in reality, it is merely "week" as it's very difficult to do. No

        • While true, the issue is one of ease of generating a (meaningful) input that matches the hash. That is, given a hash, one cannot instantly provide a plain text to generate it.

          Why, yes, that's what my second sentence said.

  • The original Xbox and I have some unfinished business.

  • It's gotten to the point that, in order to encrypt anything safely for a few years, I have to invent a time machine and steal the technology from the future. And kill the inventor, so that he doesn't independently discover it in the original time stream.

    We just got vendors to stop using MD5 and SSL 3.0 about a year ago.

  • Comment removed based on user account deletion

Think of it! With VLSI we can pack 100 ENIACs in 1 sq. cm.!

Working...