Become a fan of Slashdot on Facebook

Practical Exploits of Broken MD5 Algorithm 253

Posted by CowboyNeal on Friday September 23, 2005 @05:50AM from the as-news-breaks-so-does-encryption dept.

jose parinas writes "A practical sample of an MD5 exploit can be found, with source code included,in codeproject, a site for .Net programmers. The intent of the demos is to demonstrate a very specific type of attack that exploits the inherent trust of an MD5 hash. It's sort of a semi-social engineering attack. At Microsoft, the MD5 hash functions are banned. The main problem is that the attack is directed to the distribution of software process, as you can understand reading the paper, Considered Harmful Someday. Some open source programs, like RPM, use MD5, and in many open source distributions MD5 is used as check sum."

This discussion has been archived. No new comments can be posted.

Practical Exploits of Broken MD5 Algorithm

Load All Comments

Search 253 Comments Log In/Create an Account

Comments Filter:

Checksums are always going to be vulnerable (Score:3, Insightful)

by LiquidCoooled ( 634315 ) writes: on Friday September 23, 2005 @05:55AM (#13627874) Homepage Journal

If you contract your file from x bytes down to a fixed size, no matter what algorithm you use, you will always have collisions.

Unless you start to give your hash keys as the same size as the original file, there is not anything that can be done about it, ever.

Share
twitter facebook
- Re:Checksums are always going to be vulnerable (Score:5, Insightful)
  
  by Ckwop ( 707653 ) * writes: on Friday September 23, 2005 @06:09AM (#13627912) Homepage
  
  If you contract your file from x bytes down to a fixed size, no matter what algorithm you use, you will always have collisions. Unless you start to give your hash keys as the same size as the original file, there is not anything that can be done about it, ever.
  
  While this technical correct it is slightly misleading. The aim of a hash function is to make it has hard as possible to find a collision. For an n-bit has it takes roughly 2^(n/2) operations to find a collision. Any attack faster than this is considered a break of the hash function. It typically takes less than five minutes to break MD5 so it is horribly broken.
  
  While removing the possibility of collision all together is provably impossible you can design a hash function for which finding a collision is computationally infeasible. The standard size to achieve this today is 256-bits and the better designed functions like Whirlpool use this hash length.
  
  Simon
  
  Parent Share
  twitter facebook
  - Re:Checksums are always going to be vulnerable (Score:4, Interesting)
    
    by gowen ( 141411 ) writes: <gwowen@gmail.com> on Friday September 23, 2005 @06:29AM (#13627978) Homepage Journal
    
    It typically takes less than five minutes to break MD5 so it is horribly broken.
    
    But all that enables you to do is replace an MD5'd file with garbage that happens to have the same MD5 sum. It's hard to deliver a payload when you're limited to tricking a target into downloading what would be (essentially) a random string of ones and zeroes.
    
    Parent Share
    twitter facebook
    - Re:Checksums are always going to be vulnerable (Score:5, Informative)
      
      by Ckwop ( 707653 ) * writes: on Friday September 23, 2005 @06:40AM (#13628013) Homepage
      
      But all that enables you to do is replace an MD5'd file with garbage that happens to have the same MD5 sum. It's hard to deliver a payload when you're limited to tricking a target into downloading what would be (essentially) a random string of ones and zeroes.
      
      At Toorcon this year, Dan Kaminsky showed how to generate two valid, nicely rendered, html files with the same hash . Basically, he injects javascript into the page to remove the rubbish at the begining of the file. But how often to do you view the source of a page you're visiting. It'd be hard for a layperson to notice this. Make no mistake about it, the collision attack is very dangerous.
      
      Simon
      
      Parent Share
      twitter facebook
    - Re:Checksums are always going to be vulnerable (Score:3, Informative)
      
      by evilviper ( 135110 ) writes:
      
      But all that enables you to do is replace an MD5'd file with garbage that happens to have the same MD5 sum.
      It's not nearly as scary as swaping one document for another, or one binary for another, but it's still quite useful.
      
      Think of P2P networks. Gnutella uses SHA1 hashes to verify files, file mirrors, and the final downloaded file. If some RIAA employees could find the SHA1 hash of a very popular song, and generate junk with the same hash, they could have people downloading their junk (pardon the pun) an
- Re:Checksums are always going to be vulnerable (Score:5, Informative)
  
  by Anonymous Coward writes: on Friday September 23, 2005 @06:10AM (#13627917)
  
  This completely misses the point of cryptographic hashes.
  
  The point is that it is supposed to be difficult to find another data set which hashes to the same value without doing a brute-force search. Of course you will get collisions, but the changes are (supposed to be) 1 in 2^80 with MD5 or 1 in 2^128 with SHA-1.
  
  The exploits mentioned above are that the algorithms (MD5 and to some extent SHA-1) have been broken to allow you to construct a piece of data which hashes to the same value as the original. This is VERY different from the fact that you get collisions.
  
  Parent Share
  twitter facebook
  - Re:Checksums are always going to be vulnerable (Score:2)
    
    by nagora ( 177841 ) writes:
    
    The exploits mentioned above are that the algorithms (MD5 and to some extent SHA-1) have been broken to allow you to construct a piece of data which hashes to the same value as the original.
    Well, sort of. It allows you to construct two pieces of data that hash to the same value. It doesn't seem to make it any easier to take an arbitrary original and just magic up another file which has the same hash.
    TWW
    - Re:Checksums are always going to be vulnerable (Score:3, Informative)
      
      by Pieroxy ( 222434 ) writes:
      
      It doesn't seem to make it any easier to take an arbitrary original and just magic up another file which has the same hash.
      
      Yes it does. [slashdot.org]
      - Re:Checksums are always going to be vulnerable (Score:3, Informative)
        
        by Anonymous Coward writes:
        
        No, it doesn't.
        
        You linked an example which takes a document A, and produces two documents A1 and A2 where
        
        A1 looks like A, but A2 does not look like A
        and MD5(A1) = MD5(A2)
        
        BUT, critically, MD5(A1) is not MD5(A)
        
        So neither of the documents created is an imposter. Arbitrary payloads are still protected by MD5. If you don't agree, simply reply with a link to a file that has the MD5 hash 7a0a25a5c71fa2639a41ee743aa5e2b7
        
        No-one can do this yet, and they may never be able to do it. But MD5 has failed one of its origi
        
        Re:Checksums are always going to be vulnerable (Score:3, Insightful)
        
        by ajs ( 35943 ) writes:
        
        You are looking at this incorrectly. The problem is that, given file A, which has hash X, if it is possible to create file B which has hash X in a non-exaustive way, then the hash is weak. Most such techniques work well regardless of file B being a from-scratch random pile of data or a combination of well-formed data plus random data. So, for example, you might have an RPM with 3 files in it. Add a fourth file, and start permuting to find your similar hash.
        
        Now, as you point out, we're not yet to the point t
      - Re:Checksums are always going to be vulnerable (Score:4, Informative)
        
        by baadger ( 764884 ) writes: on Friday September 23, 2005 @09:30AM (#13628511)
        
        No it doesn't.
        
        The Wang vector pair floating about at the moment, when prepended to 2 useable files will produce a MD5 collision of the said files. BUT - as a result of doing this you are also going to corrupt these files and make them unuseable (executeables, MP3's etc, obviously not text documents).
        
        All the proof-of-concept article shows is the two attack vectors by Wang in use with 3 simple programs. You will notice the "md5extractor", which needs to be in place to remove the arbitrary vector data before the evil good.exe becomes dangerous. This exact procedure doesn't apply to most software distribution actually, how are you going to get the extractor on the victims computer in the first place?
        
        This could be a problem is somebody can produce an attack vector pair that does produce a valid executeable/PE header or and MP3 header. But these have structure and leave much less room for the vector, may place restrictions on the payload, and might not even be possible.
        
        The webpage thing described in the comment you link to is pretty harmless. Who the hell usines a MD5 hash on a HTML documents? Misleading documentation? Browser exploits? Unlikely.
        
        The fact remains if you were to try and use this method you would really be doing, and what you will have to do, is nothing more than trick the user by normal means (human failure).
        
        Coincedentally, for use in authentication you would be a fool NOT to be running sanity checks on input anyway. For use in authentication, salted and sanity checked input to MD5 should is still very very safe.
        
        I can't see a reason why P2P applications implemented for networks using MD5 file verification can't start popping off bytes at the beginning of downloads (the first block) and try it with another payload to detect and reject people using this type of multicollision attack. In addition these applications could check for valid MP3, AVI, JPEG, headers etc.
        
        The author of the "MD5 - Someday to be considered harmful" paper is correct. MD5 is risky for some purposes, P2P networks still using MD5 without any smarts may be ruined, but the hash far from dead if used carefully backed up by other checks. What makes people think moving to SHA-1 or Whirlpool is going to solve these problems (OK with SHA-1 different types of attacks) in the longer term?
        
        Relying on the hash mechanism alone is just a bad habit to get into. People are switching because it's just best to play it safe when people (myself included) don't understand the full significance of the attacks produced this year.
        
        Parent Share
        twitter facebook
- Re:Checksums are always going to be vulnerable (Score:2)
  
  by LostCluster ( 625375 ) * writes:
  
  Yes, but ideally the collisions would be random collections of bits. They shouldn't make up anything meaningful, nevermind being a piece of malware.
- Re:Checksums are always going to be vulnerable (Score:2, Insightful)
  
  by LentoMan ( 704115 ) writes:
  
  Use of two or more different checksums algorithms would make it substantially harder to create a file with a fake hash.
  - Re:Checksums are always going to be vulnerable (Score:2)
    
    by ceeam ( 39911 ) writes:
    
    Sure! But the interesting question, IMHO, is _how_much_ more difficult it would be? If both hashes are "patchable" then what would be the difficulty to find a method to make _both_ hashes match simultaneously? (I'm not a cypherpunk, anyone?)
  - Re:Checksums are always going to be vulnerable (Score:3, Insightful)
    
    by m50d ( 797211 ) writes:
    
    No, it wouldn't, *sigh*. Stop posting this, I'm tired of correcting it.
    As another reply said, however, doubling the bit count would improve security. But, a simple double-MD5 would have exactly the same problems. Therefore, the solution is to use a longer and more secure hash, like, for example, SHA-256.
    - Re:Checksums are always going to be vulnerable (Score:2, Informative)
      
      by ocelotbob ( 173602 ) writes:
      
      I think you're missing what the OP is saying here. The OP is suggesting to use something like MD5+SHA1, algorithms with different techniques for generating hashes so that you decrease the probability of creating a collision that works for both, not using something like double MD5
      - Re:Checksums are always going to be vulnerable (Score:2)
        
        by m50d ( 797211 ) writes:
        
        Yes, that's exactly what the OP is saying and it's a stupid idea. By double MD5 I meant a double-length MD5 (twice as many bits in the hash), which although still a stupid idea is nowhere near as stupid as using MD5 and SHA1 together and expecting it to increase security.
        
        Re:Checksums are always going to be vulnerable (Score:3, Informative)
        
        by Tony Hoyle ( 11698 ) writes:
        
        Using multiple hashes is a known way to increase security (some protocols use it) - it *does* work because the number of potential collisions is reduced - if you can create a collission with the MD5 (far from simple, in fact) it's extremely unlikely to *also* collide with the SHA1 hash - the number of plaintexts where they both collide is significantly lower.
- Re:Checksums are always going to be vulnerable (Score:3, Interesting)
  
  by RAMMS+EIN ( 578166 ) writes:
  
  ``If you contract your file from x bytes down to a fixed size, no matter what algorithm you use, you will always have collisions.''
  
  Yes.
  
  ``Unless you start to give your hash keys as the same size as the original file, there is not anything that can be done about it, ever.''
  
  No. If I run deflate or some other compression algorithm on a file, chances are it will come out smaller. Still, the compressed file maps one to one with the original.
- - Re:Checksums are always going to be vulnerable (Score:5, Informative)
    
    by ceeam ( 39911 ) writes: on Friday September 23, 2005 @06:25AM (#13627962)
    
    (sigh) Insightful, my ass... Checksums are NOT reversible. The main trick here is to replace one file with another and leave the hash/checksum the same by patching the fake file. (For practically every file format there exists a spare space where this patching could be done.)
    
    Parent Share
    twitter facebook
    - Re:Checksums are always going to be vulnerable (Score:2, Interesting)
      
      by MaGogue ( 859961 ) writes:
      
      You almost got it right. The reverse problem of a checksum is to produce the original file that will give the checksum.
      
      Actually, the functions are called hash functions, and they can be written as:
      H(file) = h
      H() is the hash function, h is the hash value (checksum) the reverse would be RH()
      RH(h) = file
      Now we can of course prove that reverse is not a unique function, since any fixed-length h has a limited number of possible values, whereas file can be of any length.
      Therefore, reverse has many pos
  - -1, Clueless (Score:2)
    
    by Jussi K. Kojootti ( 646145 ) writes:
    
    ... or maybe you (and the mod weho gave you Insightful) just have a strange sense of humour?
So if you need a freely available hash algorithm (Score:5, Informative)

by MadMoses ( 151207 ) writes: on Friday September 23, 2005 @05:56AM (#13627877) Homepage

...better use Tiger [technion.ac.il] or Whirlpool [terra.com.br] (based on AES). AFAIK there are no known vulnerabilities or attacks for these two yet.

Share
twitter facebook
- Re:So if you need a freely available hash algorith (Score:2, Insightful)
  
  by thomasdn ( 800430 ) writes:
  
  AFAIK there are no known vulnerabilities or attacks for these two yet
  
  I am no cryptography expert so I can not read and understand those algorithms. But the fact that there are no known vilnerabilities for an algorithm doesn't make it secure.
  Maybe they are just not used as much as other well known algorithms. And therefore nobody has found vulnerabilities for them yet?
  - Re:So if you need a freely available hash algorith (Score:5, Informative)
    
    by poopdeville ( 841677 ) writes: on Friday September 23, 2005 @07:14AM (#13628089)
    
    I am no cryptography expert so I can not read and understand those algorithms. But the fact that there are no known vilnerabilities for an algorithm doesn't make it secure. Maybe they are just not used as much as other well known algorithms. And therefore nobody has found vulnerabilities for them yet?
    This is a complicated issue. Generally, the security offered by an encryption algorithm isn't measured by its popular usage, but by the amount of time qualified professional cryptographyers/mathematicians/hackers have studied it without finding a critical vulnerability. My claim is probably too broad: there is no magical formula that determines how secure an algorithm is. But in depth work by professionals does endear confidence in an algorithm.
    As a general rule of thumb, it is wise to use an algorithm that has been seriously studied for 10-20 years. At this point, it is modern enough to withstand modern brute force attacks, and (hopefully) understood well enough to ensure that there are no structural vulnerabilities. If it is much older than that and still studied, it is likely because a flaw has been found and people are trying to push it as far as it goes.
    
    Parent Share
    twitter facebook
  - Secure is as secure does, sir. (Score:2)
    
    by lheal ( 86013 ) writes:
    
    "Secure" is both an adjective and a verb. We usually use it as an adjective, like "orange" or "happy".
    
    I've had several discussions with my wife over the years about something being "orange". She insists that some maize-like color is truly orange, but the color of the University of Illinois' basketball jersey is more of a red.
    
    Choose an algorithm that makes you feel secure. Understand that your chosen algorithm, whatever it may be, is not proved unbreakable. It's just that the work involved for someone to
- Re:So if you need a freely available hash algorith (Score:2)
  
  by El_Muerte_TDS ( 592157 ) writes:
  
  No attack? what about brute force? It's an attack, not a good one, but it is an attack.
  - Re:So if you need a freely available hash algorith (Score:2)
    
    by MadMoses ( 151207 ) writes:
    
    Brute force is always a possibility. But the possibiblity of someone having and willing to spend the resources, technology, money and time to screw you over by brute forcing a 256 bit Whirlpool hash is marginal.
- Re:So if you need a freely available hash algorith (Score:3, Funny)
  
  by commodoresloat ( 172735 ) writes:
  
  ...better use Tiger
  Once again, OSX proves to be more secure!
  *ducks*
- Whirlpool is not based on AES (Score:3, Informative)
  
  by Paul Crowley ( 837 ) writes:
  
  Whirlpool is not "based on AES". It shares a few similiarities (and a designer) but it is a distinct algorithm in its own right. It has a larger block size, different S-boxes, a different linear component, a different key schedule and so on.
  
  I would interpret "based on AES" as meaning it actually uses AES itself (perhaps in tandem Davis-Meyer mode or similar).
  
  I like Whirlpool but it's not fast. I think Tiger is quite a bit faster.
  
  Bernstein's cache timing attacks, and the ever-growing gap between processor
- - Re:So if you need a freely available hash algorith (Score:3, Informative)
    
    by MadMoses ( 151207 ) writes:
    
    Misunderstandment. I never made a statement about MD5 not being "free". But MD5 is vulnerable, that's why I pointed out some alternatives.
- - Re:So if you need a freely available hash algorith (Score:2)
    
    by FidelCatsro ( 861135 ) writes:
    
    There is also no way of knowing if your going to get hit by a bus next month .
    So all we can do is take sensible precautions to make sure it doesn't happen .
hashtrust (Score:4, Interesting)

by gcnaddict ( 841664 ) writes: on Friday September 23, 2005 @05:56AM (#13627878)

Now we know why people distribute modified game ISOs on the net and check it with md5 :P In all technicality, couldnt this mean that someone could land a virus on someone else's machine because the person trusted the hash?

Share
twitter facebook
- Re:hashtrust (Score:2)
  
  by MadMoses ( 151207 ) writes:
  
  Of course it's possible, that's why cryptographic hash algorithm designers try to make it as hard as possible to find an exactly equal hash value for a different file. That doesn't mean it can't be done. It just has to be hard enough that the result (screw some stranger's computer in your example) doesn't warrant the effort you have to put into the attack.
- Re:hashtrust (Score:2)
  
  by Metteyya ( 790458 ) writes:
  
  No, it's extremely hard. Probability of collision AND valid code is too damn low. Although this article proves that, with little social engineering, you can "emulate" (or cheat) a valid-code collision.
  - Re:hashtrust (Score:3, Insightful)
    
    by SillyNickName4me ( 760022 ) writes:
    
    Any practical use of hash algorithms is extremely easy to fool with social engineering. All you have to do is get someone to trust your hash..
  - Re:hashtrust (Score:2, Interesting)
    
    by Anonymous Coward writes:
    
    No, it's extremely hard. Probability of collision AND valid code is too damn low. Although this article proves that, with little social engineering, you can "emulate" (or cheat) a valid-code collision.
    
    No, it looks quite feasible.
    
    You need two ISOs, to each of which you have added a different block of random bytes that happen to hash to the same MD5, and you have also modified the game's setup program so that, depending on which block of bytes is there, it either installs the game normally, or it installs the
- Re:hashtrust (Score:2)
  
  by GauteL ( 29207 ) writes:
  
  Well, it only works on a specially prepared original, so you couldn't just change someone elses ISO into something harmful and still keep the same MD5 hash.
  
  However, you COULD possibly release a "good" ISO first with this preparation and get people to trust that source by word of mouth, and then you could swap this ISO for a bad ISO with the same MD5 sum.
  
  Still quite sinister, but not as easily exploitable as changing someone elses ISO.
But... (Score:2, Interesting)

by Auraiken ( 862386 ) writes:

Isn't this the problem with all algorithms? Only way to know if something is something... is to see something instead of its checksum?
As for md5... with only 32bits, it should've came up with repetitive hashes in end anyways?

I guess since the article explains some issues against md5 security, the only answer would be to trust the source that is supplying the hash in the first place? Coming down to the fact that a system is only as secure as its user?
- Re:But... (Score:2)
  
  by m50d ( 797211 ) writes:
  
  As for md5... with only 32bits, it should've came up with repetitive hashes in end anyways?
  Yes, but the point of a hash is that it's computationally infeasible to find them. If you're distributing a binary with a certain MD5 there's sure to be a file somewhere with the same sum, maybe it's a car manufacturer's specification PDF, but I can't write a trojan and get it to MD5 to the same thing *and* do what I want. Now you can.
- Re:But... (Score:2)
  
  by VolciMaster ( 821873 ) writes:
  
  As for md5... with only 32bits, it should've came up with repetitive hashes in end anyways?
  MD5 is a 160-bit algorithm, not 32.
  only answer would be to trust the source that is supplying the hash
  I have read a lot of Bruce Schneier's writings, and he points out over and over again that security relies on people, and people can fail. It is perfectly possible to create unbreakable algorithms and protocols. Impossible to guess passwords and keys.
  That's when the person wishing to gain access or dupe the user
M$ Antihash (Score:5, Funny)

by CDMA_Demo ( 841347 ) writes: on Friday September 23, 2005 @06:02AM (#13627891) Homepage

At Microsoft, the MD5 hash functions are banned.

they use crc instead!

Share
twitter facebook
- Re:M$ Antihash (Score:2)
  
  by ceeam ( 39911 ) writes:
  
  Oh, here's their CRC function: BYTE CRC(BYTE *p, size_t sz) { BYTE res = 0; while (sz--) { res ^= *p++; } return res; } // I wish I were kidding.. Not in MS's code though, but I've seen exactly this.
  - Re:M$ Antihash (Score:3, Funny)
    
    by DrSkwid ( 118965 ) writes:
    
    and your point is ?
    
    you'll be telling me Huffman encoding is dangerous next !
A quick note (Score:5, Informative)

by Darren Winsper ( 136155 ) writes: on Friday September 23, 2005 @06:06AM (#13627898)

This seems to work on the assumption that you want to do some harm with a program you created yourself, you can't actually take a random RPM and turn it into an evil RPM with the same MD5. So, yes, it's bad, but it's not as bad as you might think.

Share
twitter facebook
- Re:A quick note (Score:3, Funny)
  
  by commodoresloat ( 172735 ) writes:
  
  you can't actually take a random RPM and turn it into an evil RPM
  Sure you can; all you need to do is set the evil bit.
- Re:A quick note (Score:2)
  
  by Antique Geekmeister ( 740220 ) writes:
  
  RPM's are a bit more interesting. The content of the RPM is MD5 signed, but almost all RPM packages are also GPG signed as a package. This means that downloading things from redhat.com or mandrake.com, they have a GPG signature that is checked by most installers as a default and that the package is signed with, swearing that it was compiled by the people you love and trust.
  
  If you're not checking the GPG signatures, you have no idea what is in the package anyway: it could have been built by the 3l33t cracker
  - Re:A quick note (Score:4, Informative)
    
    by provolt ( 54870 ) writes: on Friday September 23, 2005 @07:57AM (#13628186)
    
    The signature doesn't do anything to solve the problem. If I create an evil tarball that has the same hash as the good tarball, the signature will verify properly for both files. When you download the file, you won't know if it's the good or evil one.
    
    GPG signs the hash. With a strong hash function, it's as good as signing the tarball. With a weak hash functions, one signature would match for many files.
    
    Parent Share
    twitter facebook
H(x) == H(y) - H(x + q) == H(y + q) ? (Score:3, Informative)

by strcmp ( 908668 ) writes: on Friday September 23, 2005 @06:06AM (#13627901)

There is a known result about MD5 hash function, is this: If MD5(x) == MD5(y) then MD5(x+q) == MD5(y+q) So, if you have a pair of messages, x and y, with the same MD5 value, you can append a payload q, and the MD5 value keeps the same, the size of q is arbitrary.

Considering this is such a "well known" result, you would think that MD5 should have been abandoned long ago. Is this true for other popular hash functions?

Share
twitter facebook
- Re:H(x) == H(y) - H(x + q) == H(y + q) ? (Score:5, Insightful)
  
  by igb ( 28052 ) writes: on Friday September 23, 2005 @06:12AM (#13627926)
  
  It's almost universal to use the byte count as part of the checking of equivalence, either by storing it as a distinct item or by using it as inital or final salt to the calculation of the hash.
  ian
  
  Parent Share
  twitter facebook
  - Re:H(x) == H(y) - H(x + q) == H(y + q) ? (Score:2)
    
    by Soul-Burn666 ( 574119 ) writes:
    
    Mod parent up!
    
    Most hash functions work by dividing the data to blocks, then hashing a block and combining it with the next block as input for the "small" hash function. So most functions have that property.
    
    However, like the parent stated, the last block of the data is padded with zeros and then the byte count of the original message. In case there's no place, a complete new block is added, padded with zeros and the byte count in the end.
    
    This doesn't work when x and y are of the same length and still have th
  - Said it before, saying it again (Score:2)
    
    by cpeikert ( 9457 ) writes:
    
    The known attacks produce MD5 collisions with the same length!!
    
    Padding and byte-counts and so forth are a red herring and provide no mitigation. The basic colliding messages are all exactly 1 block long. From there you can do message-extension. Any hash function with this so-called Merkle-Damgard structure is vulnerable to length extension attacks. And yes, this is true of Whirlpool too (in contrast to another poster's assertion): if you find two messages with the same hash and the same length, you can
  - Merkle Damgard: Why len(x) and len(y) matter (Score:2, Informative)
    
    by 0ptix ( 649734 ) writes:
    
    AFAIK this is an attack on the underlying Merkel-Damgard paradigm which both SHA-1 and MD5 (amoungst others) employ. The paradigm goes as follows:
    
    IV | Intialisation vector of n-bits
    MB_i | Message Black i of n-bits
- Re:H(x) == H(y) - H(x + q) == H(y + q) ? (Score:5, Interesting)
  
  by Ckwop ( 707653 ) * writes: on Friday September 23, 2005 @06:15AM (#13627935) Homepage
  
  Is this true for other popular hash functions?
  
  No it is not. The newer hashes, such as Whirlpool, do not have this problem. You're correct in saying this is a "well known result" and every cryptographer worth his salt says that this fact constitutes a break of the algorithm. We've known since the middle of the nineties that breaking MD5 was within reach. The fact there has been so much inertia in getting people to change is quite incredible really.
  
  At Toorcon this year, Dan Kaminsky showed a way to create two different webpages that render properly in a browser but have the same MD5 hash. Anybody who thinks this attack is theortical and ignorable is grossly mistaken.
  
  Simon
  
  Parent Share
  twitter facebook
  - Re:H(x) == H(y) - H(x + q) == H(y + q) ? (Score:2)
    
    by cpeikert ( 9457 ) writes:
    
    The newer hashes, such as Whirlpool, do not have this problem.
    
    I don't think you are right about this. Although it does not exactly use the Merkle-Damgard structure, Whirlpool is still iterative. If you find a collision between two short messages of the same length, then Whirlpool's internal state collides. From there you can extend both messages with the same string, maintaining the same internal state, padding and byte counts be damned.
    
    Any iterative hash is going to have this issue. It really seems lik
- Re:H(x) == H(y) - H(x + q) == H(y + q) ? (Score:2)
  
  by m50d ( 797211 ) writes:
  
  Considering this is such a "well known" result, you would think that MD5 should have been abandoned long ago. Is this true for other popular hash functions?
  It's a useful property because it means you can MD5 data piece by piece. I don't want to think about how long it would take to MD5 a 700MB .iso if you had to use the file as a whole, it takes long enough as it is. It shouldn't allow you to break a hash function without other flaws, and I'd expect it to be the case for many hashes.
- Re:H(x) == H(y) - H(x + q) == H(y + q) ? (Score:2)
  
  by kasperd ( 592156 ) writes:
  
  Is this true for other popular hash functions?
  
  First of all the way it is stated is a bit too general. It is not enough for x and y to cause a collision in the final hash, they must cause a collision in the internal state of the hashing algorithm. But all the hash collisions which have been found so far worked by finding a collision in the internal state, in which case the statement is true for any hash algorithm.
  
  You may find algorithms claimed not to be vulnurable to this kind of attack. The reality ju
Not a problem for software distribution.. (Score:4, Interesting)

by hhghghghh ( 871641 ) writes: on Friday September 23, 2005 @06:07AM (#13627903)

This isn't a problem for software distribution, really, since the good.bin file needs to start with a vector designed to enable a collision. A good-faith programmer wouldn't include that vector.

It is a problem for stuff like contracts; you draw up two versions of a contract, a good one and an evil one, let someone sign the good one, and later keep them to the clauses in the evil one.

So while there IS a very big problem, the example is a bit contrived.

Share
twitter facebook
- Re:Not a problem for software distribution.. (Score:3, Insightful)
  
  by provolt ( 54870 ) writes:
  
  This is certainly a problem with software distribution. Here's an example of why. The good-faith programmer creates good.bin. A Bad-faith programmer creates bad.bin with the same hash. All the bad-faith programmer has to do is find a way to swap bad.bin in for good.bin (hacking, misrepresentation, etc).
  
  Now everyone who gets bad.bin will hash the data, verify the signature and see that the signature verifies. When the signature verifies, everyone believes bad.bin is really good.bin.
  - - Re:Not a problem for software distribution.. (Score:2)
      
      by provolt ( 54870 ) writes:
      
      Yup that makes perfect sense. I took a shower right after I posted the comment and I realized that I was wrong. I came back to correct it, but you beat me to it.
      
      It doesn't give you a warm fuzzy about software distribution, but the attack is not as simple as I described.
    - Re:Not a problem for software distribution.. (Score:2)
      
      by provolt ( 54870 ) writes:
      
      Someone needs to mod up the AC. He has a very good point.
Pretty interesting.... (Score:2)

by vidarlo ( 134906 ) writes:

This is kinda interesting. Well, the user need to use my installer, but it might infect them, and they have no way to tell. But the interesting thing is... A couple of script/programs use md5sum, like rpm and such. How much work is it to change into sha1? AFAIK SHA1 is 160bits, whilst md5 is 128bit, so a bit more space in the rpm is needed. Apart from that?

The solution to all this is gpg signing. I've heard little fuss about that... Yes, it is simply a longer signature, making it more difficult to break,
- Re:Pretty interesting.... (Score:2)
  
  by Alioth ( 221270 ) writes:
  
  CentOS (and I suspect RHEL) uses GPG signing for getting updates and installing new packages from their repositories, so at least for CentOS - they are already using GPG signing. The first CentOS install I did was with 4, so I don't know how long they've been using GPG signatures.
- Re:Pretty interesting.... (Score:2)
  
  by provolt ( 54870 ) writes:
  
  The solution is not GPG signing. Because public key signature algorithms (like RSA) are slow they can't really be used on large files. Signing and verifying a 700 MB iso with RSA would take hours even on the fastest desktop computers.
  
  GPG uses a signature algorithm to sign the hash of the data. If the hash function is strong, then it's infeasable to find another file that has the same hash. (Two different messages with the same hash is called a collision.) Researchers have been able to find collisions i
File Integrity Checkers ? (Score:2)

by amodm ( 876842 ) writes:

I wonder how does this affect the file integrity checkers. A lot of these softwares store hashes and use them to verify if a file has changed.

So the next time someone installs a root kit, he just needs to do it in a way TFA points out.
- Re:File Integrity Checkers ? (Score:4, Insightful)
  
  by ArsenneLupin ( 766289 ) writes: on Friday September 23, 2005 @06:36AM (#13627995)
  
  So the next time someone installs a root kit, he just needs to do it in a way TFA points out.
  Wouldn't work. TFA points out a way to generate to packages, a good one, and an evil twin. Both are constructed by the algorithm.
  It does not however show how to make an evil twin for an existing package. This considerably lessens its danger.
  Think about it: the only guy who could pull this off would be the original author. But then, the original author could achieve that same goal much more easily than by "breaking" MD5.
  
  Parent Share
  twitter facebook
  - Re:File Integrity Checkers ? (Score:2, Interesting)
    
    by Shano ( 179535 ) writes:
    
    Not necessarily - it depends on how much the author is trusted to begin with. Certain types of software are very closely checked by the open source community, and any trojan will be discovered if it exists in the package.
    
    Say I write a bit of security software (for which most people take the time to compare checksums). As a relative nobody, lots of people are going to scrutinise the source code before using it. Any new release will also be checked. Only after it's been scrutinised and built up a reason
    - Re:File Integrity Checkers ? (Score:2)
      
      by ArsenneLupin ( 766289 ) writes:
      
      Not necessarily - it depends on how much the author is trusted to begin with. Certain types of software are very closely checked by the open source community, and any trojan will be discovered if it exists in the package.
      And you somehow think that an inactive, but still present trojan would not be discovered? Unlikely.
      And yes, this is exactly what this so called exploit does: it constructs a package which has both codes present at once (the payload), and some small bit of code that examines the "header"
Yadda Yadda (Score:5, Interesting)

by Effugas ( 2378 ) * writes: on Friday September 23, 2005 @06:19AM (#13627951) Homepage

Two pages, same hashes, etc. (This is the guy who wrote the MD5 someday paper.)

http://www.doxpara.com/t1.html [doxpara.com]
http://www.doxpara.com/t2.html [doxpara.com]

Share
twitter facebook
- - Re:Yadda Yadda (Score:3, Informative)
    
    by Jerf ( 17166 ) writes:
    
    Mods, parent is not insightful, parent is wrong. And it's not hard to understand why the parent has a hard time seeing how MD5 is involved when they have an incorrect idea of how it works.
    
    As a bit of a hint to NightHwk1, extremely carefully check the first bytes of those two pages. They are indeed not identical and do indeed hash to the same MD5 value.
    
    The rest of the "trick" of course hinges on the fact that a single bit change to a Turing Machine can completely alter the resulting output. "Trick" is put in
Actually RPM uses MD5 and SHA1 (Score:5, Informative)

by seifried ( 12921 ) writes: on Friday September 23, 2005 @06:29AM (#13627976) Homepage

RPM uses both MD5 and SHA1, the chances of finding a collision that satisfies both hashes is small, even if both MD5 and SHA1 are compromised since the hash the data differently.

rpm -Kvv xorg-x11-libs-6.8.2-37.FC4.49.2.i386.rpm
D: Expected size: 2655615 = lead(96)+sigs(344)+pad(0)+data(2655175)
D: Actual size: 2655615
D: opening db index /var/lib/rpm/Packages rdonly mode=0x0
D: locked db index /var/lib/rpm/Packages
D: opening db index /var/lib/rpm/Pubkeys rdonly mode=0x0
D: read h# 278 Header sanity check: OK
D: ========== DSA pubkey id b44269d0 4f2a6fd2 (h#278)
./updates-released/packages/xorg-x11-libs-6.8.2-37 .FC4.49.2.i386.rpm:
Header V3 DSA signature: OK, key ID 4f2a6fd2
Header SHA1 digest: OK (f37bf5cb97db696f14133b90e23f2455b9f94587)
MD5 digest: OK (8eda29837b6992876bd867df03b3b8af)
V3 DSA signature: OK, key ID 4f2a6fd2
D: closed db index /var/lib/rpm/Pubkeys
D: closed db index /var/lib/rpm/Packages
D: May free Score board((nil))

Share
twitter facebook
- Re:Actually RPM uses MD5 and SHA1 (Score:2)
  
  by m50d ( 797211 ) writes:
  
  RPM uses both MD5 and SHA1, the chances of finding a collision that satisfies both hashes is small, even if both MD5 and SHA1 are compromised since the hash the data differently.
  Not true, firstly the two come from the same "family" and are related, and secondly most of this type of attacks allow you to generate arbitrarily many files matching the hash almost as easily as two. So you use your MD5 attack to generate a large set (2^60 or however many you need) of RPMs matching the MD5, then you use your SHA1
- Re:Actually RPM uses MD5 and SHA1 (Score:2)
  
  by verbatim_verbose ( 411803 ) writes:
  
  Fedora does one better, and GPG signs packages (through yum). I'm not sure if other distributions do this, but I imagine this will be standard policy for most in the future. This provides a reliable way to tell that the package came from who you think it did (ie. the Redhat/Fedora dev crew), whereas the MD5/SHA1 above mainly just to verify file integrity so that a partially broken package (bad download, etc) doesn't hose your system.
SHA1 will fall soon (Score:2)

by m50d ( 797211 ) writes:

Attacks only get better, a theoretical vulnerability is one to worry about as they are almost always followed by a practical exploit like this. Move away from SHA1 before the same thing happens.
Speaking of hashing algorithms... (Score:3, Informative)

by scovetta ( 632629 ) writes: on Friday September 23, 2005 @06:51AM (#13628041) Homepage

The NIST is having a two-day workshop in Gaitherburg, Maryland (USA) on October 31-Nov 1. Xiaoyun Wang will be giving a keynote speech, and there'll be plenty of technical material to go around. The workshop website is: www.csrc.nist.gov/pki/HashWorkshop/program.htm [nist.gov]. I don't work for NIST or anything, but I thought this was interesting and they haven't really done a good job getting the word out about this conference.

Share
twitter facebook
Don't panic yet (Score:4, Insightful)

by Anonymous Coward writes: on Friday September 23, 2005 @06:56AM (#13628051)

While this is a very serious attack, it doesn't yet mean that every use of MD5 is totally unsafe.

It is feasible now to generate 2 different pieces of data with the same MD5 hash. As many file formats allow one to embed invisible 'junk', it is possible to create a 'good' version and an 'evil' version with the same MD5 hash.

BUT it is NOT (yet) feasible to create a piece of data with a given MD5 hash. This means that if you can not modify the 'good' version, you can't create a matching 'evil' version.

An example of where the usage of MD5 isn't broken are *nix passwords. Your password is hashed with MD5 (a salt is added to your password too, but that's not important here), and that hash value is stored. Anyone who can supply a password (doesn't have to be the same one!) which has the same MD5 hash, is allowed access.

So if it would be feasible to generate data with a given MD5 hash, one could easily generate a matching password, when given the MD5 hash (which you can often easily acquire, especially in NIS/yp environments). But luckily, this is not possible ... yet :-)

So you'd better start protecting those hashes (that's what a shadow password file does), en better yet, move to a better algorithm. Like the Blowfish algorithm that OpenBSD has been using for years now.

Share
twitter facebook
- Re:Don't panic yet (Score:3, Insightful)
  
  by Haeleth ( 414428 ) writes:
  
  An example of where the usage of MD5 isn't broken are *nix passwords. Your password is hashed with MD5 (a salt is added to your password too, but that's not important here)
  
  Actually, the salt is extremely important. A large proportion of unsalted MD5 passwords can be cracked trivially by looking them up in pregenerated databases. MD5 alone is broken for passwords - it's only the salt that makes MD5 passwords still "good enough" for use on low-security machines... for the time being.
On Slashdot.. (Score:5, Funny)

by Anonymous Coward writes: on Friday September 23, 2005 @07:06AM (#13628074)

surprisingly many stories hashes to the same value..

Share
twitter facebook
Multiple Checks (Score:2)

by RAMMS+EIN ( 578166 ) writes:

I wonder why we're not using multiple checks on files. In the ports systems of OpenBSD and NetBSD, checksums using several algorithms are stored for each file. Right now, I think only SHA1 and file size are checked, but I think the feasibility of an attack would be severely reduced if another alrorithm (say RMD160) were used instead. The same could be done with other systems (apt, BitTorrent, ...)
TCP (Score:3)

by RAMMS+EIN ( 578166 ) writes: on Friday September 23, 2005 @07:28AM (#13628113) Homepage Journal

And to think that they're still working on getting MD5 digests in TCP packets...the algorithm could be as useless as the checksums they currently use by the time this change would have become widely accepted (now it's probably never going to happen).

Share
twitter facebook
- Re:TCP (Score:3)
  
  by tomstdenis ( 446163 ) writes:
  
  It's ok as a checksum [e.g. non-malicious modification].
  
  If you need say authentication you need a MAC anyways. As far as I know HMAC-MD5 is not immediately attackable by the known attacks [though I wouldn't use it anyways].
  
  Tom
Ahem... (Score:3, Informative)

by tomstdenis ( 446163 ) writes: <tomstdenis@NoSpam.gmail.com> on Friday September 23, 2005 @07:42AM (#13628150) Homepage

said this before... [gentoo.org]

Dan Kaminsky is actually the dude who came up with the Stripwire idea. ... LAST YEAR.

Tom

Share
twitter facebook
- - Re:Ahem... (Score:2)
    
    by tomstdenis ( 446163 ) writes:
    
    If you're going to modify your system anyways (to use a hidden combo trick) you might as well just swap out a new hash.
    
    In the case of portage that could be done in parallel. E.g. keep using the MD5 hash and introduce a SHA-2 + DSA based signature. Then later on down the road make the switch and make the signatures mandatory.
    
    The problem is the Gentoo folk aren't cryptographers. They did MD5 hashing mostly to make sure you grabbed the right file off some random mirror (e.g. to avoid simple typos and what n
    - - Re:Ahem... (Score:2)
        
        by tomstdenis ( 446163 ) writes:
        
        Why would you concatenate two hashes?
        
        My point is if your going to change the protocol in a NON BACKWARDS COMPATIBLE FASHION why not just change it for a proper hash like SHA-2 or Whirlpool?
        
        Tom
There are some limitations to this attack (Score:5, Informative)

by GekkePrutser ( 548776 ) writes: on Friday September 23, 2005 @07:51AM (#13628171)

As far as I know, the technique used for finding these MD5 collisions, cannot be performed with a GIVEN hash. So it's not possible to create, say, a copy of an already available RPM, add malicious code to it, and easily find some data to add to it to generate the same hash. This is not possible.

The only thing the current 'crack' does is create two RANDOM input files that generate the same hashed output. So it's only useful for someone who can control both the 'original' and the 'malicious' version of the data which is being protected by an MD5 hash.

So the dangers here are kind of limited though you could still do a lot of damage with it.

Share
twitter facebook
Educate me please! (Score:2)

by AchilleTalon ( 540925 ) writes:

I just realize I don't know much about this stuff and when time comes to make a choice of an algorithm over another one, I just don't know which one is the better choice.
Anyone here can provide some links/references to appropriate documentation to educate myself?
Banning MD5 is stupid and small minded (Score:4, Insightful)

by ivan256 ( 17499 ) writes: on Friday September 23, 2005 @09:22AM (#13628469)

It's dumb to ban MD5. It is still, and will continue to be a useful tool in situations where a cursory comparison is needed for reasons other than security. It would be silly, stupid even, to use a larger hash that requires more (slower) computation in these situations. Banning MD5 can mean one of two things: Either it's a stupid publicity stunt that will result in slower code overall, or that Microsoft doesn't trust it's developers to be smart enough to know when a particular hash is appropriate.

Share
twitter facebook
About this attack (Score:4, Insightful)

by Ernesto Alvarez ( 750678 ) writes: on Friday September 23, 2005 @11:13AM (#13629182) Homepage Journal

I've been reading the posts in this thread and I've noticed that there are two types of posters here: the ones who got it 100% right, and the clueless ones (there appear to be little or no posters in the middle ground).

Now, the clueless ones are thinking of lots of "attacks" using this vulnerability, some of them really wrong. Since this has the potential of getting lots of people to do stupid things (like not trusting MD5 when they should), let's talk a little bit about the vulnerability and its effects.

First of all: this is not new. There was an article here explaining the same attack a few months ago (about x.509 certificate collisions and how to fake postscript orders, if you know what I am refering to, please post a link).

The attack goes like this:

You have a block B1 that is known to collide with another block B2.
You have some custom made code that looks like this:

-----BEGIN SNEAKY CODE---------------
If DATA[1] = DATA[2] then
do something good
else
do something bad
end
DATA[1]
DATA[2]
-----END SNEAKY CODE-----------------

The trick is that since there's a collision between B1 and B2 and MD5 makes the hash by reading sequentially, the hash for the whole program will be the same whether you fill DATA[1] and DATA[2] with B1 or B2 (in any combination). Since the code is DESIGNED to do different things depending on the collision area, by changing the contents of DATA[1] and DATA[2] you can have programs that do "good" or "bad" things, with the same hash. Please note it's been DESIGNED with that in mind.

From now on I'll talk on absolute terms, while in reality there is a very small probability of things being right for an attack without being planned that way, so keep in mind that before saying "but that's not the whole truth.....".

Now let's discuss what's possible to do and what's not:

1.Oh no! Now, someone will create a virus that has the same hash than my favorite app!

False: the app (or installer) would have to have been designed with that "feature" in advance.

2.MD5 is worthless and should not be used anymore.

False: MD5 is useless in the situation presented above. There are some very good uses of MD5 that are safe (like access control: this attack does nothing practical to you salted MD5 shadow file). MD5 should probably be watched for other undesirable properties, though. An alternate cryptographically secure function should be kept in reserve.

3.I'll use another hash function, I'll be invulnerable to this attack.

(somewhat) False: You'll be invulnerable until someone finds ONE collision in your new hash function (it might take a long time but....). Then you'll be vulnerable again. But now we all know what can be done with ONE collision. What you're thinking is probably good, but it's no silver bullet.

4.Microsoft will forbid the use of MD5 and DES, and use SHA-1 and AES. We should do the same.

(somewhat) True: Not for the reasons you're thinking though. If MS is really doing this, this attack is a lame excuse to do it. MD5 is still useable for some things, and SHA-1 is not much better than MD5 in the things related to this attack. IIRC these collisions were found using an attack derived from an attack on SHA-1. Right now, SHA-1 collisions can be found in 2^63 operations (and the clock is ticking). We should probably consider using a new hash function someday, but leave the decision to the cryptologists. About AES, it's about time. DES can be brute forced in reasonable time, and that's been like that for a few years. 3DES is slow. That's the reason for the AES contest, we should use since we have it.

5.Someone could distribute some sort of binary and the switch it so it does lots of damage to unsuspecting people.

True: That's exactly what the attack is about. Maybe you were wrong to trust [insert a name here].

6.Who should be doing what and when?

If you work in crypto, you probably k
Read the rest of this comment...

Share
twitter facebook
- Re:filesystems... (Score:5, Informative)
  
  by ArsenneLupin ( 766289 ) writes: on Friday September 23, 2005 @06:33AM (#13627985)
  
  Reiserfs (and lots of other filesystems ... and lots of other system components in general) might use hash functions to speed up lookups. These hash functions however do not need to be cryptographically secure. The hash (which is usually very short, so much that brute force would be feasible) is only used as an index into an array of "buckets". Each bucket may contain multiple files, and the system still uses a bit-by-bit comparison on the full names to find the correct entry.
  The point is to reduce the set among which to do an exhaustive search (one small hash bucket versus all known files on the system), and not to verify some kind of signature.
  Any successful attack on the hash would only be useful to make the system slow and unefficient (by making an excessive number of files end up in one bucket), but cannot corrupt it.
  
  Parent Share
  twitter facebook
- Re:compressed content safe (?) (Score:2)
  
  by ssj_195 ( 827847 ) writes:
  
  There is at least some truth to this, I think: creating a file with not only the same md5 as a given file but also the same file-type (e.g. plain-text if the original is plain-text; a valid .zip file if the original is a .zip file; a valid .iso if the original is a .iso, although I'm not sure on this last example as I don't know how strict the "structure" of a .iso is; etc) is far, far harder. In fact, it could potentially even be the case that there does not even exist a file with both the same hash and t
- Re:compressed content safe (?) (Score:2)
  
  by m50d ( 797211 ) writes:
  
  I thought that the known attacks would not work agains compressed content, because you had to add some "random" data to the content and while decompressing the decompressor would give an error. I know it isn't exact, but I thought that md5's of compressed content was safe.
  I see "gzip: trailing garbage ignored" messages all the time when unpacking tarballs, should I be worrying about them? I know most users won't.
  Anyway, if it's zip compression then by design arbitrary content can be added to the start wit
- Re:compressed content safe (?) (Score:2)
  
  by anthony_dipierro ( 543308 ) writes:
  
  Depends on the form of compression, I suppose, but you could probably create a compressed file where the garbage content uncompressed to other garbage content which was then removed (for instance, a compressed tarball where the garbage content was was in a single file which was subsequently ignored). If the decompressor gives an error while decompressing, then it isn't a very good compression algorithm, because that bit stream could have been dedicated to something valid instead of something invalid).
  
  As a
- Re:compressed content safe (?) (Score:2, Informative)
  
  by Shano ( 179535 ) writes:
  
  Lots of file types allow for arbitrary junk at certain places.
  
  For example, a very basic form of steganography: cat a .zip file to the end of a .gif file. The result is a valid file that can be displayed as an image (which ignores trailing junk), and decompressed with zip (which ignores leading junk).
  Most file formats I've written don't care about junk at the end of the file. It'll be stripped off if you load and then save, but the program won't notice or complain. One program even preserves records it
- Why it IS a problem (Score:4, Insightful)
  
  by Moraelin ( 679338 ) writes: on Friday September 23, 2005 @06:58AM (#13628055) Journal
  
  The problem is that a broken algorithm just makes that piece of social engineering a lot easier.
  
  If I just told you you can download the latest auto-installer of the latest WoW patch from www.i-pwn-ur-puter.ru instead through the slow Blizzard installer, you might think "uh, wtf, I think I'll play it safe anyway and get it directly from Blizzard. I trust them more than I trust a warez and script-kiddie site."
  
  Now picture that I tell you "and here's a link to the MD5 sum on Blizzard's site. You can check for yourself that the the file on our site is the original file and it hasn't been tampered with." In fact, I would even _urge_ you to make a habit to check all your downloads against the original MD5 sums, for your own good.
  
  It already looks a lot safer and more legitimate. Well, maybe not to _you_, but to a lot of people it does.
  
  That's the whole problem. That false sense of security makes the "if we can convince you to run our insecure extractor code" part a helluva lot easier.
  
  Parent Share
  twitter facebook
  - Re:Why it IS a problem (Score:4, Insightful)
    
    by archeopterix ( 594938 ) writes: on Friday September 23, 2005 @08:15AM (#13628229) Journal
    
    Now picture that I tell you "and here's a link to the MD5 sum on Blizzard's site. You can check for yourself that the the file on our site is the original file and it hasn't been tampered with." In fact, I would even _urge_ you to make a habit to check all your downloads against the original MD5 sums, for your own good.
    
    It already looks a lot safer and more legitimate. Well, maybe not to _you_, but to a lot of people it does.
    That's the whole problem. That false sense of security makes the "if we can convince you to run our insecure extractor code" part a helluva lot easier.
    
    The scheme described in the article enables you to: take a good file and generate:
    
    A. Another "good" file (one that generates the same exec while extracting).
    B. A "bad" file.
    Such that hash(A)=hash(B)
    For this scheme to work, you would first have to convince Blizzard to use your "good" file A for distribution (more exactly: computing the published hash). Hey Blizzard, I have a file that extracts the same files as your distribution, only has a different hash value, why don't you replace your file with my file? "Helluva lot" easier than just convincing them to distribute your "bad" file? I don't think so.
    
    Parent Share
    twitter facebook
    - Re:Why it IS a problem (Score:2)
      
      by aaronl ( 43811 ) writes:
      
      No, no you would not have to convince the company to do anything. All you have to do is convince the user that your "bad" file is the same as the company's "good" file because they both compute to the same hash. *That* is why this attack is so annoying and dangerous; you can make the file seem legit by generating a "bad" file with the same checksum.
      
      Once your file appears to be legit, since it matches the checksum on the real site, you can get yourself ranked in search engines. Most people find things by
    - And that might just be easier than you think (Score:2)
      
      by Moraelin ( 679338 ) writes:
      
      For a start, it still massively enlarges the number of people you must trust.
      
      E.g., do you automatically trust everyone working at Blizzard? How about some disgruntled temp employee at the publisher? Do you trust everyone at the company who made the install program too? Etc.
      
      Due to the whole "if H(A) = H(B), then H(A+Q) = H(A+Q)", they don't even have to convince Blizzard of anything. As long as I control the A and B versions of the self-extractor module, Blizzard's own content is the Q in that equation. It
- Re:Ah, but ..... (Score:2)
  
  by chefren ( 17219 ) writes:
  
  If the MD5sums for two files don't match, then you can still say the files definitely are not the same. So it's not entirely useless.
  
  But if they DO match? What then? Then you can't say anything. A hashing algorithm really is fairly useless for security once its broken.
- Re:Considered Harmful Considered Harmful (Score:2)
  
  by m50d ( 797211 ) writes:
  
  Already been done, as has "Djikstra considered harmful", shortly after the original one.
- Re:WTF mate? (Score:2)
  
  by RPoet ( 20693 ) writes:
  
  Not a SINGLE post about Bittorrent (which uses MD5) and the **AA's?
  
  May be related to the fact that BitTorrent uses SHA-1, not MD5.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Checksums are always going to be vulnerable (Score:3, Insightful)

Re:Checksums are always going to be vulnerable (Score:5, Insightful)

Re:Checksums are always going to be vulnerable (Score:4, Interesting)

Re:Checksums are always going to be vulnerable (Score:5, Informative)

Re:Checksums are always going to be vulnerable (Score:3, Informative)

Re:Checksums are always going to be vulnerable (Score:5, Informative)

Re:Checksums are always going to be vulnerable (Score:2)

Re:Checksums are always going to be vulnerable (Score:3, Informative)

Re:Checksums are always going to be vulnerable (Score:3, Informative)

Re:Checksums are always going to be vulnerable (Score:3, Insightful)

Re:Checksums are always going to be vulnerable (Score:4, Informative)

Re:Checksums are always going to be vulnerable (Score:2)

Re:Checksums are always going to be vulnerable (Score:2, Insightful)

Re:Checksums are always going to be vulnerable (Score:2)

Re:Checksums are always going to be vulnerable (Score:3, Insightful)

Re:Checksums are always going to be vulnerable (Score:2, Informative)

Re:Checksums are always going to be vulnerable (Score:2)

Re:Checksums are always going to be vulnerable (Score:3, Informative)

Re:Checksums are always going to be vulnerable (Score:3, Interesting)

Re:Checksums are always going to be vulnerable (Score:5, Informative)

Re:Checksums are always going to be vulnerable (Score:2, Interesting)

-1, Clueless (Score:2)

So if you need a freely available hash algorithm (Score:5, Informative)

Re:So if you need a freely available hash algorith (Score:2, Insightful)

Re:So if you need a freely available hash algorith (Score:5, Informative)

Secure is as secure does, sir. (Score:2)

Re:So if you need a freely available hash algorith (Score:2)

Re:So if you need a freely available hash algorith (Score:2)

Re:So if you need a freely available hash algorith (Score:3, Funny)

Whirlpool is not based on AES (Score:3, Informative)

Re:So if you need a freely available hash algorith (Score:3, Informative)

Re:So if you need a freely available hash algorith (Score:2)

hashtrust (Score:4, Interesting)

Re:hashtrust (Score:2)

Re:hashtrust (Score:2)

Re:hashtrust (Score:3, Insightful)

Re:hashtrust (Score:2, Interesting)

Re:hashtrust (Score:2)

But... (Score:2, Interesting)

Re:But... (Score:2)

Re:But... (Score:2)

M$ Antihash (Score:5, Funny)

Re:M$ Antihash (Score:2)

Re:M$ Antihash (Score:3, Funny)

A quick note (Score:5, Informative)

Re:A quick note (Score:3, Funny)

Re:A quick note (Score:2)

Re:A quick note (Score:4, Informative)

H(x) == H(y) - H(x + q) == H(y + q) ? (Score:3, Informative)

Re:H(x) == H(y) - H(x + q) == H(y + q) ? (Score:5, Insightful)

Re:H(x) == H(y) - H(x + q) == H(y + q) ? (Score:2)

Said it before, saying it again (Score:2)

Merkle Damgard: Why len(x) and len(y) matter (Score:2, Informative)

Re:H(x) == H(y) - H(x + q) == H(y + q) ? (Score:5, Interesting)

Re:H(x) == H(y) - H(x + q) == H(y + q) ? (Score:2)

Re:H(x) == H(y) - H(x + q) == H(y + q) ? (Score:2)

Re:H(x) == H(y) - H(x + q) == H(y + q) ? (Score:2)

Not a problem for software distribution.. (Score:4, Interesting)

Re:Not a problem for software distribution.. (Score:3, Insightful)

Re:Not a problem for software distribution.. (Score:2)

Re:Not a problem for software distribution.. (Score:2)

Pretty interesting.... (Score:2)

Re:Pretty interesting.... (Score:2)

Re:Pretty interesting.... (Score:2)

File Integrity Checkers ? (Score:2)

Re:File Integrity Checkers ? (Score:4, Insightful)

Re:File Integrity Checkers ? (Score:2, Interesting)

Re:File Integrity Checkers ? (Score:2)

Yadda Yadda (Score:5, Interesting)

Re:Yadda Yadda (Score:3, Informative)

Actually RPM uses MD5 and SHA1 (Score:5, Informative)

Re:Actually RPM uses MD5 and SHA1 (Score:2)

Re:Actually RPM uses MD5 and SHA1 (Score:2)

SHA1 will fall soon (Score:2)

Speaking of hashing algorithms... (Score:3, Informative)

Don't panic yet (Score:4, Insightful)

Re:Don't panic yet (Score:3, Insightful)

On Slashdot.. (Score:5, Funny)

Multiple Checks (Score:2)

TCP (Score:3)