A Competition To Replace SHA-1 159
SHA who? writes "In light of recent attacks on SHA-1, NIST is preparing for a competition to augment and revise the current Secure Hash Standard. The public competition will be run much like the development process for the Advance Encryption Standard, and is expected to take 3 years. As a first step, NIST is publishing draft minimum acceptability requirements, submission requirements, and evaluation criteria for candidate algorithms, and requests public comment by April 27, 2007. NIST has ordered Federal agencies to stop using SHA-1 and instead to use the SHA-2 family of hash functions."
Draft location (Score:5, Informative)
Schneier Proposed this in 2005 (Score:5, Informative)
Re:Generic hashing is impractical (Score:4, Informative)
The idea is that, in a good hash function, each input bit affects all the output bits more or less equally. This is especially true of cryptographic hashes, and for a good reason. The stronger the correlations between input and output, the weaker the hash function.
Re:Generic hashing is impractical (Score:5, Informative)
There are a number of different type of collisions as well. Lets assume we have a 256-bit hash. There is the kind of colision where you just find *any* 2 strings that produce the same hash, which should require on avarage 2**128 "operations". A harder task is given a string and its hash find another string with the same hash. For a secure hash 256-bit hash function this will require on avarage 2**256 "operations".
There are other properties that are important as well. Its a well established idea. Hashes are very very usefull and are used for a lot more that file verification and we know what properties they need. We are just not very good at producing very good hashes yet.
Re:Hash functions in common protocols (Score:4, Informative)
Multiple Hash Functions (Score:2, Informative)
By the way, IIRC, OpenBSD and NetBSD include multiple hashes per archive in their ports trees, but use only one for verification.
Re:Generic hashing is impractical (Score:3, Informative)
A hash is a signature of the file, its designed to give a good confidence that a given file that you have been supplied matches the one that you think has been supplied.
The theory being that being able to create a file that is of the same length as the orignal, is not corrupt (eg a zip file still unzips, an executable still runs, a pdf still displays) and is different from the original but still hash should be infeasable (not impossible, most cryptography doesn't look for impossible, not practical within a given time frame is sufficient for most needs)
Another use of hashes is on data storage systems, especailly with backup systems, where two files with the same hash and length are treated as the same file (so no need to write it to tape twice) this way you only have to sort the list of hashs and look for matches, rathering than having to diff every file against every other one.
Personally I think I'd rather binary diff matches hashes just to be safe - but thats time intensive. The chances of two files each having the same size and SHA-256 hash and being different is less than the chance of your sotrage device being destoryed (meteroite, fire, flood, plane) before you are able to back up either file
Re:Leadtime for security: Is it too late? (Score:3, Informative)
Re:Multiple Hash Functions (Score:5, Informative)
http://www.mail-archive.com/cryptography@metzdowd
Re:Generic hashing is impractical (Score:4, Informative)
This has demonstrated a cryptographic weakness, there could quite well be more, look at the research over the years on weakening md5, therefore moving to different algorithm would be advisable.
Its doesn't mean that you are going to be able to find a collision in non trivial time, but it did lower the bar. Lowering it enough that people wanting high grade protection should switch to a more secure algorithm.
Context specific data has no place in a hash, it would only weaken it.
Re:How about SHA-512? (Score:3, Informative)
SHA-256 and SHA-512 are different hash functions (same basic design though). On 32-bit boxes SHA-256 is faster, and on 64-bit boxes SHA-512 is faster.
There is no point in 224 or 384, but they're there just for completeness (e.g. to comply with some specs that don't allow the arbitrary truncatage of a hash).
Tom
Re:Multiple Hash Functions (Score:4, Informative)
1) Would multiple hash functions be harder to fool (i.e. make the system think you got the original, but it's actually a forgery) than one hash function that generated as many bits?
No. In fact, the multiple hash functions perform worse:
``Joux then extended this argument to point out that attempts to increase
the security of hash functions by concatenating the outputs of two
independent functions don't actually increase their theoretical security.
For example, defining H(x) = SHA1(x) || RIPEMD160(x) still gives you only
about 160 bits of strength, not 320 as you might have hoped. The reason
is because you can find a 2^80 multicollision in SHA1 using only 80*2^80
work at most, by the previous paragraph. And among all of these 2^80
values you have a good chance that two of them will collide in RIPEMD160.
So that is the total work to find a collision in the construction.''
2) Does using multiple hash functions protect you against the case where one of them gets broken?
Basically, yes. Just note that your total security is no better than the security of the best hash function (as explained in point 1).
Re:Generic hashing is impractical (Score:2, Informative)
The point is that you can verify that data is correct with a good amount of confidence, from a relatively small hash code. So I can download a lot of data through, say, bittorrent, and despite the fact that I don't necessarily trust the people I actually download from, I can verify that the hash is right and therefore I am confident that the data I receive is what the original seeder put out: no-one's decided to play games and (say) sneak their CC number grabber into the data.
So what you want is an algorithm which is reasonably easy to run, which SHA-1 is, but where it is not easy to find a collision. For example, if my hash code was simply to give the total byte sum modulo 1000, then while it would almost certainly catch accidental errors in data, it would be very easy for an attacker to stick in his CC number grabber to your data then fiddle the byte sum back to where it should be.
Your idea pretty clearly shows you have no idea of what hashes are used for: there is no point preserving the data structure, it takes a lot of extra space and gives virtually no security. For example, SHA-1 produces a 20 byte hash. I can put something that size up on my personal website without getting huge bandwidth charges even if millions of people want to download it - and then I can distribute my 1GB zipfile by way of people I don't necessarily trust (but who have more bandwidth than I) and still the eventual recipients can be confident that what they receive is what I sent out. If I include the virtual FAT table of this zipfile, my hash size goes up by about 500,000 percent (literally), and so do my bandwidth charges. And I get virtually no extra security, because all that an attacker has to do above finding an SHA-1 collision is ensure that the change doesn't affect the FAT table: i.e. he replaces some suitable virtual file of mine with one of his, keeps the name and size the same and he's done.
Re:Multiple Hash Functions (Score:2, Informative)
Re:Leadtime for security: Is it too late? (Score:4, Informative)
It's not a practical attack because 2^63 is still a huge number.
It's not a "find a collision to a known string" attack which would be stage 2.
It's not a "find a collision to a known string by appending to a fixed string" attack which would be stage 3.
It is a sratch in the armor which creates doubt if there are more powerful attacks, nothing more.
There are strong alternatives like SHA-512 and Whirlpool (AES-based) which it is possible to use today, if you're paranoid more is better. Is it urgent? Not really, even a practical stage 1 and 2 attack would just be "stuff breaks, files corrupt, migrate away". The only one with really nasty consequences is stage three with code injection attacks in software and such.
Re:Multiple Hash Functions (Score:3, Informative)
md5sum foo -> 4f1cbee4972934c3beccc902f18242a7
sha1sum foo -> 3c92a387f898a31d2e8af31caff27c0f8f7a5a3a
md5sha1sum foo -> 4f1cbee4972934c3beccc902f18242a73c92a387f898a31d2
That should definitely not weaken anything, it will require some more CPU and storage, but thats it.
Re:Leadtime for security: Is it too late? (Score:3, Informative)
"Security through obscurity" means trying to depend on indefensible secrets. The classic example from 19th century crypto theory is that it's stupid to try to keep your crypto algorithm secret, so you should keep keys secret instead.
Security through obscurity leads to worldwide breaks when it fails.
The existing secure hashes have nothing obscure about them. The algorithms are published and open for review. The fact that they're vulnerable to brute force is not being hidden and is the same problem that all the workhorse encryption algorithms have.
"Security through obscurity" would be trying to hide the fact that there's a work factor reduction attack and hoping that nobody rediscovered it.
Re:Schneier Proposed this in 2005 (Score:1, Informative)
Re:Schneier Proposed this in 2005 (Score:3, Informative)
The thing is these kinds of contests take money and time to get running and (at least initially) NIST didn't have the resources to get a competition going. So what they did is organize a hash workshop for Halloween 2005, and had a second one last August following the Crypto conference where initial planning for the contest took place (a work shop that Schneier didn't bother to attend -- I guess he had yet another book to sell).
Re:Wrong (Score:4, Informative)
As for the Chinese attacks, they haven't shown any real applicability to SHA-2 as of yet.
Re:Generic hashing is impractical (Score:2, Informative)