Someone pointed to me this pretty good article about password hashing and HMAC. I must admit that before reading the article, I was barely familiar with HMAC. I am, however, familiar enough with one-way hash functions to comment a bit on the article, and since I feel there are incorrect conclusions in this article, I also feel I should try to correct those. The first one is "So the moral of the story is that hashing the secret password directly is a bad idea." I disagree: hashing passwords, with a salt, is an appropriate measure to avoid passwords being compromised in case of a break in, even if the salt is known. As mentioned in the article, you cannot compute the original password from the hash, even if you have the salt, so it should be made clear that using hash functions with salts is a necessary and sufficient security measure for storing (as opposed to communicating) passwords for login authentication. The reason here is that the reason the hash is used in password files is solely to hide the original password, not to ensure authentication (which is done over the terminal, and is vulnerable to other attacks, outside the realm of cryptography). So hashing secrets is good in this case, but you should use a salt.

Hence the general conclusion "don't hash secrets" is a bit overstated: just hashing secrets is good in certain circumstances. When used as an integrity and authentication scheme, especially when communicating over an untrusted network, however, it's clear that hashes are not sufficient, but for reasons that are beyond the scope of hash functions themselves. Basically, if Alice sends Bob message M with hash H, and let Bob verify the message by hashing M, Bob is vulnerable from a simple attack by Eve, if she can tamper with the message. Eve can just inject her message N instead of M and hash a new digest, say J, and then communicate both to Bob instead of the original M,H pair. Then Bob will compute the message N and get the digest J, and will wrongly assume that the message wasn't tampered with.

HMAC resolves that by adding a shared secret key between Bob and Alice. That key is used in the generation of the hash H, which makes it impossible for Eve to generate a valid hash because she lacks the key Alice and Bob share.

Finally, I am also surprised to hear that there are pre-image attacks against SHA. The ones that are documented (that i could find) are meet-in-the-middle attacks with "a time complexity of 2^253.5 and space complexity of 2^16" (ref). This basically means that you cannot as far as I know, derive SHA1(secret || message || ANYTHING) from SHA1(secret || message) in any reasonable time. Then again, maybe I am wrong and misunderstanding the protocol: I'm not a cryptographer. :) (Update: indeed, this is not about a pre-image attack, it's just the way SHA1 works: the algorithm is incremental and operates over blocks of ciphers. So what is said is true, but only if the original message is aligned with block boundaries!)

But still it should be made clear that this attack (if it exists) is designed in the way that the original message is not tampered with other than being appended to. So yes: Eve could impersonate Alice here by appending an arbitrary message to the original one. But that doesn't matter because she doesn't even need all that trouble: if Eve can just tamper with the message, she can just inject a very new message with a very new hash instead of the original message altogether. That's the way hash function works, and it's the reason why MD5 sums associated with archives distributed over the internet are useful to assure the integrity of the file downloaded, but will not protect against a malicious attacker. For that cryptographic signature with a shared secret (or a public key system like PGP) are required to ensure a trust path.

Otherwise it's an interesting article and a good eye-opener, but I prefer when people understand what the heck is going on instead of relying on too simple rules of a thumb. I encourage people to read about HMAC and hash functions to really understand how they work.

People looking at implementing hashing in their application should try reading the theory then look at this example (UNIX passwords scheme) and preferably use an existing library like phppass for PHP. For more information on HMAC, I recommend the wikipedia article on HMAC. I am not sufficiently familiar with HMAC to suggest an existing library, unfortunately.

Created . Edited .