I'm looking at hashing algorithms, but couldn't find an answer.
- Bcrypt uses Blowfish
- Blowfish is better than MD5
- Q: but is Blowfish better than SHA512?
I want to clarify that I understand the difference between hashing and encryption. What prompted me to ask the question this way is this article, where the author refers to bcrypt as "adaptive hashing"
Since bcrypt is based on Blowfish, I was led to think that Blowfish is a hashing algorithm. If it's encryption as answers have pointed out, then seems to me like it shouldn't have a place in this article. What's worse is that he's concluding that bcrypt is the best. What's also confusing me now is that the phpass class (used for password hashing I believe) uses bcrypt (i.e. blowfish, i.e. encryption). Based on this new info you guys are telling me (blowfish is encryption), this class sounds wrong. Am I missing something?
It should suffice to say whether bcrypt or SHA-512 (in the context of an appropriate algorithm like PBKDF2) is good enough. And the answer is yes, either algorithm is secure enough that a breach will occur through an implementation flaw, not cryptanalysis.
If you insist on knowing which is "better", SHA-512 has had in-depth reviews by NIST and others. It's good, but flaws have been recognized that, while not exploitable now, have led to the the SHA-3 competition for new hash algorithms. Also, keep in mind that the study of hash algorithms is "newer" than that of ciphers, and cryptographers are still learning about them.
Even though bcrypt as a whole hasn't had as much scrutiny as Blowfish itself, I believe that being based on a cipher with a well-understood structure gives it some inherent security that hash-based authentication lacks. Also, it is easier to use common GPUs as a tool for attacking SHA-2–based hashes; because of its memory requirements, optimizing bcrypt requires more specialized hardware like FPGA with some on-board RAM.
Note: bcrypt is an algorithm that uses Blowfish internally. It is not an encryption algorithm itself. It is used to irreversibly obscure passwords, just as hash functions are used to do a "one-way hash".
Cryptographic hash algorithms are designed to be impossible to reverse. In other words, given only the output of a hash function, it should take "forever" to find a message that will produce the same hash output. In fact, it should be computationally infeasible to find any two messages that produce the same hash value. Unlike a cipher, hash functions aren't parameterized with a key; the same input will always produce the same output.
If someone provides a password that hashes to the value stored in the password table, they are authenticated. In particular, because of the irreversibility of the hash function, it's assumed that the user isn't an attacker that got hold of the hash and reversed it to find a working password.
Now consider bcrypt. It uses Blowfish to encrypt a magic string, using a key "derived" from the password. Later, when a user enters a password, the key is derived again, and if the ciphertext produced by encrypting with that key matches the stored ciphertext, the user is authenticated. The ciphertext is stored in the "password" table, but the derived key is never stored.
In order to break the cryptography here, an attacker would have to recover the key from the ciphertext. This is called a "known-plaintext" attack, since the attack knows the magic string that has been encrypted, but not the key used. Blowfish has been studied extensively, and no attacks are yet known that would allow an attacker to find the key with a single known plaintext.
So, just like irreversible algorithms based cryptographic digests, bcrypt produces an irreversible output, from a password, salt, and cost factor. Its strength lies in Blowfish's resistance to known plaintext attacks, which is analogous to a "first pre-image attack" on a digest algorithm. Since it can be used in place of a hash algorithm to protect passwords, bcrypt is confusingly referred to as a "hash" algorithm itself.
Assuming that rainbow tables have been thwarted by proper use of salt, any truly irreversible function reduces the attacker to trial-and-error. And the rate that the attacker can make trials is determined by the speed of that irreversible "hash" algorithm. If a single iteration of a hash function is used, an attacker can make millions of trials per second using equipment that costs on the order of $1000, testing all passwords up to 8 characters long in a few months.
If however, the digest output is "fed back" thousands of times, it will take hundreds of years to test the same set of passwords on that hardware. Bcrypt achieves the same "key strengthening" effect by iterating inside its key derivation routine, and a proper hash-based method like PBKDF2 does the same thing; in this respect, the two methods are similar.
So, my recommendation of bcrypt stems from the assumptions 1) that a Blowfish has had a similar level of scrutiny as the SHA-2 family of hash functions, and 2) that cryptanalytic methods for ciphers are better developed than those for hash functions.
Blowfish is not a hashing algorithm. It's an encryption algorithm. What that means is that you can encrypt something using blowfish, and then later on you can decrypt it back to plain text.
SHA512 is a hashing algorithm. That means that (in theory) once you hash the input you can't get the original input back again.
They're 2 different things, designed to be used for different tasks. There is no 'correct' answer to "is blowfish better than SHA512?" You might as well ask "are apples better than kangaroos?"
If you want to read some more on the topic here's some links:
Blowfish isn't better than MD5 or SHA512, as they serve different purposes. MD5 and SHA512 are hashing algorithms, Blowfish is an encryption algorithm. Two entirely different cryptographic functions.
I agree with erickson's answer, with one caveat: for password authentication purposes, bcrypt is far better than a single iteration of SHA-512 - simply because it is far slower. If you don't get why slowness is an advantage in this particular game, read the article you linked to again (scroll down to "Speed is exactly what you don’t want in a password hash function.").
You can of course build a secure password hashing algorithm around SHA-512 by iterating it thousands of times, just like the way PHK's MD5 algorithm works. Ulrich Drepper did exactly this, for glibc's crypt(). There's no particular reason to do this, though, if you already have a tested bcrypt implementation available.
I would recommend Ulrich Drepper's SHA-256/SHA-512 based crypt implementation.
We ported these algorithms to Java, and you can find a freely licensed version of them at ftp://ftp.arlut.utexas.edu/java_hashes/.
Note that most modern (L)Unices support Drepper's algorithm in their /etc/shadow files.