I was wondering about nullok
; from the man pages it seemed that it might be a good idea, but I am confused about whether it increases or decreases security. So I guess the edited line should read
- Code: Select all
password [success=1 default=ignore] pam_unix.so nullok obscure sha512 rounds=1000
The ancient default line: this is what I found when I looked at my Debian oldstable (Lenny) system, which I am currently preparing to replace. That system is about 1.5 years old.
I installed mcrypt about one month ago. That seems to have replaced the man page for "crypt" with the man page for "mcrypt", so I can't read the notes mentioned above.
I have been wondering whether by installing some extra crypto stuff (from Lenny repos only) over past year, such as mcrypt, I have somehow caused my system to become confused about what crypto it knows, thus causing default to md5, or even worse, to unsalted uniterated md5, or even worse, to the oldest default, unix crypt. And I can't figure out how to check.
Another point which worries me: the number 1000 seems to have been pulled out of a hat, and I don't understand why it has been chosen. It has small prime factors, which just might be a problem. Crudely, I understand a one-way hash function to be a mapping from keyspace to a small subset of keyspace, which should behave like a "generic" such function, so that it is hard for attackers to systematically reverse engineer hash values. I want results describing a "typical" function from a large keyspace to a subspace, possibly smaller.
For comparison: A reference book quotes some results for a randomly chosen function from a message space to itself, with no restriction on the size of the output (hashes):
- We can picture it as a directed graph consisting of cycles with trees attached to various points on each cycle.
- For a message space of size 2^n, there should be about n/2 log 2 cycles,
- About sqrt(pi) 2^(n/2) output words should belong to some cycle,
- A fraction of about 1/e output words should not be the "hash" of any input word; iterating our "hash function" should reduce the variety of the words we obtain by a fraction something like this:
- 1 round: 0.63212
- 10 rounds: 0.15024
- 100 rounds: 0.01935
- 1000 rounds: 0.00199
(Oddly enough, another Wikipedia article offers proofs of these facts, but I have lost the link.)
If we pretend that our unhashed passwords are always 128 bits long, and that our hash function outputs 128 bit hashes, but takes some inputs to the same output, then as I understand it, the idea is that a hash function should offer no particular "special structure" an attacker can exploit, so it should behave like a "randomly chosen" function, whose "expected" behavior should be roughly as just described.
One of my concerns: if our hash function behaves like a "generic function", it seems possible that if we iterate 1000 times, we might often wind up in a small cycle, which would imply that further iteration is not having the expected effect, and if the length of the cycle is also divisible by 2,5, that might further assist an attacker.
Another thought: as I understand it, the general idea of a hash algorithm like md5 is that we want to verify that a long file has not been corrupted. So the hash value should have fixed length, typically shorter
than the input file; we assume that because only a tiny fraction of the huge space from which actual files are "randomly selected" by complex human/cpu processes will occur in practice, "collisions" are sufficiently rare that we can ignore them. But it seems that disguising user input passwords for storage in /etc/shadow is a different problem, which might be better solved by using the user input password as a "seed" (perhaps with some publically known "salt") for some algorithm blending features of a random sequence generator with a hash function, which outputs a longer
word than the input word, and which can be iterated.