Eric,
...I will definitely change the existing algorithm... // ...Just FYI, here's Firefox's pw strength algorithm...
We started to discuss the password meter under the beta4 thread, but I think this thread is the right one to continue the discussion. Although I understand you won't spend much time on the meter in the near future, I'll post my thoughts just for future reference.
What is a secure password?A secure password should be hard to guess even if knowing the person who uses the password, and it should be hard to crack using brute force attacks (ie. programs that automatically try many combinations of characters). A secure password should:
* Be as long as practically possible:Longer password makes brute force attack exponentialy more difficult. For example, using hexadecimal character space (PasswordMaker default), each additional character multiplies the number of possible combinations by 16.
So while a 200 chars password may be ultimately secure, 20 chars is secure enough to make brute force attacks impossible.
* Use as many different characters as possible:That may seem obvious, but there is a caveat. The sequence ABCDEFG is made up of different characters, but it makes for a very un-secure password. On the other hand QBQWQMQH has one letter in half the positions, but letters in the remaining positions have no logic to them, making the password hard to guess and hard to crack by brute force.
So the question here is not really of using different characters, but of
distribution - using different characters that are distributed more or less evenly, or at least randomly, over the available character space.
But even distribution alone is not enough. There are two classes of passwords that may be long and contain well distributed characters, but are not at all secure: words and keyboard sequences.
Examples: PENGUIN may be nicely distributed, but very easy to guess especially for Linux related sites. QWERTYUIOP may be both long and distrbuted, but by no means random.
Avoiding these two problems isn't as easy as it may seem. In order to detect words a dictionary is required, but a dictionary that gets updated regularly with new words and slang (eg. I'm sure that "batman" and "emoticon" didn't exist 50 years ago). But what about non-English users? In order to truly provide a good solution the dictionary has to cover words of all popular languages. This is impractical.
Similarly, detecting English keyboards sequences may not be very difficult (although detedting both directions - QWERTY and YTREWQ is required). But once again when we consider non-English keyboards, and English non-QWERTY keyboards, things get much more complicated.
Which brings us to the last point:
* Use as wide character space as possible:The number of possible characters directly affects the difficulty of cracking a password. As I mentioned earlier, increasing the length of the password by one results in exponential increase in the number of combinations. The number of available characters is the base of the exponential formula.
So for example with the hexadecimal character space, each additional character in the password multiplies the number of combinations by 16. But a character space that contains English upper case, lower case and digits, multiplies the number of combinations for each additional character by 62.
However, the number of characters alone may not be sufficient. The English upper-case letters make for a space of 26 characters, which is larger than the hexadecimal space. However, the hexadecimal space is more secure as it contains both letters and digits, thus avoiding the problem of dictionary words and (at least for QWERTY keyboards) also the problem of keyboard sequences.
So the quality of the character space should be measured not only by the number of characters, but also by the different categories of characters: letters, digits, and symbols. Note that I don't count upper and lower letters as seperate categories, because the notion of upper/lower case only applies to Latin-based languages.
However this presents another problem - many systems do not allow special symbols in characters. I've seen many web sites and other password protected systems that don't allow any symbols, or allow only underline, dash and maybe another symbol or two.
So I think for practical reasons, aside from the number of characters, the quality of the character space should be measured by the use of at least two of the three categories (letters, digits, symbols).
How do Firefox and PasswordMaker meters fare?Let's look at how Firefox and PasswordMaker do in light of the above analysis. I admit that my analysis may be wrong or incomplete - I'm not an expert in this field - this is all just off the top of my head.
* Firefox:Firefox's algorithm fails in at least four different ways.
First of all it limits the importance of the length to 5. That is, passwords of lengths of 5 and longer are all given the same weight in the final score. So QWERTY and QWERTYUIOP get the same score. This is a serious flaw because brute force attack can easily crack 5 character passwords, and considering Firefox is a local application, brute force attacks are easy and fast.
Another problem is that Firefox gives higher score to using upper case letters, but doesn't test for using ONLY upper case letters. So ABCDE and ABCde get the same (exaggerated 60%) score - although the former is by far easier to guess or crack than the latter.
The third flaw is that Firefox takes a user supplied password, which may very well be a dictionary word or a keyboard sequence. The algorithm detects neither of these. Admittedly solving these problems requires huge efforts, but that doesn't detract from the importance of the flaw.
Finally, the algorithm doesn't really measure distribution, but rather using characters of different categories (ie. upper case, lower case, digits, symbols). So, combined with the previous flaws, the password ABC123 gets a score of 90% (!) although its security strength is next to nothing.
Another
theoretical deficiency is that the algorithm gives higher score to using symbols. In the particular case of Firefox' master password it isn't a real deficiency, because symbols are known to be allowed. However for a more generic algorithm, that has to deal with passwords that can't contain symbols, this is a limitation.
* PasswordMaker:Disclaimer: I haven't looked at the sources, only at the description above.
Obviosuly the algorithm puts great emphasis on the length, which is good. Also it balances the weight of different character categores, so using only upper case letters doesn't fool it as it does Firefox' meter.
However it doesn't measure true distribution of characters, so ABC123 and XBM591 get exactly the same score despite the obvious difference in security. Also ABcd12 gets a very high score (about 80%) although it's not very secure - I'd rate it no more than 50%.
Another problem is that the weighting of different character classes is relative to the length of the password, which causes some strange artifacts. For example the password
12GR)y gets about 80% while
12GR)yFKPOw' gets about 60-65% although it is more secure by several orders of magnitude.
The dictionary and keyboard sequence problems are almost irrelevant for PasswordMaker, as the passwords are hash-generated and so the chance that a password will happen to be a word or keyboard sequence is very small. Also this can be (and is) offset by giving higher scores to passwords that contain digits and symbols.
However there is one case when this can become a real problem, and that is when the user uses a prefix and/or suffix, that comprise all or most of the password. In practice this probably will not be a common scenario but it can happen to some users at least by mistake.
Overall PasswordMaker's meter is much harder to fool than Firefox's, and even when it errs it never (in my experience) gives more than 50% to trivial passwords. I think that 40% for a password such as ABC123 is still too high, but is by far more reasonable than Firefox' 90%.
So how should the perfect meter algorithm work?1.
Normalize the password and the character space -Of course this is only for the purpose of calculating the strength:
1a. Convert all lower case and accented characters to their upper-case non-accented equivalents. This is to ensure that password using Latin characters don't fool the distribution algorithm with combinations such as
ABCabc.
1b. Remove duplicate characters. This ensures that: i) Passwords such as QBQ3Q!QWQ+ don't get penalized as having bad distribution. ii) Character spaces that contain the same characters multiple times don't get awarded as having more space.
2.
Give significant weight to the length of the password - The length makes a password harder to guess or crack, but also improves the distribution of hash-generated passwords. I suggest giving zero score to lengths of 4 or less, and capping the score for length 20 or more.
3.
Give some weight to the quality of of the character space - The quality should be measured based on the number of unique characters, and the inclusion of at least 5 characters of at least two of the categories: letters, digits, numbers. Maybe give a small bonus for using all three categories instead of two. Why only
some weight? Because the character space provides a theoretical maximum distribution, while the actual password may happen to contain only a narrow range of characters.
4.
Give significant weight to the distribution of characters in the password - Combined with the length this determines the security of the password. BUT: the distribution should be calculated separately for each category of characters, in order to detect cases such as
XYZ123 which may have good distribution but isn't very secure.
Hope it all makes sense...
EZ.