Question

Hardened Diceware

Liam Quin @ Wikimedia Commons (CC BY 3.0)

~1,600 words

Published: January 24^th 12,017 HE

Last modified: March 19^th 12,018 HE

Summary

Assume your adversary is capable of one trillion guesses per second. If the device you store the private key and enter your passphrase on has been hacked, it is trivial to decrypt our communications.

Edward Snowden

Cyber security has been in the news a lot lately—what with sneaky election-interfering Russkies and Ashley Madison leaks and all—which has had me going through my own security hygiene with a critical eye.

One major issue I noticed were my password habits. I stored them all in LastPass, but largely used the same algorithm for generating them: a common series of characters, and then a unique ending string determined by the website in question. This ensured that I could recompute a forgotten password easily enough and that I was using a unique password for every site, but it also made the use of a password manager fairly superfluous and ensured that an attacker who compromised one password could potentially work out the algorithm I’d used and apply it to my other accounts.

Assume your adversary is capable of one trillion guesses per second. If the device you store the private key and enter your passphrase on has been hacked, it is trivial to decrypt our communications.

Edward Snowden

Cyber security has been in the news a lot lately—what with sneaky election-interfering Russkies and Ashley Madison leaks and all—which has had me going through my own security hygiene with a critical eye.

One major issue I noticed were my password habits. I stored them all in LastPass, but largely used the same algorithm for generating them: a common series of characters, and then a unique ending string determined by the website in question. This ensured that I could recompute a forgotten password easily enough and that I was using a unique password for every site, but it also made the use of a password manager fairly superfluous and ensured that an attacker who compromised one password could potentially work out the algorithm I’d used and apply it to my other accounts.

Something had to be done. Firstly, I ditched LastPass for KeePass—LastPass’ proprietary nature and recent-ish buyout didn’t sit well with my freetard self, and I wanted something I could store on a USB flash drive for an extra layer of physical security—before moving onto the contents of that KeePass database.

A debate has been raging for a while between the relative merits of passwords (single strings of ideally non-word characters) and passphrases (series’ of actual words (with optional space separators)). The greatest argument in favour of passphrases is their memorability; it’s far easier to remember correct horse battery staple than it is to remember @E;|)3VZk’A%+dy4zbFBPL7wIfgR, which has the same number of characters. This is important as some passwords will be for locations where access to a password manager is not possible; for example, an OS login screen.

The general rule of thumb I’ve seen is to have a passphrase that you use to encrypt your password manager of choice’s database full of all your randomly-generated, unreadable passwords.

Passwords ¶

For sake of argument, let’s assume for the remainder of this post that all of your passwords are:

strings of 125 characters
randomly generated from the set of every ASCII code between 32–126 ( to ~)
composed by giving every value an equal probability of occurring

This gives us a pool of 97 possible characters, with an entropy of about 6.6 bits each. For a 125-character string, this gives an overall password entropy of $ \log_2\left( 97^{125} \right) \approx 824.99~\text{bits} $. Assuming an adversary with a guess rate of one trillion guesses per second, or 1 Teraguess (Tg), it would take them an average of $ \left( 2^{823.99} \div 1~\text{Tg} \right) \div 2 \approx 1.11 \times 10^{236}~\text{s} $ or $ 3.52 \times 10^{228}~\text{years} $ to guess.

Passphrases ¶

That leaves us with the passphrases—our keys to that kingdom. These must necessarily be weaker as they must be composed of genuine words rather than completely random (or pseudo-random) strings of characters. However, very strong passphrase generation methods exist. I will be using Diceware for this exercise, with Random.org providing the dice rolls (each column is a word number). This article from The Intercept gives a good explanation of the Diceware system for those interested, including the calculation that for a 7-word passphrase (with words taken from the standard Diceware word list’s 7,776 words), at 1 Tg/s, it would take an average of 27 million years to guess[…]. Not too bad for a passphrase like bolt vat frisky fob land hazy rigid.

So you now have your KeePass database full of uncrackable passwords, and a still-pretty-uncrackable passphrase used to get access to it. Job done, right? Well, not quite. The number of situations where KeePass was unavailable and a passphrase was needed grew as I catalogued my behaviour. An average person may have a laptop login, their KeePass authentication, a desktop login, a work desktop login and a phone password that all need passphrases. A more security-conscious person may also have their KeePass database—along with other sensitive authentication material such as PGP keys and revocation certificates—stored in a VeraCrypt database. For plausible deniability, they may also have another passphrase to open a hidden volume.

As you can see, we’re talking a lot of passphrases here if one wants to follow good practice and not reuse anything. Realistically, one or two passphrases will probably be reused across all logins. This means that, should any of them be compromised, the attacker can gain access into a number of your other systems.

Hardened Diceware ¶

So how about you just use three? That’s 21 words for you to remember; the Lord’s Prayer has 69, and about 1.2 billion Catholics have that memorised. We’ll call these three passphrases $ \left[a, b, c\right] $. For example. say your passphrases were composed of the following words (please don’t actually use these though):

$$ a = \text{foo fii fee faa fuu fyy fff} $$

$$ b = \text{bar bir ber bor bur byr bbr} $$

$$ c = \text{baz biz bez boz buz byz bbz} $$

Firstly, you would never write any of these passphrases down. You memorise them, and they die with you. If, for whatever reason, that is entirely infeasible, they should be kept somewhere far away from you or any system you interact with (for example, placed on a USB flash drive buried in the garden of your childhood home).

Secondly, you record your passphrase-using logins in KeePass (except the KeePass database login) as a string of seven letter-number pairs, for example a1 b5 b4 c2 a0 b6 c4. You can use whatever further obfuscation you like here—Base64, modulo arithmetic, etc.—but I will assume they are stored as plaintext for the purpose of the rest of this post. For each pair in this mask, the letter corresponds to the passphrase to use and the number to the index of the word to use (in this example zero-indexed). So a1 b5 b4 c2 a0 b6 c4 would result in a passphrase of fii byr bur bez foo bbr buz.

This allows you to use a unique passphrase for each login that requires one, without you having to remember a small novel’s worth of words. The most important will be your KeePass database passphrase, as even if you forget (for example) your desktop OS login you can LiveBoot into TAILS and open it to have a look at the reminder. Most importantly, this means that loss of one or even two passphrases, even the KeePass one, will not compromise the rest of you. Say your passphrase for your desktop login uses the mask $ b_3 c_3 c_2 a_5 c_0 b_1 b_5 $ (i.e. bor boz bez fyy baz bir byr) and a man-in-the-middle attack compromises it—the passphrase will be useless for all other logins. Assuming the attacker knows you are using the system outlined in this post (which, if you look at this website’s pageview figures, is unlikely), all he know knows is 7 words that each appear somewhere across your 3 passphrases. Even if the attacker somehow also had access to your KeePass database (and let’s say your laptop login mask was $ a_6 c_2 a_4 a_5 b_1 a_2 b_4 $), the attacker would gain practically no advantage in cracking it from the compromised passphrase. With access to your KeePass database giving him the b3 c3 c2 a5 c0 b1 b5 mask for your desktop, all he now knows is

$$ a = \text{? ? ? ? ? fyy ?} $$

$$ b = \text{? bir ? bor ? byr ?} $$

$$ c = \text{baz ? bez boz ? ? ?} $$

This leaves effectively an 14-word passphrase to guess, with entropy of 181.3. That’s approximately 120 sextillion, 700 quintillion years, on average.

Fun Bonuses ¶

With this, you could easily implement a duress code; for example, a passphrase that, when entered to decrypt a filesystem, will instead either destroy it or display an alternative view, allowing plausible deniability. The most obvious would be something like a reversed $ a' $ (i.e. $ a_6 a_5 a_4 a_3 a_2 a_1 a_0 $ ), but any arbitrary ordering could be used.

Another bonus is that you could organise to be alerted when an incorrect ordering of a otherwhere-valid passphrase is used, which would allow you to quickly determine which passphrase has been compromised (i.e., if someone tries to log in to your desktop with your laptop passphrase, you know your laptop’s on the wonk).

Can you protect against rubber hose cryptanalysis with this system? This assumes that the secrets you are protecting are worth more than your life, but that your lucid analysis as such may be changed through coercion (being twatted with a hose). It’s not impervious, but you could mitigate it a lot by distributing two of the three passphrases amongst two other trusted parties without ever seeing them yourself. Retrieval could then require visual authentication, whereupon the effects of physical (and possibly also psychological) coercion would be visible, picked up upon and the passphrase refused.