Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The keys to the keydom (bit-player.org)
152 points by breadbox on Oct 27, 2013 | hide | past | favorite | 27 comments


64,081 is really close to 65536 (2^16) -- I wonder if the 64,000 vulnerable TLS keys have anything to do with a bug similar to the 2008 Debian SSL key fiasco, where the only contributor to the key's entropy was the PID (2^15 combinations). It would be interesting to see data about the types of devices that have these weak keys, but then again if there is a correlation its release could be used maliciously in the wild.


The post mentions "Most of them are found in embedded networked devices, such as routers and firewalls". Embedded devices like that have very little entropy on first boot - that could might cause them to generate very poor keys.

I wonder how many of the 60% non-unique keys are embedded devices with hardcoded keys...


"I wonder how many of the 60% non-unique keys are embedded devices with hardcoded keys..."

That wouldn't surprise me much at all.

I've been working on a secure workaround to let an embedded webserver on our consumer-oriented wifi-equipped hardware(1) do OAuth2 authentication (to connect to owner's Twitter/Faceboot/et al. accounts). OAuth relies on SSL/TLS to protect it's password-equivalent tokens on-the-wire. Getting an SSL cert onto the device (at a reasonable cost) so that a phone connecting to it via wifi doesn't throw up big scarey "somebody might be doing _bad_ things!" warnings is not easily solved.

(1) Shameless self promotion: http://holiday.moorescloud.com - The world's most intelligent christmas lights…


I can see the bumper stickers now - "Have you disassembled your firmware today?"

Sigh. I'm not sure what we can do (as an industry). Bad devices (and bad firmware, and bad software) are so widely deployed it seems almost impossible to fix.


This question has been asked in the comments to the article and there is answer: "Heninger et al. did check their data for “Debian weak keys.” They found 4,147 such TLS keys and 53,141 SSH keys."


It's definitely a bug, most likely in the pre-prime generation (not enough entropy).

If we're talking 1024-bit integers created by multiplying two 512-bit primes, the facts are:

1) The number of 512-bit primes is estimated around 11.36×10^150

2) The chance that two truly random primes are the same is one in 3.37*10^75. It is safe to assume it has never happened and never will (at least on our current hardware).


> We’re only at 10^7 so far. As Heninger said in her talk, we have enough 512-bit primes to assign a public encryption key to every atom in the universe, with little worry over possible duplicates.

I find the scale of these numbers absolutely incredible.


I had trouble following this point:

"Even though Euclid’s GCD algorithm is highly efficient, running it on all possible pairings of keys would be a strain. There’s an ingenious shortcut, based on the observation that if Y is relatively prime to each of X1,X2,…,Xn, then it also has no factor in common with the product X1×X2×⋯×Xn. Thus it’s possible to detect the presence of shared factors with just n GCD operations, instead of n2."

Can someone explain that shortcut in more detail?


Well, suppose we have 4 moduli, N1, N2, N3, N4. The naïve way to find all pairwise GCDs is to compute GCD(N1, N2), GCD(N1, N3), GCD(N1, N4), ...., GCD(N3, N4). That's (n^2 - n)/2 GCD computations.

The shortcut is to compute P = N1 * N2 * N3 * N4, GCD(N1, P / N1), GCD(N2, P / N2), GCD(N3, P / N3), GCD(N4, P / N4) --- n GCD computations . This works if all you want to know is whether there exist shared primes among those moduli. Suppose N1 = p1 * p2, and N2 = p2 * p3. N2 * N3 * N4 has the factorization p2 * p3 * p4 * p5 * p6 * p7 --- whatever the other primes are, that product still has the p2 factor in common with N1.

Note that unlike that naïve approach, this does not tell you which exact modulus has the prime in common with (say) N1. One idea is to take p2 and divide every candidate modulus by it until you find a match. This is computationally suboptimal; the better approach is to use product and remainder trees, as nicely explained in the Heninger et al paper [1, §3.3].

[1] https://factorable.net/weakkeys12.extended.pdf


1. Multiply all of the keys together: Y= X1 * X2 * ... * Xn.

2. For each key, Xi, use Euclid's algorithm to find the gcd of Xi and (Y/Xi). If the gcd of Xi is greater than 1, then you've factored Xi.

Even taking into account the arbitrary-size int arithmetic, this is more efficient then performing Euclid's algorithm on each pair (Xi, Xj) because gcds are so much more expensive then multiplications. N multiplications and n gcds on enormous ints is faster than n^2 - n gcds with 1024 bit ints.


Someone please correct me if I'm wrong, but I guess the shortcut is something like this:

  Z = X[1] * X[2] * X[3] * ... * X[n];
  for i in (1..n-1) {
    Z = Z/X[i];
    g = GCD(X[i], Z); 
    if (g > 1) {
      // check each other key to see which have g as a factor
      for j in (j+1..n) {
        if (X[j] % g == 0) {
          log that keys i & j have g as a common factor;
        }
     }
  }
Rather than having to run the inner loop for every iteration of the outer loop, it only has to run in the case where there is a common factor.

For instance, to check for common factors in [4, 13, 10] we initially compute Z = 4 * 13 * 10 = 520. Then we start with the first key, 4. Divide Z by 4 yielding 130. Do 4 and 130 have a common factor? gcd(4,130) = 2, so yes they do. We now check each remaining key individually (13 & 10) for divisibility by 2 and report which one(s) match, (in this case, 10). Then we continue with the second key, 13. Divide Z by 13 yielding 10. Do 13 & 10 have a common factor? No, they don't, and there are no more keys left to check, so we're done.


Yes that's correct, however you need not actually "log that keys i & j have g as a common factor;" if g > 1 then just add x[i] to a list, after checking all in x, then just go through that much smaller list. ( even better would be to just group x[I] in a dictionary based on g)... good example though.


I think it's shorthand English for saying:

Rather than doing n^2 comparisons (ie. every key individually GCD compared against every other key individually = (n keys)x(n-1 other keys) combinations), you can instead do n total GCD comparisons - every key GCD factored a single time against <large product of every other key> (which works out to (n keys)x(1 big product) GCD comparisons)


Ah, I see - thanks to you and the others. I think I just didn't realize that each X was referring to an actual key.


I don't have a source to back this up, but I thought this problem had been tracked down to a faulty Cisco CSPRNG.

(Later: nope, I'm wrong; the P's and Q's paper has Juniper vulnerable as well. The more likely commonality might be solid state devices and cold-start entropy.)


A very interesting read.

The final point is probably the most notable one, that key-generation immediately following a boot up can compromise the PRNG (as we know). Also a cool thought, comparing all those keys. I just compared the few keys I have hanging around (only of order 10, not 4 billion), and none of mine have common factors other than 1. Yay!

Spoofing host keys also sounds like a potential target, so you could spoof the server identity and then decrypt communications without having to deal with the actual key-cracking.


Check if your site uses a known weak key here:

https://factorable.net/keycheck.html

Enter a domain name or an IP address.


I pasted my key in to check. Came out as: "Factorable RSA Key Check Pass! This RSA key is not known to be factorable."

I would have felt uncomfortable checking my publicly available TLS or SSH server because if it comes back a fail, how do I know the site is not logging the data and logging in?


I'm not seeing the risk if it's a publicly available address. Analogy:

They tested every door lock in America.

They found that 64,000 door locks are trivially opened.

They put up a website where you can enter your home address to see if you're vulnerable.

If you enter your home address ("123 Maple Street"), they could lie and tell you that you're secure, but then go and rob you.

The thing is that they already knew whether 123 Maple Street was secure or not. They could have robbed you beforehand.

(Also, let's say that they never tested 123 Maple Street, and they tell you arbitrarily that you're secure. In this case, they still haven't gained any new knowledge.)


This isn't it at all.

They have been handed a giant pile of all keys to all homes in America. They went through the pile and found all but 64,000 keys to be too damaged to be usable. Those 64,000 keys are perfectly good, but they don't know which homes they belong to.

They open up a shop allowing people to come in with their lock to check if their key is one of the 64,000. Do you go to check? Do you drive? Or do you go wearing a mask on your face and take public transit (VPN + tor)?


Unless I misunderstood, they know which keys belongs to which houses. They successfully factored those keys, and unless they threw the metadata away they still know the domain that is using that key.

And if they did throw the metadata away, they still have this pile of 64,000 prime factors. They can ask the domains for their public keys again and test if any of those factors matches.


> they know which keys belongs to which houses

Need to be corrected into "they claim to know which keys belongs to which houses", right?


Well they did collect all the keys from their respective houses – the only thing stopping then knowing which key goes with which house is poor record keeping on their part…


Firstly, just because they know your private key - doesn't mean they can log in - just that they can snoop on the encrypted traffic, which is spectacularly unlikely for researchers at UCSD or UoM.

BUT, if one were to assume the presence of a global passive adversary – they would already have your public key, found via snooping the start of any encrypted connection to your secured service, as well as everybody else's public keys captured the same way – and one would assume that they've got _way_ more compute power available than five bucks worth of AWS EC2 time to factor them with.

Which says to me - check your public keys using the tool. If it says you're vulnerable you're probably hosed already and the only sensible option is to take the service offline until you've generated a new key-pair.

Actually, I guess the proper advice is probably to shut down any 1024 bit key protected services right now, and generate 2048 or more bit keys for them. If you've been running any services with 1024 bit keys where sniffing the cleartext of a session would reveal login credentials to anything important, and that key is a known "weak" key, it's probably time to assume the global passive adversary already has root on that box.

Try running:

> wc .ssh/*.pub

if that's returning key files with character counts below 7 or 8 hundred, run:

> ssh-keygen -l -f .ssh/id_rsa.pub

(where id_rsa.pub is the filename of the "short" public key) to determine the actual key length. (there's probably a better heuristic than my "about 7 or 8 hundred characters", I'm seeing 1024 bit rsa keys in the 200-300 char range and 1024 bit dsa keys in the 500-600char range. There's some "random length filler" in each file for containing the email address. I'm not sure what encoding is used to make ascii-representable 512 bit numbers though. )


"Some 60 percent of the keys retrieved were not unique. Presumably, most of the duplicates are accounted for by organizations that have multiple servers all operating with the same cryptographic credentials, but there were also instances of apparently unaffiliated individuals sharing a key." - how is it possible that 60 percent have the same keys? I also do not understand why a big company would share the same public key across servers? If it is compromised (somebody got the private key) then all its servers are compromised, right?


> I also do not understand why a big company would share the same public key across servers?

They might not be different servers, just different IPs. Multiple external IPs may all terminate at the same piece of hardware (Firewall, VPN device, SSL terminator, etc).


Thanks for posting. Well worth the time spent reading it :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: