I'd thank anybody who would post a tutorial/configfile to setup a DNS server (dn...

agilob · on Oct 4, 2021

> forcefully caching even failed requests to a configurable timeout

I've been doing ~SRE for 1.5 years and I've worked or helped on 3 outages related to negative DNS.. Please don't use negative cache, if you don't know how enough about DNS and can't monitor it

littlecranky67 · on Oct 4, 2021

I wouldn't suggest any ISP should do that (and I am none) but probably host this for own personal usage/home networks. If recursive DNS servers go down under the load of "smart devices", having a local copy of a larger number/set of IPs I usually visit might come in handy (and none of my requests would worsen the issue of server overload).

agilob · on Oct 4, 2021

This is 'cache', my OpenWRT router has this for thousands of records, but negative cache means: "remember this domain doesn't exist and don't retry asking other DNS providers". This is very dangerous.

Your browser AND operating system AND router already provide DNS caching, it's not something average user should even think about. You might want to consider it when things in your ISP go wrong (hello BT), or majority of computer request the same domains frequently, but then again, your router should do it already.

littlecranky67 · on Oct 5, 2021

Well now in retrospect, Cloudflare claims to have enabled SERVFAIL caching to mitigate, and other commenters here say their ISPs did similar. You might want to overthink the negative caching strategy.

jaywalk · on Oct 4, 2021

This does a good job explaining how SERVFAIL caching works: https://serverfault.com/questions/479367/how-long-a-dns-time...

littlecranky67 · on Oct 4, 2021

From your linked SO post, the accepted answer concludes:

"In summary, SERVFAIL is unlikely to be cached, but even if cached, it'll be at most a double- or even a single-digit number of seconds."

That would be fatal right now, wouldn't it? That would mean every major ISP's DNS server right now forwards millions of identical DNS resolve requests to the (currently null-routed) Facebook DNS servers. These must be millions, as heck, every larger website uses FB tracking tools, "like buttons" etc. Are they at least smart enough to throttle based on a domain/ip hash? Else it could happen that DNS servers of major ISP are soon overloaded as (constantly failing and thus uncached) requests to FB DNS would eat up all bandwidth/ressources?

toast0 · on Oct 4, 2021

Not that fatal. I think at least some recursive servers will do 'collapsed forwarding', where additional requests to resolve the same name while the first request is in progress will wait for the first request to finish and send the same results to all clients at that point. Although, perhaps that's just wishful thinking on my part.

Then you have port limits, usually each request goes out on a new port, a recursive resolver can only have 64k requests outstanding to any given authoritative (or upstream) server IP for each IP the recursive uses. Facebook runs with 4 hostnames listed, so that's a limit of 256k requests outstanding, 512k if your recursive does IPv4 and v6 (and 1 M if they're also making whatsapp requests).

DNS services for both domains appear to be back up by the way.

On the authoritative side, it's not too hard to manage this load. If you can't handle the big crush to start with, drop all requests, and then accept all the requests from 1.0.0.0/8, and add one /8 at a time as CPU permits until you're allowing everything. Once you handle the initial crush from a resolver, it should go back to normal load, and there should be some distribution of load across the various /8s. I wouldn't expect it to be evenly distributed, but it should be even enough.

Disclosure: I worked at WhatsApp, but left August 2019. I don't know anything about this outage other than idle speculation. I don't know if FB has a procedure to slow start DNS, but the theory is simple; the practice is complicated by the DNS ips being used in Anycast.

littlecranky67 · on Oct 4, 2021

> servers will do 'collapsed forwarding', [...] perhaps that's just wishful thinking on my part

I think it is wishful thinking, because that would basically be caching which is not allowed by the RFC. In 2017 the BIND implementation changed to a default cache time of 1s which would certainly ease the problem.

> then you have port limits, usually each request goes out on a new port, a recursive resolver can only have 64k

I'm unsure if this helps or worsens the situation, depending if the 'collapsed forwarding'/1s caching is in place. If this is not the case, ephemeral port exhaustion would kick in, at which point the DNS server will not be able to server other requests.

> On the authoritative side, it's not too hard to manage this load

Of course not, all you need to do is just present any response which will be cached by downstream resolvers. No smartphone/end user device will query the authoritative side as long as there is just any (even stale) response.

toast0 · on Oct 4, 2021

> If this is not the case, ephemeral port exhaustion would kick in, at which point the DNS server will not be able to server other requests.

You can use the same local ip/port to contact multiple server ip/ports, so filling up connections to FB ips shouldn't prevent you from connecting to others (but there are plenty of ways to do that wrong, I guess)

>> On the authoritative side, it's not too hard to manage this load

> Of course not, all you need to do is just present any response which will be cached by downstream resolvers.

You need to present a response before the resolver times out. One can certainly imagine a situation where the incoming packet processing results in enough delay that the responses arrive too late and are discarded. In the right conditions, this queuing delay would never clear and things just get worse. If it doesn't happen, great, but if it does, dropping most of the requests so you can timely handle the few you accept is a good way to get moving.

haimez · on Oct 5, 2021

Good ol’ CoDel - https://en.m.wikipedia.org/wiki/CoDel

fmajid · on Oct 4, 2021

It’s not so much the load as the DNS servers having to maintain state for all those queries until they time out. Must consume tremendous RAM and servers that are not event-driven could also be generating large numbers of threads.

4043D · on Oct 5, 2021

my pihole is rate limiting my partners phone, up 20k + requests last hour to facebook et el. The rest of my non standard dns is holding up so far time will tell though, if the big boys get overwhelmed you might be correct.