Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Cloudflare Is Having Issues (cloudflarestatus.com)
364 points by Kuinox on June 12, 2023 | hide | past | favorite | 184 comments


I suppose this is at the top of HN because of Reddit having an outage and Hacker News being slow. But neither Reddit nor HN use Cloudflare. And if you look at the status page there this isn't our core CDN offering, it's certain products that are affected.


I suppose this is at the top of HN as CloudFlare is quickly become a centralized point of failure for big parts of the internet, even without counting the CDN, so large swaths are affected regardless if it's just a "minor" outage or not. Hence it's interesting enough to land at the top.


Anecdotally: op guessed right - I was only interested because of the HN and reddit outages


There's also the perception issue that if the host fails, it most likely isn't cloudflare although cloudflare gives a warning.

Note: experienced it from the first row, cloud had an issue in a region with SSL...


Is there some sort of backup/failsafe mechanism for sites that use Cloudflare?


Depends how badly they fail. To take the full advantage of CF you need to keep your DNS with them. That means if you can't configure any changes, you can't quickly move to another provider either. Your only solution at that point is to transfer the whole domain, but that may also require CF's assistance. (Unless they don't handle your apex domain)

Effectively unless they're down for more than a day, you're better off taking the hit and waiting for them to resolve everything.


> Effectively unless they're down for more than a day, you're better off taking the hit and waiting for them to resolve everything.

lol. There's me worrying about a 2 minute downtime once every 10 years


Well I've managed ~53.5 years uptime with zero downtime so far 8)

I'm taking the piss too. Care to explain?

PS For a laugh (and I apologize for going dreadfully off topic), I asked ChatGPT a couple of questions regarding two mins and 10 years, just in case I'd missed a trick with your comment. I got two calculations within both answers that looked the same but have a factor of 10 difference in the result!

https://chat.openai.com/share/7aaa80e5-1c54-4ac2-9ddb-9e488d...

This is where it goes wrong:

"Total minutes in 10 years = 60 minutes/hour * 24 hours/day * 365 days/year * 10 years = 525,600 minutes"

"Total time in 10 years = 60 minutes/hour * 24 hours/day * 365 days/year * 10 years = 5,256,000 minutes"

LO Calc says that =6024365*10 = 5,256,000.

The worrying thing for me is that an awful lot of input training data might be badly wrong to cause this result or perhaps I've managed to excise a corner case in which case the training data is a bit too focussed in this particular regard. I suspect that arithmetic errors for these things will be awful because there are so many ways to screw up and the training data will have a lot of errors in it. Combine that with the number of subjects available and it will be a shit show.


By the way, when it comes to math like this, wolfram alpha simply can't be beaten: https://www.wolframalpha.com/input?i=2+minutes+of+10+years+p...

As for GPT, it's not about the training data having errors in it - GPT doesn't parrot the training data exactly, there's randomness built into it (otherwise you'd always get the exact same output to the same input). It just generates a random plausible response, it doesn't actually know math.

To highlight this, I've just asked it the exact same thing you did, exact same words, and got an entirely different response (and this time it was correct): https://i.28hours.org/20230613-051307-0ae6.png


We had some automation at work which caused a major problem resulting in a c. 140 second downtime about 3am. Not my area of responsibility, but I think it's the first time since 2005 we've had that type of widespread outage


To use Cloudflare, you switch your domain's DNS server to them. If that broke, you probably can't switch out without incurring the usual DNS propagation delay on your nameserver.

If DNS is up, you can change specific records to point to your origin or another provider instead of the proxy/CDN, provided that it can handle the load and doesn't need a setup similar to Cloudflare (i.e. repointing your nameserver).


Do they support Secondary DNS, e.g. https://support.dnsimple.com/articles/secondary-dns/

?


I wouldn't say it's a centralised point of failure. Many individual sites choose to use it, so it is used by big parts of the internet, but that doesn't make it centralised. Maybe a single point of failure.


It's impossible for a single point of failure not to be centralised. Otherwise it wouldn't be a _single_ point of failure.


Of course it is. If a million websites are built with Flask then a bug in Flask affects all of them. If a million websites individually decide to use Cloudflare then Cloudflare affects them all.

But that doesn't mean Cloudflare is built in a centralised way (I assume it isn't), nor that there's something about the internet that is centralised around Cloudflare. Rather, people are choosing to include it as a dependency in their stack.


In your example Flask would be a single point of failure and could indeed be called decentralized. For example websites can individually patch their Flask installation and come back up one by one, without depending on anyone else (at least once Flask is fixed).

Here with Cloudflare, a single entity is responsible for the fix and will more or less fix the failure for every sites that uses it at the same time, by fixing it on their side. And website individually cannot do anything on their own.

So I would argue it makes sense to call that centralized, at least from a structural/operational perspective.

It is at the very least a form of contraction of the network.


From an operational perspective it's just a supplier like any other. Lots of suppliers are involved in a business's website. That doesn't make it centralised, though?

> It is at the very least a form of contraction of the network.

I think it's that at most. No one has to use them, as they accelerate / enhance open protocols. That is the least lock-in one could hope for, so they don't contract anything in a negative way.

Contrast with, say, Etsy/Shopify, who actively try to replace the open space with closed ones.


It seems like you two are arguing different perspectives on the word centralization without conceding that a term can be relative to a viewpoint.


But every word can be relative to that? I don't see where we'd go with that. I do think it's how the word relates to the topic that's interesting, if that's what you mean, but I think we are trying to get at that (-:


Using a piece of software is not the same as using an online service.

If all those websites are using Flask, that would be the equivalent of centralising on Flask. The opposite of that would be many Flask-compatible yet unrelated other frameworks being used in parallel. A bug in Flask would not affect those not using that very codebase.

The centralisation people are speaking of here is the amount of people all putting their eggs in Cloudflare's basket.


> If all those websites are using Flask, that would be the equivalent of centralising on Flask

I agree on the equivalence between this and Cloudflare use, but not that this should be described as centralisation, which is a particularly potent word on the open web. Popularity isn't the same as centralisation. Each website could rewrite to remove Flask if they wanted.

Another example: if lots of people watch Squid Game, that doesn't indicate a centralisation of television programmes. No options are removed. People are just choosing to do a similar thing, but not in a way that centralises anything.


>It's impossible for a single point of failure not to be centralised.

Is a monoculture centralized? It doesn't have to be, and yet it can all fail due to a single fault.


There are so many websites using it that it is basically the same as being centralized.


So many sites using Flask across the web is centralized?


You're comparing apples to oranges. Flask is software of which all instances run independent of each other while Cloudflare is SaaS where all its sites depend on the same service.

If Cloudflare goes down, so does a significant portion of the web. Flask cannot have an outage like Cloudflare. All they can do is push a faulty update, but even then you can rollback or stay on the old version.


> If Cloudflare goes down, so does a significant portion of the web. Flask cannot have an outage like Cloudflare. All they can do is push a faulty update, but even then you can rollback or stay on the old version.

This is what was on the tip of my tongue. I'd argue you're missing the portion of control in this whole discussion, and how much process you can place in front of a component change.

If I have a flask dependency, I have a lot of control over this dependency. If flask screws up badly, I have many options: I can update. I can not update. I can downgrade. In fact, I could fork flask internally and fix it on my own and either be a good citizen and open up a PR, or I could be something else. I can test all of this in any number of environments before it hits a customer, and even more stages before it hits all customers.

With Cloudflare - or any number of Hosters as well, like AWS, Azure, Google Cloud, I have very little control. If I use Cloudflare as a CDN and Cloudflare goes down, I might not have the capacities at my upstream server to handle the load from all my customers, so I am down as long as Cloudflare is down. I wouldn't have the footprint available necessary to replace AWS in our own private DCs, even if we pooled all spare capacities - and then I'd still have to find a way to exfil our data from a downed AWS. (Which, yes, we have, but it'll take long hours)

And no matter how much I test, if my hoster fucks up, I'm immediately fucked as well, no matter what processes I might have. The only process around this would be provider independence, which is really expensive and a lot of effort even if you just have a luke-warm standby.


You can multihost your static site; you don't have to only use Cloudflare. You can have more than one CDN. Nothing is centralising you or funnelling you somewhere you don't want to go.

And you can do these things in parallel, unlike the Flask example, which you pretty much have to commit to using solely.


If most sites were using Flask and forced to do broken updates, it would be a centralized problem.


BUT THE INTERNET IS SUPPOSE TO ROUTE AROUND TTHIS STUFF.

Remember when crypto promised a decentralized utopia of distributed systems that any interference wouldnt work, but slowly but surely it all congregated into a couple of oligarchical companies?

Surely by now the technology of the internet has matured enough that these conglomerates are going to do the same but with a bit less fraud involved.


WTF has crypto got to do with centralisation of the Internet? The Internet is well-decentralised, but if every Joe Blogg has to put their website behind Cloudflare, it's not the Internet's fault, let alone crypto (?).

Stop putting every-bloody-thing behind Cloudflare, and the problem solves itself. I don't know whether to laugh or cry when I read of someone on HN seriously saying that they need a CDN for their personal website, or that they really need to use AWS or GCP for that matter. We tech workers have lost the plot, and it's our fault it's all in the hands of the few.


I think they're referencing when "distributed" exchanges still went down when particular APIs or domains were unreachable. It ain't fault tolerant unless it's piracy >:D


https://en.wikipedia.org/wiki/Border_Gateway_Protocol

BGP was standardized in 1989 and has been in use since 1994. The technology of the internet has matured in many ways, and remains almost identical in many others. Sometimes Microsoft is right, and backwards compatibility is the killer feature.


The internet does work around issues at the inter-networking layer through BGP and similar protocols, though the same resilience is sadly absent at the higher layers.


I think you are confusing different decades. The early Internet was way more decentralized and democratic.

Cryptocurrencies promised many things but only wanted to replace oligarchical companies with oligarchical miners.


I think they're confusing layers. The internet is more than just http servers. The IP traffic is routed around problems all the time. If the HTTP server is down, that doesn't stop the packets from arriving there (having been routed around dozens of down links and broken routers at any given moment).


True.

But if they fetch a lot of URLs from cloudflare and it takes 30 seconds to answer with a timeout instead of a http/200 under 20ms, then some architectural decisions that were sound under the latter case may make the whole system slow in the former one.


People have started testing network failures but so often fail to test slow network failures. All of a sudden your code has 500x more pending open sockets, both in and out, and the memory spikes and it all goes to hell. Even if the fast fail code path is indeed best effort.


Basically the old "I Love Lucy" chocolate factory sketch: https://youtu.be/AnHiAWlrYQc


I would love to see graphs showing "length of reply that eventually succeeded" - I suspect that most networks today, if you don't get a response in 5 seconds, you ain't never gonna get anything useful.

In other words, I wonder if going to fail fast would help the health of the Internet more than wait forever timeouts. Might reduce DDoS effects, too.


> When we plotted the data geographically and compared it to our total numbers broken out by region, there was a disproportionate increase in traffic from places like Southeast Asia, South America, Africa, and even remote regions of Siberia. Further investigation revealed that, in these places, the average page load time under Feather was over TWO MINUTES! This meant that a regular video page, at over a megabyte, was taking more than TWENTY MINUTES to load! This was the penalty incurred before the video stream even had a chance to show the first frame. Correspondingly, entire populations of people simply could not use YouTube because it took too long to see anything. Under Feather, despite it taking over two minutes to get to the first frame of video, watching a video actually became a real possibility. Over the week, word of Feather had spread in these areas and our numbers were completely skewed as a result. Large numbers of people who were previously unable to use YouTube before were suddenly able to.

https://blog.chriszacharias.com/page-weight-matters


Well a proper system should reduce the timeout for the same domain, the more it gets raised, and set it back to default once stats are good again.

But it's very complicated and costly to setup, so almost nobody does this.


If you're building complicated systems, you should probably reduce traffic to sites that are failing to respond, rather than continuing to send traffic, but timeout faster. Depending on what stage in the process the request is failing, it might not make a big difference (in a typical HTTPS exchange, the costs ramp the farther you go, processing a syn < processing a client hello < processing a complex request. (If it's a simple request, processing the client hello is more expensive than the request, of course).

If you send all the same traffic, and probably more because of retries with shorter and shorter timeouts, chances are you are going to keep the system overloaded, never detect success and never return to default timeouts. Dropping most of the traffic, and then turning it back on when the system recovers can lead to oscillation where the system works enough to drive more traffic that overloads the system etc, but at least you're getting some processing done.


Well, you do need, jitter, exponential backoff, caching, black and white listing, a stat base decision tree, etc. That's why it's a complicated and costly problem.

But if you are consuming a lot of API content, if you have crawlers or if you provide features like "get article title/summary/image/thumbs", at some scale it's an important decision to make.


>I suspect that most networks today,

I guess that comes with the type of end point in that network. A typical website, absolutely, yes, I'd agree. An API end point allowing requests of large data pools that might take a few seconds to generate but yet not a total time out would be acceptable.


I've been experiencing occasional slow-downs and issues on HN since last week. I would presume that it's due to more load (with WWDC last week and now Reddit black-out) but I don't really see this fully-reflected in the comment or up-vote count on the main page (except the recent Vision Pro thread).


Is HN slow because everyone who was normally posting on reddit is now commenting on HN instead?


I think so. I'm trying to move my time over here as much as possible.


Same here. Even the subreddits that are still open, I'd feel dirty reading.


Time to upgrade from a t2.Micro to a t3.Medium, I suppose.


This is at the top of HN because Cloudflare is not delivering a product people pay them for.


An ex admin of Reddit thinks the protest broke the internal cache layer performance: https://tildes.net/~tech/163e/reddit_appears_to_be_down_duri...


That was my guess, not knowing anything about Reddit's architecture. The "this subreddit is private" message probably used to be really rare, so it probably does the auth check for every request. Now every Google result links to "this subreddit is private", and so traffic to that endpoint went up by orders of magnitude. The result is an outage.

If I were an SRE at Reddit, the day I got wind of the "we're making all subreddits private" thing, I'd double check that code-path to see what we were in for on Day 0. However, I am not.


Or maybe, the SREs at Reddit decided this is their contribution to the protest to let it burn


In this economy?


They don't have to let it burn to the ground, but they also don't have to bust their asses to allow a point to be made.


At their paygrade?

Looking at Meta/Google/Apple's layoff packages of 6 months of severance, and factoring in $170k-$389k annual income at Reddit (per levels.fyi), I would hope they have enough savings to live off of for months if not years, to enable thtem to protest, should they so desire.


6 months severance is exceptional - 3 months is what you get if you’ve been there at least a few years and in good-standing.

And the twentysomething kids on $200k on the west coast won’t be saving that money (if the number of Teslas on the road in Redmond is anything to go by): they have no reason to believe they won’t make the same kind of TC at the next job they apply for.


at this income level, it's not worth losing my job in this economy, not even in a booming economy.


Humans spend the money they have available. All of it.


Whoa tildes are a reddit alternative? I thought it was all about, like, public access unix systems.


Huh?

It’s not quite an alternative. It’s good though and I’d recommend joining once there’s another invite wave.


Is there a known feasible way of receiving an invite at the present time?


tildes.net seems completely unrelated to the ones you are thinking of (the tildeverse)


Ah, thanks.


Anyone have an invite to share for Tildes?


Same here. Looking at the list of topics, it hits a lot of my interests and isn't 100% focused on tech. My contact details are on my website in my profile.


Looks like the email address on your keybase page is unreachable?


Sent!


Would love one too…


Can I get your email address?


tilde at maxg.io :)


Would love one as well.


Sent!


Any chance I could get one too? Email's on my website: https://picheta.me


FYI: I didn't see any notices about disruptions for workers but were intermittently unable to use `wrangler publish`: https://i.imgur.com/LUAzoHQ.png


I'm downloading a large app update (MAMP Pro).

It's screaming along at 47KBsec.

I'll have it all in about two and a half hours.

Not sure if that is related.


Possibly third parties that Reddit/HN rely on are using Cloudflare?


Given the innaccurate and clickbaity title "Cloudflare is having Issues" (when in reality it's just a few services, not the main stuff Cloudflare is largely known for), I suspect a lot of people are upvoting and not actually following the link. A comprehensive Cloudflare issue would be huge news. The truth here is not huge news IMHO.


The statement “Cloudflare is having issues” is true as soon as count(issues) > 0. So what’s so wrong about the title?

And how do you know how many websites use R2 etc. so that you can jump to the conclusion that it’s not huge news?


> The statement “Cloudflare is having issues” is true as soon as count(issues) > 0.

Pedantically, it's true as soon as "count(issues) > 1"


Exactly. The headline isn't untrue, but it is misleading. Most people hear "Cloudflare is having issues" and assume Cloudflare as a whole, and this is not an unreasonable assumption.

Reductio ad absurdum can be used to demonstrate the logic flaw. Let's assume that one person using Dynamo DB got a 500 error response from a two successive API calls (so an error rate of ~4.0e-09). It would technically be true for the headline to say only "AWS is having issues." A headline like that is going to rocket to the top of HN, and it's not going to provide many people with useful information.

It's also silly. There's plenty of room in the headline to instead of "Cloudflare is having issues" to say "Cloudflare R2, Stream Live, and others are having issues."


Off topic: Cloudflare, like Google, is bad for the internet- not just for the way they centralize the internet but for the way they begin to conflate themselves with The Internet. Very few sites need Cloudflare, and of those with it equipped, an even fewer number configure it or use it correctly for their application.


CloudFlare free-tier is incredibly useful for sticking in front of the cheapest bargain basement VPS WordPress host you can find and miraculously still ending up with a somewhat functional small-business website.

It'd be even better if there were competition in this space, but there aren't too many options outside of cloud providers who are likely relying on the fact you might accidentally slip up one day so they can charge your credit card.


Fortunately there are so many cheap options nowadays for fully managed small-business websites which include all the CDN/WAF stuff for you (Squarespace, managed WordPress, etc.). The setup/maintenance time for running your own web server doesn’t make sense unless you already have an “IT” person on the payroll.


With all the politeness and calm I can muster: you do not need Cloudflare for your small-business website.

Again: you do not need Cloudflare for your small-business website.

Do you have any idea of how many requests can a shitty unoptimised website serve on commodity hardware? Judging by comments like yours, which seem to be the majority, I wonder if anyone with less that 15 years of experience is still able to write a website serving 10k users a day (1 request every 8 seconds) on a 2 core VPS without needing a CDN.

Let me spoil the black magic only greybeard seem to know: STOP. OVERENGINEERING. You don't need Cloudflare. No one cares about DDosing your website, you're not Reddit for Heaven's sake.

If you overengineer, at least quit all rushing to give your custom to the same company, making Cloudflare a de facto monopoly.


Using a CDN is actually the opposite of overengineering. Using a CDN is an easy, low cost way to outsource any performance concerns. Overengineering is trying to handle all the potential performance issues yourself, with custom caching, database indexes, code optimization, better hardware, etc. For many sites it is simply an insurance policy. Most drivers don't need auto insurance, until that one day you do.


Exactly this. The amount of cost & time to make a (WordPress/Drupal/random database backed CMS website) site run well on a cheap VM is significantly higher than Cloudflare's free & paid options. Those aren't my first choice in CMS these days but a lot of the small business web still run them & migrating isn't cheap from a technical or user training perspective.

If you happen to have a popular CMS like WordPress on a cheap VM, odds are you are going to get DDoS all the time, even if you only have 100 legit views/day. Cloudflare will reduce this dramatically.

I would guess any site not using Cloudflare (or someone similar) is more likely overengineering. As always though, every case is unique & it depends.


Absolutely bull poop.

First, setting up anything using some third party service like Cloudflare is already too much work and doesn't even work for many people in parts of the world Cloudflare has determined are undesirable.

Second, I can, have and do host popular CMSes on hardware much more modest than Raspberry Pi performance.

Third, "odds are you are going to get DDoS all the time"? Are you a Cloudflare shill? This is nothing but wildly hyperbolic. In a quarter of a century of hosting, I've had to deal with one specific DDoS actor. One.

How are you going to claim that "odds are you are going to get DDoS all the time"? Go ahead, provide evidence, although I'm sure you can't and won't.


Not sure how using Cloudflare is more work than setting up your own LAMP VM or installing WordPress on a SaaS even.

People put up fake WordPress logins as honey pots. I'm not sure what to say to this. If you host a WordPress site you're going to get lots of traffic trying to take your website down unless your provider is helping you block it. If you go outside on a summer day, the sun is going to be shining. I have never had a WordPress site that didn't get a ton of bad traffic. If it lived on a cheap VM with a MySQL database, PHP & WordPress, it was going to be under stress at least a few times a year. Tossing Cloudflare on it takes less than 10 minutes & a few years ago was the 1 of the easiest/cheapest ways to get SSL on it. In my quarter of a century of hosting, I have had a lot of DDoS attacks & none of those sites got over 100k legit users a month. Most also didn't care about users outside their own country.

It doesn't matter to me if you use Cloudflare or someone else. I don't make money off it but I will admit it is one of my favorite providers by far. I do also really like how the executive team is personal, handles themselves online & reaches out to devs.


To host a Wordpress (as an example) site, you need to set up Wordpress and add content. To do anything with Cloudflare, you have to set up an account and configure things. If you don't use Cloudflare, you don't have to set it up. How could that be anything but extra work? Also, it's not standard - it's their own thing. And what happens if you use a VPN, or a provider that they don't like, or if you live in an area that Cloudflare simply doesn't like?

I have no idea how people putting up Wordpress honeypots is related to this discussion, but for everything else, you're advocating treating symptoms and ignoring the problem.

If you, or anyone else, want to run a Wordpress site and you expect a firewall or DDoS service to protect you from stupidity, it might work for a time, but it's not the best idea. If you don't rename your wp-login.php, that's on you. If you install 27 plugins that you don't really need then ignore the fact that they'll need constant updates, that's on you. But those are common sense things - again, the root issue should be addressed, so the symptoms never happen.

Also, if bots banging on your wp-login.php and/or "ton of bad traffic" are what you consider a DDoS, perhaps you really should consider basic site security. We call "a ton of bad traffic" normal.

I'd much rather a site that has fundamentally fewer problems than a poorly configured one that's "protected" by Cloudflare.

Oh - and what does "most also didn't care about users outside their own country" have to do with it? You're advocating FOR the idea of stratifying the Internet? Then I guess you really are a fan of what Cloudflare is doing!


People stick cloud flare on the front less for the CDN/DDoS protection and more for the web application firewall, because their small business website is almost certainly some incredibly out of date Wordpress install with more vulnerabilities than you can shake a stick at.

Wordpress should integrate one of those "Wordpress to static site" plugins as the default because that's all 80% of the users need.


Yes and let's also not put it past an unscrupulous operator to be willing to pay a few dollars online to take down the website of one of their small-business competitors. It can and does happen.

Folks pointing out that their Raspberry Pi self-hosted static site résumé can handle 500 requests per second are missing the point.


Seriously.

The Wordpress approach of dynamically composing every page server-side made sense:

* Before AJAX made personalizing the 'logged-in experience' easy even on a mainly static site

* When CPUs were so slow that regenerating, say, 1000 static HTML files just because you updated your footer or your "top stories" sidebar would take an annoying amount of time instead of what, 4 seconds now?

* Before spambots essentially made it impossible to host a comments section, and Disqus and the Facebook plugin became the defacto choice for anyone still brave enough to try.

Due to the above, I can't imagine using PHP or even some sexier-today technology to dynamically just-in-time assemble HTML pages that 99-100% of the audience will be viewing statically.


On the other side of the coin - launching an attack that would overwhelm that vps has never been easier. Even if you're serving static pages and your 2 cores can fill the 10G nic with compute to spare, the pipe can be filled with enough inbound garbage that legit requests have their packets dropped. This isn't that hard in 2023. There are dozens of booter services for hire, and if you really are scraping the bottom of the barrel of hosting, some pissed off kid can probably convince their discord buddies to generate enough traffic to knock your site over.


> STOP. OVERENGINEERING.

Please tell this to every junior/cloud developer/architect when it comes to microservices.


Our company was knocked offline by a DDOS attack. It continued all day unabated until we signed up for a free cloudflare account. That was easy.

Flash forward about 8 years and I find myself using cloudflare to stave off the immense suffering of Azure/AWS by using R2 and Pages. As soon as more people find out how easy Cloudflare is and how much it can lube a product deployment, I expect a lot more of the internet to end up there by choice.

Cloudflare deployed my app better/easier than Microsoft was able to deploy it to Azure (Microsoft owns GitHub and Azure and still didn't have their shit ironed out...Cloudflare just works).

I had to tweak them both but Azure took hours and cloudflare was a quick, cleanly documented change.

I wish they offered a service to host my nodejs APIs.


I like Azure, GitHub, AWS for a lot of things but Cloudflare has it's niches that are just better.

I haven't read into why Open AI uses Cloudflare instead of Azure services but I find it very interesting considering their Microsoft arrangement.


Sites don't need cloudflare, unless:

- They want to save on bandwidth costs

- They have to deal with some level of DDOS or site scrapers hitting every page at once

- They want to block IPs, geographies, ASNs, etc. without editing a server config

- They want speed & server-sided visitor analytics, email routing, security settings, redirect configuration, and DNS all in one dashboard


You left out this one:

- They want their vendor to arbitrarily deny access to customers and prospects

Maybe they have a 100% success rate at blocking actual threats but they sure do have a lot of false positives. I get blocked or forced into captchas at least three times per week.

I used to report it to the site admins, but among the few that responded, almost none knew how to fix it. I no longer bother, so I suspect that the less clueless admins have no idea how many visitors Cloudflare has driven away.


> I used to report it to the site admins, but among the few that responded, almost none knew how to fix it. I no longer bother, so I suspect that the less clueless admins have no idea how many visitors Cloudflare has driven away.

It's an acceptable cost, versus other ways of dealing with abusive traffic.

Hell, we (sysadmins, back when that was a term people used) used to just blackhole all the IP blocks associated with certain countries—made the logs so very much quieter, and this was back when the whole Internet was a lot quieter to begin with. At least CloudFlare's less blunt than that.

Failing to block abusive traffic can be really expensive. Detecting it is always going to cause false positives. Admins are OK with those as long as they don't cost more than implementing a system with fewer false positives would. A half-percent tax on revenue (to pick a number out of a hat) in the form of lost customers is a reasonable trade-off for a lot of companies. You've got to have pretty serious scale before it's worth investing real money to try to shave that down by a couple tenths of a percent (you'll never get it to zero, and only places like Amazon have the kind of scale that make it worth attempting to closely approach zero)


Absolutely. I had a post here about how Cloudflare is locking out Linux users[1] (which was momentarily resolved and they're back to blocking them again), but it was quickly flagged, suppressing any real discussion about it.

I have to assume jgrahamc was one of the flaggers, given their indignant comment at the top of this thread.

[1] https://news.ycombinator.com/item?id=36197401


Hi there! I work at Cloudflare. Our global performance for Linux users on challenge pages looks good at the moment, but I'd love to take a closer look.

Could you send me an email with details you have available, (rayID, IP address + website, or HAR file) at amartinetti at cloudflare.com?


See my comments below. Would appreciate you whitelisting the Opera Mini browser globally. I'll then be happy to take back some of the things I said.


Whitelisting the Opera Mini browser how? Asking Opera Mini to include a signature with requests? Based on user-agent? In any case, browser behavior can and will be copied by bad actors if doing so is a get-past-cloudflare-free-card. That's why it's not just browser, it's browsing habits based on your IP and location in addition to fingerprinting where appropriate in order to allow legitimate users to prove they're human via captcha.

Edit: even if you just mean whitelisting their proxy ip... that doesn't do much good either. It's like asking to whitelist tor - those IPs are blocked because a good amount of spam or malicious traffic originates from them, not because there are x0,000 users on each.


> Whitelisting the Opera Mini browser how?

> That's why it's not just browser, it's browsing habits based on your IP and location in addition to fingerprinting where appropriate

Mini is a barely configurable hosted browser with one IP, one location, and one fingerprint. It doesn't return anywhere near full HTML/JS/CSS, but highly cut-down code generated by Opera's server. Unless somebody has found a way to hack that server, I'm at a loss as to what damage it could cause.

This seems to me a case of sloppy use of overly broad security tools.


How about you actually look at your own logs on your own servers, rather than expecting people you exclude from doing your homework for you?


I've had a couple HN comment threads recently, about Cloudflare blocking me, and I contacted multiple sites about this, but so far no improvement.

Maybe the right people are hearing, but they consider it a necessary evil of negligible impact (accurately or inaccurately). Or is stuck with Cloudflare, who says they'll take care of it. Or maybe the message is getting lost before it gets to someone who cares.


Could it have been the paragraph you removed for being in bad taste?


No. I had momentarily put up a comment asking for someone to vouch the post, but this is against the rules.


>I get blocked or forced into captchas at least three times per week.

Same.

I get blocked on websites I have accounts on if I use a VPN. I'm not just a visitor, I'm a member, and still get blocked by cloudflare. Other times it's a small or local business and to me the website just looks down. Then, if I bother to think about and willing to drop my VPN, suddenly the website works fine.

Treating VPN users as hostile is getting really fuckin old.


VPN users are hostile just not you. VPN is abused by bad actors all the time using a shared IP that you are likely using too. If you had a dedicated IP I'm sure this would be different though makes you less anonymous which is part of the reason why you get captchas.


It's still a false positive, which is what we're talking about, unless cloudflare itself is the bad actor and they're deliberately blocking normal VPN users. Which I'm sure there's money if you dig. Governments for one would happily pay to de-anonymize or block "misbehaving" citizens.


Personally, my favorite was CF emailing me to brag about how many robots/malicious actors they block, right after blocking me for being a robot/malicious. Like... that's sure one way to undermine your numbers...


> - They want their vendor to arbitrarily deny access to customers and prospects

You can turn this behavior off as site owner.


> Maybe they have a 100% success rate at blocking actual threats

They don’t.

There’s a new, working CF WAF bypass published almost every day in the bug bounty hunter circles.


TL;DR: Also off topic.

20 years ago I was beta testing a browser nobody's heard of and happened upon a mainstream PC accessories seller like pcconnection.com or the original cdw.com with a poorly designed site: malformed cookies, important information hidden inside nonstandard tooltips, etc. Don't recall their name, but they had domains for the US and Canada. I emailed them at least twice to point this out, but they blew me off. I was amused when they went out of business a few years later.

When you criticise Cloudflare, the common response is that the site owner has chosen to block this or that. I don't believe for a moment that Cloudflare offers a checkbox for "Spin the busy animation forever, never loading the site and never generating an error". This is what happens when a company dubs itself the Internet Police while not knowing what the f*ck they're doing. In the words of the Joshua AI from Wargames

"The only winning move is not to play."

I suggest that it's impossible to do what Cloudflare is attempting without false positives, but they haven't figured that out yet. And when you complain in their public forum, do they apologize and send up the flares? No, you're met with arrogance:

https://community.cloudflare.com/t/browser-integrity-check-b...

I wasn't caught in this particular dragnet, but had the same thing happen to me this past winter with the Iceraven browser. I did complain to one of the affected sites, who weren't particularly helpful, and the issue disappeared on its own about 2 months later. Maybe somebody did start a public thread with Cloudflare, but knowing the likely response, I held off restarting that fight.

I use the Opera Mini browser sometimes (not recommended), which apparently has its worldwide server/endpoint in the Netherlands. There used to be a US endpoint but no longer. If you're using Cloudflare to block European traffic for one reason or another, okay, I get it. If you're using Cloudflare to block "unusual traffic" like VPN endpoints (no idea how it actually works), it would take them two minutes to do a reverse lookup on the IP and see

109.211.145.82.in-addr.arpa. 13531 IN PTR h18-05-12.opera-mini.net

and simply whitelist the IP globally. But they haven't...while a little voice nags at me, reminding me this is supposed to be their day job.

On some level I'm okay with the status quo and letting my first paragraph scenario play out: smart businesses grow while idiots go bankrupt. With Chrome, Safari, and Edge controlling roughly 90% of the browser market, though, I'm not holding my breath. The rest of me considers this behavior anti-competitive and a growing civil rights issue. Imagine needing a toll road or ferry service to get somewhere but the owner tells you they don't like your car and please get another one. The average person would exclaim, "What? There's nothing wrong with my car!!" Ditto for whatever browser I choose. If it supports TLS 1.3 but doesn't like the web site's HTML/JavaScript/CSS, then maybe I'll consider switching...or maybe I won't. But that should be my choice, not theirs, and definitely not the choice of a middleman with delusions of grandeur. Web sites designed for use by the public need to be accessible by the public.


In the beginning of a project I want velocity and progress. CF gives me:

Easy SSL. Register domain at cost. Throw up static site on S3 equivalent (or app on whichever cloud), add Cloudflare and a couple redirects (no www to www, no ssl to ssl), done. It’s like gmail for simple web hosting. It’s also like early Google, friendly and useful and not trying to squeeze every dollar for shareholders and bonuses.

They lend us their very competent team at free or minimal cost. I totally get the single point of failure and monopoly but compared to AWS, Azure and Google complexity it’s a simple web app dream until I have time to care about certs etc.

Competitors need to build an easy alternative and get as good as the CF team. Which is way better than me at futzing with config I only touch once every couple years.


> - They have to deal with some level of DDOS or site scrapers hitting every page at once

Sadly, that is the norm with everything since Shodan made it trivially easy to find out targets for websites. Once your hostname or IP ends up there, you will get blasted with exploits a couple minutes after a 0-day was published.


I think blaming shodan/etc is incorrect - shodans rise in popularity coincided with the rise of cheap, fast bandwidth at VPS providers enabling you to scan real fuckin fast.

It’s much more efficient to just do “zmap | ./exp” than it is to query shodan, get a limited amount of targets, etc.


It's also popular in the SEO crowd for artificial link building. With such a chunk of the web behind Cloudflare, the presumption is that it's one less fingerprint that is reasonably scalable.

For the most part, the bandwidth doesn't matter, the sites are made for bots.


With the cost of popular cloud vendors, Cloudflare is essential. The first bullet points is enough to drive people there.

You can also manage all of your internet infrastructure (domains, dns, etc) under a single dashboard that doesn't suck. Pry it from my cold dead fingers.


I run several services without cloudflare, but it's so tempting.

Their product is excellent, and so easy to use. You push a button, and suddenly a whole class of difficult problems disappears.

It's very hard to fight against the urge to use it.


This in combination with the above is a huge issue, in which people are too concerned about "don't use Cloudflare" angle that they don't spend any effort into creating alternative solutions for the problems that exist.

We can dismiss the problems all we want by saying "most people don't need it", but if they're using it, clearly they see a need and this has to be addressed not ignored. Which is why it's so hard to fight against the urge, so go ahead and use it if it saves you time and money. Until real alternatives exist, don't hinder yourself.


> You push a button, and suddenly a whole class of problems disappears.

and a whole new class of problems appears

(my regular cloudflare outage is when they decide to add new "security" that the existing "disable security" page rules don't apply to, great for API consumers)


>You push a button, and suddenly a whole class of difficult problems disappears.

This corroborates something I read that I'm paraphrasing: If you want a successful business, solve a problem and sell it.


As far as I remember it, Cloudflare was one of the first services to offer TLS on their free tier which bought an incredible amount of good will.

Between Cloudflare and LetsEncrypt I’ve never had to pay for an SSL certificate again.

Of courss, the easy to use interface was also welcome. Using their CDN for small scale or hobby projects is total overkill though.



I don't like the centralization of Cloudflare, but it's the only way I can host public web pages out of my homelab with a DSL connection. I would rather have a CDN than put my content on a VPS as the servers remain 100% under my control and I can change providers anytime.


And Cloudflare seems to be increasingly denying access to sites from browsers with spoofed or hidden referers, which is an attack on individual privacy.


You can turn this behaviour off, it's called "Browser integrity check".


As a site owner, but not as a site visitor. And without cloudflare, the site owner wouldn't even have the ability to do this.

It also seems to be something they've moved to flipping on by default.


Isn't that the fault of human nature, and not Google or Cloudflare?

Centralization is a problem if the Internet breaks without the "too big to fail" piece. I'd think as long as you can move your content and aim DNS at the new location, you can use those services without being enslaved to them.


Well, you say that a large part of the internet are using Cloudflare then by that measure, they are The Internet, together with all the rest of the services who are also The Internet.


"AOL is The Internet"

sorry, just had a flashback...


No site NEEDS Cloudflare, but many can benefit from their slew of products. From their CDN to DDoS protection, Workers, and even basic things like DNS management - they provide a really good user experience across the board. Sure, it adds another potential point of failure to the system, but for many - it's a worthy trade-off.


I host a very small blog and I am tempted to use Cloudflare, because it seems latency is very bad from some geographic regions. And I would guess every website wants reasonable latency from a large part of the world, right?


Latency matters much less than people think in many cases; especially if it's mostly a static site.

Page load times are almost always dominated by the megabytes of images and javascript that everyone seems to include now.


Very few people need seat belts. But if you don't have one on when you do you are going to have a bad time.


Where else can I set up a DNS proxy and block all but the countries I want within about 15 minutes for free?


There are very few efficient options outside of cloudflare to effectively protect against bots.

It's a lot of work to detect bots, i use a bunch of tricks on a certain website that return an XML bomb ( to discourage bots)

Note: Google's Recaptcha wasn't sufficient


Downvoters have an easier solution? Then please elaborate, we are on a technical forum after all...


I'm sorry they're having trouble, but speaking just for myself, perhaps this means I will be able for a short time to view a very considerable number of web-sites which normally are inaccessible behind a permanent loop of "we must check the security of your connection".


with all due respect what are you doing? I've literally never encountered such a screen.


In Switzerland, the national rail company's online services were down in the morning and people could not buy tickets.

Then news came out that MasterCard was also having problems during the day.

And then Reddit, HackerNews (it was totally down for me, not just slow), and now Cloudflare!


Welcome to the future!


Is this the reason for access issues to Hacker News at the moment?


I think it's just too many Reddit refugees at once.

PSA: if you browse HN in logged-out mode (e.g. a private tab), you can receive cached versions of HN threads that load much faster. The logged-in experience is currently worse than the logged-out: dynamic content generation for user-account stuff seems to be hitting a slow path on the server.


This isn’t the exact same, but there’s a similarity in order-of-magnitude differences and flow rates: Most professional pilots work for the airlines. When you remove military, it’s a huge majority. Most pilot jobs flow into the airlines. It’s a pretty cohesive labor market.

So, whatever market forces are going on for pilot labor in your little niche job are completely dominated by the airlines. Like you’d think banner towing jobs would be mostly affected by advertising trends, but they’re not. Passenger travel trends are way more important. (This is for the job market, not the business market)

Similarly, the flow from a way bigger site can overwhelm whatever you have going on.


Probably not, at least not directly. HN is hosted on M5 Computer Security at the moment and it's DNS is AWS. The load issues usually happen when a few popular threads have a really high comment count and dang usually has to enable pagination or other things to compensate.


> HN is hosted on M5 Computer Security

Do we know if it is just like, one tall MySQL instance and a few RubyOnRails API instances? Not sure what would be used for "caching" as well.

> The load issues usually happen when a few popular threads have a really high comment count and dang usually has to enable pagination or other things to compensate.

Why not auto-turn it on at like 500 comments or so?


Do we know if it is just like, one tall MySQL instance and a few RubyOnRails API instances?

Two servers, one active and one standby. Both running Arc on BSD. No MySQL or Ruby.

Why not auto-turn it on at like 500 comments or so?

Best to ask dang directly on that one hn@ycombinator.com as I have never seen the code and responsible adults don't let me near their code.


It always surprises me that a VC that has backed multiple tech unicorns can't run a scalable web site.


Considering all the trash that comes with heading in that direction (as has been repeatedly demonstrated by social media which has started off similarly to HN, only to turn around and shit on the people that helped them get going), I'm perfectly fine with a site that slows down a bit in special circumstances.


It's not like they can go grab employees from one of their funded companies to go work on HN, and they've clearly been happy with the results of their level of engineering investment in HN.

(Besides, as Reddit is showing, do you want your discussion site run like a tech unicorn? Arguably things were better for users in the skeleton crew days)


I always felt like scalability and growth aren't goals here, which is something I like to be honest.


This must be a major funnel for them. The fact they can't even run a robust text-only website ought to be an embarrassment, but here we are. Every time there's a major tech story the site grinds to a halt until one guy jumps onto a server and ninjas it or whatever. Unprofessional to say the least.


It's actually pretty impressive that they are running this from one server in their own programming language


Sure. Maybe they do the same as VCs. Use limited crappy resources in their own non-standard way to cripple growth.


What kind of reaction is that?


HN is written in Arc, an experimental lisp dialect created by Paul Graham. I believe it runs on a single instance? Not sure though.

I believe pagination does turn on automatically at 500 comments.


Looks like it is AWS. Given the fact that both Reddit and HN (and Apple, which is also experiencing issues) are on AWS, it looks suspicious.

Also, new account creation in HN is broken at the moment.


I don't know about Reddit, but I'm fairly sure HN isn't hosted at Amazon.


It was for a really short time last year when there was a bad hardware failure.


I think that is pretty unlikely. HN doesn't use Cloudflare at all as far as I know, and not to downplay the scope of the outage, but Cloudflare's core CDN Offering is fine. Other core services such as Cloudflare Pages/Workers/KV are also fine. This is mostly addon/developer products, the biggest of which (imo) being R2. If you log out, it loads a lot faster (at least the main/cached pages)


The advice in such cases: "logout". Hacker News has a lot easier time when service 'static' content.

...and I had to login just to say this...


HN's slowness is probably knock-on result of Reddit's outage. It's where a lot of people to go to discuss the Reddit shit-tornado.


I am getting 504 errors trying to load hacker news in Germany, I think it's possible it has to do with the cloudflare outage.


Why do you think that? Because of increased load from people visiting HN to discuss it?


No, because 504 means "Gateway Error" which means the original server is not at fault but some gateway in between (like a CDN)


On a side note, I recently released durafetch.com which allows you to download your Durable Object state to a local SQLite database.

It allows you to observe and query your state in dev and prod.

Take a look: https://durafetch.com



We use Cloudflare primarily for CDN, but they are having issues a few times a year and I am looking for an alternative. Any suggestions?


Amazon Cloudfront with WAF

Azure CDN with Front Door

etc.

etc.


I'm always excited when things go down like this because that means I get to read another postmortem


Sorting comments by date would be helpful on large outage threads like this that are now hours (4) old.


their business model is waiting until you're locked into Enterprise and then forcing you into ridiculous pricing. One account on business has a bill of $150 for Cloudflare Workers. On our Enterprise account they're quoting us $3,800 for the same thing.


Why not do what Orwell advised and call them "problems?"


If a foreign power wanted to amplify chaos online, this seems like an easy pre-scheduled time to do it?

But more likely just kids with endless hacked IoT available to DDoS whatever they feel like.

Before IPv6 there were "internet weather" services, are they still around?


So fed up of this crap. Switching back to fastly


reddit, hn, cloudflare down. This is a bad day for the internet


Lol what? TikTok, Facebook, Google, YouTube, AWS seem to be doing just fine.


But where do the NERDS go?


Twitter, YouTube, Google, AWS, TikTok? Or all the nerds posting screenshots to Twitter or posting screenshots of Twitter to HN and reddit, googling programming related stuff, using services deployed on AWS, and TikTok tech content creators not nerds?


hope they stay down




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: