For ridiculously easy things like this, I think it's smarter for you to just host it yourself. This way you are not forcing other people to carry your burden.
This is all done 100% within NGINX, and I include a lot of stuff you probably don't want or care about. Other web servers probably have similar capabilities.
Though the usefulness of the host (client specified it), remote port (it's random), and server ip/port (it's static and not useful for the client) are questionable.
/cdn-cgi/trace isn't available on just cloudflare.com, it's on every site they proxy too (there might be a setting somewhere to disable it yet it's on by default). I've yet to figure out what happens if your site also serves something there, yet it was surprising to see on a barebones site I had routed through them.
TIL. Looks like they have it documented (since Apr 2022, based on the github history), it cannot be modified or customized, and that it sometimes causes problems with SEO crawlers!
There was a great trial CF ran for a long period with a previous version of encrypted SNI (now ECH). One could get encrypted SNI with any site using CF via modified OpenSSL. But they stopped this experiment. Meanwhile ECH was still not ready.
If anyone is getting something other than "sni=plaintext" for a CF domain besides crypto.cloudflare.com, then please let us know how.
Isn't ECH still off by default in most browsers? The latest info I could find has it implemented but opt-in behind experimental flag in both Firefox and Chrome, and not supported at all in Safari.
That's neat, but that's the opposite of what I'm suggesting. I'm saying if you are hosting web content anyway, then just host a /ip endpoint that returns the IP address. That way you aren't forcing other people to carry your load.
If it's on the public internet, you're not forcing anyone to provide you a service. Standing up reliable infra isn't easy nor free, no matter how trivial deploying it may be.
Ok, I'll bite... What kind of machine do you host this on that can handle 400,000 of these requests (with TLS mind you) per second? That was the load he mentions it handling in 2021, he stopped mentioning requests per day metrics after that.
I think you are misunderstanding... the point of self hosting is that you only need to handle your own load. It only needs to scale to what, 1 request per day?
Wasn't the original target of nginx back in the nineties to hit 100k requests per second with less overhead than (at the time, standard) apache2?
With that in mind + the general advancing of hw since then, it wouldn't be too suprising if nginx could handle that load easily. It'd just get more difficult if you wanted to put anything else (ie. metrics) on that endpoint since afaict nginx doesn't have anything to monitor those things if you don't have the Enterprise version of their tools.
In any case you might be misreading the setup for the parent here - the recommendation is just that if you're doing this in the context of a webapp, that this sort of behavior is trivially easy to bounce back in JSON using nginx.
No, the point is, you host it for your own needs, so you are not using their service. Your implementation is for YOUR needs. You can make it public or not and advertise it or not at your discretion. What we do is make it public, but don't advertise it at all. Our software just uses it when needed.
If your use case calls for 400k/reqs a second on what your IP is and you haven't yet figured out how to build out your infrastructure(or pay someone to do it for you), you have bigger problems than figuring our what your public IP is.
400,000 req/s is actually not crazy. Back in the single-core era I was doing 1/10th of that on a production server for similar workloads a compiled VB / COM+ app. I would wager Nginx can handle that pretty easily on a modern dual or quad-core system.
Tcp connections are a 4-tuple {PeerAIP, PeerBIP, PeerAPort, PeerBPort}, if you hold your IP and port fixed, because you only listen on port 443, you can do 64k connections to each other peer's IP... if the other peer can manage to use them all.
Chances are you'll never run into that limit for this service, because even if there's a ton of users behind the same IP due to NAT, they don't need to hold long sessions, and you can disable http keep-alive and close connections right away. Chances are you'll have some ip diversity in 400k requests/second though.
The second one might be useful to verify one is not using local, ISP-provided DNS, or to see whether a DoH provider effectively geolocates its users, e.g., Cloudflare.
I have something very similar, and all my projects depend on that. I also made a copy of httpbin which I host on a server. Useful for unit testing. And a gateway for hookback urls that I use for relaying/logging oauth and similar things to internal vpns. All very useful gadgets that have been running for over half a decade on arch.
I can't say a thing about efficiency, but for maintainability, I have to disagree. This seems to be a rust library, and for maintaining that, you'd need to write a program, compile it every time some updates come by, rerun the recompiled program, ...
With the parent comment example, all you need is nginx, available as a package on all distros, and a single static config file. With an auto-updating package manager and a service manager, one never needs to write anything else apart from that static config file.
Or did you mean something else with "actually maintainable"?
I would trust the rust compile/rerun process a lot more than an auto-updating linux distribution, personally. Sooner or later that auto-updating package manager will break your config file, itself, or both.
The majority of nginx distributions will not get any major bumps during the lifetime of the distros they're in. Whether that's because the distro itself has policies against that (Debian, Ubuntu) or because the distro version expires before nginx' stable release does (most of the ones with a higher update frequency, ie. Fedora) so by the time the version expires, they can just go to the next stable version on the release after. This is pretty much only a risk if your distro is completely rolling release and even those in my experience at least offer an "lts" package of some sort.
So that's not a problem. Besides the majority of nginx setups don't rely on too fancy things; most of the stuff that gets changed in updates that affect config files are the fancier nginx modules.
> The majority of nginx distributions will not get any major bumps during the lifetime of the distros they're in. Whether that's because the distro itself has policies against that (Debian, Ubuntu) or because the distro version expires before nginx' stable release does (most of the ones with a higher update frequency, ie. Fedora) so by the time the version expires, they can just go to the next stable version on the release after.
If you're following that approach then having auto updates enabled is kind of irrelevant/meaningless; at some point you're going to be running a very old (and unsupported) version of your distro with all that that implies.
That is true, yeah. But the library in question will also eventually break the existing API / stop supporting compatible releases with security patches. I assume this nginx config will still work for many releases in the future.
Wow, I didn't realize the little article I wrote for Lifehacker would have such a big impact on this project. Also didn't realize he had passed on the project to Cloudflare. They're a great team so I'm happy it's in their capable hands now!
Not related to the story of scaling the tech but rather to the IP business in 2023:
I’m not affiliated with any of these services, but my goto has been ip4.me, ip6.me, and ip6only.me because they’re short and memorable and because they acknowledge the IPv4/IPv6 split. The first two domains give you your v4 and v6 IP respectively and the latter only resolves over IPv6 (useful to ensure your IPv6 is off when using a VPN). You can tack on /api to any of them to get a plaintext response.
A long time ago I spent a bit too long debugging something to later find out the "source port" that displays isn't right! It's still not right! I obviously fixed this by making my own site (https://ip.wtf).
... will do the right thing, or if your user-agent isn't curl send a header of "Accept: text/plain" and you'll get the plain text version (see https://ip.wtf/about for more).
ipinfo.io's API and site is dualstack. But for IPv6 connection, you need to use v6.ipinfo.io for the API service.
Even though nobody asked, I have to mention this every time there is a discussion about IPv6 and IP address data. According to domain name records, even if a site supports both IPv4 and IPv6 it defaults to IPv6. I would love to know why.
I have come across a lot of people saying IPinfo doesn't support IPv6. The internet as an ecosystem doesn't completely support IPv6, but we do. So, mentioning it constantly in every discussion is my current approach.
> According to domain name records, even if a site supports both IPv4 and IPv6 it defaults to IPv6. I would love to know why.
Newer is better? Good clients use the 'happy eyeballs' (rfc 8305) mechanism, which is roughly, open v6, wait a bit, open v4, use whichever connection comes back first. There's a bias towards v6 because often it's a more direct connection, and may work better, but it was also common to have non working AAAA records, so a delayed fallback is preferable to no fallback as used to be common.
Don't forget about the new still-draft-but-already-used HTTPS DNS record type (no, that's not a typo)! I'm still fuzzy on its IP hint value(s) but it can potentially bypass A and AAAA records completely.
the issue is that these return a bunch of html so using them from the command line is not ideal.
I made wgetip.com back in 2008 to solve this. icanhaz must've been pretty suboptimal back then to have issues with traffic. wgetip still gets about 3M requests per hour. All from a single $5 droplet.
Slightly related: There used to be a DNS server that you could query a TXT record and the response would include the IP of the server that submitted the query. You could use it to debug DNS issues. I thought it was from DNS-OARC but I can't find it anywhere. Does anyone know a way to accomplish this?
This only seems to work if your client can send requests directly to resolver1.opendns.com. It does not work if you have a server that is intercepting DNS or if you use the system configured DNS. The google version posted below is closer to what I was looking for:
Note that if you leave off the `@service.authoritative.nameserver` portion of all of the above, you get the IP address of the recursive resolver that your machine is configured to use. If you pass this to a service like https://ipinfo.io/, you can use this to figure out what company's DNS service is currently configured, be it your ISPs, or an internet company like Cloudflare/Google/Quad9.
Arguably, that is the only valid reason to be using an IP address echoer via DNS, as if you truly wanted your own IP, you could fetch it far easier from a service like icanhazip.com.
Personally, Akamai's offering is by far the most useful and convenient--it's an A record, so you won't accidentally forget to add "txt" to your dig invocation, and has a very easy to remember name. Google's works too, though it requires remembering the strange domain name. Cloudflare tries to be cute and uses the Chaosnet DNS class, which while being an incredibly cool reference to internet history, unfortunately is not propagated by most public recursive resolvers. And like the parent comment mentions, OpenDNS somehow blocks requests that pass through a recursive resolver.
I don’t know of a server that does exactly what you wrote, but I generally troubleshoot dns issues with a few dig commands.
dig without any parameters will resolve via your normal dns server and display the servers ip.
With @(dns server ip here) you can query a different dns server to check what their answer is.
With +trace you can go through the complete dns chain from the root servers on to check for issues in the chain.
Sorry if this doesn’t help you or you knew this already.
Why not use protocols which are specifically designed to report mapped IP address like STUN? It's faster as UDP exchange is shorter than 3-way TCP handshake (or even TCP+TLS handshake).
Here is an implementation which uses parallel STUN queries to report address reliably as fast as possible: https://github.com/Snawoot/myip
But unfortunately I don't have it in a random docker/kubernetes containers or random internal servers, and these are the contexts where I personally usually want to check my external IP. They usually have curl though.
Because one requires you to merely type a simple URL in your browser that you already likely have open, the other requires you to install Go, clone a git repo, compile the program, install the program, and THEN finally run the program.
This is very cool and something that'd be neat to roll into a toolset. But I think you're missing the point of izanhazip. It's meant to be quick and easily accessible regardless of where you are or what system you're on (assuming there's curl or similar installed).
I might not have permission or it wouldn't be reasonable to install something. And I can link it to customers or give them a simple command to run, and all it returns is the actual info I need from them.
It's a fun project if you want to learn to code, which is why I think there're so many of them around. I built and run Accio127[1][2] for the same reason.
The amount of intel Cloudflare obtains from this would be staggering. Just the TLS signatures alone is a goldmine. Look at the CISCO mercury product and their TLS database with malware detection. Or how Akamai bought https://www.fingerbank.org/
You got the short end of the stick for $8 reimbursement!
Ah Major Hayden. Great guy. I remember getting a crash course on system administration in Slicehost’s IRC channel because some idiot ran an ancient Java application server on an open port. We got hacked and I got volun-told to clean up the mess because I was the only person running Linux on my laptop.
Everyone was so nice and I learned so much. I owe them all a lot of drinks. :)
curl is being curl here: protocol, host, shortest possible user-agent string, nothing extra. Let's see the reply:
< HTTP/1.1 200 OK
Okay.
< Date: Tue, 01 Aug 2023 06:59:21 GMT
I am not sure, is it really necessary? I will use NTP if I need to know the current GMT. RFC doesn't state this header as mandatory.
< Content-Type: text/plain
< Content-Length: 14
Okay, nice to know.
< Connection: keep-alive
Really? What's a use case here? Do I need to be reminded of my IP again in a few seconds? Or is it in case my IP will quickly change? Oh, never mind...
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Methods: GET
Now, this is a bit ridiculous. Why would the fetch-based browser app rely on a third-party service to determine the client's IP?
< Set-Cookie: [250 bytes of total abomination]
Why, oh why? Why do I need to receive this and keep it somewhere? All I want is to haz IP! Can I only haz IP?
< Server: cloudflare
Okay, a little vanity never killed nobody.
< CF-RAY: 7efc32adfef5c21e-VIE
I know what Cloudflare Ray is. The question is: why do I need it here?
< alt-svc: h3=":443"; ma=86400
Good to know, maybe, but to be honest - this is redundant too.
xxx.xx.xx.xx [my IP address, masked for privacy reasons]
At last! Now I can do my thing with the IP I just haz.
I will not rant here about extra bytes transferred, extra bandwidth congested, extra electricity burned, and so on. Sapienti sat. Two side notes: it replies with HTTP 1.1 on HTTP 1.0 request, and it still puts alt-svc header into https reply.
You do know headers are transferred whether you look at them or not, right? A request to icanhazip.com results in a response that is 90% completely wasted bandwidth.
I used to use ipchicken and some others, but now I just type “what is my ip” into google and get it that way. Not suitable for automation but in my case it’s just a quick manual check.
I made a site like this to mess with a guy at work. Everyone knew of IPChicken, and his last name is Herring, so I created IPHerring. I'd put Photoshopped images of him on the site for different holidays and coworkers always thought it was hilarious. They would even send me ideas of what to put next.
Now I sometimes use AI image generation tools to really make it stand out. It's been a lot of fun. A lot of the tech folks in my area are using it now.
I use icanhazip. It is good; just the plain IP address, no HTML and no JSON. It could be done even simpler by just nc to a specified host and port and no HTTP is needed either; it need not wait for a request and will not need response headers.
Im so glad the late 00s era of cringy "lolcat english" is over. Perhaps the modern equivalent of excessively pretentious sounding crypto bro naming schemes is worse in some ways, but at least I dont have to pause to explain the name when I say it out aloud to someone.
Not having to ask the public facing IP address of some NAT-performing router is one of the good sides of IPv6. I'm wondering what uses there are of such a service, besides dynamic DNS updates, that would need automated queries and result in such an overwhelming amount of traffic...
A great tool I’ve used countless times after learning about it while working at Rackspace a decade ago . It was great to learn about how it’s grown over the years, what an ordeal!
Cloudflare is at the top of the list of custodians I would want if in a similar position, glad it’s found a new steward.
It's sad to think that everyone had to work so hard to solve a problem whose solution is 27 years old. One day we'll just be able to run `ip -6 addr show scope global` and get the correct answer without wasting everyone's bandwidth and compute.
what an amazing letter, thank you for your service!
icanhazip is part of internet lore, and passing it on to Cloudflare is a very noble thing to do! Cloudflare, please stay true to the idealistic principles of simplicity and availability for the service.
> Their sponsorship of icanhazip.com has saved me tens of thousands of dollars per month.
Can someone please explain how does returning a few hundred bytes of plaintext response can cost thousands of dollars? Either I'm really bad at estimating things or there's some hidden cost somewhere I'm not seeing.
Also why could it not work fine behind a Cloudflare free tier?
> In 2021, the traffic I once received in a month started arriving in 24 hours. The site went from a billion requests per day to 30-35 billion requests per day over a weekend.
30 billion times 350 bytes (size of my response) is equal to 10.5 terabytes (base-10) a day. 84 terabits per day, 84000/3600/24 ~= 1 gigabit if my math is right. A fully saturated 1 gigabit pipe of worldwide internet costs a lot. So, there.
Really interesting. I never heard of this website and was surprised to hear this volume of traffic it's getting. (I always used curlmyip.org or whatsmyip.net|.com). I wonder what is the rationale for providing a non-rate limited service that can be easily abused like this. The volume almost suggests some stuff like IoT devices or other software use this to test internet connectivity. Or botnets?
I believe that creating a traffic rate-limiting system, which is more cost-effective than simply echoing back the requestor's IP address, is likely to be a complex task. However, I haven't personally performed the calculations to confirm this.
He was using Hetzner and says it was costing him $200/month. He then had to spin up a second server, presumably increasing the cost, and it was still struggling.
From there traffic doubled, and he moved to Cloudflare Workers. And then traffic 30xed.
I suspect it's more about request count (>300k per second) than raw number of bits.
OP said a fully saturated gigabit link, which generally means you pay for transit ("size of the pipe") rather than "stuff pushed through it." I haven't been in the market for a long time, but 5-10 years ago the price was on the order of $0.50 per mbps, so $500 for a 1000 gbps ("gigabit") link. Usually you buy this with a dedicated server or colocation. You wouldn't want to host this in the cloud because they price gouge you on bandwidth by charging for stuff through the pipe.
And that's if you truly expect to saturate it. If you just want to take the risk and go for oversold fiber, you can pay some nominal price for "unmetered bandwidth" (with the caveat that if you're saturating the pipe, your host will probably make you pay for it eventually).
It mentions 2 petabytes of data per month, plus compute for a trillion requests per month. It sounds like all kinds of optimizations were made so probably it can’t be much cheaper. My guess is that the “tens of thousands” is calculated on CF Workers prices though.
nginx config example:
"}\n'; }Which will return something like this in JSON format:
This is all done 100% within NGINX, and I include a lot of stuff you probably don't want or care about. Other web servers probably have similar capabilities.