Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Google does this so they have click tracking data. But they don't need to mangle URLs in Chrome because it supports the `ping` attribute on <a> tags [0].

The ping attribute basically adds click tracking as a native browser feature so you don't need to do URL redirects. It also makes these analytics much easier for the site and mysterious to the user. Looks like most vendors besides Firefox support it. (They were pretty opposed I recall)

If you're a Chrome user, there's some extensions that disable ping requests/link auditing [1]. (EDIT: a commenter noted that uBlock Origin already blocks these! So I recommend that over this obscure extension)

[0] https://caniuse.com/ping

[1] https://chrome.google.com/webstore/detail/ping-blocker/jkpoc...



Firefox has a browser.send_pings setting to control this, not sending any pings when it is set to false. This is explicitly a valid browser implementation of the ping attribute:

https://html.spec.whatwg.org/multipage/links.html#hyperlink-...

> 2. Optionally, return. (For example, the user agent might wish to ignore any or all ping URLs in accordance with the user's expressed preferences.)

The problem isn't that Firefox doesn't support the ping attribute, the problem is that Google fails to respect user requests not to track.


I noticed this part of the spec too:

> When the `ping` attribute is present, user agents should clearly indicate to the user that following the hyperlink will also cause secondary requests to be sent in the background, possibly including listing the actual target URLs.

> For example, a visual user agent could include the hostnames of the target ping URLs along with the hyperlink's actual URL in a status bar or tooltip.

Does any browser supporting pings actually do that??

Also the "Note" in that section provides a decent argument for supporting `ping`. Basically, users will have their clicks tracked anyway, but the `ping` attribute provides more transparency and a better user experience. Though the transparency part is debatable given browser implementations.


> Also the "Note" in that section provides a decent argument for supporting `ping`. Basically, users will have their clicks tracked anyway, but the `ping` attribute provides more transparency and a better user experience.

I see this argument used a lot for including user-hostile features in browsers. I don't think it's a good argument since having a browser implementation a) makes this privacy abuse easier and b) legitimizes the practice. Meanwhile as shown in this post, user-preference to disable the feature is ignored and simply worked around by websites.


Third party links won't be "tracked anyway" if you're blocking JS. That's the only reason to have this feature since a site can track activity from its own links via logs.


This is wrong : what google currently does on firefox is make you go through a link that logs you clicked on it, then redirects you to your destination. This doesn't need any javascript, it just needs to log your http request, so it should work on any browser (even, say, elinks)


Exactly. That's why I solved this problem by using DDG.

Although I admit some searches I have to send to Google to get the result I'm looking for.


Seems to me the old Emac (or vi) saying have changed. Today it is "How do you know someone uses DDG? He'll tell you!"


How do you know someone uses Google? Well, virtually everyone does, so they probably do.


Same for me, but all google searches are done by DDG bangs like 'g!'


Using bangs defeats the purpose of using DDG in the first place.

And DDG tracks your clicks just like Google does.


Absolutely not at all. It's the reason number one why I use it because very often I know on which page I want to search for a query. And having DDG as the default search engine in the browser gives me the ability to use the URL bar to directly query Youtube with !yt or Wikipedia with !w and a lot more.


>use the URL bar to directly query Youtube with !yt or Wikipedia with !w and a lot more.

My understanding is both Chrome and Firefox can do this natively, ie without DDG


Yes you can add these, in Firefox it's called Search Shortcuts and a few are there by default. But there are a few reason why I prefer DDG:

1. I don't have to add those things in my Browser

2. When I searched DDG for something don't find good stuff I can just go to the search bar on the DDG page and add any !bang to query a search engine I like (!y for Yahoo, !g for Google, …)

3. No matter which Browser I use on which device, the only thing I have to change is the default Search engine and it's set up for anything I want to search. This is especially nice for me since I use Firefox on my Mac, Brave and Safari on my iPhone and Brave and Chromium on my Windows device


Yeah, me too. Love the bang system. Don't forget to put the exclamation point in front! "!g"


> Don't forget to put the exclamation point in front! "!g"

At least some of the bang keywords work with the "!" on either side. I tried "g!" and "w!" and they work OK.


For all the bangs I use it works either way but yes, the official way is with the exclamation in front


Seems like it's set to false by default. So it's not really a "user request" not to track so much as a "browser request" not to. Reminds me of the situation with the "Do Not Track" header where browsers sending it by default caused the signal to lose all meaning.


That wasn't what caused DNT to fail. What caused it to fail was that websites could decide whether or not to honor it, and honoring it would have meant a reduced ability to spy on people, impacting their income.


That was part of it. Obviously if sites had no choice in the matter then it wouldn't have mattered whether browsers enabled it by default or not.

Since it was a voluntary thing though, browsers sending it by default pretty much destroyed what chance there was of mainstream sites deciding to implement support for it. It's one thing to give up on tracking a small portion of users who explicitly opt-out, and another thing entirely to give up on tracking everyone except for a tiny minority who choose to opt-in.


If browsers weren't sending it be default, it wouldnt have any support because nobody saw enough traffic with it to implement it.

The design is the problem since websites who don't feel like it don't have to honor it. Whether it's because it'll bankrupt them by everyone setting it, or not bothering with supporting their unprofitable users.


Its reasonable to assume that users choose a more privacy focused browser intentionally, meaning that "default" setting is intended by the user and not a decision that's made for them without their knowledge.


No, I don't think it's reasonable to assume that when the browser that broke DNT was INTERNET EXPLORER. Internet Explorer is perhaps the most notoriously unchosen browser to ever exist.


Well, in the relation between user and browser vendor it is quite a reasonable default. I can always change my mind if I want to give up my privacy.


No tracking by default means it's Opt-In - as it should be.


Arguably, the user requested it by intentionally choosing to use a browser with that default behavior.


Note the caniuse link says "While still in the WHATWG specification, this feature was removed from the W3C HTML5 specification in 2010." Another reason I prefer Firefox. Why implement a rejected non-standard feature whose primary purpose is to enable surveillance?


Seems like it's not meant to enable tracking so much as to improve its performance and UX, as the article demonstrates. (Google tracks clicks from Firefox users just fine without ping, it just does it in a more annoying way.)

Also probably worth noting that the W3C doesn't maintain an HTML standard anymore[1]; the WHATWG standard is the definitive one.

[1]: https://www.w3.org/html/


The fact that google made some aspect of their website behave better in chrome than in firefox is not really an argument that firefox is doing it wrong so much as yet another example of browser wars 2.0


Improving performance for tracking is enabling it. We should fight to get rid of tracking, not making it more performant.


> Why implement a rejected non-standard feature whose primary purpose is to enable surveillance?

Something that's in the spec that matters (WHATWG) but not the one that desperately pretends to still have relevance for HTML though it hasn't since it tried to push XHTML 2 (W3C) isn't “rejected” or “nonstandard” in any meaningful sense.


WHATWG, also known as We Have Aligned Totally With Google...

The company that has an effectively complete control over the "standard" and churns it frequently to discourage competition...


How that came to be is an interesting study in company PR. MSFT arguably should've had a much more prominent advisory position in WHATWG than Google but WHATWG ended up solidifying in a large part to counter act all the non-standard behavior folks experienced trying to develop cross browser pages in the days when mentioning ie6 would cause a terrified silence to fall on any web dev department.


Why do you call this surveillance? You just voluntarily entered your search terms. Why shouldn't Google know which link you clicked? The main purpose of this information is to improve the service that you are using.


> Why shouldn't Google know which link you clicked?

Why should they? I asked them info. They provided a list of links. Why are they entitled to know which of the links I visited?


Because you have an option of not using Google at all. I think if you are providing a free service, you are at least entitled to know how the users use it.


Because it is surveillance. There’s plenty of surveillance examples with multiple uses and “improving the service” doesn’t blanketedly discount their other uses.

More importantly they make it a pain in the ass to copy the actual URL of a link without actually clicking it. If you right click a search result link then their JS edits the href to the Google tracking link. So you can’t actually examine the entire URL without risking opening it up and being tracked. At best you get whatever preview your browser shows on hover.


Definition of surveillance: close watch kept over someone or something (as by a detective)

Counting the clicks on the search results page is nowhere near. A cashier in the supermarket knows which items you are buying, it doesn't mean that you are under surveillance.

And most of all, if you are worried about surveillance by Google, why use them at all?


> whose primary purpose is to enable surveillance

Google search is good because it tracks what links people click and knows when they come back to go to a different url on the page. if 99% of people visit the top result for a query, return, then hit the second one, chances are that the top result never answers what the search query asks.


People here may not like to read "Google" and "good" together, but yours is a description of how things actually work.

Of all the tracking Google does, this is by far the most justified, and least concerning to me. I'd rather log out and search anonymously, if my concern was being put in a bubble, rather than block this kind of feedback for search result quality. Then, again, I primarily use DuckDuckGo and I wonder if they do anything similar.


Yes, ddg sends out beacons to improving.duckduckgo.com when you click as well to do this.

https://i.judge.sh/ragged/Derpy/chrome_5j8fWLGX6J.png

They have an info page on it: https://improivng.duckduckgo.com.



> Google search is good because

Google search might have been good over a decade ago but today it's trash.


> Google search might have been good over a decade ago but today it's trash.

IME, its still consistently far and away better than the alternatives. Part of the difference in perception of quality may be that over time it has come to use more personal signals to zero in on relevant results, and the people that complain about how bad it is overlap considerably with those who actively seek to deny those signals to Google.


> over time it has come to use more personal signals to zero in on relevant results

I am specifically disinterested in existing within an echo chamber.

When I search for a topic, I am looking for information that is most faithful to objective reality. A detailed explanation of the limits of our current understanding, or why my understanding / model is inadequate is orders of magnitude more valuable to me than something that will affirm that I am a smart, special person. Google used to be exceptionally capable of delivering those kinds of results, even if it took some work refining search terms. Over the preceding decade, their effectiveness in this regard has significantly diminished.


good point, that seems plausible. It's an unenviable choice though: good results from incessant surveillance (and nasty link re-writing per topic), or poor results if you use it infrequently.

[I wouldn't know. I've been using DDG for so long now, I can't remember the last time I used Google search. Maybe I've forgotten how much better G is. Truth is, though, DDG does what I need of it. Rarely come away without the answer I want. So no temptation to use Google. At. All.]


I still have the habit of adding !g into my queries when the results are bad.

There have been years since the last time I remember Google actually giving me better results than DDG (except when I only want product sellers, on this case there has been around an year).

Yes, maybe if I let Google see even more of my life, they would be able to get me better results. But they have access to much more than I'm comfortable with already, and the results aren't there.


10 years ago it was probably way worse than it is today, but people has this fantasy idea of good old google that found everything magically.


The first page of "search results" is usually filled with ads and Google scraping and special casing dozens of sites. It is almost at a point where I have to skip to the "second" page just to get to the actual search results.

Also Google broke a lot of search qualifiers for stupid reasons, like the '+' when they created Google+.


Occasionally I accidentally use Google search when I've got an automated Chrome window open, and I find it astounding how unhelpful the results are. Everything that is returns is clearly optimized for what Google thinks readers want in a webpage rather than what actually matches my query. DDG isn't perfect but I find it better respects my queries.


Google is objectively a better search engine then it was 10 or 15 years ago.


10 or 15 years ago, I could actually find what I wanted without it second-guessing my queries and rewriting them into irrelevance.

Now it's absolutely useless for the hard-to-find information that you most need a search engine for. That's not "objectively better" at all.


I wish you could give us a single example.


IC part numbers. Service manuals for various equipment (now all you get are sites which may or may not have one, but are willing to collect your $$$). Error codes (filled with pages of results for a different code).


A solid example? I search for some of these all the time don't remember any particular issue compared to past.


I use Google search only for one thing: shopping.

Is this the intention behind everything at Google - advertising/shopping/consumption? If so, congrats to Google I guess.


The ping attribute is blocked by uBlock Origin. It's called hyperlink auditing in the dashboard.


It’s also a notable DDoS amplification vector[1].

[1]: https://securityaffairs.co/wordpress/83890/hacking/ddos-html...


I'm confused as to why ping was used in that situation at all, rather than, for example, just a normal POST request.


Doesn't require running javascript, so presumably the devices could be more efficient in sending them versus XHR/fetch.


The article specifically says the offending pages used JavaScript to add the ping attribute to the <a> tags, so the attack wouldn't have worked against users with JS disabled anyway.


It doesn't use Ping on Chrome browsers. For example, this is how the a tag looks like on Chromium:

<a href="https://news.ycombinator.com/" data-ved="2ahUKEwiIxrz0jKDzAhUWHcAKHQnnArkQFnoECAcQAx" ping="/url?sa=t&amp;source=web&amp;rct=j&amp;url=https://news.ycombinator.com/&amp;ved=2ahUKEwiIxrz0jKDzAhUWH...">

and this is how it looks like in Firefox:

<a href="https://news.ycombinator.com/" data-ved="2ahUKEwj9i67MjKDzAhXUfMAKHWJcCYsQFnoECA0QAx" onmousedown="return rwt(this,'','','','','AOvVaw3F-2xUE22tTvOxNDwVufx-','','2ahUKEwj9i67MjKDzAhXUfMAKHWJcCYsQFnoECA0QAx','','',event)">

You can see that Chromium based browsers call a ping endpoint whereas Firefox browsers use a mousedown event. This device detection uses the user agent; changing it on Firefox to look like Chrome results in a ping attribute instead of mousedown.


My understanding of the actual amplification vector is that the JS is just obfuscation on top: they could have just as easily deployed static HTML with those attributes.


Wow, I had no idea about this.

At least with the mangled link approach it’s easier to tell that tracking is going on, but that ping attribute seems extraordinarily sneaky to me. I get that it enables “clean” links but the opaque tracking is way worse in my eyes.

Sigh.

Edit: when I think about it, I guess it’s not that dissimilar to what you can do with JS based tracking anyway, so perhaps it’s not really any worse than what already exists. But it still feels wrong for some reason.


For Firefox, there's a relatively popular extension called ClearURLs that sanitizes most URLs to remove tracking, including Google's and Amazon's.


I did notice a bug with this... I had a magic link for authentication in Gmail that used a `+` symbol in a URL, e.g. `http://example.com/token/abcd123+3cf==` and ClearURLs ended up convering the `+` to a `%20` which caused the server to fail to find the token.

Otherwise I love ClearURLs.


Or you can add https://raw.githubusercontent.com/DandelionSprout/adfilt/mas... to your uBlock Origin filters


I've tried other uBlock Origin filters for URLs and they weren't as good. Does this work flawlessly with Google, Amazon and other major players?


I've not had any trouble with it.


I see no cases when I, as a user, would want my broswer to support "ping" attribute. It's basically shady as fuck.


It would allow Google or other companies to track which result the user clicked without resorting to more complex JavaScript tracking or redirects.

I bet anyone would prefer the ping method rather than the redirects we see on Firefox that mangle copied urls.

You seem to be of the opinion that no tracking would be better. And that's fine and a popular opinion around here. But that's not an option as Google relies heavily on the clicks as an input for ranking.

So in a context where you consider tracking HAS to happen ping does offer advantages for the user.


This might hold up better if Google's rankings were actually good for all of the spyware they add. Here's a search I performed earlier today:

https://l.sr.ht/u1vK.png

The desired result is indicated in red, well below the page break. All of the other results are blogspam, SEO hacking, and mostly useless "featured snippets" and "people also ask".


Because then you could at least see the URL you're going to instead of the redirect the site is going to use to track you anyway.

This attribute doesn't do anything shady, and would do the opposite if it were actually used. The whole idea is to be able to provide the tracking data the site will get one way or the other, but with the ping attribute, you can do it without mangling the URLs.


If everybody was using "ping" instead of their various other tracking solutions (Javascript, redirecting) then you could just disable it in browser settings.


That's the whole point: to give link tracking a simple consistent interface, which makes it much easier to implement (no JavaScript libraries or proxy URLs) and to disable (a user agent can very easily choose whether to respect the ping).


> which makes it much easier to implement (no JavaScript libraries or proxy URLs)

I don't want it to be easy to implement.

> and to disable (a user agent can very easily choose whether to respect the ping).

This post is about Google working around that and just falling back to the old redirect-based click tracking for browsers that do not enable pings by default.

So having the attribute increased the browser complexity, brought no value to the user and only helped the tracking industry.


I don’t want my clicks to be tracked!


They are tracked regardless. Either through ping or redirect urls.


Then call your ISP! They are tracking them, too!


Okay? Does that mean my search engine should do that too? Two wrongs don't make a right.


you just told your free search engine all the different variations on the search strings you're looking for, basically all what is on your mind, and you've chosen your search engine based on the quality of the search they provide, and now you don't want them to know which result you chose?

I'd argue that it's the one piece of information they are most entitled to in the tit-for-tat, you help me and I'll help you arrangement you two have.


Sorry, my wording insinuates it's an either or. I meant to point out this is a war that needs to be fought on multiple fronts, and arguably your ISP has the better data (with the worse IT security, to boot)


Even through HTTPS? They'll have the domain and IP address and transfer size, but not the URL or contents of the traffic. (Unless they managed to MITM a trusted certificate somehow.)


HTTPS encrypts the host? Thought you had to know where to go to open that secure transmission. It's enough for your ISP to know you went to "pornhub.com" for example.


Yeah, despite ESNI and DNS-over-HTTPS it's likely an ISP could still effectively track usage of certain large-ish sites by IP address alone. Compare against the anonymity inherent in accessing s3.amazonaws.com/some-bucket/some-path.


They won't know when you use an alt DNS or DoH.


Your ISP needs to know the IP address of the site to route your TCP packets there, and they can easily do a reverse DNS lookup[1] on it. So hiding your DNS query from them won't prevent them from knowing what site you visited.

[1] https://en.wikipedia.org/wiki/Reverse_DNS_lookup


Exactly. At the end of the day, computers need a public address to find each other. And if you can find it, so can they.


You'll also need ECH, to avoid leaking your TLS handshake's SNI.


So that's what is going on https://news.ycombinator.com/item?id=21427341

I thought I was the only one.


Even without ping still doesn’t make sense why they can’t just use a JS event hook on click to fire a track/log request in the background instead of having to do a redirect. They already do this on link copy.


Why can't they just add a click-listener to the links?


This can be blocked easily in a variety of ways with forward proxy as well.

Google generally does not allow POST method for user-iniatiated queries, e.g., from HTML forms. However POST method is commonly used for tracking.

Using the web today with a "modern" browser and trying to exercise the slightest amount of control is like being in a Spy-vs-Spy MAD magazine comic strip.


Is this turned on by default in Chromium too?


Not to be facetious but Wow - I'm not sure how we let that "feature" slide past the "privacy" folk.


It didn't slide past. I recall a lot of discussions but those discussing and outraging are not the ones implementing and pushing forward with sheer momentum of technological dominance and money. It's the same for every privacy issue.


This is actually well-known and gets blocked by ublock origin and others.


The Google monopoly doesn't give a fuck.


Support for the ping attribute is one of the several reasons why I don't use Chrome/Chromium.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: