Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Google +1 tracks mouse movements? (stackoverflow.com)
97 points by Gullanian on July 23, 2011 | hide | past | favorite | 26 comments


According to the link, this is being used as a source of entropy to generate random numbers. Fascinating.


Seems pretty reasonable to expect it to be this code from the apache shindig library:

http://svn.apache.org/repos/asf/shindig/trunk/features/src/m...

From the file: This code implements a safer random() method that is seeded from screen width/height and (presumably random/unguessable) mouse movement, in an effort to create a better seed for random().

Its aim is to solve the problem of gadgets that are relying on secret RPC tokens to validate identity.


Aren't there a lot of patterns on how we interact with an interface?

For example, most people on facebook move their pointers to check their notifications/new messages/friend requests. Wouldn't that produce a lot of "not so random" random numbers?

(I am not looking to start a conversation on how random random numbers are, just curious on why they used this specific technique)


Hah!

That's so simple, yet so brilliant.


Mouse movements have been long used as a source of entropy in desktop applications.


Indeed. Putty's keygen thing has been using it for quite a long time for example



I read that PGP did that in the 90's.


I'm not sure how often research papers reflect reality, but Bing may be doing this as well, according to this Microsoft Research paper: http://jeffhuang.com/Final_CursorBehavior_CHI11.pdf


This is a very good paper, but it seems to approach mouse tracking specifically from the search perspective. Perhaps not extremely pertinent, but still super interesting.


Google Analytics does it too, which makes approx. 85% of all web sites you visit.


adblock'd sites dont do it :)


cute, but surely there are less intensive ways of generating random numbers?


I've always been a fan of using electronic noise. It's not complicated to build a device which does this. Weekend project to "hello world" scale if you're not too picky about specifics.

Easiest way is probably with a webcam, which will also give you a pretty good bitrate. The general schema is to read out the bias noise. You do this by blocking out any incoming light to get dark noise + bias noise. Dark noise is an assumed static shift due to CCD characteristics, so read N frames of this and look for the median signal. Subtract that off and you've got the bias, or at least something that's close enough for a weekend project.

Better version, start reading up on the math of noise sources in whatever device you want to use for a sensor. Also do an analysis of N samples to see how close the result comes to the expectation value given the type of noise involved and what deviation is expected at N samples.


There's no way to deploy this over the web, unless you use Flash to acquire their webcam video or audio.


Packet latency, for example.

The general idea is that there are many noise sources which technology normally needs to route around. To find random number sources, you reverse this.


Even something as simple as packet latency is really hard to get from JS. the timing functions you get from the browser are way too low resolution. Also the only network latency you can test for is doing an xmlhttp request as that's the only way o do any network communication (minus web sockets which arent generally available)

Aside of that, you get no direct hardware access to measure and even if you would, there is still the timer resolution problem.

Very recent browsers provide an API to get strong random numbers, but this is even less widely available than web sockets.

So either you take the mouse movement or you use something like Java or Flash, or you use bad numbers.


> * the timing functions you get from the browser are way too low resolution.*

The JS time function returns epoch milliseconds, not seconds. At the scale of network latency (hundreds if not thousands of milliseconds), the lower order bits should be effectively uniformly distributed. Resolution isn't your problem, the number of usable bits per network call is.


The problem is that the JS engines in browsers rely on not-so-sophisticated APIs to get to their millisecond values. This means that yes, you do get time in milliseconds, but the value is only updated every 40 to 100 ms


Perhaps not on modern on browsers. In Chromium on Linux, typing this in the dev console:

    x=new Date().getTime(); for(j=0;j<10000;j++); y=new Date().getTime(); for(j=0;j<10000;j++); z=new Date().getTime(); alert(x+' '+y+' '+z)
gives me high resolution timestamps:

1311577733069 1311577733083 1311577733099


I think some browsers on Windows are using QueryPerformanceCounter for timing, which has microsecond precision.


obsessive correction of small, largely irrelevant detail (while upvoting in general agreement): what you subtract (the constant shift, which actually does have some structure) is the bias. what you're left with is the dark noise.

you could also point the camera at a flat white surface - that would give you a lot more noise (so much you probably don't need to worry about subtracting anything, if you just take the least significant bits from each pixel)

[i hope i'm right - i wrote astronomy image reducing software for a living]


You're right, my error.

I haven't used IRAF in a long while, but I'm still going to have to chalk that up to "stupid things I said in the absence of sufficient coffee."


Seems like overall, it's definitely code to help thwart gaming of the +1 system, as there is already a cottage industry that offers (presumably) automated +1 clicking services for money.



Does the emperor wear any clothes?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: