It’s interesting to see Google staying consistently involved in the medical AI space going back to the PaLM / Med-PaLM days. Whatever is driving that commitment, it’s nice to see, there’s a lot of real upside here in terms of improving patient outcomes.
I was a bit surprised that only the 4B MedGemma variant was updated, but after checking the Hugging Face download stats it makes sense. The 4B model has ~350k downloads, while the next most popular variant sits around ~25k.
It turns out the Cursor one is stitching together a ton of open source components already.
That said, I don't really find the critique that models have browser source code in their training data particularly interesting.
If they spat out a full, working implementation in response to a single prompt then sure, I'd be suspicious they were just regurgitating their training data.
But if you watch the transcripts for these kinds of projects you'll see them make thousands of independent changes, reacting to test failures and iterating towards an implementation that matches the overall goals of the project.
The fact that Firefox and Chrome and WebKit are likely buried in the training data somewhere might help them a bit, but it still looks to me more like an independent implementation that's influenced by those and many other sources.
> The fact that Firefox and Chrome and WebKit are likely buried in the training data somewhere might help them a bit, but it still looks to me more like an independent implementation that's influenced by those and many other sources.
They generate a statistically appropriate token based on a very small context window. And they are slightly nerfed not to reproduce everything verbatim because that would bring all sorts of lawsuits.
Of course they are not reproducing Webkit or Blink or Firefox verbatim. However, it's not an "independent implementation". That's why it's "stringing together a bunch of open-source components": https://news.ycombinator.com/item?id=46649586
Edit: also, this "independent implementation" cannot be compiled by their own CI and doesn't work, apparently.
My first thought was, this is the kind of thing that an LLM writes and nobody checks. But then I realized, any decent LLM would have probably caught that inconsistency
Yeah, the good ones will use tools and give you a nice markdown table. Any claims like this anymore, I trust AI to go through it and give the numbers a reality check. With all the claims made about datacenter resources, power, and water usage, it's darkly funny how bad people are at understanding these things.
The math they do with their assumptions is usually pretty good, and you can tell when they put in the effort, but wow, the models and assumptions are all over the place.
If the AI industry takes a hit because people are returning to offline hobbies, it’s a signal we’ve been building the wrong things.