Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The leak last night seems to indicate this will be coding focused.

I'd imagine this must be a big leg up on Anthropic to warrant the "GPT-5" name?



I'm guessing they realized they have rip off the bandaid and release a GPT 5 at some point, and we're gonna see a relatively incremental improvement.


It's very doubtful that they'd have any kind of magical breakthrough that makes the model anything other than incrementally better right now.


How do you figure? They’ve hinted that the reasoning breakthrough used to achieve gold in the IMO will be here in GPT-5.


What breakthrough? The self-awarded "gold" IMO result was achieved by running the model for over 1hr per question.


That sounds like a breakthrough to me. I don’t think GPT-4 could accomplish the same thing given several hours to try.


Said another way, 30 min less than what humans get? It’s on average 90 min per question.


And how much energy does a human being consume while spending 90 minutes on an IMO question?


Probably more. 200 kcal (a shrinkflated bag of chips) is about 232 watt hours. A typical 4o query is 0.3 to 3 watt hours.

https://epoch.ai/gradient-updates/how-much-energy-does-chatg...


But how much time does that 0.3 watt hour query take to run? They imply that an individual ChatGPT query takes 0.3-3 watt hours, but most queries come back in seconds, so we need to scale that over a whole hour of processing.

Edit: Scrolling down: "one second of H100-time per query, 1500 watts per H100, and a 70% factor for power utilization gets us 1050 watt-seconds of energy", which is how they get down to 0.3 = 1050/60/60.

OK, so if they run if for a full hour it's 1050*60*60 = 3.8 MW? That can't be right.

Edit Edit: Wait, no, it's just 1050 Watt Hours, right (though let's be honest, the 70% power utilization is a bit goofy - the power is still used)? So it's 3x the power to solve the same question?


The gold which Google won too, right?


No Sam explicitly said that breakthrough wouldn't be in GPT-5


GPT-5 should mean a brand new model/architecture trained from scratch.


It means nothing now.

It's the same as 4G vs 5G. They have a technical definition, but it's all about marketing.


It means 5 is more than 4, Claude only has a 4. Clearly 5 is better


Think about it. You walk into a video store, you see 8-Minute Abs sittin' there, there's 7-Minute Abs right beside it. Which one are you gonna pick, man?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: