Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
brcmthrowaway
on Nov 19, 2024
|
parent
|
context
|
favorite
| on:
Llama 3.1 405B now runs at 969 tokens/s on Cerebra...
So out of all AI chip startups, Cerebras is probably the real deal
icelancer
on Nov 19, 2024
|
next
[–]
Groq is legitimate. Cerebras so far doesn't scale (wide) nearly as good as Groq. We'll see how it goes.
hendler
on Nov 19, 2024
|
parent
|
next
[–]
Google TPUs, Amazon, a YC funded ASIC/FPGA company, a Chinese Co. all have custom hardware too that might scale well.
throwawaymaths
on Nov 19, 2024
|
parent
|
prev
|
next
[–]
How exactly does groq scale wide well? Last I heard it was 9 racks!! to run llama-2 70b
Which is why they throttle your requests
pama
on Nov 19, 2024
|
root
|
parent
|
next
[–]
Well, Cerebras pretty much needs a data center to simply fit the 405B model for inference.
throwawaymaths
on Nov 19, 2024
|
root
|
parent
|
next
[–]
I guess this just shows the insanity of venture led AI hardware hype and shady startup messaging practices
gdiamos
on Nov 19, 2024
|
prev
[–]
just in time for their ipo
ipsum2
on Nov 19, 2024
|
parent
[–]
It got cancelled/postponed.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: