Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
zcbenz
9 months ago
|
parent
|
context
|
favorite
| on:
Apple's MLX adding CUDA support
In the absence of hardware unified memory, CUDA will automatically copy data between CPU/GPU when there are page faults.
fenced_load
9 months ago
|
next
[–]
There is also NVLink c2c support between Nvidia's CPUs and GPUs that doesn't require any copy, CPUs and GPUs directly access each other's memory over a coherent bus. IIRC, they have 4 CPU + 4 GPU servers already available.
benreesman
9 months ago
|
parent
|
next
[–]
Yeah NCCL is a whole world and it's not even the only thing involved, but IIRC that's the difference between 8xH100 PCI and 8xH100 SXM2.
saagarjha
9 months ago
|
prev
|
next
[–]
This seems like it would be slow…
freeone3000
9 months ago
|
parent
|
next
[–]
Matches my experience. It’s memory stalls all over the place, aggravated (on 12.3 at least) there wasn’t even a prefetcher.
nickysielicki
9 months ago
|
prev
[–]
See also:
https://www.kernel.org/doc/html/v5.0/vm/hmm.html
Consider applying for YC's Summer 2026 batch! Applications are open till May 4
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: