> But for inference and run-time computations, it could be very interesting to take a model trained with CUDA/PyTorch and export it (maybe with Apache TVM or tensorflow.js) into WebGPU that can run on end-user devices.
In its current state, can you train on PyTorch, export to ONNX, load ONNX in JavaScript/WASM, then use it for WebGPU inference?
I'm not trying to sound obsessed/married to ONNX, I just though it was "the standard". Curious to learn alternatives/what people are doing now but I fear even talking about what might being done is discussing "bleeding edge".
You can also go directly from PyTorch to WebGPU with Apache TVM. (ONNX is also supported, but my understanding is that it's better to go direct). This is an example using an LLM trained with PyTorch (I think) and run in the browser: https://mlc.ai/web-llm/
I can't seem to figure if the PR for the WebGPU backend for onnxruntime is supposed to land in a 1.14 release, a 1.15 release, has already landed, isn't yet scheduled to land, etc? https://github.com/microsoft/onnxruntime/pull/14579
> Official releases of ONNX Runtime are managed by the core ONNX Runtime team. A new release is published approximately every quarter, and the upcoming roadmap can be found here.
In its current state, can you train on PyTorch, export to ONNX, load ONNX in JavaScript/WASM, then use it for WebGPU inference?
I'm not trying to sound obsessed/married to ONNX, I just though it was "the standard". Curious to learn alternatives/what people are doing now but I fear even talking about what might being done is discussing "bleeding edge".
Edit: A quick Google shows yes https://onnxruntime.ai/docs/tutorials/web/