Maximizing WebGPU Performance From The Browser
Dec 24, 2024
Distribute AI
Maximizing WebGPU Performance From The Browser
WebGPU is a new browser API that enables developers to use the underlying system’s GPU to execute high performance computations. Such computations include AI inference among other things. The distribute.ai chrome extension leverages browsers with WebGPU support to execute AI inference on a wide range of devices. To maximize performance and compatibility across all browsers and operating systems, there are specific experimental chrome flags required so that the distribute.ai extension can execute at its fullest potential.
Let’s discuss the Chrome flags the distribute.ai extension requests, and what their purpose is.
Javascript Integration
The JavaScript Promise Integration (JSPI) API allows a WebAssembly application to more efficiently communicate with lower level system APIs using asynchronous calls rather than synchronous callbacks. This allows for more throughput and lower latency when running inference.
Experimental WebAssembly
WebAssembly pointers are limited to 32-bit integers, restricting heap sizes to 4GB. This flag expands the pointer size to 64-bits, allowing for less computation overhead and loading larger possible model sizes.
Unsafe WebGPU Support
Enables support for bleeding edge WebGPU features and drivers which aren't fully stable. This allows more devices to contribute to inference, and provides broader support for available model types.
In future chromium-based browsers releases, these flags will not be required as the features that are currently experimental are moved into final releases.
While these flags are not strictly required for all browser based AI/ML inference in general, they are required to run the larger models that the distribute.ai network supports. These experimental flags enable the distribute.ai extension to be on the frontier of distributed compute networks.
Read more about experimental ML features in the browser from Chrome, Mozilla and The WebAssembly Forum