Sarek DSL
Write high-performance GPU kernels directly in OCaml syntax using a type-safe PPX. No need to learn CUDA C or OpenCL C for your logic.
Unified Runtime
The SPOC framework manages memory transfers, device detection, and kernel execution across multiple hardware backends seamlessly.
Modern Architecture
Optimized for OCaml 5 with support for CUDA, OpenCL, Vulkan, Metal, and parallel CPU execution via Domains.
Recent Developments (2024-2026)
Sarek has been recently modernized to leverage the latest OCaml features and modern GPU APIs:
- OCaml 5.4 Integration: Full support for effects and domains, providing high-performance CPU parallel execution.
- Cross-Platform GPU: Newly added Vulkan and Apple Metal backends for modern desktop and mobile hardware.
- Improved Reliability: A new structured error handling system and comprehensive test coverage.
- Modular Design: Backend implementations are now dynamic plugins, allowing for lightweight and extensible builds.
Quick Start
Sarek is not yet in the official opam repository. Install from source:
git clone https://github.com/mathiasbourgoin/Sarek.git
cd Sarek
opam install . --deps-only -y
dune build
Check out the Getting Started guide to write your first GPU kernel in minutes, browse the Examples to see common patterns, or try the Playground to transpile kernels live in your browser.
New to GPU programming? The interactive Learn course teaches it from scratch — you edit a Sarek kernel and run it on your own GPU straight from the page (via WebGPU), with automatic checking. It builds up from vector addition through generating a Mandelbrot image and writing an image filter.
How it works
Sarek allows you to express parallel logic as standard OCaml functions. These are compiled to native GPU code at runtime.
(* A simple vector addition kernel *)
let vector_add =
[%kernel
fun (a : float32 vector) (b : float32 vector) (c : float32 vector) ->
let idx = get_global_id 0 in
c.(idx) <- a.(idx) + b.(idx)]
Project Info
Sarek is the result of over a decade of academic research into high-level parallel programming abstractions. It is currently maintained by Mathias Bourgoin.