π¨ Build Notes#
Enable exactly one GPU back-end that matches your device (
CUBLAS,METAL,OPENCL, β¦).For very large models, more VRAM helps, but OptiMLβs hybrid placement reduces the requirement.
Use quantization to lower memory and often improve speed on PC-class hardware.
Ensure release builds (
-DCMAKE_BUILD_TYPE=Release) for best performance.