Open source cross-platform compiler for compute-intensive loops used in AI algorithms, from Microsoft Research
-
Updated
Oct 10, 2023 - C++
Open source cross-platform compiler for compute-intensive loops used in AI algorithms, from Microsoft Research
Separation Axis Theorem (SAT) physics engine library accelerated via GPGPU API (ROCm/OpenCL/CUDA) / or CPU-side
ROCm Install Utilities: rocminstall.py script to install a specific ROCm release version/revision.
A custom attention framework aimed at maximum context, speed and usability. Featured with a triton kernel, and a couple of benchmarks.
A high-performance ROCm/HIP C++ library for exact modular arithmetic using the Residue Number System (RNS), focused on batched small matrix multiplications and CRT reconstruction.
RC Car with Object detection on Nvidia Jetson Nano and AMD Ryzen
Add a description, image, and links to the rocm-kernel topic page so that developers can more easily learn about it.
To associate your repository with the rocm-kernel topic, visit your repo's landing page and select "manage topics."