Skip to Main content Skip to Navigation
Journal articles

Kernel Operations on the GPU, with Autodiff,without Memory Overflows

Abstract : The KeOps library provides a fast and memory-efficient GPU support for tensors whose entries are given by a mathematical formula, such as kernel and distance matrices. KeOps alleviates the major bottleneck of tensor-centric libraries for kernel and geometric applications: memory consumption. It also supports automatic differentiation and outperforms standard GPU baselines, including PyTorch CUDA tensors or the Halide and TVM libraries. KeOps combines optimized C++/CUDA schemes with binders for high-level languages: Python (Numpy and PyTorch), Matlab and GNU R. As a result, high-level "quadratic" codes can now scale up to large data sets with millions of samples processed in seconds. KeOps brings graphics-like performances for kernel methods and is freely available on standard repositories (PyPi, CRAN). To showcase its versatility, we provide tutorials in a wide range of settings online at
Complete list of metadata
Contributor : Ghislain DURIF Connect in order to contact the contributor
Submitted on : Thursday, April 8, 2021 - 1:24:36 PM
Last modification on : Friday, August 5, 2022 - 10:51:47 AM


Publisher files allowed on an open archive


Distributed under a Creative Commons Attribution 4.0 International License


  • HAL Id : hal-02517462, version 2


Benjamin Charlier, Jean Feydy, Joan Glaunès, François-David Collin, Ghislain Durif. Kernel Operations on the GPU, with Autodiff,without Memory Overflows. Journal of Machine Learning Research, Microtome Publishing, 2021, 22 (74), pp.1-6. ⟨hal-02517462v2⟩



Record views


Files downloads