Data Dependent Kernel Approximation using Pseudo Random Fourier Features

Bharath Bhushan Damodaran 1 Nicolas Courty 2 Philippe-Henri Gosselin 3
1 OBELIX - Environment observation with complex imagery
UBS - Université de Bretagne Sud, IRISA-D5 - SIGNAUX ET IMAGES NUMÉRIQUES, ROBOTIQUE
2 SEASIDE - SEarch, Analyze, Synthesize and Interact with Data Ecosystems
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, UBS - Université de Bretagne Sud
3 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : Kernel methods are powerful and flexible approach to solve many problems in machine learning. Due to the pairwise evaluations in kernel methods, the complexity of kernel computation grows as the data size increases; thus the applicability of kernel methods is limited for large scale datasets. Random Fourier Features (RFF) has been proposed to scale the kernel method for solving large scale datasets by approximating kernel function using randomized Fourier features. While this method proved very popular, still it exists shortcomings to be effectively used. As RFF samples the randomized features from a distribution independent of training data, it requires sufficient large number of feature expansions to have similar performances to kernelized classifiers, and this is proportional to the number samples in the dataset. Thus, reducing the number of feature dimensions is necessary to effectively scale to large datasets. In this paper, we propose a kernel approximation method in a data dependent way, coined as Pseudo Random Fourier Features (PRFF) for reducing the number of feature dimensions and also to improve the prediction performance. The proposed approach is evaluated on classification and regression problems and compared with the RFF, orthogonal random features and Nystr{\"o}m approach
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-02174326
Contributor : Nicolas Courty <>
Submitted on : Friday, July 5, 2019 - 9:19:13 AM
Last modification on : Sunday, July 7, 2019 - 1:26:23 AM

Links full text

Identifiers

  • HAL Id : hal-02174326, version 1
  • ARXIV : 1711.09783

Citation

Bharath Bhushan Damodaran, Nicolas Courty, Philippe-Henri Gosselin. Data Dependent Kernel Approximation using Pseudo Random Fourier Features. 2019. ⟨hal-02174326⟩

Share

Metrics

Record views

8