Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning

Francis Bach 1
1 WILLOW - Models of visual object recognition and scene understanding
CNRS - Centre National de la Recherche Scientifique : UMR8548, Inria Paris-Rocquencourt, DI-ENS - Département d'informatique de l'École normale supérieure
Abstract : For supervised and unsupervised learning, positive definite kernels allow to use large and potentially infinite dimensional feature spaces with a computational cost that only depends on the number of observations. This is usually done through the penalization of predictor functions by Euclidean or Hilbertian norms. In this paper, we explore penalizing by sparsity-inducing norms such as the l1-norm or the block l1-norm. We assume that the kernel decomposes into a large sum of individual basis kernels which can be embedded in a directed acyclic graph; we show that it is then possible to perform kernel selection through a hierarchical multiple kernel learning framework, in polynomial time in the number of selected kernels. This framework is naturally applied to non linear variable selection; our extensive simulations on synthetic datasets and datasets from the UCI repository show that efficiently exploring the large feature space through sparsity-inducing norms leads to state-of-the-art predictive performance.
Document type :
Preprints, Working Papers, ...
Complete list of metadatas

Cited literature [20 references]  Display  Hide  Download

Contributor : Francis Bach <>
Submitted on : Tuesday, September 9, 2008 - 8:44:32 AM
Last modification on : Wednesday, January 30, 2019 - 11:07:59 AM
Long-term archiving on : Friday, June 4, 2010 - 11:06:43 AM


Files produced by the author(s)


  • HAL Id : hal-00319660, version 1
  • ARXIV : 0809.1493



Francis Bach. Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning. 2008. ⟨hal-00319660⟩



Record views


Files downloads