Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Large-scale data-dependent kernel approximation

Ionescu, Catalin ; Popa, Alin Ionut and Sminchisescu, Cristian LU (2017) 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017
Abstract

Learning a computationally efficient kernel from data is an important machine learning problem. The majority of kernels in the literature do not leverage the geometry of the data, and those that do are computationally infeasible for contemporary datasets. Recent advances in approximation techniques have expanded the applicability of the kernel methodology to scale linearly with the data size. Data-dependent kernels, which could leverage this computational advantage, have however not yet seen the benefit. Here we derive an approximate large-scale learning procedure for data-dependent kernels that is efficient and performs well in practice. We provide a Lemma that can be used to derive the asymptotic convergence of the approximation in... (More)

Learning a computationally efficient kernel from data is an important machine learning problem. The majority of kernels in the literature do not leverage the geometry of the data, and those that do are computationally infeasible for contemporary datasets. Recent advances in approximation techniques have expanded the applicability of the kernel methodology to scale linearly with the data size. Data-dependent kernels, which could leverage this computational advantage, have however not yet seen the benefit. Here we derive an approximate large-scale learning procedure for data-dependent kernels that is efficient and performs well in practice. We provide a Lemma that can be used to derive the asymptotic convergence of the approximation in the limit of infinite random features, and, under certain conditions, an estimate of the convergence speed. We empirically prove that our construction represents a valid, yet efficient approximation of the data-dependent kernel. For large-scale datasets of millions of datapoints, where the proposed method is now applicable for the first time, we notice a significant performance boost over both baselines consisting of data independent kernels and of kernel approximations, at comparable computational cost.

(Less)
Please use this url to cite or link to this publication:
author
; and
organization
publishing date
type
Contribution to conference
publication status
published
subject
conference name
20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017
conference location
Fort Lauderdale, United States
conference dates
2017-04-20 - 2017-04-22
external identifiers
  • scopus:85083936904
language
English
LU publication?
yes
id
7e6c6204-9eba-4a31-8301-5303a344a089
alternative location
http://proceedings.mlr.press/v54/ionescu17a.html
date added to LUP
2019-07-08 13:02:15
date last changed
2022-04-26 03:15:29
@misc{7e6c6204-9eba-4a31-8301-5303a344a089,
  abstract     = {{<p>Learning a computationally efficient kernel from data is an important machine learning problem. The majority of kernels in the literature do not leverage the geometry of the data, and those that do are computationally infeasible for contemporary datasets. Recent advances in approximation techniques have expanded the applicability of the kernel methodology to scale linearly with the data size. Data-dependent kernels, which could leverage this computational advantage, have however not yet seen the benefit. Here we derive an approximate large-scale learning procedure for data-dependent kernels that is efficient and performs well in practice. We provide a Lemma that can be used to derive the asymptotic convergence of the approximation in the limit of infinite random features, and, under certain conditions, an estimate of the convergence speed. We empirically prove that our construction represents a valid, yet efficient approximation of the data-dependent kernel. For large-scale datasets of millions of datapoints, where the proposed method is now applicable for the first time, we notice a significant performance boost over both baselines consisting of data independent kernels and of kernel approximations, at comparable computational cost.</p>}},
  author       = {{Ionescu, Catalin and Popa, Alin Ionut and Sminchisescu, Cristian}},
  language     = {{eng}},
  title        = {{Large-scale data-dependent kernel approximation}},
  url          = {{http://proceedings.mlr.press/v54/ionescu17a.html}},
  year         = {{2017}},
}