Large-scale data-dependent kernel approximation

Ionescu, Catalin; Popa, Alin Ionut; Sminchisescu, Cristian

Large-scale data-dependent kernel approximation

Mark

Ionescu, Catalin ; Popa, Alin Ionut and Sminchisescu, Cristian ^LU (2017) 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017

Abstract: Learning a computationally efficient kernel from data is an important machine learning problem. The majority of kernels in the literature do not leverage the geometry of the data, and those that do are computationally infeasible for contemporary datasets. Recent advances in approximation techniques have expanded the applicability of the kernel methodology to scale linearly with the data size. Data-dependent kernels, which could leverage this computational advantage, have however not yet seen the benefit. Here we derive an approximate large-scale learning procedure for data-dependent kernels that is efficient and performs well in practice. We provide a Lemma that can be used to derive the asymptotic convergence of the approximation in... (More); Learning a computationally efficient kernel from data is an important machine learning problem. The majority of kernels in the literature do not leverage the geometry of the data, and those that do are computationally infeasible for contemporary datasets. Recent advances in approximation techniques have expanded the applicability of the kernel methodology to scale linearly with the data size. Data-dependent kernels, which could leverage this computational advantage, have however not yet seen the benefit. Here we derive an approximate large-scale learning procedure for data-dependent kernels that is efficient and performs well in practice. We provide a Lemma that can be used to derive the asymptotic convergence of the approximation in the limit of infinite random features, and, under certain conditions, an estimate of the convergence speed. We empirically prove that our construction represents a valid, yet efficient approximation of the data-dependent kernel. For large-scale datasets of millions of datapoints, where the proposed method is now applicable for the first time, we notice a significant performance boost over both baselines consisting of data independent kernels and of kernel approximations, at comparable computational cost.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/7e6c6204-9eba-4a31-8301-5303a344a089

author

Ionescu, Catalin ; Popa, Alin Ionut and Sminchisescu, Cristian ^LU

organization

publishing date

2017

type

Contribution to conference

publication status

published

subject

Computer Sciences

conference name

20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017

conference location

Fort Lauderdale, United States

conference dates

2017-04-20 - 2017-04-22

external identifiers

scopus:85083936904

language

English

LU publication?

yes

id

7e6c6204-9eba-4a31-8301-5303a344a089

alternative location

http://proceedings.mlr.press/v54/ionescu17a.html

date added to LUP

2019-07-08 13:02:15

date last changed

2025-04-04 14:49:22

@misc{7e6c6204-9eba-4a31-8301-5303a344a089,
  abstract     = {{<p>Learning a computationally efficient kernel from data is an important machine learning problem. The majority of kernels in the literature do not leverage the geometry of the data, and those that do are computationally infeasible for contemporary datasets. Recent advances in approximation techniques have expanded the applicability of the kernel methodology to scale linearly with the data size. Data-dependent kernels, which could leverage this computational advantage, have however not yet seen the benefit. Here we derive an approximate large-scale learning procedure for data-dependent kernels that is efficient and performs well in practice. We provide a Lemma that can be used to derive the asymptotic convergence of the approximation in the limit of infinite random features, and, under certain conditions, an estimate of the convergence speed. We empirically prove that our construction represents a valid, yet efficient approximation of the data-dependent kernel. For large-scale datasets of millions of datapoints, where the proposed method is now applicable for the first time, we notice a significant performance boost over both baselines consisting of data independent kernels and of kernel approximations, at comparable computational cost.</p>}},
  author       = {{Ionescu, Catalin and Popa, Alin Ionut and Sminchisescu, Cristian}},
  language     = {{eng}},
  title        = {{Large-scale data-dependent kernel approximation}},
  url          = {{http://proceedings.mlr.press/v54/ionescu17a.html}},
  year         = {{2017}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Large-scale data-dependent kernel approximation