Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Sub-linear convergence of a stochastic proximal iteration method in Hilbert space

Eisenmann, Monika LU orcid ; Stillfjord, Tony LU orcid and Williamson, Måns LU orcid (2022) In Computational Optimization and Applications 83(1). p.181-210
Abstract

We consider a stochastic version of the proximal point algorithm for convex optimization problems posed on a Hilbert space. A typical application of this is supervised learning. While the method is not new, it has not been extensively analyzed in this form. Indeed, most related results are confined to the finite-dimensional setting, where error bounds could depend on the dimension of the space. On the other hand, the few existing results in the infinite-dimensional setting only prove very weak types of convergence, owing to weak assumptions on the problem. In particular, there are no results that show strong convergence with a rate. In this article, we bridge these two worlds by assuming more regularity of the optimization problem,... (More)

We consider a stochastic version of the proximal point algorithm for convex optimization problems posed on a Hilbert space. A typical application of this is supervised learning. While the method is not new, it has not been extensively analyzed in this form. Indeed, most related results are confined to the finite-dimensional setting, where error bounds could depend on the dimension of the space. On the other hand, the few existing results in the infinite-dimensional setting only prove very weak types of convergence, owing to weak assumptions on the problem. In particular, there are no results that show strong convergence with a rate. In this article, we bridge these two worlds by assuming more regularity of the optimization problem, which allows us to prove convergence with an (optimal) sub-linear rate also in an infinite-dimensional setting. In particular, we assume that the objective function is the expected value of a family of convex differentiable functions. While we require that the full objective function is strongly convex, we do not assume that its constituent parts are so. Further, we require that the gradient satisfies a weak local Lipschitz continuity property, where the Lipschitz constant may grow polynomially given certain guarantees on the variance and higher moments near the minimum. We illustrate these results by discretizing a concrete infinite-dimensional classification problem with varying degrees of accuracy.

(Less)
Please use this url to cite or link to this publication:
author
; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Convergence analysis, Convergence rate, Hilbert space, Infinite-dimensional, Stochastic proximal point
in
Computational Optimization and Applications
volume
83
issue
1
pages
30 pages
publisher
Springer
external identifiers
  • scopus:85132209527
ISSN
0926-6003
DOI
10.1007/s10589-022-00380-0
language
English
LU publication?
yes
id
ed64b43e-826d-47e2-8f2c-629a10e3a340
date added to LUP
2022-09-05 11:23:03
date last changed
2022-09-05 11:23:03
@article{ed64b43e-826d-47e2-8f2c-629a10e3a340,
  abstract     = {{<p>We consider a stochastic version of the proximal point algorithm for convex optimization problems posed on a Hilbert space. A typical application of this is supervised learning. While the method is not new, it has not been extensively analyzed in this form. Indeed, most related results are confined to the finite-dimensional setting, where error bounds could depend on the dimension of the space. On the other hand, the few existing results in the infinite-dimensional setting only prove very weak types of convergence, owing to weak assumptions on the problem. In particular, there are no results that show strong convergence with a rate. In this article, we bridge these two worlds by assuming more regularity of the optimization problem, which allows us to prove convergence with an (optimal) sub-linear rate also in an infinite-dimensional setting. In particular, we assume that the objective function is the expected value of a family of convex differentiable functions. While we require that the full objective function is strongly convex, we do not assume that its constituent parts are so. Further, we require that the gradient satisfies a weak local Lipschitz continuity property, where the Lipschitz constant may grow polynomially given certain guarantees on the variance and higher moments near the minimum. We illustrate these results by discretizing a concrete infinite-dimensional classification problem with varying degrees of accuracy.</p>}},
  author       = {{Eisenmann, Monika and Stillfjord, Tony and Williamson, Måns}},
  issn         = {{0926-6003}},
  keywords     = {{Convergence analysis; Convergence rate; Hilbert space; Infinite-dimensional; Stochastic proximal point}},
  language     = {{eng}},
  number       = {{1}},
  pages        = {{181--210}},
  publisher    = {{Springer}},
  series       = {{Computational Optimization and Applications}},
  title        = {{Sub-linear convergence of a stochastic proximal iteration method in Hilbert space}},
  url          = {{http://dx.doi.org/10.1007/s10589-022-00380-0}},
  doi          = {{10.1007/s10589-022-00380-0}},
  volume       = {{83}},
  year         = {{2022}},
}