Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

On the sign recovery by least absolute shrinkage and selection operator, thresholded least absolute shrinkage and selection operator, and thresholded basis pursuit denoising

Tardivel, Patrick J.C. and Bogdan, Małgorzata LU (2022) In Scandinavian Journal of Statistics 49(4). p.1636-1668
Abstract

Basis pursuit (BP), basis pursuit deNoising (BPDN), and least absolute shrinkage and selection operator (LASSO) are popular methods for identifying important predictors in the high-dimensional linear regression model (Formula presented.). By definition, when (Formula presented.), BP uniquely recovers (Formula presented.) when (Formula presented.) and (Formula presented.) implies (Formula presented.) (identifiability condition). Furthermore, LASSO can recover the sign of (Formula presented.) only under a much stronger irrepresentability condition. Meanwhile, it is known that the model selection properties of LASSO can be improved by hard thresholding its estimates. This article supports these findings by proving that thresholded LASSO,... (More)

Basis pursuit (BP), basis pursuit deNoising (BPDN), and least absolute shrinkage and selection operator (LASSO) are popular methods for identifying important predictors in the high-dimensional linear regression model (Formula presented.). By definition, when (Formula presented.), BP uniquely recovers (Formula presented.) when (Formula presented.) and (Formula presented.) implies (Formula presented.) (identifiability condition). Furthermore, LASSO can recover the sign of (Formula presented.) only under a much stronger irrepresentability condition. Meanwhile, it is known that the model selection properties of LASSO can be improved by hard thresholding its estimates. This article supports these findings by proving that thresholded LASSO, thresholded BPDN, and thresholded BP recover the sign of (Formula presented.) in both the noisy and noiseless cases if and only if (Formula presented.) is identifiable and large enough. In particular, if X has iid Gaussian entries and the number of predictors grows linearly with the sample size, then these thresholded estimators can recover the sign of (Formula presented.) when the signal sparsity is asymptotically below the Donoho–Tanner transition curve. This is in contrast to the regular LASSO, which asymptotically, recovers the sign of (Formula presented.) only when the signal sparsity tends to 0. Numerical experiments show that the identifiability condition, unlike the irrepresentability condition, does not seem to be affected by the structure of the correlations in the X matrix.

(Less)
Please use this url to cite or link to this publication:
author
and
organization
publishing date
type
Contribution to journal
publication status
published
subject
in
Scandinavian Journal of Statistics
volume
49
issue
4
pages
1636 - 1668
publisher
Wiley-Blackwell
external identifiers
  • scopus:85124070954
ISSN
0303-6898
DOI
10.1111/sjos.12568
language
English
LU publication?
yes
id
bd6d5f68-924d-4416-b5a8-1752db0aa42c
date added to LUP
2022-04-06 11:58:22
date last changed
2023-01-16 10:17:15
@article{bd6d5f68-924d-4416-b5a8-1752db0aa42c,
  abstract     = {{<p>Basis pursuit (BP), basis pursuit deNoising (BPDN), and least absolute shrinkage and selection operator (LASSO) are popular methods for identifying important predictors in the high-dimensional linear regression model (Formula presented.). By definition, when (Formula presented.), BP uniquely recovers (Formula presented.) when (Formula presented.) and (Formula presented.) implies (Formula presented.) (identifiability condition). Furthermore, LASSO can recover the sign of (Formula presented.) only under a much stronger irrepresentability condition. Meanwhile, it is known that the model selection properties of LASSO can be improved by hard thresholding its estimates. This article supports these findings by proving that thresholded LASSO, thresholded BPDN, and thresholded BP recover the sign of (Formula presented.) in both the noisy and noiseless cases if and only if (Formula presented.) is identifiable and large enough. In particular, if X has iid Gaussian entries and the number of predictors grows linearly with the sample size, then these thresholded estimators can recover the sign of (Formula presented.) when the signal sparsity is asymptotically below the Donoho–Tanner transition curve. This is in contrast to the regular LASSO, which asymptotically, recovers the sign of (Formula presented.) only when the signal sparsity tends to 0. Numerical experiments show that the identifiability condition, unlike the irrepresentability condition, does not seem to be affected by the structure of the correlations in the X matrix.</p>}},
  author       = {{Tardivel, Patrick J.C. and Bogdan, Małgorzata}},
  issn         = {{0303-6898}},
  language     = {{eng}},
  number       = {{4}},
  pages        = {{1636--1668}},
  publisher    = {{Wiley-Blackwell}},
  series       = {{Scandinavian Journal of Statistics}},
  title        = {{On the sign recovery by least absolute shrinkage and selection operator, thresholded least absolute shrinkage and selection operator, and thresholded basis pursuit denoising}},
  url          = {{http://dx.doi.org/10.1111/sjos.12568}},
  doi          = {{10.1111/sjos.12568}},
  volume       = {{49}},
  year         = {{2022}},
}