A power analysis for model-X knockoffs with ℓ p -regularized statistics

Weinstein, Asaf; Su, Weijie J.; Bogdan, Malgorzata; Foygel Barber, Rina; Candès, Emmanuel J.

A power analysis for model-X knockoffs with ℓ p -regularized statistics

Mark

Weinstein, Asaf ; Su, Weijie J. ; Bogdan, Malgorzata ^LU ; Foygel Barber, Rina and Candès, Emmanuel J. (2023) In The Annals of Statistics 51(3).

Abstract: Variable selection properties of procedures utilizing penalized-likelihood estimates is a central topic in the study of high-dimensional linear regression problems. Existing literature emphasizes the quality of ranking of the variables by such procedures as reflected in the receiver operating characteristic curve or in prediction performance. Specifically, recent works have harnessed modern theory of approximate message-passing (AMP) to obtain, in a particular setting, exact asymptotic predictions of the type I, type II error tradeoff for selection procedures that rely on
ℓp-regularized estimators.

In practice, effective ranking by itself is often not sufficient because some calibration for Type I error is required. In this... (More); Variable selection properties of procedures utilizing penalized-likelihood estimates is a central topic in the study of high-dimensional linear regression problems. Existing literature emphasizes the quality of ranking of the variables by such procedures as reflected in the receiver operating characteristic curve or in prediction performance. Specifically, recent works have harnessed modern theory of approximate message-passing (AMP) to obtain, in a particular setting, exact asymptotic predictions of the type I, type II error tradeoff for selection procedures that rely on
ℓp-regularized estimators.

In practice, effective ranking by itself is often not sufficient because some calibration for Type I error is required. In this work, we study theoretically the power of selection procedures that similarly rank the features by the size of an
ℓp-regularized estimator, but further use Model-X knockoffs to control the false discovery rate in the realistic situation where no prior information about the signal is available. In analyzing the power of the resulting procedure, we extend existing results in AMP theory to handle the pairing between original variables and their knockoffs. This is used to derive exact asymptotic predictions for power. We apply the general results to compare the power of the knockoffs versions of Lasso and thresholded-Lasso selection, and demonstrate that in the i.i.d. covariate setting under consideration, tuning by cross-validation on the augmented design matrix is nearly optimal. We further demonstrate how the techniques allow to analyze also the Type S error, and a corresponding notion of power, when selections are supplemented with a decision on the sign of the coefficient. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/3995461b-b318-436c-b251-6b7be4a6e065

author

Weinstein, Asaf ; Su, Weijie J. ; Bogdan, Malgorzata ^LU ; Foygel Barber, Rina and Candès, Emmanuel J.

publishing date

2023-06

type

Contribution to journal

publication status

published

subject

Probability Theory and Statistics

in

The Annals of Statistics

volume

51

issue

3

article number

1005-1029

publisher

Institute of Mathematical Statistics

external identifiers

scopus:85166340998

DOI

10.1214/23-AOS2274

language

English

LU publication?

no

id

3995461b-b318-436c-b251-6b7be4a6e065

date added to LUP

2023-12-11 12:10:32

date last changed

2025-04-04 15:19:37

@article{3995461b-b318-436c-b251-6b7be4a6e065,
  abstract     = {{Variable selection properties of procedures utilizing penalized-likelihood estimates is a central topic in the study of high-dimensional linear regression problems. Existing literature emphasizes the quality of ranking of the variables by such procedures as reflected in the receiver operating characteristic curve or in prediction performance. Specifically, recent works have harnessed modern theory of approximate message-passing (AMP) to obtain, in a particular setting, exact asymptotic predictions of the type I, type II error tradeoff for selection procedures that rely on <br/>ℓp-regularized estimators.<br/><br/>In practice, effective ranking by itself is often not sufficient because some calibration for Type I error is required. In this work, we study theoretically the power of selection procedures that similarly rank the features by the size of an <br/>ℓp-regularized estimator, but further use Model-X knockoffs to control the false discovery rate in the realistic situation where no prior information about the signal is available. In analyzing the power of the resulting procedure, we extend existing results in AMP theory to handle the pairing between original variables and their knockoffs. This is used to derive exact asymptotic predictions for power. We apply the general results to compare the power of the knockoffs versions of Lasso and thresholded-Lasso selection, and demonstrate that in the i.i.d. covariate setting under consideration, tuning by cross-validation on the augmented design matrix is nearly optimal. We further demonstrate how the techniques allow to analyze also the Type S error, and a corresponding notion of power, when selections are supplemented with a decision on the sign of the coefficient.}},
  author       = {{Weinstein, Asaf and Su, Weijie J. and Bogdan, Malgorzata and Foygel Barber, Rina and Candès, Emmanuel J.}},
  language     = {{eng}},
  number       = {{3}},
  publisher    = {{Institute of Mathematical Statistics}},
  series       = {{The Annals of Statistics}},
  title        = {{A power analysis for model-X knockoffs with ℓ p -regularized statistics}},
  url          = {{http://dx.doi.org/10.1214/23-AOS2274}},
  doi          = {{10.1214/23-AOS2274}},
  volume       = {{51}},
  year         = {{2023}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

A power analysis for model-X knockoffs with ℓ p -regularized statistics