Covariance assisted screening and estimation

Feb 20, 2014 - 4:15 PM

to , -

Covariance assisted screening and estimation

Date:	Thursday, February 20
Time:	4:10 pm -- 5:00 pm
Place:	Carver 202
Speaker:	Zheng (Tracy) Ke, Department of Statistics, Princeton U, Princeton, NJ

Abstract:

We consider the problem of variable selection in the very challenging regime where the signals (non-zero regression coecients) are rare and weak and columns of the design matrix are heavily correlated. We demonstrate that in the presence of Rare/Weak signals, many classical methods and ambitious contemporary algorithms face pitfalls. The situation is worsen in the presence of heavy correlations among design variables.

We propose a new variable selection approach which we call Covariance Assisted Screening and Estimation (CASE). CASE is a two-stage multivariate Screen and Clean algorithm. CASE has two layers of innovations. In the rst layer, we alleviate the heavy correlations of the design variables by linear ltering and use the post- ltering model to construct a sparse graph. In the second layer, we use the sparse graph to guide both the screening and cleaning.

We explain how CASE overcomes the well-known computational hurdle of multivariate screening. We also explain how CASE overcomes the so-called challenge of signal cancella tion”, so its success is not tied to strong signals or any types of incoherence/irrepresentable conditions. We set up a theoretical framework where we show CASE obtains the optimal rate of convergence in terms of Hamming errors. We have successfully applied CASE to the long- memory time series and a changepoint model, where the optimality is further investigated with the so-called notion of phase diagram.