Date of Award
Doctor of Philosophy (PhD)
This dissertation explored the idea of L1 norm in solving two statistical problems including multiple linear regression and diagnostic checking in time series. In recent years L1 shrinkage methods have become popular in linear regression as they can achieve simultaneous variable selection and parameter estimation. Their objective functions containing a least squares term and an L1 penalty term which can produce sparse solutions (Fan and Li, 2001). Least absolute shrinkage and selection operator (Lasso) was the first L1 penalized method proposed and has been widely used in practice. But the Lasso estimator has noticeable bias and is inconsistent for variable selection. Zou (2006) proposed adaptive Lasso and proved its oracle properties under some regularity conditions. We investigate the performance of adaptive Lasso by applying it to the problem of multiple undocumented change-point detection in climate. Artificial factors such as relocation of weather stations, recalibration of measurement instruments and city growth can cause abrupt mean shifts in historical temperature data. These changes do not reflect the true atmospheric evolution and unfortunately are often undocumented due to various reasons. It is imperative to locate the occurrence of these abrupt mean shifts so that raw data can be adjusted to only display the true atmosphere evolution. We have built a special linear model which accounts for long-term temperature change (global warming) by linear trend and is featured by p = n (the number of variables equals the number of observations). We apply adaptive Lasso to estimate the underlying sparse model and allow the trend parameter to be unpenalized in the objective function. Bayesian Information Criterion (BIC) and the CM criterion (Caussinus and Mestre, 2004) are used to select the finalized model. Multivariate t simultaneous confidence intervals can post-select the change-points detected by adaptive Lasso to attenuate overestimation. Considering that the oracle properties of adaptive Lasso are obtained under the condition of linear independence between predictor variables, adaptive Lasso should be used with caution since it is not uncommon for real data sets to have multicollinearity. Zou and Hastie (2005) proposed elastic net whose objective function involves both L1 and L2 penalties and claimed its superiority over Lasso in prediction. This procedure can identify a sparse model due to the L1 penalty and can tackle multicollinearity due to the L2 penalty. Although Lasso and elastic net are favored over ordinary least squares and ridge regression because of their functionality of variable selection, in presence of multicollinearity ridge regression can outperform both Lasso and elastic net in prediction. The salient point is that no regression method dominates in all cases (Fan and Li, 2001, Zou, 2006, Zou and Hastie, 2005). One major flaw of both Lasso and elastic net is the unnecessary bias brought by constraining all parameters to be penalized by the same norm. In this dissertation we propose a general and flexible framework for variable selection and estimation in linear regression. Our objective function automatically allows each parameter to be unpenalized, penalized by L1, L2 or both norms based on parameter significance and variable correlation. The resulting estimator not only can identify the correct set of significant variables with a large probability but also has smaller bias for nonzero parameters. Our procedure is a combinatorial optimization problem which can be solved by exhaustive search or genetic algorithm (as a surrogate to computation time). Aimed at a descriptive model, BIC is chosen as the model selection criterion. Another application of the L1 norm considered in this dissertation is portmanteau tests in time series. The first step in time series regression is to determine if significant serial correlation is present. If initial investigations indicate significant serial correlation, the second step is to fit an autoregressive moving average (ARMA) process to parameterize the correlation function. Portmanteau tests are commonly used to detect serial correlation or assess the goodness-of-fit of the ARMA model in these two steps. For small samples the commonly employed Ljung-Box portmanteau test (Ljung and Box, 1978) can have low power. It is beneficial to have a more powerful small sample test for detecting significant correlation. We develop such a test by considering the Cauchy estimator of correlation. While the usual sample correlation is estimated through L2 norm, the Cauchy estimator is based on L1 norm. Asymptotic properties of the test statistic are obtained. The test compares very favorably with the Box-Pierce/Ljung-Box statistics in detecting autoregressive alternatives.
Shen, Jie, "L1 methods for shrinkage and correlation" (2013). All Dissertations. 1259.