Date of Award
Doctor of Philosophy (PhD)
This dissertation develops a minimum description length (MDL) multiple changepoint detection procedure that allows for prior distributions. MDL methods, which are penalized likelihood techniques with penalties based on data description-length information principles, have been successfully applied to many recent multiple changepoint problems. This work shows ow to modify the MDL penalty to account for various prior knowledge. Our motivation lies in climatology. Here, a metadata record, which is a file listing times when a recording station physically moved, instrumentation was changed, etc., sometimes exists. While metadata records are notoriously incomplete, they permit the construct a prior distribution that helps detect changepoints. This allows both documented and undocumented changepoints to be analyzed in tandem. The method developed here takes into account 1) metadata, 2) reference series, 3) seasonal means, and 4) autocorrelations. Asymptotically, our estimated multiple changepoint configuration of monthly data is shown to be consistent. The methods are illustrated in the analysis of 114 years of monthly temperatures from Tuscaloosa, Alabama. The multivariate aspect of the methods allow maximum and minimum temperatures to be jointly studied. A method for homogenizing daily temperature series is also developed. While daily temperatures have a complex structure, statistical techniques have been accumulating that can now accommodate all of the salient characteristics of daily temperatures. The goal here is to combine these techniques in a reasonable manner for multiple changepoint identification in daily series; computational speed is key as a century of daily data has over 36,000 data points. Autocorrelation aspects are important since correlation can destroy changepoint techniques and sample correlations of day-to-day temperature anomalies are often as large as 0.7. While homogenized daily temperatures may not be as useful as homogenized monthly or yearly temperatures, homogenization done on a daily scale affords one greater statistical precision. It is relatively easy to visually discern two changepoints (breakpoints) two years apart with daily data, but virtually impossible to see them in annual series. The methods are applied to 46 years of daily data at South Haven, Michigan.
Anuradha Priyadarshani, Hewa, "Bayesian Minimum Description Length Techniques for Multiple Changepoint Detection" (2015). All Dissertations. 1573.