#### Date of Award

12-2016

#### Document Type

Dissertation

#### Degree Name

Doctor of Philosophy (PhD)

#### Legacy Department

Mathematical Science

#### Committee Member

Dr. Patrick Gerard, Committee Chair

#### Committee Member

Dr. William Bridges

#### Committee Member

Dr. Colin Gallagher

#### Committee Member

Dr. Julia Sharp

#### Abstract

Multistage sampling is a common sampling technique in many studies. A challenge presented by multistage sampling schemes is that an additional random term should be introduced to the linear model. Observations are identically distributed but not independent, thus many traditional kernel smoothing techniques, which assume that the data is independent and identically distributed, may not produce reasonable estimates for the marginal density. Breunig (2001) proposed a method to account for the intra-class correlation leading to a complex bandwidth involving high order derivatives for bivariate kernel density estimate. We consider an alternative approach where the data are grouped into multiple random samples, by taking one observation each class, then constructing a kernel density estimate for each sample. A weighted average of these kernel density estimates yields a simple expression for the optimal bandwidth that accounts for the intra-class correlation. For unbalanced data, resampling methods are implemented to ensure that each class is included in every random sample. Both simulations and analytical results are provided. One-sided tolerance intervals are confidence intervals for percentiles. Many authors have provided methods to estimate one-sided tolerance limits for both random samples and hierarchical data. Many of these methods have assumed that the population is normally distributed. Since multistage sampling is a popular sampling scheme, we would like to employ methods that avoid such assumptions on the population. We explore non-parametric methods that utilize bootstrapping and/or kernel density estimation to produce data driven percentile estimates. One way to account for hierarchical data is to decompose observations in a way that is consistent with decomposition of sum of squares for analysis of a one-way random effects model. We provide simulation study with two percentiles of interest.

#### Recommended Citation

Wilson, Christopher, "Kernel Smoothing and Tolerance Intervals for Hierarchical Data" (2016). *All Dissertations*. 1816.

https://tigerprints.clemson.edu/all_dissertations/1816