Date of Award


Document Type


Degree Name

Doctor of Philosophy (PhD)

Legacy Department



Gao, Shuhong

Committee Member

Maharaj , Hiren

Committee Member

Dimitrova , Elena


Despite the development of antiviral drugs and the optimization of therapies, the emergence of drug resistance remains one of the most challenging issues for successful treatments of HIV-infected patients. The availability of massive HIV drug resistance data provides us not only exciting opportunities for HIV research, but also the curse of high dimensionality.
We provide several statistical learning methods in this thesis to analyze sequence data from different perspectives. We propose a hierarchical random graph approach to identify possible covariation among residue-specific mutations. Viral progression pathways were inferred using an EM-like algorithm in literature, and we present a normalization method to improve the accuracy of parameter estimations. To predict the drug resistance from genotypic data, we also build a novel regression model utilizing the information from progression pathways. Finally, we introduce a computational approach to determine viral fitness, for which our initial computational results closely agree with experimental results.
Work on two other topics are presented in the Appendices. Latent class models find applications in several areas including social and biological sciences. Finding explicit maximum likelihood estimation has been elusive. We present a positive solution to a conjecture on a special latent class model proposed by Bernd Sturmfels from UC Berkeley.
Monomial ideals provide ubiquitous links between combinatorics and commutative algebra. Irreducible decomposition of monomial ideals is a basic computational problem and it finds applications in several areas. We present two algorithms for finding irreducible decomposition of monomial ideals.