Date of Award


Document Type


Degree Name

Master of Science (MS)


School of Computing

Committee Chair/Advisor

Dr. Amy Apon

Committee Member

Dr. Nina Hubig

Committee Member

Dr. Brian Dean


Medical coding is the process by which standardized medical codes are assigned to patient health records. This is a complex and challenging task that typically requires an expert human coder to review health records and assign codes from a classification system based on a standard set of rules. Considering the downstream use of these codes in statistical analysis, billing, and patient care, improving the accuracy and efficiency of the medical coding process through automation could have a far-reaching impact on the healthcare domain. Since health records typically consist of a large proportion of free-text documents, this problem has traditionally been approached as a natural language processing (NLP) task. While machine learning-based methods have seen recent popularity on this task, they tend to struggle with codes that are assigned less frequently, for which little or no training data exists. In this thesis, we utilize the open-source programming language for natural language processing, NLP++, and its associated integrated development environment to design and build an automated system to assign International Classification of Diseases (ICD) codes to discharge summaries that functions in the absence of labeled training data. We evaluate our system using the MIMIC-III dataset and compare our results to supervised approaches. Results show that for datasets where labels are sparse, our approach matches state-of-the-art machine learning approaches. It is somewhat less effective for densely labeled datasets, but provides additional support for explainability and adaptability. Overall, our approach presents an effective pathway for code assignment in clinical documents by providing both competitive performance and enhanced explainability.

Included in

Data Science Commons



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.