Date of Award

12-2022

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Genetics and Biochemistry

Committee Chair/Advisor

Liangjiang Wang

Committee Member

Lukasz Kozubowski

Committee Member

Hong Luo

Committee Member

Trudy Mackay

Abstract

In the central nervous system, synapses are essential junctions that connect neurons and play important roles in neurotransmission and synaptic plasticity. While there are many challenges in human synapse genomics, machine learning techniques, which are capable of mining and interpreting large amounts of genomic data, may be utilized to facilitate the functional studies of human synapses. In this study, we have developed machine learning models for human synapse genomics to address several biological problems.

RNA localization plays an important role at the synapse, allowing local protein synthesis required for synaptic plasticity during brain development. Previous studies were conducted in mice and rats to investigate the subcellular localization of RNAs and its impact on synaptic plasticity. However, owing to the experimental difficulties of studying human synaptic transcriptome, the full population of human synaptic RNAs remains largely unclear. We have developed a new machine learning method, PredSynRNA, to predict the synaptic localization of human RNAs by using developmental brain gene expression data. The PredSynRNA method can be used to predict and prioritize candidate RNAs localized to human synapses, providing valuable targets for experimental investigations in neuronal studies.

Long non-coding RNAs (lncRNAs) are a class of non-coding RNAs (ncRNAs) with little protein-coding potential due to the lack of an open reading frame. LncRNAs are emerging as important regulators in neuronal development, synaptic plasticity, and complex brain disorders. However, only a few synapse-related lncRNAs have been identified and characterized. In this study, we have built a new machine learning method – SynLnc to predict human synapse-related lncRNAs by mining the developmental brain gene expression data using collaborative embedding – a common technique used in recommender systems. High-confidence candidate lncRNAs shown to co-express with known synaptic genes within genomic proximity may be valuable experimental targets for future research.

Liquid-liquid phase separation (LLPS) is a physiological process essential for the formation of membraneless compartments that are pervasively found in cells and synaptic regions. While previous studies attempted to predict phase-separated proteins with conventional feature encoding and laborious feature engineering, natural language processing (NLP) techniques have not been sufficiently applied in this field. In this study, we applied the framework of the state-of-the-art deep protein language model to predict proteins with LLPS propensity and synaptic functions. The constructed models achieved good performances in both learning tasks, showing promise in deep sequence representation learning by advanced NLP techniques. As a whole, we expect the models and results can provide valuable information in studying human synapses.

Recommended Citation

Wei, Anqi, "Machine Learning Models for Human Synapse Genomics" (2022). All Dissertations. 3186.
https://tigerprints.clemson.edu/all_dissertations/3186

Additional file C-1.xlsx (74 kB)
Additional file C-2.xlsx (23 kB)
Additional file C-3.xlsx (13 kB)
Additional file C-4.xlsx (108 kB)
Additional file C-5.xlsx (182 kB)
Additional file C-6.xlsx (544 kB)

Download

Included in

Bioinformatics Commons

COinS

All Dissertations

Machine Learning Models for Human Synapse Genomics

Date of Award

Document Type

Degree Name

Department

Committee Chair/Advisor

Committee Member

Committee Member

Committee Member

Abstract

Recommended Citation

Included in

Search

Browse by

Useful Links

All Dissertations

Machine Learning Models for Human Synapse Genomics

Author

Date of Award

Document Type

Degree Name

Department

Committee Chair/Advisor

Committee Member

Committee Member

Committee Member

Abstract

Recommended Citation

Included in

Share

Search

Browse by

Useful Links