Date of Award

January 2022

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Biology

First Advisor

Manu Manu

Abstract

Mammals produce hundreds of billions of new blood cells every day througha process known as hematopoiesis. Hematopoiesis starts with stem cells that develop into all the different types of cells found in blood by changing their genome-wide gene expression. The remodeling of genome-wide gene expression can be primarily attributed to a special class of proteins called transcription factors (TFs) that can activate or repress other genes, including genes encoding TFs. TFs and their targets therefore form recurrent networks called gene regulatory networks (GRNs). GRNs are crucial during physiological developmental processes, such as hematopoiesis, while abnormalities in the regulatory interactions of GRNs can be detrimental to the organisms. To this day we do not know all the key compo-nents that comprise hematopoietic GRNs or the complete set of their regulatory interactions. Inference of GRNs directly from genetic experiments is low throughput and labor intensive, while computational inference of comprehensive GRNs is challenging due to high processing times. This dissertation focuses on deriving the architecture and the dynamics of hematopoietic GRNs from genome-wide gene expression data obtained from high-resolution time-series experiments. The dissertation also aims to address the technical challenge of speeding up the process of GRN inference. Here GRNs are inferred and modeled using gene circuits, a data-driven method based on Ordinary Differential Equations (ODEs). In gene circuits, the rate of change of a gene product depends on regulatory influences from other genes encoded as a set of parameters that are inferred from time-series data. A twelve-gene GRN comprising genes encoding key TFs and cytokine receptors involved in erythrocyte-neutrophil differentiation was inferred from a high-resolution time-series dataset of the in vitro differentiation of a multipotential cell line. The inferred GRN architecture agreed with prior empirical evidence and pre- dicted novel regulatory interactions. The inferred GRN model was also able to predict the outcome of perturbation experiments, suggesting an accurate inference of GRN architecture. The dynamics of the inferred GRN suggested an alternative explanation to the currently accepted sequence of regulatory events during neutrophil differentiation. The analysis of the model implied that two TFs, C/EBPα and Gfi1, initiate cell-fate choice in the neutrophil lineage, while PU.1, believed to be a master regulator of all white-blood cells, is activated only later. This inference was confirmed in a single-cell RNA-Seq dataset from mouse bone marrow, in which PU.1 upregulation was preceded by C/EBPα and Gfi1 upregulation. This dissertation also presents an analysis of a high-temporal resolution genome-wide gene expression dataset of in vitro macrophage-neutrophil differentiation. Analysis of these data reveal that genome-wide gene expression during differentiation is highly dynamic and complex. A large-scale transition is observed around 8h and shown to be related to wide-spread physiological remodeling of the cells. The genes associated by myeloid differentiation mainly change during the first 4 hours, implying that the cell-fate decision takes place in the first four hours of differentiation. The dissertation also presents a new classification-based model-training technique that addresses the challenge of the high computational cost of inferring GRNs. This method, called Fast Inference of Gene Regulation (FIGR), is demonstrated to be two orders magnitude faster than global non-linear optimization techniques and its computational complexity scales much better with GRN size. This work has demonstrated the feasibility of simulating relatively large realistic GRNs using a dynamical and mechanistically accurate model coupled to high-resolution time series data and that such models can yield novel biological insight. Taken together with the macrophage-neutrophil dataset and the computationally efficient GRN inference methodology, this work should open up new avenues for modeling more comprehensive GRNs in hematopoiesis and the broader field of developmental biology.

Share

COinS