Date of Award
December 2025
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Biomedical Engineering
First Advisor
Sandeep K. Singhal
Abstract
Arsenic is a naturally occurring carcinogenic element linked to DNA damage via reactive oxidation and epigenetic alteration that is ubiquitous throughout nature and has been associated with oncogenesis in the lungs, bladder, skin, and liver. Lung Cancer is both the most diagnosed cancer worldwide and the deadliest, comprising approximately 12.4% of all cancers globally and contributing to more human cancer death than any other condition, with roughly 90% of cases arising due to tobacco smoking, which exposes the user to arsenic trioxide and synergistic carcinogenicity when exposure includes both smoking and consumption of arsenic-contaminated drinking water. A novel set of 147 genes associated with arsenic exposure and correlated with molecular cancer hallmarks was recently identified and analyzed to develop a risk prediction model for bladder cancer tumorigenesis. This research examines the same 147 genes in publicly available gene expression datasets and uses a logistic regression model to identify the genes most predictive of lung cancer risk. A panel of 4 genes (ARHGEF10, ADARB1, CRIM1, SEC14L1) was discovered that showed strong diagnostic capability on both training and validating datasets, yielding an AUC of 0.894 (95% CI: 0.857 – 0.931) and 0.831 (95% CI: 0.794 – 0.868), respectively. The predictive value of the model was then tested with the addition of clinical covariates from the available metadata, to include smoking status, age, and sex. The best performance of the model is observed when all three clinical covariates are included, achieving an AUC of 0.929 (95% CI: 0.903–0.955) on training data and AUC of 0.871 (95% CI: 0.836–0.907) on validation data. All models were statistically significant (p < 0.001), supporting the relevance of clinical variables in lung cancer risk prediction, the translational and clinical potential of the model in personalized cancer risk assessment, and the usefulness of leveraging toxicogenomic signals for cross-cancer analysis.
Recommended Citation
Kennedy, Joshua, "Cross Cancer Toxicogenomic Modeling Of Arsenic Exposure Identifies A Four Gene Signature For Lung Cancer Risk" (2025). Theses and Dissertations. 8227.
https://commons.und.edu/theses/8227