Date of Award

January 2022

Document Type


Degree Name

Doctor of Philosophy (PhD)


Petroleum Engineering

First Advisor

Minou Rabiei


The sharp increase in oil and gas production in the Williston Basin of North Dakota since 2006 has resulted in a significant increase in produced water volumes. Primary mechanism for disposal of produced water is by injection into underground Inyan Kara formation through Class-II Saltwater Disposal (SWD) wells. With number of SWD wells anticipated to increase from 900 to over 1400 by 2035, localized pressurization and other potential issues that could affect performance of future oil and SWD wells, there was a need for a reliable model to select locations of future SWD wells for optimum performance. Since it is uncommon to develop traditional geological and simulation models for SWD wells, this research focused on developing data-driven proxy models based on the CRISP-Data Mining pipeline for understanding SWD well performance and optimizing future well locations. NDIC’s oil and gas division was identified as the primary data source. Significant efforts went towards identifying other secondary data sources, extracting required data from primary and secondary data sources using web scraping, integrating different data types including spatial data and creating the final data set. Orange visual programming application and Python programming language were used to carry out the required data mining activities. Exploratory Data Analysis and clustering analysis were used to gain a good understanding of the features in the data set and their relationships. Graph Data Science techniques such as Knowledge Graphs and graph-based clustering were used to gain further insights. Machine Learning regression algorithms such as Multi-Linear Regression, k-Nearest Neighbors and Random Forest were used to train machine learning models to predict average monthly barrels of saltwater disposed in a well. Model performance was optimized using the RMSE metric and the Random Forest model was selected as the final model for deployment to predict performance of a planned SWD well. A multi-target regression model was trained using deep neural network to predict water production in oil and gas wells drilled in the McKenzie county of North Dakota.