Date of Award

January 2015

Document Type


Degree Name

Master of Science (MS)


Electrical Engineering

First Advisor

Prakash Ranganathan


With the large scale deployment of phasor measurement units (PMU) in the United States, a resonating topic has been the question of how to extract useful “information” or “knowledge” from voluminous data generated by supervisory control and data acquisition (SCADA), PMUs and advanced metering infrastructure (AMI).

With a sampling rate of 30 to as high as 120 samples per second, the PMU provide a fine-grained monitoring of the power grid with time synchronized measurements of voltage, current, frequency and phase angle. Running the sensors continuously can produce nearly 2,592,000 samples of data every day. This large data need to be treated efficiently to extract information for better decision making in a smart grid network (SG) environment.

My research presents a flexible software framework to process the streaming data sets for smart-grid applications. The proposed Integrated Software Suite (ISS) is capable of mining the data using various clustering algorithms for better decision-making purposes. The decisions based on the proposed methods can help electric grid’s system operators to reduce blackouts, instabilities and oscillations in the smart-grid. The research work primarily focus on integrating a density-based clustering (DBSCAN) and variations of k-means clustering methods to capture specific types of anomalies or faults. A novel method namely, multi-tier k-means was developed to cluster the PMU data. Such a grouping scheme will enable system operators for better decision making. Different fault conditions, such as voltage, current, phase angle or frequency deviations, generation, and load trips, are investigated and a comparative analysis of application of three methods are studied.

A collection of forecasting techniques has also been applied to PMU datasets. The datasets considered are from the PJM Corporation that describes the energy demand for 13 states and District of Columbia (DC). The applications and suitability of forecasting techniques to PMU data using random forest (RF), locally weighted scatterplot smoothing (LOWESS) and seasonal auto regressive integrated moving average (SARIMA) has been investigated. The approaches are tested against standardized error indices like mean absolute percentage error (MAPE), mean squared error (MSE), root mean squared error (RMSE) and normal percentage error (PCE), to compare the performance. It is observed that the proposed hybrid combination of RF and SARIMA can be usd with good results in day ahead forecasting of load dispatch.