Date of Award

January 2015

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

Ronald A. Marsh

Abstract

The ability to gather data from many types of new information sources has grown quickly using new technologies. The ability to store and retrieve large quantities of data from these new sources has created a need for computing platforms that are able to process the data for information. High Performance Computing Cluster systems have been developed to fulfill a role required for fast processing of large amounts of data for many difficult types of computing applications.

Beowulf Clusters use many separate compute nodes to create a tightly coupled parallel HPCC system. The ability for a Beowulf Cluster HPCC system to process data depends on the ability of the compute nodes within the HPCC system to be able to retrieve data, share data, and store data with as little delay as possible. With many compute nodes competing to exchange data over limited network connections, network congestion can occur that can negatively impact the speed of computations.

With concerns about network performance optimization, and uneven distribution of computational capacity, it is important for Beowulf HPCC System Administrators to be able to evaluate real-time data transfer metrics for congestion within a particular HPCC system. In this thesis, Heat-Maps will be created to identify potential issues with Infiniband network congestion due to simultaneous data exchanges between compute nodes.

Share

COinS