You are here:

CHPC - Research Computing Support for the University

In addition to deploying and operating high performance computational resources and providing advanced user support and training, CHPC serves as an expert team to broadly support the increasingly diverse research computing needs on campus. These needs include support for big data, big data movement, data analytics, security, virtual machines, Windows science application servers, protected environments for data mining and analysis of protected health information, and advanced networking. Visit our Getting Started page for more information.

Uncertainty Quantification of RNA-Seq Co-expression Networks

By Lance Pflieger and Julio Facelli, Department of Biomedical Informatics

Systems biology utilizes the complex and copious data originating from the “omics” fields to increase understanding of biology by studying interactions among biological entities. Gene co-expression network analysis is a systems biology technique derived from graph theory that uses RNA expression data to infer functional similar genes or regulatory pathways. Gene co-expression network analysis is a computationally intensive process that requires matrix operations on tens-of-thousands of genes/transcripts. This technique has been useful in drug discovery, functional annotation of a gene and insight into disease pathology.

To assess the effect of uncertainty inherent with gene expression data, our group utilized CHPC resources to characterize variation in gene expression estimates and simulate a large quantity of co-expression networks based on this variation. The figure shown is a representation of network generated using WGCNA and expression data from the disease Spinocerebellar Type 2 (SCA2). The colors represent highly connected subnetworks of genes which are used to correlate similar gene clusters with a phenotypic trait. Our results show that uncertainty has a large effect on downstream results including subnetwork structure, hub genes identification and enrichment analysis. For instance, we find that the number of subnetworks correlating with the SCA2 phenotype varies from 1 to 6 subnetworks. While a small gene co-expression network analysis can be performed using only modest computation resources, the scale of resources required to perform uncertainty quantification (UQ) using Monte Carlo ensemble methods is several orders of magnitude larger, which are only available at CHPC.

System Status

General Environment

last update: 2019-06-19 11:23:03
General Nodes
system cores % util.
ember 960/960 100%
kingspeak 848/848 100%
notchpeak 1212/1212 100%
lonepeak 1100/1100 100%
Owner/Restricted Nodes
system cores % util.
ash 5240/7300 71.78%
notchpeak 1720/1720 100%
ember 708/1220 58.03%
kingspeak 5832/5916 98.58%
lonepeak 400/400 100%

Protected Environment

last update: 2019-06-19 11:20:04
General Nodes
system cores % util.
redwood 60/500 12%
Owner/Restricted Nodes
system cores % util.
redwood 2192/3232 67.82%

Cluster Utilization

Last Updated: 6/19/19