The WFU DEAC Cluster provides the critical infrastructure necessary for researchers to reliably upload research codes, perform large scale computations, store their actively utilized results, and have confidence in the persistence of their data in the event of storage failures. A comprehensive list of these services can be found on the Services page.
Below, you’ll find a comprehensive list of hardware resources that are currently in production within the WFU DEAC Cluster facility.
Current DEAC compute-node configurations:
94 – Cisco B-series Blades – 3,816 cores total
- 27 Ivy Bridge Blades w/20 cores — 128GB RAM
- 24 Haswell Blades w/32 cores — 128GB RAM
- 43 Broadwell Blades w/44 cores — 256GB RAM
- 12 Skylake Blades w/44 cores — 192GB RAM
2 – Cisco GPU Nodes (44 cores each) – 88 cores total
- 2 NVIDIA Tesla P100s per per node
- 3584 CUDA cores per Card
Total GPU – 14,336 CUDA cores
Total x86 cores – 3,904 cores
197.86 TB for Research Data:
The WFU DEAC cluster utilizes multiple storage devices hosting shared storage via NFS (Network File System). This high speed storage devices map storage to all cluster nodes while providing flexible configurations, quick snapshot backup capability, and easy growth. Storage device information is as follows:
NetApp FS8040 Storage Device:
Primary function is the principal data store for home directories and actively used research data on the cluster. Solid State Disk flash pool provides fast read/writes and higher performance I/O:
- 14TB via 24 Solid State Drives, 800GB capity
- 194TB via 120 SATA drives, 2TB capacity, 7200 RPM
- 177.86TB of usable storage via NFS
AWS hosts all archival storage for the DEAC cluster. Storage Gateway provides access to virtually unlimited storage on EC2 and Glacier storage.
The cluster is directly connected to the Wake Forest University campus core network router through fiber-based, 10-gigabit Ethernet. Cluster nodes are configured to take advantage of Cisco UCS enabled “usNIC,” which is a low-latency kernel bypass offering 3 times less latency. The current networking infrastructure hardware, based on Cisco’s 6500 series switches, can support up to over 1000 nodes with 10 gigabit-Ethernet to all nodes. Naturally scaling the “access” layer of this network, via additional Cisco Catalyst 6500 or Nexus 5000 switch pairs, would add support for additional nodes.
General Notes of HPL:
- High Performance Linpack (HPL) is the Top500 benchmark for all clusters.
- The program was utilized to benchmark the Peack Performance for the DEAC cluster.
- For the Basic Linear Algebra Subroutine libraries, ATLAS was built to be tuned for the IBM BladeCenter and UCS CISCO blades.
- The moduel openmpi/1.6-intel was used for spwawing multi-processes.
UCS on HPL:
- HPL was launched to utilize all available UCS blades (448 cores)
- Plot below shows peak performance in GFLOPs as the Matrix of size N increases.
- Top peak performance reached = 3.32 Tera-FLOPS