This cluster comprises 2560 hyperthreaded CPU cores, 2530 available to users, 9.84TB RAM, 9.72TB available to users, and 20 NVIDIA A40 GPU accelerators. It consists of 20 nodes with the following configuration:
|CPU:||2x AMD EPYC 7543|
|Disk:||2x 3.5TB SSD SATA|
|GPU:||None: kub-a5 – kub-a9
2x NVIDIA A40 per node: kub-a10 – kub-a14
1x NVIDIA A40 per node: kub-a15 – kub-a24
|Network:||2x 10Gbps NIC|
Primary network storage consists of four head nodes each equipped with AMD EPYC 7302P, 256GB RAM, 2x 10Gbps NIC (failover only). It offers 500TB all-flash capacity of SSD drives only in RAID 6 equivalent configuration. Used filesystem is IBM Spectrum Scale that is exported via NFS version 3 to the kubernetes cluster.
Storage is not backed up to another location, however, file system snapshots are made on daily basis, 14 snapshosts are kept. I.e., up to 14 days to the past, we are able to restore deleted/overwritten data.