PlatformAvailable Hardware

Available Hardware

CERIT-SC operates regular clusters and a secure cluster (more information in upcoming Secure cluster section). Regular clusters include kuba-cluster (the biggest one) and kubh-cluster (HA cluster). 8 powerful kub-c nodes will join kuba-cluster in near future. Secure cluster includes kubas-cluster.

Regular clusters

kuba-cluster

Kuba-cluster consists of 39 nodes (currently 31 available) and features 22 NVIDIA A40, 6 NVIDIA A10 and 12 NVIDIA A100 (80GB variant) GPU accelerators. Four A100 cards are configured as MIG parts resulting in 12x 10GB and 8x 20GB parts.

39xNodes
CPU:2x AMD EPYC 7543 32-Core Processor
(in total 64CPU per node)
Memory:1024GB:
   512GB:
kub-b15, kub-b16
All remaining nodes
Disk:2x 3.5TB SSD SATA:
8x 8TB NVME SSD:
kub-a5 — kub-a25
kub-b1 — kub-b18
GPU:None:
2x NVIDIA A40 per node:
1x NVIDIA A40 per node:
2x NVIDIA A10 per node:
2x NVIDIA A100 (80GB) per node:
1x NVIDIA H100 (PCIE/80GB):
1x NVIDIA L4:
kub-a5 — kub-a9, kub-b3, kub-b9, kub-b14, kub-b15, kub-b17
kub-a10 — kub-a14, kub-b12
kub-a15 — kub-a20, kub-a22 — kub-a24
kub-b1 — kub-b2, kub-b16
kub-b4 — kub-b8
kub-b10 — kub-b11
kub-b13
Network:2x 10Gbps Ethernet:
1x 100Gbps Infiniband:
All nodes
kub-b1 — kub-b18
🆒

kub-c nodes (upcoming)

8 HPC kub-c nodes, each featuring NVIDIA H100 GPU accelerator, will join kuba-cluster.

8xNodes
CPU:2x AMD EPYC 9454 48-Core Processor
(in total 96CPU per node)
Memory:1.5TB:All nodes
Disk:60TB NVME SSD:All nodes
GPU:1x NVIDIA H100 NVL (PCIE/94GB) per node:All nodes
Network:1x 100Gbps Ethernet:
1x 200Gbps Infiniband:
All nodes
All nodes

Storage

Primary network storage consists of four head nodes each equipped with AMD EPYC 7302P, 256GB RAM, 2x 10Gbps NIC (failover only). It offers 500TB all-flash capacity of SSD drives only in RAID 6 equivalent configuration. Used filesystem is IBM Spectrum Scale that is exported via NFS version 3 to the kubernetes cluster.

Data Backup

Storage is not backed up to another location but file system snapshots are made on daily basis. It is possible to restore deleted/overwritten data 14 days to the past.

kubh-cluster

Kubh-cluster consists of 6 nodes that are disperesed in three different locations (2 nodes in locations University Campus Bohunice — UKB, University Computer Centre — CPS — at Komenskeho namesti, Faculty of Informatics at Botanicka) thus is a suitable option for HA setups (more information in upcoming HA setup section).

6xNodes
CPU:2x AMD EPYC 7543 32-Core Processor
(in total 64CPU per node)
Memory:512GB
Disk:20TB NVME:
7TB NVME SSD:
kub-h1 — kub-h2
kub-h3 — kub-h6
Network:2x 10Gbps Ethernet:All nodes

Storage

Storage is provided only locally in each node.

Data Backup

Local storage is not backed up, it is up to the user to ensure backing-up.

Secure cluster

kubas-cluster

Kubas-cluster consists of 10 nodes and features 5x NVIDIA A40, 2x NVIDIA A100 (80GB variant), 4x NVIDIA P100, 2x NVIDIA H100 (NVL 94GB variant) and on-demand 4x NVIDIA A100 GPU accelerators. Nodes kub-cs1, kub-cs2 are physically located at different site than the rest of the cluster therefore it is possible to create HA setup in the secure cluster as well.

10xNodes
CPU:2x AMD EPYC 7543 32-Core Processor (in total 64CPU per node):
2x AMD EPYC 9454 48-Core Processor (in total 96CPU per node):
1x Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz 24-Core Processor:
1x AMD EPYC 7662 64-Core Processor (in total 64CPU per node):
kub-as1 — kub-as6
kub-cs1, kub-cs2
kblack
kzia
Memory:512GB:
1.5 TB:
512GB:
512GB:
kub-as1 — kub-as6
kub-cs1, kub-cs2
kblack
kzia
Disk:2x 3.5TB SSD SATA:
60TB NVME SSD:
1x 3.6TB SSD:
1x 1.5TB SSD:
kub-as1 — kub-as6
kub-cs1, kub-cs2
kblack
kzia
GPU:1x NVIDIA A40 per node:
2x NVIDIA A100 (80GB) per node:
2x NVIDIA H100 NVL (94GB) per node:
4x NVIDIA P100:
on-demand 1-4 NVIDIA A100 (40GB):
kub-as1 — kub-as5
kub-as6
kub-cs1, kub-cs2
kblack
kzia
Network:2x 10Gbps Ethernet:
1x 100Gbps Ethernet:
1x 10Gbps Ethernet:
1x 10Gbps Ethernet:
kub-as1 — kub-as6
kub-cs1, kub-cs2
kblack
kzia

Storage

Primary network storage consists of two head nodes each equipped with Intel(R) Xeon(R) Silver 4216 CPU @ 2.10GHz, 192GB RAM, 1x 10Gbps NIC. It offers 1700TB capacity of rotational drives only in RAID 6 configuration. Used filesystem is IBM Spectrum Scale that is exported via NFS version 3 to the kubernetes cluster.

Data Backup

Storage is regularly backed up to the different storage in the different location.