Available Hardware
CERIT-SC operates regular clusters and a secure cluster (more information in upcoming Secure cluster section). Regular clusters include kuba-cluster (the biggest one) and kubh-cluster (HA cluster). 8 powerful kub-c nodes will join kuba-cluster in near future. Secure cluster includes kubas-cluster.
Regular clusters
kuba-cluster
Kuba-cluster consists of 39 nodes (currently 31 available) and features 22 NVIDIA A40, 6 NVIDIA A10 and 12 NVIDIA A100 (80GB variant) GPU accelerators. Four A100 cards are configured as MIG parts resulting in 12x 10GB and 8x 20GB parts.
39x | Nodes | |
---|---|---|
CPU: | 2x AMD EPYC 7543 32-Core Processor (in total 64CPU per node): 2x AMD EPYC 9454 48-Core Processor (in total 96CPU per node): | All nodes apart from kub-c 8 HPC kub-c nodes |
Memory: | 1024GB: 512GB: 1.5TB: | kub-b15, kub-b16 All remaining nodes apart from kub-c 8 HPC kub-c nodes |
Disk: | 2x 3.5TB SSD SATA: 8x 8TB NVME SSD: 60TB NVME SSD: | kub-a5 — kub-a25 kub-b1 — kub-b18 8 HPC kub-c nodes |
GPU: | None: 2x NVIDIA A40 per node: 1x NVIDIA A40 per node: 2x NVIDIA A10 per node: 2x NVIDIA A100 (80GB) per node: 1x NVIDIA H100 (PCIE/80GB): 1x NVIDIA L4: 1x NVIDIA H100 NVL (PCIE/94GB) per node: | kub-a5 — kub-a9, kub-b3, kub-b9, kub-b14, kub-b15, kub-b17 kub-a10 — kub-a14, kub-b12 kub-a15 — kub-a20, kub-a22 — kub-a24 kub-b1 — kub-b2, kub-b16 kub-b4 — kub-b8 kub-b10 — kub-b11 kub-b13 8 HPC kub-c nodes |
Network: | 2x 10Gbps Ethernet: 1x 100Gbps Infiniband: 1x 100Gbps Ethernet and 1x 200Gbps Infiniband: | All nodes kub-b1 — kub-b18 8 HPC kub-c nodes |
Storage
Primary network storage consists of four head nodes each equipped with AMD EPYC 7302P, 256GB RAM, 2x 10Gbps NIC (failover only). It offers 500TB all-flash capacity of SSD drives only in RAID 6 equivalent configuration. Used filesystem is IBM Spectrum Scale that is exported via NFS version 3 to the kubernetes cluster.
Data Backup
Storage is not backed up to another location but file system snapshots are made on daily basis. It is possible to restore deleted/overwritten data 14 days to the past.
kubh-cluster
Kubh-cluster consists of 6 nodes that are disperesed in three different locations (2 nodes in locations University Campus Bohunice — UKB, University Computer Centre — CPS — at Komenskeho namesti, Faculty of Informatics at Botanicka) thus is a suitable option for HA setups (more information in upcoming HA setup section).
6x | Nodes | |
---|---|---|
CPU: | 2x AMD EPYC 7543 32-Core Processor (in total 64CPU per node) | |
Memory: | 512GB | |
Disk: | 20TB NVME: 7TB NVME SSD: | kub-h1 — kub-h2 kub-h3 — kub-h6 |
Network: | 2x 10Gbps Ethernet: | All nodes |
Storage
Storage is provided only locally in each node.
Data Backup
Local storage is not backed up, it is up to the user to ensure backing-up.
Secure cluster
kubas-cluster
Kubas-cluster consists of 10 nodes and features 5x NVIDIA A40, 2x NVIDIA A100 (80GB variant), 4x NVIDIA P100, 2x NVIDIA H100 (NVL 94GB variant) and on-demand 4x NVIDIA A100 GPU accelerators. Nodes kub-cs1, kub-cs2 are physically located at different site than the rest of the cluster therefore it is possible to create HA setup in the secure cluster as well.
10x | Nodes | |
---|---|---|
CPU: | 2x AMD EPYC 7543 32-Core Processor (in total 64CPU per node): 2x AMD EPYC 9454 48-Core Processor (in total 96CPU per node): 1x Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz 24-Core Processor: 1x AMD EPYC 7662 64-Core Processor (in total 64CPU per node): | kub-as1 — kub-as6 kub-cs1, kub-cs2 kblack kzia |
Memory: | 512GB: 1.5 TB: 512GB: 512GB: | kub-as1 — kub-as6 kub-cs1, kub-cs2 kblack kzia |
Disk: | 2x 3.5TB SSD SATA: 60TB NVME SSD: 1x 3.6TB SSD: 1x 1.5TB SSD: | kub-as1 — kub-as6 kub-cs1, kub-cs2 kblack kzia |
GPU: | 1x NVIDIA A40 per node: 2x NVIDIA A100 (80GB) per node: 2x NVIDIA H100 NVL (94GB) per node: 4x NVIDIA P100: on-demand 1-4 NVIDIA A100 (40GB): | kub-as1 — kub-as5 kub-as6 kub-cs1, kub-cs2 kblack kzia |
Network: | 2x 10Gbps Ethernet: 1x 100Gbps Ethernet: 1x 10Gbps Ethernet: 1x 10Gbps Ethernet: | kub-as1 — kub-as6 kub-cs1, kub-cs2 kblack kzia |
Storage
Primary network storage consists of two head nodes each equipped with Intel(R) Xeon(R) Silver 4216 CPU @ 2.10GHz, 192GB RAM, 1x 10Gbps NIC. It offers 1700TB capacity of rotational drives only in RAID 6 configuration. Used filesystem is IBM Spectrum Scale that is exported via NFS version 3 to the kubernetes cluster.
Data Backup
Storage is regularly backed up to the different storage in the different location.