JUWELS Cluster and Booster: Exascale Pathfinder with Modular Supercomputing Architecture at Juelich Supercomputing Centre
DOI:
https://doi.org/10.17815/jlsrf-7-183Abstract
JUWELS is a multi-petaflop modular supercomputer operated by Juelich Supercomputing Centre at Forschungszentrum Juelich as a European and national supercomputing resource for the Gauss Centre for Supercomputing. In addition, JUWELS serves the Earth system modeling community and the AI community within the Helmholtz Association as well. JUWELS currently consists of two modules. The first module deployed in 2018 is the so-called Cluster module. The Cluster is a BullSequana X1000 system with Intel Xeon Skylake-SP processors and Mellanox EDR InfiniBand. The second module deployed in 2020 is the so-called Booster module. The Booster is a BullSequana XH2000 system with 2nd generation AMD EPYC processors, NVIDIA Ampere GPUs and NVIDIA/Mellanox HDR Infiniband. This paper describes in detail the architecture of the system from a users perspective, and additionally provides further insights into the administrative infrastructure used to operate the supercomputer.References
Atos. (2021a). Atos Bullsequana X1000 product webpage. Retrieved from https://atos.net/en/products/high-performance-computing-hpc/bullsequana-x-supercomputers/bullsequana-x1000
Atos. (2021b). Atos Bullsequana XH2000 product webpage. Retrieved from https://atos.net/en/solutions/high-performance-computing-hpc/bullsequana-x-supercomputers#bullsequana-xh2000
Ceph. (2021). Ceph distributed storage system. Retrieved from https://ceph.io
ClusterLabs. (2021). ClusterLabs Stack webpage. Retrieved from https://clusterlabs.org
Eicker, N., Lippert, T., Moschny, T., & Suarez, E. (2016). The DEEP Project An alternative approach to heterogeneous cluster-computing in the many-core era. Concurrency and computation, 28(8), 2394–2411. http://dx.doi.org/10.1002/cpe.3562
Elasticsearch B.V. (2021). ELK Stack. Retrieved from https://www.elastic.co
Forschungszentrum Jülich. (2015). JUQUEEN: IBM BlueGene/Q Supercomputer System at the Jülich Supercomputing Centre. Journal of large-scale research facilities, 1, A1. http://dx.doi.org/10.17815/jlsrf-1-18
Forschungszentrum Jülich. (2021a). Forschungszentrum Jülich webpage. Retrieved from https://www.fz-juelich.de
Forschungszentrum Jülich. (2021b). Jülich Supercomputing Centre webpage. Retrieved from https:// www.fz-juelich.de/ias/jsc
Forschungszentrum Jülich. (2021c). Jupyter@JSC webpage. Retrieved from https://jupyter-jsc.fz-juelich.de
Forschungszentrum Jülich. (2021d). JUST webpage. Retrieved from https://www.fz-juelich.de/ias/jsc/just
Forschungszentrum Jülich. (2021e). JUWELS webpage. Retrieved from https://www.fz-juelich.de/ias/jsc/juwels
Forschungszentrum Jülich. (2021f). LLview webpage. Retrieved from https://www.fz-juelich.de/jsc/llview
Gauss Centre for Supercomputing. (2021a). Gauss Centre for Supercomputing webpage. Retrieved from https://www.gauss-centre.eu
Gauss Centre for Supercomputing. (2021b). HPC Access Gauss Centre for Supercomputing e.V. Retrieved from https://www.gauss-centre.eu/for-users/hpc-access/
Grafana Labs. (2021). Grafana: The open observability platform. Retrieved from https://grafana.com Helmholtz Association. (2021). Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V. (HGF) webpage. Retrieved from https://www.helmholtz.de
High-Performance Computing Center Stuttgart. (2021). High-Performance Computing Center Stuttgart webpage. Retrieved from https://www.hlrs.de
Hoste, K., Timmerman, J., Georges, A., & De Weirdt, S. (2012). EasyBuild: Building Software with Ease. In 2012 SC Companion: High Performance Computing, Networking Storage and Analysis (p. 572-582). http://dx.doi.org/10.1109/SC.Companion.2012.81
IBM. (2021). IBM Spectrum Scale product webpage. Retrieved from https://www.ibm.com/de-de/products/spectrum-scale
John von Neumann Institute for Computing. (2021). John von Neumann Institute for Computing (NIC) webpage. Retrieved from http://www.john-von-neumann-institut.de
Jülich Supercomputing Centre. (2018). JURECA: Modular supercomputer at Jülich Supercomputing Centre. Journal of large-scale research facilities, 4, A132. http://dx.doi.org/10.17815/jlsrf-4-121-1
Jülich Supercomputing Centre. (2019). JUWELS: Modular Tier-0/1 Supercomputer at the Jülich Supercomputing Centre.Journal of large-scale research facilities, 5, A135. http://dx.doi.org/10.17815/jlsrf-5-171
Leibniz Supercomputing Centre of the Bavarian Academy of Sciences, & Humanities. (2021). Leibniz Supercomputing Centre webpage. Retrieved from https://lrz.de
ParTec Cluster Competence Center GmbH. (2021). ParTec webpage. Retrieved from https://www.par-tec.com
Partnership for Advanced Computing in Europe. (2021). Partnership for Advanced Computing in Europe webpage. Retrieved from https://www.prace-ri.eu
Prometheus. (2021). Prometheus - Monitoring system & time series database. Retrieved from https://prometheus.io/
Red Hat Inc. (2021). Ansible Configuration Manager webpage. Retrieved from https://www.ansible.com
SchedMD LLC. (2021). Slurm Workload Manager webpage. Retrieved from https://slurm.schedmd.com
Top500. (2021). Top500 June 2021 list. Retrieved from https://www.top500.org/lists/2021/06
UNICORE. (2021). Uniform Interface to Computing Resources (UNICORE) webpage. Retrieved from https://www.unicore.eu
Downloads
Published
Issue
Section
URN
License
Copyright (c) 2021 Journal of large-scale research facilities JLSRF
This work is licensed under a Creative Commons Attribution 4.0 International License.
Submission of an article authorizes Forschungszentrum Jülich to publish the accepted version of the article under a CC BY 4.0 Creative Commons Licence Creative Commons-Lizenz CC-BY 4.0. No article processing charges or submission fees are involved.