Driving a Petascale HPC Center with Octoshell Management System


如何引用文章

全文:

开放存取 开放存取
受限制的访问 ##reader.subscriptionAccessGranted##
受限制的访问 订阅存取

详细

Running any computing center is a complex task. With the growth of scales and costs such tasks become challenges. So the top supercomputer sites, being big in everything, have always required special approaches to manage, to control, and to take care of them. At present, large HPC centers can have a variety of totally diverse systems containing up to millions of components, having thousands of users worldwide with the full range of complicated applications. Obviously, tons of data have to be managed in a concerted way to allow such an informational factory functioning. This paper shares the design principles, some implementation details and the roadmap vision regarding the Octoshell HPC center management system, which has been developed and is currently being used in the everyday practice of Moscow State University supercomputer center. This open source system manages Lomonosov and Lomonosov-2 systems with a total of over 5 PFlops peak performance complexes at present, providing multiple tools aimed to tackle most typical workflow tasks both for regular users and system administrators in a single shell.

作者简介

D. Nikitenko

Research Computing Center

编辑信件的主要联系方式.
Email: dan@parallel.ru
俄罗斯联邦, Moscow, 119991

Vad. Voevodin

Research Computing Center

编辑信件的主要联系方式.
Email: vadim@parallel.ru
俄罗斯联邦, Moscow, 119991

S. Zhumatiy

Research Computing Center

编辑信件的主要联系方式.
Email: serg@parallel.ru
俄罗斯联邦, Moscow, 119991


版权所有 © Pleiades Publishing, Ltd., 2019
##common.cookie##