High-Performance I/O Programming Models for Exascale Computing

Sammanfattning: The success of the exascale supercomputer is largely dependent on novel breakthroughs that overcome the increasing demands for high-performance I/O on HPC. Scientists are aggressively taking advantage of the available compute power of petascale supercomputers to run larger scale and higher-fidelity simulations. At the same time, data-intensive workloads have recently become dominant as well. Such use-cases inherently pose additional stress into the I/O subsystem, mostly due to the elevated number of I/O transactions.As a consequence, three critical challenges arise that are of paramount importance at exascale. First, while the concurrency of next-generation supercomputers is expected to increase up to 1000x, the bandwidth and access latency of the I/O subsystem is projected to remain roughly constant in comparison. Storage is, therefore, on the verge of becoming a serious bottleneck. Second, despite upcoming supercomputers expected to integrate emerging non-volatile memory technologies to compensate for some of these limitations, existing programming models and interfaces (e.g., MPI-IO) might not provide any clear technical advantage when targeting distributed intra-node storage, let alone byte-addressable persistent memories. And third, even though compute nodes becoming heterogeneous can provide benefits in terms of performance and thermal dissipation, this technological transformation implicitly increases the programming complexity. Hence, making it difficult for scientific applications to take advantage of these developments.In this thesis, we explore how programming models and interfaces must evolve to address the aforementioned challenges. We present MPI storage windows, a novel concept that proposes utilizing the MPI one-sided communication model and MPI windows as a unified interface to program memory and storage. We then demonstrate how MPI one-sided can provide benefits on data analytics frameworks following a decoupled strategy, while integrating seamless fault-tolerance and out-of-core execution. Furthermore, we introduce persistent coarrays to enable transparent resiliency in Coarray Fortran, supporting the "failed images" feature recently introduced into the standard. Finally, we propose a global memory abstraction layer, inspired by the memory-mapped I/O mechanism of the OS, to expose different storage technologies using conventional memory operations.The outcomes from these contributions are expected to have a considerable impact in a wide-variety of scientific applications on HPC, both in current and next-generation supercomputers.