PPFS: A Scale-out Distributed File System for Post-petascale Systems

Access this Article

Author(s)

    • Takatsu Fuyumasa
    • Graduate school of System and Information Engineering, University of Tsukuba
    • Hiraga Kohei
    • Graduate school of System and Information Engineering, University of Tsukuba
    • Tatebe Osamu
    • Faculty of Engineering, Information and Systems, University of Tsukuba

Abstract

<p>The fusion of the research field of high-performance computing (HPC) with that of big data, which has become known as the field of extreme big data, is problematic in that file creation in storage systems such as distributed file systems is not optimized. That is, the large workload leads to simultaneous creations of many files by many processes when creating checkpoints. The need to improve the file creation processes prompted us to design a scale-out distributed file system for post-petascale systems named PPFS. PPFS consists of PPMDS, which is a scale-out distributed metadata server, and PPOSS, which is a scalable distributed storage server for flash storage. The high file creation performance of PPMDS was achieved by using a key-value store for metadata storage and non-blocking distributed transactions to update multiple entries simultaneously. PPOSS depends on PPOST, which is an object storage system that manages the underlying low-level storage, such as Fusion IO ioDrive, a flash device connected through PCI express supporting OpenNVM. The high file creation performance was attained by implementing the PPFS prototype using file creation optimization, termed bulk creation, to reduce the amount of communication between PPMDS and PPOSS. And, to enhance the I/O performance of PPOSS when the client process and PPOSS run on the same node, PPOSS accesses a local storage device directly. The prototype implementation of PPFS with a further file creation optimization called object prefetching achieves 138, 000 Operations Per Second for file creation when using five metadata servers and 128 client processes, thereby exceeding the performance of IndexFS by 2.52 times. With local access optimization, PPOSS reached its limit at a block size of 16KiB, which is an improvement of 1.5 times compared to before optimization. Furthermore, this evaluation indicates that PPFS has a good scalability on file creation and IO performance, that is required for post-petascale systems.</p>

Journal

  • Journal of Information Processing

    Journal of Information Processing 25(0), 438-447, 2017

    Information Processing Society of Japan

Codes

Page Top