Fault Tolerance Design for Hadoop MapReduce on Gfarm Distributed Filesystem

  • Marilia Melo
    Graduate School of Systems and Information Engineering, University of Tsukuba
  • Osamu Tatebe
    Graduate School of Systems and Information Engineering, University of Tsukuba

この論文をさがす

抄録

Many distributed file systems that have been designed to use MapReduce, such as Google file system and HDFS (Hadoop Distributed File System), relax some POSIX requirements to enable high throughput streaming access. Due to lack of POSIX compatibility, it is difficult for programs other than MapReduce to access these file systems. It is often need to import files to these file system, process the data and then export the output into a POSIX compatible file system. This results in a large number of redundant file operations. In order to solve this problem we have proposed[9] Hadoop-Gfarm plugin to be able to execute MapReduce jobs directly on top of Gfarm, a globally distributed file system. In this paper we analyse the redundancy and reliability for a fault tolerance design for Hadoop-Gfarm plugin. Our evaluation shows that Hadoop-Gfarm plugin can offer a reliable solution and performs just as well as Hadoop's native HDFS, allowing users to use a POSIX-compliant API and reduces redundant copy without sacrificing performance.

収録刊行物

詳細情報 詳細情報について

  • CRID
    1570291227951630336
  • NII論文ID
    110009588134
  • NII書誌ID
    AN10463942
  • 本文言語コード
    en
  • データソース種別
    • CiNii Articles

問題の指摘

ページトップへ