Cray XD1での星団進化の高性能「小規模」シミュレーション High-Performance Small-Scale Simulation of Star Clusters Evolution on Cray XD1

この論文にアクセスする

この論文をさがす

著者

抄録

本論文では,400個のデュアルコアOpteronプロセッサを用いたCray XD1システム上での高性能な N体シミュレーションコードの実装と,64k粒子の星団のシミュレーションでの性能について述べる.これまでにも多くの天体物理学的 $N$ 体計算の並列化が報告されているが.その中でも数十プロセッサ以上を用いた実装の性能評価には,大きな粒子数が使われる傾向がある.たとえば,これまでのゴードン・ベル賞へのエントリでは,少なくとも70万粒子が用いられている.この傾向の理由は,並列化効率にある.というのも,大規模並列機で小さな粒子数で性能を出すのは非常に困難であるからである.しかしながら,多くの科学的に重要な問題では計算コストは O(N^3.3) に比例するため,比較的小さな粒子数の計算に大規模並列計算機を用いることが非常に重要である.我々は,64k粒子のO(N^2)直接計算独立時間刻み法の計算で2.03Tflops(対ピーク57.7%)の性能を実現した.これまでの64k粒子での同様の計算における最大の効率は,128プロセッサのCray T3E-900での7.8%(9Gflops)である.今回の実装では従来の方法より高スケーラブルな2次元並列アルゴリズムを用いている.さらに今回のような高性能を達成するためにはCray XD1の低レイテンシネットワークが本質的に重要であった.In this paper, we describe the implimentation and performance of N-body simulation code for a star cluster with 64k stars on a Cray XD1 system with 400 dual-core Opteron processors. There have been many reports on the parallelization of astrophysical N-body simulations. For parallel implementations on more than a few tens of processors, performance was usually measured for very large number of particles. For example, all previous entries for the Gordon-Bell prizes used at least 700\,k particles. The reason for this preference of large numbers of particles is the parallel efficiency. It is very difficult to achieve high performance on large parallel machines, if the number of particles is small. However, for many scientifically important problems the calculation cost scales as O(N^3.3), and it is very important to use large machines for relatively small number of particles. We achieved 2.03Tflops, or 57.7% of the theoretical peak performance, using a direct O(N^2) calculation with the individual timestep algorithm, on 64k particles. The best efficiency previously reported on similar calculation with 64K or smaller number of particles is 7.8% (9Gflops) on Cray T3E-900 with 128 processors. Our implementation is based on highly scalable two-dimensional parallelization scheme, and low-latency communication network of Cray XD1 turned out to be essential to achieve this level of performance.

In this paper, we describe the implimentation and performance of N-body simulation code for a star cluster with 64 k stars on a Cray XD1 system with 400 dual-core Opteron processors. There have been many reports on the parallelization of astrophysical N-body simulations. For parallel implementations on more than a few tens of processors, performance was usually measured for very large number of particles. For example, all previous entries for the Gordon-Bell prizes used at least 700k particles. The reason for this preference of large numbers of particles is the parallel efficiency. It is very difficult to achieve high performance on large parallel machines, if the number of particles is small. However, for many scientifically important problems the calculation cost scales as O(N^<3.3>), and it is very important to use large machines for relatively small number of particles. We achieved 2.03 Tflops, or 57.7% of the theoretical peak performance, using a direct O(N^2) calculation with the individual timestep algorithm, on 64k particles. The best efficiency previously reported on similar calculation with 64 K or smaller number of particles is 7.8% (9 Gflops) on Cray T3E-900 with 128 processors. Our implementation is based on highly scalable two-dimensional parallelization scheme, and low-latency communication network of Cray XD1 turned out to be essential to achieve this level of performance.

収録刊行物

  • 情報処理学会論文誌コンピューティングシステム(ACS)

    情報処理学会論文誌コンピューティングシステム(ACS) 48(SIG8(ACS18)), 54-61, 2007-05-15

    一般社団法人情報処理学会

参考文献:  21件中 1-21件 を表示

  • <no title>

    KOKUBO E.

    AAS/Division for Planetary Sciences Meeting Abstracts, 1999, 1999

    被引用文献1件

  • <no title>

    SPRINGEL V.

    Nat 435, 629, 2005

    被引用文献1件

  • <no title>

    TAIJI M.

    Highlights of Astronomy, 1998

    被引用文献1件

  • <no title>

    MAKINO J.

    PASJ, 2003

    被引用文献1件

  • <no title>

    KOKUBO E.

    ApJ, 2002

    被引用文献1件

  • <no title>

    AARSETH S. J.

    Monthly Notices Roy. Astron. Soc., 1963

    被引用文献1件

  • <no title>

    AARSETH S. J.

    Multiple Time Scales, 1985

    被引用文献1件

  • <no title>

    DORBAND E. N.

    J. Comp. Phys., 2003

    被引用文献1件

  • <no title>

    GUALANDRIS A.

    astro-ph/0412206, 2004

    被引用文献1件

  • <no title>

    FUKUSHIGE T

    PASJ, 2005

    被引用文献1件

  • <no title>

    MAKINO J.

    New Astronomy, 2002

    被引用文献1件

  • <no title>

    NITADORI K.

    New Astronomy, 2006

    被引用文献1件

  • <no title>

    MCMILLAN S. L. W.

    The Use of Supercomputer in Stellar Dynamics, 1986

    被引用文献1件

  • <no title>

    MAKINO J.

    PASJ, 1991

    被引用文献1件

  • <no title>

    MAKINO J.

    PASJ, 1992

    被引用文献1件

  • <no title>

    LIPPERT T.

    International Journal of Modern Physics C, 1998

    被引用文献1件

  • <no title>

    HILLIS W. D.

    Nature, 1987

    被引用文献1件

  • <no title>

    WARREN M. S.

    Proc. Supercomputing '97, 1997

    被引用文献1件

  • <no title>

    INAGAKI S.

    PASJ, 1984

    被引用文献1件

  • <no title>

    HEGGIE D. C.

    The Use of Supercomputer in Stellar Dynamics, 1986

    被引用文献1件

  • <no title>

    CASERTANO S.

    ApJ, 1985

    被引用文献1件

各種コード

  • NII論文ID(NAID)
    110006274062
  • NII書誌ID(NCID)
    AA11833852
  • 本文言語コード
    JPN
  • 資料種別
    Article
  • ISSN
    1882-7829
  • NDL 記事登録ID
    8837136
  • NDL 請求記号
    Z74-C192
  • データ提供元
    CJP書誌  NDL  NII-ELS  IPSJ 
ページトップへ