-
- KASAGI Akihiko
- Department of Information Engineering, Hiroshima University
-
- NAKANO Koji
- Department of Information Engineering, Hiroshima University
-
- ITO Yasuaki
- Department of Information Engineering, Hiroshima University
抄録
The Hierarchical Memory Machine (HMM) is a theoretical parallel computing model that captures the essence of computation on CUDA-enabled GPUs. The offline permutation is a task to copy numbers stored in an array a of size n to an array b of the same size along a permutation P given in advance. A conventional algorithm can complete the offline permutation by executing b[p[i]] ← a[i] for all i in parallel, where an array p stores the permutation P. We first present that the conventional algorithm runs $D_w(P)+2{n\over w}+3L-3$ time units using n threads on the HMM with width w and latency L, where Dw(P) is the distribution of P. We next show that important regular permutations including transpose, shuffle, and bit-reversal permutations run $2{n\over w}+2{n\over kw}+2L-2$ time units on the HMM with k DMMs. We have implemented permutation algorithms for these regular permutations on GeForce GTX 680 GPU. The experimental results show that these algorithms run much faster than the conventional algorithm. We also present an offline permutation algorithm for any permutation running in $16{n\over w}+16{n\over kw}+16L-16$ time units on the HMM with k DMMs. Quite surprisingly, our offline permutation algorithm on the GPU achieves better performance than the conventional algorithm in random permutation, although the running time has a large constant factor. We can say that the experimental results provide a good example of GPU computation showing that a complicated but ingenious implementation with a larger constant factor in computing time can outperform a much simpler conventional algorithm.
収録刊行物
-
- IEICE Transactions on Information and Systems
-
IEICE Transactions on Information and Systems E97.D (12), 3052-3062, 2014
一般社団法人 電子情報通信学会
- Tweet
詳細情報 詳細情報について
-
- CRID
- 1390282679354823040
-
- NII論文ID
- 130004841763
-
- ISSN
- 17451361
- 09168532
-
- 本文言語コード
- en
-
- データソース種別
-
- JaLC
- Crossref
- CiNii Articles
- KAKEN
-
- 抄録ライセンスフラグ
- 使用不可