Fast and Parallel RankClus Algorithm based on Dynamic Rank Score Tracking

Access this Article

Author(s)

Abstract

<p>Given a bi-type information network, which is an extended model of well-known bipartite graphs, how can clusters be efficiently found in graphs? Graph clustering is now a fundamental tool to understand overviews from graph-structured data. The RankClus framework accurately performs clustering for bi-type information networks using ranking-based graph clustering techniques. It integrates graph ranking algorithms such as PageRank or HITS into graph clustering procedures to improve the clustering quality. However, this integration incurs a high computational cost to handle large bi-type information networks since RankClus repeatedly computes the ranking algorithm for all nodes and edges until the clustering procedure converges. To overcome this runtime limitation, we present a novel RankClus algorithm that reduces the running time for large bi-type information networks. Our proposed method employs dynamic graph processing techniques into the iterative ranking procedures included in RankClus. By dynamically updating ranking results, our proposal reduces the number of computed nodes and edges during repeated ranking procedures. For further improving the efficiency, we also present a parallel implementation of our proposed algorithm by using thread-based parallelization on a modern manycore CPU. We experimentally verify using real-world datasets that our fast and parallel proposed methods successfully reduces the running time while maintaining the clustering quality of RankClus.</p>

Journal

  • Journal of Information Processing

    Journal of Information Processing 28(0), 453-461, 2020

    Information Processing Society of Japan

Codes

Page Top