Decomposing the Web Graph into Parameterized Connected Components

Search this Article

Author(s)

Abstract

We propose a novel method for Web page grouping based only on hyperlink information. Because of the explosive growth of the World Wide Web, page grouping is expected to provide a general grasp of the Web for effective information search and netsurfing. The Web can be regarded as a gigantic digraph where pages are vertices and links are arcs. Our grouping method is a generalization of decomposition into strongly connected components, in which each group is constructed as a subset of a strongly connected component. Moreover, group sizes can be controlled by adjusting a parameter, called the threshold parameter. We call the resulting groups parameterized connected components (PCCs). The algorithm is simple and admits parallelization. Notably, we apply Dijkstra's shortest path algorithm in our grouping method. This paper also includes experimental results for 15 million Web pages, which show the contribution of our method to efficient Web surfer navigation.

Journal

  • IEICE Trans. Information and Systems

    IEICE Trans. Information and Systems 87(2), 380-388, 2004-02-01

    The Institute of Electronics, Information and Communication Engineers

References:  25

Cited by:  1

Codes

  • NII Article ID (NAID)
    110003223360
  • NII NACSIS-CAT ID (NCID)
    AA10826272
  • Text Lang
    ENG
  • Article Type
    Journal Article
  • ISSN
    09168532
  • Data Source
    CJP  CJPref  NII-ELS 
Page Top