A Web Page Segmentation Approach Using Visual Semantics

  • ZENG Jun
    Graduate School of Information Science and Electrical Engineering, Kyushu University
  • FLANAGAN Brendan
    Graduate School of Information Science and Electrical Engineering, Kyushu University
  • HIROKAWA Sachio
    Research Institute for Information Technology, Kyushu University
  • ITO Eisuke
    Research Institute for Information Technology, Kyushu University

抄録

Web page segmentation has a variety of benefits and potential web applications. Early techniques of web page segmentation are mainly based on machine learning algorithms and rule-based heuristics, which cannot be used for large-scale page segmentation. In this paper, we propose a formulated page segmentation method using visual semantics. Instead of analyzing the visual cues of web pages, this method utilizes three measures to formulate the visual semantics: layout tree is used to recognize the visual similar blocks; seam degree is used to describe how neatly the blocks are arranged; content similarity is used to describe the content coherent degree between blocks. A comparison experiment was done using the VIPS algorithm as a baseline. Experiment results show that the proposed method can divide a Web page into appropriate semantic segments.

収録刊行物

被引用文献 (1)*注記

もっと見る

参考文献 (15)*注記

もっと見る

関連プロジェクト

もっと見る

詳細情報 詳細情報について

問題の指摘

ページトップへ