ウェブ閲覧における効率的なキーワード抽出とその利用

上村, 卓史, 喜田, 拓也, 有村, 博紀

Bibliographic Information

Other Title

ウェブエツランニオケルコウリツテキナキーワードチュウシュツトソノリヨウ
Efficient Keyword Extraction and Its Applications to Web Browsing

Search this article

Abstract

インターネットからの効率的な情報収集においてウェブブラウザの果たす役割は大きい．しかし，膨大な文書の検索および閲覧はユーザにとっていまだ大きな負担である．本論文では，情報収集の補助を目的として，ユーザが閲覧するウェブページからのキーワード抽出という問題について考察する．ここでのキーワードとは，任意の単語の連結（単語Nグラム）である．このため，接尾辞木をもとにした，任意の単語Nグラムを線形領域で表すデータ構造である単語Nグラム木を提案し，これを利用した高速な抽出アルゴリズムを与える．また，抽出されたキーワードを利用することで，得られたキーワードの表示，関連するウェブサイトの表示，検索語の入力候補の提示といった，ユーザのウェブ閲覧を補助するインタフェースを実現する手法を示す．

Web browsers play an important role in information-gathering on the Internet. However, searching and browsing documents are still time-consuming jobs for many users. In this paper we present an algorithm for keyword extraction at the current Web page. The proposed algorithm utilizes a data structure, called word N-gram tree, which represents the all concatenations of words at most N in an inputed text. We also present browsing user interfaces based on the extraction, for showing related sites, for finding summary paragraph, and for suggesting input queries for Web search.

Journal

情報処理学会論文誌データベース（TOD）

情報処理学会論文誌データベース（TOD） 1 (1), 49-60, 2008-06-26

東京 : 情報処理学会

Details 詳細情報について

CRID: 1050001337891831808

NII Article ID: 110007990001

NII Book ID: AA11464847

ISSN: 18827799; 18827772; 03875806

NDL BIB ID: 024346672

Web Site: http://id.nii.ac.jp/1001/00017392/; https://ndlsearch.ndl.go.jp/books/R000000004-I024346672

Text Lang: ja

Article Type: article

Data Source

IRDB
NDL
CiNii Articles
KAKEN

Export

ウェブ閲覧における効率的なキーワード抽出とその利用

Bibliographic Information

Search this article

Abstract

Journal

Related Projects

Keywords

Details 詳細情報について

Export

Report a problem

ウェブ閲覧における効率的なキーワード抽出とその利用

Bibliographic Information

Search this article

Abstract

Journal

Related Projects

Keywords

Details 詳細情報について

Export

Report a problem

Project list