2種類の共起辞書を用いた語彙的言い換えに基づくWeb検索システム A Web Retrieval System Based on Lexical Paraphrasing Using Two Kinds of Co-occurrence Dictionaries

Read/Search this Article

Abstract

This paper proposes a Web retrieval system that accurately and exhaustively collects the web pages which are related to a user-specified topic from the Web. When users entered a character string as a query into our proposed system, the system lexically paraphrases and expands the character string. Consequently, the system can present more topic-related web pages than conventional search engines do. First, our proposed system extracts nouns, adjectives, verbs, and katakana characters as target words from the query or character string which users entered, obtains candidate words for paraphrasing the target words based on information retrieval on the Web, and tests validity of their paraphrasing using two kinds of co-occurrence dictionaries. Then, the system expands the initial query by replacing zero or more of the target words with the candidate words that were determined to be valid. A distinctive point of the system is that it uses not only a co-occurrence dictionary that describes ``preceding, ``following, and ``predicate relationships between words but also an impression dictionary that describes co-occurrence relationships between words and two contrasting sets of impression words for the validity test. We also evaluated performance of the proposed system on paraphrasing and information retrieval on the Web using seven sample queries. As a result, its effectiveness was proved.

Journal

Transactions of the Japanese Society for Artificial Intelligence  

Transactions of the Japanese Society for Artificial Intelligence 23(5), 355-363, 2008 

The Japanese Society for Artificial Intelligence

Codes

  • NII Article ID (NAID) :
    130000098246
  • Text Lang :
    ja
  • ISSN :
    1346-0714
  • Databases :
    J-STAGE 

Export