Wikipediaマイニングによるシソーラス辞書の構築手法  [in Japanese] Wikipedia Mining to Construct a Thesaurus  [in Japanese]

Access this Article

Search this Article

Author(s)

    • 中山 浩太郎 NAKAYAMA KOTARO
    • 大阪大学大学院情報科学研究科マルチメディア工学専攻 Department of Multimedia Engineering, Graduate School of Information Science and Technology, Osaka University
    • 原 隆浩 HARA TAKAHIRO
    • 大阪大学大学院情報科学研究科マルチメディア工学専攻 Department of Multimedia Engineering, Graduate School of Information Science and Technology, Osaka University
    • 西尾 章治郎 NISHIO SHOJIRO
    • 大阪大学大学院情報科学研究科マルチメディア工学専攻 Department of Multimedia Engineering, Graduate School of Information Science and Technology, Osaka University

Abstract

シソーラス辞書は,情報検索や自然言語処理,対話エージェントなどの研究領域において幅広くその有用性が実証されてきた.しかし,自然言語処理などによる従来のシソーラス辞書自動構築では,形態素解析や同義語・多義語の処理など,語の関連性を解析する前段階の処理において精度低下を招く要因がいくつかある.また,辞書作成時と利用時のタイムラグにより最新の語や概念への対応が困難であるという問題もある.そこで本論文では,これら2 つの問題を解決するために,ここ数年で急速にコンテンツ量を増加させたWiki ベースの百科辞典である「Wikipedia」に対し,Web マイニングの手法を適用することでシソーラス辞書を自動構築する方法を提案する.Thesauri have been widely used in many applications such as information retrieval, natural language processing (NLP), and interactive agents. However, several problems, such as morphological analysis, treatment of synonymous and multisense words, still remain and degrade accuracy on traditional NLP-based thesaurus construction methods. In addition, adding latest/miner words is also a difficult issue on this research area. In this paper, to solve these problems, we propose a web mining method to automatically construct a thesaurus by extracting relations between words from Wikipedia, a wiki-based huge encyclopedia on WWW.

Thesauri have been widely used in many applications such as information retrieval, natural language processing (NLP), and interactive agents. However, several problems, such as morphological analysis, treatment of synonymous and multisense words, still remain and degrade accuracy on traditional NLP-based thesaurus construction methods. In addition, adding latest/miner words is also a difficult issue on this research area. In this paper, to solve these problems, we propose a web mining method to automatically construct a thesaurus by extracting relations between words from Wikipedia, a wiki-based huge encyclopedia on WWW.

Journal

  • IPSJ journal  

    IPSJ journal 47(10), 2917-2928, 2006-10-15 

    Information Processing Society of Japan (IPSJ)

References:  18

You must have a user ID to see the references.If you already have a user ID, please click "Login" to access the info.New users can click "Sign Up" to register for an user ID.

Cited by:  14

You must have a user ID to see the cited references.If you already have a user ID, please click "Login" to access the info.New users can click "Sign Up" to register for an user ID.

Codes

  • NII Article ID (NAID)
    110004822978
  • NII NACSIS-CAT ID (NCID)
    AN00116647
  • Text Lang
    JPN
  • Article Type
    Journal Article
  • ISSN
    1882-7764
  • NDL Article ID
    8540640
  • NDL Source Classification
    ZM13(科学技術--科学技術一般--データ処理・計算機)
  • NDL Call No.
    Z14-741
  • Data Source
    CJP  CJPref  NDL  IPSJ 
Page Top