Extending Various Thesauri by Finding Synonym Sets from a Formal Concept Lattice
-
- Ikeda Madori
- Graduate School of Informatics, Kyoto University
-
- Yamamoto Akihiro
- Graduate School of Informatics, Kyoto University
この論文をさがす
抄録
<p>In this paper, we solve the problem of extending various thesauri using a single method. Thesauri should be extended when unregistered terms are identified. Various thesauri are available, each of which is constructed according to a unique design principle. We formalise the extension of one thesaurus as a single classification problem in machine learning, with the goal of solving different classification problems. Applying existing classification methods to each thesaurus is time consuming, particularly if many thesauri must be extended. Thus, we propose a method to reduce the time required to extend multiple thesauri. In the proposed method, we first generate clusters of terms without the thesauri that are candidates for synonym sets based on formal concept analysis using the syntactic information of terms in a corpus. Reliable syntactic parsers are easy to use; thus, syntactic information is more available for many terms than semantic information. With syntactic information, for each thesaurus and for all unregistered terms, we can search candidate clusters quickly for a correct synonym set for fast classification. Experimental results demonstrate that the proposed method is faster than existing methods and classification accuracy is comparable. </p>
収録刊行物
-
- 自然言語処理
-
自然言語処理 24 (3), 323-349, 2017
一般社団法人 言語処理学会
- Tweet
詳細情報 詳細情報について
-
- CRID
- 1390001204475008512
-
- NII論文ID
- 130006078496
-
- NII書誌ID
- AN10472659
-
- ISSN
- 21858314
- 13407619
-
- NDL書誌ID
- 028334251
-
- 本文言語コード
- en
-
- データソース種別
-
- JaLC
- NDL
- Crossref
- CiNii Articles
- KAKEN
-
- 抄録ライセンスフラグ
- 使用不可