Improving semantic similarity measures for word pair comparison

この論文にアクセスする

機関リポジトリ総合研究大学院大学

この論文をさがす

NDL ONLINE

著者

- Menendez Mora, Raul Ernesto メネンデスモラ, ラウルエルネスト

書誌事項

タイトル: Improving semantic similarity measures for word pair comparison

タイトル別名: 語の比較のための意味的な類似性尺度の改善に関する研究

著者名: Menendez Mora, Raul Ernesto

著者別名: メネンデスモラ, ラウルエルネスト

学位授与大学: 総合研究大学院大学

取得学位: 博士 (情報学)

学位授与番号: 甲第1515号

学位授与年月日: 2012-03-23

注記・抄録

博士論文

  The semantic web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. In order to achieve the goals of the semantic web, it have to be able to define and to describe the relations among data (i.e., resources) on the Web. Ontologies are one of the formal representation for organizing information in the semantic web and they are also used in artificial intelligence, systems engineering, software engineering, biomedical informatics, library science, enterprise bookmarking, and information architecture as a form of knowledge representation about the world or some part of it. In the semantic web context, since many actors provide their own ontologies, ontology matching or ontology alignment has taken a critical role for helping heterogeneous resources to inter-operate [23].   Ontology matching tools find classes of data that are "semantically equivalent". This process determine correspondences between concepts which are called alignments [22]. Finding those correspondences imply a semantic similarity assessment between the involved concepts.   Semantic similarity of words pairs is often represented by the similarity between the concepts associated with the words. Several methods have been developed to compute words similarity, most of them operating on taxonomic dictionaries like WordNet [24] or external corpus like the Brown Corpus. However the majority of them suffer from a serious limitation. They only focus on the semantic information shared by those words, or in the semantic differences, but they have been rarely combined in a broader perspective.   In this thesis we developed and applied a model of semantic similarity computation for word pair comparison. This model consider the semantic commonalities and the semantic differences as the core of its approach. By applying the model five new WordNet-based semantic similarity measures for word pair comparison were created. Four of this semantic similarity measures obtained higher values of correlation with human judgment than their original expressions, while the fifth one remained as competitive as their original version.   We also studyWordNet taxonomic properties to extend a corpus-independent information content metric. The application of this new metric in one of the previously developed node-based semantic similarity allowed us to obtain the highest value of correlation with respect to human judgment. This thesis provides a general an extensible approach of semantic similarity computation for word pair comparison.

application/pdf

総研大甲第1515号

Improving semantic similarity measures for word pair comparison 語の比較のための意味的な類似性尺度の改善に関する研究

この論文にアクセスする

この論文をさがす

著者

書誌事項

注記・抄録

各種コード

書き出し