EDRを用いた日本語意味解析システムSAGE

  • 原田 実
    青山学院大学理工学部情報テクノロジー学科
  • 水野 高宏
    青山学院大学大学院理工学研究科経営工学専攻 NTTデータ

書誌事項

タイトル別名
  • Japanese Semantic Analysis System SAGE using EDR
  • EDR オ モチイタ ニホン ゴイミ カイセキ システム SAGE

この論文をさがす

抄録

Up to now, the research on the automation of object-oriented analysis, especially extracting objectoriented design elements from the problem specification written in Japanese, has been continued in the Harada laboratory since 1993. As this first process, we have developed the semantic analysis system SAGE which could be practically useable both in the performance and in the accuracy. Given a dependency tree, where clauses constituting a sentence are related by dependency arcs, SAGE searches the EDR electronic dictionary, retrieves for any two clauses connected by a dependency arc the meaning of the principal word in each clause and the deep case between such two words, and assigns the probability of such meaning-case tuple. Then, SAGE constructs an interpretation tree by allocating such meaning-case tuple and its probability to each arc in the dependency tree. Next, SAGE searches for the allocation having the maximum of the overall evaluation value given by the sum of the probability of the allocated meaning and cases. Finally, SAGE converts the resulting interpretation tree into the set of semantic frames containing the information of each word and relations with other words. In developing the system, we achieved speed-up of the construction of the interpretation tree by reducing the search space with pruning useless meaning-case tuples and by using the branch and bound method. Moreover the accuracy improvement of the analysis was achieved by applying the following four methods: (A)in constructing the interpretation tree, assigning 0 probability to all the combination of word meanings with which there are no “case” information in the concept description dictionary, (B)using the experimental rules to presume the deep cases from the surface cases to each dependency between verb clauses, (C)improving the fitness of the sentences retrieved from corpus by using part of speech, and (D)decreasing the number of meaning candidates by using reading information. As a result, the average interpretation construction time of one sentence with nine clauses or less was 2 seconds on a PC with the Pentium III processor using 320MB memory. The correct answer rate of the meaning was 82.1%, and that of the case was 77.8%.

収録刊行物

被引用文献 (28)*注記

もっと見る

参考文献 (16)*注記

もっと見る

詳細情報 詳細情報について

問題の指摘

ページトップへ