機械翻訳を用いた対話における思い違いに関する分析

山下, 直美, 石田, 亨, 平田, 圭二

東アジアを中心に，機械翻訳を用いて議論をするコミュニティが増加している．しかし，機械翻訳を用いた対話には誤訳ノイズが混入するため，ユーザ間の相互理解を構築することは難しい．我々は，機械翻訳を介した対話で大量に発生する思い違いの問題に焦点を当て，機械翻訳を介した対話の特徴と思い違いの関係について分析，考察した．その結果，思い違いが大量に発生する異国ユーザ間の返信メッセージは，内容をとらえずメッセージの部分的なところでつながる傾向が強いことが分かった．そこで，これらの対話の特徴をふまえ，誤訳による思い違いを機械的に検知する手法を提案した．提案手法では，通常のスレッド（syntactic thread）と語彙的結束性に基づいたスレッド（semantic thread）の差異を計測し，この差異が大きければユーザ間の話題に関する思い違いが生じやすいと判断するものである．実際に対話データを用いて提案手法を検証したところ，提案手法によって計測した思い違いと実際の思い違いの間には有意な正相関が観測され，手法の有効性が示された．

Multilingual communities using machine translation to overcome language barriers are showing up more and more frequently. However, when a large number of translation errors get mixed into conversation, it becomes difficult for users to fully understand each other. In this paper, we focus on misconceptions found in high volume in actual online conversations using machine translation. We first examine how misconceptions occurred by delving into 1106 messages exchanged on BBS using machine translation. The analysis results indicate that when a user responds via machine translation, he/she tends to respond to short phrases of the original message and tends to trip on the wording of the original message. Next, based on the analysis results, we propose a method that automatically measures the likeliness of each dialogue including misconceptions. The proposed method assesses the likeliness of each dialogue including misconceptions by calculating the gaps between the regular discussion thread (syntactic thread) and the discussion thread based on lexical cohesion (semantic thread). Verification results show significant positive correlation between actual misconception frequency and the syntax-semantic gap, which indicates the validity of the method.

機械翻訳を用いた対話における思い違いに関する分析

書誌事項

この論文をさがす

抄録

収録刊行物

被引用文献 (1)*注記

参考文献 (16)*注記

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

機械翻訳を用いた対話における思い違いに関する分析

書誌事項

この論文をさがす

抄録

収録刊行物

被引用文献 (1)*注記

参考文献 (16)*注記

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

参加プロジェクトリスト

詳細情報詳細情報について