音響情報のベクトル量子化を用いた音声ドキュメントからの検索語検出

坂本, 伊織, 松永, 徹, 趙, 國, 山下, 洋一

音声を含むマルチメディアコンテンツを有効に利用するには，音声認識に基づいた情報検索が重要な技術となる．与えられた検索語を音声データから検出する音声中の検索語検出（STD：Spoken Term Detection）の研究が広く行われている．本論文では，検索対象の音声ドキュメントの表現手法として，音響情報をベクトル量子化（VQ）して得られるVQコード列を用い，テキスト入力された検索語と照合するSTD手法を提案する．VQコードと音素の関連度をあらかじめ話者ごとに学習しておくことによって，音声ドキュメントのVQコード列と検索語の音素列の照合を行う．評価実験において，音声ドキュメントをサブワード列で表現する従来手法よりも高い検出性能が得られた．さらに，異なる音声認識結果で学習した関連度で照合を行った複数の検出結果を統合することによって検出性能が改善されることが示されている．

The information retrieval based on speech recognition is an important technique to easily access large amount of multimedia contents including speech. The development of spoken term detection (STD) techniques, which detect a given word or phrase from spoken documents, is widely conducted. This paper proposes a new STD method based on matching between a text query and VQ (Vector Quantization) code sequences which represent spoken documents. The co-occurrence scores between a VQ code and a subword are a priori trained for each speaker. The continuous DP matching detects a subword sequence of the query term from VQ code sequences using the co-occurrence as a local score of matching. Evaluation experiments show that the proposed method improves the performance of STD. A fusion method of multiple detection results using the different cooccurrence scores gives more improvement of STD performance.

音響情報のベクトル量子化を用いた音声ドキュメントからの検索語検出

書誌事項

この論文をさがす

抄録

収録刊行物

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

音響情報のベクトル量子化を用いた音声ドキュメントからの検索語検出

書誌事項

この論文をさがす

抄録

収録刊行物

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

参加プロジェクトリスト

詳細情報詳細情報について