Read/Search this Article
Abstract
特定話者の短音節を標本にして, 0.2msec毎にサンプリングした1024個のデータを単位にHaarの離散ウェーブレット変換を行い,ウェーブレット係数(WLC)の絶対値をスケール別に加え合わせた量(SWLC)を成分とした時系列のテンプレートのマッチングによって音節分析をした.音素が変わる連続的な発声の遷移領域ではSWLCの比SWLC(0.8msec帯)/SWLC(1.6msec帯)とSWLC(1.6msec帯)/SWLC(3.2msec帯)が節となる.ピッチ単位の有声音の波形分析では,ピーク値よりピッチ期間より短い6.4msec(0.1msec毎に64個)のデータのWLCで,低い解像度(0.8msec以上)のWLCの成分を15個(8,4,2,1)について標本母音と間のHamming拒離を求めて母音が弁別できる. 2通りの方法で同じ話者の発話に含まれる音素の分析ができた.
Speaker dependent voice recognition performance was achieved with template matching (TM). In order to give a margin to TM, sums of absolute value of wavelet transform coefficients in each scale (SWLC's) are used for vector quantization. Japanese moras are recognized under the condition of 204.8msec (1024 pieces of data those are sampled every 0.2msec) as a unit of processing. As for a segmentation, the ratio of SWLC (the 0.8msec band)/SWLC (the 1.6msec band) and SWLC (the 1.6msec band)/SWLC (the 3.2msec band) become a node in t he transition region of vowel [a,i,u,e,o]. Vowels uttered the same speaker were recognized by TM with 15 piece of WLC's in low resolution (scale is over 0.8msec) where the segmentation of processing is shorter then the pitch in order to make adaptable to the valid speech sound. Here, the data were sampled at each 0,1mces and 64 pieces of data were picked up from each peak of voice and the set of data are transferred to Haar's discrete wavelet coefficients (WLC's).
Journal
- IPSJ SIG Notes [List of Volumes]
-
IPSJ SIG Notes 2006(136), 77-82, 2006-12-21 [Table of Contents]
Information Processing Society of Japan (IPSJ)