Read/Search this Article
This paper describes a sound scene database necessary for studies such as sound source localization, sound retrieval, sound recognition and speech recognition in real acoustical environments. Many speech databases have been collected for speech recognition so far. The statistical modeling of speech based on the collected speech databases realizes a drastic improvement of speech recognition performance. However, there are only a few databases available for sound scene data including non-speech sound in real environments. A sound scene database is obviously necessary for studies of acoustical signal processing and sound recognition. This paper reports on d project for collection of the sound scene database supported by Real World Computing Partnership (RWCP). There are many kinds of sound scenes in real environments. The sound scene is denoted by sound sources and room acoustics. The number of combination of the sound sources, source positions and rooms is huge in real acoustical environments. Two approaches are taken to build the sound scene database in the early stage of the project. The first approach is to collect isolated sound sources of many kinds of non-speech sounds and speech sounds. The second approach is to collect impulse responses in various acoustical environments. The sound in the collected environments can be simulated by convolution of the isolated sound sources and impulse responses. In a later stage, the sound scene data in real acoustical environments is planned to be collected using a three dimensional microphone array. In this paper, the plan and progress of our sound scene database project are described.