Developing an SDR Test Collection from Japanese Lecture Audio Data

HANDLE オープンアクセス

抄録

APSIPA ASC 2009: Asia-Pacific Signal and Information Processing Association, 2009 Annual Summit and Conference. 4-7 October 2009. Sapporo, Japan. Oral session: Initiatives in Spoken Document Processing (6 October 2009).

The lecture is one of the most valuable genres of audiovisual data. However, spoken lectures are difficult to reuse because browsing and efficient searching within spoken lectures is difficult. To promote the research activities in the spoken lecture retrieval, this paper reports a test collection for its evaluation. The test collection consists of the target spoken documents of about 2,700 lectures (604 hours) taken from the Corpus of Spontaneous Japanese (CSJ), 39 retrieval queries, the relevant passages in the target documents for each query, and the automatic transcription of the target speech data. We report the retrieval performance targeting the constructed test collection by applying a standard spoken document retrieval (SDR) method, which serves as a baseline for the forthcoming SDR studies using the test collection. We also introduce the several studies conducted by the users of the test collection.

収録刊行物

詳細情報 詳細情報について

  • CRID
    1050001202963605120
  • NII論文ID
    120006660605
  • HANDLE
    2115/39703
  • 本文言語コード
    en
  • 資料種別
    conference paper
  • データソース種別
    • IRDB
    • CiNii Articles

問題の指摘

ページトップへ