談話標識と話題語に基づく統計的尺度による講演からの重要文抽出  [in Japanese] Automatic Extraction of Important Sentences from Lecture Transcription using Statistical Measure based on Discourse Markers and Topic Words  [in Japanese]

Access this Article

Search this Article

Author(s)

Abstract

講演(学会講演)のディジタルアーカイブ化を目的として,書き起こし(音声認識結果)から自動的に重要文を抽出するために,学会講演特有の話題構造を利用した談話標識に基づく手法を提案する.ポーズ情報および言語的情報をもとに話し言葉におけるセクション境界候補を検出し,セクション冒頭の文に頻出する談話標識を求めた上で,これに基づく統計的な重要度尺度を定義する.さらに話題語(キーワード)の統計量に基づく重要度尺度と統合することも検討した.これらの重要度尺度でCSJの14件の学会講演を対象に重要文抽出精度の評価を行い,(1)談話標識に基づく手法が有効であること,(2)話題語に基づく手法と統合することで相乗効果が得られること,を確認した.For efficient access to speech media, secondary information is required. We explore automatic extraction of important sentences from lecture presentations. We segment a lecture into units and extract key sentences based on the discourse structure. To detect the boundaries of the units, we make use of the pause information and linguistic information. We also incorporate another extraction method based on topic dependent keywords. We evaluate the proposed methods and their combination with 14 lecture transciptions. It is confirmed that the use of section boundary information and its combination with keyword-based method are effective.

For efficient access to speech media, secondary information is required. We explore automatic extraction of important sentences from lecture presentations. We segment a lecture into units and extract key sentences based on the discourse structure. To detect the boundaries of the units, we make use of the pause information and linguistic information. We also incorporate another extraction method based on topic dependent keywords. We evaluate the proposed methods and their combination with 14 lecture transciptions. It is confirmed that the use of section boundary information and its combination with keyword-based method are effective.

Journal

  • IPSJ SIG Notes

    IPSJ SIG Notes 2003(58(2003-SLP-046)), 7-12, 2003-05-27

    Information Processing Society of Japan (IPSJ)

References:  7

Cited by:  8

Codes

  • NII Article ID (NAID)
    110002913794
  • NII NACSIS-CAT ID (NCID)
    AN10442647
  • Text Lang
    JPN
  • Article Type
    Journal Article
  • ISSN
    09196072
  • NDL Article ID
    6615430
  • NDL Source Classification
    ZM13(科学技術--科学技術一般--データ処理・計算機)
  • NDL Call No.
    Z14-1121
  • Data Source
    CJP  CJPref  NDL  NII-ELS  IPSJ 
Page Top