生成・識別モデルの統合に基づく半教師あり学習法とその多重分類への応用  [in Japanese] A Semi-supervised Learning Method based on Generative/Discriminative Model Combination and its Application to Multi-label Classification  [in Japanese]

Access this Article

Search this Article

Author(s)

    • 藤野 昭典 FUJINO Akinori
    • 日本電信電話株式会社NTTコミュニケーション科学基礎研究所 NTT Communication Science Laboratories, NTT Corporation
    • 上田 修功 UEDA Naonori
    • 日本電信電話株式会社NTTコミュニケーション科学基礎研究所 NTT Communication Science Laboratories, NTT Corporation
    • 磯崎 秀樹 ISOZAKI Hideki
    • 日本電信電話株式会社NTTコミュニケーション科学基礎研究所 NTT Communication Science Laboratories, NTT Corporation

Abstract

各データが複数のカテゴリに属する多重分類問題に対して,ラベルありデータとラベルなしデータを用いた半教師あり学習により分類器を設計する手法を提案する.提案法では,ラベルありデータで学習させる識別モデルとラベルなしデータで学習させる生成モデルの統合により分類器を得る.提案法を多重テキスト分類問題に適用するため,識別モデルに対数線形モデルを,生成モデルにナイーブベイズモデルを用いる.実テキストデータからなる3つのテストコレクションを用いた実験で,従来の対数線形モデルとナイーブベイズモデルの半教師あり学習法と比較して,提案法ではより高い汎化能力を持つ多重分類器を得られることを確認した.We propose a method for designing semi-supervised multi-label classifiers, which select one or more category labels for each data example and are trained on labeled and unlabeled examples. The proposed method is based on a combination of discriminative models trained on labeled examples with generative models trained on unlabeled examples. We employed a log-linear model and a naive Bayes model as the discriminative and generative models, respectively, for multi-label text classification problems. Using three test collections consisting of real text data, we confirmed experimentally that the proposed method provided a better multi-label classifier with high generalization ability than conventional semi-supervised learning methods of log-linear and naive Bayes models.

We propose a method for designing semi-supervised multi-label classifiers, which select one or more category labels for each data example and are trained on labeled and unlabeled examples. The proposed method is based on a combination of discriminative models trained on labeled examples with generative models trained on unlabeled examples. We employed a log-linear model and a naive Bayes model as the discriminative and generative models, respectively, for multi-label text classification problems. Using three test collections consisting of real text data, we confirmed experimentally that the proposed method provided better multi-label classifiers with high generalization ability than conventional semi-supervised learning methods of log-linear and naive Bayes models.

Journal

  • IPSJ SIG Notes

    IPSJ SIG Notes 2008(85(2008-MPS-071)), 95-98, 2008-09-11

    Information Processing Society of Japan (IPSJ)

References:  8

Codes

  • NII Article ID (NAID)
    110006975922
  • NII NACSIS-CAT ID (NCID)
    AN10505667
  • Text Lang
    JPN
  • Article Type
    Technical Report
  • ISSN
    09196072
  • NDL Article ID
    9668866
  • NDL Source Classification
    ZM13(科学技術--科学技術一般--データ処理・計算機)
  • NDL Call No.
    Z14-1121
  • Data Source
    CJP  NDL  NII-ELS  IPSJ 
Page Top