A Robust K-Means for Document Clustering

Yu, Hengjun, Inoue, Kohei, Hara, Kenji, Urahama, Kiichi

doi:10.12792/jiiae.6.60

【Created on October 31, 2023】 Integration of CiNii Dissertations and CiNii Books into CiNii Research

Impact of the Release of the New "NDL Search" on CiNii Services

A Robust K-Means for Document Clustering

DOI HANDLE Web Site Web Site Open Access

Yu, Hengjun

Department of Communication Design Science, Kyushu University
Inoue, Kohei

Department of Communication Design Science, Kyushu University
Hara, Kenji

Department of Communication Design Science, Kyushu University
Urahama, Kiichi

Department of Communication Design Science, Kyushu University

Bibliographic Information

Other Title

Two-step Variable Screening Method for the Mahalanobis-Taguchi Method with Small Training Data

Search this article

NDL ONLINE

Abstract

We propose a robust K-means clustering algorithm for document clustering, where we suppose that a document-term matrix is given as an input dataset, and the documents in the dataset are clustered on the basis of the frequency of terms that occur in each document. We introduce a robust loss function to K-means clustering to obtain its robust version, and also propose a feature transform method for improving the performance of document clustering. Experimental results show that the proposed method improves the robustness of K-means to outliers and the performance of document clustering demonstrated on one of the BBC datasets originating from the BBC News.

Journal

Journal of the Institute of Industrial Applications Engineers

Journal of the Institute of Industrial Applications Engineers 6 (2), 54-59, 2018-04-25

The Institute of Industrial Applications Engineers

Related Projects

Keywords

Details 詳細情報について

CRID

1050298532705115264
NII Article ID

40021795824

120006462272

40021795810
DOI

10.12792/jiiae.6.60
ISSN

21878811

21881758
HANDLE

2324/1924408
NDL BIB ID

029492201

029492216
Web Site

https://ndlsearch.ndl.go.jp/books/R000000004-I029492201

https://ndlsearch.ndl.go.jp/books/R000000004-I029492216
Text Lang

en
Article Type

journal article
Data Source
- IRDB
- NDL
- Crossref
- CiNii Articles
- KAKEN

A Robust K-Means for Document Clustering

Bibliographic Information

Search this article

Abstract

Journal

Related Projects

Keywords

Details 詳細情報について

Export

Report a problem

A Robust K-Means for Document Clustering

Bibliographic Information

Search this article

Abstract

Journal

Related Projects

Keywords

Details 詳細情報について

Export

Report a problem

Project list