-
- Yu, Hengjun
- Department of Communication Design Science, Kyushu University
-
- Inoue, Kohei
- Department of Communication Design Science, Kyushu University
-
- Hara, Kenji
- Department of Communication Design Science, Kyushu University
-
- Urahama, Kiichi
- Department of Communication Design Science, Kyushu University
Bibliographic Information
- Other Title
-
- Two-step Variable Screening Method for the Mahalanobis-Taguchi Method with Small Training Data
Search this article
Abstract
We propose a robust K-means clustering algorithm for document clustering, where we suppose that a document-term matrix is given as an input dataset, and the documents in the dataset are clustered on the basis of the frequency of terms that occur in each document. We introduce a robust loss function to K-means clustering to obtain its robust version, and also propose a feature transform method for improving the performance of document clustering. Experimental results show that the proposed method improves the robustness of K-means to outliers and the performance of document clustering demonstrated on one of the BBC datasets originating from the BBC News.
Journal
-
- Journal of the Institute of Industrial Applications Engineers
-
Journal of the Institute of Industrial Applications Engineers 6 (2), 54-59, 2018-04-25
The Institute of Industrial Applications Engineers
- Tweet
Details 詳細情報について
-
- CRID
- 1050298532705115264
-
- NII Article ID
- 40021795824
- 120006462272
- 40021795810
-
- ISSN
- 21878811
- 21881758
-
- HANDLE
- 2324/1924408
-
- Text Lang
- en
-
- Article Type
- journal article
-
- Data Source
-
- IRDB
- NDL
- Crossref
- CiNii Articles
- KAKEN