BOOTSTRAPPING K-MEANS CLUSTERING

抄録

Independent observations X_1, X_2,…, X_n are made on a distribution F on R^d. To devide these observations into k clusters, first choose a vector of optimal cluster centers b_n=(b_<n1>, b_<n2>, …, b_<nk>) to minimize [numerical formula] as a function of a=(a_1, a_2, …, a_k), then assign each observation to its nearest cluster center. Each b_<nj> is the mean of observations in its cluster. Pollard (1982) obtained a central limit theorem for the means of the k-clusters. In this paper, it is shown that the bootstrap distribution of the centered b_n has the same limiting distribution ; the argument rests on asymptotic behavior of empirical processes on Vapnik-Chervonenkis classes in triangular array setting. Advantages of the bootstrap methods are discussed and the performance of bootstrap confidence sets is compared with Pollard's confidence sets by Monte Carlo simulation.

収録刊行物

Journal of the Japanese Society of Computational Statistics   [巻号一覧]

Journal of the Japanese Society of Computational Statistics 3(1), 1-14, 1990-12  [この号の目次]

日本計算機統計学会

プレビュー

プレビュー

各種コード

  • NII論文ID(NAID) :
    110001235576
  • NII書誌ID(NCID) :
    AA10823693
  • 本文言語コード :
    ENG
  • ISSN :
    09152350
  • 収録DB :
    NII-ELS