ランダマイズドアルゴリズムによる局所線形SVMの並列化

河村勇太, 上原邦昭

線形 SVM は高速な学習が可能であるが，線形分離不可能なデータには分類精度が低い．一方，カーネル SVM は高い分類精度が得られるが，特徴量の次元が増大することにより学習時間が膨大になる．これらの欠点を解決した局所線形 SVM は，計算量を抑えたまま，既存のカーネル SVM に劣らない分類精度を得られる手法である．本稿では，局所線形 SVM を改良し，さらに学習を高速化させる並列化手法を提案する．並列化手法として，データの分割学習を採用しているが，ノード間のパラメータに差が発生することによる分類精度の低下が問題となる．また，パラメータ差を解消するためにノード間の全通信を行うと，通信コストが膨大になるため，通信回数を少数に絞った並列化を導入する必要がある．本研究では， Cut-And-Stitch のアルゴリズムとランダマイズドアルゴリズムを組み合わせ，通常の学習と殆ど変わらない学習結果が高確率で得られることを実験により示す．Linear SVMs can be efficiently trained but they suffer from low classification accuracy on nonlinearly separable data. Kernel SVMs, on the other hand, can obtain high classification accuracy, but are computationally more expensive to train. Locally Linear SVM is a method that can obtain classification accuracy as good as kernel SVM without compromising training efficiency. In this paper, we further accelerate it by parallelization. However, drop in classification may appear due to parameter difference among processing nodes. All-to-all communication is the best way to solve this problem. However, it requires higher communication cost between nodes. To solve this problem, in this paper, we combine the cut-and-stitch algorithm and a randomized algorithm and show that we can obtain classification accuracy as good as normal Locally Linear SVM with high probability.

ランダマイズドアルゴリズムによる局所線形SVMの並列化

書誌事項

この論文をさがす

抄録

収録刊行物

詳細情報詳細情報について

書き出し

問題の指摘

ランダマイズドアルゴリズムによる局所線形SVMの並列化

書誌事項

この論文をさがす

抄録

収録刊行物

詳細情報 詳細情報について

書き出し

問題の指摘

参加プロジェクトリスト

詳細情報詳細情報について