クラウド型CAPTCHAサービスにおける機械学習を用いたボットの検知

荒井, 毅, 岡部, 寿男, 松本, 悦宜, 川村, 剛司

近年，パスワードリスト攻撃を始めとしたボットを用いた不正アクセスによる被害が増加している．ボットによる自動アクセスを防ぐ手段として CAPTCHA が利用されているが，ボットの高度化に対応して CAPTCHA の難度が高くなり，ユーザの利便性の低下が問題視されている．この問題の解決策の 1 つとして，ボット検知技術を CAPTCHA に応用し，ボット利用のリスクが高いアクセスにのみ難度の高い CAPTCHA を出すことが検討されている．本研究では，商用サービスとして広く利用されているクラウド型 CAPTCHA サービスである Capy パズル CAPTCHA において，CAPTCHA 難度を変更するためのアクセスごとのボット利用リスクの判定について検討した．実際のサービスのアクセスログに対して過去に攻撃を検知した実績からボットであるかの判別フラグを付け，教師あり機械学習での判別を試みた．判別のモデルには XGBoost を用い，特徴量として DNS 逆引き情報や地理情報，UserAgent および CAPTCHA 回答時間などを用いた．また，結果における False Positive データについて解析し，ボット判別フラグの付与されていないデータにおけるボット利用アクセスの検知を行った．

In recent years, the damage caused by unauthorized access using bots has increased. Compared with attacks on conventional login screens, the success rate is high and detection is difficult. It is considered that the user's convenience declines because the difficulty of the CAPTCHA becomes high corresponding to the advancement of the bot. As a solution, applying bot detection technologies to CAPTCHA is considered. In this research, we focus on capy puzzle CAPTCHA which is widely used in commercial service. We examined to estimate the risk of each access to change the difficulty of CAPTCHA. Based on the attacks detected in the past, we added a flag to determine bot. We tried to discriminate the flags using supervised learning. We used XGBoost as a model. We used reverse DNS response, Http-User-Agent and response time of CAPTCHA as features. Moreover, we analyzed the False Positive data and detected the bot from the data which has no bot discrimination flag.

クラウド型CAPTCHAサービスにおける機械学習を用いたボットの検知

書誌事項

抄録

収録刊行物

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

クラウド型CAPTCHAサービスにおける機械学習を用いたボットの検知

書誌事項

抄録

収録刊行物

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

参加プロジェクトリスト

詳細情報詳細情報について