書誌事項
- タイトル別名
-
- Mining Asynchronous Interesting Sequential Patterns based on Frequency and Self-Information
抄録
In this paper, we propose new methods and gave a system, called IFMAP , for extracting interesting patterns from a long sequential data based on frequency and self-information, and experimentally evaluate the proposed methods in the application of handling a newspaper article corpus.<br>Sequential data mining methods based on frequency have intensively beenstudied so far. These methods, however, are not effective nor valuable for some applications where almost all high-frequent patterns should beregarded just as meaningless noisy patterns.<br> An information-gain concept is quite important in order to restrain these noisy patterns, and was already studied for integrating it with a frequency criteria. Yang et.~al. gave a sequential mining system InfoMiner which can find periodic synchronous patterns being interesting and well-balanced from the both view-points of frequency and self-information. <br> In this paper, we refine and extend the InfoMiner technologies in the following points: firstly, our method can handle ordinary, i.e., asynchronous and non-periodic patterns by using a sliding window mechanism, whereas InfoMiner cannot; secondly we give several combination measures for choosing valuable patterns based on frequency and self-information, while InfoMiner has just one measure which, we show in this paper, is not appropriate nor effective for handling newspaper article corpora; thirdly, we proposed a new unified method for pruning the search space of sequential data mining, which can uniformally be applied to any combination measures proposed here. <br> We conduct experiments for evaluating the effectiveness and efficiency of the proposed method with respect to the runtime and the amount of excluding noisy patterns.
収録刊行物
-
- 人工知能学会論文誌
-
人工知能学会論文誌 25 (3), 464-474, 2010
一般社団法人 人工知能学会
- Tweet
詳細情報 詳細情報について
-
- CRID
- 1390001205106787584
-
- NII論文ID
- 130000263868
-
- ISSN
- 13468030
- 13460714
-
- 本文言語コード
- ja
-
- データソース種別
-
- JaLC
- Crossref
- CiNii Articles
- KAKEN
-
- 抄録ライセンスフラグ
- 使用不可