Read/Search this Article
Abstract
大規模データに対する高速な文字列検索は接尾辞配列 (SA) を用いて実現できるが,SA には多くの容量が必要になってしまう.SA を圧縮する様々な方法が提案されているが,提案手法は,SA と転置インデックスを組み合わせることで,出現頻度の高いフレーズの検索が既存の索引に比べ高速な索引を実現する.最終的には実験により,以前提案した別手法5) と似た性能で,ある程度検索時間を調整可能であることを示す.
String pattern matching for large-scale data is efficiently achieved by suffix array (SA), but SA requires a large space. Therefore, various methods to compress SA have been proposed. In this paper, we combine SA and Inverted index. As a result, performance of the proposed method is better than existing ones when searching frequent phrases. Experiments show that the proposed method is similar to one which is proposed by us in other times, and able to control search time at some level.
Journal
- IPSJ SIG Notes [List of Volumes]
-
IPSJ SIG Notes 2010-AL-128(1), 1-6, 2010-01-19 [Table of Contents]
Information Processing Society of Japan (IPSJ)
Share