Faster Sparse Suffix Tree Construction on Word RAM

高木 拓也, 上村 卓史, 有村 博紀

長さ N のテキストの K ≤ N 個の索引点に対する接尾辞木を疎接尾辞木 (sparse suffix tree) といい， O(K) 語の領域しか使用しないため，さまざまな応用に用いられている．上村と有村 (Proc. CPM2011, LNCS 6661, 2011) は，長さ N ビットのハフマン圧縮テキストが入力として与えられたとき，その疎接尾辞木を O(σ) 前処理時間と O(K+σ) 語の領域を用いて，オンライン構築するアルゴリズムを与えている．本稿では，ビット並列計算と簡潔トライ構造からなる詰め込み文字列 (packed string) 技法を用いて，長さ O(N) ビットのハフマン圧縮テキストに対して，疎接尾辞木を O(σ) 前処理時間と O(K+σ) 語の領域を用いて， O(⌈N/w⌉√w+K√w) 時間でオンライン構築するアルゴリズムを与える．ここに， w は計算機のレジスタ長 (ビット) であり， σ は符号中の符号語長さの総和である．これは，疎接尾辞木のオンライン構築で初めて， O(N) 時間より少ない計算時間を達成したアルゴリズムである．提案手法は， K≦O(N/√w) のときに従来手法より高速である．また，一般の有限接頭符号上や，単語アルファベット上の符号化テキストに対しても拡張可能である．We present an efficient algorithm on Word RAM for constructing a sparse suffix tree on an encoded text over a regular prefix-code in O(⌈N/w⌉ √w+K√w) time using O(σ)preprocessing and O(K + σ) word space, where N is the length of the text in base letters, K is the length of the text in code words, σ is the size of a base alphabet Σ, σ is the total size of a code alphabet on Σ, and w is a bit-length of a register of Word RAM.

Faster Sparse Suffix Tree Construction on Word RAM

Bibliographic Information

Search this article

Abstract

Journal

Details 詳細情報について

Export

Report a problem

Faster Sparse Suffix Tree Construction on Word RAM

Bibliographic Information

Search this article

Abstract

Journal

Details 詳細情報について

Export

Report a problem

Project list