Online Grammar-Based Self-Index and Its Applications オンライン文法圧縮に基づく自己索引とそのアプリケーション

著者

    • 髙畠, 嘉将

書誌事項

タイトル

Online Grammar-Based Self-Index and Its Applications

タイトル別名

オンライン文法圧縮に基づく自己索引とそのアプリケーション

著者名

髙畠, 嘉将

学位授与大学

九州工業大学

取得学位

博士(情報工学)

学位授与番号

甲情工第322号

学位授与年月日

2017-03-24

注記・抄録

1 Introduction

2 Preliminaries

3 Structure of the OESP-index

4 Online pattern matching for string edit distance with moves

5 siEDM: an efficient string index and search algorithm for edit distance with moves

6 Online grammar compression for frequent pattern discovery

7 Conclusions and future work

Text collections including many repetitions of substrings, called highly repetitive text collections, have been increasing in various fields like the version control system and the genome database. For the efficient use of such texts collections, the importance of the compression algorithm and compressed indexes is rapidly increasing more and more. The grammar-based self-indexes are suitable for the problem of information retrieval on a compressed text because they support the random access on the compressed text. However, the constructing algorithms of existing grammar-based self-indexes are offline, that is, these algorithms require the whole input beforehand. Therefore, the memory consumption depends on the size of input text explicitly. In order to overcome this difficulty, we propose the first online algorithm for grammar-based self-index, called Online Edit-Sensitive Parsing index (OESP-index, for short). The proposed algorithm directly transforms the input text into the corresponding variable-length encoded string reading the input symbol one-by-one. Compared to the existing self-indexes, the memory consumption of our online algorithm depends on the size of output, that is, the size of compressed text. Additionally, we also present three applications based on the grammar-based compression: (i) the online pattern matching problem for string edit-distance with moves called online ESP (OESP); (ii) the string index for edit-distance with moves called siEDM; (iii) the online grammar compression for frequent pattern discovery in smaller space. For these applications, we demonstrate the performance of our algorithm for large-scale data.

九州工業大学博士学位論文 学位記番号:情工博甲第322号 学位授与年月日:平成29年3月24日

平成28年度

1 Introduction||2 Preliminaries||3 Structure of the OESP-index||4 Online pattern matching for string edit distance with moves||5 siEDM: an efficient string index and search algorithm for edit distance with moves||6 Online grammar compression for frequent pattern discovery||7 Conclusions and future work

九州工業大学博士学位論文(要旨)学位記番号:情工博甲第322号 学位授与年月日:平成29年3月24日

目次

  1. 2017-10-02 再収集 (2コマ目)
  2. 2023-03-04 再収集 (3コマ目)
  3. 2023-08-05 再収集 (4コマ目)
  4. 2023-10-11 再収集 (5コマ目)
20アクセス

各種コード

  • NII論文ID(NAID)
    500001036571
  • NII著者ID(NRID)
    • 8000001616786
  • DOI(JaLC)
  • DOI
  • 本文言語コード
    • eng
  • データ提供元
    • 機関リポジトリ
    • NDLデジタルコレクション
ページトップへ