ローカルアラインメントを用いたテキスト間の柔軟な対応付け  [in Japanese] A Flexible Text Matching using Local Alignments  [in Japanese]

Access this Article

Search this Article

Author(s)

    • 岩山 真 IWAYAMA Makoto
    • 東京工業大学 精密工学研究所 Precision and Intelligence Laboratory, Tokyo Institute of Technology
    • 新森 昭宏 SHINMORI Akihiko
    • 東京工業大学 総合理工学研究科 知能システム科学専攻 Department of Computational Intelligence and Systems Sciences, Tokyo Institute of Technology

Abstract

従来のDPマッチングでは難しかった交差の存在するテキスト間の対応付けを行う手法を提案する.提案手法の特徴は以下の二点である.まずはテキスト間における部分文字列同士のアラインメント,すなわちローカルアラインメントの概念と,その計算手法としてローカルアラインメントDPマッチングを導入した点であり,もう一点はローカルアラインメントの順位付けを行い,対応付けに利用した点である.前者の工夫により,DPマッチングの利点である類似度の最適化と計算量の削減を実現し,後者の工夫により,交差にも対応したテキスト間の柔軟な対応付けを実現した.提案手法の適用例として,公開特許公報全文における「請求項」と「発明の詳細な説明」との対応付けを紹介し,本手法の有効性を議論する.A method of aligning a text with another text, in which the partial alignments include crossovers and overlaps, is proposed. This method has the following two characteristics. One is to introduce the concept of the local alignment between sub-strings and use the dynamic programming to enumerate the possible local alignments. Another is to extract sub-optimal local alignments in addition to the optimal one. The former realizes efficient enumeration of local alignments and the latter realizes flexible text matching, where the partial alignments have crossovers and overlaps. We show an example of applying the method for finding alignments between "claims" and "embodiments" in a patent application, and discuss its effectiveness.

A method of aligning a text with another text, in which the partial alignments include crossovers and overlaps, is proposed. This method has the following two characteristics. One is to introduce the concep of the local alignment between sub-strings and use the dynamic programming to enumerate the possible local alignments. Another is to extract sub-optimal local alignments in addition to the optimal one. The former realizes efficient enumeration of local alignments and the latter realizes flexible text matching, where the partial alignments have crossovers and overlaps. We show an example of applying the method for finding alignments between "claims" and "embodiments" in a patent application, and discuss its effectiveness.

Journal

  • IPSJ SIG Notes

    IPSJ SIG Notes 2002(87(2002-NL-151)), 23-28, 2002-09-17

    Information Processing Society of Japan (IPSJ)

References:  10

Cited by:  1

Codes

  • NII Article ID (NAID)
    110002934362
  • NII NACSIS-CAT ID (NCID)
    AN10115061
  • Text Lang
    JPN
  • Article Type
    Journal Article
  • ISSN
    09196072
  • NDL Article ID
    6313563
  • NDL Source Classification
    ZM13(科学技術--科学技術一般--データ処理・計算機)
  • NDL Call No.
    Z14-1121
  • Data Source
    CJP  CJPref  NDL  NII-ELS  IPSJ 
Page Top