Resolving Overlapping Ambiguities and Selecting Correct Word Sequence in Chinese Using Internet Corpus.

Access this Article

Author(s)

    • Han Dongli
    • The University of Electro-Communications, Department of Computer Science
    • Furugori Teiji
    • The University of Electro-Communications, Department of Computer Science

Abstract

We propose an effective method for resolving overlapping ambiguities found in sentential analyses of Chinese. It detects the ambiguities by a FBMM scanner, resolves them by using the relevancy value (RV), a statistical measure for word co-occurrences taken from textual data on the Internet, and selects the correct word sequence for the sentence being analyzed. We use contextual information also when RVs are considered not sufficient to resolving the ambiguities and choosing the correct word sequence. An experiment for selecting the desired sequences shows a success rate of about 85%. This result is convincing and far better than those in other comparable studies.

Journal

  • Journal of Natural Language Processing

    Journal of Natural Language Processing 8(3), 107-121, 2001

    The Association for Natural Language Processing

Codes

  • NII Article ID (NAID)
    130004292160
  • Text Lang
    ENG
  • ISSN
    1340-7619
  • Data Source
    J-STAGE 
Page Top