Resolving Overlapping Ambiguities and Selecting Correct Word Sequence in Chinese Using Internet Corpus.
Access this Article
We propose an effective method for resolving overlapping ambiguities found in sentential analyses of Chinese. It detects the ambiguities by a FBMM scanner, resolves them by using the relevancy value (RV), a statistical measure for word co-occurrences taken from textual data on the Internet, and selects the correct word sequence for the sentence being analyzed. We use contextual information also when RVs are considered not sufficient to resolving the ambiguities and choosing the correct word sequence. An experiment for selecting the desired sequences shows a success rate of about 85%. This result is convincing and far better than those in other comparable studies.
- Journal of Natural Language Processing
Journal of Natural Language Processing 8(3), 107-121, 2001
The Association for Natural Language Processing