Abstract
text
During the process of unknown word detection in Chinese word segmentation, many detected word candidates are invalid. These false unknown word candidates deteriorate the overall segmentation accuracy, as it will affect the segmentation accuracy of known words. Therefore, we propose to eliminate as many invalid word candidates as possible by a pruning process. Our experiments show that by cutting down the invalid unknown word candidates, we improve the segmentation accuracy of known words and hence that of the overall segmentation accuracy.