認識情報を利用した英数字混在文書からの文字切出しと認識

本郷 保夫

doi:10.1541/ieejeiss1987.122.6_928

認識情報を利用した英数字混在文書からの文字切出しと認識

DOI Web Site Web Site 参考文献18件

本郷保夫

富士電機

書誌事項

タイトル別名

Character Segmentation and Recognition of Alphanumeric-mixed Documents Based on Pattern Recognition Information
ニンシキジョウホウオリヨウシタエイスウジコンザイブンショカラノモジキリダシトニンシキ

この論文をさがす

抄録

Generally speaking, Japanese OCR cannot easily read Japanese documents that also contain alphanumeric data, bacause of the proportional pitch setting of alphanumeric characters displaced in the fixed pitch setting of the Japanese document.<br> This paper describes how to extract character candidates from combinations of small patterns that may be components of separable Japanese characters or slim patterns as alphanumeric characters, and how to select true character patterns from character candidates. We propose a new segmentation and recognition method for alphanumeric-mixed documents based on pattern recognition information such as similarities, pattern sizes and character kinds.<br> The method was tested on alphanumeric-mixed documents, which were 51 pages of technical journals and transactions containing 68, 867 characters. The resulting segmentation rate was 99.75% and the recognition rate was 99.05%, so we conclude that this method may be applied to Japanese OCR.

収録刊行物

電気学会論文誌Ｃ（電子・情報・システム部門誌）

電気学会論文誌Ｃ（電子・情報・システム部門誌） 122 (6), 928-935, 2002

一般社団法人電気学会

参考文献 (18)*注記

詳細情報詳細情報について

CRID

1390001204609919104
NII論文ID

130006845174

10008508962
NII書誌ID

AN10065950
DOI

10.1541/ieejeiss1987.122.6_928
ISSN

13488155

03854221
NDL書誌ID

6174783
Web Site

https://ndlsearch.ndl.go.jp/books/R000000004-I6174783

https://www.jstage.jst.go.jp/article/ieejeiss1987/122/6/122_6_928/_pdf
データソース種別
- JaLC
- NDL
- Crossref
- CiNii Articles
抄録ライセンスフラグ
使用不可

書き出し

問題の指摘

ページトップへ

認識情報を利用した英数字混在文書からの文字切出しと認識

書誌事項

この論文をさがす

抄録

収録刊行物

参考文献 (18)*注記

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

参加プロジェクトリスト

詳細情報詳細情報について