CJKV information processing


CJKV information processing

Ken Lunde

O'Reilly, 1999

大学図書館所蔵 件 / 24



Includes bibliographical references and index



"CJKV Information Processing" is the definitive guide for tackling the difficult issues faced when dealing with complex Asian languages - Chinese, Japanese, Korean, and Vietnamese - in the context of computing or Internet services. Unlike the English alphabet with a mere 26 letters, these complex writing systems use multiple alphabets comprising thousands of characters. Handling such an unwieldy amount of data is formidable and complex. Until now, working with these writing systems was an unattainable task to most, but this book clarifies the issues, even to those who don't understand East Asian languages. This book contains revised information from Ken Lunde's first book, "Understanding Japanese Information Processing", and supplements each chapter with meticulous details about how the Chinese (hanzi), Japanese (kana and kanji), Korean (hangul and hanja), and Vietnamese (Quoc ngu, chu Nom, and chu Han) writing systems have been implemented on contemporary computer systems. This book is unique in that it does not simply rattle off information that can be found in other sources, but rather it provides the reader with hitherto unexplained insights into how these complex writing systems have been adapted for use on computers, and provides the user and developer alike with useful and time-saving tips and techniques. Information on today's hot topics, such as how these writing systems impact contemporary Internet resources like the Web, HTML, XML, Java, and Adobe Acrobat, is also provided.


Foreword Preface 1. CJKV Information Processing Overview Multiple Writing Systems Character Set Standards Encoding Methods Input Methods Typography Basic Concepts and Terminology 2. Writing Systems Latin Characters and Transliteration Zhuyin Kana Hangul Chinese Characters Non-Chinese Chinese Characters 3. Character Set Standards Non-Coded Character Set Standards Coded Character Set Standards International Character Set Standards Character Set Standard Oddities Non-Coded Versus Coded Character Sets Information Interchange Versus Professional Publishing Advice to Developers 4. Encoding Methods Locale-Independent Encoding Methods Locale-Specific Encoding Methods Comparing CJKV Encoding Methods International Encoding Methods Charset Designations Code Pages Code Conversion Repairing Unreadable CJKV Text Beware of Little and Big Endian Issues Advice to Developers 5. Input Methods Transliteration Techniques Input Techniques User Interface Concerns Keyboard Arrays Other Input Hardware Input Method Software 6. Font Formats Typeface Design Issues Bitmapped Fonts Outline Fonts Ruby Fonts Host-Based Versus Printer-Resident Fonts Creating Your Own Fonts External Character Handling Advice to Developers 7. Typography Rules, Rules, Rules... Typographic Units and Measurements Horizontal and Vertical Layout Line Breaking and Word Wrapping Character Spanning Alternate Metrics Kerning Line Length Issues Multilingual Text Glyph Substitution Annotations Typographic Software 8. Output Methods Where Can Fonts Live? Printer Output PostScript CJKV Printers Computer Monitor Output Other Printing Methods The Role of Printer Drivers Output Tips and Tricks Advice to Developers 9. Information Processing Techniques Language, Country, and Script Codes Programming Languages Code Conversion Algorithms Java Programming Examples Miscellaneous Algorithms Byte Versus Character Handling Character Sorting Natural Language Processing Regular Expressions Search Engines Code Processing Tools 10. Operating Systems, Text Editors, and Word Processors Viewing CJKV Text on Non-CJKV Systems Operating Systems Hybrid Environments Text Editors Word Processors Dedicated Word Processors 11. Dictionaries and Dictionary Software Chinese Character Dictionary Indexes Character Dictionaries Other Useful Dictionaries Dictionary Hardware Dictionary Software Machine Translation Software Machine Translation Services Learning Aids 12. The Internet Email News FTP and Telnet Network Domains Getting Connected Internet Software 13. The World Wide Web Content Versus Presentation Displaying Web Documents Authoring HTML Documents Authoring XML Documents Authoring PDF Documents Character References CGI Programming Examples Shall We Surf? A. Code Conversion Tables B. Notation Conversion Table C. Vendor Character Set Standards D. Vendor Encoding Methods E. GB 2312-80 Table F. GB/T 12345-90 Table G. CNS 11643-1992 Table H. Big Five Table I. Hong Kong GCCS Table J. JIS X 0208:1997 Table K. JIS X 0212-1990 Table L. KS X 1001:1992 Table M. KS X 1002:1991 Hanja Table N. Hangul Reading Table O. TCVN 6056:1995 Table P. Code Table Indexes Q. Character Lists and Mapping Tables R. Chinese Character Lists S. Single-Byte Code Tables T. Software and Document Sources U. Mailing Lists V. Professional Organizations W. Perl Code Examples X. Glossary Bibliography Index

「Nielsen BookData」 より