Corpus-based research into language : in honour of Jan Aarts
Author(s)
Bibliographic Information
Corpus-based research into language : in honour of Jan Aarts
(Language and computers : studies in practical linguistics, no. 12)
Rodopi, 1994
Available at 40 libraries
  Aomori
  Iwate
  Miyagi
  Akita
  Yamagata
  Fukushima
  Ibaraki
  Tochigi
  Gunma
  Saitama
  Chiba
  Tokyo
  Kanagawa
  Niigata
  Toyama
  Ishikawa
  Fukui
  Yamanashi
  Nagano
  Gifu
  Shizuoka
  Aichi
  Mie
  Shiga
  Kyoto
  Osaka
  Hyogo
  Nara
  Wakayama
  Tottori
  Shimane
  Okayama
  Hiroshima
  Yamaguchi
  Tokushima
  Kagawa
  Ehime
  Kochi
  Fukuoka
  Saga
  Nagasaki
  Kumamoto
  Oita
  Miyazaki
  Kagoshima
  Okinawa
  Korea
  China
  Thailand
  United Kingdom
  Germany
  Switzerland
  France
  Belgium
  Netherlands
  Sweden
  Norway
  United States of America
Note
Bibliography: p. [255]-276
Description and Table of Contents
Description
For over two decades Jan Aarts has been actively involved in corpus linguistic research. He was the instigator of a large number of projects, and he was responsible for what has become known as the Nijmegen approach to corpus linguistics. It is thanks to him that words like TOSCA and LDB have become household names in the corpus linguistic community.
The present volume has been collected in his honour. The contributions in it cover a wide range of topics in the field of corpus linguistic research, especially those in which Jan Aarts takes a keen interest: corpus encoding and tagging, parsing and databases, and the linguistic exploration of corpus data. The contributions in this volume discuss work done in this field outside Nijmegen, for the obvious reason that we do not wish to present him with a report on work in which he is himself involved.
Table of Contents
Flor AARTS: A tribute to Jan Aarts. Nelleke OOSTDIJK and Pieter de HAAN: Introduction. PART I: THE ENCODING AND TAGGING OF CORPORA. Stig JOHANSSON: Continuity and change in the encoding of computer corpora. Sidney GREENBAUM and Ni YIBIN: Tagging the British ICE Corpus: English word classes. Geoffrey LEECH, Roger GARSIDE, and Michael BRYANT: The large-scale grammatical tagging of text: Experience with the British National Corpus. Willem MEIJS: Computerized lexicons and theoretical models. Louise GUTHRIE, Joe GUTHRIE, and Jim COWIE: Resolving lexical ambiguity. PART II: PARSING AND DATABASES. Ted BRISCOE: Prospects for practical parsing of unrestricted text: Robust statistical parsing techniques. Fred KARLSSON: Robust parsing of unconstrained text. Clive SOUTER and Eric ATWELL: Using parsed corpora: A review of current practice. Ezra BLACK: An experiment in customizing the Lancaster Treebank. Geoffrey SAMPSON: SUSANNE: A Domesday Book of English grammar. William GALE and Kenneth CHURCH: What is wrong with adding one? PART III: LINGUISTIC EXPLORATION OF THE DATA. Douglas BIBER and Edward FINEGAN: Intra-textual variation within medical research articles. Bengt ALTENBERG: On the functions of such in spoken and written English. Anna-Brita STENSTROEM and Jan SVARTVIK: Imparsable speech: Repeats and other nonfluencies in spoken English. References. List of contributors.
by "Nielsen BookData"