コーパス言語学と日本語研究

後藤 斉, Hitosi GOTOO

doi:10.15084/00002182

東北大学

Tohoku University

本稿は,コーパス言語学をもっとも発達させたイギリスにおける事情と日本におけるコーパス研究の位置づけとを対比しつつ歴史的に概観して,その発展の違いの要因を探り,あわせて今後に対するなにがしかの見通しを得ようとするものである。イギリスにおいてコーパス言語学が発達したことには,主要因としては言語研究の流れに沿うものであったことが挙げられ,ほかにもいくつかの言語内的および言語外的要因が挙げられる。それに対して,日本では,計算機利用の言語研究の歴史は長いが,コーパスの概念の精緻化には至らず,現在,代表性を備えていて,人文系の研究者が共有できるようなコーパスが存在しない。現在の不十分なコーパスでも意味論の研究などに利用することが可能ではあるが,国立国語研究所が「現代日本語書き言葉均衡コーパス」の構築に着手したことの意義は大きい。ただし,それを十分に生かすためには,利用考の側にも主体的な努力が求められる。

Linguistics in Japan has failed to develop corpus-based language studies into corpus linguistics, inspite of the long history of computer-based mathematical linguistics dated from the 1960s and sporadic contacts with English corpus linguistics since the 1980s. This is contrastive to the situation in Britain, where corpus linguistics has been established since the early 1980s, with grammatical and lexicological studies as main foci of interest. It is noteworthy that there is no Japanese corpus, available to researchers, which could be safely claimed as representative, so that researchers are now obliged to use a haphazardous collection of electronic texts as a corpus. Usefulness of such a corpus is evident, as is shown in a tentative case study, but inevitably limited. A representative corpus would serve better to linguistic research. The project of Balanced Corpus of Contemporary Written Japanese, now being undertaken by the National Institute for Japanese Language, is expected to fill the need and this is evidently welcome. It should be noted, however, that, in order to gain full advantage of a corpus, users will have to make efforts to acquire knowledge on techniques and basic facts in text processing.

application/pdf

コーパス言語学と日本語研究

書誌事項

この論文をさがす

抄録

収録刊行物

被引用文献 (1)*注記

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

コーパス言語学と日本語研究

書誌事項

この論文をさがす

抄録

収録刊行物

被引用文献 (1)*注記

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

参加プロジェクトリスト

詳細情報詳細情報について