A Comprehensive Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation

Abstract

<p>Neural machine translation (NMT) has shown very promising results when there are large amounts of parallel corpora. However, for low resource domains, vanilla NMT cannot give satisfactory performance due to overfitting on the small size of parallel corpora. Two categories of domain adaptation approaches have been proposed for low resource NMT, i.e., adaptation using out-of-domain parallel corpora and in-domain monolingual corpora. In this paper, we conduct a comprehensive empirical comparison of the methods in both categories. For domain adaptation using out-of-domain parallel corpora, we further propose a novel domain adaptation method named mixed fine tuning, which combines two existing methods namely fine tuning and multi domain NMT. For domain adaptation using in-domain monolingual corpora, we compare two existing methods namely language model fusion and synthetic data generation. In addition, we propose a method that combines these two categories. We empirically compare all the methods and discuss their benefits and shortcomings. To the best of our knowledge, this is the first work on a comprehensive empirical comparison of domain adaptation methods for NMT.</p>

Journal

References(15)*help

See more

Details 詳細情報について

Report a problem

Back to top