Institutions | About Us | Help | Gaeilge
rian logo

Go Back
Parser-based retraining for domain adaptation of probabilistic generators
Hogan, Deirdre; Foster, Jennifer; Wagner, Joachim; van Genabith, Josef
While the effect of domain variation on Penn-treebank- trained probabilistic parsers has been investigated in previous work, we study its effect on a Penn-Treebank-trained probabilistic generator. We show that applying the generator to data from the British National Corpus results in a performance drop (from a BLEU score of 0.66 on the standard WSJ test set to a BLEU score of 0.54 on our BNC test set). We develop a generator retraining method where the domain-specific training data is automatically produced using state-of-the-art parser output. The retraining method recovers a substantial portion of the performance drop, resulting in a generator which achieves a BLEU score of 0.61 on our BNC test data.
Keyword(s): Machine translating; Penn-Treebank-trained probabilistic generator
Publication Date:
Type: Conference item
Peer-Reviewed: Yes
Language(s): English
Institution: Dublin City University
Funder(s): Enterprise Ireland; Science Foundation Ireland; Irish Research Council for Science Engineering and Technology
Citation(s): Hogan, Deirdre and Foster, Jennifer and Wagner, Joachim and van Genabith, Josef (2008) Parser-based retraining for domain adaptation of probabilistic generators. In: INLG 08 - 5th International Natural Language Generation Conference , 12-14 June 2008, Salt Fork, Ohio, USA.
Publisher(s): Association for Computational Linguistics
File Format(s): application/pdf
Related Link(s):
First Indexed: 2010-02-17 05:08:27 Last Updated: 2015-03-23 05:23:03