Institutions | About Us | Help | Gaeilge
rian logo


Mark
Go Back
-ing words in RBMT: multilingual evaluation and exploration of pre- and post-processing solutions
Aranberri Monasterio, Nora
This PhD dissertation falls within the domain of machine translation and it specifically focuses on the machine translation of IT-domain -ing words into four target languages: French, German, Japanese and Spanish. Claimed to be problematic due to their linguistic flexibility, i.e. -ing words can function as nouns, adjectives and verbs, this dissertation investigates how problematic -ing words are and explores possible solutions for improvement of their MT output. A corpus-based approach for a better representation of the domain-specific structures where -ing words occur is used. After selecting a significant sample, the -ing words are classified following a functional categorisation presented by Izquierdo (2006). The sample is machine-translated using a customised RBMT system. A feature-based human evaluation is then performed in order to obtain information about the specific feature under study. The results showed that 73% of the -ing words were correctly translated in terms of grammaticality and accuracy for German, Japanese and Spanish. The percentage for French was lower at 52%. These data, combined with a thorough analysis of the MT output, allows for the identification of cross-language and language-specific issues and their characteristics, setting the path for improvement. The approaches for improvements examined cover both the pre- and post-processing stages of automated translation. For pre-processing, controlled language (CL) and automatic source re-writing (ASR) are explored and evaluated. For post-processing, global search and replace (Global S&R) and statistical post-editing (SPE) methods are tested. CL is reported to reduce -ing word ambiguity but to not achieve substantial machine translation improvement. Regex-based implementations of ASR and Global S&R efforts show considerable translation improvements ranging from 60% to 95% and minimal degradation, ranging from 0% to 18%. The results yielded for SPE show little improvement, or even degradation at both sentence and -ing word level.
Keyword(s): Machine translating; -ing words; machine translation; evaluation
Publication Date:
2010
Type: Doctoral thesis
Peer-Reviewed: No
Language(s): English
Institution: Dublin City University
Funder(s): Enterprise Ireland
Citation(s): Aranberri Monasterio, Nora (2010) -ing words in RBMT: multilingual evaluation and exploration of pre- and post-processing solutions. PhD thesis, Dublin City University.
Publisher(s): Dublin City University. School of Applied Language and Intercultural Studies; Dublin City University. Centre for Translation and Textual Studies (CTTS)
File Format(s): application/pdf
Supervisor(s): O'Brien, Sharon
Related Link(s): http://doras.dcu.ie/15093/1/thesis_nora_aranberri_2009.pdf
First Indexed: 2010-03-30 05:08:44 Last Updated: 2015-03-23 05:23:29