Institutions
|
About Us
|
Help
|
Gaeilge
0
1000
Home
Browse
Advanced Search
Search History
Marked List
Statistics
A
A
A
Show search options
Hide search options
Search using:
All
Any
None of these
Exact Phrase
in
Keyword (All Fields)
Title
Author
Subject
Institution
Funder
All
Any
None of these
Exact Phrase
in
Keyword (All Fields)
Title
Author
Subject
Institution
Funder
All
Any
None of these
Exact Phrase
in
Keyword (All Fields)
Title
Author
Subject
Institution
Funder
From
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969
1968
1967
1966
1965
1964
1963
1962
1961
1960
1959
1958
1957
1956
1955
1954
1953
1952
1951
1950
1949
1948
1947
1946
1944
1943
1942
1941
1940
1939
1938
1937
1936
1935
1934
1933
1932
1931
1930
1929
1928
1927
1925
1923
1920
1919
1917
1915
1914
1913
1912
1911
1909
1908
1907
1906
1905
1904
1903
1902
1901
1900
1899
1898
1897
1896
1895
1894
1893
1892
1891
1890
1889
1888
1887
1886
1885
1884
1883
1882
1881
1880
1879
1878
1877
1876
1875
1874
1873
1872
1871
1870
1869
1867
1866
1865
1864
1862
1861
1859
1858
1857
1856
1855
1854
1853
1852
1851
1849
To
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969
1968
1967
1966
1965
1964
1963
1962
1961
1960
1959
1958
1957
1956
1955
1954
1953
1952
1951
1950
1949
1948
1947
1946
1944
1943
1942
1941
1940
1939
1938
1937
1936
1935
1934
1933
1932
1931
1930
1929
1928
1927
1925
1923
1920
1919
1917
1915
1914
1913
1912
1911
1909
1908
1907
1906
1905
1904
1903
1902
1901
1900
1899
1898
1897
1896
1895
1894
1893
1892
1891
1890
1889
1888
1887
1886
1885
1884
1883
1882
1881
1880
1879
1878
1877
1876
1875
1874
1873
1872
1871
1870
1869
1867
1866
1865
1864
1862
1861
1859
1858
1857
1856
1855
1854
1853
1852
1851
1849
Optionally, filter by:
(Leave unchecked to search all fields)
Item Type
Book
Book chapter
Conference item
Contribution to newspaper/magazine
Doctoral thesis
Journal article
Master thesis (research)
Master thesis (taught)
Multimedia
Patent
Report
Review
Working paper
Other
Peer Review Status
Peer reviewed
Non peer reviewed
Unknown
Institution
Dublin City University
Dublin Institute of Technology
NUI Galway
NUI Maynooth
Trinity College Dublin
University College Cork
University College Dublin
University of Limerick
Funder
Enterprise Ireland (EI)
Environmental Protection Agency (EPA)
Health Research Board (HRB)
Higher Education Authority (HEA)
Irish Aid
Irish Research Council for Humanities and Social Sciences (IRCHSS)
Irish Research Council for Science Engineering and Technology (IRCSET)
Marine Institute
Science Foundation Ireland (SFI)
Teagasc
Language
Irish
English
Danish
French
German
Interlingue; Occidental
Italian
Japanese
Spanish; Castilian
Current Search:
All of 'Machine' and 'translating' in all fields;
259 items found
Sort by
Relevance
Title
Author
Item type
Date
Institution
Peer review status
Language
Order
Ascending
Descending
25
50
100
per page
1
2
3
4
5
6
7
8
9
10
Bibtex
CSV
EndNote
RefWorks
RIS
XML
Displaying Results 1 - 25 of 259 on page 1 of 11
Marked
Mark
Comparing rule-based and data-driven approaches to Spanish-to-Basque machine translation
(2007)
Labaka, Gorka; Stroppa, Nicolas; Way, Andy; Sarasola, Kepa
Comparing rule-based and data-driven approaches to Spanish-to-Basque machine translation
(2007)
Labaka, Gorka; Stroppa, Nicolas; Way, Andy; Sarasola, Kepa
Abstract:
In this paper, we compare the rule-based and data-driven approaches in the context of Spanish-to-Basque Machine Translation. The rule-based system we consider has been developed specifically for Spanish-to-Basque machine translation, and is tuned to this language pair. On the contrary, the data-driven system we use is generic, and has not been specifically designed to deal with Basque. Spanish-to-Basque Machine Translation is a challenge for data-driven approaches for at least two reasons. First, there is lack of bilingual data on which a data-driven MT system can be trained. Second, Basque is a morphologically-rich agglutinative language and translating to Basque requires a huge generation of morphological information, a difficult task for a generic system not specifically tuned to Basque. We present the results of a series of experiments, obtained on two different corpora, one being “in-domain” and the other one “out-of-domain” with respect to the data-driven system. We show that ...
http://doras.dcu.ie/15228/
Marked
Mark
Learning labelled dependencies in machine translation evaluation
(2009)
He, Yifan; Way, Andy
Learning labelled dependencies in machine translation evaluation
(2009)
He, Yifan; Way, Andy
Abstract:
Recently novel MT evaluation metrics have been presented which go beyond pure string matching, and which correlate better than other existing metrics with human judgements. Other research in this area has presented machine learning methods which learn directly from human judgements. In this paper, we present a novel combination of dependency- and machine learning-based approaches to automatic MT evaluation, and demonstrate greater correlations with human judgement than the existing state-of-the-art methods. In addition, we examine the extent to which our novel method can be generalised across different tasks and domains.
http://doras.dcu.ie/15159/
Marked
Mark
Tuning syntactically enhanced word alignment for statistical machine translation
(2009)
Ma, Yanjun; Lambert, Patrik; Way, Andy
Tuning syntactically enhanced word alignment for statistical machine translation
(2009)
Ma, Yanjun; Lambert, Patrik; Way, Andy
Abstract:
We introduce a syntactically enhanced word alignment model that is more flexible than state-of-the-art generative word alignment models and can be tuned according to different end tasks. First of all, this model takes the advantages of both unsupervised and supervised word alignment approaches by obtaining anchor alignments from unsupervised generative models and seeding the anchor alignments into a supervised discriminative model. Second, this model offers the flexibility of tuning the alignment according to different optimisation criteria. Our experiments show that using our word alignment in a Phrase-Based Statistical Machine Translation system yields a 5.38% relative increase on IWSLT 2007 task in terms of BLEU score.
http://doras.dcu.ie/15158/
Marked
Mark
Statistical analysis of alignment characteristics for phrase-based machine translation
(2010)
Lambert, Patrik; Petitrenaud, Simon; Ma, Yanjun; Way, Andy
Statistical analysis of alignment characteristics for phrase-based machine translation
(2010)
Lambert, Patrik; Petitrenaud, Simon; Ma, Yanjun; Way, Andy
Abstract:
In most statistical machine translation (SMT) systems, bilingual segments are extracted via word alignment. However, there lacks systematic study as to what alignment characteristics can benefit MT under specific experimental settings such as the language pair or the corpus size. In this paper we produce a set of alignments by directly tuning the alignment model according to alignment F-score and BLEU score in order to investigate the alignment characteristics that are helpful in translation. We report results for a phrasebased SMT system on Chinese-to-English IWSLT data, and Spanish-to-English European Parliament data. With a statistical analysis into alignment characteristics that are correlated with BLEU score, we give alignment hints to improve BLEU score using a phrase-based SMT system and different types of corpus.
http://doras.dcu.ie/15790/
Marked
Mark
Lattice score based data cleaning for phrase-based statistical machine translation
(2010)
Jiang, Jie; Way, Andy; Carson-Berndsen, Julie
Lattice score based data cleaning for phrase-based statistical machine translation
(2010)
Jiang, Jie; Way, Andy; Carson-Berndsen, Julie
Abstract:
Statistical machine translation relies heavily on parallel corpora to train its models for translation tasks. While more and more bilingual corpora are readily available, the quality of the sentence pairs should be taken into consideration. This paper presents a novel lattice score-based data cleaning method to select proper sentence pairs from the ones extracted from a bilingual corpus by the sentence alignment methods. The proposed method is carried out as follows: firstly, an initial phrasebased model is trained on the full sentencealigned corpus; then for each of the sentence pairs in the corpus, word alignments are used to create anchor pairs and sourceside lattices; thirdly, based on the translation model, target-side phrase networks are expanded on the lattices and Viterbi searching is used to find approximated decoding results; finally, BLEU score thresholds are used to filter out the low-score sentence pairs for the data cleaning purpose. Our experiments on the FBIS corpus ...
http://doras.dcu.ie/15789/
Marked
Mark
Combining multi-domain statistical machine translation models using automatic classifiers
(2010)
Banerjee, Pratyush; Du, Jinhua; Li, Baoli; Kumar Naskar, Sudip; Way, Andy; van Genabith...
Combining multi-domain statistical machine translation models using automatic classifiers
(2010)
Banerjee, Pratyush; Du, Jinhua; Li, Baoli; Kumar Naskar, Sudip; Way, Andy; van Genabith, Josef
Abstract:
This paper presents a set of experiments on Domain Adaptation of Statistical Machine Translation systems. The experiments focus on Chinese-English and two domain-specific corpora. The paper presents a novel approach for combining multiple domain-trained translation models to achieve improved translation quality for both domain-specific as well as combined sets of sentences. We train a statistical classifier to classify sentences according to the appropriate domain and utilize the corresponding domain-specific MT models to translate them. Experimental results show that the method achieves a statistically significant absolute improvement of 1.58 BLEU (2.86% relative improvement) score over a translation model trained on combined data, and considerable improvements over a model using multiple decoding paths of the Moses decoder, for the combined domain test set. Furthermore, even for domain-specific test sets, our approach works almost as well as dedicated domain-specific models and pe...
http://doras.dcu.ie/15804/
Marked
Mark
Accuracy-based scoring for phrase-based statistical machine translation
(2010)
Penkale, Sergio; Ma, Yanjun; Galron, Daniel; Way, Andy
Accuracy-based scoring for phrase-based statistical machine translation
(2010)
Penkale, Sergio; Ma, Yanjun; Galron, Daniel; Way, Andy
Abstract:
Although the scoring features of state-of-the-art Phrase-Based Statistical Machine Translation (PB-SMT) models are weighted so as to optimise an objective function measuring translation quality, the estimation of the features themselves does not have any relation to such quality metrics. In this paper, we introduce a translation quality-based feature to PBSMT in a bid to improve the translation quality of the system. Our feature is estimated by averaging the edit-distance between phrase pairs involved in the translation of oracle sentences, chosen by automatic evaluation metrics from the N-best outputs of a baseline system, and phrase pairs occurring in the N-best list. Using our method, we report a statistically significant 2.11% relative improvement in BLEU score for the WMT 2009 Spanish-to-English translation task. We also report that using our method we can achieve statistically significant improvements over the baseline using many other MT evaluation metrics, and a substantial ...
http://doras.dcu.ie/15802/
Marked
Mark
Experiments on domain adaptation for patent machine translation in the PLuTO project
(2011)
Ceausu, Alexandru; Tinsley, John ; Zhang, Jian ; Way, Andy
Experiments on domain adaptation for patent machine translation in the PLuTO project
(2011)
Ceausu, Alexandru; Tinsley, John ; Zhang, Jian ; Way, Andy
Abstract:
The PLUTO1 project (Patent Language Translations Online) aims to provide a rapid solution for the online retrieval and translation of patent documents through the integration of a number of existing state-of-the-art components provided by the project partners. The paper presents some of the experiments on patent domain adaptation of the Machine Translation (MT) systems used in the PLuTO project. The experiments use the International Patent Classification for domain adaptation and are focused on the English–French language pair.
http://doras.dcu.ie/16412/
Marked
Mark
An example-based approach to translating sign language
(2005)
Morrissey, Sara; Way, Andy
An example-based approach to translating sign language
(2005)
Morrissey, Sara; Way, Andy
Abstract:
Users of sign languages are often forced to use a language in which they have reduced competence simply because documentation in their preferred format is not available. While some research exists on translating between natural and sign languages, we present here what we believe to be the first attempt to tackle this problem using an example-based (EBMT) approach. Having obtained a set of English–Dutch Sign Language examples, we employ an approach to EBMT using the ‘Marker Hypothesis’ (Green, 1979), analogous to the successful system of (Way & Gough, 2003), (Gough & Way, 2004a) and (Gough & Way, 2004b). In a set of experiments, we show that encouragingly good translation quality may be obtained using such an approach.
http://doras.dcu.ie/15297/
Marked
Mark
Data-driven machine translation for sign languages
(2008)
Morrissey, Sara
Data-driven machine translation for sign languages
(2008)
Morrissey, Sara
Abstract:
This thesis explores the application of data-driven machine translation (MT) to sign languages (SLs). The provision of an SL MT system can facilitate communication between Deaf and hearing people by translating information into the native and preferred language of the individual. We begin with an introduction to SLs, focussing on Irish Sign Language - the native language of the Deaf in Ireland. We describe their linguistics and mechanics including similarities and differences with spoken languages. Given the lack of a formalised written form of these languages, an outline of annotation formats is discussed as well as the issue of data collection. We summarise previous approaches to SL MT, highlighting the pros and cons of each approach. Initial experiments in the novel area of example-based MT for SLs are discussed and an overview of the problems that arise when automatically translating these manual-visual languages is given. Following this we detail our data-driven approach, exa...
http://doras.dcu.ie/570/
Marked
Mark
Translating with examples: the LFG-DOT models of translation
(2003)
Way, Andy
Translating with examples: the LFG-DOT models of translation
(2003)
Way, Andy
Abstract:
Machine Translation (MT) systems based on Data-Oriented Parsing (DOP: Bod, 1998) and LFG-DOP (Bod & Kaplan, 1998) may be viewed as instances of Example-Based MT (EBMT). In both approaches, new translations are processed with respect to previously seen translations residing in the system's database. We describe DOT models for translation (Poutsma, 1998, Poutsma 2000) based on DOP. We demonstrate that DOT1 is not guaranteed to produce the correct translation, despite provably deriving the most probable translation. The DOT2 translation model solves most of the problems of DOT1, but suffers from limited compositionality when confronted with certain data. Nothwithstanding the success of DOT2, any system based purely on trees will ultimately be found wanting as a general solution to the wide diversity of translation problems, as certain linguistic phenomena require a description at levels deeper than surface syntax. We then show how LFG-DOP can be extended to serve as a novel hy...
http://doras.dcu.ie/15320/
Marked
Mark
Translating with examples
(2001)
Way, Andy
Translating with examples
(2001)
Way, Andy
Abstract:
Machine Translation (MT) systems based on Data-Oriented Parsing (DOP: Bod, 1998) and LFG-DOP (Bod & Kaplan, 1998) may be viewed as instances of Example-Based MT (EBMT). In both approaches, new translations are processed with respect to previously seen translations residing in the system's database. We describe the DOT models of translation (Poutsma 1998; 2000) based on DOP. We demon- strate that DOT1 is not guaranteed to produce the correct translation, despite provably deriving the most probable translation. The DOT2 translation model solves most of the problems of DOT1, but suffers from limited compositionality when confronted with certain data. Notwithstanding the success of DOT2, any system based purely on trees will ultimately be found wanting as a general solution to the wide diversity of translation problems, as certain linguistic phenomena require a description at levels deeper than surface syntax. We then show how LFG-DOP can be extended to serve as a novel hybrid ...
http://doras.dcu.ie/15350/
Marked
Mark
Assistive translation technology for deaf people: translating into and animating Irish sign language
(2008)
Morrissey, Sara
Assistive translation technology for deaf people: translating into and animating Irish sign language
(2008)
Morrissey, Sara
Abstract:
Machine Translation (MT) for sign languages (SLs) can facilitate communication between Deaf and hearing people by translating information into the native and preferred language of the individuals. In this paper, we discuss automatic translation from English to Irish SL (ISL) in the domain of airport information. We describe our data collection processes and the architecture of the MaTrEx system used for our translation work. This is followed by an outline of the additional animation phase that transforms the translated output into animated ISL. Through a set of experiments, evaluated both automatically and manually, we show that MT has the potential to assist Deaf people by providing information in their first language.
http://doras.dcu.ie/15199/
Marked
Mark
Combining data-driven MT systems for improved sign language translation
(2007)
Morrissey, Sara; Way, Andy; Stein, Daniel; Bungeroth, Jan; Ney, Hermann
Combining data-driven MT systems for improved sign language translation
(2007)
Morrissey, Sara; Way, Andy; Stein, Daniel; Bungeroth, Jan; Ney, Hermann
Abstract:
In this paper, we investigate the feasibility of combining two data-driven machine translation (MT) systems for the translation of sign languages (SLs). We take the MT systems of two prominent data-driven research groups, the MaTrEx system developed at DCU and the Statistical Machine Translation (SMT) system developed at RWTH Aachen University, and apply their respective approaches to the task of translating Irish Sign Language and German Sign Language into English and German. In a set of experiments supported by automatic evaluation results, we show that there is a definite value to the prospective merging of MaTrEx’s Example-Based MT chunks and distortion limit increase with RWTH’s constraint reordering.
http://doras.dcu.ie/15229/
Marked
Mark
Post-Editing Machine Translated Text in a Commercial Setting: Observation and Statistical Analysis
(2010)
Tatsumi, Midori
Post-Editing Machine Translated Text in a Commercial Setting: Observation and Statistical Analysis
(2010)
Tatsumi, Midori
Abstract:
Machine translation systems, when they are used in a commercial context for publishing purposes, are usually used in combination with human post-editing. Thus understanding human post-editing behaviour is crucial in order to maximise the benefit of machine translation systems. Though there have been a number of studies carried out on human post-editing to date, there is a lack of large-scale studies on post-editing in industrial contexts which focus on the activity in real-life settings. This study observes professional Japanese post-editors’ work and examines the effect of the amount of editing made during post-editing, source text characteristics, and post-editing behaviour, on the amount of post-editing effort. A mixed method approach was employed to both quantitatively and qualitatively analyse the data and gain detailed insights into the post-editing activity from various view points. The results indicate that a number of factors, such as sentence structure, document component ...
http://doras.dcu.ie/16062/
Marked
Mark
Optimal bilingual data for French-English PB-SMT
(2009)
Ozdowska, Sylwia; Way, Andy
Optimal bilingual data for French-English PB-SMT
(2009)
Ozdowska, Sylwia; Way, Andy
Abstract:
We investigate the impact of the original source language (SL) on French–English PB-SMT. We train four configurations of a state-of-the-art PB-SMT system based on French–English parallel corpora which differ in terms of the original SL, and conduct experiments in both translation directions. We see that data containing original French and English translated from French is optimal when building a system translating from French into English. Conversely, using data comprising exclusively French and English translated from several other languages is suboptimal regardless of the translation direction. Accordingly, the clamour for more data needs to be tempered somewhat; unless the quality of such data is controlled, more training data can cause translation performance to decrease drastically, by up to 38% relative BLEU in our experiments.
http://doras.dcu.ie/15157/
Marked
Mark
Joining hands: developing a sign language machine translation system with and for the deaf community
(2007)
Morrissey, Sara; Way, Andy
Joining hands: developing a sign language machine translation system with and for the deaf community
(2007)
Morrissey, Sara; Way, Andy
Abstract:
This paper discusses the development of an automatic machine translation (MT) system for translating spoken language text into signed languages (SLs). The motivation for our work is the improvement of accessibility to airport information announcements for D/deaf and hard of hearing people. This paper demonstrates the involvement of Deaf colleagues and members of the D/deaf community in Ireland in three areas of our research: the choice of a domain for automatic translation that has a practical use for the D/deaf community; the human translation of English text into Irish Sign Language (ISL) as well as advice on ISL grammar and linguistics; and the importance of native ISL signers as manual evaluators of our translated output.
http://doras.dcu.ie/15232/
Marked
Mark
Building a sign language corpus for use in machine translation
(2010)
Morrissey, Sara; Somers, Harold ; Smith, Robert ; Gilchrist, Shane ; Dandapat, Sandipan
Building a sign language corpus for use in machine translation
(2010)
Morrissey, Sara; Somers, Harold ; Smith, Robert ; Gilchrist, Shane ; Dandapat, Sandipan
Abstract:
In recent years data-driven methods of machine translation (MT) have overtaken rule-based approaches as the predominant means of automatically translating between languages. A pre-requisite for such an approach is a parallel corpus of the source and target languages. Technological developments in sign language (SL) capturing, analysis and processing tools now mean that SL corpora are becoming increasingly available. With transcription and language analysis tools being mainly designed and used for linguistic purposes, we describe the process of creating a multimedia parallel corpus specifically for the purposes of English to Irish Sign Language (ISL) MT. As part of our larger project on localisation, our research is focussed on developing assistive technology for patients with limited English in the domain of healthcare. Focussing on the first point of contact a patient has with a GP’s office, the medical secretary, we sought to develop a corpus from the dialogue between the two part...
http://doras.dcu.ie/16040/
Marked
Mark
Investigating the effects of controlled language on the reading and comprehension of machine translated texts: A mixed-methods approach
(2012)
Doherty, Stephen
Investigating the effects of controlled language on the reading and comprehension of machine translated texts: A mixed-methods approach
(2012)
Doherty, Stephen
Abstract:
This study investigates whether the use of controlled language (CL) improves the readability and comprehension of technical support documentation produced by a statistical machine translation system. Readability is operationalised here as the extent to which a text can be easily read in terms of formal linguistic elements; while comprehensibility is defined as how easily a text’s content can be understood by the reader. A biphasic mixed-methods triangulation approach is taken, in which a number of quantitative and qualitative evaluation methods are combined. These include: eye tracking, automatic evaluation metrics (AEMs), retrospective interviews, human evaluations, memory recall testing, and readability indices. A further aim of the research is to investigate what, if any, correlations exist between the various metrics used, and to explore the cognitive framework of the evaluation process. The research finds that the use of CL input results in significantly higher scores for items...
http://doras.dcu.ie/16805/
Marked
Mark
Example-based machine translation via the web
(2002)
Gough, Nano; Way, Andy; Hearne, Mary
Example-based machine translation via the web
(2002)
Gough, Nano; Way, Andy; Hearne, Mary
Abstract:
One of the limitations of translation memory systems is that the smallest translation units currently accessible are aligned sentential pairs. We propose an example-based machine translation system which uses a 'phrasal lexicon' in addition to the aligned sentences in its database. These phrases are extracted from the Penn Treebank using the Marker Hypothesis as a constraint on segmentation. They are then translated by three on-line machine translation (MT) systems, and a number of linguistic resources are automatically constructed which are used in the translation of new input. We perform two experiments on testsets of sentences and noun phrases to demonstrate the effectiveness of our system. In so doing, we obtain insights into the strengths and weaknesses of the selected on-line MT systems. Finally, like many example-based machine translation systems, our approach also suffers from the problem of ‘boundary friction’. Where the quality of resulting translations is compro...
http://doras.dcu.ie/15347/
Marked
Mark
Statistically motivated example-based machine translation using translation memory
(2010)
Dandapat , Sandipan ; Morrissey, Sara; Kumar Naskar, Sudip; Somers, Harold
Statistically motivated example-based machine translation using translation memory
(2010)
Dandapat , Sandipan ; Morrissey, Sara; Kumar Naskar, Sudip; Somers, Harold
Abstract:
In this paper we present a novel way of integrating Translation Memory into an Example-based Machine translation System (EBMT) to deal with the issue of low resources. We have used a dialogue of 380 sentences as the example-base for our system. The translation units in the Translation Memories are automatically extracted based on the aligned phrases (words) of a statistical machine translation (SMT) system. We attempt to use the approach to improve translation from English to Bangla as many statistical machine translation systems have difficulty with such small amounts of training data. We have found the approach shows improvement over a baseline SMT system.
http://doras.dcu.ie/16041/
Marked
Mark
Combining semantic and syntactic generalization in example-based machine translation
(2011)
Ebling, Sarah; Way, Andy; Volk, Martin; Kumar Naskar, Sudip
Combining semantic and syntactic generalization in example-based machine translation
(2011)
Ebling, Sarah; Way, Andy; Volk, Martin; Kumar Naskar, Sudip
Abstract:
In this paper, we report our experiments in combining two EBMT systems that rely on generalized templates, Marclator and CMU-EBMT, on an English–German translation task. Our goal was to see whether a statistically significant improvement could be achieved over the individual performances of these two systems. We observed that this was not the case. However, our system consistently outperformed a lexical EBMT baseline system.
http://doras.dcu.ie/16413/
Marked
Mark
Wrapper syntax for example-based machine translation
(2006)
Owczarzak, Karolina; Mellebeek, Bart; Groves, Declan; van Genabith, Josef; Way, Andy
Wrapper syntax for example-based machine translation
(2006)
Owczarzak, Karolina; Mellebeek, Bart; Groves, Declan; van Genabith, Josef; Way, Andy
Abstract:
TransBooster is a wrapper technology designed to improve the performance of wide-coverage machine translation systems. Using linguistically motivated syntactic information, it automatically decomposes source language sentences into shorter and syntactically simpler chunks, and recomposes their translation to form target language sentences. This generally improves both the word order and lexical selection of the translation. To date, TransBooster has been successfully applied to rule-based MT, statistical MT, and multi-engine MT. This paper presents the application of TransBooster to Example-Based Machine Translation. In an experiment conducted on test sets extracted from Europarl and the Penn II Treebank we show that our method can raise the BLEU score up to 3.8% relative to the EBMT baseline. We also conduct a manual evaluation, showing that TransBooster-enhanced EBMT produces a better output in terms of fluency than the baseline EBMT in 55% of the cases and in terms of accuracy in...
http://doras.dcu.ie/15282/
Marked
Mark
Lexical syntax for statistical machine translation
(2009)
Hassan, Hany
Lexical syntax for statistical machine translation
(2009)
Hassan, Hany
Abstract:
Statistical Machine Translation (SMT) is by far the most dominant paradigm of Machine Translation. This can be justified by many reasons, such as accuracy, scalability, computational efficiency and fast adaptation to new languages and domains. However, current approaches of Phrase-based SMT lacks the capabilities of producing more grammatical translations and handling long-range reordering while maintaining the grammatical structure of the translation output. Recently, SMT researchers started to focus on extending Phrase-based SMT systems with syntactic knowledge; however, the previous techniques have limited capabilities due to introducing redundantly ambiguous syntactic structures and using decoders with limited language models, and with a high computational cost. In this thesis, we extend Phrase-based SMT with lexical syntactic descriptions that localize global syntactic information on the word without introducing syntactic redundant ambiguity. We presente a novel model of Phrase...
http://doras.dcu.ie/2320/
Marked
Mark
Constrained word alignment models for statistical machine translation
(2009)
Ma, Yanjun
Constrained word alignment models for statistical machine translation
(2009)
Ma, Yanjun
Abstract:
Word alignment is a fundamental and crucial component in Statistical Machine Translation (SMT) systems. Despite the enormous progress made in the past two decades, this task remains an active research topic simply because the quality of word alignment is still far from optimal. Most state-of-the-art word alignment models are grounded on statistical learning theory treating word alignment as a general sequence alignment problem, where many linguistically motivated insights are not incorporated. In this thesis, we propose new word alignment models with linguistically motivated constraints in a bid to improve the quality of word alignment for Phrase-Based SMT systems (PB-SMT). We start the exploration with an investigation into segmentation constraints for word alignment by proposing a novel algorithm, namely word packing, which is motivated by the fact that one concept expressed by one word in one language can frequently surface as a compound or collocation in another language. Our al...
http://doras.dcu.ie/14866/
Displaying Results 1 - 25 of 259 on page 1 of 11
1
2
3
4
5
6
7
8
9
10
Bibtex
CSV
EndNote
RefWorks
RIS
XML
Item Type
Book chapter (5)
Conference item (222)
Doctoral thesis (20)
Journal article (10)
Master thesis (research) (2)
Peer Review Status
Peer reviewed (231)
Non peer reviewed (28)
Year
2012 (3)
2011 (16)
2010 (58)
2009 (52)
2008 (20)
2007 (41)
2006 (21)
2005 (9)
2004 (12)
2003 (11)
2002 (8)
2001 (6)
2000 (2)
Language
English (256)
French (3)
built by Enovation Solutions