Institutions
|
About Us
|
Help
|
Gaeilge
0
1000
Home
Browse
Advanced Search
Search History
Marked List
Statistics
A
A
A
Show search options
Hide search options
Search using:
All
Any
None of these
Exact Phrase
in
Keyword (All Fields)
Title
Author
Subject
Institution
Funder
All
Any
None of these
Exact Phrase
in
Keyword (All Fields)
Title
Author
Subject
Institution
Funder
All
Any
None of these
Exact Phrase
in
Keyword (All Fields)
Title
Author
Subject
Institution
Funder
From
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969
1968
1967
1966
1965
1964
1963
1962
1961
1960
1959
1958
1957
1956
1955
1954
1953
1952
1951
1950
1949
1948
1947
1946
1944
1943
1942
1941
1940
1939
1938
1937
1936
1935
1934
1933
1932
1931
1930
1929
1928
1927
1925
1923
1920
1919
1917
1915
1914
1913
1912
1911
1909
1908
1907
1906
1905
1904
1903
1902
1901
1900
1899
1898
1897
1896
1895
1894
1893
1892
1891
1890
1889
1888
1887
1886
1885
1884
1883
1882
1881
1880
1879
1878
1877
1876
1875
1874
1873
1872
1871
1870
1869
1867
1866
1865
1864
1862
1861
1859
1858
1857
1856
1855
1854
1853
1852
1851
1849
To
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969
1968
1967
1966
1965
1964
1963
1962
1961
1960
1959
1958
1957
1956
1955
1954
1953
1952
1951
1950
1949
1948
1947
1946
1944
1943
1942
1941
1940
1939
1938
1937
1936
1935
1934
1933
1932
1931
1930
1929
1928
1927
1925
1923
1920
1919
1917
1915
1914
1913
1912
1911
1909
1908
1907
1906
1905
1904
1903
1902
1901
1900
1899
1898
1897
1896
1895
1894
1893
1892
1891
1890
1889
1888
1887
1886
1885
1884
1883
1882
1881
1880
1879
1878
1877
1876
1875
1874
1873
1872
1871
1870
1869
1867
1866
1865
1864
1862
1861
1859
1858
1857
1856
1855
1854
1853
1852
1851
1849
Optionally, filter by:
(Leave unchecked to search all fields)
Item Type
Book
Book chapter
Conference item
Contribution to newspaper/magazine
Doctoral thesis
Journal article
Master thesis (research)
Master thesis (taught)
Multimedia
Patent
Report
Review
Working paper
Other
Peer Review Status
Peer reviewed
Non peer reviewed
Unknown
Institution
Dublin City University
Dublin Institute of Technology
NUI Galway
NUI Maynooth
Trinity College Dublin
University College Cork
University College Dublin
University of Limerick
Funder
Enterprise Ireland (EI)
Environmental Protection Agency (EPA)
Health Research Board (HRB)
Higher Education Authority (HEA)
Irish Aid
Irish Research Council for Humanities and Social Sciences (IRCHSS)
Irish Research Council for Science Engineering and Technology (IRCSET)
Marine Institute
Science Foundation Ireland (SFI)
Teagasc
Language
Irish
English
Danish
French
German
Interlingue; Occidental
Italian
Japanese
Spanish; Castilian
Current Search:
All of 'Machine' and 'translating' in all fields;
259 items found
Sort by
Relevance
Title
Author
Item type
Date
Institution
Peer review status
Language
Order
Ascending
Descending
25
50
100
per page
1
2
3
4
5
6
7
8
9
10
Bibtex
CSV
EndNote
RefWorks
RIS
XML
Displaying Results 26 - 50 of 259 on page 2 of 11
Marked
Mark
Source-side context-informed hypothesis alignment for combining outputs from machine translation systems
(2009)
Du, Jinhua; Ma, Yanjun; Way, Andy
Source-side context-informed hypothesis alignment for combining outputs from machine translation systems
(2009)
Du, Jinhua; Ma, Yanjun; Way, Andy
Abstract:
This paper presents a new hypothesis alignment method for combining outputs of multiple machine translation (MT) systems. Traditional hypothesis alignment algorithms such as TER, HMM and IHMM do not directly utilise the context information of the source side but rather address the alignment issues via the output data itself. In this paper, a source-side context-informed (SSCI) hypothesis alignment method is proposed to carry out the word alignment and word reordering issues. First of all, the source–target word alignment links are produced as the hidden variables by exporting source phrase spans during the translation decoding process. Secondly, a mapping strategy and normalisation model are employed to acquire the 1- to-1 alignment links and build the confusion network (CN). The source-side context-based method outperforms the state-of-the-art TERbased alignment model in our experiments on the WMT09 English-to-French and NIST Chinese-to-English data sets respectively. Experimental ...
http://doras.dcu.ie/15163/
Marked
Mark
Tracking relevant alignment characteristics for machine translation
(2009)
Lambert, Patrik; Ma, Yanjun; Ozdowska, Sylwia; Way, Andy
Tracking relevant alignment characteristics for machine translation
(2009)
Lambert, Patrik; Ma, Yanjun; Ozdowska, Sylwia; Way, Andy
Abstract:
In most statistical machine translation (SMT) systems, bilingual segments are extracted via word alignment. In this paper we compare alignments tuned directly according to alignment F-score and BLEU score in order to investigate the alignment characteristics that are helpful in translation. We report results for two different SMT systems (a phrase-based and an n-gram-based system) on Chinese to English IWSLT data, and Spanish to English European Parliament data. We give alignment hints to improve BLEU score, depending on the SMT system used and the type of corpus.
http://doras.dcu.ie/15161/
Marked
Mark
Labelled dependencies in machine translation evaluation
(2007)
Owczarzak, Karolina; van Genabith, Josef; Way, Andy
Labelled dependencies in machine translation evaluation
(2007)
Owczarzak, Karolina; van Genabith, Josef; Way, Andy
Abstract:
We present a method for evaluating the quality of Machine Translation (MT) output, using labelled dependencies produced by a Lexical-Functional Grammar (LFG) parser. Our dependency based method, in contrast to most popular string-based evaluation metrics, does not unfairly penalize perfectly valid syntactic variations in the translation, and the addition of WordNet provides a way to accommodate lexical variation. In comparison with other metrics on 16,800 sentences of Chinese-English newswire text, our method reaches high correlation with human scores.
http://doras.dcu.ie/15221/
Marked
Mark
A syntactic skeleton for statistical machine translation
(2006)
Mellebeek, Bart; Owczarzak, Karolina; Groves, Declan; van Genabith, Josef; Way, Andy
A syntactic skeleton for statistical machine translation
(2006)
Mellebeek, Bart; Owczarzak, Karolina; Groves, Declan; van Genabith, Josef; Way, Andy
Abstract:
We present a method for improving statistical machine translation performance by using linguistically motivated syntactic information. Our algorithm recursively decomposes source language sentences into syntactically simpler and shorter chunks, and recomposes their translation to form target language sentences. This improves both the word order and lexical selection of the translation. We report statistically significant relative improvementsof 3.3% BLEU score in an experiment (English!Spanish) carried out on an 800-sentence test set extracted from the Europarl corpus.
http://doras.dcu.ie/15279/
Marked
Mark
Low-resource machine translation using MATREX: The DCU machine translation system for IWSLT 2009
(2009)
Ma, Yanjun; Okita , Tsuyoshi ; Cetinoglu, Ozlem; Du, Jinhua ; Way, Andy
Low-resource machine translation using MATREX: The DCU machine translation system for IWSLT 2009
(2009)
Ma, Yanjun; Okita , Tsuyoshi ; Cetinoglu, Ozlem; Du, Jinhua ; Way, Andy
Abstract:
In this paper, we give a description of the Machine Translation (MT) system developed at DCU that was used for our fourth participation in the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT 2009). Two techniques are deployed in our system in order to improve the translation quality in a low-resource scenario. The first technique is to use multiple segmentations in MT training and to utilise word lattices in decoding stage. The second technique is used to select the optimal training data that can be used to build MT systems. In this year’s participation, we use three different prototype SMT systems, and the output from each system are combined using standard system combination method. Our system is the top system for Chinese–English CHALLENGE task in terms of BLEU score.
http://doras.dcu.ie/16162/
Marked
Mark
Towards a user-friendly webservice architecture for statistical machine translation in the PANACEA project
(2011)
Toral, Antonio; Pecina, Pavel; Way, Andy ; Poch, Marc
Towards a user-friendly webservice architecture for statistical machine translation in the PANACEA project
(2011)
Toral, Antonio; Pecina, Pavel; Way, Andy ; Poch, Marc
Abstract:
This paper presents a webservice architecture for Statistical Machine Translation aimed at non-technical users. A workflow editor allows a user to combine different webservices using a graphical user interface. In the current state of this project, the webservices have been implemented for a range of sentential and sub-sentential aligners. The advantage of a common interface and a common data format allows the user to build workflows exchanging different aligners.
http://doras.dcu.ie/16467/
Marked
Mark
Towards using web-crawled data for domain adaptation in statistical machine translation
(2011)
Pecina, Pavel; Toral, Antonio; Way, Andy ; Papavassiliou, Vassilis ; Prokopidis, Prokop...
Towards using web-crawled data for domain adaptation in statistical machine translation
(2011)
Pecina, Pavel; Toral, Antonio; Way, Andy ; Papavassiliou, Vassilis ; Prokopidis, Prokopis; Giagkou, Maria
Abstract:
This paper reports on the ongoing work focused on domain adaptation of statistical machine translation using domain-specific data obtained by domain-focused web crawling. We present a strategy for crawling monolingual and parallel data and their exploitation for testing, language modelling, and system tuning in a phrase--based machine translation framework. The proposed approach is evaluated on the domains of Natural Environment and Labour Legislation and two language pairs: English–French and English–Greek.
http://doras.dcu.ie/16468/
Marked
Mark
Bilingually motivated domain-adapted word segmentation for statistical machine translation
(2009)
Ma, Yanjun; Way, Andy
Bilingually motivated domain-adapted word segmentation for statistical machine translation
(2009)
Ma, Yanjun; Way, Andy
Abstract:
We introduce a word segmentation approach to languages where word boundaries are not orthographically marked, with application to Phrase-Based Statistical Machine Translation (PB-SMT). Instead of using manually segmented monolingual domain-specific corpora to train segmenters, we make use of bilingual corpora and statistical word alignment techniques. First of all, our approach is adapted for the specific translation task at hand by taking the corresponding source (target) language into account. Secondly, this approach does not rely on manually segmented training data so that it can be automatically adapted for different domains. We evaluate the performance of our segmentation approach on PB-SMT tasks from two domains and demonstrate that our approach scores consistently among the best results across different data conditions.
http://doras.dcu.ie/15164/
Marked
Mark
Using supertags as source language context in SMT
(2009)
Haque, Rejwanul; Naskar, Sudip Kumar; Ma, Yanjun; Way, Andy
Using supertags as source language context in SMT
(2009)
Haque, Rejwanul; Naskar, Sudip Kumar; Ma, Yanjun; Way, Andy
Abstract:
Recent research has shown that Phrase-Based Statistical Machine Translation (PB-SMT) systems can benefit from two enhancements: (i) using words and POS tags as context-informed features on the source side; and (ii) incorporating lexical syntactic descriptions in the form of supertags on the target side. In this work we present a novel PB-SMT model that combines these two aspects by using supertags as source language contextinformed features. These features enable us to exploit source similarity in addition to target similarity, as modelled by the language model. In our experiments two kinds of supertags are employed: those from Lexicalized Tree-Adjoining Grammar and Combinatory Categorial Grammar. We use a memory-based classification framework that enables the estimation of these features while avoiding problems of sparseness. Despite the differences between these two approaches, the supertaggers give similar improvements. We evaluate the performance of our approach on an English-to...
http://doras.dcu.ie/15160/
Marked
Mark
Marker-based filtering of bilingual phrase pairs for SMT
(2009)
Sánchez-Martínez, Felipe; Way, Andy
Marker-based filtering of bilingual phrase pairs for SMT
(2009)
Sánchez-Martínez, Felipe; Way, Andy
Abstract:
State-of-the-art statistical machine translation systems make use of a large translation table obtained after scoring a set of bilingual phrase pairs automatically extracted from a parallel corpus. The number of bilingual phrase pairs extracted from a pair of aligned sentences grows exponentially as the length of the sentences increases; therefore, the number of entries in the phrase table used to carry out the translation may become unmanageable, especially when online, 'on demand' translation is required in real time. We describe the use of closed-class words to filter the set of bilingual phrase pairs extracted from the parallel corpus by taking into account the alignment information and the type of the words involved in the alignments. On four European language pairs, we show that our simple yet novel approach can filter the phrase table by up to a third yet still provide competitive results compared to the baseline. Furthermore, it provides a nice balance between the ...
http://doras.dcu.ie/15156/
Marked
Mark
MaTrEx: the DCU machine translation system for ICON 2008
(2008)
Srivastava, Ankit; Haque, Rejwanul; Naskar, Sudip Kumar; Way, Andy
MaTrEx: the DCU machine translation system for ICON 2008
(2008)
Srivastava, Ankit; Haque, Rejwanul; Naskar, Sudip Kumar; Way, Andy
Abstract:
In this paper, we give a description of the machine translation system developed at DCU that was used for our participation in the NLP Tools Contest of the International Conference on Natural Language Processing (ICON 2008). This was our first ever attempt at working on any Indian language. In this participation, we focus on various techniques for word and phrase alignment to improve system quality. For the English-Hindi translation task we exploit source-language reordering. We also carried out experiments combining both in-domain and out-of-domain data to improve the system performance and, as a post-processing step we transliterate out-of-vocabulary items.
http://doras.dcu.ie/15200/
Marked
Mark
Using same-language machine translation to create alternative target sequences for text-to-speech synthesis
(2009)
Cahill, Peter; Du, Jinhua; Way, Andy; Carson-Berndsen, Julie
Using same-language machine translation to create alternative target sequences for text-to-speech synthesis
(2009)
Cahill, Peter; Du, Jinhua; Way, Andy; Carson-Berndsen, Julie
Abstract:
Modern speech synthesis systems attempt to produce speech utterances from an open domain of words. In some situations, the synthesiser will not have the appropriate units to pronounce some words or phrases accurately but it still must attempt to pronounce them. This paper presents a hybrid machine translation and unit selection speech synthesis system. The machine translation system was trained with English as the source and target language. Rather than the synthesiser only saying the input text as would happen in conventional synthesis systems, the synthesiser may say an alternative utterance with the same meaning. This method allows the synthesiser to overcome the problem of insufficient units in runtime.
http://doras.dcu.ie/15182/
Marked
Mark
Supertagged phrase-based statistical machine translation
(2007)
Hassan, Hany; Sima'an, Khalil; Way, Andy
Supertagged phrase-based statistical machine translation
(2007)
Hassan, Hany; Sima'an, Khalil; Way, Andy
Abstract:
Until quite recently, extending Phrase-based Statistical Machine Translation (PBSMT) with syntactic structure caused system performance to deteriorate. In this work we show that incorporating lexical syntactic descriptions in the form of supertags can yield significantly better PBSMT systems. We describe a novel PBSMT model that integrates supertags into the target language model and the target side of the translation model. Two kinds of supertags are employed: those from Lexicalized Tree-Adjoining Grammar and Combinatory Categorial Grammar. Despite the differences between these two approaches, the supertaggers give similar improvements. In addition to supertagging, we also explore the utility of a surface global grammaticality measure based on combinatory operators. We perform various experiments on the Arabic to English NIST 2005 test set addressing issues such as sparseness, scalability and the utility of system subcomponents. Our best result (0.4688 BLEU) improves by 6.1% relati...
http://doras.dcu.ie/15218/
Marked
Mark
Exploiting parallel treebanks to improve phrase-based statistical machine translation
(2007)
Tinsley, John; Hearne, Mary; Way, Andy
Exploiting parallel treebanks to improve phrase-based statistical machine translation
(2007)
Tinsley, John; Hearne, Mary; Way, Andy
Abstract:
We use existing tools to automatically build two parallel treebanks from existing parallel corpora. We then show that combining the data extracted from both the treebanks and the corpora into a single translation model can improve the translation quality in a baseline phrase-based statistical machine translation system.
http://doras.dcu.ie/15266/
Marked
Mark
Using machine-learning to assign function labels to parser output for Spanish
(2006)
Chrupała, Grzegorz; van Genabith, Josef
Using machine-learning to assign function labels to parser output for Spanish
(2006)
Chrupała, Grzegorz; van Genabith, Josef
Abstract:
Data-driven grammatical function tag assignment has been studied for English using the Penn-II Treebank data. In this paper we address the question of whether such methods can be applied successfully to other languages and treebank resources. In addition to tag assignment accuracy and f-scores we also present results of a task-based evaluation. We use three machine-learning methods to assign Cast3LB function tags to sentences parsed with Bikel’s parser trained on the Cast3LB treebank. The best performing method, SVM, achieves an f-score of 86.87% on gold-standard trees and 66.67% on parser output - a statistically significant improvement of 6.74% over the baseline. In a task-based evaluation we generate LFG functional-structures from the function tag-enriched trees. On this task we achive an f-score of 75.67%, a statistically significant 3.4% improvement over the baseline.
http://doras.dcu.ie/15270/
Marked
Mark
Multi-engine machine translation by recursive sentence decomposition
(2006)
Mellebeek, Bart; Owczarzak, Karolina; van Genabith, Josef; Way, Andy
Multi-engine machine translation by recursive sentence decomposition
(2006)
Mellebeek, Bart; Owczarzak, Karolina; van Genabith, Josef; Way, Andy
Abstract:
In this paper, we present a novel approach to combine the outputs of multiple MT engines into a consensus translation. In contrast to previous Multi-Engine Machine Translation (MEMT) techniques, we do not rely on word alignments of output hypotheses, but prepare the input sentence for multi-engine processing. We do this by using a recursive decomposition algorithm that produces simple chunks as input to the MT engines. A consensus translation is produced by combining the best chunk translations, selected through majority voting, a trigram language model score and a confidence score assigned to each MT engine. We report statistically significant relative improvements of up to 9% BLEU score in experiments (English→Spanish) carried out on an 800-sentence test set extracted from the Penn-II Treebank.
http://doras.dcu.ie/15281/
Marked
Mark
Syntactic phrase-based statistical machine translation
(2006)
Hassan, Hany; Hearne, Mary; Way, Andy; Sima'an, Khalil
Syntactic phrase-based statistical machine translation
(2006)
Hassan, Hany; Hearne, Mary; Way, Andy; Sima'an, Khalil
Abstract:
Phrase-based statistical machine translation (PBSMT) systems represent the dominant approach in MT today. However, unlike systems in other paradigms, it has proven difficult to date to incorporate syntactic knowledge in order to improve translation quality. This paper improves on recent research which uses 'syntactified' target language phrases, by incorporating supertags as constraints to better resolve parse tree fragments. In addition, we do not impose any sentence-length limit, and using a log-linear decoder, we outperform a state-of-the-art PBSMT system by over 1.3 BLEU points (or 3.51% relative) on the NIST 2003 Arabic-English test corpus.
http://doras.dcu.ie/15280/
Marked
Mark
Improving online machine translation systems
(2005)
Mellebeek, Bart; Khasin, Anna; Owczarzak, Karolina; van Genabith, Josef; Way, Andy
Improving online machine translation systems
(2005)
Mellebeek, Bart; Khasin, Anna; Owczarzak, Karolina; van Genabith, Josef; Way, Andy
Abstract:
In (Mellebeek et al., 2005), we proposed the design, implementation and evaluation of a novel and modular approach to boost the translation performance of existing, wide-coverage, freely available machine translation systems, based on reliable and fast automatic decomposition of the translation input and corresponding composition of translation output. Despite showing some initial promise, our method did not improve on the baseline Logomedia1 and Systran2 MT systems. In this paper, we improve on the algorithm presented in (Mellebeek et al., 2005), and on the same test data, show increased scores for a range of automatic evaluation metrics. Our algorithm now outperforms Logomedia, obtains similar results to SDL3 and falls tantalisingly short of the performance achieved by Systran.
http://doras.dcu.ie/15296/
Marked
Mark
TransBooster: boosting the performance of wide-coverage machine translation systems
(2005)
Mellebeek, Bart; Khasin, Anna; van Genabith, Josef; Way, Andy
TransBooster: boosting the performance of wide-coverage machine translation systems
(2005)
Mellebeek, Bart; Khasin, Anna; van Genabith, Josef; Way, Andy
Abstract:
We propose the design, implementation and evaluation of a novel and modular approach to boost the translation performance of existing, wide-coverage, freely available machine translation systems based on reliable and fast automatic decomposition of the translation input and corresponding composition of translation output. We provide details of our method, and experimental results compared to the MT systems SYSTRAN and Logomedia. While many avenues for further experimentation remain, to date we fall just behind the baseline systems on the full 800-sentence testset, but in certain cases our method causes the translation quality obtained via the MT systems to improve.
http://doras.dcu.ie/15298/
Marked
Mark
wEBMT: developing and validating an example-based machine translation system using the world wide web
(2003)
Way, Andy; Gough, Nano
wEBMT: developing and validating an example-based machine translation system using the world wide web
(2003)
Way, Andy; Gough, Nano
Abstract:
We have developed an example-based machine translation (EBMT) system that uses the World Wide Web for two different purposes: First, we populate the system’s memory with translations gathered from rule-based MT systems located on the Web. The source strings input to these systems were extracted automatically from an extremely small subset of the rule types in the Penn-II Treebank. In subsequent stages, the (source, target) translation pairs obtained are automatically transformed into a series of resources that render the translation process more successful. Despite the fact that the output from on-line MT systems is often faulty, we demonstrate in a number of experiments that when used to seed the memories of an EBMT system, they can in fact prove useful in generating translations of high quality in a robust fashion. In addition, we demonstrate the relative gain of EBMT in comparison to on-line systems. Second, despite the perception that the documents available on the Web are of qu...
http://doras.dcu.ie/15318/
Marked
Mark
OpenMaTrEx: a free/open-source marker-driven example-based machine translation system
(2010)
Dandapat, Sandipan; Forcada, Mikel; Groves, Declan; Penkale, Sergio; Tinsley, John; Way...
OpenMaTrEx: a free/open-source marker-driven example-based machine translation system
(2010)
Dandapat, Sandipan; Forcada, Mikel; Groves, Declan; Penkale, Sergio; Tinsley, John; Way, Andy
Abstract:
We describe OpenMaTrEx, a free/open-source example based machine translation (EBMT) system based on the marker hypothesis, comprising a marker-driven chunker, a collection of chunk aligners, and two engines: one based on a simple proof-of-concept monotone EBMT recombinator and a Moses-based statistical decoder. OpenMaTrEx is a free/open-source release of the basic components of MaTrEx, the Dublin City University machine translation system.
http://doras.dcu.ie/15797/
Marked
Mark
Testing students' understanding of complex transfer
(2002)
Way, Andy
Testing students' understanding of complex transfer
(2002)
Way, Andy
Abstract:
Courses on Machine Translation (MT) need to be tailored to different sets of students with differing skills and demands (Kenny & Way, 2001) Nevertheless, any contemporary course on MT ought to equip students with at least a superficial knowledge of the differences between rule-based and statistical MT direct and indirect approaches and transfer- based and interlingual systems. With regard to this latter distinction, the issue of complex transfer is an integral component to this section of a course on MT, whether this be to computational linguists, translators or language students. This paper presents a method of assessing the level of understanding of the issues pertaining to complex transfer for final year undergraduates studying a degree programme in Computational Linguistics. The intention is that this methodology may contribute to a suite of exercises which may be used by other instructors in Machine Translation.
http://doras.dcu.ie/15829/
Marked
Mark
Example-based machine translation of the Basque language
(2006)
Stroppa, Nicolas; Groves, Declan; Way, Andy; Sarasola, Kepa
Example-based machine translation of the Basque language
(2006)
Stroppa, Nicolas; Groves, Declan; Way, Andy; Sarasola, Kepa
Abstract:
Basque is both a minority and a highly inflected language with free order of sentence constituents. Machine Translation of Basque is thus both a real need and a test bed for MT techniques. In this paper, we present a modular Data-Driven MT system which includes different chunkers as well as chunk aligners which can deal with the free order of sentence constituents of Basque. We conducted Basque to English translation experiments, evaluated on a large corpus (270, 000 sentence pairs). The experimental results show that our system significantly outperforms state-of-the-art approaches according to several common automatic evaluation metrics.
http://doras.dcu.ie/15821/
Marked
Mark
Deep Syntax in Statistical Machine Translation
(2011)
Graham, Yvette
Deep Syntax in Statistical Machine Translation
(2011)
Graham, Yvette
Abstract:
Statistical Machine Translation (SMT) via deep syntactic transfer employs a three-stage architecture, (i) parse source language (SL) input, (ii) transfer SL deep syntactic structure to the target language (TL), and (iii) generate a TL translation. The deep syntactic transfer architecture achieves a high level of language pair independence compared to other Machine Translation (MT) approaches, as translation is carried out at the more language independent deep syntactic representation. TL word order can be generated independently of SL word order and therefore no reordering model between source and target words is required. In addition, words in dependency relations are adjacent in the deep syntactic structure, allowing the extraction of more general transfer rules, compared to other rules/phrases extracted from the surface form corpus, as such words are often distant in surface form strings, as well as allowing the use of a TL deep syntax language model, which models a deeper notion...
http://doras.dcu.ie/16078/
Marked
Mark
Integrating source-language context into log-linear models of statistical machine translation
(2011)
Haque, Rejwanul
Integrating source-language context into log-linear models of statistical machine translation
(2011)
Haque, Rejwanul
Abstract:
The translation features typically used in state-of-the-art statistical machine translation (SMT) model dependencies between the source and target phrases, but not among the phrases in the source language themselves. A swathe of research has demonstrated that integrating source context modelling directly into log-linear phrase-based SMT (PB-SMT) and hierarchical PB-SMT (HPB-SMT), and can positively influence the weighting and selection of target phrases, and thus improve translation quality. In this thesis we present novel approaches to incorporate source-language contextual modelling into the state-of-the-art SMT models in order to enhance the quality of lexical selection. We investigate the effectiveness of use of a range of contextual features, including lexical features of neighbouring words, part-of-speech tags, supertags, sentence-similarity features, dependency information, and semantic roles. We explored a series of language pairs featuring typologically different languages,...
http://doras.dcu.ie/16458/
Displaying Results 26 - 50 of 259 on page 2 of 11
1
2
3
4
5
6
7
8
9
10
Bibtex
CSV
EndNote
RefWorks
RIS
XML
Item Type
Book chapter (5)
Conference item (222)
Doctoral thesis (20)
Journal article (10)
Master thesis (research) (2)
Peer Review Status
Peer reviewed (231)
Non peer reviewed (28)
Year
2012 (3)
2011 (16)
2010 (58)
2009 (52)
2008 (20)
2007 (41)
2006 (21)
2005 (9)
2004 (12)
2003 (11)
2002 (8)
2001 (6)
2000 (2)
Language
English (256)
French (3)
built by Enovation Solutions