Institutions
|
About Us
|
Help
|
Gaeilge
0
1000
Home
Browse
Advanced Search
Search History
Marked List
Statistics
A
A
A
Show search options
Hide search options
Search using:
All
Any
None of these
Exact Phrase
in
Keyword (All Fields)
Title
Author
Subject
Institution
Funder
All
Any
None of these
Exact Phrase
in
Keyword (All Fields)
Title
Author
Subject
Institution
Funder
All
Any
None of these
Exact Phrase
in
Keyword (All Fields)
Title
Author
Subject
Institution
Funder
From
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969
1968
1967
1966
1965
1964
1963
1962
1961
1960
1959
1958
1957
1956
1955
1954
1953
1952
1951
1950
1949
1948
1947
1946
1944
1943
1942
1941
1940
1939
1938
1937
1936
1935
1934
1933
1932
1931
1930
1929
1928
1927
1925
1923
1920
1919
1917
1915
1914
1913
1912
1911
1909
1908
1907
1906
1905
1904
1903
1902
1901
1900
1899
1898
1897
1896
1895
1894
1893
1892
1891
1890
1889
1888
1887
1886
1885
1884
1883
1882
1881
1880
1879
1878
1877
1876
1875
1874
1873
1872
1871
1870
1869
1867
1866
1865
1864
1862
1861
1859
1858
1857
1856
1855
1854
1853
1852
1851
1849
To
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969
1968
1967
1966
1965
1964
1963
1962
1961
1960
1959
1958
1957
1956
1955
1954
1953
1952
1951
1950
1949
1948
1947
1946
1944
1943
1942
1941
1940
1939
1938
1937
1936
1935
1934
1933
1932
1931
1930
1929
1928
1927
1925
1923
1920
1919
1917
1915
1914
1913
1912
1911
1909
1908
1907
1906
1905
1904
1903
1902
1901
1900
1899
1898
1897
1896
1895
1894
1893
1892
1891
1890
1889
1888
1887
1886
1885
1884
1883
1882
1881
1880
1879
1878
1877
1876
1875
1874
1873
1872
1871
1870
1869
1867
1866
1865
1864
1862
1861
1859
1858
1857
1856
1855
1854
1853
1852
1851
1849
Optionally, filter by:
(Leave unchecked to search all fields)
Item Type
Book
Book chapter
Conference item
Contribution to newspaper/magazine
Doctoral thesis
Journal article
Master thesis (research)
Master thesis (taught)
Multimedia
Patent
Report
Review
Working paper
Other
Peer Review Status
Peer reviewed
Non peer reviewed
Unknown
Institution
Dublin City University
Dublin Institute of Technology
NUI Galway
NUI Maynooth
Trinity College Dublin
University College Cork
University College Dublin
University of Limerick
Funder
Enterprise Ireland (EI)
Environmental Protection Agency (EPA)
Health Research Board (HRB)
Higher Education Authority (HEA)
Irish Aid
Irish Research Council for Humanities and Social Sciences (IRCHSS)
Irish Research Council for Science Engineering and Technology (IRCSET)
Marine Institute
Science Foundation Ireland (SFI)
Teagasc
Language
Irish
English
Danish
French
German
Interlingue; Occidental
Italian
Japanese
Spanish; Castilian
Current Search:
All of 'Machine' and 'translating' in all fields;
259 items found
Sort by
Relevance
Title
Author
Item type
Date
Institution
Peer review status
Language
Order
Ascending
Descending
25
50
100
per page
1
2
3
4
5
6
7
8
9
10
11
Bibtex
CSV
EndNote
RefWorks
RIS
XML
Displaying Results 226 - 250 of 259 on page 10 of 11
Marked
Mark
Arabic parsing using grammar transforms
(2010)
Tounsi, Lamia; van Genabith, Josef
Arabic parsing using grammar transforms
(2010)
Tounsi, Lamia; van Genabith, Josef
Abstract:
We investigate Arabic Context Free Grammar parsing with dependency annotation comparing lexicalised and unlexicalised parsers. We study how morphosyntactic as well as function tag information percolation in the form of grammar transforms (Johnson, 1998, Kulick et al., 2006) affects the performance of a parser and helps dependency assignment. We focus on the three most frequent functional tags in the Arabic Penn Treebank: subjects, direct objects and predicates . We merge these functional tags with their phrasal categories and (where appropriate) percolate case information to the non-terminal (POS) category to train the parsers. We then automatically enrich the output of these parsers with full dependency information in order to annotate trees with Lexical Functional Grammar (LFG) f-structure equations with produce f-structures, i.e. attribute-value matrices approximating to basic predicate-argument-adjunct structure representations. We present a series of experiments evaluating how ...
http://doras.dcu.ie/15991/
Marked
Mark
Lemmatization and lexicalized statistical parsing of morphologically rich languages: the case of French
(2010)
Seddah, Djamé ; Chrupała, Grzegorz ; Cetinoglu, Ozlem; van Genabith, Josef; Candito, Ma...
Lemmatization and lexicalized statistical parsing of morphologically rich languages: the case of French
(2010)
Seddah, Djamé ; Chrupała, Grzegorz ; Cetinoglu, Ozlem; van Genabith, Josef; Candito, Marie
Abstract:
This paper shows that training a lexicalized parser on a lemmatized morphologically-rich treebank such as the French Treebank slightly improves parsing results. We also show that lemmatizing a similar in size subset of the English Penn Treebank has almost no effect on parsing performance with gold lemmas and leads to a small drop of performance when automatically assigned lemmas and POS tags are used. This highlights two facts: (i) lemmatization helps to reduce lexicon data-sparseness issues for French, (ii) it also makes the parsing process sensitive to correct assignment of POS tags to unknown words.
http://doras.dcu.ie/15987/
Marked
Mark
"cba to check the spelling" investigating parser performance on discussion forum posts
(2010)
Foster, Jennifer
"cba to check the spelling" investigating parser performance on discussion forum posts
(2010)
Foster, Jennifer
Abstract:
We evaluate the Berkeley parser on text from an online discussion forum. We evaluate the parser output with and without gold tokens and spellings (using Sparseval and Parseval), and we compile a list of problematic phenomena for this domain. The Parseval f-score for a small development set is 77.56. This increases to 80.27 when we apply a set of simple transformations to the input sentences and to the Wall Street Journal (WSJ) training sections.
http://doras.dcu.ie/15984/
Marked
Mark
A road map for interoperable language resource metadata
(2010)
Cieri, Christopher ; Choukri, Khalid ; Calzolari, Nicoletta ; Langendoen, D. Terence ; ...
A road map for interoperable language resource metadata
(2010)
Cieri, Christopher ; Choukri, Khalid ; Calzolari, Nicoletta ; Langendoen, D. Terence ; Leveling, Johannes; Palmer, Martha ; Ide, Nancy ; Pustejovsky, James
Abstract:
LRs remain expensive to create and thus rare relative to demand across languages and technology types. The accidental re-creation of an LR that already exists is a nearly unforgiveable waste of scarce resources that is unfortunately not so easy to avoid. The number of catalogs the HLT researcher must search, with their different formats, make it possible to overlook an existing resource. This paper sketches the sources of this problem and outlines a proposal to rectify along with a new vision of LR cataloging that will to facilitates the documentation and exploitation of a much wider range of LRs than previously considered.
http://doras.dcu.ie/15983/
Marked
Mark
Finding common ground: towards a surface realisation shared task
(2010)
Belz, Anja; White, Mike; van Genabith, Josef; Hogan, Deirdre; Stent, Amanda
Finding common ground: towards a surface realisation shared task
(2010)
Belz, Anja; White, Mike; van Genabith, Josef; Hogan, Deirdre; Stent, Amanda
Abstract:
In many areas of NLP reuse of utility tools such as parsers and POS taggers is now common, but this is still rare in NLG. The subfield of surface realisation has perhaps come closest, but at present we still lack a basis on which different surface realisers could be compared, chiefly because of the wide variety of different input representations used by different realisers. This paper outlines an idea for a shared task in surface realisation, where inputs are provided in a common-ground representation formalism which participants map to the types of input required by their system. These inputs are derived from existing annotated corpora developed for language analysis (parsing etc.). Outputs (realisations) are evaluated by automatic comparison against the human-authored text in the corpora as well as by human assessors.
http://doras.dcu.ie/15981/
Marked
Mark
Handling unknown words in statistical latent-variable parsing models for Arabic, English and French
(2010)
Attia, Mohammed; Foster, Jennifer ; Hogan, Deirdre ; Le Roux, Joseph ; Tounsi, Lamia; v...
Handling unknown words in statistical latent-variable parsing models for Arabic, English and French
(2010)
Attia, Mohammed; Foster, Jennifer ; Hogan, Deirdre ; Le Roux, Joseph ; Tounsi, Lamia; van Genabith, Josef
Abstract:
This paper presents a study of the impact of using simple and complex morphological clues to improve the classification of rare and unknown words for parsing. We compare this approach to a language-independent technique often used in parsers which is based solely on word frequencies. This study is applied to three languages that exhibit different levels of morphological expressiveness: Arabic, French and English. We integrate information about Arabic affixes and morphotactics into a PCFG-LA parser and obtain stateof-the-art accuracy. We also show that these morphological clues can be learnt automatically from an annotated corpus.
http://doras.dcu.ie/15980/
Marked
Mark
An automatically built named entity lexicon for Arabic
(2010)
Attia, Mohammed; Toral, Antonio; Tounsi, Lamia; Monachini, Monica ; van Genabith, Josef
An automatically built named entity lexicon for Arabic
(2010)
Attia, Mohammed; Toral, Antonio; Tounsi, Lamia; Monachini, Monica ; van Genabith, Josef
Abstract:
We have successfully adapted and extended the automatic Multilingual, Interoperable Named Entity Lexicon approach to Arabic, using Arabic WordNet (AWN) and Arabic Wikipedia (AWK). First, we extract AWN’s instantiable nouns and identify the corresponding categories and hyponym subcategories in AWK. Then, we exploit Wikipedia inter-lingual links to locate correspondences between articles in ten different languages in order to identify Named Entities (NEs). We apply keyword search on AWK abstracts to provide for Arabic articles that do not have a correspondence in any of the other languages. In addition, we perform a post-processing step to fetch further NEs from AWK not reachable through AWN. Finally, we investigate diacritization using matching with geonames databases, MADA-TOKAN tools and different heuristics for restoring vowel marks of Arabic NEs. Using this methodology, we have extracted approximately 45,000 Arabic NEs and built, to the best of our knowledge, the largest, most ma...
http://doras.dcu.ie/15979/
Marked
Mark
Partial dependency parsing for Irish
(2010)
Ui Dhonnchadha, Elaine; van Genabith, Josef
Partial dependency parsing for Irish
(2010)
Ui Dhonnchadha, Elaine; van Genabith, Josef
Abstract:
In this paper we present a partial dependency parser for Irish, in which Constraint Grammar (CG) rules are used to annotate dependency relations and grammatical functions in unrestricted Irish text. Chunking is performed using a regular-expression grammar which operates on the dependency tagged sentences. As this is the first implementation of a parser for unrestricted Irish text (to our knowledge), there were no guidelines or precedents available. Therefore deciding what constitutes a syntactic unit, and how it should be annotated, accounts for a major part of the early development effort. Currently, all tokens in a sentence are tagged for grammatical function and local dependency. Long-distance dependencies, prepositional attachments or coordination are not handled, resulting in a partial dependency analysis. Evaluations show that the partial dependency analysis achieves an f-score of 93.60% on development data and 94.28% on unseen test data, while the chunker achieves an f-score ...
http://doras.dcu.ie/16215/
Marked
Mark
Design and Characterisation of a Novel Artificial Life System Incorporating Hierarchical Selection
(2010)
Kelly, Ciarán
Design and Characterisation of a Novel Artificial Life System Incorporating Hierarchical Selection
(2010)
Kelly, Ciarán
Abstract:
In this thesis, a minimal artificial chemistry system is presented, which is inspired by the RNA World hypothesis and is loosely based on Holland's Learning Classier Systems. The Molecular Classier System (MCS) takes a bottom-up, individual-based approach to building artificial bio-chemical networks. The MCS has been developed to demonstrate the effects of hierarchical selection. Hierarchical selection appears to have been critical for the evolution of complexity in life as we know it yet, to date, no computational artificial life system has investigated the viability of using hierarchical selection as a mechanism for achieving qualitatively similar results. Hierarchy in MCS is enforced by constraining artificial molecules, which are modeled as individuals, to exist within externally provided containers - protocells. This research is focused on the period of time surrounding the conjectured first Major Transition - from individual replicating molecules to populations of mole...
http://doras.dcu.ie/15727/
Marked
Mark
Automatic F-Structure Annotation from the AP Treebank
(2000)
Sadler, Louisa ; van Genabith, Josef; Way, Andy
Automatic F-Structure Annotation from the AP Treebank
(2000)
Sadler, Louisa ; van Genabith, Josef; Way, Andy
Abstract:
We present a method for automatically annotating treebank resources with functional structures. The method defines systematic patterns of correspondence between partial PS configurations and functional structures. These are applied to PS rules extracted from treebanks. The set of techniques which we have developed constitute a methodology for corpus-guided grammar development. Despite the widespread belief that treebank representations are not very useful in grammar development, we show that systematic patterns of c-structure to f-structure correspondence can be simply and successfully stated over such rules. The method is partial in that it requires manual correction of the annotated grammar rules.
http://doras.dcu.ie/16169/
Marked
Mark
Experiments in Structure-Preserving Grammar Compaction
(2000)
Hepple, Mark; van Genabith, Josef
Experiments in Structure-Preserving Grammar Compaction
(2000)
Hepple, Mark; van Genabith, Josef
Abstract:
Structure preserving grammar compaction (SPC) is a simple CFG compaction technique originally described in (van Genabith et al., 1999a, 1999b). It works by generalising category labels and in so doing plugs holes in the grammar. To date the method has been tested on small corpra only. In the present research we apply SPC to a large grammar extracted from the Penn Treebank and examine its effects on rule treebank grammar size and on rule accession rates (as an indicator of grammar completeness) . 1 Introduction Tree banks and resources compiled from treebanks are potentially very useful in NLP. Grammars extracted from treebanks --- so called treebank grammars (Charniak, 1996) --- can form the basis of large coverage NLP systems. Such treebank grammars, however, can suffer from several shortcomings: they commonly feature a large number of flat, highly specific rules that may be rarely used, with ensuing costs for processing (load) under the grammar.
http://doras.dcu.ie/16214/
Marked
Mark
Taxonomy and evaluation of various collaborative platforms
(2010)
Gupta, Rajat; Aouad, Lamine
Taxonomy and evaluation of various collaborative platforms
(2010)
Gupta, Rajat; Aouad, Lamine
http://doras.dcu.ie/16266/
Marked
Mark
Large-scale induction and evaluation of lexical resources from the Penn-II and Penn-III treebanks
(2005)
O’Donovan, Ruth ; Burke, Michael ; Cahill, Aoife ; van Genabith, Josef; Way, Andy
Large-scale induction and evaluation of lexical resources from the Penn-II and Penn-III treebanks
(2005)
O’Donovan, Ruth ; Burke, Michael ; Cahill, Aoife ; van Genabith, Josef; Way, Andy
Abstract:
We present a methodology for extracting subcategorization frames based on an automatic lexical-functional grammar (LFG) f-structure annotation algorithm for the Penn-II and Penn-III Treebanks. We extract syntactic-function-based subcategorization frames (LFG semantic forms) and traditional CFG category-based subcategorization frames as well as mixed function/category-based frames, with or without preposition information for obliques and particle information for particle verbs. Our approach associates probabilities with frames conditional on the lemma, distinguishes between active and passive frames, and fully reflects the effects of long-distance dependencies in the source data structures. In contrast to many other approaches, ours does not predefine the subcategorization frame types extracted, learning them instead from the source data. Including particles and prepositions, we extract 21,005 lemma frame types for 4,362 verb lemmas, with a total of 577 frame types and an average of ...
http://doras.dcu.ie/16178/
Marked
Mark
CALL for endangered languages: Challenges and rewards
(2003)
Ward, Monica; van Genabith, Josef
CALL for endangered languages: Challenges and rewards
(2003)
Ward, Monica; van Genabith, Josef
Abstract:
The interaction between CALL and Endangered Languages (EL) is an under-researched and under-exploited field. It is perhaps no surprise that this should be the case as CALL in the EL context has to address additional requirements and deal with extra constraints over and above those that prevail in mainstream CALL. This article introduces the topic of Endangered Languages and lists two classifications for Endangered Languages (Terralingua, 2000; Unesco, 1993). It outlines why a language becomes endangered and why it is important to save ELs. It identifies the special constraints that prevail in the EL CALL situation. These constraints determine the EL CALL requirements. In a case study, a software template and suggested syllabus have been developed for the production of CALL materials for ELs. A working example of courseware developed using the template is presented. Finally, the cultural dimension of EL CALL is outlined.
http://doras.dcu.ie/16210/
Marked
Mark
Metaphors, logic and type theory
(2001)
van Genabith, Josef
Metaphors, logic and type theory
(2001)
van Genabith, Josef
http://doras.dcu.ie/16216/
Marked
Mark
A computational model of the referential semantics of projective prepositions
(2006)
Kelleher, John ; van Genabith, Josef
A computational model of the referential semantics of projective prepositions
(2006)
Kelleher, John ; van Genabith, Josef
Abstract:
In this paper we present a framework for interpreting locative expressions containing the prepositions in front of and behind. These prepositions have different semantics in the viewer-centred and intrinsic frames of reference (Vandeloise, 1991). We define a model of their semantics in each frame of reference. The basis of these models is a novel parameterized continuum function that creates a 3-D spatial template. In the intrinsic frame of reference the origin used by the continuum function is assumed to be known a priori and object occlusion does not impact on the applicability rating of a point in the spatial template. In the viewer-centred frame the location of the spatial template’s origin is dependent on the user’s perception of the landmark at the time of the utterance and object occlusion is integrated into the model. Where there is an ambiguity with respect to the intended frame of reference, we define an algorithm for merging the spatial templates from the competing frames...
http://doras.dcu.ie/16165/
Marked
Mark
A three-pass system combination framework by combining multiple hypothesis alignment methods
(2009)
Du, Jinhua; Way, Andy
A three-pass system combination framework by combining multiple hypothesis alignment methods
(2009)
Du, Jinhua; Way, Andy
Abstract:
So far, many effective hypothesis alignment metrics have been proposed and applied to the system combination, such as TER, HMM, ITER and IHMM. In addition, the Minimum Bayes-risk (MBR) decoding and the confusion network (CN) have become the state-of-the art techniques in system combination. In this paper, we present a three-pass system combination strategy that can combine hypothesis alignment results derived from different alignment metrics to generate a better translation. Firstly the different alignment metrics are carried out to align the backbone and hypotheses, and the individual CN is built corresponding to each alignment results; then we construct a super network by merging the multiple metric-based CN and generate a consensus output. Finally a modified consensus network MBR (ConMBR) approach is employed to search a best translation. Our proposed strategy out performs the best single CN as well as the best single system in our experiments on NIST Chinese-to-English test set.
http://doras.dcu.ie/16011/
Marked
Mark
Closing the gap between stochastic and rule-based LFG grammars
(2010)
Hautli, Annette; Cetinoglu, Ozlem; van Genabith, Josef
Closing the gap between stochastic and rule-based LFG grammars
(2010)
Hautli, Annette; Cetinoglu, Ozlem; van Genabith, Josef
Abstract:
Developing large-scale deep grammars in a constraint-based framework such as Lexical Functional Grammar (LFG) is time-consuming and requires significant linguistic insight. Recently, treebank-based constraint-grammar acquisition approaches have been developed as an alternative to hand-crafting such resources. While treebank-based approaches are wide coverage and robust and achieve competitive evaluation results for many languages, the granularity of the linguistic analyses provided by treebank-based resources tends to be less fine-grained than what is offered by state-of-the-art handcrafted grammars. This paper presents an approach to extend the English DCU LFG annotation algorithm with more detailed f-structure information to provide probabilistic treebank-based LFG grammars with rich feature information comparable to that implemented by the hand-crafted English XLE grammar, while maintaining the robustness and the coverage of treebankbased stochastic grammars.
http://doras.dcu.ie/16017/
Marked
Mark
From treebank resources to LFG F-structures
(2003)
Frank, Anette; Sadler, Louisa; van Genabith, Josef; Way, Andy
From treebank resources to LFG F-structures
(2003)
Frank, Anette; Sadler, Louisa; van Genabith, Josef; Way, Andy
Abstract:
We present two methods for automatically annotating treebank resources with functional structures. Both methods define systematic patterns of correspondence between partial PS configurations and functional structures. These are applied to PS rules extracted from treebanks, or directly to constraint set encodings of treebank PS trees.
http://doras.dcu.ie/15824/
Marked
Mark
Automatic acquisition of Spanish LFG resources from the Cast3LB treebank
(2005)
O’Donovan, Ruth ; Cahill, Aoife; van Genabith, Josef; Way, Andy
Automatic acquisition of Spanish LFG resources from the Cast3LB treebank
(2005)
O’Donovan, Ruth ; Cahill, Aoife; van Genabith, Josef; Way, Andy
Abstract:
In this paper, we describe the automatic annotation of the Cast3LB Treebank with LFG f-structures for the subsequent extraction of Spanish probabilistic grammar and lexical resources. We adapt the approach and methodology of Cahill et al. (2004), O’Donovan et al. (2004) and elsewhere for English to Spanish and the Cast3LB treebank encoding. We report on the quality and coverage of the automatic f-structure annotation. Following the pipeline and integrated models of Cahill et al. (2004), we extract wide-coverage probabilistic LFG approximations and parse unseen Spanish text into f-structures. We also extend Bikel’s (2002) Multilingual Parse Engine to include a Spanish language module. Using the retrained Bikel parser in the pipeline model gives the best results against a manually constructed gold standard (73.20% predsonly f-score). We also extract Spanish lexical resources: 4090 semantic form types with 98 frame types. Subcategorised prepositions and particles are included in the fr...
http://doras.dcu.ie/16174/
Marked
Mark
German particle verbs and pleonastic prepositions
(2006)
Rehbein, Ines ; van Genabith, Josef
German particle verbs and pleonastic prepositions
(2006)
Rehbein, Ines ; van Genabith, Josef
Abstract:
This paper discusses the behaviour of German particle verbs formed by two-way prepositions in combination with pleonastic PPs including the verb particle as a preposition. These particle verbs have a characteristic feature: some of them license directional prepositional phrases in the accusative, some only allow for locative PPs in the dative, and some particle verbs can occur with PPs in the accusative and in the dative. Directional particle verbs together with directional PPs present an additional problem: the particle and the preposition in the PP seem to provide redundant information. The paper gives an overview of the semantic verb classes inuencing this phenomenon, based on corpus data, and explains the underlying reasons for the behaviour of the particle verbs. We also show how the restrictions on particle verbs and pleonastic PPs can be expressed in a grammar theory like Lexical Functional Grammar (LFG).
http://doras.dcu.ie/16172/
Marked
Mark
Treebank-based acquisition of LFG parsing resources for French
(2008)
Schluter, Natalie; van Genabith, Josef
Treebank-based acquisition of LFG parsing resources for French
(2008)
Schluter, Natalie; van Genabith, Josef
Abstract:
Motivated by the expense in time and other resources to produce hand-crafted grammars, there has been increased interest in automatically obtained wide-coverage grammars from treebanks for natural language processing. In particular, recent years have seen the growth in interest in automatically obtained deep resources that can represent information absent from simple CFG-type structured treebanks and which are considered to produce more language-neutral linguistic representations, such as dependency syntactic trees. As is often the case in early pioneering work on natural language processing, English has provided the focus of first efforts towards acquiring deep-grammar resources, followed by successful treatments of, for example, German, Japanese, Chinese and Spanish. However, no comparable large-scale automatically acquired deep-grammar resources have been obtained for French to date. The goal of this paper is to present the application of treebank-based language acquisition to th...
http://doras.dcu.ie/16176/
Marked
Mark
Automatic extraction of Arabic multiword expressions
(2010)
Attia, Mohammed; Tounsi, Lamia; Pecina, Pavel ; van Genabith, Josef; Toral , Antonio
Automatic extraction of Arabic multiword expressions
(2010)
Attia, Mohammed; Tounsi, Lamia; Pecina, Pavel ; van Genabith, Josef; Toral , Antonio
Abstract:
In this paper we investigate the automatic acquisition of Arabic Multiword Expressions (MWE). We propose three complementary approaches to extract MWEs from available data resources. The first approach relies on the correspondence asymmetries between Arabic Wikipedia titles and titles in 21 different languages. The second approach collects English MWEs from Princeton WordNet 3.0, translates the collection into Arabic using Google Translate, and utilizes different search engines to validate the output. The third uses lexical association measures to extract MWEs from a large unannotated corpus. We experimentally explore the feasibility of each approach and measure the quality and coverage of the output against gold standards.
http://doras.dcu.ie/16155/
Marked
Mark
LFG without C-structures
(2010)
Cetinoglu, Ozlem ; Foster, Jennifer ; Nivre, Joakim ; Hogan, Deirdre ; Cahill, Aoife; v...
LFG without C-structures
(2010)
Cetinoglu, Ozlem ; Foster, Jennifer ; Nivre, Joakim ; Hogan, Deirdre ; Cahill, Aoife; van Genabith, Josef
Abstract:
We explore the use of two dependency parsers, Malt and MST, in a Lexical Functional Grammar parsing pipeline. We compare this to the traditional LFG parsing pipeline which uses constituency parsers. We train the dependency parsers not on classical LFG f-structures but rather on modified dependency-tree versions of these in which all words in the input sentence are represented and multiple heads are removed. For the purposes of comparison, we also modify the existing CFG-based LFG parsing pipeline so that these "LFG-inspired" dependency trees are produced. We find that the differences in parsing accuracy over the various parsing architectures is small.
http://doras.dcu.ie/15982/
Marked
Mark
Dependency parsing resources for French: Converting acquired lexical functional grammar F-Structure annotations and parsing F-Structures directly
(2009)
Schluter, Natalie ; van Genabith, Josef
Dependency parsing resources for French: Converting acquired lexical functional grammar F-Structure annotations and parsing F-Structures directly
(2009)
Schluter, Natalie ; van Genabith, Josef
Abstract:
Recent years have seen considerable success in the generation of automatically obtained wide-coverage deep grammars for natural language processing, given reliable and large CFG-like treebanks. For research within Lexical Functional Grammar framework, these deep grammars are typically based on an extended PCFG parsing scheme from which dependencies are extracted. However, increasing success in statistical dependency parsing suggests that such deep grammar approaches to statistical parsing could be streamlined. We explore this novel approach to deep grammar parsing within the framework of LFG in this paper, for French, showing that best results (an f-score of 69.46) for the established integrated architecture may be obtained for French.
http://doras.dcu.ie/16170/
Displaying Results 226 - 250 of 259 on page 10 of 11
1
2
3
4
5
6
7
8
9
10
11
Bibtex
CSV
EndNote
RefWorks
RIS
XML
Item Type
Book chapter (5)
Conference item (222)
Doctoral thesis (20)
Journal article (10)
Master thesis (research) (2)
Peer Review Status
Peer reviewed (231)
Non peer reviewed (28)
Year
2012 (3)
2011 (16)
2010 (58)
2009 (52)
2008 (20)
2007 (41)
2006 (21)
2005 (9)
2004 (12)
2003 (11)
2002 (8)
2001 (6)
2000 (2)
Language
English (256)
French (3)
built by Enovation Solutions