Results 1 -
5 of
5
Benchmarking of Statistical Dependency Parsers for French
, 2010
"... We compare the performance of three statistical parsing architectures on the problem of deriving typed dependency structures for French. The architectures are based on PCFGs with latent variables, graph-based dependency parsing and transition-based dependency parsing, respectively. We also study the ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
We compare the performance of three statistical parsing architectures on the problem of deriving typed dependency structures for French. The architectures are based on PCFGs with latent variables, graph-based dependency parsing and transition-based dependency parsing, respectively. We also study the influence of three types of lexical information: lemmas, morphological features, and word clusters. The results show that all three systems achieve competitive performance, with a best labeled attachment score over 88%. All three parsers benefit from the use of automatically derived lemmas, while morphological features seem to be less important. Word clusters have a positive effect primarily on the latent variable parser.
LFG without C-structures
"... We explore the use of two dependency parsers, Malt and MST, in a Lexical Functional Grammar parsing pipeline. We compare this to the traditional LFG parsing pipeline which uses constituency parsers. We train the dependency parsers not on classical LFG f-structures but rather on modified dependency-t ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We explore the use of two dependency parsers, Malt and MST, in a Lexical Functional Grammar parsing pipeline. We compare this to the traditional LFG parsing pipeline which uses constituency parsers. We train the dependency parsers not on classical LFG f-structures but rather on modified dependency-tree versions of these in which all words in the input sentence are represented and multiple heads are removed. For the purposes of comparison, we also modify the existing CFG-based LFG parsing pipeline so that these "LFG-inspired " dependency trees are produced. We find that the differences in parsing accuracy over the various parsing architectures is small. 1
A Graph Based Method for Building Multilingual Weakly Supervised Dependency Parsers
"... Abstract. The structure of a sentence can be seen as a spanning tree in a linguistically augmented graph of syntactic nodes. This paper presents an approach for unlabeled dependency parsing based on this view. The first step involves marking the chunks and the chunk heads of a given sentence and the ..."
Abstract
- Add to MetaCart
Abstract. The structure of a sentence can be seen as a spanning tree in a linguistically augmented graph of syntactic nodes. This paper presents an approach for unlabeled dependency parsing based on this view. The first step involves marking the chunks and the chunk heads of a given sentence and then identifying the intra-chunk dependency relations. The second step involves learning to identify the inter-chunk dependency relations. For this, we use an initialization technique based on a measure we call Normalized Conditional Mutual Information (NCMI), in addition to a few linguistic constraints. We present the results for Hindi. We have achieved a precision of 80.83 % for sentences of size less than 10 words and 66.71 % overall. This is significantly better than the baseline in which random initialization is used.
Generative re-ranking model for dependency parsing of Italian sentences
"... Abstract. We present a general framework for dependency parsing of Italian sentences based on a combination of discriminative and generative models. We use a state-of-the-art discriminative model to obtain a k-best list of candidate structures for the test sentences, and use the generative model to ..."
Abstract
- Add to MetaCart
Abstract. We present a general framework for dependency parsing of Italian sentences based on a combination of discriminative and generative models. We use a state-of-the-art discriminative model to obtain a k-best list of candidate structures for the test sentences, and use the generative model to compute the probability of each candidate, and select the most probable one. We present the details of the specific generative model we have employed for the EVALITA’09 task. Results show that by using the generative model we gain around 1 % in labeled accuracy (around 7 % error reduction) over the discriminative model.
The proper place of men and machines in language technology Processing Russian without any linguistic knowledge
"... The paper describes several experiments aimed at designing tools for processing Russian texts, namely for Part-Of-Speech tagging, lemmatisation and syntactic parsing, exploiting exclusively statistical approaches without coding any linguistic rules specifically for Russian. While not claiming any ne ..."
Abstract
- Add to MetaCart
The paper describes several experiments aimed at designing tools for processing Russian texts, namely for Part-Of-Speech tagging, lemmatisation and syntactic parsing, exploiting exclusively statistical approaches without coding any linguistic rules specifically for Russian. While not claiming any new ground for machine learning research, the results demonstrate the possibility to create state-of-the-art tools for Russian in very short time using only machine learning and no hard-coded linguistic knowledge. One of the results of this study is a set of publicly available resources which can be used in standard pipelines for processing Russian. However, they also demonstrate hidden costs associated with the use of purely statistical methods and the need to integrate linguistic parameters into statistical procedures. 1

