Results 1 - 10
of
12
CoNLL-X shared task on multilingual dependency parsing
- In Proc. of CoNLL
, 2006
"... Each year the Conference on Computational Natural Language Learning (CoNLL) 1 features a shared task, in which participants train and test their systems on exactly the same data sets, in order to better compare systems. The tenth CoNLL (CoNLL-X) saw a shared task on Multilingual Dependency Parsing. ..."
Abstract
-
Cited by 161 (2 self)
- Add to MetaCart
Each year the Conference on Computational Natural Language Learning (CoNLL) 1 features a shared task, in which participants train and test their systems on exactly the same data sets, in order to better compare systems. The tenth CoNLL (CoNLL-X) saw a shared task on Multilingual Dependency Parsing. In this paper, we describe how treebanks for 13 languages were converted into the same dependency format and how parsing performance was measured. We also give an overview of the parsing approaches that participants took and the results that they achieved. Finally, we try to draw general conclusions about multi-lingual parsing: What makes a particular language, treebank or annotation scheme easier or harder to parse and which phenomena are challenging for any dependency parser? Acknowledgement Many thanks to Amit Dubey and Yuval Krymolowski, the other two organizers of the shared task, for discussions, converting treebanks, writing software and helping with the papers. 2
Maltparser: A language-independent system for data-driven dependency parsing
- In Proc. of the Fourth Workshop on Treebanks and Linguistic Theories
, 2005
"... ..."
Incrementality in deterministic dependency parsing
- In Proceedings of the Workshop on Incremental Parsing (ACL
, 2004
"... Deterministic dependency parsing is a robust and efficient approach to syntactic parsing of unrestricted natural language text. In this paper, we analyze its potential for incremental processing and conclude that strict incrementality is not achievable within this framework. However, we also show th ..."
Abstract
-
Cited by 25 (6 self)
- Add to MetaCart
Deterministic dependency parsing is a robust and efficient approach to syntactic parsing of unrestricted natural language text. In this paper, we analyze its potential for incremental processing and conclude that strict incrementality is not achievable within this framework. However, we also show that it is possible to minimize the number of structures that require nonincremental processing by choosing an optimal parsing algorithm. This claim is substantiated with experimental evidence showing that the algorithm achieves incremental parsing for 68.9% of the input when tested on a random sample of Swedish text. When restricted to sentences that are accepted by the parser, the degree of incrementality increases to 87.9%. 1
Talbanken05: A Swedish treebank with phrase structure and dependency annotation
- In Proceedings of the fifth international conference on Language Resources and Evaluation (LREC2006
"... We introduce Talbanken05, a Swedish treebank based on a syntactically annotated corpus from the 1970s, Talbanken76, converted to modern formats. The treebank is available in three different formats, besides the original one: two versions of phrase structure annotation and one dependency-based annota ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
We introduce Talbanken05, a Swedish treebank based on a syntactically annotated corpus from the 1970s, Talbanken76, converted to modern formats. The treebank is available in three different formats, besides the original one: two versions of phrase structure annotation and one dependency-based annotation, all of which are encoded in XML. In this paper, we describe the conversion process and exemplify the available formats. The treebank is freely available for research and educational purposes. 1.
Discriminative Classifiers for Deterministic Dependency Parsing
- In Proceedings of the 44th AnnualMeeting of the Association for ComputationalLinguistics (ACL
, 2006
"... Deterministic parsing guided by treebankinduced classifiers has emerged as a simple and efficient alternative to more complex models for data-driven parsing. We present a systematic comparison of memory-based learning (MBL) and support vector machines (SVM) for inducing classifiers for deterministic ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Deterministic parsing guided by treebankinduced classifiers has emerged as a simple and efficient alternative to more complex models for data-driven parsing. We present a systematic comparison of memory-based learning (MBL) and support vector machines (SVM) for inducing classifiers for deterministic dependency parsing, using data from Chinese, English and Swedish, together with a variety of different feature models. The comparison shows that SVM gives higher accuracy for richly articulated feature models across all languages, albeit with considerably longer training times. The results also confirm that classifier-based deterministic parsing can achieve parsing accuracy very close to the best results reported for more complex parsing models. 1
Discriminative Learning and Spanning Tree Algorithms for Dependency Parsing
, 2006
"... In this thesis we develop a discriminative learning method for dependency parsing using
online large-margin training combined with spanning tree inference algorithms. We will
show that this method provides state-of-the-art accuracy, is extensible through the feature
set and can be implemented effici ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
In this thesis we develop a discriminative learning method for dependency parsing using
online large-margin training combined with spanning tree inference algorithms. We will
show that this method provides state-of-the-art accuracy, is extensible through the feature
set and can be implemented efficiently. Furthermore, we display the language independent
nature of the method by evaluating it on over a dozen diverse languages as well as show its
practical applicability through integration into a sentence compression system.
We start by presenting an online large-margin learning framework that is a generaliza-
tion of the work of Crammer and Singer [34, 37] to structured outputs, such as sequences
and parse trees. This will lead to the heart of this thesis – discriminative dependency pars-
ing. Here we will formulate dependency parsing in a spanning tree framework, yielding
efficient parsing algorithms for both projective and non-projective tree structures. We will
then extend the parsing algorithm to incorporate features over larger substructures with-
out an increase in computational complexity for the projective case. Unfortunately, the
non-projective problem then becomes NP-hard so we provide structurally motivated ap-
proximate algorithms. Having defined a set of parsing algorithms, we will also define a
rich feature set and train various parsers using the online large-margin learning framework.
We then compare our trained dependency parsers to other state-of-the-art parsers on 14
diverse languages: Arabic, Bulgarian, Chinese, Czech, Danish, Dutch, English, German,
Japanese, Portuguese, Slovene, Spanish, Swedish and Turkish.
Having built an efficient and accurate discriminative dependency parser, this thesis will
then turn to improving and applying the parser. First we will show how additional re-
sources can provide useful features to increase parsing accuracy and to adapt parsers to
new domains. We will also argue that the robustness of discriminative inference-based
learning algorithms lend themselves well to dependency parsing when feature representa-
tions or structural constraints do not allow for tractable parsing algorithms. Finally, we
integrate our parsing models into a state-of-the-art sentence compression system to show
its applicability to a real world problem.
Tree Transformations in Data-Driven Dependency Parsing
"... Tesnière (1959) is often regarded as the founder of the modern theoretical tradition of dependency grammar, but it is not one well-defined theory. Instead, it comprises several related theories, having some core notions in common. Such a notion is the directed relations between pairs of lexical elem ..."
Abstract
- Add to MetaCart
Tesnière (1959) is often regarded as the founder of the modern theoretical tradition of dependency grammar, but it is not one well-defined theory. Instead, it comprises several related theories, having some core notions in common. Such a notion is the directed relations between pairs of lexical elements, but the formalism and the criteria for
Predictive Text Entry using Syntax and Semantics
"... Most cellular telephones use numeric keypads, where texting is supported by dictionaries and frequency models. Given a key sequence, the entry system recognizes the matching words and proposes a rankordered list of candidates. The ranking quality is instrumental to an effective entry. This paper des ..."
Abstract
- Add to MetaCart
Most cellular telephones use numeric keypads, where texting is supported by dictionaries and frequency models. Given a key sequence, the entry system recognizes the matching words and proposes a rankordered list of candidates. The ranking quality is instrumental to an effective entry. This paper describes a new method to enhance entry that combines syntax and language models. We first investigate components to improve the ranking step: language models and semantic relatedness. We then introduce a novel syntactic model to capture the word context, optimize ranking, and then reduce the number of keystrokes per character (KSPC) needed to write a text. We finally combine this model with the other components and we discuss the results. We show that our syntax-based model reaches an error reduction in KSPC of 12.4 % on a Swedish corpus over a baseline using word frequencies. We also show that bigrams are superior to all the other models. However, bigrams have a memory footprint that is unfit for most devices. Nonetheless, bigrams can be further improved by the addition of syntactic models with an error reduction that reaches 29.4%. 1
Bootstrapping Lexicalized Models in Memory-Based Dependency Parsing
"... Previous research has shown that a lexicalized parsing model incorporating words but no parts-of-speech can outperform a model involving partsof-speech but no words given enough training data for supervised learning. We show that the same effect can be achieved with a bootstrapping approach, where a ..."
Abstract
- Add to MetaCart
Previous research has shown that a lexicalized parsing model incorporating words but no parts-of-speech can outperform a model involving partsof-speech but no words given enough training data for supervised learning. We show that the same effect can be achieved with a bootstrapping approach, where a mixed model trained on a small treebank is used to parse a larger corpus which is used as training data for the lexicalized model. The results are obtained using a memory-based learning algorithm for deterministic dependency parsing of unrestricted Swedish text.
DEPENDENCY PARSING OF SPOKEN SWEDISH
"... The tremendous improvement in robustness and accuracy of natural language parsing that we have witnessed during the last decade has almost exclusively been concerned with the analysis of written texts. The development of equally accurate syntactic parsers for spoken language is one of the greatest c ..."
Abstract
- Add to MetaCart
The tremendous improvement in robustness and accuracy of natural language parsing that we have witnessed during the last decade has almost exclusively been concerned with the analysis of written texts. The development of equally accurate syntactic parsers for spoken language is one of the greatest challenges for the parsing community. In this paper, we report the first results on parsing spoken Swedish with a data-driven dependency parser previously evaluated on written texts from a wide variety of languages. We compare two different algorithms, one restricted to projective dependency structures and one that allows non-projective structures, and compare the results to those obtained for written Swedish using the same methodology. The results show that parsing accuracy is still lower for spoken language than for written language, although part of the difference can be explained by properties of the transcribed spoken corpus used in the experiments. The results also show that the capacity to derive non-projective dependency structures is more crucial for spoken Swedish than for written Swedish.

