Results 11 - 20
of
52
Profiting from Mark-Up: Hyper-Text Annotations for Guided Parsing
"... We show how web mark-up can be used to improve unsupervised dependency parsing. Starting from raw bracketings of four common HTML tags (anchors, bold, italics and underlines), we refine approximate partial phrase boundaries to yield accurate parsing constraints. Conversion procedures fall out of our ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
We show how web mark-up can be used to improve unsupervised dependency parsing. Starting from raw bracketings of four common HTML tags (anchors, bold, italics and underlines), we refine approximate partial phrase boundaries to yield accurate parsing constraints. Conversion procedures fall out of our linguistic analysis of a newly available million-word hyper-text corpus. We demonstrate that derived constraints aid grammar induction by training Klein and Manning’s Dependency Model with Valence (DMV) on this data set: parsing accuracy on Section 23 (all sentences) of the Wall Street Journal corpus jumps to 50.4%, beating previous state-of-theart by more than 5%. Web-scale experiments show that the DMV, perhaps because it is unlexicalized, does not benefit from orders of magnitude more annotated but noisier data. Our model, trained on a single blog, generalizes to 53.3 % accuracy out-of-domain, against the Brown corpus — nearly 10 % higher than the previous published best. The fact that web mark-up strongly correlates with syntactic structure may have broad applicability in NLP. 1
Semisupervised learning of dependency parsers using generalized expectation criteria
- IN PROC. ACL
, 2009
"... In this paper, we propose a novel method for semi-supervised learning of nonprojective log-linear dependency parsers using directly expressed linguistic prior knowledge (e.g. a noun’s parent is often a verb). Model parameters are estimated using a generalized expectation (GE) objective function that ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
In this paper, we propose a novel method for semi-supervised learning of nonprojective log-linear dependency parsers using directly expressed linguistic prior knowledge (e.g. a noun’s parent is often a verb). Model parameters are estimated using a generalized expectation (GE) objective function that penalizes the mismatch between model predictions and linguistic expectation constraints. In a comparison with two prominent “unsupervised” learning methods that require indirect biasing toward the correct syntactic structure, we show that GE can attain better accuracy with as few as 20 intuitive constraints. We also present positive experimental results on longer sentences in multiple languages.
Discriminative Learning and Spanning Tree Algorithms for Dependency Parsing
, 2006
"... In this thesis we develop a discriminative learning method for dependency parsing using
online large-margin training combined with spanning tree inference algorithms. We will
show that this method provides state-of-the-art accuracy, is extensible through the feature
set and can be implemented effici ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
In this thesis we develop a discriminative learning method for dependency parsing using
online large-margin training combined with spanning tree inference algorithms. We will
show that this method provides state-of-the-art accuracy, is extensible through the feature
set and can be implemented efficiently. Furthermore, we display the language independent
nature of the method by evaluating it on over a dozen diverse languages as well as show its
practical applicability through integration into a sentence compression system.
We start by presenting an online large-margin learning framework that is a generaliza-
tion of the work of Crammer and Singer [34, 37] to structured outputs, such as sequences
and parse trees. This will lead to the heart of this thesis – discriminative dependency pars-
ing. Here we will formulate dependency parsing in a spanning tree framework, yielding
efficient parsing algorithms for both projective and non-projective tree structures. We will
then extend the parsing algorithm to incorporate features over larger substructures with-
out an increase in computational complexity for the projective case. Unfortunately, the
non-projective problem then becomes NP-hard so we provide structurally motivated ap-
proximate algorithms. Having defined a set of parsing algorithms, we will also define a
rich feature set and train various parsers using the online large-margin learning framework.
We then compare our trained dependency parsers to other state-of-the-art parsers on 14
diverse languages: Arabic, Bulgarian, Chinese, Czech, Danish, Dutch, English, German,
Japanese, Portuguese, Slovene, Spanish, Swedish and Turkish.
Having built an efficient and accurate discriminative dependency parser, this thesis will
then turn to improving and applying the parser. First we will show how additional re-
sources can provide useful features to increase parsing accuracy and to adapt parsers to
new domains. We will also argue that the robustness of discriminative inference-based
learning algorithms lend themselves well to dependency parsing when feature representa-
tions or structural constraints do not allow for tractable parsing algorithms. Finally, we
integrate our parsing models into a state-of-the-art sentence compression system to show
its applicability to a real world problem.
Confidence driven unsupervised semantic parsing
- In Proc. of the Meeting of Association for Computational Linguistics (ACL
, 2011
"... Current approaches for semantic parsing take a supervised approach requiring a considerable amount of training data which is expensive and difficult to obtain. This supervision bottleneck is one of the major difficulties in scaling up semantic parsing. We argue that a semantic parser can be trained ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Current approaches for semantic parsing take a supervised approach requiring a considerable amount of training data which is expensive and difficult to obtain. This supervision bottleneck is one of the major difficulties in scaling up semantic parsing. We argue that a semantic parser can be trained effectively without annotated data, and introduce an unsupervised learning algorithm. The algorithm takes a self training approach driven by confidence estimation. Evaluated over Geoquery, a standard dataset for this task, our system achieved 66 % accuracy, compared to 80 % of its fully supervised counterpart, demonstrating the promise of unsupervised approaches for this task. 1
A Look at Parsing and Its Applications
- In Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06
, 2006
"... This paper provides a brief introduction to recent work in statistical parsing and its applications. We highlight successes to date, remaining challenges, and promising future work. ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
This paper provides a brief introduction to recent work in statistical parsing and its applications. We highlight successes to date, remaining challenges, and promising future work.
Products of Random Latent Variable Grammars
"... We show that the automatically induced latent variable grammars of Petrov et al. (2006) vary widely in their underlying representations, depending on their EM initialization point. We use this to our advantage, combining multiple automatically learned grammars into an unweighted product model, which ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
We show that the automatically induced latent variable grammars of Petrov et al. (2006) vary widely in their underlying representations, depending on their EM initialization point. We use this to our advantage, combining multiple automatically learned grammars into an unweighted product model, which gives significantly improved performance over state-ofthe-art individual grammars. In our model, the probability of a constituent is estimated as a product of posteriors obtained from multiple grammars that differ only in the random seed used for initialization, without any learning or tuning of combination weights. Despite its simplicity, a product of eight automatically learned grammars improves parsing accuracy from 90.2 % to 91.8 % on English, and from 80.3 % to 84.5 % on German. 1
Training a parser for machine translation reordering
- In Proc. of EMNLP
, 2011
"... We propose a simple training regime that can improve the extrinsic performance of a parser, given only a corpus of sentences and a way to automatically evaluate the extrinsic quality of a candidate parse. We apply our method to train parsers that excel when used as part of a reordering component in ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
We propose a simple training regime that can improve the extrinsic performance of a parser, given only a corpus of sentences and a way to automatically evaluate the extrinsic quality of a candidate parse. We apply our method to train parsers that excel when used as part of a reordering component in a statistical machine translation system. We use a corpus of weakly-labeled reference reorderings to guide parser training. Our best parsers contribute significant improvements in subjective translation quality while their intrinsic attachment scores typically regress. 1
Recognizing disfluencies in conversational speech
- IEEE Transactions on Audio, Speech and Language Processing
, 2006
"... Abstract—We present a system for modeling disfluency in conversational speech: repairs, fillers, and self-interruption points (IPs). For each sentence, candidate repair analyses are generated by a stochastic tree adjoining grammar (TAG) noisy-channel model. A probabilistic syntactic language model s ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Abstract—We present a system for modeling disfluency in conversational speech: repairs, fillers, and self-interruption points (IPs). For each sentence, candidate repair analyses are generated by a stochastic tree adjoining grammar (TAG) noisy-channel model. A probabilistic syntactic language model scores the fluency of each analysis, and a maximum-entropy model selects the most likely analysis given the language model score and other features. Fillers are detected independently via a small set of deterministic rules, and IPs are detected by combining the output of repair and filler detection modules. In the recent Rich Transcription Fall 2004 (RT-04F) blind evaluation, systems competed to detect these three forms of disfluency under two input conditions: a best-case scenario of manually transcribed words and a fully automatic case of automatic speech recognition (ASR) output. For all three tasks and on both types of input, our system was the top performer in the evaluation. Index Terms—Disfluency modeling, natural language processing, rich transcription, speech processing. I.
When is Self-Training Effective for Parsing?
"... Self-training has been shown capable of improving on state-of-the-art parser performance (McClosky et al., 2006) despite the conventional wisdom on the matter and several studies to the contrary (Charniak, 1997; Steedman et al., 2003). However, it has remained unclear when and why selftraining is he ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Self-training has been shown capable of improving on state-of-the-art parser performance (McClosky et al., 2006) despite the conventional wisdom on the matter and several studies to the contrary (Charniak, 1997; Steedman et al., 2003). However, it has remained unclear when and why selftraining is helpful. In this paper, we test four hypotheses (namely, presence of a phase transition, impact of search errors, value of non-generative reranker features, and effects of unknown words). From these experiments, we gain a better understanding of why self-training works for parsing. Since improvements from selftraining are correlated with unknown bigrams and biheads but not unknown words, the benefit of self-training appears most influenced by seeing known words in new combinations. 1

