Results 1 -
6 of
6
A Linear Programming Formulation for Global Inference in Natural Language Tasks
- In Proceedings of CoNLL-2004
, 2004
"... The typical processing paradigm in natural language processing is the "pipeline" approach, where learners are being used at one level, their outcomes are being used as features for a second level of predictions and so one. In addition to accumulating errors, it is clear that the sequential processin ..."
Abstract
-
Cited by 91 (26 self)
- Add to MetaCart
The typical processing paradigm in natural language processing is the "pipeline" approach, where learners are being used at one level, their outcomes are being used as features for a second level of predictions and so one. In addition to accumulating errors, it is clear that the sequential processing is a crude approximation to a process in which interactions occur across levels and down stream decisions often interact with previous decisions. This work develops a general...
Effective use of phrases in language modeling to improve information retrieval
- 2004 Symposium on AI & Math Special Session on Intelligent Text Processing
, 2004
"... Traditional information retrieval models treat the query as a bag of words, assuming that the occurrence of each query term is independent of the positions and occurrences of others. Several of these traditional models have been extended to incorporate positional information, most often through the ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Traditional information retrieval models treat the query as a bag of words, assuming that the occurrence of each query term is independent of the positions and occurrences of others. Several of these traditional models have been extended to incorporate positional information, most often through the inclusion of phrases. This has shown improvements in effectiveness on large, modern test collections. The language modeling approach to information retrieval is attractive because it provides a well-studied theoretical framework that has been successful in other fields. Incorporating positional information into language models is intuitive and has shown significant improvements in several language-modeling applications. However, attempts to integrate positional information into the language-modeling approach to IR have not shown consistent significant improvements. This paper provides a broader exploration of this problem. We apply the backoff technique to incorporate a bigram phrase language model with the traditional unigram one and compare its performance to an interpolation of a conditional bigram model with the unigram model. While this novel application of backoff does not improve effectiveness, we find that our formula for interpolating a conditional bigram model with the unigram model yields significantly different results from prior work. Namely, it shows an 11 % relative improvement in average precision on one query set, while yielding no improvement on the other two. 1.
Wen-tau Yih Machine Learning and Applied Statistics Group
"... Natural language decisions often involve assigning values to sets of variables, representing low level decisions and context dependent disambiguation. In most cases there are complex relationships among these variables representing dependencies that range from simple statistical correlations to thos ..."
Abstract
- Add to MetaCart
Natural language decisions often involve assigning values to sets of variables, representing low level decisions and context dependent disambiguation. In most cases there are complex relationships among these variables representing dependencies that range from simple statistical correlations to those that are constrained by deeper structural, relational and semantic properties of the text. In this work we study a specific instantiation of this problem in the context of identifying named entities and relations between them in free form text. Given a collection of discrete random variables representing outcomes of learned local predictors for entities and relations, we seek an optimal global assignment to the variables that respects multiple constraints, including constraints on the type of arguments a relation can take, and the mutual activity of different relations. We develop a linear programming formulation to address this global inference problem and evaluate it in the context of simultaneously learning named entities and relations. We show that global inference improves stand-alone learning; in addition, our approach allows us to efficiently incorporate expressive domain and task specific constraints at decision time, resulting, beyond significant improvements in the accuracy, in “coherent ” quality of the inference. 2 Global Inference for Entity and Relation Identification via a Linear Programming Formulation 1.1
In: Proceedings of CoNLL-2000 and LLL-2000, pages 107-110, Lisbon, Portugal, 2000.
- In CoNLL
, 2000
"... We study the problem of identifying phrase structure. We formalize it as the problem of combining the outcomes of several different clas- sifters in a way that provides a coherent inference that satisfies some constraints, and develop two general approaches for it. The first is a Markovian approach ..."
Abstract
- Add to MetaCart
We study the problem of identifying phrase structure. We formalize it as the problem of combining the outcomes of several different clas- sifters in a way that provides a coherent inference that satisfies some constraints, and develop two general approaches for it. The first is a Markovian approach that extends stan- dard HMMs to allow the use of a rich obser- vations structure and of general classifiers to model state-observation dependencies. The sec- ond is an extension of constraint satisfaction for- malisms. We also develop efficient algorithms under both models and study them experimen- tally in the context of shallow parsing. I Identifying Phrase Structure The problem of identifying phrase structure can be formalized as follows. Given an input string O =< Ol, 02,..., On >, a phrase is a substring of consecutive input symbols oi, oi+l, , oj. Some external mechanism is assumed to consis- tently (or stochastically) annotate substrings as phrases 2. Our goal is to come up with a mech- anism that, given an input string, identifies the phrases in this string. this is a fundamental task with applications in natural language (Church, 1988; Ramshaw and Marcus, 1995; Mufioz et al., 1999; Cardie and Pierce, 1998).
SUPERVISED CLUSTERING WITH STRUCTURAL SVMs
, 2009
"... Supervised clustering is the problem of training clustering methods to produce desirable clusterings. Given sets of items and complete clusterings over these sets, a supervised clustering algorithm learns how to cluster future sets of items in a similar fashion, typically by changing the underlying ..."
Abstract
- Add to MetaCart
Supervised clustering is the problem of training clustering methods to produce desirable clusterings. Given sets of items and complete clusterings over these sets, a supervised clustering algorithm learns how to cluster future sets of items in a similar fashion, typically by changing the underlying similarity measure between item pairs. This work presents a general approach for training clustering methods such as correlation clustering and k-means/spectral clustering able to optimize to task-specific performance criteria using structural SVMs. We empirically and theoretically analyze our supervised clustering approach on a variety of datasets and clustering methods. This analysis also leads to general insights about structural SVMs beyond supervised clustering. Specifically, since clustering is a NP-hard task and the corresponding training problem likewise must make use of approximate inference during training of the parameters, we present a detailed theoretical and empirical analysis of the general use of approximations in structural SVM training.

