Results 1  10
of
14
A Linear Programming Formulation for Global Inference in Natural Language Tasks
 In Proceedings of CoNLL2004
, 2004
"... The typical processing paradigm in natural language processing is the "pipeline" approach, where learners are being used at one level, their outcomes are being used as features for a second level of predictions and so one. In addition to accumulating errors, it is clear that the sequential ..."
Abstract

Cited by 149 (40 self)
 Add to MetaCart
(Show Context)
The typical processing paradigm in natural language processing is the "pipeline" approach, where learners are being used at one level, their outcomes are being used as features for a second level of predictions and so one. In addition to accumulating errors, it is clear that the sequential processing is a crude approximation to a process in which interactions occur across levels and down stream decisions often interact with previous decisions. This work develops a general...
Narrowing the modeling gap: A clusterranking approach to coreference resolution
 Journal of Artificial Intelligence Research
"... Traditional learningbased coreference resolvers operate by training the mentionpair model for determining whether two mentions are coreferent or not. Though conceptually simple and easy to understand, the mentionpair model is linguistically rather unappealing and lags far behind the heuristicbas ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
Traditional learningbased coreference resolvers operate by training the mentionpair model for determining whether two mentions are coreferent or not. Though conceptually simple and easy to understand, the mentionpair model is linguistically rather unappealing and lags far behind the heuristicbased coreference models proposed in the prestatistical NLP era in terms of sophistication. Two independent lines of recent research have attempted to improve the mentionpair model, one by acquiring the mentionranking model to rank preceding mentions for a given anaphor, and the other by training the entitymention model to determine whether a preceding cluster is coreferent with a given mention. We propose a clusterranking approach to coreference resolution, which combines the strengths of the mentionranking model and the entitymention model, and is therefore theoretically more appealing than both of these models. In addition, we seek to improve cluster rankers via two extensions: (1) lexicalization and (2) incorporating knowledge of anaphoricity by jointly modeling anaphoricity determination and coreference resolution. Experimental results on the ACE data sets demonstrate the superior performance of cluster rankers to competing approaches as well as the effectiveness of our two extensions. 1.
Effective use of phrases in language modeling to improve information retrieval
 2004 Symposium on AI & Math Special Session on Intelligent Text Processing
, 2004
"... Traditional information retrieval models treat the query as a bag of words, assuming that the occurrence of each query term is independent of the positions and occurrences of others. Several of these traditional models have been extended to incorporate positional information, most often through the ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Traditional information retrieval models treat the query as a bag of words, assuming that the occurrence of each query term is independent of the positions and occurrences of others. Several of these traditional models have been extended to incorporate positional information, most often through the inclusion of phrases. This has shown improvements in effectiveness on large, modern test collections. The language modeling approach to information retrieval is attractive because it provides a wellstudied theoretical framework that has been successful in other fields. Incorporating positional information into language models is intuitive and has shown significant improvements in several languagemodeling applications. However, attempts to integrate positional information into the languagemodeling approach to IR have not shown consistent significant improvements. This paper provides a broader exploration of this problem. We apply the backoff technique to incorporate a bigram phrase language model with the traditional unigram one and compare its performance to an interpolation of a conditional bigram model with the unigram model. While this novel application of backoff does not improve effectiveness, we find that our formula for interpolating a conditional bigram model with the unigram model yields significantly different results from prior work. Namely, it shows an 11 % relative improvement in average precision on one query set, while yielding no improvement on the other two. 1.
SUPERVISED CLUSTERING WITH STRUCTURAL SVMs
, 2009
"... Supervised clustering is the problem of training clustering methods to produce desirable clusterings. Given sets of items and complete clusterings over these sets, a supervised clustering algorithm learns how to cluster future sets of items in a similar fashion, typically by changing the underlying ..."
Abstract
 Add to MetaCart
Supervised clustering is the problem of training clustering methods to produce desirable clusterings. Given sets of items and complete clusterings over these sets, a supervised clustering algorithm learns how to cluster future sets of items in a similar fashion, typically by changing the underlying similarity measure between item pairs. This work presents a general approach for training clustering methods such as correlation clustering and kmeans/spectral clustering able to optimize to taskspecific performance criteria using structural SVMs. We empirically and theoretically analyze our supervised clustering approach on a variety of datasets and clustering methods. This analysis also leads to general insights about structural SVMs beyond supervised clustering. Specifically, since clustering is a NPhard task and the corresponding training problem likewise must make use of approximate inference during training of the parameters, we present a detailed theoretical and empirical analysis of the general use of approximations in structural SVM training.
A Linear Programming Formulation for Global Inference in Natural Language Tasks
"... Given a collection of discrete random variables representing outcomes of learned local predictors in natural language, e.g., named entities and relations, we seek an optimal global assignment to the variables in the presence of general (nonsequential) constraints. Examples of these constraints incl ..."
Abstract
 Add to MetaCart
(Show Context)
Given a collection of discrete random variables representing outcomes of learned local predictors in natural language, e.g., named entities and relations, we seek an optimal global assignment to the variables in the presence of general (nonsequential) constraints. Examples of these constraints include the type of arguments a relation can take, and the mutual activity of different relations, etc. We develop a linear programming formulation for this problem and evaluate it in the context of simultaneously learning named entities and relations. Our approach allows us to efficiently incorporate domain and task specific constraints at decision time, resulting in significant improvements in the accuracy and the “humanlike ” quality of the inferences. 1
A Linear Programming Formulation for Global Inference in Natural Language Tasks
"... Given a collection of discrete random variables representing outcomes of learned local predictors in natural language, e.g., named entities and relations, we seek an optimal global assignment to the variables in the presence of general (nonsequential) constraints. Examples of these constraints incl ..."
Abstract
 Add to MetaCart
(Show Context)
Given a collection of discrete random variables representing outcomes of learned local predictors in natural language, e.g., named entities and relations, we seek an optimal global assignment to the variables in the presence of general (nonsequential) constraints. Examples of these constraints include the type of arguments a relation can take, and the mutual activity of different relations, etc. We develop a linear programming formulation for this problem and evaluate it in the context of simultaneously learning named entities and relations. Our approach allows us to efficiently incorporate domain and task specific constraints at decision time, resulting in significant improvements in the accuracy and the “humanlike ” quality of the inferences. 1
Presented at NIPS’04 workshop on Learning with Structured Outputs Piecewise Training with Parameter Independence Diagrams: Comparing Globally and Locallytrained
"... We present a diagrammatic formalism and practial methods for introducing additional independence assumptions into parameter estimation, enabling efficient training of undirected graphical models in locallynormalized pieces. On two realworld data sets we demonstrate our locallytrained linearchain ..."
Abstract
 Add to MetaCart
We present a diagrammatic formalism and practial methods for introducing additional independence assumptions into parameter estimation, enabling efficient training of undirected graphical models in locallynormalized pieces. On two realworld data sets we demonstrate our locallytrained linearchain CRFs outperforming traditional CRFs— training in less than onefifth the time, and providing a statisticallysignificant gain in accuracy. 1
Wentau Yih Machine Learning and Applied Statistics Group
"... Natural language decisions often involve assigning values to sets of variables, representing low level decisions and context dependent disambiguation. In most cases there are complex relationships among these variables representing dependencies that range from simple statistical correlations to thos ..."
Abstract
 Add to MetaCart
(Show Context)
Natural language decisions often involve assigning values to sets of variables, representing low level decisions and context dependent disambiguation. In most cases there are complex relationships among these variables representing dependencies that range from simple statistical correlations to those that are constrained by deeper structural, relational and semantic properties of the text. In this work we study a specific instantiation of this problem in the context of identifying named entities and relations between them in free form text. Given a collection of discrete random variables representing outcomes of learned local predictors for entities and relations, we seek an optimal global assignment to the variables that respects multiple constraints, including constraints on the type of arguments a relation can take, and the mutual activity of different relations. We develop a linear programming formulation to address this global inference problem and evaluate it in the context of simultaneously learning named entities and relations. We show that global inference improves standalone learning; in addition, our approach allows us to efficiently incorporate expressive domain and task specific constraints at decision time, resulting, beyond significant improvements in the accuracy, in “coherent ” quality of the inference. 2 Global Inference for Entity and Relation Identification via a Linear Programming Formulation 1.1