Results 1 
5 of
5
A Linear Programming Formulation for Global Inference in Natural Language Tasks
 In Proceedings of CoNLL2004
, 2004
"... The typical processing paradigm in natural language processing is the "pipeline" approach, where learners are being used at one level, their outcomes are being used as features for a second level of predictions and so one. In addition to accumulating errors, it is clear that the sequential ..."
Abstract

Cited by 141 (38 self)
 Add to MetaCart
The typical processing paradigm in natural language processing is the "pipeline" approach, where learners are being used at one level, their outcomes are being used as features for a second level of predictions and so one. In addition to accumulating errors, it is clear that the sequential processing is a crude approximation to a process in which interactions occur across levels and down stream decisions often interact with previous decisions. This work develops a general...
Learning and Inference for Information Extraction
, 2005
"... Information extraction is a process that extracts limited semantic concepts from text documents and presents them in an organized way. Unlike several other natural language tasks, information extraction has a direct impact on enduser applications. Despite its importance, information extraction is s ..."
Abstract
 Add to MetaCart
Information extraction is a process that extracts limited semantic concepts from text documents and presents them in an organized way. Unlike several other natural language tasks, information extraction has a direct impact on enduser applications. Despite its importance, information extraction is still a difficult task due to the inherent complexity and ambiguity of human languages. Moreover, mutual dependencies between local predictions of the target concepts further increase difficulty of the task. In order to enhance information extraction technologies, we develop general approaches for two aspects – relational feature generation and global inference with classifiers. It has been quite convincingly argued that relational learning is suitable in training a complicated natural language system. We propose a relational feature generation approach that facilitates relational learning through propositional learning algorithms. In particular, we develop a relational representation language to produce features in a data driven way. The resulting features capture the relational structures of a given domain, and therefore allow the learning algorithms to effectively learn the relational definitions of target concepts. Although the learned classifier can be used to directly predict the target concepts, conflicts between the labels of different target variables often occur due to imperfect classifiers. We propose
A Linear Programming Formulation for Global Inference in Natural Language Tasks
"... Given a collection of discrete random variables representing outcomes of learned local predictors in natural language, e.g., named entities and relations, we seek an optimal global assignment to the variables in the presence of general (nonsequential) constraints. Examples of these constraints incl ..."
Abstract
 Add to MetaCart
(Show Context)
Given a collection of discrete random variables representing outcomes of learned local predictors in natural language, e.g., named entities and relations, we seek an optimal global assignment to the variables in the presence of general (nonsequential) constraints. Examples of these constraints include the type of arguments a relation can take, and the mutual activity of different relations, etc. We develop a linear programming formulation for this problem and evaluate it in the context of simultaneously learning named entities and relations. Our approach allows us to efficiently incorporate domain and task specific constraints at decision time, resulting in significant improvements in the accuracy and the “humanlike ” quality of the inferences. 1
A Linear Programming Formulation for Global Inference in Natural Language Tasks
"... Given a collection of discrete random variables representing outcomes of learned local predictors in natural language, e.g., named entities and relations, we seek an optimal global assignment to the variables in the presence of general (nonsequential) constraints. Examples of these constraints incl ..."
Abstract
 Add to MetaCart
(Show Context)
Given a collection of discrete random variables representing outcomes of learned local predictors in natural language, e.g., named entities and relations, we seek an optimal global assignment to the variables in the presence of general (nonsequential) constraints. Examples of these constraints include the type of arguments a relation can take, and the mutual activity of different relations, etc. We develop a linear programming formulation for this problem and evaluate it in the context of simultaneously learning named entities and relations. Our approach allows us to efficiently incorporate domain and task specific constraints at decision time, resulting in significant improvements in the accuracy and the “humanlike ” quality of the inferences. 1
Wentau Yih Machine Learning and Applied Statistics Group
"... Natural language decisions often involve assigning values to sets of variables, representing low level decisions and context dependent disambiguation. In most cases there are complex relationships among these variables representing dependencies that range from simple statistical correlations to thos ..."
Abstract
 Add to MetaCart
(Show Context)
Natural language decisions often involve assigning values to sets of variables, representing low level decisions and context dependent disambiguation. In most cases there are complex relationships among these variables representing dependencies that range from simple statistical correlations to those that are constrained by deeper structural, relational and semantic properties of the text. In this work we study a specific instantiation of this problem in the context of identifying named entities and relations between them in free form text. Given a collection of discrete random variables representing outcomes of learned local predictors for entities and relations, we seek an optimal global assignment to the variables that respects multiple constraints, including constraints on the type of arguments a relation can take, and the mutual activity of different relations. We develop a linear programming formulation to address this global inference problem and evaluate it in the context of simultaneously learning named entities and relations. We show that global inference improves standalone learning; in addition, our approach allows us to efficiently incorporate expressive domain and task specific constraints at decision time, resulting, beyond significant improvements in the accuracy, in “coherent ” quality of the inference. 2 Global Inference for Entity and Relation Identification via a Linear Programming Formulation 1.1