Results 1 - 10
of
20
Template-Based Information Extraction without the Templates
"... Standard algorithms for template-based information extraction (IE) require predefined template schemas, and often labeled data, to learn to extract their slot fillers (e.g., an embassy is the Target of a Bombing template). This paper describes an approach to template-based IE that removes this requi ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Standard algorithms for template-based information extraction (IE) require predefined template schemas, and often labeled data, to learn to extract their slot fillers (e.g., an embassy is the Target of a Bombing template). This paper describes an approach to template-based IE that removes this requirement and performs extraction without knowing the template structure in advance. Our algorithm instead learns the template structure automatically from raw text, inducing template schemas as sets of linked events (e.g., bombings include detonate, set off, and destroy events) associated with semantic roles. We also solve the standard IE task, using the induced syntactic patterns to extract role fillers from specific documents. We evaluate on the MUC-4 terrorism dataset and show that we induce template structure very similar to handcreated gold structure, and we extract role fillers with an F1 score of.40, approaching the performance of algorithms that require full knowledge of the templates. 1
A Bayesian Model for Unsupervised Semantic Parsing
"... We propose a non-parametric Bayesian model for unsupervised semantic parsing. Following Poon and Domingos (2009), we consider a semantic parsing setting where the goal is to (1) decompose the syntactic dependency tree of a sentence into fragments, (2) assign each of these fragments to a cluster of s ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
We propose a non-parametric Bayesian model for unsupervised semantic parsing. Following Poon and Domingos (2009), we consider a semantic parsing setting where the goal is to (1) decompose the syntactic dependency tree of a sentence into fragments, (2) assign each of these fragments to a cluster of semantically equivalent syntactic structures, and (3) predict predicate-argument relations between the fragments. We use hierarchical Pitman-Yor processes to model statistical dependencies between meaning representations of predicates and those of their arguments, as well as the clusters of their syntactic realizations. We develop a modification of the Metropolis-Hastings split-merge sampler, resulting in an efficient inference algorithm for the model. The method is experimentally evaluated by using the induced semantic representation for the question answering task in the biomedical domain. 1
A Database of Narrative Schemas
"... This paper describes a new language resource of events and semantic roles that characterize real-world situations. Narrative schemas contain sets of related events (edit and publish), a temporal ordering of the events (edit before publish), and the semantic roles of the participants (authors publish ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This paper describes a new language resource of events and semantic roles that characterize real-world situations. Narrative schemas contain sets of related events (edit and publish), a temporal ordering of the events (edit before publish), and the semantic roles of the participants (authors publish books). This type of world knowledge was central to early research in natural language understanding. Scripts were one of the main formalisms, representing common sequences of events that occur in the world. Unfortunately, most of this knowledge was hand-coded and time consuming to create. Current machine learning techniques, as well as a new approach to learning through coreference chains, has allowed us to automatically extract rich event structure from open domain text in the form of narrative schemas. The narrative schema resource described in this paper contains approximately 5000 unique events combined into schemas of varying sizes. We describe the resource, how it is learned, and a new evaluation of the coverage of these schemas over unseen documents.
Extracting Action and Event Semantics from Web Text
"... Most information extraction research identifies the state of the world in text, including the entities and the relationships that exist between them. Much less attention has been paid to the understanding of dynamics, or how the state of the world changes over time. Because intelligent behavior seek ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Most information extraction research identifies the state of the world in text, including the entities and the relationships that exist between them. Much less attention has been paid to the understanding of dynamics, or how the state of the world changes over time. Because intelligent behavior seeks to change the state of the world in rational and utilitymaximizing ways, common-sense knowledge about dynamics is essential for intelligent agents. In this paper, we describe a novel system, PREPOST, that tackles the problem of extracting the preconditions and effects of actions and events, two important kinds of knowledge for connecting world state and the actions that affect it. In experiments on Web text, PREPOST is able to improve by 79 % over a baseline technique for identifying the effects of actions (64 % improvement for preconditions). 1
Assessing the Role of Discourse References in Entailment Inference
"... Discourse references, notably coreference and bridging, play an important role in many text understanding applications, but their impact on textual entailment is yet to be systematically understood. On the basis of an in-depth analysis of entailment instances, we argue that discourse references have ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Discourse references, notably coreference and bridging, play an important role in many text understanding applications, but their impact on textual entailment is yet to be systematically understood. On the basis of an in-depth analysis of entailment instances, we argue that discourse references have the potential of substantially improving textual entailment recognition, and identify a number of research directions towards this goal. 1
Extracting strips representations of actions and events
- In RANLP
, 2011
"... Knowledge about how the world changes over time is a vital component of commonsense knowledge for Artificial Intelligence (AI) and natural language understanding. Actions and events are fundamental components to any knowledge about changes in the state of the world: the states before and after an ev ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Knowledge about how the world changes over time is a vital component of commonsense knowledge for Artificial Intelligence (AI) and natural language understanding. Actions and events are fundamental components to any knowledge about changes in the state of the world: the states before and after an event differ in regular and predictable ways. We describe a novel system that tackles the problem of extracting knowledge from text about how actions and events change the world over time. We leverage standard language-processing tools, like semantic role labelers and coreference resolvers, as well as large-corpus statistics like pointwise mutual information, to identify STRIPS representations of actions and events, a type of representation commonly used in AI planning systems. In experiments on Web text, our extractor’s Area under the Curve (AUC) improves by more than 31 % over the closest system from the literature for identifying the preconditions and add effects of actions. In addition, we also extract significant aspects of STRIPS representations that are missing from previous work, including delete effects and arguments. 1
Carnegie Mellon
, 2010
"... We present the PICTOR browser, a visualization designed to facilitate the analysis of quotations about userspecified topics in large collections of news text. PICTOR focuses on quotations because they are a major vehicle of communication in the news genre. It extracts quotes from articles that match ..."
Abstract
- Add to MetaCart
We present the PICTOR browser, a visualization designed to facilitate the analysis of quotations about userspecified topics in large collections of news text. PICTOR focuses on quotations because they are a major vehicle of communication in the news genre. It extracts quotes from articles that match a user’s text query, and groups these quotes into “threads ” that illustrate the development of subtopics over time. It allows users to rapidly explore the space of relevant quotes by viewing their content and speakers, to examine the contexts in which quotes appear, and to tune how threads are constructed. We offer two case studies demonstrating how PICTOR can support a richer understanding of news events. 1
Mining Commonsense Knowledge From Personal Stories in Internet Weblogs
"... Recent advances in automated knowledge base construction have created new opportunities to address one of the hardest challenges in Artificial Intelligence: automated commonsense reasoning. In this paper, we describe our recent efforts in mining commonsense knowledge from the personal stories that p ..."
Abstract
- Add to MetaCart
Recent advances in automated knowledge base construction have created new opportunities to address one of the hardest challenges in Artificial Intelligence: automated commonsense reasoning. In this paper, we describe our recent efforts in mining commonsense knowledge from the personal stories that people write about their lives in their Internet weblogs. We summarize three preliminary investigations that involve the application of statistical natural language processing techniques to corpora of millions of weblog stories, and outline our current approach to solving a number of outstanding technical challenges. 1.
Automatically Producing Plot Unit Representations for Narrative Text
"... In the 1980s, plot units were proposed as a conceptual knowledge structure for representing and summarizing narrative stories. Our research explores whether current NLP technology can be used to automatically produce plot unit representations for narrative text. We create a system called AESOP that ..."
Abstract
- Add to MetaCart
In the 1980s, plot units were proposed as a conceptual knowledge structure for representing and summarizing narrative stories. Our research explores whether current NLP technology can be used to automatically produce plot unit representations for narrative text. We create a system called AESOP that exploits a variety of existing resources to identify affect states and applies “projection rules ” to map the affect states onto the characters in a story. We also use corpus-based techniques to generate a new type of affect knowledge base: verbs that impart positive or negative states onto their patients (e.g., being eaten is an undesirable state, but being fed is a desirable state). We harvest these “patient polarity verbs ” from a Web corpus using two techniques: co-occurrence with Evil/Kind Agent patterns, and bootstrapping over conjunctions of verbs. We evaluate the plot unit representations produced by our system on a small collection of Aesop’s fables. 1
Coreference Based Event-Argument Relation Extraction on Biomedical Text
"... This paper presents a new approach that exploits coreference information to extract event-argument (E-A) relations from biomedical documents. This approach has two advantages: (1) it can extract a large number of valuable E-A relations based on the concept of salience in discourse (Grosz et al., 199 ..."
Abstract
- Add to MetaCart
This paper presents a new approach that exploits coreference information to extract event-argument (E-A) relations from biomedical documents. This approach has two advantages: (1) it can extract a large number of valuable E-A relations based on the concept of salience in discourse (Grosz et al., 1995) ; (2) it enables us to identify E-A relations over sentence boundaries (cross-links) using transitivity involving coreference relations. We propose two coreference-based models: a pipeline based on Support Vector Machine (SVM) classifiers, and a joint Markov Logic Network (MLN). We show the effectiveness of these models on a biomedical event corpus. The both models outperform the systems without coreference information. When compared with the two models, joint MLN outperforms pipeline SVM with gold coreference information. 1

