Results 1 - 10
of
12
Scaling Textual Inference to the Web
"... Most Web-based Q/A systems work by finding pages that contain an explicit answer to a question. These systems are helpless if the answer has to be inferred from multiple sentences, possibly on different pages. To solve this problem, we introduce the HOLMES system, which utilizes textual inference (T ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
Most Web-based Q/A systems work by finding pages that contain an explicit answer to a question. These systems are helpless if the answer has to be inferred from multiple sentences, possibly on different pages. To solve this problem, we introduce the HOLMES system, which utilizes textual inference (TI) over tuples extracted from text. Whereas previous work on TI (e.g., the literature on textual entailment) has been applied to paragraph-sized texts, HOLMES utilizes knowledge-based model construction to scale TI to a corpus of 117 million Web pages. Given only a few minutes, HOLMES doubles recall for example queries in three disparate domains (geography, business, and nutrition). Importantly, HOLMES’s runtime is linear in the size of its input corpus due to a surprising property of many textual relations in the Web corpus—they are “approximately ” functional in a well-defined sense. 1
Using Wikipedia to Bootstrap Open Information Extraction
"... We often use ‘Data Management ’ to refer to the manipulation of relational or semi-structured information, but much of the world’s data is unstructured, for example the vast amount of natural-language text on the Web. The ability to manage ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
We often use ‘Data Management ’ to refer to the manipulation of relational or semi-structured information, but much of the world’s data is unstructured, for example the vast amount of natural-language text on the Web. The ability to manage
Deriving Generalized Knowledge from Corpora using WordNet Abstraction
- Proc. EACL'09
, 2009
"... Existing work in the extraction of commonsense knowledge from text has been primarily restricted to factoids that serve as statements about what may possibly obtain in the world. We present an approach to deriving stronger, more general claims by abstracting over large sets of factoids. Our goal is ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Existing work in the extraction of commonsense knowledge from text has been primarily restricted to factoids that serve as statements about what may possibly obtain in the world. We present an approach to deriving stronger, more general claims by abstracting over large sets of factoids. Our goal is to coalesce the observed nominals for a given predicate argument into a few predominant types, obtained as WordNet synsets. The results can be construed as generically quantified sentences restricting the semantic type of an argument position of a predicate. 1
Semantic role labeling for open information extraction
- In Proceedings of the First International Workshop on Formalisms and Methodology for Learning by
, 2010
"... Open Information Extraction is a recent paradigm for machine reading from arbitrary text. In contrast to existing techniques, which have used only shallow syntactic features, we investigate the use of semantic features (semantic roles) for the task of Open IE. We compare TEXTRUNNER (Banko et al., 20 ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Open Information Extraction is a recent paradigm for machine reading from arbitrary text. In contrast to existing techniques, which have used only shallow syntactic features, we investigate the use of semantic features (semantic roles) for the task of Open IE. We compare TEXTRUNNER (Banko et al., 2007), a state of the art open extractor, with our novel extractor SRL-IE, which is based on UIUC’s SRL system (Punyakanok et al., 2008). We find that SRL-IE is robust to noisy heterogeneous Web data and outperforms TEXTRUN-NER on extraction quality. On the other hand, TEXTRUNNER performs over 2 orders of magnitude faster and achieves good precision in high locality and high redundancy extractions. These observations enable the construction of hybrid extractors that output higher quality results than TEXTRUNNER and similar quality as SRL-IE in much less time. 1
Discovering commonsense entailment rules implicit in sentences
- In Proc. of the EMNLP 2011 Workshop on Textual Entailment (TextInfer
, 2011
"... Reasoning about ordinary human situations and activities requires the availability of diverse types of knowledge, including expectations about the probable results of actions and the lexical entailments for many predicates. We describe initial work to acquire such a collection of conditional (if–the ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Reasoning about ordinary human situations and activities requires the availability of diverse types of knowledge, including expectations about the probable results of actions and the lexical entailments for many predicates. We describe initial work to acquire such a collection of conditional (if–then) knowledge by exploiting presuppositional discourse patterns (such as ones involving ‘but’, ‘yet’, and ‘hoping to’) and abstracting the matched material into general rules. 1
An Analysis of Open Information Extraction based on Semantic Role Labeling
, 2011
"... Open Information Extraction extracts relations from text without requiring a pre-specified domain or vocabulary. While existing techniques have used only shallow syntactic features, we investigate the use of semantic role labeling techniques for the task of Open IE. Semantic role labeling (SRL) and ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Open Information Extraction extracts relations from text without requiring a pre-specified domain or vocabulary. While existing techniques have used only shallow syntactic features, we investigate the use of semantic role labeling techniques for the task of Open IE. Semantic role labeling (SRL) and Open IE, although developed mostly in isolation, are quite related. We compare SRLbased open extractors, which perform computationally expensive, deep syntactic analysis, with TextRunner, an open extractor, which uses shallow syntactic analysis but is able to analyze many more sentences in a fixed amount of time and thus exploit corpus-level statistics. Our evaluation answers questions regarding these systems, including, can SRL extractors, which are trained on PropBank, cope with heterogeneous text found on the Web? Which extractor attains better precision, recall, f-measure, or running time? How does extractor performance vary for binary, n-ary and nested relations? How much do we gain by running multiple extractors? How do we select the optimal extractor given amount of data, available time, types of extractions desired?
From generic sentences to scripts
"... One way to tackle the problem of acquiring general world knowledge to support language understanding and commonsense reasoning is to derive this knowledge by direct interpretation of general statements in ordinary language. One of several problems encountered in such an effort is that general statem ..."
Abstract
- Add to MetaCart
One way to tackle the problem of acquiring general world knowledge to support language understanding and commonsense reasoning is to derive this knowledge by direct interpretation of general statements in ordinary language. One of several problems encountered in such an effort is that general statements frequently involve “donkey anaphora”. Here a “dynamic Skolemization” approach is suggested that avoids dynamic semantics and leads naturally to script- or frame-like representations. Introduction: Long-range goals, and the need for world knowledge A group of us at the University of Rochester are pursuing the ambitious goal of creating a broadly knowledgable dialog agent that is motivated by curiosity, by vicarious satisfaction
Edinburgh, Scotland, UKc○2011 The Association for Computational Linguistics Order copies of this and other ACL proceedings from:
, 2011
"... Textual inference and paraphrase have attracted a significant amount of attention in recent years. Many NLP tasks, including question answering, information extraction, and text summarization, can be mapped at least partially onto the recognition of textual entailments and the detection of semantic ..."
Abstract
- Add to MetaCart
Textual inference and paraphrase have attracted a significant amount of attention in recent years. Many NLP tasks, including question answering, information extraction, and text summarization, can be mapped at least partially onto the recognition of textual entailments and the detection of semantic equivalence between texts. Robust and accurate algorithms and resources for inference and
The Same Semantic Relations Link Structurally Different Realizations of Concepts
, 2009
"... To make sense of an utterance, people identify in its linear linguistic expression the concepts and the connections between them. A concept normally has a lexical realization; connections between concepts often do not, but they are perceived even without the benefit of lexical cues. Making these con ..."
Abstract
- Add to MetaCart
To make sense of an utterance, people identify in its linear linguistic expression the concepts and the connections between them. A concept normally has a lexical realization; connections between concepts often do not, but they are perceived even without the benefit of lexical cues. Making these connections – called semantic relations in the field of natural language processing – relies on the form and structure of linguistic expressions, and the concepts these expressions evoke. This implies two levels: the level of the text, the linguistic expression with its form and (grammatical) structure, and the level of the concepts which the speaker wants to convey. An overview of the literature shows that semantic relations are, for pragmatic reasons, a means to an end – extract information, explain the links between the head of a phrase and its arguments, and so on – and that is why they are analyzed from the perspective of what they link. At the text level, the process of semantic relation analysis is informed by syntactic elements – noun phrases, verbs and their arguments, clauses and so on – thus differentiating semantic relations based on the complexity of the syntactic constructions in
Prontolearn: Unsupervised . . . GENERATION USING PROBABILISTIC METHODS
, 2010
"... An ontology is a formal, explicit specification of a shared conceptualization [1, 2]. Formalizing an ontology for a domain is a tedious and cumbersome process. It is constrained by the knowledge acquisition bottleneck (KAB). There exists a large number of text corpora that can be used for classific ..."
Abstract
- Add to MetaCart
An ontology is a formal, explicit specification of a shared conceptualization [1, 2]. Formalizing an ontology for a domain is a tedious and cumbersome process. It is constrained by the knowledge acquisition bottleneck (KAB). There exists a large number of text corpora that can be used for classification in order to create ontologies with the intention to provide better support for the intended parties. In our research we provide a novel unsupervised bottom-up ontology generation method. This method is based on lexico-semantic structures and Bayesian reasoning to expedite the ontology generation process. This process also provides evidence to domain experts to build ontologies based on top-down approaches.

