• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

A latent dirichlet allocation method for selectional preferences (2010)

by A Ritter, Mausam, O Etzioni
Venue:In ACL
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 30
Next 10 →

A Semi-Supervised Method to Learn and Construct Taxonomies using the Web

by Zornitsa Kozareva, Eduard Hovy
"... Although many algorithms have been developed to harvest lexical resources, few organize the mined terms into taxonomies. We propose (1) a semi-supervised algorithm that uses a root concept, a basic level concept, and recursive surface patterns to learn automatically from the Web hyponym-hypernym pai ..."
Abstract - Cited by 7 (0 self) - Add to MetaCart
Although many algorithms have been developed to harvest lexical resources, few organize the mined terms into taxonomies. We propose (1) a semi-supervised algorithm that uses a root concept, a basic level concept, and recursive surface patterns to learn automatically from the Web hyponym-hypernym pairs subordinated to the root; (2) a Web based concept positioning procedure to validate the learned pairs ’ is-a relations; and (3) a graph algorithm that derives from scratch the integrated taxonomy structure of all the terms. Comparing results with WordNet, we find that the algorithm misses some concepts and links, but also that it discovers many additional ones lacking in WordNet. We evaluate the taxonomization power of our method on reconstructing parts of the WordNet taxonomy. Experiments show that starting from scratch, the algorithm can reconstruct 62 % of the WordNet taxonomy for the regions tested.

Latent variable models of selectional preference

by Diarmuid Ó Séaghdha - In ACL 2010 , 2010
"... This paper describes the application of so-called topic models to selectional preference induction. Three models related to Latent Dirichlet Allocation, a proven method for modelling document-word cooccurrences, are presented and evaluated on datasets of human plausibility judgements. Compared to pr ..."
Abstract - Cited by 7 (0 self) - Add to MetaCart
This paper describes the application of so-called topic models to selectional preference induction. Three models related to Latent Dirichlet Allocation, a proven method for modelling document-word cooccurrences, are presented and evaluated on datasets of human plausibility judgements. Compared to previously proposed techniques, these models perform very competitively, especially for infrequent predicate-argument combinations where they exceed the quality of Web-scale predictions while using relatively little data. 1

A Bayesian Model for Unsupervised Semantic Parsing

by Ivan Titov, Alexandre Klementiev
"... We propose a non-parametric Bayesian model for unsupervised semantic parsing. Following Poon and Domingos (2009), we consider a semantic parsing setting where the goal is to (1) decompose the syntactic dependency tree of a sentence into fragments, (2) assign each of these fragments to a cluster of s ..."
Abstract - Cited by 5 (3 self) - Add to MetaCart
We propose a non-parametric Bayesian model for unsupervised semantic parsing. Following Poon and Domingos (2009), we consider a semantic parsing setting where the goal is to (1) decompose the syntactic dependency tree of a sentence into fragments, (2) assign each of these fragments to a cluster of semantically equivalent syntactic structures, and (3) predict predicate-argument relations between the fragments. We use hierarchical Pitman-Yor processes to model statistical dependencies between meaning representations of predicates and those of their arguments, as well as the clusters of their syntactic realizations. We develop a modification of the Metropolis-Hastings split-merge sampler, resulting in an efficient inference algorithm for the model. The method is experimentally evaluated by using the induced semantic representation for the question answering task in the biomedical domain. 1

Identifying Functional Relations in Web Text

by Thomas Lin, Oren Etzioni
"... Determining whether a textual phrase denotes a functional relation (i.e., a relation that maps each domain element to a unique range element) is useful for numerous NLP tasks such as synonym resolution and contradiction detection. Previous work on this problem has relied on either counting methods o ..."
Abstract - Cited by 4 (3 self) - Add to MetaCart
Determining whether a textual phrase denotes a functional relation (i.e., a relation that maps each domain element to a unique range element) is useful for numerous NLP tasks such as synonym resolution and contradiction detection. Previous work on this problem has relied on either counting methods or lexico-syntactic patterns. However, determining whether a relation is functional, by analyzing mentions of the relation in a corpus, is challenging due to ambiguity, synonymy, anaphora, and other linguistic phenomena. We present the LEIBNIZ system that overcomes these challenges by exploiting the synergy between the Web corpus and freelyavailable knowledge resources such as Freebase. It first computes multiple typed functionality scores, representing functionality of the relation phrase when its arguments are constrained to specific types. It then aggregates these scores to predict the global functionality for the phrase. LEIBNIZ outperforms previous work, increasing area under the precisionrecall curve from 0.61 to 0.88. We utilize LEIBNIZ to generate the first public repository of automatically-identified functional relations.

Identifying Relations for Open Information Extraction

by Anthony Fader, Stephen Soderland, Oren Etzioni , 2011
"... Open Information Extraction (IE) is the task of extracting assertions from massive corpora without requiring a pre-specified vocabulary. This paper shows that the output of state-ofthe-art Open IE systems is rife with uninformative and incoherent extractions. To overcome these problems, we introduce ..."
Abstract - Cited by 4 (2 self) - Add to MetaCart
Open Information Extraction (IE) is the task of extracting assertions from massive corpora without requiring a pre-specified vocabulary. This paper shows that the output of state-ofthe-art Open IE systems is rife with uninformative and incoherent extractions. To overcome these problems, we introduce two simple syntactic and lexical constraints on binary relations expressed by verbs. We implemented the constraints in the REVERB Open IE system, which more than doubles the area under the precision-recall curve relative to previous extractors such as TEXTRUNNER and WOE pos. More than 30 % of REVERB’s extractions are at precision 0.8 or higher— compared to virtually none for earlier systems. The paper concludes with a detailed analysis of REVERB’s errors, suggesting directions for future work.

Structured Relation Discovery using Generative Models

by Limin Yao, Aria Haghighi, Sebastian Riedel, Andrew Mccallum
"... We explore unsupervised approaches to relation extraction between two named entities; for instance, the semantic bornIn relation between a person and location entity. Concretely, we propose a series of generative probabilistic models, broadly similar to topic models, each which generates a corpus of ..."
Abstract - Cited by 4 (2 self) - Add to MetaCart
We explore unsupervised approaches to relation extraction between two named entities; for instance, the semantic bornIn relation between a person and location entity. Concretely, we propose a series of generative probabilistic models, broadly similar to topic models, each which generates a corpus of observed triples of entity mention pairs and the surface syntactic dependency path between them. The output of each model is a clustering of observed relation tuples and their associated textual expressions to underlying semantic relation types. Our proposed models exploit entity type constraints within a relation as well as features on the dependency path between entity mentions. We examine effectiveness of our approach via multiple evaluations and demonstrate 12 % error reduction in precision over a state-of-the-art weakly supervised baseline. al., 2010). We propose a series of generative probabilistic models, broadly similar to standard topic models, which generate a corpus of observed triples of entity mention pairs and the surface syntactic dependency path between them. Our proposed models exploit entity type constraints within a relation as well as features on the dependency path between entity mentions. The output of our approach is a clustering over observed relation paths (e.g. “X was born in Y ” and “X is from Y”) such that expressions in the same cluster bear the same semantic relation type between entities. Past work has shown that standard supervised techniques can yield high-performance relation detection when abundant labeled data exists for a fixed inventory of individual relation types (e.g. leaderOf) (Kambhatla, 2004; Culotta and Sorensen, 2004; Roth and tau Yih, 2002). However, less explored are open-domain approaches where the set of possible relation types are not fixed and little to no labeled is given for each relation type (Banko et 1

Automatic Labelling of Topic Models

by Jey Han Lau, Karl Grieser, David Newman
"... We propose a method for automatically labelling topics learned via LDA topic models. We generate our label candidate set from the top-ranking topic terms, titles of Wikipedia articles containing the top-ranking topic terms, and sub-phrases extracted from the Wikipedia article titles. We rank the lab ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
We propose a method for automatically labelling topics learned via LDA topic models. We generate our label candidate set from the top-ranking topic terms, titles of Wikipedia articles containing the top-ranking topic terms, and sub-phrases extracted from the Wikipedia article titles. We rank the label candidates using a combination of association measures and lexical features, optionally fed into a supervised ranking model. Our method is shown to perform strongly over four independent sets of topics, significantly better than a benchmark method. 1

Machine reading at the university of washington

by Hoifung Poon, Janara Christensen, Pedro Domingos, Oren Etzioni, Raphael Hoffmann, Chloe Kiddon, Thomas Lin, Xiao Ling, Alan Ritter, Stefan Schoenmackers, Stephen Soderland, Dan Weld, Fei Wu, Congle Zhang - IN NAACL WORKSHOP FAM-LBR , 2010
"... Machine reading is a long-standing goal of AI and NLP. In recent years, tremendous progress has been made in developing machine learning approaches for many of its subtasks such as parsing, information extraction, and question answering. However, existing end-to-end solutions typically require subst ..."
Abstract - Cited by 2 (2 self) - Add to MetaCart
Machine reading is a long-standing goal of AI and NLP. In recent years, tremendous progress has been made in developing machine learning approaches for many of its subtasks such as parsing, information extraction, and question answering. However, existing end-to-end solutions typically require substantial amount of human efforts (e.g., labeled data and/or manual engineering), and are not well poised for Web-scale knowledge acquisition. In this paper, we propose a unifying approach for machine reading by bootstrapping from the easiest extractable knowledge and conquering the long tail via a self-supervised learning process. This self-supervision is powered by joint inference based on Markov logic, and is made scalable by leveraging hierarchical structures and coarse-to-fine inference. Researchers at the University of Washington have taken the first steps in this direction. Our existing work explores the wide spectrum of this vision and shows its promise.

Commonsense from the Web: Relation Properties

by Thomas Lin, Oren Etzioni
"... When general purpose software agents fail, it’s often because they’re brittle and need more background commonsense knowledge. In this paper we present relation properties as a valuable type of commonsense knowledge that can be automatically inferred at scale by reading the Web. People base many comm ..."
Abstract - Cited by 2 (2 self) - Add to MetaCart
When general purpose software agents fail, it’s often because they’re brittle and need more background commonsense knowledge. In this paper we present relation properties as a valuable type of commonsense knowledge that can be automatically inferred at scale by reading the Web. People base many commonsense inferences on their knowledge of relation properties such as functionality, transitivity, and others. For example, all people know that bornIn(Year) satisfies the functionality property, meaning that each person can be born in exactly one year. Thus inferences like "Obama was born in 1961, so he was not born in 2008", which computers do not know, are obvious even to children. We demonstrate scalable heuristics for learning relation functionality from noisy Web text that outperform existing approaches to detecting functionality. The heuristics we use address Web NLP challenges that are also common to learning other relation properties, and can be easily transferred. Each relation property we learn for a Web-scale set of relations will enable computers to solve real tasks, and the data from learning many such properties will be a useful addition to general commonsense knowledge bases.

Unsupervised Learning of Selectional Restrictions and Detection of Argument Coercions

by Kirk Roberts, A M. Harabagiu
"... Metonymic language is a pervasive phenomenon. Metonymic type shifting, or argument type coercion, results in a selectional restriction violation where the argument’s semantic class differs from the class the predicate expects. In this paper we present an unsupervised method that learns the selection ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
Metonymic language is a pervasive phenomenon. Metonymic type shifting, or argument type coercion, results in a selectional restriction violation where the argument’s semantic class differs from the class the predicate expects. In this paper we present an unsupervised method that learns the selectional restriction of arguments and enables the detection of argument coercion. This method also generates an enhanced probabilistic resolution of logical metonymies. The experimental results indicate substantial improvements the detection of coercions and the ranking of metonymic interpretations. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University