Results 11 -
17 of
17
Ontology-Aware Partitioning for Knowledge Graph Identification
"... Knowledge graphs provide a powerful representation of entities and the relationships between them, but automatically constructing such graphs from noisy extractions presents numerous challenges. Knowledge graph identification (KGI) is a technique for knowledge graph construction that jointly reasons ..."
Abstract
- Add to MetaCart
(Show Context)
Knowledge graphs provide a powerful representation of entities and the relationships between them, but automatically constructing such graphs from noisy extractions presents numerous challenges. Knowledge graph identification (KGI) is a technique for knowledge graph construction that jointly reasons about entities, attributes and relations in the presence of uncertain inputs and ontological constraints. Although knowledge graph identification shows promise scaling to knowledge graphs built from millions of extractions, increasingly powerful extraction engines may soon require knowledge graphs built from billions of extractions. One tool for scaling is partitioning extractions to allow reasoning to occur in parallel. We explore approaches which leverage ontological information and distributional information in partitioning. We compare these techniques with hash-based approaches, and show that using a richer partitioning model that incorporates the ontology graph and distribution of extractions provides superior results. Our results demonstrate that partitioning can result in order-of-magnitude speedups without reducing model performance.
Convex inference for community discovery in signed networks ∗
"... In contrast to traditional social networks, signed ones encode both relations of affinity and disagreement. Community discovery in this kind of networks has been successfully addressed using the Potts model, originated in statistical me-chanics to explain the magnetic dipole moments of atomic spins. ..."
Abstract
- Add to MetaCart
(Show Context)
In contrast to traditional social networks, signed ones encode both relations of affinity and disagreement. Community discovery in this kind of networks has been successfully addressed using the Potts model, originated in statistical me-chanics to explain the magnetic dipole moments of atomic spins. However, due to the computational complexity of finding an exact solution, it has not been ap-plied to many real-world networks yet. We propose a novel approach to compute an approximated solution to the Potts model applied to the context of community discovering, which is based on a continuous convex relaxation of the original prob-lem using hinge-loss functions. We show empirically the benefits of the proposed method in comparison with loopy belief propagation in terms of the communities discovered. We illustrate the scalability and effectiveness of our approach by ap-plying it to the network of voters of the European Parliament that we have crawled for this study. This large-scale and dense network comprises about 300 votings pe-riods on the actual term involving a total of more than 730 voters. Remarkably, the two major communities are those created by the european-antieuropean antag-onism, rather than the classical right-left antagonism. 1
ClaimEval: Integrated and Flexible Framework for Claim Evaluation Using Credibility of Sources
"... The World Wide Web (WWW) has become a rapidly grow-ing platform consisting of numerous sources which provide supporting or contradictory information about claims (e.g., “Chicken meat is healthy”). In order to decide whether a claim is true or false, one needs to analyze content of dif-ferent sources ..."
Abstract
- Add to MetaCart
(Show Context)
The World Wide Web (WWW) has become a rapidly grow-ing platform consisting of numerous sources which provide supporting or contradictory information about claims (e.g., “Chicken meat is healthy”). In order to decide whether a claim is true or false, one needs to analyze content of dif-ferent sources of information on the Web, measure credibility of information sources, and aggregate all these information. This is a tedious process and the Web search engines address only part of the overall problem, viz., producing only a list of relevant sources. In this paper, we present ClaimEval, a novel and integrated approach which given a set of claims to vali-date, extracts a set of pro and con arguments from the Web in-formation sources, and jointly estimates credibility of sources and correctness of claims. ClaimEval uses Probabilistic Soft Logic (PSL), resulting in a flexible and principled framework which makes it easy to state and incorporate different forms of prior-knowledge. Through extensive experiments on real-world datasets, we demonstrate ClaimEval’s capability in de-termining validity of a set of claims, resulting in improved accuracy compared to state-of-the-art baselines.
Duality and the Continuous Graphical Model
"... Abstract. Inspired by the Linear Programming based algorithms for discrete MRFs, we show how a corresponding infinite-dimensional dual for continuous-state MRFs can be approximated by a hierarchy of tractable relaxations. This hierarchy of dual programs includes as a special case the methods of Peng ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. Inspired by the Linear Programming based algorithms for discrete MRFs, we show how a corresponding infinite-dimensional dual for continuous-state MRFs can be approximated by a hierarchy of tractable relaxations. This hierarchy of dual programs includes as a special case the methods of Peng et al. [17] and Zach & Kohli [33]. We give ap-proximation bounds for the tightness of our construction, study their relationship to discrete MRFs and give a generic optimization algorithm based on Nesterov’s dual-smoothing method [16]. 1
Planned Protest Modeling in News and Social Media
"... Civil unrest (protests, strikes, and “occupy ” events) is a common occurrence in both democracies and au-thoritarian regimes. The study of civil unrest is a key topic for political scientists as it helps capture an im-portant mechanism by which citizenry express them-selves. In countries where civil ..."
Abstract
- Add to MetaCart
Civil unrest (protests, strikes, and “occupy ” events) is a common occurrence in both democracies and au-thoritarian regimes. The study of civil unrest is a key topic for political scientists as it helps capture an im-portant mechanism by which citizenry express them-selves. In countries where civil unrest is lawful, qual-itative analysis has revealed that more than 75 % of the protests are planned, organized, and/or announced in advance; therefore detecting future time mentions in rel-evant news and social media is a direct way to develop a protest forecasting system. We develop such a system in this paper, using a combination of key phrase learning to identify what to look for, probabilistic soft logic to rea-son about location occurrences in extracted results, and time normalization to resolve future tense mentions. We illustrate the application of our system to 10 countries in
Using Semantics & Statistics to Turn Data into Knowledge
"... Many information extraction and knowledge base construc-tion systems are addressing the challenge of deriving knowl-edge from text. A key problem in constructing these knowl-edge bases from sources like the web is overcoming the erro-neous and incomplete information found in millions of can-didate e ..."
Abstract
- Add to MetaCart
Many information extraction and knowledge base construc-tion systems are addressing the challenge of deriving knowl-edge from text. A key problem in constructing these knowl-edge bases from sources like the web is overcoming the erro-neous and incomplete information found in millions of can-didate extractions. To solve this problem, we turn to seman-tics – using ontological constraints between candidate facts to eliminate errors. In this article, we represent the desired knowledge base as a knowledge graph and introduce the prob-lem of knowledge graph identification, collectively resolv-ing the entities, labels, and relations present in the knowl-edge graph. Knowledge graph identification requires reason-ing jointly over millions of extractions simultaneously, posing a scalability challenge to many approaches. We use proba-
Discovering Evolving Political Vocabulary in Social Media
"... Abstract—As a surrogate data source for many real-world phenomena, social media such as Twitter can yield key in-sight into people’s behavior and their group affiliations and memberships. As an event unfolds on Twitter, the language, hashtags, and vocabulary used to describe it evolves over time, so ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—As a surrogate data source for many real-world phenomena, social media such as Twitter can yield key in-sight into people’s behavior and their group affiliations and memberships. As an event unfolds on Twitter, the language, hashtags, and vocabulary used to describe it evolves over time, so that it is difficult to a priori capture the composition of a social group of interest using static keywords. Capturing such dynamic compositions is crucial to both understanding the true membership of social groups and in providing high-quality data for downstream applications such as trend forecasting. We propose a novel unsupervised learning algorithm that builds dynamic vocabularies using probabilistic soft logic (PSL), a framework for probabilistic reasoning over relational domains. Using 10 presidential elections from eight countries of Latin