Results 1 -
6 of
6
Hinge-loss Markov random fields and probabilistic soft logic
, 2015
"... A fundamental challenge in developing high-impact machine learning technologies is balancing the ability to model rich, structured domains with the ability to scale to big data. Many important problem areas are both richly structured and large scale, from social and biological networks, to knowledge ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
(Show Context)
A fundamental challenge in developing high-impact machine learning technologies is balancing the ability to model rich, structured domains with the ability to scale to big data. Many important problem areas are both richly structured and large scale, from social and biological networks, to knowledge graphs and the Web, to images, video, and natural language. In this paper, we introduce two new formalisms for modeling structured data, distinguished from previous approaches by their ability to both capture rich structure and scale to big data. The first, hinge-loss Markov random fields (HL-MRFs), is a new kind of probabilistic graphical model that generalizes different approaches to convex inference. We unite three approaches from the randomized algorithms, probabilistic graphical models, and fuzzy logic communities, showing that all three lead to the same inference objective. We then derive HL-MRFs by generalizing this unified objective. The second new formalism, probabilistic soft logic (PSL), is a probabilistic programming language that makes HL-MRFs easy to define using a syntax based on first-order logic. We next introduce an algorithm for inferring most-probable variable assignments (MAP inference) that is much more scalable than general-purpose convex optimization software, because it uses message passing to take advantage of sparse dependency structures. We then show how to learn the parameters of HL-MRFs. The learned HL-MRFs are as accurate as analogous discrete models, but much more scalable. Together, these algorithms enable HL-MRFs and PSL to model rich, structured data at scales not previously possible.
Inferring user preferences by probabilistic logical reasoning over social networks. arXiv preprint arXiv:1411.2679
, 2014
"... We propose a framework for inferring the latent attitudes or pref-erences of users by performing probabilistic first-order logical rea-soning over the social network graph. Our method answers ques-tions about Twitter users like Does this user like sushi? or Is this user a New York Knicks fan? by bui ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
We propose a framework for inferring the latent attitudes or pref-erences of users by performing probabilistic first-order logical rea-soning over the social network graph. Our method answers ques-tions about Twitter users like Does this user like sushi? or Is this user a New York Knicks fan? by building a probabilistic model that reasons over user attributes (the user’s location or gender) and the social network (the user’s friends and spouse), via inferences like homophily (I am more likely to like sushi if spouse or friends like sushi, I am more likely to like the Knicks if I live in New York). The algorithm uses distant supervision, semi-supervised data harvesting and vector space models to extract user attributes (e.g. spouse, edu-cation, location) and preferences (likes and dislikes) from text. The extracted propositions are then fed into a probabilistic reasoner (we investigate both Markov Logic and Probabilistic Soft Logic). Our experiments show that probabilistic logical reasoning significantly improves the performance on attribute and relation extraction, and also achieves an F-score of 0.791 at predicting a users likes or dis-likes, significantly better than two strong baselines.
Paired-dual learning for fast training of latent variable hinge-loss mrfs
- In Proceedings of the International Conference of Machine Learning
, 2015
"... Latent variables allow probabilistic graphical models to capture nuance and structure in im-portant domains such as network science, natural language processing, and computer vision. Naive approaches to learning such complex models can be prohibitively expensive—because they require repeated inferen ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
(Show Context)
Latent variables allow probabilistic graphical models to capture nuance and structure in im-portant domains such as network science, natural language processing, and computer vision. Naive approaches to learning such complex models can be prohibitively expensive—because they require repeated inferences to update beliefs about la-tent variables—so lifting this restriction for use-ful classes of models is an important problem. Hinge-loss Markov random fields (HL-MRFs) are graphical models that allow highly scalable inference and learning in structured domains, in part by representing structured problems with continuous variables. However, this representa-tion leads to challenges when learning with la-tent variables. We introduce paired-dual learn-ing, a framework that greatly speeds up training by using tractable entropy surrogates and avoid-ing repeated inferences. Paired-dual learning op-timizes an objective with a pair of dual inference problems. This allows fast, joint optimization of parameters and dual variables. We evaluate on social-group detection, trust prediction in social networks, and image reconstruction, finding that paired-dual learning trains models as accurate as those trained by traditional methods in much less time, often before traditional methods make even a single parameter update. 1.
Statistical Relational Learning with Soft Quantifiers
"... Abstract. Quantification in statistical relational learning (SRL) is either existen-tial or universal, however humans might be more inclined to express knowledge using soft quantifiers, such as “most ” and “a few”. In this paper, we define the syntax and semantics of PSLQ, a new SRL framework that s ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. Quantification in statistical relational learning (SRL) is either existen-tial or universal, however humans might be more inclined to express knowledge using soft quantifiers, such as “most ” and “a few”. In this paper, we define the syntax and semantics of PSLQ, a new SRL framework that supports reasoning with soft quantifiers, and present its most probable explanation (MPE) inference algorithm. To the best of our knowledge, PSLQ is the first SRL framework that combines soft quantifiers with first-order logic rules for modeling uncertain rela-tional data. Our experimental results for link prediction in social trust networks demonstrate that the use of soft quantifiers not only allows for a natural and in-tuitive formulation of domain knowledge, but also improves the accuracy of in-ferred results. 1
Mining integrated semantic networks for drug repositioning opportunities
"... Current research and development approaches to drug discovery have become less fruitful and more costly. One alternative paradigm is that of drug repositioning. Many marketed examples of repositioned drugs have been identified through serendipitous or rational observations, highlighting the need for ..."
Abstract
- Add to MetaCart
(Show Context)
Current research and development approaches to drug discovery have become less fruitful and more costly. One alternative paradigm is that of drug repositioning. Many marketed examples of repositioned drugs have been identified through serendipitous or rational observations, highlighting the need for more systematic methodologies to tackle the problem. Systems level approaches have the potential to enable the development of novel methods to understand the action of therapeutic compounds, but requires an integrative approach to biological data. Integrated networks can facilitate systems level analyses by combining multiple sources of evidence to provide a rich description of drugs, their targets and their interactions. Classically, such networks can be mined manually where a skilled person is able to identify portions of the graph (semantic subgraphs) that are indicative of relationships between drugs and highlight possible repositioning opportunities. However, this approach is not scalable. Automated approaches are required to systematically mine integrated networks for these subgraphs and bring them to the
HyPER: A Flexible and Extensible Probabilistic Framework for Hybrid Recommender Systems
"... As the amount of recorded digital information increases, there is a growing need for flexible recommender systems which can incorporate richly structured data sources to im-prove recommendations. In this paper, we show how a re-cently introduced statistical relational learning framework can be used ..."
Abstract
- Add to MetaCart
(Show Context)
As the amount of recorded digital information increases, there is a growing need for flexible recommender systems which can incorporate richly structured data sources to im-prove recommendations. In this paper, we show how a re-cently introduced statistical relational learning framework can be used to develop a generic and extensible hybrid rec-ommender system. Our hybrid approach, HyPER (HY-brid Probabilistic Extensible Recommender), incorporates and reasons over a wide range of information sources. Such sources include multiple user-user and item-item similarity measures, content, and social information. HyPER auto-matically learns to balance these different information sig-nals when making predictions. We build our system using a powerful and intuitive probabilistic programming language called probabilistic soft logic [1], which enables efficient and accurate prediction by formulating our custom recommender systems with a scalable class of graphical models known as hinge-loss Markov random fields. We experimentally evalu-ate our approach on two popular recommendation datasets, showing that HyPER can effectively combine multiple in-formation types for improved performance, and can signifi-cantly outperform existing state-of-the-art approaches.