Results 11  20
of
610
Probabilistic classification and clustering in relational data
 In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence
, 2001
"... Supervised and unsupervised learning methods have traditionally focused on data consisting of independent instances of a single type. However, many realworld domains are best described by relational models in which instances of multiple types are related to each other in complex ways. For example, ..."
Abstract

Cited by 127 (4 self)
 Add to MetaCart
Supervised and unsupervised learning methods have traditionally focused on data consisting of independent instances of a single type. However, many realworld domains are best described by relational models in which instances of multiple types are related to each other in complex ways. For example, in a scientific paper domain, papers are related to each other via citation, and are also related to their authors. In this case, the label of one entity (e.g., the topic of the paper) is often correlated with the labels of related entities. We propose a general class of models for classification and clustering in relational domains that capture probabilistic dependencies between related instances. We show how to learn such models efficiently from data. We present empirical results on two real world data sets. Our experiments in a transductive classification setting indicate that accuracy can be significantly improved by modeling relational dependencies. Our algorithm automatically induces a very natural behavior, where our knowledge about one instance helps us classify related ones, which in turn help us classify others. In an unsupervised setting, our models produced coherent clusters with a very natural interpretation, even for instance types that do not have any attributes. 1
A DoubleLoop Algorithm to Minimize the Bethe and Kikuchi Free Energies
 NEURAL COMPUTATION
, 2001
"... Recent work (Yedidia, Freeman, Weiss [22]) has shown that stable points of belief propagation (BP) algorithms [12] for graphs with loops correspond to extrema of the Bethe free energy [3]. These BP algorithms have been used to obtain good solutions to problems for which alternative algorithms fail t ..."
Abstract

Cited by 126 (5 self)
 Add to MetaCart
Recent work (Yedidia, Freeman, Weiss [22]) has shown that stable points of belief propagation (BP) algorithms [12] for graphs with loops correspond to extrema of the Bethe free energy [3]. These BP algorithms have been used to obtain good solutions to problems for which alternative algorithms fail to work [4], [5], [10] [11]. In this paper we rst obtain the dual energy of the Bethe free energy which throws light on the BP algorithm. Next we introduce a discrete iterative algorithm which we prove is guaranteed to converge to a minimum of the Bethe free energy. We call this the doubleloop algorithm because it contains an inner and an outer loop. It extends a class of mean eld theory algorithms developed by [7],[8] and, in particular, [13]. Moreover, the doubleloop algorithm is formally very similar to BP which may help understand when BP converges. Finally, we extend all our results to the Kikuchi approximation which includes the Bethe free energy as a special case [3]. (Yedidia et al [22] showed that a \generalized belief propagation" algorithm also has its xed points at extrema of the Kikuchi free energy). We are able both to obtain a dual formulation for Kikuchi but also obtain a doubleloop discrete iterative algorithm that is guaranteed to converge to a minimum of the Kikuchi free energy. It is anticipated that these doubleloop algorithms will be useful for solving optimization problems in computer vision and other applications.
Learning Probabilistic Models of Link Structure
 Journal of Machine Learning Research
, 2002
"... Most realworld data is heterogeneous and richly interconnected. Examples include the Web, hypertext, bibliometric data and social networks. In contrast, most statistical learning methods work with "flat" data representations, forcing us to convert our data into a form that loses much of ..."
Abstract

Cited by 122 (15 self)
 Add to MetaCart
(Show Context)
Most realworld data is heterogeneous and richly interconnected. Examples include the Web, hypertext, bibliometric data and social networks. In contrast, most statistical learning methods work with "flat" data representations, forcing us to convert our data into a form that loses much of the link structure. The recently introduced framework of probabilistic relational models (PRMs) embraces the objectrelational nature of structured data by capturing probabilistic interactions between attributes of related entities. In this paper, we extend this framework by modeling interactions between the attributes and the link structure itself. An advantage of our approach is a unified generarive model for both content and relational structure. We propose two mechanisms for representing a probabilistic distribution over link structures: reference uncertainty and existence uncertainty. We describe the appropriate conditions for using each model and present learning algorithms for each. We present experimental results showing that the learned models can be used to predict link structure and, moreover, the observed link structure can be used to provide better predictions for the attributes in the model.
Combining phylogenetic and hidden Markov models in biosequence analysis
 J. Comput. Biol
, 2004
"... A few models have appeared in recent years that consider not only the way substitutions occur through evolutionary history at each site of a genome, but also the way the process changes from one site to the next. These models combine phylogenetic models of molecular evolution, which apply to individ ..."
Abstract

Cited by 119 (13 self)
 Add to MetaCart
(Show Context)
A few models have appeared in recent years that consider not only the way substitutions occur through evolutionary history at each site of a genome, but also the way the process changes from one site to the next. These models combine phylogenetic models of molecular evolution, which apply to individual sites, and hidden Markov models, which allow for changes from site to site. Besides improving the realism of ordinary phylogenetic models, they are potentially very powerful tools for inference and prediction—for gene finding, for example, or prediction of secondary structure. In this paper, we review progress on combined phylogenetic and hidden Markov models and present some extensions to previous work. Our main result is a simple and efficient method for accommodating higherorder states in the HMM, which allows for contextsensitive models of substitution— that is, models that consider the effects of neighboring bases on the pattern of substitution. We present experimental results indicating that higherorder states, autocorrelated rates, and multiple functional categories all lead to significant improvements in the fit of a combined phylogenetic and hidden Markov model, with the effect of higherorder states being particularly pronounced.
Extracting places and activities from gps traces using hierarchical conditional random fields
 International Journal of Robotics Research
, 2007
"... Learning patterns of human behavior from sensor data is extremely important for highlevel activity inference. We show how to extract a person’s activities and significant places from traces of GPS data. Our system uses hierarchically structured conditional random fields to generate a consistent mod ..."
Abstract

Cited by 108 (3 self)
 Add to MetaCart
(Show Context)
Learning patterns of human behavior from sensor data is extremely important for highlevel activity inference. We show how to extract a person’s activities and significant places from traces of GPS data. Our system uses hierarchically structured conditional random fields to generate a consistent model of a person’s activities and places. In contrast to existing techniques, our approach takes highlevel context into account in order to detect the significant places of a person. Our experiments show significant improvements over existing techniques. Furthermore, they indicate that our system is able to robustly estimate a person’s activities using a model that is trained from data collected by other persons. 1
Relational dependency networks
 Journal of Machine Learning Research
, 2007
"... Recent work on graphical models for relational data has demonstrated significant improvements in classification and inference when models represent the dependencies among instances. Despite its use in conventional statistical models, the assumption of instance independence is contradicted by most re ..."
Abstract

Cited by 107 (24 self)
 Add to MetaCart
Recent work on graphical models for relational data has demonstrated significant improvements in classification and inference when models represent the dependencies among instances. Despite its use in conventional statistical models, the assumption of instance independence is contradicted by most relational datasets. For example, in citation data there are dependencies among the topics of a paper’s references, and in genomic data there are dependencies among the functions of interacting proteins. In this paper, we present relational dependency networks (RDNs), graphical models that are capable of expressing and reasoning with such dependencies in a relational setting. We discuss RDNs in the context of relational Bayes networks and relational Markov networks and outline the relative strengths of RDNs—namely, the ability to represent cyclic dependencies, simple methods for parameter estimation, and efficient structure learning techniques. The strengths of RDNs are due to the use of pseudolikelihood learning techniques, which estimate an efficient approximation of the full joint distribution. We present learned RDNs for a number of realworld datasets and evaluate the models in a prediction context, showing that RDNs identify and exploit cyclic relational dependencies to achieve significant performance gains over conventional conditional models. In addition, we use synthetic data to explore model performance under various relational data characteristics, showing that RDN learning and inference techniques are accurate over a wide range of conditions.
Decentralised Coordination of LowPower Embedded Devices Using the MaxSum Algorithm
 In: 7 th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS08
, 2008
"... This paper considers the problem of performing decentralised coordination of lowpower embedded devices (as is required within many environmental sensing and surveillance applications). Specifically, we address the generic problem of maximising social welfare within a group of interacting agents. We ..."
Abstract

Cited by 88 (28 self)
 Add to MetaCart
(Show Context)
This paper considers the problem of performing decentralised coordination of lowpower embedded devices (as is required within many environmental sensing and surveillance applications). Specifically, we address the generic problem of maximising social welfare within a group of interacting agents. We propose a novel representation of the problem, as a cyclic bipartite factor graph, composed of variable and function nodes (representing the agents’ states and utilities respectively). We show that such representation allows us to use an extension of the maxsum algorithm to generate approximate solutions to this global optimisation problem through local decentralised message passing. We empirically evaluate this approach on a canonical coordination problem (graph colouring), and benchmark it against state of the art approximate and complete algorithms (DSA and DPOP). We show that our approach is robust to lossy communication, that it generates solutions closer to those of DPOP than DSA is able to, and that it does so with a communication cost (in terms of total messages size) that scales very well with the number of agents in the system (compared to the exponential increase of DPOP). Finally, we describe a hardware implementation of our algorithm operating on lowpower Chipcon CC2431 SystemonChip sensor nodes.
Collective segmentation and labeling of distant entities in information extraction
, 2004
"... In information extraction, we often wish to identify all mentions of an entity, such as a person or organization. Traditionally, a group of words is labeled as an entity based only on local information. But information from throughout a document can be useful; for example, if the same word is used m ..."
Abstract

Cited by 87 (17 self)
 Add to MetaCart
(Show Context)
In information extraction, we often wish to identify all mentions of an entity, such as a person or organization. Traditionally, a group of words is labeled as an entity based only on local information. But information from throughout a document can be useful; for example, if the same word is used multiple times, it is likely to have the same label each time. We present a CRF that explicitly represents dependencies between the labels of pairs of similar words in a document. On a standard information extraction data set, we show that learning these dependencies leads to a 13.7% reduction in error on the field that had caused the most repetition errors. 1
Bethe free energy, Kikuchi approximations and belief propagation algorithms
, 2000
"... Belief propagation (BP) was only supposed to work for treelike networks but works surprisingly well in many applications involving networks with loops, including turbo codes. However, there has been little understanding of the algorithm or the nature of the solutions it nds for general graphs. ..."
Abstract

Cited by 83 (2 self)
 Add to MetaCart
Belief propagation (BP) was only supposed to work for treelike networks but works surprisingly well in many applications involving networks with loops, including turbo codes. However, there has been little understanding of the algorithm or the nature of the solutions it nds for general graphs. We show that BP can only converge to a stationary point of an approximate free energy, known as the Bethe free energy in statistical physics. This result characterizes BP xedpoints and makes connections with variational approaches to approximate inference. More importantly, our analysis lets us build on the progress made in statistical physics since Bethe's approximation was introduced in 1935. Kikuchi and others have shown how to construct more accurate free energy approximations, of which Bethe's approximation is the simplest. Exploiting the insights from our analysis, we derive generalized belief propagation (GBP) versions of these Kikuchi approximations. These new message passing algorithms can be signicantly more accurate than ordinary BP, at an adjustable increase in complexity. We illustrate such a new GBP algorithm on a grid Markov network and show that it gives much more accurate marginal probabilities than those found using ordinary BP.