Results 1 - 10
of
11
On the Convexity of Latent Social Network Inference
"... In many real-world scenarios, it is nearly impossible to collect explicit social network data. In such cases, whole networks must be inferred from underlying observations. Here, we formulate the problem of inferring latent social networks based on network diffusion or disease propagation data. We co ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
In many real-world scenarios, it is nearly impossible to collect explicit social network data. In such cases, whole networks must be inferred from underlying observations. Here, we formulate the problem of inferring latent social networks based on network diffusion or disease propagation data. We consider contagions propagating over the edges of an unobserved social network, where we only observe the times when nodes became infected, but not who infected them. Given such node infection times, we then identify the optimal network that best explains the observed data. We present a maximum likelihood approach based on convex programming with a l1-like penalty term that encourages sparsity. Experiments on real and synthetic data reveal that our method near-perfectly recovers the underlying network structure as well as the parameters of the contagion propagation model. Moreover, our approach scales well as it can infer optimal networks on thousand nodes in a matter of minutes. 1
Time-Varying Dynamic Bayesian Networks
"... Directed graphical models such as Bayesian networks are a favored formalism for modeling the dependency structures in complex multivariate systems such as those encountered in biology and neural science. When a system is undergoing dynamic transformation, temporally rewiring networks are needed for ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Directed graphical models such as Bayesian networks are a favored formalism for modeling the dependency structures in complex multivariate systems such as those encountered in biology and neural science. When a system is undergoing dynamic transformation, temporally rewiring networks are needed for capturing the dynamic causal influences between covariates. In this paper, we propose time-varying dynamic Bayesian networks (TV-DBN) for modeling the structurally varying directed dependency structures underlying non-stationary biological/neural time series. This is a challenging problem due the non-stationarity and sample scarcity of time series data. We present a kernel reweighted ℓ1-regularized auto-regressive procedure for this problem which enjoys nice properties such as computational efficiency and provable asymptotic consistency. To our knowledge, this is the first practical and statistically sound method for structure learning of TV-DBNs. We applied TV-DBNs to time series measurements during yeast cell cycle and brain response to visual stimuli. In both cases, TV-DBNs reveal interesting dynamics underlying the respective biological systems. 1
Learning Graphical Game Models
"... Graphical games provide compact representation of a multiagent interaction when agents ’ payoffs depend only on actions of agents in their local neighborhood. We formally describe the problem of learning a graphical game model from limited observation of the payoff function, define three performance ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Graphical games provide compact representation of a multiagent interaction when agents ’ payoffs depend only on actions of agents in their local neighborhood. We formally describe the problem of learning a graphical game model from limited observation of the payoff function, define three performance metrics for evaluating learned games, and investigate several learning algorithms based on minimizing empirical loss. Our first algorithm is a branch-and-bound search, which takes advantage of the structure of the empirical loss function to derive upper and lower bounds on loss at every node of the search tree. We also examine a greedy heuristic and local search algorithms. Our experiments with directed graphical games show that (i) when only a small sample of profile payoffs is available, branch-and-bound significantly outperforms other methods, and has competitive running time, but (ii) when many profiles are observed, greedy is nearly optimal and considerably better than other methods, at a fraction of branch-andbound’s running time. The results are comparable for undirected graphical games and when payoffs are sampled with noise. 1
A Recursive Method for Structural Learning of Directed Acyclic Graphs
"... In this paper, we propose a recursive method for structural learning of directed acyclic graphs (DAGs), in which a problem of structural learning for a large DAG is first decomposed into two problems of structural learning for two small vertex subsets, each of which is then decomposed recursively in ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In this paper, we propose a recursive method for structural learning of directed acyclic graphs (DAGs), in which a problem of structural learning for a large DAG is first decomposed into two problems of structural learning for two small vertex subsets, each of which is then decomposed recursively into two problems of smaller subsets until none subset can be decomposed further. In our approach, search for separators of a pair of variables in a large DAG is localized to small subsets, and thus the approach can improve the efficiency of searches and the power of statistical tests for structural learning. We show how the recent advances in the learning of undirected graphical models can be employed to facilitate the decomposition. Simulations are given to demonstrate the performance of the proposed method.
Structure Learning on Large Scale Common Sense Statistical Models of Human State
"... Research has shown promise in the design of large scale common sense probabilistic models to infer human state from environmental sensor data. These models have made use of mined and preexisting common sense data and traditional probabilistic machine learning techniques to improve recognition of the ..."
Abstract
- Add to MetaCart
Research has shown promise in the design of large scale common sense probabilistic models to infer human state from environmental sensor data. These models have made use of mined and preexisting common sense data and traditional probabilistic machine learning techniques to improve recognition of the state of everyday human life. In this paper, we demonstrate effective techniques for structure learning on graphical models designed for this domain, improving the SRCS system of (Pentney et al. 2006) by learning additional dependencies between variables. Because the models used for common sense reasoning typically involve a large number of variables, issues of scale arise in searching for additional dependencies; we discuss how we use data mining techniques to address this problem. We show experimentally that these techniques improve the accuracy of state prediction, and that, with a good prior model, the use of a common sense model with structure learning provides better prediction of unlabeled variables as well as labeled variables. The results also demonstrate that it is possible to collect new common sense information about daily life using such a statistical model and labeled data.
Systems biology
"... A boosting approach to structure learning of graphs with and without prior knowledge ..."
Abstract
- Add to MetaCart
A boosting approach to structure learning of graphs with and without prior knowledge
Spatiotemporal Models for Data-Anomaly Detection in Dynamic Environmental Monitoring Campaigns
"... The ecological sciences have benefited greatly from recent advances in wireless sensor technologies. These technologies allow researchers to deploy networks of automated sensors, which can monitor a landscape at very fine temporal and spatial scales. However, these networks are subject to harsh cond ..."
Abstract
- Add to MetaCart
The ecological sciences have benefited greatly from recent advances in wireless sensor technologies. These technologies allow researchers to deploy networks of automated sensors, which can monitor a landscape at very fine temporal and spatial scales. However, these networks are subject to harsh conditions, which lead to malfunctions in individual sensors and failures in network communications. The resulting data streams often exhibit incorrect data measurements and missing values. Identifying and correcting these is time-consuming and error-prone. We present a method for real-time automated data quality control (QC) that exploits the spatial and temporal correlations in the data to distinguish sensor failures from valid observations. The model adapts to each deployment site by learning a Bayesian network structure that captures spatial relationships between sensors, and it extends the structure to a dynamic Bayesian network to incorporate temporal correlations. This model is able to flag faulty observations and predict the true values of the missing or corrupt readings. The performance of the model is evaluated on data collected by the SensorScope Project. The results show that the spatiotemporal model demonstrates clear advantages over models that include only temporal or only spatial correlations, and that the model is capable of accurately imputing corrupted values. Categories and Subject Descriptors: I.2.6 [Artificial Intelligence]: Learning—Parameter learning; I.5.1 [Pattern Recognition]: Models—Statistical; structural; G.3 [Mathematics of Computing]: Probability and Statistics—Distribution functions; Markov processes; multivariate statistics
Bayesian Sparse Factor Models and DAGs Inference and Comparison
"... In this paper we present a novel approach to learn directed acyclic graphs (DAGs) and factor models within the same framework while also allowing for model comparison between them. For this purpose, we exploit the connection between factor models and DAGs to propose Bayesian hierarchies based on spi ..."
Abstract
- Add to MetaCart
In this paper we present a novel approach to learn directed acyclic graphs (DAGs) and factor models within the same framework while also allowing for model comparison between them. For this purpose, we exploit the connection between factor models and DAGs to propose Bayesian hierarchies based on spike and slab priors to promote sparsity, heavy-tailed priors to ensure identifiability and predictive densities to perform the model comparison. We require identifiability to be able to produce variable orderings leading to valid DAGs and sparsity to learn the structures. The effectiveness of our approach is demonstrated through extensive experiments on artificial and biological data showing that our approach outperform a number of state of the art methods. 1
Inferring Networks of Diffusion and Influence
"... Information diffusion and virus propagation are fundamental processes taking place in networks. While it is often possible to directly observe when nodes become infected with a virus or publish the information, observing individual transmissions (who infects whom, or who influences whom) is typicall ..."
Abstract
- Add to MetaCart
Information diffusion and virus propagation are fundamental processes taking place in networks. While it is often possible to directly observe when nodes become infected with a virus or publish the information, observing individual transmissions (who infects whom, or who influences whom) is typically very difficult. Furthermore, in many applications, the underlying network over which the diffusions and propagations spread is actually unobserved. We tackle these challenges by developing a method for tracing paths of diffusion and influence through networks and inferring the networks over which contagions propagate. Given the times when nodes adopt pieces of information or become infected, we identify the optimal network that best explains the observed infection times. Since the optimization problem is NP-hard to solve exactly, we develop an efficient approximation algorithm that scales to large datasets and finds provably near-optimal networks. We demonstrate the effectiveness of our approach by tracing information diffusion in a set of 170 million blogs and news articles over a one year period to infer how information flows through the online media space. We find that the diffusion network of news for the top 1,000 media sites and blogs tends to have a core-periphery structure with a small set of core media sites that diffuse information to the rest of the Web. These sites tend to have stable circles of influence with more general news media sites acting as connectors between them.

