Results 1 - 10
of
24
New specifications for exponential random graph models
, 2004
"... The most promising class of statistical models for expressing structural properties of social networks observed at one moment in time, is the class of Exponential Random Graph Models (ERGMs), also known as p ∗ models. The strong point of these models is that they can represent a variety of structura ..."
Abstract
-
Cited by 59 (15 self)
- Add to MetaCart
The most promising class of statistical models for expressing structural properties of social networks observed at one moment in time, is the class of Exponential Random Graph Models (ERGMs), also known as p ∗ models. The strong point of these models is that they can represent a variety of structural tendencies, such as transitivity, that define complicated dependence patterns not easily modeled by more basic probability models. Recently, MCMC algorithms have been developed which produce approximate Maximum Likelihood estimators. Applying these models in their traditional specification to observed network data often has led to problems, however, which can be traced back to the fact that important parts of the parameter space correspond to nearly degenerate distributions, which may lead to convergence problems of estimation algorithms, and a poor fit to empirical data. This paper proposes new specifications of Exponential Random Graph Models. These specifications represent structural properties such as transitivity and heterogeneity of degrees by more complicated graph statistics than the traditional star and triangle counts. Three kinds of statistic are proposed: geometrically weighted degree distributions, alternating k-triangles, and alternating independent two-paths. Examples are presented both of modeling graphs and digraphs, in which the new specifications lead to much better results than the earlier existing specifications of the ERGM. It is concluded that the new specifications increase the range and applicability of the ERGM as a tool for the statistical analysis of social networks.
Network-based marketing: Identifying likely adopters via consumer networks
- Statistical Science
"... Abstract. Network-based marketing refers to a collection of marketing techniques that take advantage of links between consumers to increase sales. We concentrate on the consumer networks formed using direct interactions (e.g., communications) between consumers. We survey the diverse literature on su ..."
Abstract
-
Cited by 48 (10 self)
- Add to MetaCart
Abstract. Network-based marketing refers to a collection of marketing techniques that take advantage of links between consumers to increase sales. We concentrate on the consumer networks formed using direct interactions (e.g., communications) between consumers. We survey the diverse literature on such marketing with an emphasis on the statistical methods used and the data to which these methods have been applied. We also provide a discussion of challenges and opportunities for this burgeoning research topic. Our survey highlights a gap in the literature. Because of inadequate data, prior studies have not been able to provide direct, statistical support for the hypothesis that network linkage can directly affect product/service adoption. Using a new data set that represents the adoption of a new telecommunications service, we show very strong support for the hypothesis. Specifically, we show three main results: (1) “Network neighbors”—those consumers linked to a prior customer—adopt the service at a rate 3–5 times greater than baseline groups selected by the best practices of the firm’s marketing team. In addition, analyzing the network allows the firm to acquire new customers who otherwise would have fallen through the cracks, because they would not have been identified based on traditional attributes. (2) Statistical models, built with a very large amount of geographic, demographic and prior purchase data, are significantly and substantially improved by including network information. (3) More detailed network information allows the ranking of the network neighbors so as to permit the selection of small sets of individuals with very high probabilities of adoption. Key words and phrases: Viral marketing, word of mouth, targeted marketing, network analysis, classification, statistical relational learning. 1.
Dynamic Social Network Analysis using Latent Space Models
- ACM SIGKDD EXPLORATIONS NEWSLETTER
, 2005
"... This paper explores two aspects of social network modeling. First, we generalize a successful static model of relationships into a dynamic model that accounts for friendships drifting over time. Second, we show how to make it tractable to learn such models from data, even as the number of entities n ..."
Abstract
-
Cited by 47 (2 self)
- Add to MetaCart
This paper explores two aspects of social network modeling. First, we generalize a successful static model of relationships into a dynamic model that accounts for friendships drifting over time. Second, we show how to make it tractable to learn such models from data, even as the number of entities n gets large. The generalized model associates each entity with a point in p-dimensional Euclidean latent space. The points can move as time progresses but large moves in latent space are improbable. Observed links between entities are more likely if the entities are close in latent space. We show how to make such a model tractable (sub-quadratic in the number of entities) by the use of appropriate kernel functions for similarity in latent space; the use of low dimensional KD-trees; a new efficient dynamic adaptation of multidimensional scaling for a first pass of approximate projection of entities into latent space; and an efficient conjugate gradient update rule for non-linear local optimization in which amortized time per entity during an update is O(log n). We use both synthetic and real-world data on up to 11,000 entities which indicate near-linear scaling in computation time and improved performance over four alternative approaches. We also illustrate the system operating on twelve years of NIPS co-authorship data.
Assessing Degeneracy in Statistical Models of Social Networks
- Journal of the American Statistical Association
, 2003
"... discussions. This paper presents recent advances in the statistical modeling of random graphs that have an impact on the empirical study of social networks. Statistical exponential family models (Wasserman and Pattison 1996) are a generalization of the Markov random graph models introduced by Frank ..."
Abstract
-
Cited by 45 (12 self)
- Add to MetaCart
discussions. This paper presents recent advances in the statistical modeling of random graphs that have an impact on the empirical study of social networks. Statistical exponential family models (Wasserman and Pattison 1996) are a generalization of the Markov random graph models introduced by Frank and Strauss (1986), which in turn are derived from developments in spatial statistics (Besag 1974). These models recognize the complex dependencies within relational data structures. A major barrier to the application of random graph models to social networks has been the lack of a sound statistical theory to evaluate model fit. This problem has at least three aspects: the specification of realistic models, the algorithmic difficulties of the inferential methods, and the assessment of the degree to which the graph structure produced by the models matches that of the data. We discuss these and related issues of the model degeneracy and inferential degeneracy for commonly used estimators.
Leveraging relational autocorrelation with latent group models
- In MRDM '05: Proceedings of the 4th international workshop on Multi-relational mining. ACM
"... Abstract. The presence of autocorrelation provides strong motivation for using relational techniques for learning and inference. Autocorrelation is a statistical dependency between the values of the same variable on related entities and is a nearly ubiquitous characteristic of relational data sets. ..."
Abstract
-
Cited by 43 (14 self)
- Add to MetaCart
Abstract. The presence of autocorrelation provides strong motivation for using relational techniques for learning and inference. Autocorrelation is a statistical dependency between the values of the same variable on related entities and is a nearly ubiquitous characteristic of relational data sets. Recent research has explored the use of collective inference techniques to exploit this phenomenon. These techniques achieve significant performance gains by modeling observed correlations among class labels of related instances, but the models fail to capture a frequent cause of autocorrelation—the presence of underlying groups that influence the attributes on a set of entities. We propose a latent group model (LGM) for relational data, which discovers and exploits the hidden structures responsible for the observed autocorrelation among class labels. Modeling the latent group structure improves model performance, increases inference efficiency, and enhances our understanding of the datasets. We evaluate performance on three relational classification tasks and show that LGM outperforms models that ignore latent group structure when there is little known information with which to seed inference.
Recovering temporally rewiring networks: A model-based approach
- In ICML07
, 2007
"... A plausible representation of relational information among entities in dynamic systems such as a living cell or a social community is a stochastic network which is topologically rewiring and semantically evolving over time. While there is a rich literature on modeling static or temporally invariant ..."
Abstract
-
Cited by 19 (5 self)
- Add to MetaCart
A plausible representation of relational information among entities in dynamic systems such as a living cell or a social community is a stochastic network which is topologically rewiring and semantically evolving over time. While there is a rich literature on modeling static or temporally invariant networks, much less has been done toward modeling the dynamic processes underlying rewiring networks, and on recovering such networks when they are not observable. We present a class of hidden temporal exponential random graph models (htERGMs) to study the yet unexplored topic of modeling and recovering temporally rewiring networks from time series of node attributes such as activities of social actors or expression levels of genes. We show that one can reliably infer the latent timespecific topologies of the evolving networks from the observation. We report empirical results on both synthetic data and a Drosophila lifecycle gene expression data set, in comparison with a static counterpart of htERGM. 1.
Random Effects Models for Network Data
, 2003
"... One impediment to the statistical analysis of network data has been the difficulty in modeling the dependence among the observations. In the very simple case of binary (0-1) network data, some researchers have parameterized network dependence in terms of exponential family representations. Accurate ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
One impediment to the statistical analysis of network data has been the difficulty in modeling the dependence among the observations. In the very simple case of binary (0-1) network data, some researchers have parameterized network dependence in terms of exponential family representations. Accurate parameter estimation for such models is quite difficult, and the most commonly used models often display a significant lack of fit. Additionally, such models are generally limited to binary data. In contrast, random effects models have been a widely successful tool in capturing statistical dependence for a variety of data types, and allow for prediction, imputation, and hypothesis testing within a general regression context. We propose novel random effects structures to capture network dependence, which can also provide graphical representations of network structure and variability.
Estimating the integrated likelihood via posterior simulation using the harmonic mean identity
- Bayesian Statistics
, 2007
"... The integrated likelihood (also called the marginal likelihood or the normalizing constant) is a central quantity in Bayesian model selection and model averaging. It is defined as the integral over the parameter space of the likelihood times the prior density. The Bayes factor for model comparison a ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
The integrated likelihood (also called the marginal likelihood or the normalizing constant) is a central quantity in Bayesian model selection and model averaging. It is defined as the integral over the parameter space of the likelihood times the prior density. The Bayes factor for model comparison and Bayesian testing is a ratio of integrated likelihoods, and the model weights in Bayesian model averaging are proportional to the integrated likelihoods. We consider the estimation of the integrated likelihood from posterior simulation output, aiming at a generic method that uses only the likelihoods from the posterior simulation iterations. The key is the harmonic mean identity, which says that the reciprocal of the integrated likelihood is equal to the posterior harmonic mean of the likelihood. The simplest estimator based on the identity is thus the harmonic mean of the likelihoods. While this is an unbiased and simulation-consistent estimator, its reciprocal can have infinite variance and so it is unstable in general. We describe two methods for stabilizing the harmonic mean estimator. In the first one, the parameter space is reduced in such a way that the modified estimator involves a harmonic mean of heavier-tailed densities, thus resulting in a finite variance estimator. The resulting
Bilinear Mixed Effects Models for Dyadic Data
, 2003
"... This article discusses the use of a symmetric multiplicative interaction effect to capture certain types of third-order dependence patterns often present in social networks and other dyadic datasets. Such an effect, along with standard linear fixed and random effects, is incorporated into a general ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
This article discusses the use of a symmetric multiplicative interaction effect to capture certain types of third-order dependence patterns often present in social networks and other dyadic datasets. Such an effect, along with standard linear fixed and random effects, is incorporated into a generalized linear model, and a Markov chain Monte Carlo algorithm is provided for Bayesian estimation and inference. In an example analysis of international relations data, accounting for such patterns improves model fit and predictive performance.
Modeling homophily and stochastic equivalence in symmetric relational data
- Neural Informaiton Processing Systems 20
, 2007
"... This article discusses a latent variable model for inference and prediction of symmetric relational data. The model, based on the idea of the eigenvalue decomposition, represents the relationship between two nodes as the weighted inner-product of node-specific vectors of latent characteristics. This ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
This article discusses a latent variable model for inference and prediction of symmetric relational data. The model, based on the idea of the eigenvalue decomposition, represents the relationship between two nodes as the weighted inner-product of node-specific vectors of latent characteristics. This “eigenmodel ” generalizes other popular latent variable models, such as latent class and distance models: It is shown mathematically that any latent class or distance model has a representation as an eigenmodel, but not vice-versa. The practical implications of this are examined in the context of three real datasets, for which the eigenmodel has as good or better out-of-sample predictive performance than the other two models. 1

