Results 1  10
of
29
Improving predictive inference under covariate shift by weighting the loglikelihood function
 JOURNAL OF STATISTICAL PLANNING AND INFERENCE
, 2000
"... ..."
Stratified exponential families: Graphical models and model selection
 ANNALS OF STATISTICS
, 2001
"... ..."
Inference in Curved Exponential Family Models for Networks
 Journal of Computational and Graphical Statistics
, 2006
"... Network data arise in a wide variety of applications. Although descriptive statistics for networks abound in the literature, the science of fitting statistical models to complex network data is still in its infancy. The models considered in this article are based on exponential families; therefore, ..."
Abstract

Cited by 47 (9 self)
 Add to MetaCart
Network data arise in a wide variety of applications. Although descriptive statistics for networks abound in the literature, the science of fitting statistical models to complex network data is still in its infancy. The models considered in this article are based on exponential families; therefore, we refer to them as exponential random graph models (ERGMs). Although ERGMs are easy to postulate, maximum likelihood estimation of parameters in these models is very difficult. In this article, we first review the method of maximum likelihood estimation using Markov chain Monte Carlo in the context of fitting linear ERGMs. We then extend this methodology to the situation where the model comes from a curved exponential family. The curved exponential family methodology is applied to new specifications of ERGMs, proposed by Snijders et al. (2004), having nonlinear parameters to represent structural properties of networks such as transitivity and heterogeneity of degrees. We review the difficult topic of implementing likelihood ratio tests for these models, then apply all these modelfitting and testing techniques to the estimation of linear and nonlinear parameters for a collaboration network between partners in a New England law firm.
ergm: A Package to Fit, Simulate and Diagnose ExponentialFamily Models for Networks
 Journal of Statistical Software
, 2008
"... We describe some of the capabilities of the ergm package and the statistical theory underlying it. This package contains tools for accomplishing three important, and interrelated, tasks involving exponentialfamily random graph models (ERGMs): estimation, simulation, and goodness of fit. More precis ..."
Abstract

Cited by 24 (5 self)
 Add to MetaCart
We describe some of the capabilities of the ergm package and the statistical theory underlying it. This package contains tools for accomplishing three important, and interrelated, tasks involving exponentialfamily random graph models (ERGMs): estimation, simulation, and goodness of fit. More precisely, ergm has the capability of approximating a maximum likelihood estimator for an ERGM given a network data set; simulating new network data sets from a fitted ERGM using Markov chain Monte Carlo; and assessing how well a fitted ERGM does at capturing characteristics of a particular network data set.
Graphical models and exponential families
 In Proceedings of the 14th Annual Conference on Uncertainty in Arti cial Intelligence (UAI98
, 1998
"... We provide a classification of graphical models according to their representation as subfamilies of exponential families. Undirected graphical models with no hidden variables are linear exponential families (LEFs), directed acyclic graphical models and chain graphs with no hidden variables, includin ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
We provide a classification of graphical models according to their representation as subfamilies of exponential families. Undirected graphical models with no hidden variables are linear exponential families (LEFs), directed acyclic graphical models and chain graphs with no hidden variables, including Bayesian networks with several families of local distributions, are curved exponential families (CEFs) and graphical models with hidden variables are stratified exponential families (SEFs). An SEF is a finite union of CEFs satisfying a frontier condition. In addition, we illustrate how one can automatically generate independence and nonindependence constraints on the distributions over the observable variables implied by a Bayesian network with hidden variables. The relevance of these results for model selection is examined. 1
Alternating minimization and Boltzmann machine learning
 IEEE Trans. Neural Networks
, 1992
"... Abstract 'Paining a Boltzmann machine with hidden units is appropriately treated in information geometry using the information divergence and the technique of alternating minimization. The resulting algorithm is shown to be closely related to gradient descent Boltzmann machine learning rules, ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
Abstract 'Paining a Boltzmann machine with hidden units is appropriately treated in information geometry using the information divergence and the technique of alternating minimization. The resulting algorithm is shown to be closely related to gradient descent Boltzmann machine learning rules, and the close relationship of both to the EM algorithm is described. An iterative proportional fitting procedure for training machines without hidden units is described and incorporated into the alternating minimization algorithm. of the network is described by a binary vector of length n, x " = [ XI,..., xn], where xj E {0,1} specifies whether unit j has value 0 or 1 in the network state. When the network is running, at each time instant one of the units is chosen for updating. Suppose unit i is chosen for updating. It assumes the value 1 with probability I.
A New Look at the Entropy for Solving Linear Inverse Problems
 IEEE Transactions on Information Theory
, 1994
"... Entropybased methods are widely used for solving inverse problems, especially when the solution is known to be positive. We address here the linear illposed and noisy inverse problems y = Ax + n with a more general convex constraint x 2 C, where C is a convex set. Although projective methods ar ..."
Abstract

Cited by 14 (4 self)
 Add to MetaCart
Entropybased methods are widely used for solving inverse problems, especially when the solution is known to be positive. We address here the linear illposed and noisy inverse problems y = Ax + n with a more general convex constraint x 2 C, where C is a convex set. Although projective methods are well adapted to this context, we study here alternative methods which rely highly on some "informationbased" criteria. Our goal is to enlight the role played by entropy in this frame, and to present a new and deeper point of view on the entropy, using general tools and results of convex analysis and large deviations theory. Then, we present a new and large scheme of entropicbased inversion of linearnoisy inverse problems. This scheme was introduced by Navaza in 1985 [48] in connection with a physical modeling for crystallographic applications, and further studied by DacunhaCastelle and Gamboa [13]. Important features of this paper are (i) a unified presentation of many well kno...
Likelihood Asymptotics
, 1998
"... The paper gives an overview of modern likelihood asymptotics with emphasis on results and applicability. Only parametric inference in wellbehaved models is considered and the theory discussed leads to highly accurate asymptotic tests for general smooth hypotheses. The tests are refinements of the u ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
The paper gives an overview of modern likelihood asymptotics with emphasis on results and applicability. Only parametric inference in wellbehaved models is considered and the theory discussed leads to highly accurate asymptotic tests for general smooth hypotheses. The tests are refinements of the usual asymptotic likelihood ratio tests, and for onedimensional hypotheses the test statistic is known as r , introduced by BarndorffNielsen. Examples illustrate the applicability and accuracy as well as the complexity of the required computations. Modern likelihood asymptotics has developed by merging two lines of research: asymptotic ancillarity is the basis of the statistical development, and saddlepoint approximations or Laplacetype approximations have simultaneously developed as the technical foundation. The main results and techniques of these two lines will be reviewed, and a generalization to multidimensional tests is developed. In the final part of the paper further problems and ...
The Epic Story of Maximum Likelihood
, 2008
"... At a superficial level, the idea of maximum likelihood must be prehistoric: early hunters and gatherers may not have used the words “method of maximum likelihood ” to describe their choice of where and how to hunt and gather, but it is hard to believe they would have been surprised if their method ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
At a superficial level, the idea of maximum likelihood must be prehistoric: early hunters and gatherers may not have used the words “method of maximum likelihood ” to describe their choice of where and how to hunt and gather, but it is hard to believe they would have been surprised if their method had been described in those terms. It seems a simple, even unassailable idea: Who would rise to argue in favor of a method of minimum likelihood, or even mediocre likelihood? And yet the mathematical history of the topic shows this “simple idea ” is really anything but simple. Joseph Louis Lagrange, Daniel Bernoulli, Leonard Euler, Pierre Simon Laplace and Carl Friedrich Gauss are only some of those who explored the topic, not always in ways we would sanction today. In this article, that history is reviewed from back well before Fisher to the time of Lucien Le Cam’s dissertation. In the process Fisher’s unpublished 1930 characterization of conditions for the consistency and efficiency of maximum likelihood estimates is presented, and the mathematical basis of his three proofs discussed. In particular, Fisher’s derivation of the information inequality is seen to be derived from his work on the analysis of variance, and his later approach via estimating functions was derived from Euler’s Relation for homogeneous functions. The reaction to Fisher’s work is reviewed, and some lessons drawn.