Results 1  10
of
73
Model selection and accounting for model uncertainty in graphical models using Occam's window
, 1993
"... We consider the problem of model selection and accounting for model uncertainty in highdimensional contingency tables, motivated by expert system applications. The approach most used currently is a stepwise strategy guided by tests based on approximate asymptotic Pvalues leading to the selection o ..."
Abstract

Cited by 324 (48 self)
 Add to MetaCart
We consider the problem of model selection and accounting for model uncertainty in highdimensional contingency tables, motivated by expert system applications. The approach most used currently is a stepwise strategy guided by tests based on approximate asymptotic Pvalues leading to the selection of a single model; inference is then conditional on the selected model. The sampling properties of such a strategy are complex, and the failure to take account of model uncertainty leads to underestimation of uncertainty about quantities of interest. In principle, a panacea is provided by the standard Bayesian formalism which averages the posterior distributions of the quantity of interest under each of the models, weighted by their posterior model probabilities. Furthermore, this approach is optimal in the sense of maximising predictive ability. However, this has not been used in practice because computing the posterior model probabilities is hard and the number of models is very large (often greater than 1011). We argue that the standard Bayesian formalism is unsatisfactory and we propose an alternative Bayesian approach that, we contend, takes full account of the true model uncertainty byaveraging overamuch smaller set of models. An efficient search algorithm is developed for nding these models. We consider two classes of graphical models that arise in expert systems: the recursive causal models and the decomposable
Multidimensional Scaling
 Handbook of Statistics
, 2001
"... eflecting the importance or precision of dissimilarity # i j . 1. SOURCES OF DISTANCE DATA Dissimilarity information about a set of objects can arise in many different ways. We review some of the more important ones, organized by scientific discipline. 1.1. Geodesy. The most obvious application, ..."
Abstract

Cited by 37 (2 self)
 Add to MetaCart
(Show Context)
eflecting the importance or precision of dissimilarity # i j . 1. SOURCES OF DISTANCE DATA Dissimilarity information about a set of objects can arise in many different ways. We review some of the more important ones, organized by scientific discipline. 1.1. Geodesy. The most obvious application, perhaps, is in sciences in which distance is measured directly, although generally with error. This happens, for instance, in triangulation in geodesy. We have measurements which are approximately equal to distances, either Euclidean or spherical, depending on the scale of the experiment. In other examples, measured distances are less directly related to physical distances. For example, we could measure airplane or road or train travel distances between different cities. Physical distance is usually not the only factor determining these types of dissimilarities. 1 2 J. DE LEEUW<
Marginal models for categorical data
, 1997
"... Statistical models defined by imposing restrictions on marginal distributions of contingency tables have received considerable attention recently. This paper introduces a general definition of marginal loglinear parameters and describes conditions for a marginal loglinear parameter to be a smoot ..."
Abstract

Cited by 35 (8 self)
 Add to MetaCart
(Show Context)
Statistical models defined by imposing restrictions on marginal distributions of contingency tables have received considerable attention recently. This paper introduces a general definition of marginal loglinear parameters and describes conditions for a marginal loglinear parameter to be a smooth parameterization of the distribution, and to be variation independent. Statistical models defined by imposing affine restrictions on the marginal loglinear parameters are investigated. These models generalize ordinary loglinear and multivariate logistic models. Sufficient conditions for a logaffine marginal model to be nonempty, and to be a curved exponential family are given. Standard large sample theory is shown to apply to maximum likelihood estimation of logaffine marginal models for a variety of sampling procedures.
Soft Evidential Update for Probabilistic Multiagent Systems
 INTERNATIONAL JOURNAL OF APPROXIMATE REASONING
, 2000
"... We address the problem of updating a probability distribution represented by a Bayesian network upon presentation of soft evidence. Our motivation ..."
Abstract

Cited by 27 (5 self)
 Add to MetaCart
We address the problem of updating a probability distribution represented by a Bayesian network upon presentation of soft evidence. Our motivation
Network Routing
 Phil. Trans. R. Soc. Lond. A,337
, 1991
"... How should flows through a network be organized, so that the network responds sensibly to failures and overloads? The question is currently of considerable technological importance in connection with the development of computer and telecommunication networks, while in various other forms it has a lo ..."
Abstract

Cited by 27 (2 self)
 Add to MetaCart
(Show Context)
How should flows through a network be organized, so that the network responds sensibly to failures and overloads? The question is currently of considerable technological importance in connection with the development of computer and telecommunication networks, while in various other forms it has a long history in the fields of physics and economics. In all of these areas there is interest in how simple, local rules, often involving random actions, can produce coherent and purposeful behaviour at the macroscopic level. This paper describes some examples from these various fields, and indicates how analogies with fundamental concepts such as energy and price can provide powerful insights into the design of routing schemes for communication networks.
Computing Maximum Likelihood Estimates in loglinear models
, 2006
"... We develop computational strategies for extended maximum likelihood estimation, as defined in Rinaldo (2006), for general classes of loglinear models of widespred use, under Poisson and productmultinomial sampling schemes. We derive numerically efficient procedures for generating and manipulating ..."
Abstract

Cited by 19 (3 self)
 Add to MetaCart
(Show Context)
We develop computational strategies for extended maximum likelihood estimation, as defined in Rinaldo (2006), for general classes of loglinear models of widespred use, under Poisson and productmultinomial sampling schemes. We derive numerically efficient procedures for generating and manipulating design matrices and we propose various algorithms for computing the extended maximum likelihood estimates of the expectations of the cell counts. These algorithms allow to identify the set of estimable cell means for any given observable table and can be used for modifying traditional goodnessoffit tests to accommodate for a nonexistent MLE. We describe and take advantage of the connections between extended maximum likelihood
Approximate string comparator search strategies for very large administrative lists
, 2005
"... Rather than collect data from a variety of surveys, it is often more efficient to merge information from administrative lists. Matching of person files might be done using name and dateofbirth as the primary identifying information. There are obvious difficulties with entities having a commonly oc ..."
Abstract

Cited by 16 (3 self)
 Add to MetaCart
(Show Context)
Rather than collect data from a variety of surveys, it is often more efficient to merge information from administrative lists. Matching of person files might be done using name and dateofbirth as the primary identifying information. There are obvious difficulties with entities having a commonly occurring name such as John Smith that may occur 30,000+ times (1.5 for each dateofbirth). If there are 5 % typographical error in each field, then using fast characterbycharacter searches can miss 20 % of true matches among noncommonly occurring records where name plus dateofbirth might be unique. This paper describes some existing solutions and current research directions.