Results 1  10
of
14
Discrete Multivariate Analysis: Theory and Practice
, 1975
"... the collaboration of Richard J. Light and Frederick Mosteller. ..."
Abstract

Cited by 423 (34 self)
 Add to MetaCart
the collaboration of Richard J. Light and Frederick Mosteller.
Computing Maximum Likelihood Estimates in loglinear models
, 2006
"... We develop computational strategies for extended maximum likelihood estimation, as defined in Rinaldo (2006), for general classes of loglinear models of widespred use, under Poisson and productmultinomial sampling schemes. We derive numerically efficient procedures for generating and manipulating ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
We develop computational strategies for extended maximum likelihood estimation, as defined in Rinaldo (2006), for general classes of loglinear models of widespred use, under Poisson and productmultinomial sampling schemes. We derive numerically efficient procedures for generating and manipulating design matrices and we propose various algorithms for computing the extended maximum likelihood estimates of the expectations of the cell counts. These algorithms allow to identify the set of estimable cell means for any given observable table and can be used for modifying traditional goodnessoffit tests to accommodate for a nonexistent MLE. We describe and take advantage of the connections between extended maximum likelihood
Three Centuries of Categorical Data Analysis: Loglinear Models and Maximum Likelihood Estimation
"... The common view of the history of contingency tables is that it begins in 1900 with the work of Pearson and Yule, but it extends back at least into the 19th century. Moreover it remains an active area of research today. In this paper we give an overview of this history focussing on the development o ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
The common view of the history of contingency tables is that it begins in 1900 with the work of Pearson and Yule, but it extends back at least into the 19th century. Moreover it remains an active area of research today. In this paper we give an overview of this history focussing on the development of loglinear models and their estimation via the method of maximum likelihood. S. N. Roy played a crucial role in this development with two papers coauthored with his students S. K. Mitra and Marvin Kastenbaum, at roughly the midpoint temporally in this development. Then we describe a problem that eluded Roy and his students, that of the implications of sampling zeros for the existence of maximum likelihood estimates for loglinear models. Understanding the problem of nonexistence is crucial to the analysis of large sparse contingency tables. We introduce some relevant results from the application of algebraic geometry to the study of this statistical problem. 1
Univariate and Bivariate Loglinear Models for Discrete Test Score Distributions
, 2000
"... The welldeveloped theory of exponential families of distributions is applied to the problem of fitting the univariate histograms and discrete bivariate frequency distributions that often arise in the analysis of test scores. These models are powerful tools for many forms of parametric data smoothi ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
The welldeveloped theory of exponential families of distributions is applied to the problem of fitting the univariate histograms and discrete bivariate frequency distributions that often arise in the analysis of test scores. These models are powerful tools for many forms of parametric data smoothing and are particularly wellsuited to problems in which there is little or no theory to guide a choice of probability models, e.g., smoothing a distribution to eliminate roughness and zero frequencies in order to equate scores from different tests. Attention is given to efficient computation of the maximum likelihood estimates of the parameters using Newton's Method and to computationally efficient methods for obtaining the asymptotic standard errors of the fitted frequencies and proportions. We discuss tools that can be used to diagnose the quality of the fitted frequencies for both the univariate and the bivariate cases. Five examples, using real data, are used to illustrate the methods of this paper.
DistributionFree Multivariate Process Control Based On LogLinear Modeling
"... This paper considers statistical process control (SPC) when the process measurement is multivariate. In the literature, most existing multivariate SPC procedures assume that the incontrol distribution of the multivariate process measurement is known and it is a Gaussian distribution. In application ..."
Abstract

Cited by 5 (5 self)
 Add to MetaCart
This paper considers statistical process control (SPC) when the process measurement is multivariate. In the literature, most existing multivariate SPC procedures assume that the incontrol distribution of the multivariate process measurement is known and it is a Gaussian distribution. In applications, however, the measurement distribution is usually unknown and it needs to be estimated from data. Furthermore, multivariate measurements often do not follow a Gaussian distribution (e.g., cases when some measurement components are discrete). We demonstrate that results from conventional multivariate SPC procedures are usually unreliable when the data are nonGaussian. Existing statistical tools for describing multivariate nonGaussian data, or, transforming the multivariate nonGaussian data to multivariate Gaussian data are limited, making appropriate multivariate SPC difficult in such cases. In this paper, we suggest a methodology for estimating the incontrol multivariate measurement distribution when a set of incontrol data is available, which is based on loglinear modeling and which takes into account the association structure among the measurement components. Based on this estimated incontrol distribution, a multivariate CUSUM procedure for detecting shifts in the location parameter vector of the measurement distribution is also suggested for Phase II SPC. This procedure does not depend on the Gaussian distribution assumption; thus, it is appropriate to use for most multivariate SPC problems.
The organizersâ€™ ecology: An empirical study of foreign banks
 in Shanghai. Org. Sci
, 2006
"... doi 10.1287/orsc.1060.0182 ..."
Computing Maximum Likelihood Estimates . . .
"... We develop computational strategies for extended maximum likelihood estimation, as defined in Rinaldo (2006), for general classes of loglinear models of widespred use, under Poisson and productmultinomial sampling schemes. We derive numerically efficient procedures for generating and manipulating ..."
Abstract
 Add to MetaCart
We develop computational strategies for extended maximum likelihood estimation, as defined in Rinaldo (2006), for general classes of loglinear models of widespred use, under Poisson and productmultinomial sampling schemes. We derive numerically efficient procedures for generating and manipulating design matrices and we propose various algorithms for computing the extended maximum likelihood estimates of the expectations of the cell counts. These algorithms allow to identify the set of estimable cell means for any given observable table and can be used for modifying traditional goodnessoffit tests to accommodate for a nonexistent MLE. We describe and take advantage of the connections between extended maximum likelihood
Suggested Citation
"... Kasprzyk for their helpful comments on earlier drafts of this paper. Clerical assistance was ..."
Abstract
 Add to MetaCart
Kasprzyk for their helpful comments on earlier drafts of this paper. Clerical assistance was
Methodology Working Paper M06/14Assessing Identification Risk in Survey Microdata using Loglinear Models
"... This article considers the assessment of the risk of identification of respondents in survey microdata, in the context of applications at the United Kingdom (UK) Office for National Statistics (ONS). The threat comes from the matching of categorical 'key' variables between microdata records and exte ..."
Abstract
 Add to MetaCart
This article considers the assessment of the risk of identification of respondents in survey microdata, in the context of applications at the United Kingdom (UK) Office for National Statistics (ONS). The threat comes from the matching of categorical 'key' variables between microdata records and external data sources and from the use of loglinear models to facilitate matching. While the potential use of such statistical models is wellestablished in the literature, little consideration has been given to model specification nor to the sensitivity of risk assessment to this specification. In this article we develop new criteria for assessing the specification of a loglinear model in relation to the accuracy of risk estimates. We find that, within a class of 'reasonable ' models, risk estimates tend to decrease as the complexity of the model increases. We develop criteria to detect 'underfitting ' (associated with overestimation of the risk). The criteria may also reveal 'overfitting ' (associated with underestimation) although not so clearly, so we suggest employing a forward model selection approach. We show how our approach may be used for both filelevel and recordlevel measures of risk. We evaluate the proposed procedures using samples drawn from the 2001 UK Census where the true risks can be determined. We also apply our approach to a large survey dataset.