Results 1 - 10
of
50
Discrete Multivariate Analysis: Theory and Practice
, 1975
"... the collaboration of Richard J. Light and Frederick Mosteller. ..."
Abstract
-
Cited by 332 (28 self)
- Add to MetaCart
the collaboration of Richard J. Light and Frederick Mosteller.
Three Centuries of Categorical Data Analysis: Log-linear Models and Maximum Likelihood Estimation
"... The common view of the history of contingency tables is that it begins in 1900 with the work of Pearson and Yule, but it extends back at least into the 19th century. Moreover it remains an active area of research today. In this paper we give an overview of this history focussing on the development o ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
The common view of the history of contingency tables is that it begins in 1900 with the work of Pearson and Yule, but it extends back at least into the 19th century. Moreover it remains an active area of research today. In this paper we give an overview of this history focussing on the development of log-linear models and their estimation via the method of maximum likelihood. S. N. Roy played a crucial role in this development with two papers co-authored with his students S. K. Mitra and Marvin Kastenbaum, at roughly the mid-point temporally in this development. Then we describe a problem that eluded Roy and his students, that of the implications of sampling zeros for the existence of maximum likelihood estimates for loglinear models. Understanding the problem of non-existence is crucial to the analysis of large sparse contingency tables. We introduce some relevant results from the application of algebraic geometry to the study of this statistical problem. 1
Some applications of categorical data analysis to epidemiological studies. Environ. Health Perspect. 32: 000
, 1979
"... Several examples of categorized data from epidemiological studies are analyzed to illustrate that more informative analysis than tests of independence can be performed by fitting models. All of the analyses fit into a unified conceptual framework that can be performed by weighted least squares. The ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Several examples of categorized data from epidemiological studies are analyzed to illustrate that more informative analysis than tests of independence can be performed by fitting models. All of the analyses fit into a unified conceptual framework that can be performed by weighted least squares. The methods presented show how to calculate point estimate of parameters, asymptotic variances, and asymptotically valid x2 tests. The examples presented are analysis of relative risks estimated from several 2 x 2 tables, analysis of selected features of life tables, construction of synthetic life tables from cross-sectional studies, and analysis of dose-response curves.
Algebraic Descriptions of Nominal Multivariate Discrete Data
- J. Multivariate Anal
, 1995
"... Traditionally, multivariate discrete data are analyzed by means of log-linear models. In this paper we show how an algebraic approach leads naturally to alternative models, parametrized in terms of the moments of the distribution. Moreover we derive a complete characterization of all meaningful tran ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Traditionally, multivariate discrete data are analyzed by means of log-linear models. In this paper we show how an algebraic approach leads naturally to alternative models, parametrized in terms of the moments of the distribution. Moreover we derive a complete characterization of all meaningful transformations of the components and show how transformations affect the moments of a distribution. It turns out that our models provide the necessary formal description of longitudinal data; moreover in the classical case, they can be considered as an analysis tool, complementary to log-linear models. 1 Introduction We start with a given multivariate discrete nominal variable X. Questions of interest about X can be roughly divided into two groups. One group is related to conditional characteristics such as conditional independencies or questions concerning the sign and/or magnitude of log-odds ratios. The other group focuses on marginal characteristics such as marginal independencies or multiv...
Local Estimators in Multivariate Generalized Linear Models With Varying-Coefficients
, 1997
"... Introduction In varying-coefficient models as considered by Hastie & Tibshirani (1993) coefficients are allowed to change smoothly across the value of other variables, the so-called effect modifiers. That means one has two types of regressors, the usual covariate x and the effect modifier u. For th ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Introduction In varying-coefficient models as considered by Hastie & Tibshirani (1993) coefficients are allowed to change smoothly across the value of other variables, the so-called effect modifiers. That means one has two types of regressors, the usual covariate x and the effect modifier u. For the wide class of multivariate 2 generalized linear models the varying-coefficient type has the form E(yjx; u) = h(Z(x)fi(u)) (1) where h : IR q ! IR q is the response function, Z(x) is a design matrix composed from covariates x and fi(u) is the parameter vector varying across values of u. If fi(u) is non varying but a fixed c
PROPORTIONAL ODDS AND PARTIAL PROPORTIONAL ODDS MODELS FOR ORDINAL RESPONSE VARIABLES
, 1986
"... ..."
Cochran-Mantel-Haenszel Techniques: Applications Involving Epidemiologic Survey Data
"... In epidemiologic research, data are often collected that can be summarized in three-way contingency tables. Typically, the presence or absence of disease or other health outcome is cross-classified with a second categorical variable such as the presence or absence of exposure and a third categorical ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In epidemiologic research, data are often collected that can be summarized in three-way contingency tables. Typically, the presence or absence of disease or other health outcome is cross-classified with a second categorical variable such as the presence or absence of exposure and a third categorical variable which may represent a single factor (e.g., race), or, more generally, may represent the complete set of combinations of the levels of several variables (race, gender, age, etc.). The purpose of analysis of such three-way tables is often to quantify the association among the first two variables while controlling for the third. Several methods based on the work of Cochran (1954) and Mantel and Haenszel (1959) are commonly used for this purpose. These Cochran-Mantel-Haenszel (CMH) techniques include Mantel-Haenszel odds ratio, rate difference, and rate ratio estimators (see, for example, Sato, 1990), and various special cases of the generalized Cochran-MantelHaenszel tests (Landis, He...
Abstract Bridging the Gap between the Theory and Practice of Analysis of Data from Complex Surveys-
"... So as not to publish misleading results, subject matter analysts at Statistics Canada are urged to take account of the complexities of the survey design when doing analysis using data from Statistics Canada’s surveys. While commercial software packages that incorporate methods for controlling for fe ..."
Abstract
- Add to MetaCart
So as not to publish misleading results, subject matter analysts at Statistics Canada are urged to take account of the complexities of the survey design when doing analysis using data from Statistics Canada’s surveys. While commercial software packages that incorporate methods for controlling for features of the sample design are becoming more readily available and more efficient to use, analysts without some background in survey theory still have difficulty in knowing how to proceed. Statistics Canada has a small unit called the Data Analysis Resource Centre (DARC) whose purpose is to provide specialized services in analysis of statistical data. One of the major activities of DARC is the support of subject matter analysts who are using data from surveys with complex designs. This paper will present some of our experiences in DARC with assisting analysts in doing their research and some “tips and traps ” that we have identified. Because of the practice in many publications, due to space restrictions, of presenting descriptive estimates without their corresponding variance estimates, new analysts of survey data are frequently surprised that they require more data about the survey design than just the final weights in order to produce acceptable variance estimates. These analysts, who are generally secondary users of survey data rather than having been involved in the implementation of the survey, welcome assistance with possible approaches to accounting for the actual survey design and tips on different software packages that can implement the approach that best suits their needs. In order to communicate our advice on these topics, DARC
Model Checking for Incomplete High Dimensional Categorical Data
, 1999
"... OF THE DISSERTATION Model Checking for Incomplete High Dimensional Categorical Data by Ming-Yi Hu Doctor of Philosophy in Statistics University of California, Los Angeles, 1999 Professor Thomas R. Belin, Co-chair Professor Robert I. Jennrich, Co-chair Categorical data are often arranged in ..."
Abstract
- Add to MetaCart
OF THE DISSERTATION Model Checking for Incomplete High Dimensional Categorical Data by Ming-Yi Hu Doctor of Philosophy in Statistics University of California, Los Angeles, 1999 Professor Thomas R. Belin, Co-chair Professor Robert I. Jennrich, Co-chair Categorical data are often arranged in a contingency table and summarized by a loglinear model. A standard approach for comparing two competing models is to calculate twice the discrepancy between maximized loglikelihoods, which follows a 2 distribution asymptotically. But when data are sparse, the 2 approximation may be questionable. xii As an alternative to a large-sample approximation to the reference distribution, we implement the framework introduced by Rubin (1984) for finding the posterior predictive check (PPC) distribution. The PPC distribution represents the conditional probability of a future value of a test statistic based on the information given by observed data along with model specifications, which can se...

