Results 1  10
of
45
Algebraic Algorithms for Sampling from Conditional Distributions
 Annals of Statistics
, 1995
"... We construct Markov chain algorithms for sampling from discrete exponential families conditional on a sufficient statistic. Examples include generating tables with fixed row and column sums and higher dimensional analogs. The algorithms involve finding bases for associated polynomial ideals and so a ..."
Abstract

Cited by 209 (19 self)
 Add to MetaCart
(Show Context)
We construct Markov chain algorithms for sampling from discrete exponential families conditional on a sufficient statistic. Examples include generating tables with fixed row and column sums and higher dimensional analogs. The algorithms involve finding bases for associated polynomial ideals and so an excursion into computational algebraic geometry.
Computing Maximum Likelihood Estimates in loglinear models
, 2006
"... We develop computational strategies for extended maximum likelihood estimation, as defined in Rinaldo (2006), for general classes of loglinear models of widespred use, under Poisson and productmultinomial sampling schemes. We derive numerically efficient procedures for generating and manipulating ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
We develop computational strategies for extended maximum likelihood estimation, as defined in Rinaldo (2006), for general classes of loglinear models of widespred use, under Poisson and productmultinomial sampling schemes. We derive numerically efficient procedures for generating and manipulating design matrices and we propose various algorithms for computing the extended maximum likelihood estimates of the expectations of the cell counts. These algorithms allow to identify the set of estimable cell means for any given observable table and can be used for modifying traditional goodnessoffit tests to accommodate for a nonexistent MLE. We describe and take advantage of the connections between extended maximum likelihood
Triangular systems for symmetric binary variables
 Electr. J. Statist
, 2009
"... Abstract We introduce and study distributions of sets of binary variables that are symmetric, that is each has equally probable levels. The joint distribution of these special types of binary variables, if generated by a recursive process of linear main effects is essentially parametrized in terms ..."
Abstract

Cited by 8 (6 self)
 Add to MetaCart
Abstract We introduce and study distributions of sets of binary variables that are symmetric, that is each has equally probable levels. The joint distribution of these special types of binary variables, if generated by a recursive process of linear main effects is essentially parametrized in terms of marginal correlations. This contrasts with the loglinear formulation of joint probabilities in which parameters measure conditional associations given all remaining variables. The new formulation permits useful comparisons of different types of graphical Markov models and leads to a close approximation of Gaussian orthant probabilities.
Three Centuries of Categorical Data Analysis: Loglinear Models and Maximum Likelihood Estimation
"... The common view of the history of contingency tables is that it begins in 1900 with the work of Pearson and Yule, but it extends back at least into the 19th century. Moreover it remains an active area of research today. In this paper we give an overview of this history focussing on the development o ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
(Show Context)
The common view of the history of contingency tables is that it begins in 1900 with the work of Pearson and Yule, but it extends back at least into the 19th century. Moreover it remains an active area of research today. In this paper we give an overview of this history focussing on the development of loglinear models and their estimation via the method of maximum likelihood. S. N. Roy played a crucial role in this development with two papers coauthored with his students S. K. Mitra and Marvin Kastenbaum, at roughly the midpoint temporally in this development. Then we describe a problem that eluded Roy and his students, that of the implications of sampling zeros for the existence of maximum likelihood estimates for loglinear models. Understanding the problem of nonexistence is crucial to the analysis of large sparse contingency tables. We introduce some relevant results from the application of algebraic geometry to the study of this statistical problem. 1
Sequences of regressions and their independences
, 2012
"... Ordered sequences of univariate or multivariate regressions provide statistical modelsfor analysingdata fromrandomized, possiblysequential interventions, from cohort or multiwave panel studies, but also from crosssectional or retrospective studies. Conditional independences are captured by what we ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
Ordered sequences of univariate or multivariate regressions provide statistical modelsfor analysingdata fromrandomized, possiblysequential interventions, from cohort or multiwave panel studies, but also from crosssectional or retrospective studies. Conditional independences are captured by what we name regression graphs, provided the generated distribution shares some properties with a joint Gaussian distribution. Regression graphs extend purely directed, acyclic graphs by two types of undirected graph, one type for components of joint responses and the other for components of the context vector variable. We review the special features and the history of regression graphs, prove criteria for Markov equivalence anddiscussthenotion of simpler statistical covering models. Knowledgeof Markov equivalence provides alternative interpretations of a given sequence of regressions, is essential for machine learning strategies and permits to use the simple graphical criteria of regression graphs on graphs for which the corresponding criteria are in general more complex. Under the known conditions that a Markov equivalent directed acyclic graph exists for any given regression graph, we give a polynomial time algorithm to find one such graph.
FCFS infinite bipartite matching of servers and customers
 Adv. Appl. Probab
"... We consider an infinite sequence of customers of types C = {1, 2,..., I} and an infinite sequence of servers of types S = {1,..., J}, where server of type j can serve a subset of customer types C(j) and where customer of type i can be served by a subset of server types S(i). We assume the types of c ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
(Show Context)
We consider an infinite sequence of customers of types C = {1, 2,..., I} and an infinite sequence of servers of types S = {1,..., J}, where server of type j can serve a subset of customer types C(j) and where customer of type i can be served by a subset of server types S(i). We assume the types of customers and servers in the infinite sequences are random, independent identically distributed, and customers and servers are matched according to their order in the sequence, on a first come first served (FCFS) basis. We investigate this process of infinite bipartite matching. In particular we are interested in the limiting rate ri,j of customers of type i assigned to servers of type j. We present a countable state Markov chain to describe this process, we prove ergodicity and existence of limiting rates, and calculate ri,j for some previously unsolved instances.
Sparse Contingency Tables and HighDimensional LogLinealr Models for Alternative Splicing
 in FullLength cDNA Libraries, Research Report 132, Swiss Federal Institute of Technology
, 2006
"... Corinne Dahinden is PhD student at the Seminar für Statistik, ETH Zürich, CH8092 Zürich, ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Corinne Dahinden is PhD student at the Seminar für Statistik, ETH Zürich, CH8092 Zürich,
Data Engineering
"... A growing number of applications need access to video data stored in digital form on secondary storage devices (e.g., videoondemand, multimedia messaging). As a result, video servers that are responsible for the storage and retrieval, at fixed rates, of hundreds of videos from disks are becoming i ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
A growing number of applications need access to video data stored in digital form on secondary storage devices (e.g., videoondemand, multimedia messaging). As a result, video servers that are responsible for the storage and retrieval, at fixed rates, of hundreds of videos from disks are becoming increasingly important. Since video data tends to be voluminous, several disks are usually used in order to store the videos. A challenge is to devise schemes for the storage and retrieval of videos that distribute the workload evenly across disks, reduce the cost of the server and at the same time, provide good response times to client requests for video data. In this paper, we present schemes that are based on striping videos (finegrained as well as coarsegrained) across disks in order to effectively utilize disk bandwidth. For the schemes, we show how an optimalcost server architecture can be determined if data for a certain prespecified number of videos is to be concurrently retrieved...