Results 1  10
of
43
Algebraic Algorithms for Sampling from Conditional Distributions
 Annals of Statistics
, 1995
"... We construct Markov chain algorithms for sampling from discrete exponential families conditional on a sufficient statistic. Examples include generating tables with fixed row and column sums and higher dimensional analogs. The algorithms involve finding bases for associated polynomial ideals and so a ..."
Abstract

Cited by 192 (16 self)
 Add to MetaCart
We construct Markov chain algorithms for sampling from discrete exponential families conditional on a sufficient statistic. Examples include generating tables with fixed row and column sums and higher dimensional analogs. The algorithms involve finding bases for associated polynomial ideals and so an excursion into computational algebraic geometry.
A random polynomialtime algorithm for approximating the volume of convex bodies
 Journal of the ACM
, 1991
"... We consider the problem of counting the number of contingency tables with given row and column sums. This problem is known to be #Pcomplete, even when there are only two rows [7]. In this paper we present the first fullypolynomial randomized approximation scheme for counting contingency tables whe ..."
Abstract

Cited by 115 (9 self)
 Add to MetaCart
We consider the problem of counting the number of contingency tables with given row and column sums. This problem is known to be #Pcomplete, even when there are only two rows [7]. In this paper we present the first fullypolynomial randomized approximation scheme for counting contingency tables when the number of rows is constant. A novel feature of our algorithm is that it is a hybrid of an exact counting technique with an approximation algorithm, giving two distinct phases. In the first, the columns are partitioned into “small ” and “large”. We show that the number of contingency tables can be expressed as the weighted sum of a polynomial number of new instances of the problem, where each instance consists of some new row sums and the original large column sums. In the second phase, we show how to approximately count contingency tables when all the column sums are large. In this case, we show that the solution lies in approximating the volume of a single convex body, a problem which is known to be solvable in polynomial time [5]. 1.
Sequential Monte Carlo methods for statistical analysis of tables
 J. Amer. Statist. Assoc
"... We describe a sequential importance sampling (SIS) procedure for analyzing twoway zero–one or contingency tables with fixed marginal sums. An essential feature of the new method is that it samples the columns of the table progressively according to certain special distributions. Our method produces ..."
Abstract

Cited by 51 (10 self)
 Add to MetaCart
We describe a sequential importance sampling (SIS) procedure for analyzing twoway zero–one or contingency tables with fixed marginal sums. An essential feature of the new method is that it samples the columns of the table progressively according to certain special distributions. Our method produces Monte Carlo samples that are remarkably close to the uniform distribution, enabling one to approximate closely the null distributions of various test statistics about these tables. Our method compares favorably with other existing Monte Carlobased algorithms, and sometimes is a few orders of magnitude more efficient. In particular, compared with Markov chain Monte Carlo (MCMC)based approaches, our importance sampling method not only is more efficient in terms of absolute running time and frees one from pondering over the mixing issue, but also provides an easy and accurate estimate of the total number of tables with fixed marginal sums, which is far more difficult for an MCMC method to achieve.
Mathematical foundations of the Markov chain Monte Carlo method
 in Probabilistic Methods for Algorithmic Discrete Mathematics
, 1998
"... 7.2 was jointly undertaken with Vivek Gore, and is published here for the first time. I also thank an anonymous referee for carefully reading and providing helpful comments on a draft of this chapter. 1. Introduction The classical Monte Carlo method is an approach to estimating quantities that a ..."
Abstract

Cited by 30 (1 self)
 Add to MetaCart
7.2 was jointly undertaken with Vivek Gore, and is published here for the first time. I also thank an anonymous referee for carefully reading and providing helpful comments on a draft of this chapter. 1. Introduction The classical Monte Carlo method is an approach to estimating quantities that are hard to compute exactly. The quantity z of interest is expressed as the expectation z = ExpZ of a random variable (r.v.) Z for which some efficient sampling procedure is available. By taking the mean of some sufficiently large set of independent samples of Z, one may obtain an approximation to z. For example, suppose S = \Phi (x; y) 2 [0; 1] 2 : p i (x; y) 0; for all i \Psi<F12
The complexity of threeway statistical tables
 SIAM J. COMPUT
, 2004
"... Multiway tables with specified marginals arise in a variety of applications in statistics and operations research. We provide a comprehensive complexity classification of three fundamental computational problems on tables: existence, counting, and entrysecurity. One outcome of our work is that eac ..."
Abstract

Cited by 26 (7 self)
 Add to MetaCart
Multiway tables with specified marginals arise in a variety of applications in statistics and operations research. We provide a comprehensive complexity classification of three fundamental computational problems on tables: existence, counting, and entrysecurity. One outcome of our work is that each of the following problems is intractable already for “slim” 3tables, with constant number 3 of rows: (1) deciding existence of 3tables with specified 2marginals; (2) counting all 3tables with specified 2marginals; (3) deciding whether a specified value is attained in a specified entry by at least one of the 3tables having the same 2marginals as a given table. This implies that a characterization of feasible marginals for such slim tables, sought by much recent research, is unlikely to exist. Another consequence of our study is a systematic efficient way of embedding the set of 3tables satisfying any given 1marginals and entry upper bounds in a set of slim 3tables satisfying suitable 2marginals with no entry bounds. This provides a valuable tool for studying multiindex transportation problems and multiindex transportation polytopes. Remarkably, it enables us to automatically recover a famous example due to Vlach of a “realfeasible integerinfeasible ” collection of 2marginals for 3tables of smallest possible size (3, 4, 6).
A DivideandConquer Algorithm for Generating Markov Bases of Multiway Tables
, 2003
"... We describe a divideandconquer technique for generating a Markov basis that connects all tables of counts having a fixed set of marginal totals ..."
Abstract

Cited by 23 (8 self)
 Add to MetaCart
We describe a divideandconquer technique for generating a Markov basis that connects all tables of counts having a fixed set of marginal totals
PolynomialTime Counting and Sampling of TwoRowed Contingency Tables
 Theoretical Computer Sciences
, 1998
"... In this paper a Markov chain for contingency tables with two rows is defined. The chain is shown to be rapidly mixing using the path coupling method. The mixing time of the chain is quadratic in the number of columns and linear in the logarithm of the table sum. We prove a lower bound for the mixing ..."
Abstract

Cited by 22 (6 self)
 Add to MetaCart
In this paper a Markov chain for contingency tables with two rows is defined. The chain is shown to be rapidly mixing using the path coupling method. The mixing time of the chain is quadratic in the number of columns and linear in the logarithm of the table sum. We prove a lower bound for the mixing time, which is quadratic in the number of columns and linear in the logarithm of the number of columns. Two extensions of the new chain are discussed: one for threerowed contingency tables and one for mrowed contingency tables. We show that, unfortunately, it is not possible to prove rapid mixing for these chains by simply extending the path coupling approach used in the tworowed case. 1 Introduction A contingency table is a matrix of nonnegative integers with prescribed positive row and column sums. Contingency tables are used in statistics to store data from sample surveys (see for example [3, Chapter 8]). For a survey of contingency tables and related problems, see [8]. The data is o...
Rapidly mixing Markov chains for sampling contingency tables with constant number of rows
 Proceedings of the 43rd Annual Symposium on Foundations of Computer Science (FOCS
, 2002
"... Abstract. We consider the problem of sampling almost uniformly from the set of contingency tables with given row and column sums, when the number of rows is a constant. Cryan and Dyer [J. Comput. System Sci., 67 (2003), pp. 291–310] have recently given a fully polynomial randomized approximation sch ..."
Abstract

Cited by 21 (4 self)
 Add to MetaCart
Abstract. We consider the problem of sampling almost uniformly from the set of contingency tables with given row and column sums, when the number of rows is a constant. Cryan and Dyer [J. Comput. System Sci., 67 (2003), pp. 291–310] have recently given a fully polynomial randomized approximation scheme (fpras) for the related counting problem, which employs Markov chain methods indirectly. They leave open the question as to whether a natural Markov chain on such tables mixes rapidly. Here we show that the “2 × 2 heatbath ” Markov chain is rapidly mixing. We prove this by considering first a heatbath chain operating on a larger window. Using techniques developed
Sequential importance sampling for multiway tables
 Annals of Statistics
, 2005
"... We describe an algorithm for the sequential sampling of entries in multiway contingency tables with given constraints. The algorithm can be used for computations in exact conditional inference. To justify the algorithm, a theory relates sampling values at each step to properties of the associated to ..."
Abstract

Cited by 21 (3 self)
 Add to MetaCart
We describe an algorithm for the sequential sampling of entries in multiway contingency tables with given constraints. The algorithm can be used for computations in exact conditional inference. To justify the algorithm, a theory relates sampling values at each step to properties of the associated toric ideal using computational commutative algebra. In particular, the property of interval cell counts at each step is related to exponents on lead indeterminates of a lexicographic Gröbner basis. Also, the approximation of integer programming by linear programming for sampling is related to initial terms of a toric ideal. We apply the algorithm to examples of contingency tables which appear in the social and medical sciences. The numerical results demonstrate that the theory is applicable and that the algorithm performs well. 1. Introduction. Sampling
Sampling binary contingency tables with a greedy start (Preprint
, 2005
"... We study the problem of counting and randomly sampling binary contingency tables. For given row and column sums, we are interested in approximately counting (or sampling) 0/1 n×m matrices with the specified row/column sums. We present a simulated annealing algorithm with running time O((nm) 2 D 3 dm ..."
Abstract

Cited by 21 (6 self)
 Add to MetaCart
We study the problem of counting and randomly sampling binary contingency tables. For given row and column sums, we are interested in approximately counting (or sampling) 0/1 n×m matrices with the specified row/column sums. We present a simulated annealing algorithm with running time O((nm) 2 D 3 dmax log 5 (n + m)) for any row/column sums where D is the number of nonzero entries and dmax is the maximum row/column sum. In the worst case, the running time of the algorithm is O(n 11 log 5 n) for an n × n matrix. This is the first algorithm to directly solve binary contingency tables for all row/column sums. Previous work reduced the problem to the permanent, or restricted attention to row/column sums that are close to regular. The interesting aspect of our simulated annealing algorithm is that it starts at a nontrivial instance, whose solution relies on the existence of short alternating paths in the graph constructed by a particular Greedy algorithm. 1