Results 1 - 10
of
31
Probability Approximations via the Poisson Clumping Heuristic: An Update
, 1992
"... s (which includes the geometric tails of some of our queueing examples) is given by Goldie [26]. C34 Rice's formula. Formalizations of the "local" version of the heurstic from Rice's formula P (M t ? b) ¸ tae b as b ! 1; t fixed are given by Albin [2] C41 Stein's method for extremes of stationary ..."
Abstract
-
Cited by 137 (5 self)
- Add to MetaCart
s (which includes the geometric tails of some of our queueing examples) is given by Goldie [26]. C34 Rice's formula. Formalizations of the "local" version of the heurstic from Rice's formula P (M t ? b) ¸ tae b as b ! 1; t fixed are given by Albin [2] C41 Stein's method for extremes of stationary sequences. Barbour et al [13] study the Poisson approximation for the sojourn time of a stationary discrete-time process above a high level in several examples: moving averages with non-negative weights (C5); Gaussian sequnces with non-negative correlations. The Gaussian case in considered further in Holst and Janson [28] and in Arratia et al. [8]. C42 The Compound Poisson Point Process Limit. As remarked in (A12.2), one can formalize a crude version of the heuristic by looking at compound Poisson point process limits. In the context o
Probabilistic and Statistical Properties of Words: An Overview
- Journal of Computational Biology
, 2000
"... In the following, an overview is given on statistical and probabilistic properties of words, as occurring in the analysis of biological sequences. Counts of occurrence, counts of clumps, and renewal counts are distinguished, and exact distributions as well as normal approximations, Poisson process a ..."
Abstract
-
Cited by 68 (1 self)
- Add to MetaCart
In the following, an overview is given on statistical and probabilistic properties of words, as occurring in the analysis of biological sequences. Counts of occurrence, counts of clumps, and renewal counts are distinguished, and exact distributions as well as normal approximations, Poisson process approximations, and compound Poisson approximations are derived. Here, a sequence is modelled as a stationary ergodic Markov chain; a test for determining the appropriate order of the Markov chain is described. The convergence results take the error made by estimating the Markovian transition probabilities into account. The main tools involved are moment generating functions, martingales, Stein’s method, and the Chen-Stein method. Similar results are given for occurrences of multiple patterns, and, as an example, the problem of unique recoverability of a sequence from SBH chip data is discussed. Special emphasis lies on disentangling the complicated dependence structure between word occurrences, due to self-overlap as well as due to overlap between words. The results can be used to derive approximate, and conservative, con � dence intervals for tests. Key words: word counts, renewal counts, Markov model, exact distribution, normal approximation, Poisson process approximation, compound Poisson approximation, occurrences of multiple words, sequencing by hybridization, martingales, moment generating functions, Stein’s method, Chen-Stein method. 1.
Sequence Comparison Significance and Poisson Approximation
- Stat. Sci
, 1994
"... The Chen-Stein method of Poisson approximation has been used to establish theorems about comparison of two DNA or protein sequences. The most useful result for sequence alignment applies to alignment scoring for aligned letters and no gaps. However there has not been a valid method to assign statist ..."
Abstract
-
Cited by 31 (4 self)
- Add to MetaCart
The Chen-Stein method of Poisson approximation has been used to establish theorems about comparison of two DNA or protein sequences. The most useful result for sequence alignment applies to alignment scoring for aligned letters and no gaps. However there has not been a valid method to assign statistical significance to alignment scores with gaps. In this paper we extend Poisson approximation techniques using the Aldous clumping heuristic to a practical method of estimating statistical significance.
Poisson process approximation for sequence repeats, and sequencing by hybridization
- J. of Computational Biology
, 1996
"... Sequencing by hybridization is a tool to determine a DNA sequence from the unordered lit of all I-tuples contained in this sequence; typical numbers for 1 are I = 8, 10, 12. For theoretical purposes we assume that the multiset of all I-tuples is known. This multiset determines the DNA sequence uniqu ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
Sequencing by hybridization is a tool to determine a DNA sequence from the unordered lit of all I-tuples contained in this sequence; typical numbers for 1 are I = 8, 10, 12. For theoretical purposes we assume that the multiset of all I-tuples is known. This multiset determines the DNA sequence uniquely if none of the so-called Ukkonen transformations are possible. These transformations require repeats of (1- 1)-tuples in the sequence, with these repeats occurring in certain spatial patterns. We model DNA as an i.i.d. sequence. We first prove Poisson process approximations for the process of indicators of all leftmost long repeats allowing self-overlap and for the process of indicators of all left-most long repeats without self-overlap. Using the Chen-Stein method, we get bounds on the error of these approximations. As a corollary, we approximate the distribution of longest repeats. In the second step we analyze the spatial patterns of the repeats. Finally we combine these two steps to prove an approximation for the probability that a random sequence is uniquely recoverable from its list of I-tuples. For all our results we give some numerical examples including error bounds. Key words: sequencing by hybridization, sequence repeats, DNA sequences, Chen-Stein method, Poisson process approximation, Ukkonen transformations. 0
method and Plancherel measure of the symmetric group
"... Abstract: We initiate a Stein’s method approach to the study of the Plancherel measure of the symmetric group. A new proof of Kerov’s central limit theorem for character ratios of random representations of the symmetric group on transpositions is obtained; the proof gives an error term. The construc ..."
Abstract
-
Cited by 15 (7 self)
- Add to MetaCart
Abstract: We initiate a Stein’s method approach to the study of the Plancherel measure of the symmetric group. A new proof of Kerov’s central limit theorem for character ratios of random representations of the symmetric group on transpositions is obtained; the proof gives an error term. The construction of an exchangeable pair needed for applying Stein’s method arises from the theory of harmonic functions on Bratelli diagrams. We also find the spectrum of the Markov chain on partitions underlying the construction of the exchangeable pair. This yields an intriguing method for studying the asymptotic decomposition of tensor powers of some representations of the symmetric group.
Poisson approximation for functionals of random trees
- and Alg
, 1996
"... We use Poisson approximation techniques for sums of indicator random variables to derive explicit error bounds and central limit theorems for several functionals of random trees. In particular, we consider (i) the number of comparisons for successful and unsuccessful search in a binary search tree a ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
We use Poisson approximation techniques for sums of indicator random variables to derive explicit error bounds and central limit theorems for several functionals of random trees. In particular, we consider (i) the number of comparisons for successful and unsuccessful search in a binary search tree and (ii) internode distances in increasing trees. The Poisson approximation setting is shown to be a natural and fairly simple framework for deriving asymptotic results.
Inequalities for Rare Events in Time-Reversible Markov Chains I
- Stochastic Inequalities, IMS
, 1992
"... The distribution of waiting time until a rare event is often approximated by the exponential distribution. In the context of first hitting times for stationary reversible chains, the error has a simple explicit bound involving only the mean waiting time ET and the relaxation time ø of the chain. We ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
The distribution of waiting time until a rare event is often approximated by the exponential distribution. In the context of first hitting times for stationary reversible chains, the error has a simple explicit bound involving only the mean waiting time ET and the relaxation time ø of the chain. We recall general upper and lower bounds on ET and then discuss improvements available in the case ET AE ø where the exponential approximation holds. In a sequel, Stein's method will be used to get explicit bounds on the Poisson approximation for the number of non-adjacent visits to a rare subset. 1 Introduction The Poisson approximation for numbers of rare events which actually occur, and the exponential approximation for the waiting time until first occurrence of a rare event, are useful throughout many areas on probability -- one view of this big picture is presented in Aldous (1989).Here we study explicit bounds in these approximations, in the special setting of hitting times of station...
Asymptotics of Poisson approximation to random discrete distributions: an analytic approach
- Advances in Applied Probability
, 1998
"... this paper, we shall describe the asymptotic behaviors of several distances of Poisson approximation to a wide class of discrete distributions covering many examples from number theory, combinatorics and arithmetic semigroups. Our aim is to show that whenever (analytic) generating functions of the r ..."
Abstract
-
Cited by 9 (9 self)
- Add to MetaCart
this paper, we shall describe the asymptotic behaviors of several distances of Poisson approximation to a wide class of discrete distributions covering many examples from number theory, combinatorics and arithmetic semigroups. Our aim is to show that whenever (analytic) generating functions of the random variables in question are available, complex-analytic methods can be used to derive precise asymptotic results for the five distances above. Actually, we shall consider the following generalized distances: let ff ? 0 be a fixed positive number, (X; Y ) = FM (X; Y ) = (X; Y ) = sup K (X; Y ) = sup M (X; Y ) = jP(X = j) \Gamma P(Y = j) Note that d TV = d M . Besides the case ff = 1 (and ff = 1=2 for d M ), only the case d TV was previously studied by Franken [39] for Poisson approximation to the sum of independent but not identically distributed Bernoulli random variables. We take these quantities as our measures of degree of nearness of Poisson approximation, some of which may be interpreted as certain norms in suitable space as many authors did (cf. [12, 22, 23, 74, 96]). For a large class of discrete distributions, we shall derive an asymptotic main term together with an error estimate for each of these distances. Our results are thus "approximation theorems" rather than "limit theorems". The common form of the underlying structure of these distributions suggests the study of an analytic scheme as we did previously for normal approximation and large deviations (cf. [53, 54]). Many concrete examples from probabilistic number theory and combinatorial structures will justify the study of this scheme. Our treatment being completely general, many extensions can be further pursued with essentially the same line of methods. We shall di...
Stein’s method for concentration inequalities
- Prob. Th. Rel. Fields
, 2007
"... Abstract. We introduce a version of Stein’s method for proving concentration and moment inequalities in problems with dependence. Simple illustrative examples from combinatorics, physics, and mathematical statistics are provided. 1. Introduction and ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
Abstract. We introduce a version of Stein’s method for proving concentration and moment inequalities in problems with dependence. Simple illustrative examples from combinatorics, physics, and mathematical statistics are provided. 1. Introduction and
method, Jack measure, and the Metropolis algorithm
- J. Combin. Theory Ser. A
"... Abstract: The one parameter family of Jackα measures on partitions is an important discrete analog of Dyson’s β ensembles of random matrix theory. Except for special values of α = 1/2,1,2 which have group theoretic interpretations, the Jackα measure has been difficult if not intractable to analyze. ..."
Abstract
-
Cited by 8 (6 self)
- Add to MetaCart
Abstract: The one parameter family of Jackα measures on partitions is an important discrete analog of Dyson’s β ensembles of random matrix theory. Except for special values of α = 1/2,1,2 which have group theoretic interpretations, the Jackα measure has been difficult if not intractable to analyze. This paper proves a central limit theorem (with an error term) for Jackα measure which works for arbitrary values of α. For α = 1 we recover a known central limit theorem on the distribution of character ratios of random representations of the symmetric group on transpositions. The case α = 2 gives a new central limit theorem for random spherical functions of a Gelfand pair. The proof uses Stein’s method and has interesting ingredients: an intruiging construction of an exchangeable pair, properties of Jack polynomials, and work of Hanlon relating Jack polynomials to the

