Results 1 -
9 of
9
The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator.
, 1995
"... The two-parameter Poisson-Dirichlet distribution, denoted pd(ff; `), is a distribution on the set of decreasing positive sequences with sum 1. The usual Poisson-Dirichlet distribution with a single parameter `, introduced by Kingman, is pd(0; `). Known properties of pd(0; `), including the Markov ..."
Abstract
-
Cited by 162 (36 self)
- Add to MetaCart
The two-parameter Poisson-Dirichlet distribution, denoted pd(ff; `), is a distribution on the set of decreasing positive sequences with sum 1. The usual Poisson-Dirichlet distribution with a single parameter `, introduced by Kingman, is pd(0; `). Known properties of pd(0; `), including the Markov chain description due to Vershik-Shmidt-Ignatov, are generalized to the two-parameter case. The size-biased random permutation of pd(ff; `) is a simple residual allocation model proposed by Engen in the context of species diversity, and rediscovered by Perman and the authors in the study of excursions of Brownian motion and Bessel processes. For 0 ! ff ! 1, pd(ff; 0) is the asymptotic distribution of ranked lengths of excursions of a Markov chain away from a state whose recurrence time distribution is in the domain of attraction of a stable law of index ff. Formulae in this case trace back to work of Darling, Lamperti and Wendel in the 1950's and 60's. The distribution of ranked lengths of e...
Random Discrete Distributions Derived From Self-Similar Random Sets
- Electronic J. Probability
, 1996
"... : A model is proposed for a decreasing sequence of random variables (V 1 ; V 2 ; \Delta \Delta \Delta) with P n V n = 1, which generalizes the Poisson-Dirichlet distribution and the distribution of ranked lengths of excursions of a Brownian motion or recurrent Bessel process. Let V n be the length ..."
Abstract
-
Cited by 13 (10 self)
- Add to MetaCart
: A model is proposed for a decreasing sequence of random variables (V 1 ; V 2 ; \Delta \Delta \Delta) with P n V n = 1, which generalizes the Poisson-Dirichlet distribution and the distribution of ranked lengths of excursions of a Brownian motion or recurrent Bessel process. Let V n be the length of the nth longest component interval of [0; 1]nZ, where Z is an a.s. non-empty random closed of (0; 1) of Lebesgue measure 0, and Z is self-similar, i.e. cZ has the same distribution as Z for every c ? 0. Then for 0 a ! b 1 the expected number of n's such that V n 2 (a; b) equals R b a v \Gamma1 F (dv) where the structural distribution F is identical to the distribution of 1 \Gamma sup(Z " [0; 1]). Then F (dv) = f(v)dv where (1 \Gamma v)f(v) is a decreasing function of v, and every such probability distribution F on [0; 1] can arise from this construction. Keywords: interval partition, zero set, excursion lengths, regenerative set, structural distribution. AMS subject classificat...
An accurate model for genetic hitch-hiking
- Genetics
, 2007
"... We suggest a simple deterministic approximation for the growth of the favoured-allele frequency during a selective sweep. Using this approximation we introduce an accurate model for genetic hitch-hiking. Only when Ns < 10 (N is the population size and s denotes the selection coefficient), are discre ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
We suggest a simple deterministic approximation for the growth of the favoured-allele frequency during a selective sweep. Using this approximation we introduce an accurate model for genetic hitch-hiking. Only when Ns < 10 (N is the population size and s denotes the selection coefficient), are discrepancies between our approximation and direct numerical simulations of a Moran model noticeable. Our model describes the gene genealogies of a contiguous segment of neutral loci close to the selected one, and it does not assume that the selective sweep happens instantaneously. This enables us to compute SNP distributions on the neutral segment without bias. I.
OLDEST ALLELE- A BAYESIAN APPROACH
, 1989
"... Consider an aged-ordered population Zl, Z2,..., where Z; is the frequency of the ith oldest allele and.EZi = 1. From this population consider an aged-ordered sample of size n with 1 alleles and the frequencies in age-order denoted by M = (/; ml, m2" " ml). vVe calculate the posterior distribution an ..."
Abstract
- Add to MetaCart
Consider an aged-ordered population Zl, Z2,..., where Z; is the frequency of the ith oldest allele and.EZi = 1. From this population consider an aged-ordered sample of size n with 1 alleles and the frequencies in age-order denoted by M = (/; ml, m2" " ml). vVe calculate the posterior distribution and posterior moments from the population frequency of the oldest allele, Zl, given the sample M, assuming that the population is at stationarity and follows the neutral infinite alleles model. We also calculate the posterior disribution of Zl given a partition of n genes with no age information in the sample. These results are used to determine Bayes estimators for the population frequency of the oldest type, and the analysis is extended to include the posterior distribution of Zl, Z2,..., Zk given M for any k. nhTfl<?pq' G.E.M. distribution, infinite NSF BSR-8619760
1 cornelllect Mathematical Population Genetics: Lecture Notes
, 2006
"... These notes should, ideally, be read before the Cornell meeting starts. They are intended to give background material in mathematical population genetics and also, in part, to form the background for some of the material given by other lecturers. At the very least, the first 27 pages should be read ..."
Abstract
- Add to MetaCart
These notes should, ideally, be read before the Cornell meeting starts. They are intended to give background material in mathematical population genetics and also, in part, to form the background for some of the material given by other lecturers. At the very least, the first 27 pages should be read before the meeting. Some standard genetical terms will be used and it is assumed that the reader is familiar with the meanings of these. These terms include gene, genotype, allele, (gene) locus, haploid, diploid, homozygote, heterozygote, heterozygosity, monoecious, dioecious,polymorphism,
Model ∗
"... Importance sampling or Markov Chain Monte Carlo sampling is required for state-of-the-art statistical analysis of population genetics data. The applicability of these sampling-based inference techniques depends crucially on the proposal distribution. In this paper, we discuss importance sampling for ..."
Abstract
- Add to MetaCart
Importance sampling or Markov Chain Monte Carlo sampling is required for state-of-the-art statistical analysis of population genetics data. The applicability of these sampling-based inference techniques depends crucially on the proposal distribution. In this paper, we discuss importance sampling for the infinite sites model. The infinite sites assumption is attractive because it constraints the number of possible genealogies, thereby allowing for the analysis of larger data sets. We recall the Griffiths-Tavaré and Stephens-Donnelly proposals and emphasize the relation between the latter proposal and exact sampling from the infinite alleles model. We also introduce a new proposal that takes knowledge of the ancestral state into account. The new proposal is derived from a new result on exact sampling from a single site. The methods are illustrated on simulated data sets and the data considered in Griffiths and Tavaré (1994).
Convergence Time to the Ewens Sampling Formula
"... In this paper, we establish the cutoff phenomena for the discrete time infinite alleles Moran model. If M is the population size and µ is the mutation rate, we find a cutoff time of log(Mµ)/µ generations. The stationary distribution for this process in the case of sampling without replacement is the ..."
Abstract
- Add to MetaCart
In this paper, we establish the cutoff phenomena for the discrete time infinite alleles Moran model. If M is the population size and µ is the mutation rate, we find a cutoff time of log(Mµ)/µ generations. The stationary distribution for this process in the case of sampling without replacement is the Ewens sampling formula. We show that the bound for the total variation distance from the generation t distribution to the Ewens sampling formula is well approximated by one of the extreme value distributions, namely, a standard Gumbel distribution. Beginning with the card shuffling examples of Aldous and Diaconis and extending the ideas of Donnelly and Rodrigues for the two allele model, this model adds to the list of Markov chains that displays the cutoff phenomenon. Because of the broad use of infinite alleles models, this cutoff sets the time scale of applicability for statistical tests based on the Ewens sampling formula and other tests of neutrality in a number of population genetic studies.
Manuscript submitted to: Electronic Journal of Probability Record indices and age-ordered frequencies in Exchangeable
, 2008
"... Abstract The frequencies X1, X2,... of an exchangeable Gibbs random partition Π of N = {1, 2,...} (Gnedin and Pitman (2006)) are considered in their age-order, i.e. their size-biased order. We study their dependence on the sequence i1, i2,... of least elements of the blocks of Π. In particular, cond ..."
Abstract
- Add to MetaCart
Abstract The frequencies X1, X2,... of an exchangeable Gibbs random partition Π of N = {1, 2,...} (Gnedin and Pitman (2006)) are considered in their age-order, i.e. their size-biased order. We study their dependence on the sequence i1, i2,... of least elements of the blocks of Π. In particular, conditioning on 1 = i1 < i2 <..., a representation is shown to be Xj = ξj−1 i=j (1 − ξi) j = 1, 2,... where {ξj: j = 1, 2,...} is a sequence of independent Beta random variables. Sequences with such a product form are called neutral to the left. We show that the property of conditional left-neutrality in fact characterizes the Gibbs family among all exchangeable partitions, and leads to further interesting results on: (i) the conditional Mellin transform of Xk, given ik, and (ii) the conditional distribution of the first k normalized frequencies, given ∑k j=1 Xj and ik; the latter turns out to be a mixture of Dirichlet distributions. Many of the mentioned representations are extensions of Griffiths and Lessard (2005) results on Ewens ’ partitions.
Record indices and age-ordered frequencies in Exchangeable Gibbs Partitions
, 2008
"... Abstract We consider a random partition Π of N = {1, 2,...} such that, for each n, its restriction Πn to [n] = {1,..., n} is given by an exchangeable Gibbs partition with parameters α, V for α ∈ (−∞, 1] and V = (Vn,k) defined recursively by setting V1,1 = 1 and Vn,k = (n − αk)Vn+1,k + Vn+1,k+1 k ≤ ..."
Abstract
- Add to MetaCart
Abstract We consider a random partition Π of N = {1, 2,...} such that, for each n, its restriction Πn to [n] = {1,..., n} is given by an exchangeable Gibbs partition with parameters α, V for α ∈ (−∞, 1] and V = (Vn,k) defined recursively by setting V1,1 = 1 and Vn,k = (n − αk)Vn+1,k + Vn+1,k+1 k ≤ n = 1, 2,... (Gnedin and Pitman 2006). By ranking the blocks Πn1,..., Πnk of Πn by their age-order i.e. by the order of their least elements i1,...,ik, we study how the distribution of the frequencies of the blocks depends on i1,...,ik. Several interesting representations for the limit age-ordered relative frequencies X1, X2,... of Π arise, depending on which ij’s one conditions on. In particular, conditioning on the entire vector i = 1 = i1 < i2 <..., a representation is Xj = ξj−1 (1 − ξi) j = 1, 2,... i=j where the ξj’s are independent Beta random variables with parameters, respectively, (1−α, ij+1−αj−1). We show the connection of such a representation with the so-called Beta-Stacy class of random discrete distributions (Walker and Muliere 1997). The vector i is found to form a Markov chain depending on both α and V. When V is chosen from Pitman’s subfamily, the two-parameter GEM distribution is reobtained by averaging the ξ over i. Conditioning on ik alone, we give two alternative representations for the Laplace transform of both − log Xk and − log ( ∑k i=1 Xi), and we characterize Ewens ’ partitions as the only exchangeable Gibbs partitions for which − logXk|ik can be represented as an infinite sum of independent random variables. We finally show that, for every k, conditional on ∑k i=1 Xi, the distribution of the normalized age-ordered frequencies X1 / ∑k i=1 Xi,..., Xk / ∑k i=1 Xi is a mixture of Dirichlet distributions on the (k − 1)-dimensional simplex, whose mixing measure is indexed by ik. We provide a non-trivial explicit formula for the marginal distribution of ik. Many of the mentioned representations are extensions of Griffiths and Lessard (2005) results on Ewens ’ partitions.

