Results 11  20
of
104
SPINAL PARTITIONS AND INVARIANCE UNDER REROOTING OF CONTINUUM RANDOM TREES
"... We develop some theory of spinal decompositions of discrete and continuous fragmentation trees. Specifically, we consider a coarse and a fine spinal integer partition derived from spinal tree decompositions. We prove that for a twoparameter Poisson–Dirichlet family of continuous fragmentation trees ..."
Abstract

Cited by 20 (12 self)
 Add to MetaCart
We develop some theory of spinal decompositions of discrete and continuous fragmentation trees. Specifically, we consider a coarse and a fine spinal integer partition derived from spinal tree decompositions. We prove that for a twoparameter Poisson–Dirichlet family of continuous fragmentation trees, including the stable trees of Duquesne and Le Gall, the fine partition is obtained from the coarse one by shattering each of its parts independently, according to the same law. As a second application of spinal decompositions, we prove that among the continuous fragmentation trees, stable trees are the only ones whose distribution is invariant under uniform rerooting. 1. Introduction. Starting from a rooted combinatorial tree T[n] with n leaves labeled by [n] ={1,...,n}, we call the path from the root to the leaf labeled 1 the spine of T[n]. Deleting each edge along the spine of T[n] defines a graph whose connected components we call bushes. If, as well as cutting each edge on the spine, we cut each edge connected to a spinal vertex, each bush is further decomposed
Invariance principles for random bipartite planar maps
 ANN. PROBAB
, 2007
"... Random planar maps are considered in the physics literature as the discrete counterpart of random surfaces. It is conjectured that properly rescaled random planar maps, when conditioned to have a large number of faces, should converge to a limiting surface whose law does not depend, up to scaling fa ..."
Abstract

Cited by 18 (6 self)
 Add to MetaCart
Random planar maps are considered in the physics literature as the discrete counterpart of random surfaces. It is conjectured that properly rescaled random planar maps, when conditioned to have a large number of faces, should converge to a limiting surface whose law does not depend, up to scaling factors, on details of the class of maps that are sampled. Previous works on the topic, starting with Chassaing and Schaeffer, have shown that the radius of a random quadrangulation with n faces, that is, the maximal graph distance on such a quadrangulation to a fixed reference point, converges in distribution once rescaled by n 1/4 to the diameter of the Brownian snake, up to a scaling constant. Using a bijection due to Bouttier, Di Francesco and Guitter between bipartite planar maps and a family of labeled trees, we show the corresponding invariance principle for a class of random maps that follow a Boltzmann distribution putting weight qk on faces of degree 2k: the radius of such maps, conditioned to have n faces (or n vertices) and under a criticality assumption, converges in distribution once rescaled by n 1/4 to a scaled version of the diameter of the Brownian snake. Convergence results for the socalled profile of maps are also provided. The convergence of rescaled bipartite maps to the Brownian map, in the sense introduced by Marckert and Mokkadem, is also shown. The proofs of these results rely on a new invariance principle for twotype spatial Galton–Watson trees.
A bayesian interpretation of interpolated kneserney
, 2006
"... Interpolated KneserNey is one of the best smoothing methods for ngram language models. Previous explanations for its superiority have been based on intuitive and empirical justifications of specific properties of the method. We propose a novel interpretation of interpolated KneserNey as approxima ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
Interpolated KneserNey is one of the best smoothing methods for ngram language models. Previous explanations for its superiority have been based on intuitive and empirical justifications of specific properties of the method. We propose a novel interpretation of interpolated KneserNey as approximate inference in a hierarchical Bayesian model consisting of PitmanYor processes. As opposed to past explanations, our interpretation can recover exactly the formulation of interpolated KneserNey, and performs better than interpolated KneserNey when a better inference procedure is used. 1
Clustering Using Objective Functions and Stochastic Search
, 2007
"... Summary. A new approach to clustering multivariate data, based on a multilevel linear mixed model, is proposed. A key feature of the model is that observations from the same cluster are correlated, because they share clusterspecific random effects. The inclusion of clusterspecific random effects a ..."
Abstract

Cited by 16 (3 self)
 Add to MetaCart
Summary. A new approach to clustering multivariate data, based on a multilevel linear mixed model, is proposed. A key feature of the model is that observations from the same cluster are correlated, because they share clusterspecific random effects. The inclusion of clusterspecific random effects allows parsimonious departure from an assumed base model for cluster mean profiles. This departure is captured statistically via the posterior expectation, or best linear unbiased predictor. One of the parameters in the model is the true underlying partition of the data, and the posterior distribution of this parameter, which is known up to a normalizing constant, is used to cluster the data. The problem of finding partitions with high posterior probability is not amenable to deterministic methods such as the EM algorithm. Thus, we propose a stochastic search algorithm that is driven by a Markov chain that is a mixture of two Metropolis–Hastings algorithms—one that makes small scale changes to individual objects and another that performs large scale moves involving entire clusters. The methodology proposed is fundamentally different from the wellknown finite mixture model approach to clustering, which does not explicitly include the partition as a parameter, and involves an independent and identically distributed structure.
The structure of the allelic partition of the total population for GaltonWatson processes with neutral mutations
"... We consider a (sub)critical Galton–Watson process with neutral mutations (infinite alleles model), and decompose the entire population into clusters of individuals carrying the same allele. We specify the law of this allelic partition in terms of the distribution of the number of clonechildren and ..."
Abstract

Cited by 15 (3 self)
 Add to MetaCart
We consider a (sub)critical Galton–Watson process with neutral mutations (infinite alleles model), and decompose the entire population into clusters of individuals carrying the same allele. We specify the law of this allelic partition in terms of the distribution of the number of clonechildren and the number of mutantchildren of a typical individual. The approach combines an extension of Harris representation of Galton–Watson processes and a version of the ballot theorem. Some limit theorems related to the distribution of the allelic partition are also given. 1. Introduction. We consider a Galton–Watson process, that is, a population model with asexual reproduction such that at every generation, each individual gives birth to a random number of children according to a fixed distribution and independently of the other individuals in the population. We are interested in the situation where a child can be either a clone, that
Brownian Bridge Asymptotics for Random pMappings
 Electonic J. Probab
, 2002
"... The Joyal bijection between doublyrooted trees and mappings can be lifted to a transformation on function space which takes treewalks to mappingwalks. Applying known results on weak convergence of random tree walks to Brownian excursion, we give a conceptually simpler rederivation of the 1994 ..."
Abstract

Cited by 14 (8 self)
 Add to MetaCart
The Joyal bijection between doublyrooted trees and mappings can be lifted to a transformation on function space which takes treewalks to mappingwalks. Applying known results on weak convergence of random tree walks to Brownian excursion, we give a conceptually simpler rederivation of the 1994 AldousPitman result on convergence of uniform random mapping walks to reflecting Brownian bridge, and extend this result to random pmappings.
Regenerative partition structures
 Electron. J. Combin. 11 Research Paper
"... We consider Kingman’s partition structures which are regenerative with respect to a general operation of random deletion of some part. Prototypes of this class are the Ewens partition structures which Kingman characterised by regeneration after deletion of a part chosen by sizebiased sampling. We a ..."
Abstract

Cited by 14 (7 self)
 Add to MetaCart
We consider Kingman’s partition structures which are regenerative with respect to a general operation of random deletion of some part. Prototypes of this class are the Ewens partition structures which Kingman characterised by regeneration after deletion of a part chosen by sizebiased sampling. We associate each regenerative partition structure with a corresponding regenerative composition structure, which (as we showed in a previous paper) can be associated in turn with a regenerative random subset of the positive halfline, that is the closed range of a subordinator. A general regenerative partition structure is thus represented in terms of the Laplace exponent of an associated subordinator. We also analyse deletion properties characteristic of the twoparameter family of partition structures.
Regenerative tree growth: binary selfsimilar continuum random trees and PoissonDirichlet compositions
, 2008
"... We use a natural ordered extension of the Chinese Restaurant Process to grow a twoparameter family of binary selfsimilar continuum fragmentation trees. We provide an explicit embedding of Ford’s sequence of alpha model trees in the continuum tree which we identified in a previous article as a dist ..."
Abstract

Cited by 14 (7 self)
 Add to MetaCart
We use a natural ordered extension of the Chinese Restaurant Process to grow a twoparameter family of binary selfsimilar continuum fragmentation trees. We provide an explicit embedding of Ford’s sequence of alpha model trees in the continuum tree which we identified in a previous article as a distributional scaling limit of Ford’s trees. In general, the Markov branching trees induced by the twoparameter growth rule are not sampling consistent, so the existence of compact limiting trees cannot be deduced from previous work on the sampling consistent case. We develop here a new approach to establish such limits, based on regenerative interval partitions and the urnmodel description of sampling from Dirichlet random distributions. 1. Introduction. We
Asymptotic laws for regenerative compositions: Gamma subordinators and the like
 PROBAB. THEORY RELATED FIELDS 135
, 2008
"... For ˜ R = 1 −exp(−R) a random closed set obtained by exponential transformation of the closed range R of a subordinator, a regenerative composition of generic positive integer n is defined by recording the sizes of clusters of n uniform random points as they are separated by the points of ˜ R. We fo ..."
Abstract

Cited by 14 (7 self)
 Add to MetaCart
For ˜ R = 1 −exp(−R) a random closed set obtained by exponential transformation of the closed range R of a subordinator, a regenerative composition of generic positive integer n is defined by recording the sizes of clusters of n uniform random points as they are separated by the points of ˜ R. We focus on the number of parts Kn of the composition when ˜ R is derived from a gamma subordinator. We prove logarithmic asymptotics of the moments and central limit theorems for Kn and other functionals of the composition such as the number of singletons, doubletons, etc. This study complements our previous work on asymptotics of these functionals when the tail of the Lévy measure is regularly varying at 0+.
Inducing TreeSubstitution Grammars
"... Inducing a grammar from text has proven to be a notoriously challenging learning task despite decades of research. The primary reason for its difficulty is that in order to induce plausible grammars, the underlying model must be capable of representing the intricacies of language while also ensuring ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
Inducing a grammar from text has proven to be a notoriously challenging learning task despite decades of research. The primary reason for its difficulty is that in order to induce plausible grammars, the underlying model must be capable of representing the intricacies of language while also ensuring that it can be readily learned from data. The majority of existing work on grammar induction has favoured model simplicity (and thus learnability) over representational capacity by using context free grammars and first order dependency grammars, which are not sufficiently expressive to model many common linguistic constructions. We propose a novel compromise by inferring a probabilistic tree substitution grammar, a formalism which allows for arbitrarily large tree fragments and thereby better represent complex linguistic structures. To limit the model’s complexity we employ a Bayesian nonparametric prior which biases the model towards a sparse grammar with shallow productions. We demonstrate the model’s efficacy on supervised phrasestructure parsing, where we induce a latent segmentation of the training treebank, and on unsupervised dependency grammar induction. In both cases the model uncovers interesting latent linguistic structures while producing competitive results.