Results 11  20
of
169
Asymptotic laws for compositions derived from transformed subordinators
 ANN. PROBAB
, 2006
"... A random composition of n appears when the points of a random closed set ˜ R ⊂ [0, 1] are used to separate into blocks n points sampled from the uniform distribution. We study the number of parts Kn of this composition and other related functionals under the assumption that ˜ R = φ(S•) where (St, t ..."
Abstract

Cited by 24 (10 self)
 Add to MetaCart
(Show Context)
A random composition of n appears when the points of a random closed set ˜ R ⊂ [0, 1] are used to separate into blocks n points sampled from the uniform distribution. We study the number of parts Kn of this composition and other related functionals under the assumption that ˜ R = φ(S•) where (St, t ≥ 0) is a subordinator and φ: [0, ∞] → [0, 1] is a diffeomorphism. We derive the asymptotics of Kn when the Lévy measure of the subordinator is regularly varying at 0 with positive index. Specialising to the case of exponential function φ(x) = 1 −e −x we establish a connection between the asymptotics of Kn and the exponential functional of the subordinator.
Invariance principles for random bipartite planar maps
 ANN. PROBAB
, 2007
"... Random planar maps are considered in the physics literature as the discrete counterpart of random surfaces. It is conjectured that properly rescaled random planar maps, when conditioned to have a large number of faces, should converge to a limiting surface whose law does not depend, up to scaling fa ..."
Abstract

Cited by 23 (7 self)
 Add to MetaCart
(Show Context)
Random planar maps are considered in the physics literature as the discrete counterpart of random surfaces. It is conjectured that properly rescaled random planar maps, when conditioned to have a large number of faces, should converge to a limiting surface whose law does not depend, up to scaling factors, on details of the class of maps that are sampled. Previous works on the topic, starting with Chassaing and Schaeffer, have shown that the radius of a random quadrangulation with n faces, that is, the maximal graph distance on such a quadrangulation to a fixed reference point, converges in distribution once rescaled by n 1/4 to the diameter of the Brownian snake, up to a scaling constant. Using a bijection due to Bouttier, Di Francesco and Guitter between bipartite planar maps and a family of labeled trees, we show the corresponding invariance principle for a class of random maps that follow a Boltzmann distribution putting weight qk on faces of degree 2k: the radius of such maps, conditioned to have n faces (or n vertices) and under a criticality assumption, converges in distribution once rescaled by n 1/4 to a scaled version of the diameter of the Brownian snake. Convergence results for the socalled profile of maps are also provided. The convergence of rescaled bipartite maps to the Brownian map, in the sense introduced by Marckert and Mokkadem, is also shown. The proofs of these results rely on a new invariance principle for twotype spatial Galton–Watson trees.
The structure of the allelic partition of the total population for GaltonWatson processes with neutral mutations
"... We consider a (sub)critical Galton–Watson process with neutral mutations (infinite alleles model), and decompose the entire population into clusters of individuals carrying the same allele. We specify the law of this allelic partition in terms of the distribution of the number of clonechildren and ..."
Abstract

Cited by 22 (4 self)
 Add to MetaCart
(Show Context)
We consider a (sub)critical Galton–Watson process with neutral mutations (infinite alleles model), and decompose the entire population into clusters of individuals carrying the same allele. We specify the law of this allelic partition in terms of the distribution of the number of clonechildren and the number of mutantchildren of a typical individual. The approach combines an extension of Harris representation of Galton–Watson processes and a version of the ballot theorem. Some limit theorems related to the distribution of the allelic partition are also given. 1. Introduction. We consider a Galton–Watson process, that is, a population model with asexual reproduction such that at every generation, each individual gives birth to a random number of children according to a fixed distribution and independently of the other individuals in the population. We are interested in the situation where a child can be either a clone, that
Limits of normalized quadrangulations. The Brownian map
 Ann. Probab
, 2004
"... Consider qn a random pointed quadrangulation chosen equally likely among the pointed quadrangulations with n faces. In this paper, we show that, when n goes to +∞, qn suitably normalized converges weakly in a certain sense to a random limit object, which is continuous and compact, and that we name t ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
(Show Context)
Consider qn a random pointed quadrangulation chosen equally likely among the pointed quadrangulations with n faces. In this paper, we show that, when n goes to +∞, qn suitably normalized converges weakly in a certain sense to a random limit object, which is continuous and compact, and that we name the Brownian map. The same result is shown for a model of rooted quadrangulations and for some models of rooted quadrangulations with random edge lengths. A metric space of rooted (resp. pointed) abstract maps that contains the model of discrete rooted (resp. pointed) quadrangulations and the model of Brownian map is defined. The weak convergences hold in these metric spaces. 1
SPINAL PARTITIONS AND INVARIANCE UNDER REROOTING OF CONTINUUM RANDOM TREES
, 2009
"... We develop some theory of spinal decompositions of discrete and continuous fragmentation trees. Specifically, we consider a coarse and a fine spinal integer partition derived from spinal tree decompositions. We prove that for a twoparameter Poisson–Dirichlet family of continuous fragmentation trees ..."
Abstract

Cited by 20 (11 self)
 Add to MetaCart
(Show Context)
We develop some theory of spinal decompositions of discrete and continuous fragmentation trees. Specifically, we consider a coarse and a fine spinal integer partition derived from spinal tree decompositions. We prove that for a twoparameter Poisson–Dirichlet family of continuous fragmentation trees, including the stable trees of Duquesne and Le Gall, the fine partition is obtained from the coarse one by shattering each of its parts independently, according to the same law. As a second application of spinal decompositions, we prove that among the continuous fragmentation trees, stable trees are the only ones whose distribution is invariant under uniform rerooting.
Clustering Using Objective Functions and Stochastic Search
, 2007
"... Summary. A new approach to clustering multivariate data, based on a multilevel linear mixed model, is proposed. A key feature of the model is that observations from the same cluster are correlated, because they share clusterspecific random effects. The inclusion of clusterspecific random effects a ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
(Show Context)
Summary. A new approach to clustering multivariate data, based on a multilevel linear mixed model, is proposed. A key feature of the model is that observations from the same cluster are correlated, because they share clusterspecific random effects. The inclusion of clusterspecific random effects allows parsimonious departure from an assumed base model for cluster mean profiles. This departure is captured statistically via the posterior expectation, or best linear unbiased predictor. One of the parameters in the model is the true underlying partition of the data, and the posterior distribution of this parameter, which is known up to a normalizing constant, is used to cluster the data. The problem of finding partitions with high posterior probability is not amenable to deterministic methods such as the EM algorithm. Thus, we propose a stochastic search algorithm that is driven by a Markov chain that is a mixture of two Metropolis–Hastings algorithms—one that makes small scale changes to individual objects and another that performs large scale moves involving entire clusters. The methodology proposed is fundamentally different from the wellknown finite mixture model approach to clustering, which does not explicitly include the partition as a parameter, and involves an independent and identically distributed structure.
An Infinite Latent Attribute Model for Network Data
 In Proceedings of the International Conference on Machine Learning (ICML
, 2012
"... Latent variable models for network data extract a summary of the relational structure underlying an observed network. The simplest possible models subdivide nodes of the network into clusters; the probability of a link between any two nodes then depends only on their cluster assignment. Currently av ..."
Abstract

Cited by 18 (6 self)
 Add to MetaCart
(Show Context)
Latent variable models for network data extract a summary of the relational structure underlying an observed network. The simplest possible models subdivide nodes of the network into clusters; the probability of a link between any two nodes then depends only on their cluster assignment. Currently available models can be classified by whether clusters are disjoint or are allowed to overlap. These models can explain a “flat ” clustering structure. Hierarchical Bayesian models provide a natural approach to capture more complex dependencies. We propose a model in which objects are characterised by a latent feature vector. Each feature is itself partitioned into disjoint groups (subclusters), corresponding to a second layer of hierarchy. In experimental comparisons, the model achieves significantly improved predictive performance on social and biological link prediction tasks. The results indicate that models with a single layer hierarchy oversimplify real networks. 1.
Inducing TreeSubstitution Grammars
"... Inducing a grammar from text has proven to be a notoriously challenging learning task despite decades of research. The primary reason for its difficulty is that in order to induce plausible grammars, the underlying model must be capable of representing the intricacies of language while also ensuring ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
(Show Context)
Inducing a grammar from text has proven to be a notoriously challenging learning task despite decades of research. The primary reason for its difficulty is that in order to induce plausible grammars, the underlying model must be capable of representing the intricacies of language while also ensuring that it can be readily learned from data. The majority of existing work on grammar induction has favoured model simplicity (and thus learnability) over representational capacity by using context free grammars and first order dependency grammars, which are not sufficiently expressive to model many common linguistic constructions. We propose a novel compromise by inferring a probabilistic tree substitution grammar, a formalism which allows for arbitrarily large tree fragments and thereby better represent complex linguistic structures. To limit the model’s complexity we employ a Bayesian nonparametric prior which biases the model towards a sparse grammar with shallow productions. We demonstrate the model’s efficacy on supervised phrasestructure parsing, where we induce a latent segmentation of the training treebank, and on unsupervised dependency grammar induction. In both cases the model uncovers interesting latent linguistic structures while producing competitive results.
Bayesian nonparametric estimator derived from conditional Gibbs structures. Annals of Applied Probability
 J. Phys. A: Math. Gen
, 2008
"... We consider discrete nonparametric priors which induce Gibbstype exchangeable random partitions and investigate their posterior behavior in detail. In particular, we deduce conditional distributions and the corresponding Bayesian nonparametric estimators, which can be readily exploited for predictin ..."
Abstract

Cited by 17 (4 self)
 Add to MetaCart
(Show Context)
We consider discrete nonparametric priors which induce Gibbstype exchangeable random partitions and investigate their posterior behavior in detail. In particular, we deduce conditional distributions and the corresponding Bayesian nonparametric estimators, which can be readily exploited for predicting various features of additional samples. The results provide useful tools for genomic applications where prediction of future outcomes is required. 1. Introduction. Random