Results 11 - 20
of
62
Bayesian Analysis of Polyphonic Western Tonal Music
- Journal of the Acoustical Society of America
, 2006
"... This paper deals with the computational analysis of musical audio from recorded audio waveforms. This general problem includes, as sub-tasks, music transcription, extraction of musical pitch, dynamics, timbre, instrument identity, and source separation. Analysis of real musical signals is a highly ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
This paper deals with the computational analysis of musical audio from recorded audio waveforms. This general problem includes, as sub-tasks, music transcription, extraction of musical pitch, dynamics, timbre, instrument identity, and source separation. Analysis of real musical signals is a highly ill-posed task which is made complicated by the presence of transient sounds, background interference or the complex structure of musical pitches in the time-frequency domain. This paper focuses on models and algorithms for computer transcription of multiple musical pitches in audio, elaborated from previous work by two of the authors. The audio data are supposedly pre-segmented into fixed pitch regimes such as individual chords. The models presented apply to pitched (tonal) music and are formulated via a Gabor representation of non-stationary signals. A Bayesian probabilistic structure is employed for representation of prior information about the parameters of the notes. This paper introduces a numerical Bayesian inference strategy for estimation of the pitches and other parameters of the waveform. The improved algorithm is much quicker, and makes the approach feasible in realistic sitautions.
Compressive Sensing on Manifolds Using a Nonparametric Mixture of Factor Analyzers: Algorithm and Performance Bounds 1
"... Nonparametric Bayesian methods are employed to constitute a mixture of low-rank Gaussians, for data x ∈ RN that are of high dimension N but are constrained to reside in a low-dimensional subregion of RN. The number of mixture components and their rank are inferred automatically from the data. The re ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Nonparametric Bayesian methods are employed to constitute a mixture of low-rank Gaussians, for data x ∈ RN that are of high dimension N but are constrained to reside in a low-dimensional subregion of RN. The number of mixture components and their rank are inferred automatically from the data. The resulting algorithm can be used for learning manifolds and for reconstructing signals from manifolds, based on compressive sensing (CS) projection measurements. The statistical CS inversion is performed analytically. We derive the required number of CS random measurements needed for successful reconstruction, based on easily computed quantities, drawing on block–sparsity properties. The proposed methodology is validated on several synthetic and real datasets. I.
Cluster-based network model for time-course gene expression data. Biostatistics
, 2007
"... We propose a model–based approach to unify clustering and network modeling using time–course gene expression data. Specifically, our approach uses a mixture model to cluster genes. Genes within the same cluster share a similar expression profile. The network is built over cluster–specific expression ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
We propose a model–based approach to unify clustering and network modeling using time–course gene expression data. Specifically, our approach uses a mixture model to cluster genes. Genes within the same cluster share a similar expression profile. The network is built over cluster–specific expression profiles using state–space models. We discuss the application of our model to simulated data as well as to time–course gene expression data arising from animal models on prostate cancer progression. The latter application shows that with a combined statistical/bioinformatics analyses we are able to extract gene-to-gene relationships supported by the literature as well as new plausible relationships. Keywords: Model–based clustering, Bayesian network, dynamic linear model, mixture model, time course gene expression, prostate cancer, bioinformatics. 1
An MCMC Sampling Approach to Estimation of Nonstationary Hidden Markov Models
- IEEE Trans. Signal Processing
, 2002
"... Hidden Markov models (HMMs) represent a very important tool for analysis of signals and systems. In the past two decades, HMMs have attracted the attention of various research communities, including the ones in statistics, engineering, and mathematics. Their extensive use in signal processing and, i ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Hidden Markov models (HMMs) represent a very important tool for analysis of signals and systems. In the past two decades, HMMs have attracted the attention of various research communities, including the ones in statistics, engineering, and mathematics. Their extensive use in signal processing and, in particular, speech processing is well documented. A major weakness of conventional HMMs is their inflexibility in modeling state durations. This weakness can be avoided by adopting a more complicated class of HMMs known as nonstationary HMMs. In this paper, we analyze nonstationary HMMs whose state transition probabilities are functions of time that indirectly model state durations by a given probability mass function and whose observation spaces are discrete. The objective of our work is to estimate all the unknowns of a nonstationary HMM, which include its parameters and the state sequence. To that end, we construct a Markov chain Monte Carlo (MCMC) sampling scheme, where sampling from all the posterior probability distributions is very easy. The proposed MCMC sampling scheme has been tested in extensive computer simulations on finite discrete-valued observed data, and some of the simulation results are presented in the paper. Index Terms---Gibbs sampling, hidden Markov models, Markov chain Monte Carlo, nonstationary.
Bayesian finite mixtures with an unknown number of components: the allocation sampler
- University of Glasgow
, 2005
"... A new Markov chain Monte Carlo method for the Bayesian analysis of finite mixture distributions with an unknown number of components is presented. The sampler is characterized by a state space consisting only of the number of components and the latent allocation variables. Its main advantage is that ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
A new Markov chain Monte Carlo method for the Bayesian analysis of finite mixture distributions with an unknown number of components is presented. The sampler is characterized by a state space consisting only of the number of components and the latent allocation variables. Its main advantage is that it can be used, with minimal changes, for mixtures of components from any parametric family, under the assumption that the component parameters can be integrated out of the model analytically. Artificial and real data sets are used to illustrate the method and mixtures of univariate and of multivariate normals are explicitly considered. The problem of label switching, when parameter inference is of interest, is addressed in a post-processing stage.
Nonparametric Bayesian Density Estimation on Manifolds with Applications to Planar Shapes
"... Abstract. Statistical analysis on landmark-based shape spaces has diverse applications in morphometrics, medical diagnostics, machine vision, robotics and other areas. These shape spaces are non-Euclidean quotient manifolds, often the quotient of the unit sphere under a group of transformations. To ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Abstract. Statistical analysis on landmark-based shape spaces has diverse applications in morphometrics, medical diagnostics, machine vision, robotics and other areas. These shape spaces are non-Euclidean quotient manifolds, often the quotient of the unit sphere under a group of transformations. To conduct nonparametric inferences, one may define notions of center and spread of a probability distribution on an arbitrary manifold and work with their estimates. There has been a significant amount of work done in this direction. However, it is useful to consider full likelihood-based methods, which allow nonparametric estimation of the probability density. This article proposes a class of mixture models constructed using suitable kernels on a general compact non-Euclidean manifold and then on the planar shape space in particular. Following a Bayesian approach with a nonparametric prior on the mixing distribution, conditions are obtained under which the Kullback-Leibler property holds, implying large support and weak posterior consistency. Gibbs sampling methods are developed for posterior computation, and the methods are applied to problems in density estimation on shape space and classification with shape-based predictors. 1.
Quality control and robust estimation of cDNA microarray with replicates
- J. AM. STAT. ASSOC
, 2006
"... We consider robust estimation of gene intensities from cDNA microarray data with replicates. Several statistical methods for estimating gene intensities from microarrays have been proposed, but little work has been done on robust estimation. This is particularly relevant for experiments with replica ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
We consider robust estimation of gene intensities from cDNA microarray data with replicates. Several statistical methods for estimating gene intensities from microarrays have been proposed, but little work has been done on robust estimation. This is particularly relevant for experiments with replicates, because even one outlying replicate can have a disastrous effect on the estimated intensity for the gene concerned. Because of the many steps involved in the experimental process from hybridization to image analysis, cDNA microarray data often contain outliers. For example, an outlying data value could occur because of scratches or dust on the surface, imperfections in the glass, or imperfections in the array production. We develop a Bayesian hierarchical model for robust estimation of cDNA microarray intensities. Outliers are modeled explicitly using a t-distribution, and our model also addresses such classical issues as design effects, normalization, transformation, and nonconstant variance. Parameter estimation is carried out using Markov chain Monte Carlo. By identifying potential outliers, the method provides automatic quality control of replicate, array, and gene measurements. The method is applied to three publicly available gene expression datasets and compared with three other methods: ANOVA-normalized log ratios, the median log ratio, and estimation after the removal of outliers based on Dixon’s test. We find that the between-replicate variability of the intensity estimates is lower for our method than for any of the others. We also address the issue of whether the background should be subtracted when estimating intensities. It has been argued that this should not be done because it increases variability, whereas the arguments for doing so are that there is a physical basis for the image background, and that not doing so will bias downward the estimated log ratios of differentially expressed genes. We show that the arguments on both sides of this debate are correct for our data, but that by using our model one can have the best
Fast Bayesian Inference in Dirichlet Process Mixture Models
"... Summary There has been increasing interest in applying Bayesian nonparametric methods in large samples and high dimensions. As Markov chain Monte Carlo (MCMC) algorithms are often infeasible, there is a pressing need for much faster algorithms. This article proposes a fast approach for inference in ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Summary There has been increasing interest in applying Bayesian nonparametric methods in large samples and high dimensions. As Markov chain Monte Carlo (MCMC) algorithms are often infeasible, there is a pressing need for much faster algorithms. This article proposes a fast approach for inference in Dirichlet process mixture (DPM) models. Viewing the partitioning of subjects into clusters as a model selection problem, we propose a sequential greedy search algorithm for selecting the partition. Then, when conjugate priors are chosen, the resulting posterior conditionally on the selected partition is available in closed form. This approach allows testing of parametric models versus nonparametric alternatives based on Bayes factors. We evaluate the approach using simulation studies and apply it to data sets from the literature, as well as to a large epidemiologic study.
Representing Degree Distributions, Clustering, and Homophily in Social Networks With Latent Cluster Random Effects Models
, 2007
"... preparation of this paper. Social network data often involve transitivity, homophily on observed attributes, clustering, and heterogeneity of actors. We propose the latent cluster random effects model to take account of all of these features, and we describe a Bayesian estimation method. The model f ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
preparation of this paper. Social network data often involve transitivity, homophily on observed attributes, clustering, and heterogeneity of actors. We propose the latent cluster random effects model to take account of all of these features, and we describe a Bayesian estimation method. The model fits two real datasets well. We show by simulation that networks with the same degree distribution can have very different clustering behaviors. This suggests that scale-free and small-world network models may not be adequate for all types of network, while our model recovers both the clustering and the degree distribution. 1
Easy Computation of Bayes Factors and Normalizing Constants for Mixture Models via Mixture Importance Sampling
, 2001
"... We propose a method for approximating integrated likelihoods, or posterior normalizing constants, in finite mixture models, for which analytic approximations such as the Laplace method are invalid. Integrated likelihoods are key components of Bayes factors and of the posterior model probabilities us ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
We propose a method for approximating integrated likelihoods, or posterior normalizing constants, in finite mixture models, for which analytic approximations such as the Laplace method are invalid. Integrated likelihoods are key components of Bayes factors and of the posterior model probabilities used in Bayesian model averaging. The method starts by formulating the model in terms of the unobserved group memberships, Z, and making these, rather than the model parameters, the variables of integration. The integral is then evaluated using importance sampling over the Z. The tricky part is choosing the importance sampling function, and we study the use of mixtures as importance sampling functions. We propose two forms of this: defensive mixture importance sampling (DMIS), and Z-distance importance sampling. We choose the parameters of the mixture adaptively, and we show how this can be done so as to approximately minimize the variance of the approximation to the integral.

