Results 1  10
of
12
ModelBased Clustering, Discriminant Analysis, and Density Estimation
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2000
"... Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little ..."
Abstract

Cited by 260 (24 self)
 Add to MetaCart
Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little systematic guidance associated with these methods for solving important practical questions that arise in cluster analysis, such as \How many clusters are there?", "Which clustering method should be used?" and \How should outliers be handled?". We outline a general methodology for modelbased clustering that provides a principled statistical approach to these issues. We also show that this can be useful for other problems in multivariate analysis, such as discriminant analysis and multivariate density estimation. We give examples from medical diagnosis, mineeld detection, cluster recovery from noisy data, and spatial density estimation. Finally, we mention limitations of the methodology, a...
A Tutorial on a Practical Bayesian Alternative to NullHypothesis Significance Testing
"... Nullhypothesis significance testing remains the standard inferential tool in cognitive science despite its serious disadvantages. Primary among these is the fact that the resulting probability value does not tell the researcher what he or she usually wants to know: how probable is a hypothesis, giv ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
Nullhypothesis significance testing remains the standard inferential tool in cognitive science despite its serious disadvantages. Primary among these is the fact that the resulting probability value does not tell the researcher what he or she usually wants to know: how probable is a hypothesis, given the obtained data? Inspired by developments presented by Wagenmakers (2007), I provide a tutorial on a Bayesian modelselection approach that requires only a simple transformation of sum of squares values generated by the standard analysis of variance. This approach generates a graded level of evidence regarding which model (e.g., effect absent [null hypothesis] vs. effect present [alternative hypothesis]) is more strongly supported by the data. This method also obviates admonitions never to speak of accepting the null hypothesis. An Excel worksheet for computing the Bayesian analysis is provided as supplemental material. The widespread use of nullhypothesis significance testing (NHST) in psychological research has withstood numerous rounds of debate (e.g., Chow, 1998; Cohen,
Determining the Number of Colors or Gray Levels in an Image Using Approximate Bayes Factors: The Pseudolikelihood Information Criterion (PLIC)
 PLIC), IEEE Transactions on Pattern Analysis and Machine Intelligence 24
, 2001
"... We propose a method for choosing the number of colors, or true gray levels, in an image. This is motivated by medical and satellite image segmentation, and may also be useful for color and gray scale image quantization, the display and storage of computergenerated holograms, and the use of cooccurr ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We propose a method for choosing the number of colors, or true gray levels, in an image. This is motivated by medical and satellite image segmentation, and may also be useful for color and gray scale image quantization, the display and storage of computergenerated holograms, and the use of cooccurrence matrices for assessing texture in images. Our underlying probability model is a hidden Markov random field. Each number of colors considered is viewed as corresponding to a statistical model for the image, and the resulting models are compared via approximate Bayes factors. The Bayes factors are approximated using BIC, where the required maximized likelihood is approximated by the QianTitterington pseudo likelihood. We call the resulting criterion PLIC (Pseudolikelihood Information Criterion). We also discuss a simpler approximation, MMIC (Marginal Mixture Information Criterion), which is based only on the marginal distribution of pixel values. This turns out to be useful for initialization, and also to have moderately good, albeit suboptimal, performance in its own right. We apply PLIC to three examples: a simulated twoband image, a medical segmentation problem, and a satellite image, and in each case it gives good results in practice. Keywords: BIC; Color image quantization; Cooccurrence matrix; Hologram; ICM algorithm; Image segmentation; Markov Random Field; Medical image; Mixture model; Posterior model probability; Pseudolikelihood; Satellite image.
ModelBased Clustering, Discriminant Analysis, and Density Estimation
 Journal of the American Statistical Association
, 2000
"... Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little ..."
Abstract
 Add to MetaCart
Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little systematic guidance associated with these methods for solving important practical questions that arise in cluster analysis, such as \How many clusters are there?", \Which clustering method should be used?" and \How should outliers be handled?". We outline a general methodology for modelbased clustering that provides a principled statistical approach to these issues. We also show that this can be useful for other problems in multivariate analysis, such as discriminant analysis and multivariate density estimation. We give examples from medical diagnosis, mineeld detection, cluster recovery from noisy data, and spatial density estimation. Finally, we mention limitations of the methodology, a...
Bayesian Multidimensional Scaling and Choice of Dimension
 Journal of the American Statistical Association
, 2001
"... Multidimensional scaling is widely used to handle data which consist of dissimilarity measures between pairs of objects or people. We deal with two major problems in metric multidimensional scaling  conguration of objects and determination of the dimension of object conguration  within a Bayesian ..."
Abstract
 Add to MetaCart
Multidimensional scaling is widely used to handle data which consist of dissimilarity measures between pairs of objects or people. We deal with two major problems in metric multidimensional scaling  conguration of objects and determination of the dimension of object conguration  within a Bayesian framework. A Markov chain Monte Carlo algorithm is proposed for object con guration, along with a simple Bayesian criterion for choosing their eective dimension, called MDSIC. Simulation results are presented, as well as examples on real data. Our method provides better results than classical multidimensional scaling for object conguration, and MDSIC seems to work well for dimension choice in the examples considered. Key Words: Clustering, Dimensionality, Dissimilarity, Markov chain Monte Carlo, Metric scaling, Model selection. Contents 1 Introduction 1 2 Classical Multidimensional Scaling 3 3 Bayesian Multidimensional Scaling 5 3.1 Model and Prior . . . . . . . . . . . . . . . . ...
Confidence Intervals for Markovian Models
, 2004
"... This paper introduces a new multinomial approach unifying the computation of confidence intervals for Markovian models. Starting from a method used for homogeneous Markov chains, we show that it can be applied on models incorporating a hidden component. We consider three models derived from the bas ..."
Abstract
 Add to MetaCart
This paper introduces a new multinomial approach unifying the computation of confidence intervals for Markovian models. Starting from a method used for homogeneous Markov chains, we show that it can be applied on models incorporating a hidden component. We consider three models derived from the basic homogeneous Markov chain: the Mixture Transition Distribution (MTD) model, the Hidden Markov Model (HMM), and the Double Chain Markov Model (DCMM). Compared to existing methods, our proposal can be used on data sets of any size without requiring extensive computing.
DOI 10.3758/s1342801000495 A tutorial on a practical Bayesian alternative to nullhypothesis significance testing
, 2011
"... Abstract Nullhypothesis significance testing remains the standard inferential tool in cognitive science despite its serious disadvantages. Primary among these is the fact that the resulting probability value does not tell the researcher what he or she usually wants to know: How probable is a hypoth ..."
Abstract
 Add to MetaCart
Abstract Nullhypothesis significance testing remains the standard inferential tool in cognitive science despite its serious disadvantages. Primary among these is the fact that the resulting probability value does not tell the researcher what he or she usually wants to know: How probable is a hypothesis, given the obtained data? Inspired by developments presented by Wagenmakers (Psychonomic Bulletin & Review, 14, 779– 804, 2007), I provide a tutorial on a Bayesian model selection approach that requires only a simple transformation of sumofsquares values generated by the standard analysis of variance. This approach generates a graded level of evidence regarding which model (e.g., effect absent [null hypothesis] vs. effect present [alternative hypothesis]) is more strongly supported by the data. This method also obviates admonitions never to speak of accepting the null hypothesis. An Excel worksheet for computing the Bayesian analysis is provided as supplemental material. Keywords Bayesian analysis. Nullhypothesis significance testing The widespread use of nullhypothesis significance testing (NHST) in psychological research has withstood numerous