Results 1  10
of
16
ModelBased Clustering, Discriminant Analysis, and Density Estimation
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2000
"... Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little ..."
Abstract

Cited by 562 (29 self)
 Add to MetaCart
Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little systematic guidance associated with these methods for solving important practical questions that arise in cluster analysis, such as \How many clusters are there?", "Which clustering method should be used?" and \How should outliers be handled?". We outline a general methodology for modelbased clustering that provides a principled statistical approach to these issues. We also show that this can be useful for other problems in multivariate analysis, such as discriminant analysis and multivariate density estimation. We give examples from medical diagnosis, mineeld detection, cluster recovery from noisy data, and spatial density estimation. Finally, we mention limitations of the methodology, a...
Model Selection and Model Averaging in Phylogenetics: Advantages of Akaike Information Criterion and Bayesian Approaches Over Likelihood Ratio Tests
, 2004
"... Model selection is a topic of special relevance in molecular phylogenetics that affects many, if not all, stages of phylogenetic inference. Here we discuss some fundamental concepts and techniques of model selection in the context of phylogenetics. We start by reviewing different aspects of the sel ..."
Abstract

Cited by 379 (8 self)
 Add to MetaCart
Model selection is a topic of special relevance in molecular phylogenetics that affects many, if not all, stages of phylogenetic inference. Here we discuss some fundamental concepts and techniques of model selection in the context of phylogenetics. We start by reviewing different aspects of the selection of substitution models in phylogenetics from a theoretical, philosophical and practical point of view, and summarize this comparison in table format. We argue that the most commonly implemented model selection approach, the hierarchical likelihood ratio test, is not the optimal strategy for model selection in phylogenetics, and that approaches like the Akaike Information Criterion (AIC) and Bayesian methods offer important advantages. In particular, the latter two methods are able to simultaneously compare multiple nested or nonnested models, assess model selection uncertainty, and allow for the estimation of phylogenies and model parameters using all available models (modelaveraged inference or multimodel inference). We also describe how the relative importance of the different parameters included in substitution models can be depicted. To illustrate some of these points, we have applied AICbased model averaging to 37 mitochondrial DNA sequences from the subgenus Ohomopterus (genus Carabus) ground beetles described by Sota and Vogler (2001).
A Tutorial on a Practical Bayesian Alternative to NullHypothesis Significance Testing
"... Nullhypothesis significance testing remains the standard inferential tool in cognitive science despite its serious disadvantages. Primary among these is the fact that the resulting probability value does not tell the researcher what he or she usually wants to know: how probable is a hypothesis, giv ..."
Abstract

Cited by 27 (4 self)
 Add to MetaCart
(Show Context)
Nullhypothesis significance testing remains the standard inferential tool in cognitive science despite its serious disadvantages. Primary among these is the fact that the resulting probability value does not tell the researcher what he or she usually wants to know: how probable is a hypothesis, given the obtained data? Inspired by developments presented by Wagenmakers (2007), I provide a tutorial on a Bayesian modelselection approach that requires only a simple transformation of sum of squares values generated by the standard analysis of variance. This approach generates a graded level of evidence regarding which model (e.g., effect absent [null hypothesis] vs. effect present [alternative hypothesis]) is more strongly supported by the data. This method also obviates admonitions never to speak of accepting the null hypothesis. An Excel worksheet for computing the Bayesian analysis is provided as supplemental material. The widespread use of nullhypothesis significance testing (NHST) in psychological research has withstood numerous rounds of debate (e.g., Chow, 1998; Cohen,
Determining the number of colors or gray levels in an image using approximate Bayes factors: the pseudolikelihood information criterion
 PLIC), IEEE Transactions on Pattern Analysis and Machine Intelligence 24
, 2002
"... ..."
Confidence Intervals for Markovian Models
, 2004
"... This paper introduces a new multinomial approach unifying the computation of confidence intervals for Markovian models. Starting from a method used for homogeneous Markov chains, we show that it can be applied on models incorporating a hidden component. We consider three models derived from the bas ..."
Abstract
 Add to MetaCart
This paper introduces a new multinomial approach unifying the computation of confidence intervals for Markovian models. Starting from a method used for homogeneous Markov chains, we show that it can be applied on models incorporating a hidden component. We consider three models derived from the basic homogeneous Markov chain: the Mixture Transition Distribution (MTD) model, the Hidden Markov Model (HMM), and the Double Chain Markov Model (DCMM). Compared to existing methods, our proposal can be used on data sets of any size without requiring extensive computing.
Bayesian Multidimensional Scaling and Choice of Dimension
 Journal of the American Statistical Association
, 2001
"... Multidimensional scaling is widely used to handle data which consist of dissimilarity measures between pairs of objects or people. We deal with two major problems in metric multidimensional scaling  conguration of objects and determination of the dimension of object conguration  within a Bayesian ..."
Abstract
 Add to MetaCart
Multidimensional scaling is widely used to handle data which consist of dissimilarity measures between pairs of objects or people. We deal with two major problems in metric multidimensional scaling  conguration of objects and determination of the dimension of object conguration  within a Bayesian framework. A Markov chain Monte Carlo algorithm is proposed for object con guration, along with a simple Bayesian criterion for choosing their eective dimension, called MDSIC. Simulation results are presented, as well as examples on real data. Our method provides better results than classical multidimensional scaling for object conguration, and MDSIC seems to work well for dimension choice in the examples considered. Key Words: Clustering, Dimensionality, Dissimilarity, Markov chain Monte Carlo, Metric scaling, Model selection. Contents 1 Introduction 1 2 Classical Multidimensional Scaling 3 3 Bayesian Multidimensional Scaling 5 3.1 Model and Prior . . . . . . . . . . . . . . . . ...