Results 1  10
of
87
Bayesian measures of model complexity and fit
 Journal of the Royal Statistical Society, Series B
, 2002
"... [Read before The Royal Statistical Society at a meeting organized by the Research ..."
Abstract

Cited by 327 (4 self)
 Add to MetaCart
[Read before The Royal Statistical Society at a meeting organized by the Research
2004. Frequentist properties of Bayesian posterior probabilities of phylogenetic trees under simple and complex substitution models. Syst. Biol
"... Abstract. — What does die posterior probability of a phylogenetic tree mean? This simulation study shows that Bayesian posterior probabilities have the meaning that is typically ascribed to them; the pt>sterkir probability ot'a tree is the probability that the tree is corwct, assuming th> ..."
Abstract

Cited by 74 (4 self)
 Add to MetaCart
(Show Context)
Abstract. — What does die posterior probability of a phylogenetic tree mean? This simulation study shows that Bayesian posterior probabilities have the meaning that is typically ascribed to them; the pt>sterkir probability ot'a tree is the probability that the tree is corwct, assuming th>.it the model is correct. At the same time, the BayLsian method can be sensitive to model misspecification, and the sensitivity of the Bayesian method appears to be greater than the sensitivity ot " the nonparametric bootstrap method (using maximum likelihood to estimate trees). Although the estimatLs of phylogeny obtained by use of the method of maximum likelihood or the Bayesian method are Ukely to be similar, the assessment of the uncertainty of inferred trees via either bootstriipping (t"or maximum likelihood estimates) or petsterior probabilities (for Bayesian estimates) is not likely to be the same. We suggest that the Bayesian method be implemented with the most complex models of those currently avaiiable, as tliis should reduce the chance that the metliod will concentrate too much probability on tuo few trees. [Bayesian estimation; Markov ch^iin Monte Carlo; posterior probability; prior probability.] Quantify ing the uncertainty of a phylogcneticesti mil te is at least as important a goal as obtaining the phylogenetic estimate itself. Measures of phylogenetic reliability not only point out what parts of a tree can be trusted when interpreting the evolution of a group, but can guide
How Many Genes Are Needed for a Discriminant Microarray Data Analysis
 Proc. Critical Assessment of Techniques for Microarray Data Mining Workshop
, 2000
"... The analysis of the leukemia data from Whitehead/MIT group is a discriminant analysis (also called a supervised learning). Among thousands of genes whose expression levels are measured, not all are needed for discriminant analysis: a gene may either not contribute to the separation of two types of t ..."
Abstract

Cited by 47 (2 self)
 Add to MetaCart
The analysis of the leukemia data from Whitehead/MIT group is a discriminant analysis (also called a supervised learning). Among thousands of genes whose expression levels are measured, not all are needed for discriminant analysis: a gene may either not contribute to the separation of two types of tissues/cancers, or it may be redundant because it is highly correlated with other genes. There are two theoretical frameworks in which variable selection (or gene selection in our case) can be addressed. The first is model selection, and the second is model averaging. We have carried out model selection using Akaike information criterion and Bayesian information criterion with logistic regression (discrimination, prediction, or classification) to determine the number of genes that provide the best model. These model selection criteria set upper limits of 2225 and 1213 genes for this data set with 38 samples, and the best model consists of only one (no.4847, zyxin) or two genes. We have also carried out model averaging over the best singlegene logistic predictors using three different weights: maximized likelihood, prediction rate on training set, and equal weight. We have observed that the performance of most of these weighted predictors on the testing set is gradually reduced as more genes are included, but a clear cutoff that separates good and bad prediction performance is not found. 1 Li Yang 2
Algorithm for finding optimal gene sets in microarray prediction. http://stravinsky.ucsc.edu/josh/gesses
, 2006
"... Motivation: Microarray data has been recently been shown to be efficacious in distinguishing closely related cell types that often appear in the diagnosis of cancer. It is useful to determine the minimum number of genes needed to do such a diagnosis both for clinical use and to determine the importa ..."
Abstract

Cited by 28 (0 self)
 Add to MetaCart
Motivation: Microarray data has been recently been shown to be efficacious in distinguishing closely related cell types that often appear in the diagnosis of cancer. It is useful to determine the minimum number of genes needed to do such a diagnosis both for clinical use and to determine the importance of specific genes for cancer. Here a replication algorithm is used for this purpose. It evolves an ensemble of predictors, all using different combinations of genes to generate a set of optimal predictors. Results: We apply this method to the leukemia data of the Whitehead/MIT group that attempts to differentially diagnose two kinds of leukemia, and also to data of Khan et. al. to distinguish four different kinds of childhood cancers. In the latter case we were able to reduce the number of genes needed from 96 down to 15, while at the same time being able to perfectly classify all of their test data.
Evolutionary Theory and the Reality of Macro Probabilities
"... Evolutionary theory is awash with probabilities. For example, natural selection is said to occur when there is variation in fitness, and fitness is standardly decomposed into two components, viability and fertility, each of which is understood probabilistically. With respect to viability, a fertiliz ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
Evolutionary theory is awash with probabilities. For example, natural selection is said to occur when there is variation in fitness, and fitness is standardly decomposed into two components, viability and fertility, each of which is understood probabilistically. With respect to viability, a fertilized egg is said to have a certain chance of surviving to reproductive age; with respect to fertility, an adult is said to have an expected number of offspring. There is more to evolutionary theory than the theory of natural selection, and here too one finds probabilistic concepts aplenty. When there is no selection, the theory of neutral evolution says that a gene’s chance of eventually reaching fixation is 1/(2N), where N is the number of organisms in the generation of the diploid population to which the gene belongs. The evolutionary consequences of mutation are likewise conceptualized in terms of the probability per unit time a gene has of changing from one state to another. The examples just mentioned are all “forwarddirected” probabilities; they describe the probability of later events, conditional on earlier events. However, evolutionary theory also uses “backwards probabilities ” that describe the probability of a cause conditional on its effects; for example, coalescence theory allows one to calculate the expected number of generations in the past that the genes in the present generation find their most recent common
Error distribution for gene expression data
 Statistical Applications in Genetics and Molecular Biology
, 2005
"... Copyright c©2005 by the authors. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher, bepre ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
(Show Context)
Copyright c©2005 by the authors. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher, bepress, which has been given certain exclusive rights by the author. Statistical Applications in Genetics and Molecular Biology is produced by Berkeley Electronic Press (bepress).
Assessing the fit of siteoccupancy models
 J. Agric. Biol. Environ. Stat
, 2004
"... Few species are likely to be so evident that they will always be detected at a site when present. Recently a model has been developed that enables estimation of the proportion of area occupied, when the target species is not detected with certainty. Here we apply this modeling approach to data colle ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
(Show Context)
Few species are likely to be so evident that they will always be detected at a site when present. Recently a model has been developed that enables estimation of the proportion of area occupied, when the target species is not detected with certainty. Here we apply this modeling approach to data collected on terrestrial salamanders in the Plethodon glutinosus complex in the Great Smoky Mountains National Park, USA, and wish to address the question “how accurately does the fitted model represent the data? ” The goodnessoffit of the model needs to be assessed in order to make accurate inferences. This article presents a method where a simple Pearson chisquare statistic is calculated and a parametric bootstrap procedure is used to determine whether the observed statistic is unusually large. We found evidence that the most global model considered provides a poor fit to the data, hence estimated an overdispersion factor to adjust model selection procedures and inflate standard errors. Two hypothetical datasets with known assumption violations are also analyzed, illustrating that the method may be used to guide researchers to making appropriate inferences. The results of a simulation study are presented to provide a broader view of the methods properties.
Cultural evolution in laboratory microsocieties including traditions of rule giving and rule following
 Evolution and Human Behavior
, 2004
"... Experiments may contribute to understanding the basic processes of cultural evolution. We drew features from previous laboratory research with small groups in which traditions arose during several generations. Groups of four participants chose by consensus between solving anagrams printed on red car ..."
Abstract

Cited by 12 (2 self)
 Add to MetaCart
(Show Context)
Experiments may contribute to understanding the basic processes of cultural evolution. We drew features from previous laboratory research with small groups in which traditions arose during several generations. Groups of four participants chose by consensus between solving anagrams printed on red cards and on blue cards. Payoffs for the choices differed. After 12 min, the participant who had been in the experiment the longest was removed and replaced with a naïve person. These replacements, each of which marked the end of a generation, continued for 10–15 generations, at which time the day’s session ended. Timeout duration, which determined whether the group earned more by choosing red or blue, and which was fixed for a day’s session, was varied across three conditions to equal 1, 2, or 3 min. The groups developed choice traditions that tended toward maximizing earnings. The stronger the dependence between choice and earnings, the stronger was the tradition. Once a choice tradition evolved, groups passed it on by instructing newcomers, using some combination of accurate information, mythology, and coercion. Among verbal traditions, frequency of mythology varied directly
Dna segmentation as a model selection process
 In International Conference on Research in Computational Molecular Biology (RECOMB
"... Previous divideandconquer segmentation analyses of DNA sequences do not provide a satisfactory stopping criterion for the recursion. This paper proposes that segmentation be considered as a model selection process. Using the tools in model selection, a limit for the stopping criterion on the relax ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
(Show Context)
Previous divideandconquer segmentation analyses of DNA sequences do not provide a satisfactory stopping criterion for the recursion. This paper proposes that segmentation be considered as a model selection process. Using the tools in model selection, a limit for the stopping criterion on the relaxed end can be determined. The Bayesian information criterion, in particular, provides a much more stringent stopping criterion than what is currently used. Such a stringent criterion can be used to delineate larger DNA domains. A relationship between the stopping criterion and the average domain size is empirically determined, which may aid in the determination of isochore borders. 1.