Results 1  10
of
13
Bayes Factors
, 1995
"... In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null ..."
Abstract

Cited by 1012 (70 self)
 Add to MetaCart
In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null is onehalf. Although there has been much discussion of Bayesian hypothesis testing in the context of criticism of P values, less attention has been given to the Bayes factor as a practical tool of applied statistics. In this paper we review and discuss the uses of Bayes factors in the context of five scientific applications in genetics, sports, ecology, sociology and psychology.
Could Fisher, Jeffreys, and Neyman Have Agreed on Testing?
, 2002
"... Ronald Fisher advocated testing using pvalues; Harold Jeffreys proposed use of objective posterior probabilities of hypotheses; and Jerzy Neyman recommended testing with fixed error probabilities. Each was quite critical of the other approaches. ..."
Abstract

Cited by 30 (2 self)
 Add to MetaCart
Ronald Fisher advocated testing using pvalues; Harold Jeffreys proposed use of objective posterior probabilities of hypotheses; and Jerzy Neyman recommended testing with fixed error probabilities. Each was quite critical of the other approaches.
Statistical Techniques for Language Recognition: An Introduction and Guide for Cryptanalysts
 Cryptologia
, 1993
"... We explain how to apply statistical techniques to solve several languagerecognition problems that arise in cryptanalysis and other domains. Language recognition is important in cryptanalysis because, among other applications, an exhaustive key search of any cryptosystem from ciphertext alone requir ..."
Abstract

Cited by 12 (2 self)
 Add to MetaCart
We explain how to apply statistical techniques to solve several languagerecognition problems that arise in cryptanalysis and other domains. Language recognition is important in cryptanalysis because, among other applications, an exhaustive key search of any cryptosystem from ciphertext alone requires a test that recognizes valid plaintext. Written for cryptanalysts, this guide should also be helpful to others as an introduction to statistical inference on Markov chains. Modeling language as a finite stationary Markov process, we adapt a statistical model of pattern recognition to language recognition. Within this framework we consider four welldefined languagerecognition problems: 1) recognizing a known language, 2) distinguishing a known language from uniform noise, 3) distinguishing unknown 0thorder noise from unknown 1storder language, and 4) detecting nonuniform unknown language. For the second problem we give a most powerful test based on the NeymanPearson Lemma. For the oth...
Bayesian maximum a posteriori multiple testing procedure
 SANKHYA
, 2006
"... We consider a Bayesian approach to multiple hypothesis testing. A hierarchical prior model is based on imposing a prior distribution π(k) on the number of hypotheses arising from alternatives (false nulls). We then apply the maximum a posteriori (MAP) rule to find the most likely configuration of nu ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
We consider a Bayesian approach to multiple hypothesis testing. A hierarchical prior model is based on imposing a prior distribution π(k) on the number of hypotheses arising from alternatives (false nulls). We then apply the maximum a posteriori (MAP) rule to find the most likely configuration of null and alternative hypotheses. The resulting MAP procedure and its closely related stepup and stepdown versions compare ordered Bayes factors of individual hypotheses with a sequence of critical values depending on the prior. We discuss the relations between the proposed MAP procedure and the existing frequentist and Bayesian counterparts. A more detailed analysis is given for the normal data, where we show, in particular, that choosing a specific π(k), the MAP procedure can mimic several known familywise error (FWE) and false discovery rate (FDR) controlling procedures. The performance of MAP procedures is illustrated on a simulated example.
BayesianMotivated Tests of Function Fit and their Asymptotic Frequentist Properties, The Annals of Statistics
, 2004
"... ..."
ImputationBased Analysis of Association Studies: Candidate Regions and Quantitative Traits
"... We introduce a new framework for the analysis of association studies, designed to allow untyped variants to be more effectively and directly tested for association with a phenotype. The idea is to combine knowledge on patterns of correlation among SNPs (e.g., from the International HapMap project or ..."
Abstract
 Add to MetaCart
We introduce a new framework for the analysis of association studies, designed to allow untyped variants to be more effectively and directly tested for association with a phenotype. The idea is to combine knowledge on patterns of correlation among SNPs (e.g., from the International HapMap project or resequencing data in a candidate region of interest) with genotype data at tag SNPs collected on a phenotyped study sample, to estimate (‘‘impute’’) unmeasured genotypes, and then assess association between the phenotype and these estimated genotypes. Compared with standard singleSNP tests, this approach results in increased power to detect association, even in cases in which the causal variant is typed, with the greatest gain occurring when multiple causal variants are present. It also provides more interpretable explanations for observed associations, including assessing, for each SNP, the strength of the evidence that it (rather than another correlated SNP) is causal. Although we focus on association studies with quantitative phenotype and a relatively restricted region (e.g., a candidate gene), the framework is applicable and computationally practical for whole genome association studies. Methods described here are implemented in a
RESEARCH ARTICLE Open Access GCContent Normalization for RNASeq Data
"... Background: Transcriptome sequencing (RNASeq) has become the assay of choice for highthroughput studies of gene expression. However, as is the case with microarrays, major technologyrelated artifacts and biases affect the resulting expression measures. Normalization is therefore essential to ensu ..."
Abstract
 Add to MetaCart
Background: Transcriptome sequencing (RNASeq) has become the assay of choice for highthroughput studies of gene expression. However, as is the case with microarrays, major technologyrelated artifacts and biases affect the resulting expression measures. Normalization is therefore essential to ensure accurate inference of expression levels and subsequent analyses thereof. Results: We focus on biases related to GCcontent and demonstrate the existence of strong samplespecific GCcontent effects on RNASeq read counts, which can substantially bias differential expression analysis. We propose three simple withinlane genelevel GCcontent normalization approaches and assess their performance on two different RNASeq datasets, involving different species and experimental designs. Our methods are compared to stateoftheart normalization procedures in terms of bias and mean squared error for expression foldchange estimation and in terms of Type I error and pvalue distributions for tests of differential expression. The exploratory data analysis and normalization methods proposed in this article are implemented in the opensource Bioconductor R package EDASeq. Conclusions: Our withinlane normalization procedures, followed by betweenlane normalization, reduce GCcontent bias and lead to more accurate estimates of expression foldchanges and tests of differential
CompoundRelated Peaks and Chromatograms from High Frequency Noise, Spikes and SolventBased Noise in LC – MS Data Sets ∗
"... Liquid Chromatography Mass Spectrometry (LCMS) is a powerful method for sensitive detection and quantification of proteins and peptides in complex biological fluids like serum. LCMS produces complex data sets, consisting of some hundreds of millions of data points per sample at a resolution of 0. ..."
Abstract
 Add to MetaCart
Liquid Chromatography Mass Spectrometry (LCMS) is a powerful method for sensitive detection and quantification of proteins and peptides in complex biological fluids like serum. LCMS produces complex data sets, consisting of some hundreds of millions of data points per sample at a resolution of 0.1 amu in the m/z domain and 7000 data points in the time domain. However, the detection of the lower abundance proteins from this data is hampered by the presence of artefacts, such as high frequency noise and spikes. Moreover, not all of the tens of thousands of the chromatograms produced per sample are relevant for the pursuit of the biomarkers. Thus in analysing the LCMS data, two critical preprocessing issues arise. Which of the thousands of the: 1. chromatograms per sample are relevant for the detection of the biomarkers?, and 2. signals per chromatogram are truly compoundrelated? Each of these issues involves assessing the significance (deviation from noise) of multiple observations and the issue of multiple comparisons arises. Current methods disregard the multiplicity and provide no concrete threshold for significance.
BAYESIAN FREQUENTIST HYBRID INFERENCE
, 2009
"... Bayesian and frequentist methods differ in many aspects, but share some basic optimality properties. In practice, there are situations in which one of the methods is more preferred by some criteria. We consider the case of inference about a set of multiple parameters, which can be divided into two d ..."
Abstract
 Add to MetaCart
Bayesian and frequentist methods differ in many aspects, but share some basic optimality properties. In practice, there are situations in which one of the methods is more preferred by some criteria. We consider the case of inference about a set of multiple parameters, which can be divided into two disjoint subsets. On one set, a frequentist method may be favored and on the other, the Bayesian. This motivates a joint estimation procedure in which some of the parameters are estimated Bayesian, and the rest by the maximumlikelihood estimator in the same parametric model, and thus keep the strengths of both the methods and avoid their weaknesses. Such a hybrid procedure gives us more flexibility in achieving overall inference advantages. We study the consistency and highorder asymptotic behavior of the proposed estimator, and illustrate its application. Also, the results imply a new method for constructing objective prior.