Results 21  30
of
62
False Discovery Rates and Copy Number Variation
"... Copy number changes, the gains and losses of chromosome segments, are a common type of genetic variation among healthy individuals as well as an important feature in tumor genomes. Microarray technology enables us to simultaneously measure, with moderate accuracy, copy number variation at more than ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Copy number changes, the gains and losses of chromosome segments, are a common type of genetic variation among healthy individuals as well as an important feature in tumor genomes. Microarray technology enables us to simultaneously measure, with moderate accuracy, copy number variation at more than a million chromosome locations and for hundreds of subjects. This leads to massive data sets and complicated inference problems concerning which locations for which subjects are genuinely variable. In this paper we consider a relatively simple false discovery rate approach to cnv analysis. More careful parametric changepoint methods can then be focused on promising regions of the genome. Key words and phrases: copy number 1
Nonparametric empirical Bayes for the Dirichlet process mixture model
 Statistics and Computing
, 2004
"... The Dirichlet process prior allows flexible nonparametric mixture modeling. The number of mixture components is not specified in advance and can grow as new data come in. However, the behavior of the model is sensitive to the choice of the parameters, including an infinitedimensional distribution ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
The Dirichlet process prior allows flexible nonparametric mixture modeling. The number of mixture components is not specified in advance and can grow as new data come in. However, the behavior of the model is sensitive to the choice of the parameters, including an infinitedimensional distributional parameter G0 . Most previous applications have either fixed G0 as a member of a parametric family or treated G0 in a Bayesian fashion, using parametric prior specifications. In contrast, we have developed an adaptive nonparametric method for constructing smooth estimates of G0 . We combine this method with a technique for estimating #, the other Dirichlet process parameter, that is inspired by an existing characterization of its maximumlikelihood estimator. Together, these estimation procedures yield a flexible empirical Bayes treatment of Dirichlet process mixtures. Such a treatment is useful in situations where smooth point estimates of G0 are of intrinsic interest, or where the structure of G0 cannot be conveniently modeled with the usual parametric prior families. Analysis of simulated and realworld datasets illustrates the robustness of this approach.
A new approach to fitting linear models in high dimensional spaces
, 2000
"... This thesis presents a new approach to fitting linear models, called “pace regression”, which also overcomes the dimensionality determination problem. Its optimality in minimizing the expected prediction loss is theoretically established, when the number of free parameters is infinitely large. In th ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
This thesis presents a new approach to fitting linear models, called “pace regression”, which also overcomes the dimensionality determination problem. Its optimality in minimizing the expected prediction loss is theoretically established, when the number of free parameters is infinitely large. In this sense, pace regression outperforms existing procedures for fitting linear models. Dimensionality determination, a special case of fitting linear models, turns out to be a natural byproduct. A range of simulation studies are conducted; the results support the theoretical analysis. Through the thesis, a deeper understanding is gained of the problem of fitting linear models. Many key issues are discussed. Existing procedures, namely OLS, AIC, BIC, RIC, CIC, CV(d), BS(m), RIDGE, NNGAROTTE and LASSO, are reviewed and compared, both theoretically and empirically, with the new methods. Estimating a mixing distribution is an indispensable part of pace regression. A measurebased minimum distance approach, including probability measures and nonnegative measures, is proposed, and strongly consistent estimators are produced. Of all minimum distance methods for estimating a mixing distribution, only the
Kernel Methods for TextIndependent Speaker Verification
, 2010
"... In recent years, systems based on support vector machines (SVMs) have become standard for speaker veriﬁcation (SV) tasks. An important aspect of these systems is the dynamic kernel.
These operate on sequence data and handle the dynamic nature of the speech. In this thesis a number of techniques are ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
In recent years, systems based on support vector machines (SVMs) have become standard for speaker veriﬁcation (SV) tasks. An important aspect of these systems is the dynamic kernel.
These operate on sequence data and handle the dynamic nature of the speech. In this thesis a number of techniques are proposed for improving dynamic kernelbased SV systems.
The ﬁrst contribution of this thesis is the development of alternative forms of dynamic kernel. Several popular dynamic kernels proposed for SV are based on the KullbackLeibler divergence between Gaussian mixture models. Since this has no closedform solution, typically a matchedpair upper bound is used instead. This places signiﬁcant restrictions on the forms of model structure that may be used. In this thesis, dynamic kernels are proposed based
on alternative, variational approximations to the divergence. Unlike standard approaches, these allow the use of a more ﬂexible modelling framework. Also, using a more accurate approximation may lead to performance gains.
The second contribution of this thesis is to investigate the combination of multiple systems to improve SV performance. Typically, systems are combined by fusing the output scores.
For SVM classiﬁers, an alternative strategy is to combine at the kernel level. Recently an efficient maximummargin scheme for learning kernel weights has been developed. In this thesis several modiﬁcations are proposed to allow this scheme to be applied to SV tasks.
System combination will only lead to gains when the kernels are complementary. In this thesis it is shown that many commonly used dynamic kernels can be placed into one of two broad classes, derivative and parametric kernels. The attributes of these classes are contrasted and the conditions under which the two forms of kernel are identical are described. By avoiding these conditions gains may be obtained by combining derivative and parametric kernels.
The ﬁnal contribution of this thesis is to investigate the combination of dynamic kernels with traditional static kernels for vector data. Here two general combination strategies are available: static kernel functions may be deﬁned over the dynamic feature vectors. Alternatively, a static kernel may be applied at the observation level. In general, it is not possible to explicitly train a model in the feature space associated with a static kernel. However, it is shown in this thesis that this form of kernel can be computed by using a suitable metric with approximate component posteriors. Generalised versions of standard parametric and derivative kernels, that include an observationlevel static kernel, are proposed based on this
approach.
Learning least squares estimators without assumed priors or supervision
, 2009
"... The two standard methods of obtaining a leastsquares optimal estimator are (1) Bayesian estimation, in which one assumes a prior distribution on the true values and combines this with a model of the measurement process to obtain an optimal estimator, and (2) supervised regression, in which one opti ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
The two standard methods of obtaining a leastsquares optimal estimator are (1) Bayesian estimation, in which one assumes a prior distribution on the true values and combines this with a model of the measurement process to obtain an optimal estimator, and (2) supervised regression, in which one optimizes a parametric estimator over a training set containing pairs of corrupted measurements and their associated true values. But many realworld systems do not have access to either supervised training examples or a prior model. Here, we study the problem of obtaining an optimal estimator given a measurement process with known statistics, and a set of corrupted measurements of random values drawn from an unknown prior. We develop a general form of nonparametric empirical Bayesian estimator that is written as a direct function of the measurement density, with no explicit reference to the prior. We study the observation conditions under which such “priorfree ” estimators may be obtained, and we derive specific forms for a variety of different corruption processes. Each of these priorfree estimators may also be used to express the mean squared estimation error as an expectation over the measurement density, thus generalizing Stein’s unbiased risk estimator (SURE) which provides such an expression for the additive Gaussian noise case. Minimizing this expression over measurement samples provides an “unsupervised
Tweedie’s Formula and Selection Bias
"... We suppose that the statistician observes some large number of estimates zi, each with its own unobserved expectation parameter µi. The largest few of the zi’s are likely to substantially overestimate their corresponding µi’s, this being an example of selection bias, or regression to the mean. Tweed ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We suppose that the statistician observes some large number of estimates zi, each with its own unobserved expectation parameter µi. The largest few of the zi’s are likely to substantially overestimate their corresponding µi’s, this being an example of selection bias, or regression to the mean. Tweedie’s formula, first reported by Robbins in 1956, offers a simple empirical Bayes approach for correcting selection bias. This paper investigates its merits and limitations. In addition to the methodology, Tweedie’s formula raises more general questions concerning empirical Bayes theory, discussed here as “relevance ” and “empirical Bayes information. ” There is a close connection between applications of the formula and James–Stein estimation. Keywords: Bayesian relevance, empirical Bayes information, James–Stein, false discovery rates, regret, winner’s curse
Skellam shrinkage: Waveletbased intensity estimation for inhomogeneous Poisson data
"... The ubiquity of integrating detectors in imaging and other applications implies that a variety of realworld data are well modeled as Poisson random variables whose means are in turn proportional to an underlying vectorvalued signal of interest. In this article, we first show how the socalled Skel ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
The ubiquity of integrating detectors in imaging and other applications implies that a variety of realworld data are well modeled as Poisson random variables whose means are in turn proportional to an underlying vectorvalued signal of interest. In this article, we first show how the socalled Skellam distribution arises from the fact that Haar wavelet and filterbank transform coefficients corresponding to measurements of this type are distributed as sums and differences of Poisson counts. We then provide two main theorems on Skellam shrinkage, one showing the nearoptimality of shrinkage in the Bayesian setting and the other providing for unbiased risk estimation in a frequentist context. These results serve to yield new estimators in the Haar transform domain, including an unbiased risk estimate for shrinkage of HaarFisz variancestabilized data, along with accompanying lowcomplexity algorithms for inference. We conclude with a simulation study demonstrating the efficacy of our Skellam
GENERAL THEORY OF INFERENTIAL MODELS II. MARGINAL INFERENCE
"... This paper is a continuation of the authors ’ theoretical investigation of inferential model (IMs); see Martin, Hwang and Liu (2010). The fundamental idea is that priorfree posterior probabilitylike inference with desirable longrun frequency properties can be achieved through a system based on pr ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
This paper is a continuation of the authors ’ theoretical investigation of inferential model (IMs); see Martin, Hwang and Liu (2010). The fundamental idea is that priorfree posterior probabilitylike inference with desirable longrun frequency properties can be achieved through a system based on predicting unobserved auxiliary variables. In Part I, an intermediate conditioning step was proposed to reduce the dimension of the auxiliary variable to be predicted, making the construction of efficient IMs more manageable. Here we consider the problem of inference in the presence of nuisance parameters, and we show that such problems admit a further auxiliary variable reduction via marginalization. Unlike classical procedures that use optimization or integration, the proposed framework eliminates nuisance parameters via a set union operation. Sufficient conditions are given for when this marginalization operation can be performed without loss of information, and in such cases we prove that an appropriately constructed IM is calibrated, in a frequentist sense, for marginal inference. In problems where these sufficient conditions are not met, we propose a marginalization technique based on parameter expansion that leads to conservative marginal inference. The marginal IM approach is illustrated on a number of examples, including Stein’s problem and the BehrensFisher problem.
Statistical Decision Making for Authentication and Intrusion Detection
, 2009
"... User authentication and intrusion detection differ from standard classification problems in that while we have data generated from legitimate users, impostor or intrusion data is scarce or nonexistent. We review existing techniques for dealing with this problem and propose a novel alternative based ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
User authentication and intrusion detection differ from standard classification problems in that while we have data generated from legitimate users, impostor or intrusion data is scarce or nonexistent. We review existing techniques for dealing with this problem and propose a novel alternative based on a principled statistical decisionmaking view point. We examine the technique on a toy problem and validate it on complex realworld data from an RFID based access control system. The results indicate that it can significantly outperform the classical world model approach. The method could be more generally useful in other decisionmaking scenarios where there is a lack of adversary data. 1