Results 1  10
of
274
Partial Correlation Estimation by Joint Sparse Regression Models
 JASA
, 2008
"... In this article, we propose a computationally efficient approach—space (Sparse PArtial Correlation Estimation)—for selecting nonzero partial correlations under the highdimensionlowsamplesize setting. This method assumes the overall sparsity of the partial correlation matrix and employs sparse re ..."
Abstract

Cited by 95 (8 self)
 Add to MetaCart
In this article, we propose a computationally efficient approach—space (Sparse PArtial Correlation Estimation)—for selecting nonzero partial correlations under the highdimensionlowsamplesize setting. This method assumes the overall sparsity of the partial correlation matrix and employs sparse regression techniques for model fitting. We illustrate the performance of space by extensive simulation studies. It is shown that space performs well in both nonzero partial correlation selection and the identification of hub variables, and also outperforms two existing methods. We then apply space to a microarray breast cancer dataset and identify a set of hub genes that may provide important insights on genetic regulatory networks. Finally, we prove that, under a set of suitable assumptions, the proposed procedure is asymptotically consistent in terms of model selection and parameter estimation.
Singletrial analysis and classification of ERP components  a tutorial
, 2010
"... Analyzing brain states that correspond to event related potentials (ERPs) on a single trial basis is a hard problem due to the high trialtotrial variability and the unfavorable ratio between signal (ERP) and noise (artifacts and neural background activity). In this tutorial, we provide a comprehen ..."
Abstract

Cited by 74 (13 self)
 Add to MetaCart
Analyzing brain states that correspond to event related potentials (ERPs) on a single trial basis is a hard problem due to the high trialtotrial variability and the unfavorable ratio between signal (ERP) and noise (artifacts and neural background activity). In this tutorial, we provide a comprehensive framework for decoding ERPs, elaborating on linear concepts, namely spatiotemporal patterns and filters as well as linear ERP classification. However, the bottleneck of these techniques is that they require an accurate covariance matrix estimation in high dimensional sensor spaces which is a highly intricate problem. As a remedy, we propose to use shrinkage estimators and show that appropriate regularization of linear discriminant analysis (LDA) by shrinkage yields excellent results for singletrial ERP classification that are far superior to classical LDA classification. Furthermore, we give practical hints on the interpretation of what classifiers learned from the data and demonstrate in particular that the tradeoff between goodnessoffit and model complexity in regularized LDA relates to a morphing between a difference pattern of ERPs and a spatial filter which cancels non taskrelated brain activity.
Convex optimization techniques for fitting sparse gaussian graphical models
 In Proceedings of the 23rd International Conference on Machine Learning
, 2006
"... We consider the problem of fitting a largescale covariance matrix to multivariate Gaussian data in such a way that the inverse is sparse, thus providing model selection. Beginning with a dense empirical covariance matrix, we solve a maximum likelihood problem with an l1norm penalty term added to e ..."
Abstract

Cited by 62 (0 self)
 Add to MetaCart
(Show Context)
We consider the problem of fitting a largescale covariance matrix to multivariate Gaussian data in such a way that the inverse is sparse, thus providing model selection. Beginning with a dense empirical covariance matrix, we solve a maximum likelihood problem with an l1norm penalty term added to encourage sparsity in the inverse. For models with tens of nodes, the resulting problem can be solved using standard interiorpoint algorithms for convex optimization, but these methods scale poorly with problem size. We present two new algorithms aimed at solving problems with a thousand nodes. The first, based on Nesterov’s firstorder algorithm, yields a rigorous complexity estimate for the problem, with a much better dependence on problem size than interiorpoint methods. Our second algorithm uses block coordinate descent, updating row/columns of the covariance matrix sequentially. Experiments with genomic data show that our method is able to uncover biologically interpretable connections among genes. 1.
Solving Inverse Problems with Piecewise Linear Estimators: From Gaussian Mixture Models to Structured Sparsity
, 2010
"... A general framework for solving image inverse problems is introduced in this paper. The approach is based on Gaussian mixture models, estimated via a computationally efficient MAPEM algorithm. A dual mathematical interpretation of the proposed framework with structured sparse estimation is describe ..."
Abstract

Cited by 55 (8 self)
 Add to MetaCart
(Show Context)
A general framework for solving image inverse problems is introduced in this paper. The approach is based on Gaussian mixture models, estimated via a computationally efficient MAPEM algorithm. A dual mathematical interpretation of the proposed framework with structured sparse estimation is described, which shows that the resulting piecewise linear estimate stabilizes the estimation when compared to traditional sparse inverse problem techniques. This interpretation also suggests an effective dictionary motivated initialization for the MAPEM algorithm. We demonstrate that in a number of image inverse problems, including inpainting, zooming, and deblurring, the same algorithm produces either equal, often significantly better, or very small margin worse results than the best published ones, at a lower computational cost. 1 I.
Modeling changing dependency structure in multivariate time series
 In International Conference in Machine Learning
, 2007
"... We show how to apply the efficient Bayesian changepoint detection techniques of Fearnhead in the multivariate setting. We model the joint density of vectorvalued observations using undirected Gaussian graphical models, whose structure we estimate. We show how we can exactly compute the MAP segmenta ..."
Abstract

Cited by 48 (0 self)
 Add to MetaCart
(Show Context)
We show how to apply the efficient Bayesian changepoint detection techniques of Fearnhead in the multivariate setting. We model the joint density of vectorvalued observations using undirected Gaussian graphical models, whose structure we estimate. We show how we can exactly compute the MAP segmentation, as well as how to draw perfect samples from the posterior over segmentations, simultaneously accounting for uncertainty about the number and location of changepoints, as well as uncertainty about the covariance structure. We illustrate the technique by applying it to financial data and to bee tracking data. 1.
A robust procedure for gaussian graphical model search from microarray data with p larger than n
 Journal of Machine Learning Research
, 2006
"... Learning of largescale networks of interactions from microarray data is an important and challenging problem in bioinformatics. A widely used approach is to assume that the available data constitute a random sample from a multivariate distribution belonging to a Gaussian graphical model. As a conse ..."
Abstract

Cited by 45 (6 self)
 Add to MetaCart
Learning of largescale networks of interactions from microarray data is an important and challenging problem in bioinformatics. A widely used approach is to assume that the available data constitute a random sample from a multivariate distribution belonging to a Gaussian graphical model. As a consequence, the prime objects of inference are fullorder partial correlations which are partial correlations between two variables given the remaining ones. In the context of microarray data the number of variables exceed the sample size and this precludes the application of traditional structure learning procedures because a sampling version of fullorder partial correlations does not exist. In this paper we consider limitedorder partial correlations, these are partial correlations computed on marginal distributions of manageable size, and provide a set of rules that allow one to assess the usefulness of these quantities to derive the independence structure of the underlying Gaussian graphical model. Furthermore, we introduce a novel structure learning procedure based on a quantity, obtained from limitedorder partial correlations, that we call the nonrejection rate. The applicability and usefulness of the procedure are demonstrated by both simulated and real data.
Entropy Inference and the JamesStein Estimator, with Application to Nonlinear Gene Association Networks
"... We present a procedure for effective estimation of entropy and mutual information from smallsample data, and apply it to the problem of inferring highdimensional gene association networks. Specifically, we develop a JamesSteintype shrinkage estimator, resulting in a procedure that is highly effic ..."
Abstract

Cited by 40 (1 self)
 Add to MetaCart
(Show Context)
We present a procedure for effective estimation of entropy and mutual information from smallsample data, and apply it to the problem of inferring highdimensional gene association networks. Specifically, we develop a JamesSteintype shrinkage estimator, resulting in a procedure that is highly efficient statistically as well as computationally. Despite its simplicity, we show that it outperforms eight other entropy estimation procedures across a diverse range of sampling scenarios and datagenerating models, even in cases of severe undersampling. We illustrate the approach by analyzing E. coli gene expression data and computing an entropybased geneassociation network from gene expression data. A computer program is available that implements the proposed shrinkage estimator. Keywords: entropy, shrinkage estimation, JamesStein estimator, “small n, large p ” setting, mutual information, gene association network
How close is the sample covariance matrix to the actual covariance matrix
 Journal of Theoretical Probability
, 2010
"... Abstract. Given a probability distribution in Rn with general (nonwhite) covariance, a classical estimator of the covariance matrix is the sample covariance matrix obtained from a sample of N independent points. What is the optimal sample size N = N(n) that guarantees estimation with a fixed accura ..."
Abstract

Cited by 33 (3 self)
 Add to MetaCart
(Show Context)
Abstract. Given a probability distribution in Rn with general (nonwhite) covariance, a classical estimator of the covariance matrix is the sample covariance matrix obtained from a sample of N independent points. What is the optimal sample size N = N(n) that guarantees estimation with a fixed accuracy in the operator norm? Suppose the distribution is supported in a centered Euclidean ball of radius O ( √ n). We conjecture that the optimal sample size is N = O(n) for all distributions with finite fourth moment, and we prove this up to an iterated logarithmic factor. This problem is motivated by the optimal theorem of M. Rudelson [23] which states that N = O(n log n) for distributions with finite second moment, and a recent result of R. Adamczak et al. [1] which guarantees that N = O(n) for subexponential distributions. 1.
Shrinkage algorithms for MMSE covariance estimation
 Signal Processing, IEEE Transactions on
, 2010
"... Abstract—We address covariance estimation in the sense of minimum meansquared error (MMSE) when the samples are Gaussian distributed. Specifically, we consider shrinkage methods which are suitable for high dimensional problems with a small number of samples (large p small n). First, we improve on t ..."
Abstract

Cited by 29 (2 self)
 Add to MetaCart
(Show Context)
Abstract—We address covariance estimation in the sense of minimum meansquared error (MMSE) when the samples are Gaussian distributed. Specifically, we consider shrinkage methods which are suitable for high dimensional problems with a small number of samples (large p small n). First, we improve on the LedoitWolf (LW) method by conditioning on a sufficient statistic. By the RaoBlackwell theorem, this yields a new estimator called RBLW, whose meansquared error dominates that of LW for Gaussian variables. Second, to further reduce the estimation error, we propose an iterative approach which approximates the clairvoyant shrinkage estimator. Convergence of this iterative method is established and a closed form expression for the limit is determined, which is referred to as the oracle approximating shrinkage (OAS) estimator. Both RBLW and OAS estimators have simple expressions and are easily implemented. Although the two methods are developed from different perspectives, their structure is identical up to specified constants. The RBLW estimator provably dominates the LW method for Gaussian samples. Numerical simulations demonstrate that the OAS approach can perform even better than RBLW, especially when n is much less than p. We also demonstrate the performance of these techniques in the context of adaptive beamforming. Index Terms—Beamforming, covariance estimation, minimum meansquared error (MMSE), shrinkage. I.
Preconditioned stochastic gradient descent optimisation for monomodal image registration
 Medical Image Computing and ComputerAssisted Intervention (MICCAI
, 2011
"... © The Author(s) 2008. This article is published with open access at Springerlink.com Abstract We present a stochastic gradient descent optimisation method for image registration with adaptive step size prediction. The method is based on the theoretical work by Plakhov and Cruz (J. Math. Sci. 120(1) ..."
Abstract

Cited by 29 (3 self)
 Add to MetaCart
© The Author(s) 2008. This article is published with open access at Springerlink.com Abstract We present a stochastic gradient descent optimisation method for image registration with adaptive step size prediction. The method is based on the theoretical work by Plakhov and Cruz (J. Math. Sci. 120(1):964–973, 2004). Our main methodological contribution is the derivation of an imagedriven mechanism to select proper values for the most important free parameters of the method. The selection mechanism employs general characteristics of the cost functions that commonly occur in intensitybased image registration. Also, the theoretical convergence conditions of the optimisation method are taken into account. The proposed adaptive stochastic gradient descent (ASGD) method is compared to a standard, nonadaptive RobbinsMonro (RM) algorithm. Both ASGD and RM employ a stochastic subsampling technique to accelerate the optimisation process. Registration experiments were performed on 3D CT and MR data of the head, lungs, and prostate, using various similarity measures and transformation models. The results indicate that ASGD is robust to these variations in the registration framework and is less sensitive to the settings of the userdefined parameters than RM. The main disadvantage of RM is the need for a predetermined step size function. The ASGD method provides a solution for that issue.