Results 1  10
of
11
NonParametric Detection of Signals by Information Theoretic Criteria: Performance Analysis and an Improved Estimator
, 2009
"... Determining the number of sources is a fundamental problem in many scientific fields. In this paper we consider the nonparametric setting, and focus on the detection performance of two popular estimators based on information theoretic criteria, the Akaike information criterion (AIC) and minimum des ..."
Abstract

Cited by 23 (3 self)
 Add to MetaCart
(Show Context)
Determining the number of sources is a fundamental problem in many scientific fields. In this paper we consider the nonparametric setting, and focus on the detection performance of two popular estimators based on information theoretic criteria, the Akaike information criterion (AIC) and minimum description length (MDL). We present three contributions on this subject. First, we derive a new expression for the detection performance of the MDL estimator, which exhibits a much closer fit to simulations in comparison to previous formulas. Second, we present a random matrix theory viewpoint of the performance of the AIC estimator, including approximate analytical formulas for its overestimation probability. Finally, we show that a small increase in the penalty term of AIC leads to an estimator with a very good detection performance and a negligible overestimation probability.
Removing unwanted variation from high dimensional data with negative controls
"... High dimensional data suffer from unwanted variation, such as the batch effects common in microarray data. Unwanted variation complicates the analysis of high dimensional data, leading to high rates of false discoveries, high rates of missed discoveries, or both. In many cases the factors causing th ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
High dimensional data suffer from unwanted variation, such as the batch effects common in microarray data. Unwanted variation complicates the analysis of high dimensional data, leading to high rates of false discoveries, high rates of missed discoveries, or both. In many cases the factors causing the unwanted variation are unknown and must be inferred from the data. In such cases, negative controls may be used to identify the unwanted variation and separate it from the wanted variation. We present a new method, RUV4, to adjust for unwanted variation in high dimensional data with negative controls. RUV4 may be used when the goal of the analysis is to determine which of the features are truly associated with a given factor of interest. One nice property of RUV4 is that it is relatively insensitive to the number of unwanted factors included in the model; this makes estimating the number of factors less critical. We also present a novel method for estimating the features ’ variances that may be used even when a large number of unwanted factors are included in the model and the design matrix is full rank. We name this the “inverse method for estimating variances. ” By combining RUV4 with the inverse method, it is no longer necessary to estimate the number of unwanted factors at all. Using both real and simulated data we compare the performance of RUV4 with that of other adjustment methods such as SVA, LEAPP, ICE, and RUV2. We find that RUV4 and its variants perform as well or better than other methods.
Distribution of the largest eigenvalue for real Wishart and Gaussian random matrices and a simple approximation for the TracyWidom distribution. arXiv preprint arXiv:1209.3394
, 2012
"... ar ..."
(Show Context)
Approximation of Rectangular BetaLaguerre Ensembles and Large Deviations
"... Let λ1, · · · , λn be random eigenvalues coming from the betaLaguerre ensemble with parameter p, which is a generalization of the real, complex and quaternion Wishart matrices of parameter (n, p). In the case that the sample size n is much smaller than the dimension of the population distributio ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Let λ1, · · · , λn be random eigenvalues coming from the betaLaguerre ensemble with parameter p, which is a generalization of the real, complex and quaternion Wishart matrices of parameter (n, p). In the case that the sample size n is much smaller than the dimension of the population distribution p, a common situation in modern data, we approximate the betaLaguerre ensemble by a betaHermite ensemble which is a generalization of the real, complex and quaternion Wigner matrices. As corollaries, when n is much smaller than p, we show that the largest and smallest eigenvalues of the complex Wishart matrix are asymptotically independent; we obtain the limiting distribution of the condition numbers as a sum of two i.i.d. random variables with a TracyWidom distribution, which is much different from the exact square case that n = p by Edelman (1988); we propose a test procedure for a spherical hypothesis test. By the same approximation tool, we obtain the asymptotic distribution of the smallest eigenvalue of the betaLaguerre ensemble. In the second part of the paper, under the assumption that n is much smaller than p in a certain scale, we prove the large deviation principles for three basic statistics: the largest eigenvalue, the smallest eigenvalue and the empirical distribution of λ1, · · · , λn, where the last large deviation is derived by using a nonstandard method.
MAXIMUM LIKELIHOOD ESTIMATION FOR LINEAR GAUSSIAN COVARIANCE MODELS
"... We study parameter estimation in linear Gaussian covariance models, which are pdimensional Gaussian models with linear constraints on the covariance matrix. Maximum likelihood estimation for this class of models leads to a nonconvex optimization problem which typically has many local optima. We p ..."
Abstract
 Add to MetaCart
(Show Context)
We study parameter estimation in linear Gaussian covariance models, which are pdimensional Gaussian models with linear constraints on the covariance matrix. Maximum likelihood estimation for this class of models leads to a nonconvex optimization problem which typically has many local optima. We prove that the loglikelihood function is concave over a large region of the cone of positive definite matrices. Using recent results on the asymptotic distribution of extreme eigenvalues of the Wishart distribution, we provide sufficient conditions for any hill climbing method to converge to the global optimum. The proofs of these results utilize largesample asymptotic theory under the scheme n/p → γ> 1. Remarkably, our numerical simulations indicate that our results remain valid for min{n, p} as small as 2. An important consequence of this analysis is that for sample sizes n ' 14p, maximum likelihood estimation for linear Gaussian covariance models behaves as if it were a convex optimization problem. 1. Introduction. In
Spatial Sensing and Cognitive Radio Communication in the Presence of A KUser Interference Primary Network
"... ar ..."
(Show Context)