Results 1 - 10
of
41
Distributed detection in sensor networks with packet losses and finite capacity links
- IEEE Transactions on Signal Processing
, 2006
"... We consider the problem of classifying among a set of M hypotheses via distributed noisy sensors. The sensors can collaborate over a communication network and the task is to arrive at a consensus about the event after exchanging messages. We apply a variant of belief propagation as a strategy for co ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
We consider the problem of classifying among a set of M hypotheses via distributed noisy sensors. The sensors can collaborate over a communication network and the task is to arrive at a consensus about the event after exchanging messages. We apply a variant of belief propagation as a strategy for collaboration to arrive at a solution to the distributed classification problem. We show that the message evolution can be re-formulated as the evolution of a linear dynamical system, which is primarily characterized by network connectivity. We show that a consensus to the centralized MAP estimate can almost always reached by the sensors for any arbitrary network. We then extend these results in several directions. First, we demonstrate that these results continue to hold with quantization of the messages, which is appealing from the point of view of finite bit rates supportable between links. We then demonstrate robustness against packet losses, which implies that optimal decisions can be achieved with asynchronous transmissions as well. Next, we present an account of energy requirements for distributed detection and demonstrate significant improvement over conventional decentralized detection. Finally, extensions to distributed estimation are described. 1
Sample size for fdr-control in microarray data analysis
- Bioinformatics
, 2005
"... We consider identifying differentially expressing genes between two patient groups using microarray experiment. We propose a sample size calculation method for a specified number of true rejections while controlling the false discovery rate at a de-sired level. Input parameters for the sample size c ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
We consider identifying differentially expressing genes between two patient groups using microarray experiment. We propose a sample size calculation method for a specified number of true rejections while controlling the false discovery rate at a de-sired level. Input parameters for the sample size calculation include the allocation proportion in each group, the number of genes in each array, the number of differen-tially expressing genes, and the effect sizes among the differentially expressing genes. We have a closed-form sample size formula if the projected effect sizes are equal among differentially expressing genes. Otherwise, our method requires a numerical method to solve an equation. Simulation studies are conducted to show that the calculated sample sizes are accurate in practical settings. The proposed method is demonstrated with a real study. Key words: Block compound symmetry, Family-wise error rate, Prognostic gene, True rejection, Two-sample t-test.
An evaluation of thresholding techniques in fMRI analysis
, 2004
"... This paper reviews and compares individual voxel-wise thresholding methods for identifying active voxels in single-subject fMRI datasets. Different error rates are described which may be used to calibrate activation thresholds. We discuss methods which control each of the error rates at a prespecifi ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
This paper reviews and compares individual voxel-wise thresholding methods for identifying active voxels in single-subject fMRI datasets. Different error rates are described which may be used to calibrate activation thresholds. We discuss methods which control each of the error rates at a prespecified level a, including simple procedures which ignore spatial correlation among the test statistics as well as more elaborate ones which incorporate this correlation information. The operating characteristics of the methods are shown through a simulation study, indicating that the error rate used has an important impact on the sensitivity of the thresholding method, but that accounting for correlation has little impact. Therefore, the simple procedures described work well for thresholding most single-subject fMRI experiments and are recommended. The methods are illustrated with a real bilateral finger tapping experiment
Exceedance Control of the False Discovery Proportion
"... Multiple testing methods to control the False Discovery Rate (FDR), the expected proportion of falsely rejected null hypotheses among all rejections) have received much attention. It can be valuable instead to control not the mean of this false discovery proportion (FDP) but the probability that the ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
Multiple testing methods to control the False Discovery Rate (FDR), the expected proportion of falsely rejected null hypotheses among all rejections) have received much attention. It can be valuable instead to control not the mean of this false discovery proportion (FDP) but the probability that the FDP exceeds a specified bound. In this paper, we construct a general class of methods for exceedance control of FDP based on inverting tests of uniformity. The method also produces a confidence envelope for the FDP as a function of rejection threshold. We discuss how
False discovery control with p-value weighting
, 2006
"... We present a method for multiple hypothesis testing that maintains control of the false discovery rate while incorporating prior information about the hypotheses. The prior information takes the form of p-value weights. If the assignment of weights is positively associated with the null hypotheses b ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
We present a method for multiple hypothesis testing that maintains control of the false discovery rate while incorporating prior information about the hypotheses. The prior information takes the form of p-value weights. If the assignment of weights is positively associated with the null hypotheses being false, the procedure improves power, except in cases where power is already near one. Even if the assignment of weights is poor, power is only reduced slightly, as long as the weights are not too large. We also provide a similar method for controlling false discovery exceedance.
Estimation and confidence sets for sparse normal mixtures
, 2005
"... For high dimensional statistical models, researchers have begun to focus on situations which can be described as having relatively few moderately large coefficients. Such situations lead to some very subtle statistical problems. In particular, Ingster and Donoho and Jin have considered a sparse norm ..."
Abstract
-
Cited by 9 (5 self)
- Add to MetaCart
For high dimensional statistical models, researchers have begun to focus on situations which can be described as having relatively few moderately large coefficients. Such situations lead to some very subtle statistical problems. In particular, Ingster and Donoho and Jin have considered a sparse normal means testing problem, in which they described the precise demarcation, or the detection boundary. Meinshausen and Rice have shown that it is even possible to estimate consistently the fraction of nonzero coordinates on a subset of the detectable region, but leave unanswered the question of exactly which parts of the detectable region that consistent estimation is possible. In the present paper we develop a new approach for estimating the fraction of nonzero means for problems where the nonzero means are moderately large. We show that the detection region described by Ingster and Donoho and Jin turns out to be the region where it is possible to consistently estimate the expected fraction of nonzero coordinates. This theory is developed further and minimax rates of convergence are derived. A procedure is constructed which attains the optimal rate of convergence in this setting. Furthermore, the procedure also provides an honest lower bound for confidence intervals while minimizing the expected length of such an interval. Simulations are used to enable comparison with the work of Meinshausen and Rice, where a procedure is given but where rates of convergence have not been discussed. Extensions to more general Gaussian mixture models are also given.
Asymptotic minimaxity of false discovery rate thresholding for sparse exponential data
- Ann. Statist
, 2006
"... Control of the False Discovery Rate (FDR) is an important development in multiple hypothesis testing, allowing the user to limit the fraction of rejected null hypotheses which correspond to false rejections (i.e. false discoveries). The FDR principle also can be used in multiparameter estimation pro ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Control of the False Discovery Rate (FDR) is an important development in multiple hypothesis testing, allowing the user to limit the fraction of rejected null hypotheses which correspond to false rejections (i.e. false discoveries). The FDR principle also can be used in multiparameter estimation problems to set thresholds for separating signal from noise when the signal is sparse. Success has been proven when the noise is Gaussian; see [3]. In this paper, we consider the application of FDR thresholding to a non-Gaussian setting, in hopes of learning whether the good asymptotic properties of FDR thresholding as an estimation tool hold more broadly than just at the standard Gaussian model. We consider a vector Xi, i = 1,..., n, whose coordinates are independent exponential with individual means µi. The vector µ is thought to be sparse, with most coordinates 1 and a small fraction significantly larger than 1. This models a situation where most coordinates are simply ‘noise’, but a small fraction of the coordinates contain ‘signal’. We develop an estimation theory working with log(µi) as the estimand, and use the percoordinate mean-squared error in recovering log(µi) to measure risk. We consider minimax
False discovery and false nondiscovery rates in single-step multiple testing procedures
- Ann. Statist
, 2006
"... Results on the false discovery rate (FDR) and the false nondiscovery rate (FNR) are developed for single-step multiple testing procedures. In addition to verifying desirable properties of FDR and FNR as measures of error rates, these results extend previously known results, providing further insight ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Results on the false discovery rate (FDR) and the false nondiscovery rate (FNR) are developed for single-step multiple testing procedures. In addition to verifying desirable properties of FDR and FNR as measures of error rates, these results extend previously known results, providing further insights, particularly under dependence, into the notions of FDR and FNR and related measures. First, considering fixed configurations of true and false null hypotheses, inequalities are obtained to explain how an FDR- or FNR-controlling single-step procedure, such as a Bonferroni or ˘ Sidák procedure, can potentially be improved. Two families of procedures are then constructed, one that modifies the FDR-controlling and the other that modifies the FNR-controlling ˘ Sidák procedure. These are proved to control FDR or FNR under independence less conservatively than the corresponding families that modify the FDR- or FNR-controlling Bonferroni procedure. Results of numerical investigations of the performance of the modified ˘ Sidák FDR procedure over its competitors are presented. Second, considering a mixture model where different configurations of true and false null hypotheses are assumed to have certain probabilities, results are also derived that extend some of Storey’s work to the dependence case.
Estimation and control of multiple testing error rates for microarray studies
- BRIEFINGS IN BIOINFORMATICS. VOL 7. NO 1. 25--36
, 65
"... ..."
Nonparametric assessment of contamination in multivariate data using minimum-volume sets and FDR
, 2007
"... Large, multivariate datasets from high-throughput instrumentation have become ubiquitous throughout the sciences. Frequently, it is of great interest to characterize the measurements in these datasets by the extent to which they represent ‘nominal ’ versus ‘contaminated ’ instances. However, often t ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Large, multivariate datasets from high-throughput instrumentation have become ubiquitous throughout the sciences. Frequently, it is of great interest to characterize the measurements in these datasets by the extent to which they represent ‘nominal ’ versus ‘contaminated ’ instances. However, often the nature of even the nominal patterns in the data are unknown and potentially quite complex, making their explicit parametric modeling a daunting task. In this paper, we introduce a nonparametric method for the simultaneous annotation of multivariate data (called MN-SCAnn), by which one may produce an annotated ranking of the observations, indicating the relative extent to which each may or may not be considered nominal, while making minimal assumptions on the nature of the nominal distribution. In our framework each observation is linked to a corresponding minimum volume set and, implicitly adopting a hypothesis testing perspective, each set is associated with a test, which in turn is accompanied by a certain false discovery rate. The combination of minimum volume set methods with false discovery rate principles, in the context of contaminated data, is new. Moreover, estimation of the key underlying quantities requires that a number of issues be addressed. We illustrate MN-SCAnn through examples in two contexts – the pre-processing of cell-based assays in bioinformatics, and the detection of anomalous traffic patterns in Internet measurement studies.

