Results 1  10
of
12
Signal reconstruction from noisy random projections
 IEEE Trans. Inform. Theory
, 2006
"... Recent results show that a relatively small number of random projections of a signal can contain most of its salient information. It follows that if a signal is compressible in some orthonormal basis, then a very accurate reconstruction can be obtained from random projections. We extend this type of ..."
Abstract

Cited by 168 (21 self)
 Add to MetaCart
Recent results show that a relatively small number of random projections of a signal can contain most of its salient information. It follows that if a signal is compressible in some orthonormal basis, then a very accurate reconstruction can be obtained from random projections. We extend this type of result to show that compressible signals can be accurately recovered from random projections contaminated with noise. We also propose a practical iterative algorithm for signal reconstruction, and briefly discuss potential applications to coding, A/D conversion, and remote wireless sensing. Index Terms sampling, signal reconstruction, random projections, denoising, wireless sensor networks
Efficient Agnostic Learning of Neural Networks with Bounded Fanin
, 1996
"... We show that the class of two layer neural networks with bounded fanin is efficiently learnable in a realistic extension to the Probably Approximately Correct (PAC) learning model. In this model, a joint probability distribution is assumed to exist on the observations and the learner is required to ..."
Abstract

Cited by 68 (18 self)
 Add to MetaCart
We show that the class of two layer neural networks with bounded fanin is efficiently learnable in a realistic extension to the Probably Approximately Correct (PAC) learning model. In this model, a joint probability distribution is assumed to exist on the observations and the learner is required to approximate the neural network which minimizes the expected quadratic error. As special cases, the model allows learning realvalued functions with bounded noise, learning probabilistic concepts and learning the best approximation to a target function that cannot be well approximated by the neural network. The networks we consider have realvalued inputs and outputs, an unlimited number of threshold hidden units with bounded fanin, and a bound on the sum of the absolute values of the output weights. The number of computation This work was supported by the Australian Research Council and the Australian Telecommunications and Electronics Research Board. The material in this paper was pres...
Faster rates in regression via active learning
 in Proceedings of NIPS
, 2005
"... In this paper we address the theoretical capabilities of active sampling for estimating functions in noise. Specifically, the problem we consider is that of estimating a function from noisy pointwise samples, that is, the measurements which are collected at various points over the domain of the fun ..."
Abstract

Cited by 36 (9 self)
 Add to MetaCart
In this paper we address the theoretical capabilities of active sampling for estimating functions in noise. Specifically, the problem we consider is that of estimating a function from noisy pointwise samples, that is, the measurements which are collected at various points over the domain of the function. In the classical (passive) setting the sampling locations are chosen a priori, meaning that the choice of the sample locations precedes the gathering of the function observations. In the active sampling setting, on the other hand, the sample locations are chosen in an online fashion: the decision of where to sample next depends on all the observations made up to that point, in the spirit of the twenty questions game (as opposed to passive sampling where all the questions need to be asked before any answers are given). This extra degree of flexibility leads to improved signal reconstruction in comparison to the performance of classical (passive) methods. We present results characterizing the fundamental limits of active learning for various nonparametric function classes, as well as practical algorithms capable of exploiting the extra flexibility of the active setting and provably improving on classical techniques. In particular, significantly faster rates of convergence are achievable in cases involving functions whose complexity (in a the Kolmogorov sense) is highly concentrated in small regions of space (e.g., piecewise constant functions). Our active learning theory and methods show promise in a number of applications, including field estimation using wireless sensor networks and fault line detection. 1
Nonparametric time series prediction through adaptive model selection
 Machine Learning
, 2000
"... Abstract. We consider the problem of onestep ahead prediction for time series generated by an underlying stationary stochastic process obeying the condition of absolute regularity, describing the mixing nature of process. We make use of recent results from the theory of empirical processes, and ada ..."
Abstract

Cited by 28 (0 self)
 Add to MetaCart
Abstract. We consider the problem of onestep ahead prediction for time series generated by an underlying stationary stochastic process obeying the condition of absolute regularity, describing the mixing nature of process. We make use of recent results from the theory of empirical processes, and adapt the uniform convergence framework of Vapnik and Chervonenkis to the problem of time series prediction, obtaining finite sample bounds. Furthermore, by allowing both the model complexity and memory size to be adaptively determined by the data, we derive nonparametric rates of convergence through an extension of the method of structural risk minimization suggested by Vapnik. All our results are derived for general L p error measures, and apply to both exponentially and algebraically mixing processes.
Minimum Complexity Regression Estimation with Weakly Dependent Observations
 IEEE Trans. Inform. Theory
, 1996
"... Parameter Spaces and Abstract Complexities For each integer rt _> 1, let % denote a model dimension, for example, see (2), and let S, denote a compact subset of ]R The set S, will serve as a collection of parameters associated with the model dimension %, for example, see (5). For every v S,, let f( ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
Parameter Spaces and Abstract Complexities For each integer rt _> 1, let % denote a model dimension, for example, see (2), and let S, denote a compact subset of ]R The set S, will serve as a collection of parameters associated with the model dimension %, for example, see (5). For every v S,, let f(,, v) denote a realvalued function on Bx parameterized by (n, v), for example, see (3). The following condition is required to invoke the exponential inequalities in Theorems 4.2 and 4.3.
Statistical imaging and complexity regularization
 IEEE Trans. Inf. Theory
, 2000
"... Abstract — We apply the complexity–regularization principle to statistical illposed inverse problems in imaging. We formulate a natural distortion measure in image space and develop nonasymptotic bounds on estimation performance in terms of an index of resolvability that characterizes the compressi ..."
Abstract

Cited by 16 (3 self)
 Add to MetaCart
Abstract — We apply the complexity–regularization principle to statistical illposed inverse problems in imaging. We formulate a natural distortion measure in image space and develop nonasymptotic bounds on estimation performance in terms of an index of resolvability that characterizes the compressibility of the true image. These bounds extend previous results that were obtained in the literature under simpler observational models. I. Statement of the Problem A variety of imaging problems involve estimation of an image from noisy, degraded observations [1, 2]. Examples include tomography, astronomical imaging, ultrasound imaging, radar imaging, forensic science, and restoration of old movies. In some of these problems, a statistical model relating the observations
Adaptive Hausdorff Estimation of Density Level Sets
, 2007
"... Consider the problem of estimating the γlevel set G ∗ γ = {x: f(x) ≥ γ} of an unknown ddimensional density function f based on n independent observations X1,..., Xn from the density. This problem has been addressed under global error criteria related to the symmetric set difference. However, in c ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
Consider the problem of estimating the γlevel set G ∗ γ = {x: f(x) ≥ γ} of an unknown ddimensional density function f based on n independent observations X1,..., Xn from the density. This problem has been addressed under global error criteria related to the symmetric set difference. However, in certain applications such as anomaly detection and clustering, a more uniform mode of convergence is desirable to ensure that the estimated set is close to the target set everywhere. The Hausdorff error criterion provides this degree of uniformity and hence is more appropriate in such situations. It is known that the minimax optimal rate of convergence for the Hausdorff error is (n/log n) −1/(d+2α) for level sets with Lipschitz boundaries, where the parameter α characterizes the regularity of the density around the level of interest. However, the estimators proposed in previous work achieve this rate for very restricted classes of sets (e.g. the boundary fragment and starshaped sets) that effectively reduce the set estimation problem to a function estimation problem. This characterization precludes the existence of multiple connected components, which is fundamental to many applications such as clustering. Also, all previous work assumes knowledge of the density regularity as characterized by the parameter α. In this paper, we present a procedure that is adaptive to unknown regularity conditions and achieves near minimax optimal rates of Hausdorff error convergence for a class of level sets with very general shapes and multiple connected components at arbitrary orientations. 1
Agnostic Learning and Single Hidden Layer Neural Networks
, 1996
"... This thesis is concerned with some theoretical aspects of supervised learning of realvalued functions. We study a formal model of learning called agnostic learning. The agnostic learning model assumes a joint probability distribution on the observations (inputs and outputs) and requires the learnin ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
This thesis is concerned with some theoretical aspects of supervised learning of realvalued functions. We study a formal model of learning called agnostic learning. The agnostic learning model assumes a joint probability distribution on the observations (inputs and outputs) and requires the learning algorithm to produce an hypothesis with performance close to that of the best function within a specified class of functions. It is a very general model of learning which includes function learning, learning with additive noise and learning the best approximation in a class of functions as special cases. Within the agnostic learning model, we concentrate on learning functions which can be well approximated by single hidden layer neural networks. Artificial neural networks are often used as black box models for modelling phenomena for which very little prior knowledge is available. Agnostic learning is a natural model for such learning problems. The class of single hidden layer neural netwo...
To my familyAbstract Active Learning and Adaptive Sampling for NonParametric Inference
"... by ..."
Transversals in Trees
"... 1 Farley’s problem and its solution A transversal in a rooted tree is any set of nodes that meets every path from the root to a leaf. We let c(T, k) denote the number of transversals of size k in a rooted tree T. If T has n nodes and n ≥ 2, then and n−1 k−1 n ≤ c(T, k) ≤ c(T, n) = 1, c(T, n − 1) = ..."
Abstract
 Add to MetaCart
1 Farley’s problem and its solution A transversal in a rooted tree is any set of nodes that meets every path from the root to a leaf. We let c(T, k) denote the number of transversals of size k in a rooted tree T. If T has n nodes and n ≥ 2, then and n−1 k−1 n ≤ c(T, k) ≤ c(T, n) = 1, c(T, n − 1) = n, k for all k = 1, 2,..., n − 2. (1) The n − 2 upper bounds in (1) are attained simultaneously if and only if T is the path (the tree with precisely one leaf); the n − 2 lower bounds in (1) are attained simultaneously if and only if T is the star (the tree where all leaves are children of the root). Jonathan David Farley asked how high can these lower bounds be raised if each node of T has at most two children; he offered a creative interpretation of this question in [7, 8]. In this section, we give an answer.