Results 1 - 10
of
10
Signal reconstruction from noisy random projections
- IEEE Trans. Inform. Theory
, 2006
"... Recent results show that a relatively small number of random projections of a signal can contain most of its salient information. It follows that if a signal is compressible in some orthonormal basis, then a very accurate reconstruction can be obtained from random projections. We extend this type of ..."
Abstract
-
Cited by 104 (11 self)
- Add to MetaCart
Recent results show that a relatively small number of random projections of a signal can contain most of its salient information. It follows that if a signal is compressible in some orthonormal basis, then a very accurate reconstruction can be obtained from random projections. We extend this type of result to show that compressible signals can be accurately recovered from random projections contaminated with noise. We also propose a practical iterative algorithm for signal reconstruction, and briefly discuss potential applications to coding, A/D conversion, and remote wireless sensing. Index Terms sampling, signal reconstruction, random projections, denoising, wireless sensor networks
Efficient Agnostic Learning of Neural Networks with Bounded Fan-in
, 1996
"... We show that the class of two layer neural networks with bounded fan-in is efficiently learnable in a realistic extension to the Probably Approximately Correct (PAC) learning model. In this model, a joint probability distribution is assumed to exist on the observations and the learner is required to ..."
Abstract
-
Cited by 57 (18 self)
- Add to MetaCart
We show that the class of two layer neural networks with bounded fan-in is efficiently learnable in a realistic extension to the Probably Approximately Correct (PAC) learning model. In this model, a joint probability distribution is assumed to exist on the observations and the learner is required to approximate the neural network which minimizes the expected quadratic error. As special cases, the model allows learning real-valued functions with bounded noise, learning probabilistic concepts and learning the best approximation to a target function that cannot be well approximated by the neural network. The networks we consider have real-valued inputs and outputs, an unlimited number of threshold hidden units with bounded fan-in, and a bound on the sum of the absolute values of the output weights. The number of computation This work was supported by the Australian Research Council and the Australian Telecommunications and Electronics Research Board. The material in this paper was pres...
Nonparametric time series prediction through adaptive model selection
- Machine Learning
, 2000
"... Abstract. We consider the problem of one-step ahead prediction for time series generated by an underlying stationary stochastic process obeying the condition of absolute regularity, describing the mixing nature of process. We make use of recent results from the theory of empirical processes, and ada ..."
Abstract
-
Cited by 28 (0 self)
- Add to MetaCart
Abstract. We consider the problem of one-step ahead prediction for time series generated by an underlying stationary stochastic process obeying the condition of absolute regularity, describing the mixing nature of process. We make use of recent results from the theory of empirical processes, and adapt the uniform convergence framework of Vapnik and Chervonenkis to the problem of time series prediction, obtaining finite sample bounds. Furthermore, by allowing both the model complexity and memory size to be adaptively determined by the data, we derive nonparametric rates of convergence through an extension of the method of structural risk minimization suggested by Vapnik. All our results are derived for general L p error measures, and apply to both exponentially and algebraically mixing processes.
Faster rates in regression via active learning
- in Proceedings of NIPS
, 2005
"... In this paper we address the theoretical capabilities of active sampling for estimating functions in noise. Specifically, the problem we consider is that of estimating a function from noisy point-wise samples, that is, the measurements which are collected at various points over the domain of the fun ..."
Abstract
-
Cited by 25 (6 self)
- Add to MetaCart
In this paper we address the theoretical capabilities of active sampling for estimating functions in noise. Specifically, the problem we consider is that of estimating a function from noisy point-wise samples, that is, the measurements which are collected at various points over the domain of the function. In the classical (passive) setting the sampling locations are chosen a priori, meaning that the choice of the sample locations precedes the gathering of the function observations. In the active sampling setting, on the other hand, the sample locations are chosen in an online fashion: the decision of where to sample next depends on all the observations made up to that point, in the spirit of the twenty questions game (as opposed to passive sampling where all the questions need to be asked before any answers are given). This extra degree of flexibility leads to improved signal reconstruction in comparison to the performance of classical (passive) methods. We present results characterizing the fundamental limits of active learning for various nonparametric function classes, as well as practical algorithms capable of exploiting the extra flexibility of the active setting and provably improving on classical techniques. In particular, significantly faster rates of convergence are achievable in cases involving functions whose complexity (in a the Kolmogorov sense) is highly concentrated in small regions of space (e.g., piecewise constant functions). Our active learning theory and methods show promise in a number of applications, including field estimation using wireless sensor networks and fault line detection. 1
Minimum Complexity Regression Estimation with Weakly Dependent Observations
- IEEE Trans. Inform. Theory
, 1996
"... Parameter Spaces and Abstract Complexities For each integer rt _> 1, let % denote a model dimension, for example, see (2), and let S, denote a compact subset of ]R The set S, will serve as a collection of parameters associated with the model dimension %, for example, see (5). For every v S,, let f( ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
Parameter Spaces and Abstract Complexities For each integer rt _> 1, let % denote a model dimension, for example, see (2), and let S, denote a compact subset of ]R The set S, will serve as a collection of parameters associated with the model dimension %, for example, see (5). For every v S,, let f(,, v) denote a real-valued function on Bx parameterized by (n, v), for example, see (3). The following condition is required to invoke the exponential inequalities in Theorems 4.2 and 4.3.
Statistical Imaging and Complexity Regularization
- IEEE Transactions on Information Theory
, 1999
"... We apply the complexity--regularization principle to statistical ill-posed inverse problems in imaging. The class of problems studied includes restoration of images corrupted by Gaussian or Poisson noise and nonlinear transforms. We formulate a natural distortion measure in image space and develop n ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
We apply the complexity--regularization principle to statistical ill-posed inverse problems in imaging. The class of problems studied includes restoration of images corrupted by Gaussian or Poisson noise and nonlinear transforms. We formulate a natural distortion measure in image space and develop nonasymptotic bounds on estimation performance in terms of an index of resolvability that characterizes the compressibility of the true image. These bounds extend previous results that were obtained in the literature under simpler observational models. We present a connection between complexity-regularized estimation and rate-distortion theory, which suggests a method for constructing optimal codebooks. However, the design of computationally tractable complexity--regularized image estimators is quite challenging; we present some of the issues involved and illustrate them with a Poisson-imaging application. Keywords: nonparametric estimation, compression, minimum description length principle, ...
Agnostic Learning and Single Hidden Layer Neural Networks
, 1996
"... This thesis is concerned with some theoretical aspects of supervised learning of real-valued functions. We study a formal model of learning called agnostic learning. The agnostic learning model assumes a joint probability distribution on the observations (inputs and outputs) and requires the learnin ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This thesis is concerned with some theoretical aspects of supervised learning of real-valued functions. We study a formal model of learning called agnostic learning. The agnostic learning model assumes a joint probability distribution on the observations (inputs and outputs) and requires the learning algorithm to produce an hypothesis with performance close to that of the best function within a specified class of functions. It is a very general model of learning which includes function learning, learning with additive noise and learning the best approximation in a class of functions as special cases. Within the agnostic learning model, we concentrate on learning functions which can be well approximated by single hidden layer neural networks. Artificial neural networks are often used as black box models for modelling phenomena for which very little prior knowledge is available. Agnostic learning is a natural model for such learning problems. The class of single hidden layer neural netwo...
Adaptive Hausdorff Estimation of Density Level Sets
, 2007
"... Consider the problem of estimating the γ-level set G ∗ γ = {x: f(x) ≥ γ} of an unknown d-dimensional density function f based on n independent observations X1,..., Xn from the density. This problem has been addressed under global error criteria related to the symmetric set difference. However, in c ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Consider the problem of estimating the γ-level set G ∗ γ = {x: f(x) ≥ γ} of an unknown d-dimensional density function f based on n independent observations X1,..., Xn from the density. This problem has been addressed under global error criteria related to the symmetric set difference. However, in certain applications such as anomaly detection and clustering, a more uniform mode of convergence is desirable to ensure that the estimated set is close to the target set everywhere. The Hausdorff error criterion provides this degree of uniformity and hence is more appropriate in such situations. It is known that the minimax optimal rate of convergence for the Hausdorff error is (n/log n) −1/(d+2α) for level sets with Lipschitz boundaries, where the parameter α characterizes the regularity of the density around the level of interest. However, the estimators proposed in previous work achieve this rate for very restricted classes of sets (e.g. the boundary fragment and star-shaped sets) that effectively reduce the set estimation problem to a function estimation problem. This characterization precludes the existence of multiple connected components, which is fundamental to many applications such as clustering. Also, all previous work assumes knowledge of the density regularity as characterized by the parameter α. In this paper, we present a procedure that is adaptive to unknown regularity conditions and achieves near minimax optimal rates of Hausdorff error convergence for a class of level sets with very general shapes and multiple connected components at arbitrary orientations. 1
Transversals in Trees
"... 1 Farley’s problem and its solution A transversal in a rooted tree is any set of nodes that meets every path from the root to a leaf. We let c(T, k) denote the number of transversals of size k in a rooted tree T. If T has n nodes and n ≥ 2, then and n−1 k−1 n ≤ c(T, k) ≤ c(T, n) = 1, c(T, n − 1) = ..."
Abstract
- Add to MetaCart
1 Farley’s problem and its solution A transversal in a rooted tree is any set of nodes that meets every path from the root to a leaf. We let c(T, k) denote the number of transversals of size k in a rooted tree T. If T has n nodes and n ≥ 2, then and n−1 k−1 n ≤ c(T, k) ≤ c(T, n) = 1, c(T, n − 1) = n, k for all k = 1, 2,..., n − 2. (1) The n − 2 upper bounds in (1) are attained simultaneously if and only if T is the path (the tree with precisely one leaf); the n − 2 lower bounds in (1) are attained simultaneously if and only if T is the star (the tree where all leaves are children of the root). Jonathan David Farley asked how high can these lower bounds be raised if each node of T has at most two children; he offered a creative interpretation of this question in [7, 8]. In this section, we give an answer.

