Results 1  10
of
241
Regularization Theory and Neural Networks Architectures
 Neural Computation
, 1995
"... We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Ba ..."
Abstract

Cited by 317 (31 self)
 Add to MetaCart
We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Basis Functions approximation schemes. This paper shows that regularization networks encompass a much broader range of approximation schemes, including many of the popular general additive models and some of the neural networks. In particular, we introduce new classes of smoothness functionals that lead to different classes of basis functions. Additive splines as well as some tensor product splines can be obtained from appropriate classes of smoothness functionals. Furthermore, the same generalization that extends Radial Basis Functions (RBF) to Hyper Basis Functions (HBF) also leads from additive models to ridge approximation models, containing as special cases Breiman's hinge functions, som...
A Theory of Networks for Approximation and Learning
 Laboratory, Massachusetts Institute of Technology
, 1989
"... Learning an inputoutput mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multidimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, t ..."
Abstract

Cited by 200 (24 self)
 Add to MetaCart
Learning an inputoutput mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multidimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, this form of learning is closely related to classical approximation techniques, such as generalized splines and regularization theory. This paper considers the problems of an exact representation and, in more detail, of the approximation of linear and nonlinear mappings in terms of simpler functions of fewer variables. Kolmogorov's theorem concerning the representation of functions of several variables in terms of functions of one variable turns out to be almost irrelevant in the context of networks for learning. Wedevelop a theoretical framework for approximation based on regularization techniques that leads to a class of threelayer networks that we call Generalized Radial Basis Functions (GRBF), since they are mathematically related to the wellknown Radial Basis Functions, mainly used for strict interpolation tasks. GRBF networks are not only equivalent to generalized splines, but are also closely related to pattern recognition methods suchasParzen windows and potential functions and to several neural network algorithms, suchas Kanerva's associative memory,backpropagation and Kohonen's topology preserving map. They also haveaninteresting interpretation in terms of prototypes that are synthesized and optimally combined during the learning stage. The paper introduces several extensions and applications of the technique and discusses intriguing analogies with neurobiological data.
Multiview Stereo via Volumetric Graphcuts and Occlusion Robust PhotoConsistency
, 2007
"... This paper presents a volumetric formulation for the multiview stereo problem which is amenable to a computationally tractable global optimisation using Graphcuts. Our approach is to seek the optimal partitioning of 3D space into two regions labelled as ‘object’ and ‘empty’ under a cost functional ..."
Abstract

Cited by 140 (10 self)
 Add to MetaCart
This paper presents a volumetric formulation for the multiview stereo problem which is amenable to a computationally tractable global optimisation using Graphcuts. Our approach is to seek the optimal partitioning of 3D space into two regions labelled as ‘object’ and ‘empty’ under a cost functional consisting of the following two terms: (1) A term that forces the boundary between the two regions to pass through photoconsistent locations and (2) a ballooning term that inflates the ‘object ’ region. To take account of the effect of occlusion on the first term we use an occlusion robust photoconsistency metric based on Normalised Cross Correlation, which does not assume any geometric knowledge about the reconstructed object. The globally optimal 3D partitioning can be obtained as the minimum cut solution of a weighted graph.
Transformation Invariance in Pattern Recognition  Tangent Distance and Tangent Propagation
 Lecture Notes in Computer Science
, 1998
"... . In pattern recognition, statistical modeling, or regression, the amount of data is a critical factor affecting the performance. If the amount of data and computational resources are unlimited, even trivial algorithms will converge to the optimal solution. However, in the practical case, given ..."
Abstract

Cited by 129 (2 self)
 Add to MetaCart
. In pattern recognition, statistical modeling, or regression, the amount of data is a critical factor affecting the performance. If the amount of data and computational resources are unlimited, even trivial algorithms will converge to the optimal solution. However, in the practical case, given limited data and other resources, satisfactory performance requires sophisticated methods to regularize the problem by introducing a priori knowledge. Invariance of the output with respect to certain transformations of the input is a typical example of such a priori knowledge. In this chapter, we introduce the concept of tangent vectors, which compactly represent the essence of these transformation invariances, and two classes of algorithms, "tangent distance" and "tangent propagation", which make use of these invariances to improve performance. 1 Introduction Pattern Recognition is one of the main tasks of biological information processing systems, and a major challenge of compute...
Flexible Metric Nearest Neighbor Classification
, 1994
"... The Knearestneighbor decision rule assigns an object of unknown class to the plurality class among the K labeled "training" objects that are closest to it. Closeness is usually defined in terms of a metric distance on the Euclidean space with the input measurement variables as axes. The ..."
Abstract

Cited by 123 (2 self)
 Add to MetaCart
The Knearestneighbor decision rule assigns an object of unknown class to the plurality class among the K labeled "training" objects that are closest to it. Closeness is usually defined in terms of a metric distance on the Euclidean space with the input measurement variables as axes. The metric chosen to define this distance can strongly effect performance. An optimal choice depends on the problem at hand as characterized by the respective class distributions on the input measurement space, and within a given problem, on the location of the unknown object in that space. In this paper new types of Knearestneighbor procedures are described that estimate the local relevance of each input variable, or their linear combinations, for each individual point to be classified. This information is then used to separately customize the metric used to define distance from that object in finding its nearest neighbors. These procedures are a hybrid between regular Knearestneighbor methods and treestructured recursive partitioning techniques popular in statistics and machine learning.
Improved fast Gauss transform and efficient kernel density estimation
 In ICCV
, 2003
"... Evaluating sums of multivariate Gaussians is a common computational task in computer vision and pattern recognition, including in the general and powerful kernel density estimation technique. The quadratic computational complexity of the summation is a significant barrier to the scalability of this ..."
Abstract

Cited by 109 (7 self)
 Add to MetaCart
Evaluating sums of multivariate Gaussians is a common computational task in computer vision and pattern recognition, including in the general and powerful kernel density estimation technique. The quadratic computational complexity of the summation is a significant barrier to the scalability of this algorithm to practical applications. The fast Gauss transform (FGT) has successfully accelerated the kernel density estimation to linear running time for lowdimensional problems. Unfortunately, the cost of a direct extension of the FGT to higherdimensional problems grows exponentially with dimension, making it impractical for dimensions above 3. We develop an improved fast Gauss transform to efficiently estimate sums of Gaussians in higher dimensions, where a new multivariate expansion scheme and an adaptive space subdivision technique dramatically improve the performance. The improved FGT has been applied to the mean shift algorithm achieving linear computational complexity. Experimental results demonstrate the efficiency and effectiveness of our algorithm. 1
Smoothing by Local Regression: Principles and Methods
 Statistical Theory and Computational Aspects of Smoothing, W. Haerdle, M. G. Schimek (eds), Physica
, 1996
"... ..."
Shape statistics in kernel space for variational image segmentation
 PATTERN RECOGNITION
, 2003
"... ..."
Parametric and Nonparametric Unsupervised Cluster Analysis
 Pattern Recognition
, 1996
"... Much work has been published on methods for assessing the probable number of clusters or structures within unknown data sets. This paper aims to look in more detail at two methods, a broad parametric method, based around the assumption of Gaussian clusters and the other a nonparametric method which ..."
Abstract

Cited by 54 (6 self)
 Add to MetaCart
Much work has been published on methods for assessing the probable number of clusters or structures within unknown data sets. This paper aims to look in more detail at two methods, a broad parametric method, based around the assumption of Gaussian clusters and the other a nonparametric method which utilises methods of scalespace filtering to extract robust structures within a data set. It is shown that, whilst both methods are capable of determining cluster validity for data sets in which clusters tend towards a multivariate Gaussian distribution, the parametric method inevitably fails for clusters which have a nonGaussian structure whilst the scalespace method is more robust. Key words : Cluster analysis, maximum likelihood methods, scalespace filtering, probability density estimation. 1 Introduction Most scientific disciplines generate experimental data from an observed system about which we have may have little understanding of the data generating function. The notion that com...