Results 1  10
of
48
Deterministic Annealing for Clustering, Compression, Classification, Regression, and Related Optimization Problems
 Proceedings of the IEEE
, 1998
"... this paper. Let us place it within the neural network perspective, and particularly that of learning. The area of neural networks has greatly benefited from its unique position at the crossroads of several diverse scientific and engineering disciplines including statistics and probability theory, ph ..."
Abstract

Cited by 248 (11 self)
 Add to MetaCart
this paper. Let us place it within the neural network perspective, and particularly that of learning. The area of neural networks has greatly benefited from its unique position at the crossroads of several diverse scientific and engineering disciplines including statistics and probability theory, physics, biology, control and signal processing, information theory, complexity theory, and psychology (see [45]). Neural networks have provided a fertile soil for the infusion (and occasionally confusion) of ideas, as well as a meeting ground for comparing viewpoints, sharing tools, and renovating approaches. It is within the illdefined boundaries of the field of neural networks that researchers in traditionally distant fields have come to the realization that they have been attacking fundamentally similar optimization problems.
Constructive Algorithms for Structure Learning in Feedforward Neural Networks for Regression Problems
 IEEE Transactions on Neural Networks
, 1997
"... In this survey paper, we review the constructive algorithms for structure learning in feedforward neural networks for regression problems. The basic idea is to start with a small network, then add hidden units and weights incrementally until a satisfactory solution is found. By formulating the whole ..."
Abstract

Cited by 66 (2 self)
 Add to MetaCart
In this survey paper, we review the constructive algorithms for structure learning in feedforward neural networks for regression problems. The basic idea is to start with a small network, then add hidden units and weights incrementally until a satisfactory solution is found. By formulating the whole problem as a state space search, we first describe the general issues in constructive algorithms, with special emphasis on the search strategy. A taxonomy, based on the differences in the state transition mapping, the training algorithm and the network architecture, is then presented. Keywords Constructive algorithm, structure learning, state space search, dynamic node creation, projection pursuit regression, cascadecorrelation, resourceallocating network, group method of data handling. I. Introduction A. Problems with Fixed Size Networks I N recent years, many neural network models have been proposed for pattern classification, function approximation and regression problems. Among...
Global Optimization for Neural Network Training
 IEEE Computer
, 1996
"... In this paper, we study various supervised learning methods for training feedforward neural networks. In general, such learning can be considered as a nonlinear global optimization problem in which the goal is to minimize a nonlinear error function that spans the space of weights using heuristic st ..."
Abstract

Cited by 43 (12 self)
 Add to MetaCart
In this paper, we study various supervised learning methods for training feedforward neural networks. In general, such learning can be considered as a nonlinear global optimization problem in which the goal is to minimize a nonlinear error function that spans the space of weights using heuristic strategies that look for global optima (in contrast to local optima). We survey various global optimization methods suitable for neuralnetwork learning, and propose the NOVEL method, a novel global optimization method for nonlinear optimization and neural network learning. By combining global and local searches, we show how NOVEL can be used to find a good local minimum in the error space. Our key idea is to use a userdefined trace that pulls a search out of a local minimum without having to restart it from a new starting point. Using five benchmark problems, we compare NOVEL against some of the best global optimization algorithms and demonstrate its superior improvement in performance. 1 In...
A review of dimension reduction techniques
, 1997
"... The problem of dimension reduction is introduced as a way to overcome the curse of the dimensionality when dealing with vector data in highdimensional spaces and as a modelling tool for such data. It is defined as the search for a lowdimensional manifold that embeds the highdimensional data. A cl ..."
Abstract

Cited by 30 (4 self)
 Add to MetaCart
The problem of dimension reduction is introduced as a way to overcome the curse of the dimensionality when dealing with vector data in highdimensional spaces and as a modelling tool for such data. It is defined as the search for a lowdimensional manifold that embeds the highdimensional data. A classification of dimension reduction problems is proposed. A survey of several techniques for dimension reduction is given, including principal component analysis, projection pursuit and projection pursuit regression, principal curves and methods based on topologically continuous maps, such as Kohonen’s maps or the generalised topographic mapping. Neural network implementations for several of these techniques are also reviewed, such as the projection pursuit learning network and the BCM neuron with an objective function. Several appendices complement the mathematical treatment of the main text.
Learning and Approximation Capabilities of Adaptive Spline Activation Function Neural Networks
 NEURAL NETWORKS
, 1998
"... In this paper, we study the theoretical properties of a new kind of artificial neural network, which is able to adapt its activation functions by varying the control points of a Catmull Rom cubic spline. Most of all, we are interested in generalization capability, and we can show that our architec ..."
Abstract

Cited by 28 (19 self)
 Add to MetaCart
In this paper, we study the theoretical properties of a new kind of artificial neural network, which is able to adapt its activation functions by varying the control points of a Catmull Rom cubic spline. Most of all, we are interested in generalization capability, and we can show that our architecture presents several advantages. First of all, it can be seen as a suboptimal realization of the additive spline based model obtained by the reguralization theory. Besides, simulations confirm that the special learning mechanism allows to use in a very effective way the network's free parameters, keeping their total number at lower values than in networks with sigmoidal activation functions. Other notable properties are a shorter training time and a reduced hardware complexity, due to the surplus in the number of neurons. # 1998 Elsevier Science Ltd. All rights reserved. Keywords: Spline neural networks; Multilayer perceptron; Generalized sigmoidal functions; Adaptive activation functions...
Spatial modelling using a new class of nonstationary covariance functions
 Environmetrics
, 2006
"... We introduce a new class of nonstationary covariance functions for spatial modelling. Nonstationary covariance functions allow the model to adapt to spatial surfaces whose variability changes with location. The class includes a nonstationary version of the Matérn stationary covariance, in which the ..."
Abstract

Cited by 27 (0 self)
 Add to MetaCart
We introduce a new class of nonstationary covariance functions for spatial modelling. Nonstationary covariance functions allow the model to adapt to spatial surfaces whose variability changes with location. The class includes a nonstationary version of the Matérn stationary covariance, in which the differentiability of the spatial surface is controlled by a parameter, freeing one from fixing the differentiability in advance. The class allows one to knit together local covariance parameters into a valid global nonstationary covariance, regardless of how the local covariance structure is estimated. We employ this new nonstationary covariance in a fully Bayesian model in which the unknown spatial process has a Gaussian process (GP) distribution with a nonstationary covariance function from the class. We model the nonstationary structure in a computationally efficient way that creates nearly stationary local behavior and for which stationarity is a special case. We also suggest nonBayesian approaches to nonstationary kriging. To assess the method, we compare the Bayesian nonstationary GP model with a Bayesian stationary GP model, various standard spatial smoothing approaches, and nonstationary models that can adapt to function heterogeneity. In simulations, the nonstationary GP model adapts to function heterogeneity, unlike the stationary models, and also outperforms the other nonstationary models. On a real dataset, GP models outperform the competitors, but while the nonstationary GP gives qualitatively more sensible results, it fails to outperform the stationary GP on heldout data, illustrating the difficulty in fitting complex spatial functions with relatively few observations. The nonstationary covariance model could also be used for nonGaussian data and embedded in additive models as well as in more complicated, hierarchical spatial or spatiotemporal models. More complicated models may require simpler parameterizations for computational efficiency.
Constructive Feedforward Neural Networks for Regression Problems: A Survey
, 1995
"... In this paper, we review the procedures for constructing feedforward neural networks in regression problems. While standard backpropagation performs gradient descent only in the weight space of a network with fixed topology, constructive procedures start with a small network and then grow additiona ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
In this paper, we review the procedures for constructing feedforward neural networks in regression problems. While standard backpropagation performs gradient descent only in the weight space of a network with fixed topology, constructive procedures start with a small network and then grow additional hidden units and weights until a satisfactory solution is found. The constructive procedures are categorized according to the resultant network architecture and the learning algorithm for the network weights. The Hong Kong University of Science & Technology Technical Report Series Department of Computer Science 1 Introduction In recent years, many neural network models have been proposed for pattern classification, function approximation and regression problems. Among them, the class of multilayer feedforward networks is perhaps the most popular. Standard backpropagation performs gradient descent only in the weight space of a network with fixed topology; this approach is analogous to ...
Multilayer Feedforward Networks with Adaptive Spline Activation Function
 IEEE Trans. on Neural Network
"... In this paper, a new adaptive spline activation function neural network (ASNN) is presented. Due to the ASNN's high representation capabilities, networks with a small number of interconnections can be trained to solve both pattern recognition and data processing realtime problems. The main idea is ..."
Abstract

Cited by 20 (17 self)
 Add to MetaCart
In this paper, a new adaptive spline activation function neural network (ASNN) is presented. Due to the ASNN's high representation capabilities, networks with a small number of interconnections can be trained to solve both pattern recognition and data processing realtime problems. The main idea is to use a CatmullRom cubic spline as the neuron's activation function, which ensures a simple structure suitable for both software and hardware implementation. Experimental results demonstrate improvements in terms of generalization capability and of learning speed in both pattern recognition and data processing tasks. Index Terms Adaptive activation functions, function shape autotuning, generalization, generalized sigmoidal functions, multilayer perceptron, neural networks, spline neural networks. I. INTRODUCTION I N both hardware and software neural network (NN) implementations, the complexity, both structural, in terms of interconnections, and computational, in terms of the number...
Efficient Algorithms for Function Approximation with Piecewise Linear Sigmoidal Networks
, 1998
"... This paper presents a computationally efficient algorithm for function approximation with piecewise linear sigmoidal nodes. A one hidden layer network is constructed one node at a time using the wellknown method of fitting the residual. The task of fitting an individual node is accomplished using ..."
Abstract

Cited by 19 (1 self)
 Add to MetaCart
This paper presents a computationally efficient algorithm for function approximation with piecewise linear sigmoidal nodes. A one hidden layer network is constructed one node at a time using the wellknown method of fitting the residual. The task of fitting an individual node is accomplished using a new algorithm that searches for the best fit by solving a sequence of Quadratic Programming problems. This approach offers significant advantages over derivativebased search algorithms (e.g. backpropagation and its extensions). Unique characteristics of this algorithm include: finite step convergence, a simple stopping criterion, solutions that are independent of initial conditions, good scaling properties and a robust numerical implementation. Empirical results are included to illustrate these characteristics.
Bayesian Wavelet Networks for Nonparametric Regression
, 1997
"... Radial wavelet networks have recently been proposed as a method for nonparametric regression. In this paper we analyse their performance within a Bayesian framework. We derive probability distributions over both the dimension of the networks and the network coefficients by placing a prior on the deg ..."
Abstract

Cited by 19 (6 self)
 Add to MetaCart
Radial wavelet networks have recently been proposed as a method for nonparametric regression. In this paper we analyse their performance within a Bayesian framework. We derive probability distributions over both the dimension of the networks and the network coefficients by placing a prior on the degrees of freedom of the model. This process bypasses the need to test or select a finite number of networks during the modelling process. Predictions are formed by mixing over many models of varying dimension and parameterization. We show that the complexity of the models adapts to the complexity of the data and produces good results on a number of benchmark test series. Keywords: Wavelets, radial basis functions, model choice, Bayesian neural networks, reversible jump Markov chain Monte Carlo, nonparametric regression, splines. 1 Introduction Wavelet networks have previously been studied in relation to nonparametric regression by Zhang (1997), Kugarajah and Zhang (1995), Zhang and Benveni...