Results 1  10
of
13
Issues in Bayesian Analysis of Neural Network Models
, 1998
"... This paper discusses these issues exploring the potentiality of Bayesian ideas in the analysis of NN models. Buntine and Weigend (1991) and MacKay (1992) have provided frameworks for their Bayesian analysis based on Gaussian approximations and Neal (1993) has applied hybrid Monte Carlo ideas. Ripley ..."
Abstract

Cited by 31 (0 self)
 Add to MetaCart
This paper discusses these issues exploring the potentiality of Bayesian ideas in the analysis of NN models. Buntine and Weigend (1991) and MacKay (1992) have provided frameworks for their Bayesian analysis based on Gaussian approximations and Neal (1993) has applied hybrid Monte Carlo ideas. Ripley (1993) and Cheng and Titterington (1994) have dwelt on the power of these ideas, specially as far as interpretation and architecture selection are concerned. See MacKay (1995) for a recent review. From a statistical modeling point of view NN's are a special instance of mixture models. Many issues about posterior multimodality and computational strategies in NN modeling are of relevance in the wider class of mixture models. Related recent references in the Bayesian literature on mixture models include Diebolt and Robert (1994), Escobar and West (1994), Robert and Mengersen (1995), Roeder and Wasserman (1995), West (1994), West and Cao (1993), West, Muller and Escobar (1994), and West and Turner (1994). We concentrate on approximation problems, though many of our suggestions can be translated to other areas. For those problems, NN's are viewed as highly nonlinear (semiparametric) approximators, where parameters are typically estimated by least squares. Applications of interest for practicioners include nonlinear regression, stochastic optimisation and regression metamodels for simulation output. The main issue we address here is how to undertake a Bayesian analysis of a NN model, and the uses of it we may make. Our contributions include: an evaluation of computational approaches to Bayesian analysis of NN models, including a novel Markov chain Monte Carlo scheme; a suggestion of a scheme for handling a variable architecture model and a scheme for combining NN models with more ...
Bayesian Neural Networks For Internet Traffic Classification
 IEEE Transaction on Neural Networks
, 2007
"... ..."
BioMed Central
, 2006
"... A novel approach to phylogenetic tree construction using stochastic optimization and clustering ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
A novel approach to phylogenetic tree construction using stochastic optimization and clustering
Improving the Determination of the Hyperparameters in Bayesian Learning
 In Proccedings of the ACNN '98
, 1998
"... Bayesian learning provides a theoretical way to prevent neural networks from overfitting. It is possible to determine the weight decay parameter during the training process without using a validation set. This is done by maximizing the evidence p(Djff; fi) of the hyperparameters ff and fi. In this ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Bayesian learning provides a theoretical way to prevent neural networks from overfitting. It is possible to determine the weight decay parameter during the training process without using a validation set. This is done by maximizing the evidence p(Djff; fi) of the hyperparameters ff and fi. In this papers two new methods are described that improve the determination of the hyperparameters. The first method defines an iteration process in order to get the optimal value of ff. We proof that this iteration process always converge to the optimal solution. The second one takes into account the fact, that ff and fi are socalled scale parameters and therefore have a natural a priori probability that differs significantly from the a priori probability that is used in general. The new methods are applied to a very noisy data set, namely the prediction of the foreign exchange rate of the US Dollar against the German Mark and demonstrate a substantial improvement with respect to the generalizati...
Extended Kalman Filter Based Pruning Algorithms And Several Aspects Of Neural Network Learning
, 1998
"... In recent years, more and more researchers have been aware of the effectiveness of using the extended Kalman filter (EKF) in neural network learning since some information such as the Kalman gain and error covariance matrix can be obtained during the progress of training. It would be interesting to ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
In recent years, more and more researchers have been aware of the effectiveness of using the extended Kalman filter (EKF) in neural network learning since some information such as the Kalman gain and error covariance matrix can be obtained during the progress of training. It would be interesting to inquire if there is any possibility of using an EKF method together with pruning in order to speed up the learning process, as well as to determine the size of a trained network. In this dissertation, certain extended Kalman filter based pruning algorithms for feedforward neural network (FNN) and recurrent neural network (RNN) are proposed and several aspects of neural network learning are presented. For FNN, a weight importance measure linking up prediction error sensitivity and the byproducts obtained from EKF training is derived. Comparison results demonstrate that the proposed measure can better approximate the prediction error sensitivity than using the forgetting recursive least squa...
Gated Experts for Classification of Financial Time Series
, 1997
"... this paper are neural networks whose forecasts are combined by another neural network, a gate. For regression problems such an architecture was shown to partly remedy the two main problems in forecasting real world time series: nonstationarity and overfitting. The goal of this paper is to compare th ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
this paper are neural networks whose forecasts are combined by another neural network, a gate. For regression problems such an architecture was shown to partly remedy the two main problems in forecasting real world time series: nonstationarity and overfitting. The goal of this paper is to compare the forecasting ability of gated experts (GE) with a that of a single neural network expert on a time series classification task, which corresponds to decisions of taking a long position in a stock, a short position, or doing nothing. A new error function and a weight update rule were derived for this problem. The architecture was tested on the actual stock market data, and the errors on both training and testing data were smaller than errors for the best expert. This suggests that the performance of any single stock market forecasting system can be improved by making several copies of it and training them under the GE framework. In addition, an algorithm is presented for the GE architecture that makes it possible for the model to modify the data to fit the model better. Such a modification is done only if the decrease in the model cost associated with the output error is less than the increase in the input cost associated with moving the data away from its initial values. This idea corresponds to a bidirectional search for the true model, which was shown in AI to cut in half the exponent in the search time in comparison to the standard unidirectional search used by most connectionist architectures. The implementation of this algorithm was show to further decrease overfitting on the testing data.
Stock Market Pattern Recognition with Neural Networks
, 1997
"... this paper we understand a real world structure or process which is characterized by a set of structural and behavioral patterns. These patterns can be viewed as reflecting the "style" of the object. The objects are assumed to have a relatively high level of stationarity and the patterns c ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
this paper we understand a real world structure or process which is characterized by a set of structural and behavioral patterns. These patterns can be viewed as reflecting the "style" of the object. The objects are assumed to have a relatively high level of stationarity and the patterns characterizing an object are assumed to be probabilistically dependent on each other. For stock market modeling, these objects do not have to represent physical entities such as company's assets or other objects used in fundamental analysis. The objects can be structures that were created as a result of complex interactions of physical entities. The subject of behavioral finance deals with a class of objects such as fads and fashions present in the market. The stock market is extremely sensitive to its environment, and many objects related to the stock market contribute their patterns to the stock price. The goal is to extract patterns related to each object and build a model of the object from these patterns. For the purpose of risk management, patterns not related to any object will be considered nonstationary and will thus be classified as noise. A similar idea was proposed in Weigend, Zimmermann and Neuneier (1996), who describe an architecture in which the data is accepted for analysis only if it confirms the model. In the AI term, their algorithm implements a bidirectional search, which was proven to give better results than a onesided search. The objects in the stock market contribute patterns to the stock price at different time scales. This idea is gaining a wide recognition which is reflected in the growing number of research in multiresolution analysis. See Bjorn and Weigend (1996) for discussion. For example, investors and traders operate at different time horizons, and t...
unknown title
"... Abstract—Advancements in robotics have gained much momentum in recent years. Industrial robotic systems are increasingly being used outside the factory floor, evident by the growing presence of service robots in personal environments. In light of these trends, there is currently a pressing need of i ..."
Abstract
 Add to MetaCart
Abstract—Advancements in robotics have gained much momentum in recent years. Industrial robotic systems are increasingly being used outside the factory floor, evident by the growing presence of service robots in personal environments. In light of these trends, there is currently a pressing need of identifying new ways of programming robots safely, quickly and more intuitively. These methods should focus on service robots and address long outstanding HumanRobot Interaction issues in industrial robotics simultaneously. In this paper, the potential of using an Augmented Reality (AR) environment to facilitate immersive robot programming in unknown environments is explored. The benefits of an AR environment over conventional robot programming approaches are discussed, followed by a description of the Robot Programming using AR (RPAR) system developed in this research. New methodologies for programming two classes of robotic tasks using RPAR are proposed. A number of case studies are presented and the results discussed. Keywords—Robot programming, augmented reality, methodology, collisionfree path. I.
Neural Network Optimization through searching guided by stochastic methods
, 1998
"... : In this paper we present an integrated framework for neural network optimization. The problem of a finding a sensible topology and a good set of parameters is resolved by intertwining two processes: topology optimization and parameter adjustment, both embedded in an evolutionary search algorithm. ..."
Abstract
 Add to MetaCart
: In this paper we present an integrated framework for neural network optimization. The problem of a finding a sensible topology and a good set of parameters is resolved by intertwining two processes: topology optimization and parameter adjustment, both embedded in an evolutionary search algorithm. Efficient evolutionary operators can be implemented based on stochastic methods like mutual information, to optimize the input structure, or the correlation coefficient of two functions, i.e., the activation of two hidden neurons. On the other hand, the 'quality' of the topology can not be evaluated without adjusting the parameters of the network, and therefore depends on the initialization and the learning process. The theory of Bayesian learning gives a framework to adjust parameters without making use of additional data, e.g., a validation set to determine hyperparameters as the weighting factor of regularization terms. Since these parameters are all set automatically during learning, it ...
Optimizing the Evidence
"... Since Bayesian learning for neural networks was introduced by MacKay it was applied to real world problems with varying success. Despite of the fact that Bayesian learning provides an elegant theory to prevent neural networks from overfitting, it is not as commonly used as it should be. In this pape ..."
Abstract
 Add to MetaCart
Since Bayesian learning for neural networks was introduced by MacKay it was applied to real world problems with varying success. Despite of the fact that Bayesian learning provides an elegant theory to prevent neural networks from overfitting, it is not as commonly used as it should be. In this paper we focus on two problems that arise in practice: (1) The evidence p(Djff) of the hyperparameter ff does not monotonically increase during the learning process and (2) the correlation coefficient between the evidence and the generalization performance is usually positive but significantly different from 1. The latter problem is solved in practice by forming a committee of networks with reasonably high evidence, thus reducing the influence of outliers. Based on good choice of the prior of the hyperparameters, which was crucial for the convergence of the algorithm in our experiments, we exploit in the following the positive correlation between the evidence and the generalization performance b...