Results 11  20
of
160
NeuroRule: a connectionist approach to data mining
 In Proceedings of the International Conference on Very Large Databases (VLDB95
, 1995
"... Classification, which involves finding rules that partition a given da.ta set into disjoint groups, is one class of data mining problems. Approaches proposed so far for mining classification rules for large databases are mainly decision tree based symbolic learning methods. The connectionist approac ..."
Abstract

Cited by 39 (6 self)
 Add to MetaCart
Classification, which involves finding rules that partition a given da.ta set into disjoint groups, is one class of data mining problems. Approaches proposed so far for mining classification rules for large databases are mainly decision tree based symbolic learning methods. The connectionist approach based on neura.l networks has been thought not well suited for data mining. One of the major reasons cited is that knowledge generated by neural networks is not explicitly represented in the form of rules suitable for verification or interpretation by humans. This paper examines this issue. With our newly developed algorithms, rules which are similar to, or more concise than those generated by the symbolic methods can be extracted from the neural networks. The data mining process using neural networks with the emphasis on rule extraction is described. ExperimenM results and comparison with previously published works are presented. 1
OnLine Learning Processes in Artificial Neural Networks
, 1993
"... We study online learning processes in artificial neural networks from a general point of view. Online learning means that a learning step takes place at each presentation of a randomly drawn training pattern. It can be viewed as a stochastic process governed by a continuoustime master equation. O ..."
Abstract

Cited by 35 (4 self)
 Add to MetaCart
We study online learning processes in artificial neural networks from a general point of view. Online learning means that a learning step takes place at each presentation of a randomly drawn training pattern. It can be viewed as a stochastic process governed by a continuoustime master equation. Online learning is necessary if not all training patterns are available all the time. This occurs in many applications when the training patterns are drawn from a timedependent environmental distribution. Studying learning in a changing environment, we encounter a conflict between the adaptability and the confidence of the network's representation. Minimization of a criterion incorporating both effects yields an algorithm for online adaptation of the learning parameter. The inherent noise of online learning makes it possible to escape from undesired local minima of the error potential on which the learning rule performs (stochastic) gradient descent. We try to quantify these often made cl...
Training Neural Nets with the Reactive Tabu Search
"... In this paper the task of training subsymbolic systems is considered as a combinatorial optimization problem and solved with the heuristic scheme of the Reactive Tabu Search. An iterative optimization process based on a "modified greedy search" component is complemented with a metastrate ..."
Abstract

Cited by 35 (8 self)
 Add to MetaCart
In this paper the task of training subsymbolic systems is considered as a combinatorial optimization problem and solved with the heuristic scheme of the Reactive Tabu Search. An iterative optimization process based on a "modified greedy search" component is complemented with a metastrategy to realize a discrete dynamical system that discourages limit cycles and the confinement of the search trajectory in a limited portion of the search space. The possible cycles are discouraged by prohibiting (i.e., making tabu) the execution of moves that reverse the ones applied in the most recent part of the search, for a prohibition period that is adapted in an automated way. The confinement is avoided and a proper exploration is obtained by activating a diversification strategy when too many configurations are repeated excessively often. The RTS method is applicable to nondifferentiable functions, it is robust with respect to the random initialization and effective in continuing the search after local minima. Three tests of the technique on feedforward and feedback systems are presented.
LocationAware Computing: A Neural Network Model For Determining Location In Wireless LANs
, 2002
"... The strengths of the RF signals arriving from more access points in a wireless LANs are related to the position of the mobile terminal and can be used to derive the location of the user. In a ..."
Abstract

Cited by 34 (1 self)
 Add to MetaCart
The strengths of the RF signals arriving from more access points in a wireless LANs are related to the position of the mobile terminal and can be used to derive the location of the user. In a
Accelerated Learning By Active Example Selection
 International Journal of Neural Systems
, 1994
"... Much previous work on training multilayer neural networks has attempted to speed up the backpropagation algorithm using more sophisticated weight modification rules, whereby all the given training examples are used in a random or predetermined sequence. In this paper we investigate an alternative a ..."
Abstract

Cited by 33 (10 self)
 Add to MetaCart
(Show Context)
Much previous work on training multilayer neural networks has attempted to speed up the backpropagation algorithm using more sophisticated weight modification rules, whereby all the given training examples are used in a random or predetermined sequence. In this paper we investigate an alternative approach in which the learning proceeds on an increasing number of selected training examples, starting with a small training set. We derive a measure of criticality of examples and present an incremental learning algorithm that uses this measure to select a critical subset of given examples for solving the particular task. Our experimental results suggest that the method can significantly improve training speed and generalization performance in many real applications of neural networks. This method can be used in conjunction with other variations of gradient descent algorithms. 1 Introduction One of the most widely used methods for training multilayer feedforward neural networks is the erro...
Image Recognition and Neuronal Networks: Intelligent Systems for the Improvement of Imaging Information
, 2000
"... this paper we have concentrated on describing issues related to the development and use of artificial neural networkbased intelligent systems for medical image interpretation. Research in intelligent systems todate remains centred on technological issues and is mostly application driven. However, ..."
Abstract

Cited by 24 (14 self)
 Add to MetaCart
(Show Context)
this paper we have concentrated on describing issues related to the development and use of artificial neural networkbased intelligent systems for medical image interpretation. Research in intelligent systems todate remains centred on technological issues and is mostly application driven. However, previous research and experience suggests that the successful implementation of computerised systems (e.g., [34] [35]), and decision support systems in particular (e.g., [36]), in the area of healthcare relies on the successful integration of the technology with the organisational and social context within which it is applied. Therefore, the successful implementation of intelligent medical image interpretation systems 9 should not only rely on their technical feasibility and effectiveness but also on organisational and social aspects that may rise from their applications, as clinical information is acquired, processed, used and exchanged between professionals. All these issues are critical in healthcare applications because they ultimately reflect on the quality of care provided.
Discriminative Training of Hidden Markov Models
, 1998
"... vi Abbreviations vii Notation viii 1 Introduction 1 2 Hidden Markov Models 4 2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 HMM Modelling Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 HMM Topology . . . . . . . . . ..."
Abstract

Cited by 24 (0 self)
 Add to MetaCart
vi Abbreviations vii Notation viii 1 Introduction 1 2 Hidden Markov Models 4 2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 HMM Modelling Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 HMM Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.4 Finding the Best Transcription . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.5 Setting the Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3 Objective Functions 19 3.1 Properties of Maximum Likelihood Estimators . . . . . . . . . . . . . . . . . . . 19 3.2 Maximum Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3 Maximum Mutual Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.4 Frame Discrimination . . . . . . . . . . . . . . . . ....
Artificial neural networks in bankruptcy prediction: General framework and crossvalidation analysis
, 1999
"... In this paper, we present a general framework for understanding the role of artificial neural networks (ANNs) in bankruptcy prediction. We give a comprehensive review of neural network applications in this area and illustrate the link between neural networks and traditional Bayesian classification t ..."
Abstract

Cited by 23 (1 self)
 Add to MetaCart
In this paper, we present a general framework for understanding the role of artificial neural networks (ANNs) in bankruptcy prediction. We give a comprehensive review of neural network applications in this area and illustrate the link between neural networks and traditional Bayesian classification theory. The method of crossvalidation is used to examine the betweensample variation of neural networks for bankruptcy prediction. Based on a matched sample of 220 firms, our findings indicate that neural networks are significantly better than logistic regression models in prediction as well as classification rate estimation. In addition, neural networks are robust to sampling variations in overall classi
Deep Learning Made Easier by Linear Transformations in Perceptrons
"... We transform the outputs of each hidden neuron in a multilayer perceptron network to be zero mean and zero slope, and use separate shortcut connections to model the linear dependencies instead. This transformation aims at separating the problems of learning the linear and nonlinear parts of the who ..."
Abstract

Cited by 23 (9 self)
 Add to MetaCart
(Show Context)
We transform the outputs of each hidden neuron in a multilayer perceptron network to be zero mean and zero slope, and use separate shortcut connections to model the linear dependencies instead. This transformation aims at separating the problems of learning the linear and nonlinear parts of the whole inputoutput mapping, which has many benefits. We study the theoretical properties of the transformation by noting that they make the Fisher information matrix closer to a diagonal matrix, and thus standard gradient closer to the natural gradient. We experimentally confirm the usefulness of the transformations by noting that they make basic stochastic gradient learning competitive with stateoftheart learning algorithms in speed, and that they seem also to help find solutions that generalize better. The experiments include both classification of handwritten digits with a 3layer network and learning a lowdimensional representation for images by using a 6layer autoencoder network. The transformations were beneficial in all cases, with and without regularization. 1
On Langevin Updating in Multilayer Perceptrons
 Neural Computation
, 1993
"... : The Langevin updating rule, in which noise is added to the weights during learning, is presented and analyzed. It is well controlled and, being a natural extension to standard backpropagation learning, easily combined with other modifications of backpropagation. If the Hessian matrix is numericall ..."
Abstract

Cited by 22 (1 self)
 Add to MetaCart
(Show Context)
: The Langevin updating rule, in which noise is added to the weights during learning, is presented and analyzed. It is well controlled and, being a natural extension to standard backpropagation learning, easily combined with other modifications of backpropagation. If the Hessian matrix is numerically illconditioned, Langevin updating converges faster than backpropagation and, probably, also higher order algorithms. This is particularly important for multilayer perceptrons with many hidden layers, which tend to have illconditioned Hessians. In addition, Manhattan updating is shown to have a similar effect as Langevin updating. 1 denni@thep.lu.se Introduction Performances of artificial neural networks (ANN) are often improved when external noise is present during the training phase. For instance in Hopfieldtype networks the basins of attraction for the stored memory patterns are enlarged when noisecorrupted training patterns are used [1]. In linear perceptrons the generalization a...