Results 11  20
of
43
Multicolumn deep neural networks for image classification
 IN PROCEEDINGS OF THE 25TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2012
, 2012
"... Traditional methods of computer vision and machine learning cannot match human performance on tasks such as the recognition of handwritten digits or traffic signs. Our biologically plausible deep artificial neural network architectures can. Small (often minimal) receptive fields of convolutional win ..."
Abstract

Cited by 19 (5 self)
 Add to MetaCart
Traditional methods of computer vision and machine learning cannot match human performance on tasks such as the recognition of handwritten digits or traffic signs. Our biologically plausible deep artificial neural network architectures can. Small (often minimal) receptive fields of convolutional winnertakeall neurons yield large network depth, resulting in roughly as many sparsely connected neural layers as found in mammals between retina and visual cortex. Only winner neurons are trained. Several deep neural columns become experts on inputs preprocessed in different ways; their predictions are averaged. Graphics cards allow for fast training. On the very competitive MNIST handwriting benchmark, our method is the first to achieve nearhuman performance. On a traffic sign recognition benchmark it outperforms humans by a factor of two. We also improve the stateoftheart on a plethora of common image classification benchmarks.
Discovering Predictable Classifications
 Neural Computation
, 1992
"... Prediction problems are among the most common learning problems for neural networks (e.g. in the context of time series prediction, control, etc.). With many such problems, however, perfect prediction is inherently impossible. For such cases we present novel unsupervised systems that learn to clas ..."
Abstract

Cited by 18 (9 self)
 Add to MetaCart
Prediction problems are among the most common learning problems for neural networks (e.g. in the context of time series prediction, control, etc.). With many such problems, however, perfect prediction is inherently impossible. For such cases we present novel unsupervised systems that learn to classify patterns such that the classifications are predictable while still being as specific as possible. The approach can be related to the IMAX method of Hinton, Becker and Zemel (1989, 1991). Experiments include Becker's and Hinton's stereo task, which can be solved more readily by our system. 1 1 MOTIVATION AND BASIC APPROACH Many neural net systems (e.g. for control, time series prediction, etc.) rely on adaptive submodules for learning to predict patterns from other patterns. Perfect prediction, however, is often inherently impossible. In this paper we study the problem of finding pattern classifications such that the classes are predictable, while still being as specific as possibl...
Combining Exploratory Projection Pursuit And Projection Pursuit Regression With Application To Neural Networks
 Neural Computation
, 1992
"... We present a novel classification and regression method that combines exploratory projection pursuit (unsupervised training) with projection pursuit regression (supervised training), to yield a new family of cost/complexity penalty terms. Some improved generalization properties are demonstrated on r ..."
Abstract

Cited by 17 (9 self)
 Add to MetaCart
We present a novel classification and regression method that combines exploratory projection pursuit (unsupervised training) with projection pursuit regression (supervised training), to yield a new family of cost/complexity penalty terms. Some improved generalization properties are demonstrated on real world problems. 1 Introduction Parameter estimation becomes difficult in highdimensional spaces due to the increasing sparseness of the data. Therefore, when a low dimensional representation is embedded in the data, dimensionality reduction methods become useful. One such method  projection pursuit regression (Friedman and Stuetzle, 1981) (PPR) is capable of performing dimensionality reduction by composition, namely, it constructs an approximation to the desired response function using a composition of lower dimensional smooth functions. These functions depend on low dimensional projections through the data. When the dimensionality of the problem is in the thousands, even projection...
Discovering Problem Solutions with Low Kolmogorov Complexity and High Generalization Capability
 MACHINE LEARNING: PROCEEDINGS OF THE TWELFTH INTERNATIONAL CONFERENCE
, 1994
"... Many machine learning algorithms aim at finding "simple" rules to explain training data. The expectation is: the "simpler" the rules, the better the generalization on test data (! Occam's razor). Most practical implementations, however, use measures for "simplicity" that lack the power, universality ..."
Abstract

Cited by 16 (8 self)
 Add to MetaCart
Many machine learning algorithms aim at finding "simple" rules to explain training data. The expectation is: the "simpler" the rules, the better the generalization on test data (! Occam's razor). Most practical implementations, however, use measures for "simplicity" that lack the power, universality and elegance of those based on Kolmogorov complexity and Solomonoff's algorithmic probability. Likewise, most previous approaches (especially those of the "Bayesian" kind) suffer from the problem of choosing appropriate priors. This paper addresses both issues. It first reviews some basic concepts of algorithmic complexity theory relevant to machine learning, and how the SolomonoffLevin distribution (or universal prior) deals with the prior problem. The universal prior leads to a probabilistic method for finding "algorithmically simple" problem solutions with high generalization capability. The method is based on Levin complexity (a timebounded generalization of Kolmogorov complexity) and...
Neural Networks in Business: Techniques and Applications for the Operations Researcher
, 2000
"... This paper presents an overview of the di!erent types of neural network models which are applicable when solving business problems. The history of neural networks in business is outlined, leading to a discussion of the current applications in business including data mining, as well as the current re ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
This paper presents an overview of the di!erent types of neural network models which are applicable when solving business problems. The history of neural networks in business is outlined, leading to a discussion of the current applications in business including data mining, as well as the current research directions. The role of neural networks as a modern operations research tool is discussed. Scope and purpose Neural networks are becoming increasingly popular in business. Many organisations are investing in neural network and data mining solutions to problems which have traditionally fallen under the responsibility of operations research. This article provides an overview for the operations research reader of the basic neural network techniques, as well as their historical and current use in business. The paper is intended as an introductory article for the remainder of this special issue on neural networks in business. # 2000 Elsevier Science Ltd. All rights reserved. Keywords: N...
Artificial neural networks as models of stimulus control
, 2006
"... We evaluate the ability of artificial neural network models (multilayer perceptrons) to predict stimulus–response relationships. A variety of empirical results are considered, such as generalization, peakshift (supernormality) and stimulus intensity effects. The networks were trained on the same t ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
We evaluate the ability of artificial neural network models (multilayer perceptrons) to predict stimulus–response relationships. A variety of empirical results are considered, such as generalization, peakshift (supernormality) and stimulus intensity effects. The networks were trained on the same tasks as the animals in the considered experiments. The subsequent generalization tests on the networks showed that the model replicates correctly the empirical results. It is concluded that these models are valuable tools in the study of animal behaviour.
Learning Algorithms for Networks with Internal and External Feedback
 IN D. S. TOURETZKY , J. L. ELMAN , T. J. SEJNOWSKI , G. E. HINTON , PROC OF THE CONNECTIONIST MODELS SUMMER SCHOOL, PAGES 5261. SAN MATEO, CA: MORGAN KAUFMANN, 1990.
, 1990
"... This paper gives an overview of some novel algorithms for reinforcement learning in nonstationary possibly reactive environments. I have decided to describe many ideas briefly rather than going into great detail on any one idea. The paper is structured as follows: In the first section some terminolo ..."
Abstract

Cited by 10 (8 self)
 Add to MetaCart
This paper gives an overview of some novel algorithms for reinforcement learning in nonstationary possibly reactive environments. I have decided to describe many ideas briefly rather than going into great detail on any one idea. The paper is structured as follows: In the first section some terminology is introduced. Then there follow five sections, each headed by a short abstract. The second section describes the entirely local `neural bucket brigade algorithm'. The third section applies Sutton's TDmethods to fully recurrent continually running probabilistic networks. The fourth section describes an algorithm based on system identification and on two interacting fully recurrent `selfsupervised' learning networks. The fifth section describes an application of adaptive control techniques to adaptive attentive vision: It demonstrates how `selective attention' can be learned. Finally, the sixth section critisizes methods based on system identification and adaptive critics, and describes ...
Continuous History Compression
 Proc. of Intl. Workshop on Neural Networks, RWTH Aachen
, 1993
"... Neural networks have proven poor at learning the structure in complex and extended temporal sequences in which contingencies among elements can span long time lags. The principle of history compression [18] provides a means of transforming long sequences with redundant information into equivalent sh ..."
Abstract

Cited by 9 (6 self)
 Add to MetaCart
Neural networks have proven poor at learning the structure in complex and extended temporal sequences in which contingencies among elements can span long time lags. The principle of history compression [18] provides a means of transforming long sequences with redundant information into equivalent shorter sequences; the shorter sequences are more easily manipulated and learned by neural networks. The principle states that expected sequence elements can be removed from the sequence to form an equivalent, more compact sequence without loss of information. The principle was embodied in a neural net predictive architecture that attempted to anticipate the next element of a sequence given the previous elements. If the prediction was accurate, the next element was discarded; otherwise, it was passed on to a second network that processed the sequence in some fashion (e.g., recognition, classification, autoencoding, etc.). As originally proposed, a binary judgement was made as to the predictabi...
Neural Sequence Chunkers
, 1991
"... This paper addresses the problem of learning to `divide and conquer' by meaningful hierarchical adaptive decomposition of temporal sequences. This problem is relevant for timeseries analysis as well as for goaldirected learning, particularily if event sequences tend to have hierarchical temporal s ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
This paper addresses the problem of learning to `divide and conquer' by meaningful hierarchical adaptive decomposition of temporal sequences. This problem is relevant for timeseries analysis as well as for goaldirected learning, particularily if event sequences tend to have hierarchical temporal structure. The first neural systems for recursively chunking sequences are described. These systems are based on a principle called the `principle of history compression'. This principle essentially says: As long as a predictor is able to predict future environmental inputs from previous ones, no additional knowledge can be obtained by observing these inputs in reality. Only unexpected inputs deserve attention. A focus is on a class of 2network systems which try to collapse a selforganizing (possibly multilevel) hierarchy of temporal predictors into a single recurrent network. Only those input events that were not expected by the first recurrent net are transferred to the second recurrent ...
Adaptive State Representation and Estimation Using Recurrent Connectionist Networks
 Miller, Satten, Webos, NN for Control
, 1990
"... Introduction The purpose of this chapter is to provide an introductory overview of some of the current research efforts directed toward adapting the weights in connectionist networks having feedback connections. While much of the recent emphasis in the field has been on multilayer networks having n ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
Introduction The purpose of this chapter is to provide an introductory overview of some of the current research efforts directed toward adapting the weights in connectionist networks having feedback connections. While much of the recent emphasis in the field has been on multilayer networks having no such feedback connections, it is likely that the use of recurrently connected networks will be of particular importance for applications to the control of dynamical systems. Following the approach taken in the previous chapter by Andy Barto, this chapter will emphasize the relationship of connectionist research in this area to strategies used in more conventional engineering circles for modelling and controlling dynamical systems, while at the same time noting what there is in the connectionist approach that is novel. In particular, I will argue that while much of the connectionist approach to adapting the weights in recurrent networks having interesting dynamics rests on the same