Results 1 -
2 of
2
Non Linear Neurons in the Low Noise Limit: A Factorial Code Maximizes Information Transfer
, 1994
"... We investigate the consequences of maximizing information transfer in a simple neural network (one input layer, one output layer), focussing on the case of non linear transfer functions. We assume that both receptive fields (synaptic efficacies) and transfer functions can be adapted to the environm ..."
Abstract
-
Cited by 129 (17 self)
- Add to MetaCart
We investigate the consequences of maximizing information transfer in a simple neural network (one input layer, one output layer), focussing on the case of non linear transfer functions. We assume that both receptive fields (synaptic efficacies) and transfer functions can be adapted to the environment. The main result is that, for bounded and invertible transfer functions, in the case of a vanishing additive output noise, and no input noise, maximization of information (Linsker'sinfomax principle) leads to a factorial code - hence to the same solution as required by the redundancy reduction principle of Barlow. We show also that this result is valid for linear, more generally unbounded, transfer functions, provided optimization is performed under an additive constraint, that is which can be written as a sum of terms, each one being specific to one output neuron. Finally we study the effect of a non zero input noise. We find that, at first order in the input noise, assumed to be small ...
Information Processing by a Perceptron in an Unsupervised Learning Task
, 1993
"... We study the ability of a simple neural network (a perceptron architecture, no hidden units, binary outputs) to process information in the context of an unsupervised learning task. The network is asked to provide the best possible neural representation of a given input distribution, according to som ..."
Abstract
-
Cited by 14 (8 self)
- Add to MetaCart
We study the ability of a simple neural network (a perceptron architecture, no hidden units, binary outputs) to process information in the context of an unsupervised learning task. The network is asked to provide the best possible neural representation of a given input distribution, according to some criterion taken from Information Theory. We compare various optimization criteria that have been proposed : maximum information transmission, minimum redundancy and closeness to factorial code. We show that for the perceptron one can compute the maximal information that the code (the output neural representation) can convey about the input. We show that one can use Statistical Mechanics techniques, such as the replica techniques, to compute the typical mutual information between input and output distributions. More precisely, for a Gaussian input source with a given correlation matrix, we compute the typical mutual information when the couplings are chosen randomly. We determine the correl...

