Good ErrorCorrecting Codes based on Very Sparse Matrices
, 1999
"... We study two families of errorcorrecting codes defined in terms of very sparse matrices. "MN" (MacKayNeal) codes are recently invented, and "Gallager codes" were first investigated in 1962, but appear to have been largely forgotten, in spite of their excellent properties. The decoding of both cod ..."
Abstract

Cited by 513 (25 self)
We study two families of errorcorrecting codes defined in terms of very sparse matrices. "MN" (MacKayNeal) codes are recently invented, and "Gallager codes" were first investigated in 1962, but appear to have been largely forgotten, in spite of their excellent properties. The decoding of both codes can be tackled with a practical sumproduct algorithm. We prove that these codes are "very good," in that sequences of codes exist which, when optimally decoded, achieve information rates up to the Shannon limit. This result holds not only for the binarysymmetric channel but also for any channel with symmetric stationary ergodic noise. We give experimental results for binarysymmetric channels and Gaussian channels demonstrating that practical performance substantially better than that of standard convolutional and concatenated codes can be achieved; indeed, the performance of Gallager codes is almost as close to the Shannon limit as that of turbo codes.
Connectionist Learning Procedures
 ARTIFICIAL INTELLIGENCE
, 1989
"... A major goal of research on networks of neuronlike processing units is to discover efficient learning procedures that allow these networks to construct complex internal representations of their environment. The learning procedures must be capable of modifying the connection strengths in such a way ..."
Abstract

Cited by 339 (6 self)
A major goal of research on networks of neuronlike processing units is to discover efficient learning procedures that allow these networks to construct complex internal representations of their environment. The learning procedures must be capable of modifying the connection strengths in such a way that internal units which are not part of the input or output come to represent important features of the task domain. Several interesting gradientdescent procedures have recently been discovered. Each connection computes the derivative, with respect to the connection strength, of a global measure of the error in the performance of the network. The strength is then adjusted in the direction that decreases the error. These relatively simple, gradientdescent learning procedures work well for small tasks and the new challenge is to find ways of improving their convergence rate and their generalization abilities so that they can be applied to larger, more realistic tasks.
On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts  Towards Memetic Algorithms
, 1989
"... Short abstract, isn't it? P.A.C.S. numbers 05.20, 02.50, 87.10 1 Introduction Large Numbers "...the optimal tour displayed (see Figure 6) is the possible unique tour having one arc fixed from among 10 655 tours that are possible among 318 points and have one arc fixed. Assuming that one could ..."
Abstract

Cited by 186 (10 self)
Short abstract, isn't it? P.A.C.S. numbers 05.20, 02.50, 87.10 1 Introduction Large Numbers "...the optimal tour displayed (see Figure 6) is the possible unique tour having one arc fixed from among 10 655 tours that are possible among 318 points and have one arc fixed. Assuming that one could possibly enumerate 10 9 tours per second on a computer it would thus take roughly 10 639 years of computing to establish the optimality of this tour by exhaustive enumeration." This quote shows the real difficulty of a combinatorial optimization problem. The huge number of configurations is the primary difficulty when dealing with one of these problems. The quote belongs to M.W Padberg and M. Grotschel, Chap. 9., "Polyhedral computations", from the book The Traveling Salesman Problem: A Guided tour of Combinatorial Optimization [124]. It is interesting to compare the number of configurations of realworld problems in combinatorial optimization with those large numbers arising in Cosmol...
Bidirectional Associative Memories
 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS
, 1988
"... Stability and encoding properties of twolayer nonlinear feedback neural networks are examined. Bidirectionality, forward and backard information flow, is introduced in neural nets to produce twoway associative search for stored associations (A, B, ). Passing information through M gives one directi ..."
Abstract

Cited by 155 (3 self)
Stability and encoding properties of twolayer nonlinear feedback neural networks are examined. Bidirectionality, forward and backard information flow, is introduced in neural nets to produce twoway associative search for stored associations (A, B, ). Passing information through M gives one direction; passing it through its transpose M r gives the other. A bidirectional associative memory. (BAM) behaves as a hetero associative content addressable memory (CAM), storing and recalling the vector pairs (A1, Bi),..,(Am Bin) , where .4 {0,1}"and B We prove that every nbyp matrix M is a bidirectionally stable heteroas sociative CAM for both binary/bipolar and continuous neurons a, and hi. When the BAM neurons are activated, the network quickly evolves to a stable state of twopattern reverberation, or resonance. The stable reverberation corresponds to a system energy local minimum. Heteroassociafive inlormation is encoded iu a BAM by summing correlation matrices. The BAM storage capact .ty for reliable recall is roughly m < niin(n, p). No more heteroassociafive pairs can be 'reliably stored and recalled than the lesser of the dimensions of the pattern spaces (0,1 }"and 0,1 } P. The Appendix shos that it is better on average to use bipolar { 1,i} coding than binary. {0,1 } coding of heteroassociative pairs (.4, B,). BAM encoding and decoding are combined in the adaptive BAM, which extends global bidirectional stabflit), to realtime unsupervised learning. Temporal patterns (AE,., A,,) are represented as ordered lists of binary/bipolar vectors and stored in a temporal associative memory (TAM) nby matrix M as a limit cycle of the dynamical system. Forward recall proceeds through M, backward recall through M r . Temporal patterns are stored by summing contiguous bipolar...
Deep Dyslexia: A Case Study of Connectionist Neuropsychology
, 1993
"... Deep dyslexia is an acquired reading disorder marked by the occurrence of semantic errors (e.g., reading RIVER as "ocean"). In addition, patients exhibit a number of other symptoms, including visual and morphological effects in their errors, a partofspeech effect, and an advantage for concrete ove ..."
Abstract

Cited by 138 (27 self)
Deep dyslexia is an acquired reading disorder marked by the occurrence of semantic errors (e.g., reading RIVER as "ocean"). In addition, patients exhibit a number of other symptoms, including visual and morphological effects in their errors, a partofspeech effect, and an advantage for concrete over abstract words. Deep dyslexia poses a distinct challenge for cognitive neuropsychology because there is little understanding of why such a variety of symptoms should cooccur in virtually all known patients. Hinton and Shallice (1991) replicated the cooccurrence of visual and semantic errors by lesioning a recurrent connectionist network trained to map from orthography to semantics. While the success of their simulations is encouraging, there is little understanding of what underlying principles are responsible for them. In this paper we evaluate and, where possible, improve on the most important design decisions made by Hinton and Shallice, relating to the task, the network architecture, the training procedure, and the testing procedure. We identify four properties of networks that underly their ability to reproduce the deep dyslexic symptomcomplex: distributed orthographic and semantic representations, gradient descent learning, attractors for word meanings, and greater richness of concrete vs. abstract semantics. The first three of these are general connectionist principles and the last is based on earlier theorizing. Taken together, the results demonstrate the usefulness of a connectionist approach to understanding deep dyslexia in particular, and the viability of connectionist neuropsychology in general.
Gradient calculation for dynamic recurrent neural networks: a survey
 IEEE Transactions on Neural Networks
, 1995
"... Abstract  We survey learning algorithms for recurrent neural networks with hidden units, and put the various techniques into a common framework. We discuss xedpoint learning algorithms, namely recurrent backpropagation and deterministic Boltzmann Machines, and non xedpoint algorithms, namely backp ..."
Abstract

Cited by 135 (3 self)
Abstract  We survey learning algorithms for recurrent neural networks with hidden units, and put the various techniques into a common framework. We discuss xedpoint learning algorithms, namely recurrent backpropagation and deterministic Boltzmann Machines, and non xedpoint algorithms, namely backpropagation through time, Elman's history cuto, and Jordan's output feedback architecture. Forward propagation, an online technique that uses adjoint equations, and variations thereof, are also discussed. In many cases, the uni ed presentation leads to generalizations of various sorts. We discuss advantages and disadvantages of temporally continuous neural networks in contrast to clocked ones, continue with some \tricks of the trade" for training, using, and simulating continuous time and recurrent neural networks. We present somesimulations, and at the end, address issues of computational complexity and learning speed.
LargeStep Markov Chains for the Traveling Salesman Problem
 Complex Systems
, 1991
"... We introduce a new class of Markov chain Monte Carlo search procedures, leading to more powerful optimization methods than simulated annealing. The main idea is to embed deterministic local search techniques into stochastic algorithms. The Monte Carlo explores only local optima, and it is able to ma ..."
Abstract

Cited by 92 (6 self)
We introduce a new class of Markov chain Monte Carlo search procedures, leading to more powerful optimization methods than simulated annealing. The main idea is to embed deterministic local search techniques into stochastic algorithms. The Monte Carlo explores only local optima, and it is able to make large, global changes, even at low temperatures, thus overcoming large barriers in configuration space. We test these procedures in the case of the Traveling Salesman Problem. The embedded local searches we use are 3opt and LinKernighan. The large change or step consists of a special kind of 4change followed by localopt minimization. We test this algorithm on a number of instances. The power of the method is illustrated by solving to optimality some large problems such as the LIN318, the AT&T532, and the RAT783 problems. For even larger instances with randomly distributed cities, the Markov chain procedure improves 3opt by over 1.6%, and LinKernighan by 1.3%, leading to a new best h...
Cooperative Robust Estimation Using Layers of Support
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1991
"... We present an approach to the problem of representing images that contain multiple objects or surfaces. Rather than use an edgebased approach to represent the segmentation of a scene, we propose a multilayer estimation framework which uses support maps to represent the segmentation of the image in ..."
Abstract

Cited by 86 (5 self)
We present an approach to the problem of representing images that contain multiple objects or surfaces. Rather than use an edgebased approach to represent the segmentation of a scene, we propose a multilayer estimation framework which uses support maps to represent the segmentation of the image into homogeneous chunks. This supportbased approach can represent objects that are split into disjoint regions, or have surfaces that are transparently interleaved. Our framework is based on an extension of robust estimation methods which provide a theoretical basis for supportbased estimation. The Minimum Description Length principle is used to decide how many support maps to use in describing a particular image. We show results applying this framework to heterogeneous interpolation and segmentation tasks on range and motion imagery. 1 Introduction Realworld perceptual systems must deal with complicated and cluttered environments. To succeed in such environments, a system must be able to r...