Results 1  10
of
20
The Helmholtz Machine
, 1995
"... Discovering the structure inherent in a set of patterns is a fundamental aim of statistical inference or learning. One fruitful approach is to build a parameterized stochastic generative model, independent draws from which are likely to produce the patterns. For all but the simplest generative model ..."
Abstract

Cited by 194 (22 self)
 Add to MetaCart
Discovering the structure inherent in a set of patterns is a fundamental aim of statistical inference or learning. One fruitful approach is to build a parameterized stochastic generative model, independent draws from which are likely to produce the patterns. For all but the simplest generative models, each pattern can be generated in exponentially many ways. It is thus intractable to adjust the parameters to maximize the probability of the observed patterns. We describe a way of finessing this combinatorial explosion by maximizing an easily computed lower bound on the probability of the observations. Our method can be viewed as a form of hierarchical selfsupervised learning that may relate to the function of bottomup and topdown cortical processing pathways.
On the Unification Line Processes, Outlier Rejection, and Robust Statistics with Applications in Early Vision
, 1996
"... The modeling of spatial discontinuities for problems such as surface recovery, segmentation, image reconstruction, and optical flow has been intensely studied in computer vision. While "lineprocess" models of discontinuities have received a great deal of attention, there has been recent interest i ..."
Abstract

Cited by 190 (8 self)
 Add to MetaCart
The modeling of spatial discontinuities for problems such as surface recovery, segmentation, image reconstruction, and optical flow has been intensely studied in computer vision. While "lineprocess" models of discontinuities have received a great deal of attention, there has been recent interest in the use of robust statistical techniques to account for discontinuities. This paper unifies the two approaches. To achieve this we generalize the notion of a "line process" to that of an analog "outlier process" and show how a problem formulated in terms of outlier processes can be viewed in terms of robust statistics. We also characterize a class of robust statistical problems for which an equivalent outlierprocess formulation exists and give a straightforward method for converting a robust estimation problem into an outlierprocess formulation. We show how prior assumptions about the spatial structure of outliers can be expressed as constraints on the recovered analog outlier processes and how traditional continuation methods can be extended to the explicit outlierprocess formulation. These results indicate that the outlierprocess approach provides a general framework which subsumes the traditional lineprocess approaches as well as a wide class of robust estimation problems. Examples in surface reconstruction, image segmentation, and optical flow are presented to illustrate the use of outlier processes and to show how the relationship between outlier processes and robust statistics can be exploited. An appendix provides a catalog of common robust error norms and their equivalent outlierprocess formulations.
Gradient calculation for dynamic recurrent neural networks: a survey
 IEEE Transactions on Neural Networks
, 1995
"... Abstract  We survey learning algorithms for recurrent neural networks with hidden units, and put the various techniques into a common framework. We discuss xedpoint learning algorithms, namely recurrent backpropagation and deterministic Boltzmann Machines, and non xedpoint algorithms, namely backp ..."
Abstract

Cited by 135 (3 self)
 Add to MetaCart
Abstract  We survey learning algorithms for recurrent neural networks with hidden units, and put the various techniques into a common framework. We discuss xedpoint learning algorithms, namely recurrent backpropagation and deterministic Boltzmann Machines, and non xedpoint algorithms, namely backpropagation through time, Elman's history cuto, and Jordan's output feedback architecture. Forward propagation, an online technique that uses adjoint equations, and variations thereof, are also discussed. In many cases, the uni ed presentation leads to generalizations of various sorts. We discuss advantages and disadvantages of temporally continuous neural networks in contrast to clocked ones, continue with some \tricks of the trade" for training, using, and simulating continuous time and recurrent neural networks. We present somesimulations, and at the end, address issues of computational complexity and learning speed.
Optimal perceptual inference
 In CVPR, Washington DC
, 1983
"... When a vision system creates an interpretation of some input datn, it assigns truth values or probabilities to intcrnal hypothcses about the world. We present a nondctcrministic method for assigning truth values that avoids many of the problcms encountered by existing relaxation methods. Instead of ..."
Abstract

Cited by 91 (14 self)
 Add to MetaCart
When a vision system creates an interpretation of some input datn, it assigns truth values or probabilities to intcrnal hypothcses about the world. We present a nondctcrministic method for assigning truth values that avoids many of the problcms encountered by existing relaxation methods. Instead of rcprcscnting probabilitics with realnumbers, we usc a more dircct encoding in which thc probability associated with a hypotlmis is rcprcscntcd by the probability hat it is in one of two states, true or false. Wc give a particular nondeterministic operator, based on statistical mechanics, for updating the truth values of hypothcses. The operator ensures that the probability of discovering a particular combination of hypothcscs is a simplc function of how good that combination is. Wc show that thcrc is a simple relationship bctween this operator and Bayesian inference, and we describe a learning rule which allows a parallel system to converge on a set ofweights that optimizes its perccptt~al inferences. lnt roduction One way of interpreting images is to formulate hypotheses about parts or aspects of the imagc and then decide which of these hypotheses are likely to be correct. Thc probability that each hypothesis is correct is determined partly by its fit to the imagc and partly by its fit to other hypothcses (hat are taken to be correct, so the truth'value of an individual hypothesis cannot be decided in isolation. One method of searching for the most plausible combination of hypotheses is to use a rclaxation process in which a probability is associated with each hypothesis, and the probabilities arc then iteratively modified on the basis of the fit to the imagc and the known relationships bctwcen hypotheses. An attractive property of rclaxation methods is that they can be implemented in parallel hardwarc where one computational unit is used for each possible hypothcsis, and the interactions betwcen hypotheses are implemented by dircct hardwarc connections betwcen the units. Many variations of the basic relaxation idea have becn However, all the current methods suffer from one or more of the following problems:
On DiscontinuityAdaptive Smoothness Priors in Computer Vision
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1995
"... A variety of analytic and probabilistic models in connection to Markov random fields (MRFs) have been proposed in the last decade for solving low level vision problems involving discontinuities. This paper presents a systematic study on these models and defines a general discontinuity adaptive (DA) ..."
Abstract

Cited by 30 (5 self)
 Add to MetaCart
A variety of analytic and probabilistic models in connection to Markov random fields (MRFs) have been proposed in the last decade for solving low level vision problems involving discontinuities. This paper presents a systematic study on these models and defines a general discontinuity adaptive (DA) MRF model. By analyzing the Euler equation associated with the energy minimization, it shows that the fundamental difference between different models lies in the behavior of interaction between neighboring points, which is determined by the a priori smoothness constraint encoded into the energy function An important necessary condition is derived for the interaction to be adaptive to discontinuities to avoid oversmoothing. This forms the basis on which a class of adaptive interaction functions (AIFs) is defined. The DA model is defined in terms of the Euler equation constrained by this class of AIFs. Its solution is C 1 continuous and allows arbitrarily large but bounded slopes in dealing...
Energy Functions for Early Vision and Analog Networks.
 Biological Cybernetics
, 1987
"... This paper describes attempts to model the modules of early vision in terms of minimizing energy functions, in particular energy functions allowing discontinuities in the solution. It examines the success of using Hopfieldstyle analog networks for solving such problems. Finally it discusses the ..."
Abstract

Cited by 23 (2 self)
 Add to MetaCart
This paper describes attempts to model the modules of early vision in terms of minimizing energy functions, in particular energy functions allowing discontinuities in the solution. It examines the success of using Hopfieldstyle analog networks for solving such problems. Finally it discusses the limitations of the energy function approach.
Automatic Creation of BoundaryRepresentation Models from Single Line Drawings
, 2002
"... This thesis presents methods for the automatic creation of boundaryrepresentation models of polyhedral objects from single line drawings depicting the objects. This topic is important in that automated interpretation of freehand sketches would remove a bottleneck in current engineering design metho ..."
Abstract

Cited by 17 (11 self)
 Add to MetaCart
This thesis presents methods for the automatic creation of boundaryrepresentation models of polyhedral objects from single line drawings depicting the objects. This topic is important in that automated interpretation of freehand sketches would remove a bottleneck in current engineering design methods. The thesis does not consider conversion of freehand sketches to line drawings or methods which require manual intervention or multiple drawings. Thge thesis contains a number of...
The leastdisturbance principle and weak constraints
, 1983
"... Certain problems, notably in computer vision, involve adjusting a set of realvalued labels to satisfy certain constraints. They can be formulated as optimisation problems, using the 'leastdisturbance' principle: the minimal alteration is made to the labels that will achieve a consistent labelling. ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
Certain problems, notably in computer vision, involve adjusting a set of realvalued labels to satisfy certain constraints. They can be formulated as optimisation problems, using the 'leastdisturbance' principle: the minimal alteration is made to the labels that will achieve a consistent labelling. Under certain linear constraints, the solution can be achieved iteratively and in parallel, by hillclimbing. However, where 'weak' constraints are imposed on the labels constraints that may be broken at a cost the optimisation problem becomes nonconvex; a continuous search for the solution is no longer satisfactory. A strategy is proposed for this case, by construction of convex envelopes and by the use of 'graduated' nonconvexity.