Results 1  10
of
310
A taxonomy and evaluation of dense twoframe stereo correspondence algorithms
 International Journal of Computer Vision
, 2002
"... Abstract. Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, twoframe ..."
Abstract

Cited by 1412 (22 self)
 Add to MetaCart
Abstract. Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, twoframe stereo methods. Our taxonomy is designed to assess the different components and design decisions made in individual stereo algorithms. Using this taxonomy, we compare existing stereo methods and present experiments evaluating the performance of many different variants. In order to establish a common software platform and a collection of data sets for easy evaluation, we have designed a standalone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms. We have also produced several new multiframe stereo data sets with ground truth and are making both the code and data sets available on the Web. Finally, we include a comparative evaluation of a large set of today’s bestperforming stereo algorithms.
Parallel Networks that Learn to Pronounce English Text
 COMPLEX SYSTEMS
, 1987
"... This paper describes NETtalk, a class of massivelyparallel network systems that learn to convert English text to speech. The memory representations for pronunciations are learned by practice and are shared among many processing units. The performance of NETtalk has some similarities with observed h ..."
Abstract

Cited by 513 (5 self)
 Add to MetaCart
(Show Context)
This paper describes NETtalk, a class of massivelyparallel network systems that learn to convert English text to speech. The memory representations for pronunciations are learned by practice and are shared among many processing units. The performance of NETtalk has some similarities with observed human performance. (i) The learning follows a power law. (;i) The more words the network learns, the better it is at generalizing and correctly pronouncing new words, (iii) The performance of the network degrades very slowly as connections in the network are damaged: no single link or processing unit is essential. (iv) Relearning after damage is much faster than learning during the original training. (v) Distributed or spaced practice is more effective for longterm retention than massed practice. Network models can be constructed that have the same performance and learning characteristics on a particular task, but differ completely at the levels of synaptic strengths and singleunit responses. However, hierarchical clustering techniques applied to NETtalk reveal that these different networks have similar internal representations of lettertosound correspondences within groups of processing units. This suggests that invariant internal representations may be found in assemblies of neurons intermediate in size between highly localized and completely distributed representations.
Analogical Mapping by Constraint Satisfaction
 COGNITIVE SCIENCE 13, 295 (1989)
, 1989
"... A theory of analogical mopping between source and target analogs based upon interacting structural, semantic, and pragmatic constraints is proposed here. The structural constraint of fsomorphfsm encourages mappings that maximize the consistency of relational corresondences between the elements of th ..."
Abstract

Cited by 366 (28 self)
 Add to MetaCart
(Show Context)
A theory of analogical mopping between source and target analogs based upon interacting structural, semantic, and pragmatic constraints is proposed here. The structural constraint of fsomorphfsm encourages mappings that maximize the consistency of relational corresondences between the elements of the two analogs. The constraint of semantic similarity supports mapping hypotheses to the degree that mapped predicates have similar meanings. The constraint of pragmatic centrality fovors mappings involving elements the analogist believes to be important in order to achieve the purpose for which the anology Is being used. The theory is implemented in a computer progrom called ACME (Analogical Constraint Mapping Engine), which represents constraints by means of a network of supporting and competing hypotheses regarding what elements to map. A coop erative algorithm for parallel constraint satisfaction identifies mapping hypotheses that collectively represent the overall mapping that best fits the interactlng constraints. ACME has been applied to a wide range of examples that include problem analogies, analogical arguments, explanatory analogies, story analogies, formal analogies, and metaphors. ACME is sensitive to semantic and prag matic information if it is available,.and yet able to compute mappings between formally isomorphic analogs without any similar or identical elements. The theory Is able to account for empirical findings regarding the impact of consistenty and similarity on human processing of analogies.
Stereo matching using belief propagation
, 2003
"... In this paper, we formulate the stereo matching problem as a Markov network and solve it using Bayesian belief propagation. The stereo Markov network consists of three coupled Markov random fields that model the following: a smooth field for depth/disparity, a line process for depth discontinuity, ..."
Abstract

Cited by 319 (3 self)
 Add to MetaCart
(Show Context)
In this paper, we formulate the stereo matching problem as a Markov network and solve it using Bayesian belief propagation. The stereo Markov network consists of three coupled Markov random fields that model the following: a smooth field for depth/disparity, a line process for depth discontinuity, and a binary process for occlusion. After eliminating the line process and the binary process by introducing two robust functions, we apply the belief propagation algorithm to obtain the maximum a posteriori (MAP) estimation in the Markov network. Other lowlevel visual cues (e.g., image segmentation) can also be easily incorporated in our stereo model to obtain better stereo results. Experiments demonstrate that our methods are comparable to the stateoftheart stereo algorithms for many test cases.
Numerical Shape from Shading and Occluding Boundaries
 Artifical Intelligence
, 1981
"... An iterative method for computing shape from shading using occluding boundary information is proposed. Some applications of this method are shown. We employ the stereographic plane to express the orientations of surface patches, rather than the more commonly.used gradient space. Use of the stereogra ..."
Abstract

Cited by 225 (16 self)
 Add to MetaCart
An iterative method for computing shape from shading using occluding boundary information is proposed. Some applications of this method are shown. We employ the stereographic plane to express the orientations of surface patches, rather than the more commonly.used gradient space. Use of the stereographic plane makes it possible to incorporate occluding boundary information, but forces us to employ a smoothness constraint different from the one previously proposed. The new constraint follows directly from a particular definition of surface smoothness. We solve the set of equations arising from the smoothness constraints and the imageirradiance equation iteratively, using occluding boundary information to supply boundary conditions. Good initial values are found at certain points to help reduce the number of iterations required to reach a reasonable solution. Numerical experiments show that the method is effective and robust. Finally, we analyze scanning electron microscope (SEM) pictures using this method. Other applications are also proposed. 1.
A Theory of Networks for Approximation and Learning
 Laboratory, Massachusetts Institute of Technology
, 1989
"... Learning an inputoutput mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multidimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, t ..."
Abstract

Cited by 217 (23 self)
 Add to MetaCart
Learning an inputoutput mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multidimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, this form of learning is closely related to classical approximation techniques, such as generalized splines and regularization theory. This paper considers the problems of an exact representation and, in more detail, of the approximation of linear and nonlinear mappings in terms of simpler functions of fewer variables. Kolmogorov's theorem concerning the representation of functions of several variables in terms of functions of one variable turns out to be almost irrelevant in the context of networks for learning. Wedevelop a theoretical framework for approximation based on regularization techniques that leads to a class of threelayer networks that we call Generalized Radial Basis Functions (GRBF), since they are mathematically related to the wellknown Radial Basis Functions, mainly used for strict interpolation tasks. GRBF networks are not only equivalent to generalized splines, but are also closely related to pattern recognition methods suchasParzen windows and potential functions and to several neural network algorithms, suchas Kanerva's associative memory,backpropagation and Kohonen's topology preserving map. They also haveaninteresting interpretation in terms of prototypes that are synthesized and optimally combined during the learning stage. The paper introduces several extensions and applications of the technique and discusses intriguing analogies with neurobiological data.
Gradient calculation for dynamic recurrent neural networks: a survey
 IEEE Transactions on Neural Networks
, 1995
"... Abstract  We survey learning algorithms for recurrent neural networks with hidden units, and put the various techniques into a common framework. We discuss xedpoint learning algorithms, namely recurrent backpropagation and deterministic Boltzmann Machines, and non xedpoint algorithms, namely backp ..."
Abstract

Cited by 171 (3 self)
 Add to MetaCart
(Show Context)
Abstract  We survey learning algorithms for recurrent neural networks with hidden units, and put the various techniques into a common framework. We discuss xedpoint learning algorithms, namely recurrent backpropagation and deterministic Boltzmann Machines, and non xedpoint algorithms, namely backpropagation through time, Elman's history cuto, and Jordan's output feedback architecture. Forward propagation, an online technique that uses adjoint equations, and variations thereof, are also discussed. In many cases, the uni ed presentation leads to generalizations of various sorts. We discuss advantages and disadvantages of temporally continuous neural networks in contrast to clocked ones, continue with some \tricks of the trade" for training, using, and simulating continuous time and recurrent neural networks. We present somesimulations, and at the end, address issues of computational complexity and learning speed.
Psychophysical evidence for separate channels for the perception of form, color, movement and depth
 J. Neurosci
, 1987
"... Physiological and anatomical findings in the primate visual system, as well as clinical evidence in humans, suggest that different components of visual information processing are segregated into largely independent parallel pathways. Such a segregation leads to certain predictions about human vision ..."
Abstract

Cited by 164 (2 self)
 Add to MetaCart
(Show Context)
Physiological and anatomical findings in the primate visual system, as well as clinical evidence in humans, suggest that different components of visual information processing are segregated into largely independent parallel pathways. Such a segregation leads to certain predictions about human vision. In this paper we describe psychophysical experiments on the interactions of color, form, depth, and movement in human perception, and we attempt to correlate these aspects of visual perception with the different subdivisions of the visual system. It is function that breathes life into anatomy and perception that vivifies sensory electrophysiology. We must know the logic structure of psychophysical phenomena if meaning is to be read into electron records from the retina, living or dead. Perhaps we may more surely pick a path through the ever thickening forest of fact if we hold some chart of the pattern of our perceptions.Rushton, 1962 Introspection suggests that visual perception can be subdivided into several subprocesses. If asked to list these, most people would include form, color, movement, depth, and perhaps texture. The intuitive impression that vision is multipartite, that it comprises several systems, has been supported by centuries of human psychophysics and by some recent anatomical and physiological studies in primates. In this paper we will try to correlate a large number of observations on human visual perception with these anatomical and physiological findings. To some extent the correlations must be conjectural, not only because our information on the anatomy and physiology of monkeys is far from complete, but also because of the extrapolations
Recognizing People by Their Gait: The Shape of Motion
, 1996
"... > y)). Scaleindependent scalar features of each flow, based on moments of the moving point weighted by u, v,or(u, v), characterize the spatial distribution of the flow. We then analyze the periodic structure of these sequences of scalars. The scalar sequences for an image sequenc ..."
Abstract

Cited by 162 (8 self)
 Add to MetaCart
> y)). Scaleindependent scalar features of each flow, based on moments of the moving point weighted by u, v,or(u, v), characterize the spatial distribution of the flow. We then analyze the periodic structure of these sequences of scalars. The scalar sequences for an image sequence have the same fundamental period but differ in phase, which is a phase feature for each signal. Some phase features are consistent for one person and show significant statistical variation among persons. We use the phase feature vectors to recognize individuals by the shape of their motion. As few as three features out of the full set of twelve lead to excellent discrimination. Keywords: action recognition, gait recognition, motion features, optic flow, motion energy, spatial frequency, analysis Recognizing People by Their Gait: The Shape of Moti
Disparity analysis of images
 IEEE TPAMI
, 1980
"... AbstractAn algorithm for matching images of real world scenes is presented. The matching is a specification of the geometrical disparity between the images and may be used to partially reconstruct the threedimensional structure of the scene. Sets of candidate matching points are selected independen ..."
Abstract

Cited by 155 (2 self)
 Add to MetaCart
(Show Context)
AbstractAn algorithm for matching images of real world scenes is presented. The matching is a specification of the geometrical disparity between the images and may be used to partially reconstruct the threedimensional structure of the scene. Sets of candidate matching points are selected independently in each image. These points are the locations of small, distinct features which are likely to be detectable in both images. An initial network of possible matches between the two sets of candidates is constructed. Each possible match specifies a possible disparity of a candidate point in a selected reference image. An initial estimate of the probability of each possible disparity is made, based on the similarity of subimages surrounding the points. These estimates are iteratively improved by a relaxation labeling technique making use of the local continuity property of disparity that is a consequence of the continuity of real world surfaces. The algorithm is effective for binocular parallax, motion parallax, and object motion. It quickly converges to good estimates of disparity, which reflect the spatial organization of the scene. Index TermsDisparity, matching, motion, relaxation labeling, scene analysis, stereo.