Results 1 - 10
of
10
Recognition-by-components: A theory of human image understanding
- Psychological Review
, 1987
"... The perceptual recognition of objects is conceptualized to be a process in which the image of the input is segmented at regions of deep concavity into an arrangement of simple geometric components, such as blocks, cylinders, wedges, and cones. The fundamental assumption of the proposed theory, recog ..."
Abstract
-
Cited by 550 (8 self)
- Add to MetaCart
The perceptual recognition of objects is conceptualized to be a process in which the image of the input is segmented at regions of deep concavity into an arrangement of simple geometric components, such as blocks, cylinders, wedges, and cones. The fundamental assumption of the proposed theory, recognition-by-components (RBC), is that a modest set of generalized-cone components, called geons (N ^ 36), can be derived from contrasts of five readily detectable properties of edges in a two-dimensional image: curvature, collinearity, symmetry, parallelism, and cotermmation. The detection of these properties is generally invariant over viewing position and image quality and consequently allows robust object perception when the image is projected from a novel viewpoint or is degraded. RBC thus provides a principled account of the heretofore undecided relation between the classic principles of perceptual organization and pattern recognition: The constraints toward regularization (Pragnanz) characterize not the complete object but the object's components. Representational power derives from an allowance of free combinations of the geons. A Principle of Componential Recovery can account for the major phenomena of object recognition: If an arrangement of two or three geons can be recovered from the input, objects can be quickly recognized even when they are occluded, novel, rotated in depth, or extensively degraded. The results from experiments on the perception of briefly presented pictures by human observers provide empirical support for the theory. Any single object can project an infinity of image configura-tions to the retina. The orientation of the object to the viewer can vary continuously, each giving rise to a different two-dimen-sional projection. The object can be occluded by other objects or texture fields, as when viewed behind foliage. The object need not be presented as a full-colored textured image but in-stead can be a simplified line drawing. Moreover, the object can even be missing some of its parts or be a novel exemplar of its
An Invitation to Discuss Computer Depiction
, 2002
"... This paper draws from art history and perception to place computer depiction in the broader context of picture production. It highlights the often underestimated complexity of the interactions between features in the picture and features of the represented scene. Depiction is not always a unidirecti ..."
Abstract
-
Cited by 41 (4 self)
- Add to MetaCart
This paper draws from art history and perception to place computer depiction in the broader context of picture production. It highlights the often underestimated complexity of the interactions between features in the picture and features of the represented scene. Depiction is not always a unidirectional projection from a 3D scene to a 2D picture, but involves much feedback and influence from the picture space to the object space. Depiction can be seen as a pre-existing 3D reality projected onto 2D, but also as a 2D pictorial representation that is superficially compatible with an hypothetic 3D scene. We show that depiction is essentially an optimization problem, producing the best picture given goals and constraints. We introduce a classification of basic depiction techniques based on four kinds of issue. The spatial system deals with the mapping of spatial properties between 3D and 2D (including, but not restricted to, perspective projection). The primitive system deals with the dimensionality and mappings between picture primitives and scene primitives. Attributes deal with the assignment of visual properties such as colors, texture, or thickness. Finally, marks are the physical implementations of the picture (e.g. brush strokes, mosaic cells). A distinction is introduced between interaction and picturegeneration methods, and techniques are then organized depending on the dimensionality of the inputs and outputs.
Parsing silhouettes: The short-cut rule
, 1999
"... this paper, we propose the short-cut rule, which states that, other things being equal, human vision prefers to use the shortest possible cuts to parse silhouettes. We motivate this rule, and the well-known Petters rule for modal completion, by the principle of transversality. We present five ps ..."
Abstract
-
Cited by 21 (4 self)
- Add to MetaCart
this paper, we propose the short-cut rule, which states that, other things being equal, human vision prefers to use the shortest possible cuts to parse silhouettes. We motivate this rule, and the well-known Petters rule for modal completion, by the principle of transversality. We present five psychophysical experiments that test the short-cut rule, show that it successfully predicts part cuts which connect boundary points given by the minima rule, and show that it can also create new boundary points
Learning to Recognize Objects
- TRENDS IN COGNITIVE SCIENCES
, 2000
"... In this report we review a large body of literature describing how experience affects recognition. Both neurophysiology and psychophysics provide clear evidence for the development of recognition over time. In particular, we show how perceptual learning in recognition tasks can be directly linked to ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
In this report we review a large body of literature describing how experience affects recognition. Both neurophysiology and psychophysics provide clear evidence for the development of recognition over time. In particular, we show how perceptual learning in recognition tasks can be directly linked to learning in feature tuned inferotemporal lobe neurons in the primate brain. The environment as we experience it, is so structured that potentially very different images appearing in close temporal succession are likely to be views of the same object. We argue that this temporal structure forms the basis of a tendency (a prior in the sense of Bayesian Statistics) of the human visual system to associate images of objects together over short periods of time.
Object recognition in the geometric era: A retrospective
- Toward CategoryLevel Object Recognition, volume 4170 of Lecture Notes in Computer Science
, 2006
"... Abstract. Recent advances in object recognition have emphasized the integration of intensity-derived features such as affine patches with associated geometric constraints leading to impressive performance in complex scenes. Over the four previous decades, the central paradigm of recognition was base ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Abstract. Recent advances in object recognition have emphasized the integration of intensity-derived features such as affine patches with associated geometric constraints leading to impressive performance in complex scenes. Over the four previous decades, the central paradigm of recognition was based on formal geometric object descriptions with a focus on the properties of such descriptions under perspective image formation. This paper will review the key advances of the geometric era and investigate the underlying causes of the movement away from formal geometry and prior models towards the use of statistical learning methods based on appearance features. 1
Neural Mechanisms Underlying Processing in the Visual Areas of the Occipital and Temporal Lobes
- Oxford University
, 1994
"... There is evidence that over a series of cortical processing stages, the visual system of primates produces a representation of objects which shows invariance with respect to, for example, translation, size, and view, as shown by recordings from single neurons in the temporal lobe (Rolls, 1992; Desim ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
There is evidence that over a series of cortical processing stages, the visual system of primates produces a representation of objects which shows invariance with respect to, for example, translation, size, and view, as shown by recordings from single neurons in the temporal lobe (Rolls, 1992; Desimone, 1991; Tanaka et al., 1991). To clarify how such a system might learn to recognise `naturally' transformed objects, I investigate a model of cortical visual processing which incorporates a number of features of the primate visual system. The model consists of a series of layers with convergence from a limited region of the preceding layer, and mutual inhibition over a short range within a layer. The feed-forward connections provide the inputs to competitive networks, each utilising a modified Hebb-like learning rule which incorporates a temporal trace of the preceding neuronal activity. The modified Hebb-rule, called simply the trace learning rule, is aimed at enabling neurons to learn t...
Image Structure Analysis for CBIR
- Proc. Digital Image Computing: Techniques and Applications, DICTA'99
, 1999
"... To separate an image into its constituent parts while desirable is still a challenging problem. Imagine an image indexing system that can extract the main perceptual components in the image, their properties and relationships between them. This will allow an image to be indexed as containing parts w ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
To separate an image into its constituent parts while desirable is still a challenging problem. Imagine an image indexing system that can extract the main perceptual components in the image, their properties and relationships between them. This will allow an image to be indexed as containing parts which represent meaningful components, say, a car, people, some trees, etc, rather then systems produced by standard segmentations in which an image of a car generates many spurious regions. For images containing general or arbitrary contents, object recognition techniques do not suffice.
then '(2) = I(2,2,1,1,2)]. (2,2,2,1,2)
, 61
"... structable from , and is constructable from {} J', then is constructable from o Definition: The type i m-constructable space, denoted by F(), is the set of all m-ary relations which are constructable from by type i m-syntheses, i.e., F.() = { I = () is an m-synthesis and is type i}, where i = O ..."
Abstract
- Add to MetaCart
structable from , and is constructable from {} J', then is constructable from o Definition: The type i m-constructable space, denoted by F(), is the set of all m-ary relations which are constructable from by type i m-syntheses, i.e., F.() = { I = () is an m-synthesis and is type i}, where i = O, 1, 2, 3o In case of type O, it may be referred to simply as the m-constructable space Fm(). The following is a simple result of the definitions. Theorem 4.1 For any set of relations , m m m r(a) p. r(a) r2(a) r3(a). 63 in Other simple facts about type i m-constructable spaces are: m m = F() using Lemma 4.1. general F.() and Fi(Fi()) by On the other hand, it was earlier defined that a relation is derivable from a relation 8 if is an equivalent of some projection of 8. It will be convenient to have the following operator. Definition: The derivable space of a set of relations , denoted by A(), is the set of all relations which are derivable from . We may write A() instead of A({})
Cognition 63 (1997) 29--78
- Cognition
, 1997
"... Many objects have component parts, and these parts often differ in their visual salience. In this paper we present a theory of part salience. The theory builds on the minima rule for defining part boundaries. According to this rule, human vision defines part boundaries at negative minima of curvatur ..."
Abstract
- Add to MetaCart
Many objects have component parts, and these parts often differ in their visual salience. In this paper we present a theory of part salience. The theory builds on the minima rule for defining part boundaries. According to this rule, human vision defines part boundaries at negative minima of curvature on silhouettes, and along negative minima of the principal curvatures on surfaces. We propose that the salience of a part depends on (at least) three factors: its size relative to the whole object, the degree to which it protrudes, and the strength of its boundaries. We present evidence that these factors influence visual processes which determine the choice of figure and ground. We give quantitative definitions for the factors, visual demonstrations of their effects, and results of psychophysical experiments. 1997 Elsevier Science B.V.
Reprinted from ‘GRAPHIC LANGUAGES ’- Editors: F. NAKE and A. ROSENFELD RESYNTHESIS OF BIOLOGICAL IMAGES FROY TREE-STRUCTURED DECOMPOSITION DATA
"... Abstract: In this paper we are simultaneously concerned with methods for decomposing grey scale microscope images and with methods for verifying the correctness of these decompositions. One such method is resynthesis. Resynthesis js viewed as a procedure whereby an analyzed scene can be reconstitute ..."
Abstract
- Add to MetaCart
Abstract: In this paper we are simultaneously concerned with methods for decomposing grey scale microscope images and with methods for verifying the correctness of these decompositions. One such method is resynthesis. Resynthesis js viewed as a procedure whereby an analyzed scene can be reconstituted and subjerted to an analysis by human (informal) methods to determine the information preservation of the process. Several algorithms are presented for different ways of resynthesizing a decomposed image from its morphological decomposition analysis. In attempting to do pattern recognition with computers on continuous tone image sources of complex structure, one encounters the problem of decomposing the images for scene analysis. When one‘s goal in such pattern recognition is more than to assign the image to one of a number (usually small) of distinct classes, it becomes

