Results 21 - 30
of
510
ImageRover: A Content-Based Image Browser for the World Wide Web
- In Proc. IEEE Workshop on Content-based Access of Image and Video Libraries
, 1997
"... ImageRover is a search by image content navigation tool for the world wide web. To gather images expediently, the image collection subsystem utilizes a distributed fleet of WWW robots running on different computers. The image robots gather information about the images they find, computing the approp ..."
Abstract
-
Cited by 117 (3 self)
- Add to MetaCart
ImageRover is a search by image content navigation tool for the world wide web. To gather images expediently, the image collection subsystem utilizes a distributed fleet of WWW robots running on different computers. The image robots gather information about the images they find, computing the appropriate image decompositions and indices, and store this extracted information in vector form for searches based on image content. At search time, users can iteratively guide the search through the selection of relevant examples. Search performance is made efficient through the use of an approximate, optimized k-d tree algorithm. The system employs a novel relevance feedback algorithm that selects the distance metrics appropriate for a particular query. Keywords: Image databases, query by image content, content-based retrieval, world wide web search engines. 1 Introduction For a while now there have been software "robots" roving the World Wide Web (WWW) collecting index information about th...
Orientation histograms for hand gesture recognition
- Mitsubishi Electric Research Labs., 201
, 213
"... We present a method to recognize hand gestures, based on a pattern recognition technique developed by McConnell [16] employing histograms of local orientation. We use the orientation histogram as a feature vector for gesture classification and interpolation. For moving or ¨dynamic gestures ¨ , the h ..."
Abstract
-
Cited by 116 (2 self)
- Add to MetaCart
We present a method to recognize hand gestures, based on a pattern recognition technique developed by McConnell [16] employing histograms of local orientation. We use the orientation histogram as a feature vector for gesture classification and interpolation. For moving or ¨dynamic gestures ¨ , the histogram of the spatio-temporal gradients of image intensity form the analogous feature vector and may be useful for dynamic gesture recognition.
An Active Vision Architecture based on Iconic Representations
- Artificial Intelligence
, 1995
"... Active vision systems have the capability of continuously interacting with the environment. The rapidly changing environment of such systems means that it is attractive to replace static representations with visual routines that compute information on demand. Such routines place a premium on image d ..."
Abstract
-
Cited by 116 (12 self)
- Add to MetaCart
Active vision systems have the capability of continuously interacting with the environment. The rapidly changing environment of such systems means that it is attractive to replace static representations with visual routines that compute information on demand. Such routines place a premium on image data structures that are easily computed and used. The purpose of this paper is to propose a general active vision architecture based on efficiently computable iconic representations. This architecture employs two primary visual routines, one for identifying the visual image near the fovea (object identification), and another for locating a stored prototype on the retina (object location). This design allows complex visual behaviors to be obtained by composing these two routines with different parameters. The iconic representations are comprised of high-dimensional feature vectors obtained from the responses of an ensemble of Gaussian derivative spatial filters at a number of orientations and...
Weak hypotheses and boosting for generic object detection and recognition
- In Proc. ECCV
, 2004
"... Abstract. In this paper we describe the first stage of a new learning system for object detection and recognition. For our system we propose Boosting [5] as the underlying learning technique. This allows the use of very diverse sets of visual features in the learning process within a common framewor ..."
Abstract
-
Cited by 107 (7 self)
- Add to MetaCart
Abstract. In this paper we describe the first stage of a new learning system for object detection and recognition. For our system we propose Boosting [5] as the underlying learning technique. This allows the use of very diverse sets of visual features in the learning process within a common framework: Boosting — together with a weak hypotheses finder — may choose very inhomogeneous features as most relevant for combination into a final hypothesis. As another advantage the weak hypotheses finder may search the weak hypotheses space without explicit calculation of all available hypotheses, reducing computation time. This contrasts the related work of Agarwal and Roth [1] where Winnow was used as learning algorithm and all weak hypotheses were calculated explicitly. In our first empirical evaluation we use four types of local descriptors: two basic ones consisting of a set of grayvalues and intensity moments and two high level descriptors: moment invariants [8] and SIFTs [12]. The descriptors are calculated from local patches detected by an interest point operator. The weak hypotheses finder selects one of the local patches and one type of local descriptor and efficiently searches for the most discriminative similarity threshold. This differs from other work on Boosting for object recognition where simple rectangular hypotheses [22] or complex classifiers [20] have been used. In relatively simple images, where the objects are prominent, our approach yields results comparable to the state-of-the-art [3]. But we also obtain very good results on more complex images, where the objects are located in arbitrary positions, poses, and scales in the images. These results indicate that our flexible approach, which also allows the inclusion of features from segmented regions and even spatial relationships, leads us a significant step towards generic object recognition. 1
Local scale control for edge detection and blur estimation
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1998
"... Abstract—The standard approach to edge detection is based on a model of edges as large step changes in intensity. This approach fails to reliably detect and localize edges in natural images where blur scale and contrast can vary over a broad range. The main problem is that the appropriate spatial sc ..."
Abstract
-
Cited by 90 (9 self)
- Add to MetaCart
Abstract—The standard approach to edge detection is based on a model of edges as large step changes in intensity. This approach fails to reliably detect and localize edges in natural images where blur scale and contrast can vary over a broad range. The main problem is that the appropriate spatial scale for local estimation depends upon the local structure of the edge, and thus varies unpredictably over the image. Here we show that knowledge of sensor properties and operator norms can be exploited to define a unique, locally computable minimum reliable scale for local estimation at each point in the image. This method for local scale control is applied to the problem of detecting and localizing edges in images with shallow depth of field and shadows. We show that edges spanning a broad range of blur scales and contrasts can be recovered accurately by a single system with no input parameters other than the second moment of the sensor noise. A natural dividend of this approach is a measure of the thickness of contours which can be used to estimate focal and penumbral blur. Local scale control is shown to be important for the estimation of blur in complex images, where the potential for interference between nearby edges of very different blur scale requires that estimates be made at the minimum reliable scale.
On Photometric Issues in 3D Visual Recognition From A Single 2D Image
- International Journal of Computer Vision
, 1997
"... . We describe the problem of recognition under changing illumination conditions and changing viewing positions from a computational and human vision perspective. On the computational side we focus on the mathematical problems of creating an equivalence class for images of the same 3D object undergo ..."
Abstract
-
Cited by 89 (6 self)
- Add to MetaCart
. We describe the problem of recognition under changing illumination conditions and changing viewing positions from a computational and human vision perspective. On the computational side we focus on the mathematical problems of creating an equivalence class for images of the same 3D object undergoing certain groups of transformations --- mostly those due to changing illumination, and briefly discuss those due to changing viewing positions. The computational treatment culminates in proposing a simple scheme for recognizing, via alignment, an image of a familiar object taken from a novel viewing position and a novel illumination condition. On the human vision aspect, the paper is motivated by empirical evidence inspired by Mooney images of faces that suggest a relatively high level of visual processing is involved in compensating for photometric sources of variability, and furthermore, that certain limitations on the admissible representations of image information may exist. The psycho...
Control of Selective Perception Using Bayes Nets and Decision Theory
, 1993
"... A selective vision system sequentially collects evidence to support a specified hypothesis about a scene, as long as the additional evidence is worth the effort of obtaining it. Efficiency comes from processing the scene only where necessary, to the level of detail necessary, and with only the neces ..."
Abstract
-
Cited by 87 (1 self)
- Add to MetaCart
A selective vision system sequentially collects evidence to support a specified hypothesis about a scene, as long as the additional evidence is worth the effort of obtaining it. Efficiency comes from processing the scene only where necessary, to the level of detail necessary, and with only the necessary operators. Knowledge representation and sequential decision-making are central issues for selective vision, which takes advantage of prior knowledge of a domain's abstract and geometrical structure and models for the expected performance and cost of visual operators. The TEA-1 selective vision system uses Bayes nets for representation and benefitcost analysis for control of visual and non-visual actions. It is the high-level control for an active vision system, enabling purposive behavior, the use of qualitative vision modules and a pointable multiresolution sensor. TEA-1 demonstrates that Bayes nets and decision theoretic techniques provide a general, re-usable framework for constructi...
Efficient Re-rendering of Naturally Illuminated Environments
- IN FIFTH EUROGRAPHICS WORKSHOP ON RENDERING
, 1994
"... We present a method for the efficient re-rendering of a scene under a directional illuminant at an arbitrary orientation. We take advantage of the linearity of the rendering operator with respect to illumination for a fixed scene and camera geometry. Re-rendering is accomplished via linear combinati ..."
Abstract
-
Cited by 82 (4 self)
- Add to MetaCart
We present a method for the efficient re-rendering of a scene under a directional illuminant at an arbitrary orientation. We take advantage of the linearity of the rendering operator with respect to illumination for a fixed scene and camera geometry. Re-rendering is accomplished via linear combination of a set of pre-rendered "basis" images. The theory of steerable functions provides the machinery to derive an appropriate set of basis images. We demonstrate the technique on both simple and complex scenes illuminated by an approximation to natural skylight. We show re-rendering simulations under conditions of varying sun position and cloudiness.
Generic Object Recognition with Boosting
- IEEE Trans. PAMI
, 2006
"... This paper presents a powerful framework for generic object recognition. Boosting is used as an underlying learning technique. For the first time a combination of various weak classifiers of different types of descriptors is used, which slightly increases the classification result but dramatically i ..."
Abstract
-
Cited by 76 (4 self)
- Add to MetaCart
This paper presents a powerful framework for generic object recognition. Boosting is used as an underlying learning technique. For the first time a combination of various weak classifiers of different types of descriptors is used, which slightly increases the classification result but dramatically improves the stability of a classifier. Besides applying well known techniques to extract salient regions we also present a new segmentation method-“Similarity-Measure-Segmentation”. This approach delivers segments, which can consist of several disconnected parts. This turns out to be a mighty description of local similarity. With regard to the task of object categorization, Similarity-Measure-Segmentation performs equal or better than current state-of-the-art segmentation techniques. In contrast to previous solutions we aim at handling of complex objects appearing in highly cluttered images. Therefore we have set up a database containing images with the required complexity. On these images we obtain very good classification results of up to 87 % ROC-equal error rate. Focusing the performance on common databases for object recognition our approach outperforms all comparable solutions.
Television control by hand gestures
- International Workshop on Automatic Face and Gesture Recognition
, 1995
"... We study how a viewer can control a television set remotely by hand gestures. We address two fundamental issues of gesture–based human–computer interaction: (1) How can one communicate a rich set of commands without extensive user training and memorization of gestures? (2) How can the computer recog ..."
Abstract
-
Cited by 73 (3 self)
- Add to MetaCart
We study how a viewer can control a television set remotely by hand gestures. We address two fundamental issues of gesture–based human–computer interaction: (1) How can one communicate a rich set of commands without extensive user training and memorization of gestures? (2) How can the computer recognize the commands in a complicated visual environment? We made a prototype of this system using a computer workstation and a television. The graphical overlays appear on the computer screen, although they could be mixed with the video to appear on the television. The computer controls the television set through serial port commands to an electronically controlled remote control. We describe knowledge we gained from building the prototype.

