Results 11 - 20
of
25
Computational Models of Perceptual Organization
- Robotics Institute, Carnegie Mellon University
, 2003
"... Perceptual organization refers to the process of organizing sensory input into coherent and interpretable perceptual structures. This process is challenging due to the chicken-and-egg nature between the various sub-processes such as image segmentation, figure-ground segregation and object recognitio ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Perceptual organization refers to the process of organizing sensory input into coherent and interpretable perceptual structures. This process is challenging due to the chicken-and-egg nature between the various sub-processes such as image segmentation, figure-ground segregation and object recognition. Low-level processing requires the guidance of high-level knowledge to overcome noise; while high-level processing relies on low-level processes to reduce the computational complexity. Neither process can be sufficient on its own. Consequently, any system that carries out these processes in a sequence is bound to be brittle. An alternative system is one in which all processes interact with each other simultaneously. In this thesis, we develop a set of simple yet realistic interactive processing models for perceptual organization. We model the processing in the framework of spectral graph theory, with a criterion encoding the overall goodness of perceptual organization. We derive fast solutions for near-global optima of the criterion, and demonstrate the efficacy of the models on segmenting a wide range of real images. Through these models, we are able to capture a variety of perceptual phenomena: a unified treatment of various grouping, figure-ground and depth cues to produce popout, region segmentation and depth segregation in one step; and a unified framework for integrating bottom-up and top-down information to produce an object segmentation from spatial and object attention. We achieve these goals by empowering current spectral graph methods with a principled solution for multiclass spectral graph partitioning; expanded repertoire of grouping cues to include similarity, dissimilarity and ordering relationships; a theory for integrating sparse grouping cues; and a model ...
Object Recognition using Generalized Robust Invariant Feature and Gestalt Law of Proximity and Similarity
- IEEE Workshop on Perceptual Organization in Computer Vision (in CVPR'06
, 2006
"... In this paper, we propose a new context-based method for object recognition. We first introduce a neurophysiologically motivated visual part detector. We found that the optimal form of the visual part detector is a combination of a radial symmetry detector and a corner-like structure detector. A gen ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In this paper, we propose a new context-based method for object recognition. We first introduce a neurophysiologically motivated visual part detector. We found that the optimal form of the visual part detector is a combination of a radial symmetry detector and a corner-like structure detector. A general context descriptor, named G-RIF (Generalized-Robust Invariant Feature), is then proposed, which encodes edge orientation, edge density and hue information in a unified form. Finally, a context-based voting scheme is proposed. This proposed method is inspired by the function of the human visual system, called figure-ground discrimination. We use the proximity and similarity between features to support each other. The contextual feature descriptor and contextual voting method, which use contextual information, enhance the recognition performance enormously in severely cluttered environments. 1.
Nearest Neighbour Searching in High Dimensional Metric Space
, 2006
"... Given one item, finding its closest match within a database of other such items is a task performed in numerous domains. Image matching, data mining, and electroencephalogram data analysis are a few varied examples. The extension of the concept of Euclidean distance in 2D and 3D space to higher dime ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Given one item, finding its closest match within a database of other such items is a task performed in numerous domains. Image matching, data mining, and electroencephalogram data analysis are a few varied examples. The extension of the concept of Euclidean distance in 2D and 3D space to higher dimensional space provides an effective comparison of items in these sorts of domains. Particular regard will be given to the performance of nearest neighbour searching in a large database of SIFT descriptors. (Scale Invariant Feature Transform) SIFT descriptors are useful in support of many image matching tasks. There are a number of algorithms, each with there own issues of storage size and search performance. The literature review (COMP6720) will aim to describe the significant algorithms and their performance attributes. The review should also identify opportunities for the enhancement (or enhanced implementation) of existing algorithms for the purposes of computer vision. Such further work would be the subject of the (COMP6702) follow-on project. 1
3D Object Modeling and Recognition from Photographs and Image Sequences
"... Abstract. This chapter proposes a representation of rigid three-dimensional (3D) objects in terms of local affine-invariant descriptors of their images and the spatial relationships between the corresponding surface patches. Geometric constraints associated with different views of the same patches u ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. This chapter proposes a representation of rigid three-dimensional (3D) objects in terms of local affine-invariant descriptors of their images and the spatial relationships between the corresponding surface patches. Geometric constraints associated with different views of the same patches under affine projection are combined with a normalized representation of their appearance to guide the matching process involved in object modeling and recognition tasks. The proposed approach is applied in two domains: (1) Photographs — models of rigid objects are constructed from small sets of images and recognized in highly cluttered shots taken from arbitrary viewpoints. (2) Video — dynamic scenes containing multiple moving objects are segmented into rigid components, and the resulting 3D models are directly matched to each other, giving a novel approach to video indexing and retrieval. 1
Current Advances in Computer-based Object Detection and Target Acquisition
, 2004
"... Object detection is a part of our everyday lives, however, automatic object detection by computer is still an open question. In 30 years of research in computer vision, little progress has been made. This report is a survey on the most recent techniques in object detection research. First, we introd ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Object detection is a part of our everyday lives, however, automatic object detection by computer is still an open question. In 30 years of research in computer vision, little progress has been made. This report is a survey on the most recent techniques in object detection research. First, we introduce the definition, challenges, applications and general components of the object detection system. This is followed by a review of various appearance based approaches and feature based approaches. Appearance based approaches are classified based on different classifiers into linear representation, distribution-based, support vector machines, sparse Winnow network. Meanwhile different feature based approaches are distinguished from each other by what features are being used- texture, shape, context and multiple features. Then a framework of an object detection system is
Annotating Historical Archives of Images
"... Recent initiatives like the Million Book Project and Google Print Library Project have already archived several million books in digital format, and within a few years a significant fraction of world’s books will be online. While the majority of the data will naturally be text, there will also be te ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Recent initiatives like the Million Book Project and Google Print Library Project have already archived several million books in digital format, and within a few years a significant fraction of world’s books will be online. While the majority of the data will naturally be text, there will also be tens of millions of pages of images. Many of these images will defy automation annotation for the foreseeable future, but a considerable fraction of the images may be amiable to automatic annotation by algorithms that can link the historical image with a modern contemporary, with its attendant metatags. In order to perform this linking we must have a suitable distance measure which appropriately combines the relevant features of shape, color, texture and text. However the best combination of these features will vary from application to application and even from one manuscript to another. In this work we propose a simple technique to learn the distance measure by perturbing the training set in a principled way. We show the utility of our ideas on archives of manuscripts containing images from natural history and cultural artifacts.
LOCAL, SEMI-LOCAL AND GLOBAL MODELS FOR TEXTURE, OBJECT AND SCENE RECOGNITION
, 2000
"... This dissertation addresses the problems of recognizing textures, objects, and scenes in photographs. We present approaches to these recognition tasks that combine salient local image features with spatial relations and effective discriminative learning techniques. First, we introduce a bag of featu ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This dissertation addresses the problems of recognizing textures, objects, and scenes in photographs. We present approaches to these recognition tasks that combine salient local image features with spatial relations and effective discriminative learning techniques. First, we introduce a bag of features image model for recognizing textured surfaces under a wide range of transformations, including viewpoint changes and non-rigid deformations. We present results of a large-scale comparative evaluation indicating that bags of features can be effective not only for texture, but also for object categization, even in the presence of substantial clutter and intra-class variation. We also show how to augment the purely local image representation with statistical co-occurrence relations between pairs of nearby features, and develop a learning and classification framework for the task of classifying individual features in a multi-texture image. Next, we present a more structured alternative to bags of features for object recognition, namely, an image representation based on semi-local parts, or groups of features characterized by stable appearance and geometric layout. Semi-local parts are automatically learned from small sets of unsegmented, cluttered images. Finally, we present a global method for recognizing scene categories that works by partitioning the image into increasingly fine sub-regions and computing histograms of local features found inside each sub-region. The resulting spatial pyramid representation demonstrates significantly improved performance on challenging scene categorization tasks.
Vision Based Gesture Recognition System With
"... Abstract — Robots of the future should communicate with humans in a natural way. We are especially interested in visionbased gesture interaction. This paper describes our hand gesture recognition system that will be used in a mobile robot. It is based on our version 1.0 hand gesture recognition syst ..."
Abstract
- Add to MetaCart
Abstract — Robots of the future should communicate with humans in a natural way. We are especially interested in visionbased gesture interaction. This paper describes our hand gesture recognition system that will be used in a mobile robot. It is based on our version 1.0 hand gesture recognition system. This paper focuses on the different methods we used on this new system and we will also compare the results of two systems. I.
A Unified Robust Algorithm for Detection of Human and Non-human Object in Intelligent Safety Application
"... Abstract—This paper presents a general trainable framework for fast and robust upright human face and non-human object detection and verification in static images. To enhance the performance of the detection process, the technique we develop is based on the combination of fast neural network (FNN) a ..."
Abstract
- Add to MetaCart
Abstract—This paper presents a general trainable framework for fast and robust upright human face and non-human object detection and verification in static images. To enhance the performance of the detection process, the technique we develop is based on the combination of fast neural network (FNN) and classical neural network (CNN). In FNN, a useful correlation is exploited to sustain high level of detection accuracy between input image and the weight of the hidden neurons. This is to enable the use of Fourier transform that significantly speed up the time detection. The combination of CNN is responsible to verify the face region. A bootstrap algorithm is used to collect non human object, which adds the false detection to the training process of the human and non-human object. Experimental results on test images with both simple and complex background demonstrate that the proposed method has obtained high detection rate and low false positive rate in detecting both human face and non-human object. Keywords—Algorithm, detection of human and non-human object, FNN, CNN, Image training. I.
Hierarchical Object Indexing and Sequential Learning
"... This work is about scene interpretation in the sense of detecting and localizing instances from multiple object classes. We concentrate on object indexing: generate an over-complete interpretation – a list with extra detections but none missed. Pruning such an index to a final interpretation involve ..."
Abstract
- Add to MetaCart
This work is about scene interpretation in the sense of detecting and localizing instances from multiple object classes. We concentrate on object indexing: generate an over-complete interpretation – a list with extra detections but none missed. Pruning such an index to a final interpretation involves a global, often intensive, contextual analysis. We propose a tree-structured hierarchy as a framework for indexing; each node represents a subset of interpretations. This unifies object representation, scene parsing, and sequential learning (modifying the hierarchy as new samples, poses and classes are encountered). Then we specialize to learning- designing and refining a binary classifier at each node of the hierarchy dedicated to the corresponding subset of interpretations. The whole procedure is illustrated by experiments in reading license plates. 1.

