Results 1 - 10
of
12
Auto-context and its Application to High-level Vision Tasks
- In Proc. CVPR
"... The notion of using context information for solving high-level vision and medical image segmentation problems has been increasingly realized in the field. However, how to learn an effective and efficient context model, together with an image appearance model, remains mostly unknown. The current lite ..."
Abstract
-
Cited by 40 (1 self)
- Add to MetaCart
The notion of using context information for solving high-level vision and medical image segmentation problems has been increasingly realized in the field. However, how to learn an effective and efficient context model, together with an image appearance model, remains mostly unknown. The current literature using Markov Random Fields (MRFs) and Conditional Random Fields (CRFs) often involves specific algorithm design, in which the modeling and computing stages are studied in isolation. In this paper, we propose the auto-context algorithm. Given a set of training images and their corresponding label maps, we first learn a classifier on local image patches. The discriminative probability (or classification confidence) maps created by the learned classifier are then used as context information, in addition to the original image patches, to train a new classifier. The algorithm then iterates until convergence. Auto-context integrates low-level and context information by fusing a large number of low-level appearance features with context and implicit shape information. The resulting discriminative algorithm is general and easy to implement. Under nearly the same parameter settings in training, we apply the algorithm to three challenging vision applications: foreground/background segregation, human body configuration estimation, and scene region labeling. Moreover, context also plays a very important role in medical/brain images where the anatomical structures are mostly constrained to relatively fixed positions. With only some slight changes resulting from using 3D instead of 2D features, the auto-context algorithm applied to brain MRI image segmentation is shown to outperform state-of-the-art algorithms specifically designed for this domain. Furthermore, the scope of the proposed algorithm goes beyond image analysis and it has the potential to be used for a wide variety of problems in multi-variate labeling.
Robust and Fast Collaborative Tracking with Two Stage Sparse Optimization
"... Abstract. The sparse representation has been widely used in many areas and utilized for visual tracking. Tracking with sparse representation is formulated as searching for samples with minimal reconstruction errors from learned template subspace. However, the computational cost makes it unsuitable t ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Abstract. The sparse representation has been widely used in many areas and utilized for visual tracking. Tracking with sparse representation is formulated as searching for samples with minimal reconstruction errors from learned template subspace. However, the computational cost makes it unsuitable to utilize high dimensional advanced features which are often important for robust tracking under dynamic environment. Based on the observations that a target can be reconstructed from several templates, and only some of the features with discriminative power are significant to separate the target from the background, we propose a novel online tracking algorithm with two stage sparse optimization to jointly minimize the target reconstruction error and maximize the discriminative power. As the target template and discriminative features usually have temporal and spatial relationship, dynamic group sparsity (DGS) is utilized in our algorithm. The proposed method is compared with three state-of-art trackers using five public challenging sequences, which exhibit appearance changes, heavy occlusions, and pose variations. Our algorithm is shown to outperform these methods. 1
A Stochastic Graph Evolution Framework for Robust Multi-Target Tracking
"... Abstract. Maintaining the stability of tracks on multiple targets in video over extended time periods remains a challenging problem. A few methods which have recently shown encouraging results in this direction rely on learning context models or the availability of training data. However, this may n ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract. Maintaining the stability of tracks on multiple targets in video over extended time periods remains a challenging problem. A few methods which have recently shown encouraging results in this direction rely on learning context models or the availability of training data. However, this may not be feasible in many application scenarios. Moreover, tracking methods should be able to work across different scenarios (e.g. multiple resolutions of the video) making such context models hard to obtain. In this paper, we consider the problem of long-term tracking in video in application domains where context information is not available a priori, nor can it be learned online. We build our solution on the hypothesis that most existing trackers can obtain reasonable short-term tracks (tracklets). By analyzing the statistical properties of these tracklets, we develop associations between them so as to come up with longer tracks. This is achieved through a stochastic graph evolution step that considers the statistical properties of individual tracklets, as well as the statistics of the targets along each proposed long-term track. On multiple real-life video sequences spanning low and high resolution data, we show the ability to accurately track over extended time periods (results are shown on many minutes of continuous video). 1
A Survey of Recent Advances in Face detection
, 2010
"... Face detection has been one of the most studied topics in the computer vision literature. In this technical report, we survey the recent advances in face detection for the past decade. The seminal Viola-Jones face detector is first reviewed. We then survey the various techniques according to how the ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Face detection has been one of the most studied topics in the computer vision literature. In this technical report, we survey the recent advances in face detection for the past decade. The seminal Viola-Jones face detector is first reviewed. We then survey the various techniques according to how they extract features and what learning algorithms are adopted. It is our hope that by reviewing the many existing algorithms, we will see even better algorithms developed to solve this fundamental computer vision problem.
Learning Occlusion with Likelihoods for Visual Tracking
"... We propose a novel algorithm to detect occlusion for visual tracking through learning with observation likelihoods. In our technique, target is divided into regular grid cells and the state of occlusion is determined for each cell using a classifier. Each cell in the target is associated with many s ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We propose a novel algorithm to detect occlusion for visual tracking through learning with observation likelihoods. In our technique, target is divided into regular grid cells and the state of occlusion is determined for each cell using a classifier. Each cell in the target is associated with many small patches, and the patch likelihoods observed during tracking construct a feature vector, which is used for classification. Since the occlusion is learned with patch likelihoods instead of patches themselves, the classifier is universally applicable to any videos or objects for occlusion reasoning. Our occlusion detection algorithm has decent performance in accuracy, which is sufficient to improve tracking performance significantly. The proposed algorithm can be combined with many generic tracking methods, and we adopt L1 minimization tracker to test the performance of our framework. The advantage of our algorithm is supported by quantitative and qualitative evaluation, and successful tracking and occlusion reasoning results are illustrated in many challenging video sequences. 1.
Context-aware Tracking of Small Targets in Video
"... Video-based tracking of small targets in a dense environment of clutter is very difficult, because the image resolution of the target is too low to provide reliable information for matching, and in turn the clutter generates a large number of false positive matches and distractions. Most traditional ..."
Abstract
- Add to MetaCart
Video-based tracking of small targets in a dense environment of clutter is very difficult, because the image resolution of the target is too low to provide reliable information for matching, and in turn the clutter generates a large number of false positive matches and distractions. Most traditional methods attempt to oppose the target to the environment, and are thus confronted in handling the enormous distractions. In fact, a target is rarely isolated and independent to the environment, e.g., when persistent disturbances are present in the vicinity of the target. Therefore, there may exist some objects that exhibit short-term or even longer-term motion correlation to the target. They constitute a very useful spatial contexts of the target. Thus, taking the advantage of the contextual information in an efficient way can improve the robustness of target tracking, as the spatial contexts provide extra constraints in target matching and additional verification in data association. This paper presents a new approach of context-aware tracking for small targets, in which a set of motion-correlated auxiliary objects are automatically discovered on-the-fly. The image region of one such auxiliary object generates a specific spatial context of the target, and leads to an individual contextual constraint to the motion of the target. Under the small motion assumption on two consecutive frames, these individual contextual constraints have linear forms. The collection of all such individual contextual constraints gives a contextual system, based on which the target motion can be accurately estimated so that the association of the target over consecutive image frames can be reliably constructed. This new approach is computationally efficient. Extensive experiments on real test video sequences show the effectiveness and efficiency of the proposed approach.
Multi-target Tracking in Time-lapse Video Forensics ABSTRACT
"... To help an officer to efficiently review many hours of surveillance recordings, we develop a system of automated video analysis. We introduce a multi-target tracking algorithm that operates on recorded video. Apart from being robust to visual challenges (like partial and full occlusion, variation in ..."
Abstract
- Add to MetaCart
To help an officer to efficiently review many hours of surveillance recordings, we develop a system of automated video analysis. We introduce a multi-target tracking algorithm that operates on recorded video. Apart from being robust to visual challenges (like partial and full occlusion, variation in illumination and camera view), our algorithm is also robust to temporal challenges, i.e., unknown variation in frame rate. The complication with variation in frame rate is that it invalidates motion estimation. As such, tracking algorithms that are based on motion models will show decreased performance. On the other hand, appearance based tracking suffers from a plethora of false detections. Our tracking algorithm, albeit relying on appearance based detection, deals robustly with the caveats of both approaches. The solution rests on the fact that we can make fully informed choices; not only based on preceding, but also based on following frames. It works as follows. We assume an object detection algorithm that is able to detect all target objects that are present in each frame. From this we build a graph structure. The detections form the graph’s nodes. The vertices are formed by connecting each detection in one frame to all detections in the following frame. Thus, each path through the graph shows some particular selection of successive object detections. Object tracking is then reformulated as a heuristic search for optimal paths, where optimal means to find all detections belonging to a single object and excluding any other detection. We show that this approach, without an explicit motion model, is robust to both the visual and temporal challenges.
MULTI-TARGET TRACKING USING LONG-TERM STOCHASTIC ASSOCIATIONS
"... Maintaining the stability of tracks on multiple targets in video over extended time periods remains a challenging problem. A few methods which have recently shown encouraging results in this direction rely on learning context models or the availability of training data. However, this may not be feas ..."
Abstract
- Add to MetaCart
Maintaining the stability of tracks on multiple targets in video over extended time periods remains a challenging problem. A few methods which have recently shown encouraging results in this direction rely on learning context models or the availability of training data. However, this may not be feasible in many application scenarios. Moreover, tracking methods should be able to work across multiple resolutions of the video. In this paper, we consider the problem of long-term tracking in video in application domains where context information is not available a priori, nor can it be learned online. We build our solution on the hypothesis that most existing trackers can obtain reasonable short-term tracks (tracklets). By analyzing the statistical properties of these tracklets, we develop associations between them so as to come up with longer tracks. On multiple real-life video sequences spanning low and high resolution data, we show the ability to accurately track over extended time periods. Index Terms — multi-target, long-term tracking, stochastic association 1.
Visual Tracking and Illumination . . .
, 2009
"... Compressive sensing, or sparse representation, has played a fundamental role in many fields of science. It shows that the signals and images can be reconstructed from far fewer measurements than what is usually considered to be necessary. Sparsity leads to efficient estimation, efficient compression ..."
Abstract
- Add to MetaCart
Compressive sensing, or sparse representation, has played a fundamental role in many fields of science. It shows that the signals and images can be reconstructed from far fewer measurements than what is usually considered to be necessary. Sparsity leads to efficient estimation, efficient compression, dimensionality reduction, and efficient modeling. Recently, there has been a growing interest in compressive sensing in computer vision and it has been successfully applied to face recognition, background subtraction, object tracking and other problems. Sparsity can be achieved by solving the compressive sensing problem using ℓ1 minimization. In this dissertation, we present the results of a study of applying sparse representation to illumination recovery, object tracking, and simultaneous tracking and recognition. Illumination recovery, also known as inverse lighting, is the problem of recovering an illumination distribution in a scene from the appearance of objects located in the scene. It is used for Augmented Reality, where the virtual objects match the existingimage and cast convincing shadows on the real scene rendered with the recovered illumination. Shadows in a scene are caused by the occlusion of incoming light, and thus contain information about the lighting of the scene. Although shadows have been used
Information Fusion Measures of Effectiveness (MOE) for Decision Support
"... For decades, there have been discussions on measures of merits (MOM) that include measures of effectiveness (MOE) and measures of performance (MOP) for system-level performance. As the amount of sensed and collected data becomes increasingly large, there is a need to look at the architectures, metri ..."
Abstract
- Add to MetaCart
For decades, there have been discussions on measures of merits (MOM) that include measures of effectiveness (MOE) and measures of performance (MOP) for system-level performance. As the amount of sensed and collected data becomes increasingly large, there is a need to look at the architectures, metrics, and processes that provide the best methods for decision support systems. In this paper, we overview some information fusion methods in decision support and address the capability to measure the effects of the fusion products on user functions. The current standard Information Fusion model is the Data Fusion Information Group (DFIG) model that specifically addresses the needs of the user in an information fusion system. Decision support implies that information methods augment user decision making as opposed to the machine making the decision and displaying it to user. We develop a list of suggested measures of merits that facilitate decision support decision support Measures of Effectiveness (MOE) metrics of quality, information gain, and robustness, from the analysis based on the measures of performance (MOPs) of timeliness, accuracy, confidence, throughput, and cost. We demonstrate in an example with motion imagery to support the MOEs of quality (time/decision confidence plots), information gain (completeness of annotated imagery for situation awareness), and robustness through analysis of imagery over time and repeated looks for enhanced target identification confidence.

