Results 1 -
9 of
9
Shifting Weights: Adapting Object Detectors from Image to Video
"... Typical object detectors trained on images perform poorly on video, as there is a clear distinction in domain between the two types of data. In this paper, we tackle the problem of adapting object detectors learned from images to work well on videos. We treat the problem as one of unsupervised domai ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
(Show Context)
Typical object detectors trained on images perform poorly on video, as there is a clear distinction in domain between the two types of data. In this paper, we tackle the problem of adapting object detectors learned from images to work well on videos. We treat the problem as one of unsupervised domain adaptation, in which we are given labeled data from the source domain (image), but only unlabeled data from the target domain (video). Our approach, self-paced domain adaptation, seeks to iteratively adapt the detector by re-training the detector with automatically discovered target domain examples, starting with the easiest first. At each iteration, the algorithm adapts by considering an increased number of target domain examples, and a decreased number of source domain examples. To discover target domain examples from the vast amount of video data, we introduce a simple, robust approach that scores trajectory tracks instead of bounding boxes. We also show how rich and expressive features specific to the target domain can be incorporated under the same framework. We show promising results on the 2011 TRECVID Multimedia Event Detection [1] and LabelMe Video [2] datasets that illustrate the benefit of our approach to adapt object detectors to video. 1
Realtime multilevel crowd tracking using reciprocal velocity obstacles
- IN: PROCEEDINGS OF CONFERENCE ON PATTERN RECOGNITION, SWEDEN
, 2014
"... We present a novel, realtime algorithm to compute the trajectory of each pedestrian in moderately dense crowd scenes. Our formulation is based on an adaptive particle filtering scheme that uses a multi-agent motion model based on velocity-obstacles, and takes into account local interactions as well ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
We present a novel, realtime algorithm to compute the trajectory of each pedestrian in moderately dense crowd scenes. Our formulation is based on an adaptive particle filtering scheme that uses a multi-agent motion model based on velocity-obstacles, and takes into account local interactions as well as physical and personal constraints of each pedestrian. Our method dynamically changes the number of particles allocated to each pedestrian based on different confidence metrics. Additionally, we use a new high-definition crowd video dataset, which is used to evaluate the performance of different pedestrian tracking algorithms. This dataset consists of videos of indoor and outdoor scenes, recorded at different locations with 30-80 pedestrians. We highlight the performance benefits of our algorithm over prior techniques using this dataset. In practice, our algorithm can compute trajectories of tens of pedestrians on a multi-core desktop CPU at interactive rates (27-30 frames per second). To the best of our knowledge, our approach is 4-5 times faster than prior methods, which provide similar accuracy.
Instance Label Prediction by Dirichlet Process Multiple Instance Learning
"... We propose a generative Bayesian model that predicts instance labels from weak (bag-level) supervision. We solve this problem by simulta-neously modeling class distributions by Gaussian mixture models and inferring the class labels of positive bag instances that satisfy the multiple in-stance constr ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
We propose a generative Bayesian model that predicts instance labels from weak (bag-level) supervision. We solve this problem by simulta-neously modeling class distributions by Gaussian mixture models and inferring the class labels of positive bag instances that satisfy the multiple in-stance constraints. We employ Dirichlet process priors on mixture weights to automate model se-lection, and efficiently infer model parameters and positive bag instances by a constrained varia-tional Bayes procedure. Our method improves on the state-of-the-art of instance classification from weak supervision on 20 benchmark text catego-rization data sets and one histopathology cancer diagnosis data set. 1
Self-learning camera: Autonomous adaptation of object detectors to unlabeled video streams
- In ECCV
, 2014
"... Learning object detectors requires massive amounts of labeled training samples from the specific data source of interest. This is impractical when dealing with many different sources (e.g., in camera networks), or constantly changing ones such as mobile cameras (e.g., in robotics or driving assistan ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Learning object detectors requires massive amounts of labeled training samples from the specific data source of interest. This is impractical when dealing with many different sources (e.g., in camera networks), or constantly changing ones such as mobile cameras (e.g., in robotics or driving assistant systems). In this pa-per, we address the problem of self-learning detectors in an autonomous manner, i.e. (i) detectors continuously updating themselves to efficiently adapt to stream-ing data sources (contrary to transductive algorithms), (ii) without any labeled data strongly related to the target data stream (contrary to self-paced learning), and (iii) without manual intervention to set and update hyper-parameters. To that end, we propose an unsupervised, on-line, and self-tuning learning algorithm to optimize a multi-task learning convex objective. Our method uses confident but laconic or-acles (high-precision but low-recall off-the-shelf generic detectors), and exploits the structure of the problem to jointly learn on-line an ensemble of instance-level trackers, from which we derive an adapted category-level object detector. Our approach is validated on real-world publicly available video object datasets. 1
Real-time Crowd Tracking using Parameter Optimized Mixture of Motion Models
, 2014
"... We present a novel, real-time algorithm to track the trajectory of each pedestrian in moderately dense crowded scenes. Our formulation is based on an adaptive particle-filtering scheme that uses a comb-nation of various multi-agent heterogeneous pedestrian simulation models. We automatically comput ..."
Abstract
- Add to MetaCart
We present a novel, real-time algorithm to track the trajectory of each pedestrian in moderately dense crowded scenes. Our formulation is based on an adaptive particle-filtering scheme that uses a comb-nation of various multi-agent heterogeneous pedestrian simulation models. We automatically compute the optimal parameters for each of these different models based on prior tracked data and use the best model as motion prior for our particle-filter based tracking algorithm. We also use our “mixture of motion models ” for adaptive particle selection and accelerate the performance of the online tracking algorithm. The motion model parameter estimation is formulated as an optimization problem, and we use an approach that solves this combinatorial optimization problem in a model independent manner and hence scalable to any multi-agent pedestrian motion model. We evaluate the performance of our approach on different crowd video datasets and highlight the improvement in accuracy over homogeneous motion models and a baseline mean-shift based tracker. In practice, our formulation can compute trajectories of tens of pedestrians on a multi-core desktop CPU in in real time and offer higher accuracy as compared to prior real time pedestrian tracking algorithms.
Multiple instance learning with response-optimized random forests
"... Abstract—We introduce a multiple instance learning algorithm based on randomized decision trees. Our model extends an existing algorithm by Blockeel et al. [2] in several ways: 1) We learn a random forest instead of a single tree. 2) We construct the trees by splits based on non-linear boundaries on ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—We introduce a multiple instance learning algorithm based on randomized decision trees. Our model extends an existing algorithm by Blockeel et al. [2] in several ways: 1) We learn a random forest instead of a single tree. 2) We construct the trees by splits based on non-linear boundaries on multiple features at a time. 3) We learn an optimal way of combining the decisions of multiple trees under the multiple instance constraints (i.e. positive bags have at least one positive instance, negative bags have only negative instances). Experiments on the typical benchmark data sets show that this model’s prediction performance is clearly better than earlier tree based methods, and is comparable to the global state-of-the-art. I.
Acknowledgments.................................. ii
, 2013
"... I would like to express the deepest appreciation to my supervisor, Dr. Mohan Sridharan, whose expertise, understanding, and patience, added considerably to my graduate experience. Mohan was a great source of advice and encouragement, and gave me a lot of freedom in choosing my research focus. Withou ..."
Abstract
- Add to MetaCart
(Show Context)
I would like to express the deepest appreciation to my supervisor, Dr. Mohan Sridharan, whose expertise, understanding, and patience, added considerably to my graduate experience. Mohan was a great source of advice and encouragement, and gave me a lot of freedom in choosing my research focus. Without his guidance and persistent help, this dissertation would not have been possible. I would also like to thank my friends in the Stochastic Estimation and Autonomous Robotics Lab, particularly Shiqi Zhang, for our debates, exchanges of knowledge, skills, and venting of frustration during my graduate program, which helped enrich the experience. I must also acknowledge my committee members, Dr. J. Nelson Rushton, Dr. Hamed Sari-Sarraf and Dr. Peter Stone for their help and advice. I would also like to thank my family for the support they provided me through my entire life, and in particular, I must acknowledge my wife, Yue, without whose love, encouragement and editing assistance, I would not have finished this dissertation.
iCub Facility
"... We introduce an online action recognition system that can be combined with any set of frame-by-frame feature descriptors. Our system covers the frame feature space with classifiers whose distribution adapts to the hardness of locally approximating the Bayes optimal classifier. An efficient nearest n ..."
Abstract
- Add to MetaCart
(Show Context)
We introduce an online action recognition system that can be combined with any set of frame-by-frame feature descriptors. Our system covers the frame feature space with classifiers whose distribution adapts to the hardness of locally approximating the Bayes optimal classifier. An efficient nearest neighbour search is used to find and combine the local classifiers that are closest to the frames of a new video to be classified. The ad-vantages of our approach are: incremental training, frame by frame real-time prediction, nonparametric predictive modelling, video segmentation for continuous action recogni-tion, no need to trim videos to equal lengths and only one tuning parameter (which, for large datasets, can be safely set to the diameter of the feature space). Experiments on standard benchmarks show that our system is competitive with state-of-the-art non-incremental and incremental baselines.
TRACKING ARTICULATED HUMAN MOVEMENTS WITH A COMPONENT BASED APPROACH TO BOOSTED MULTIPLE INSTANCE LEARNING
"... ABSTRACT Our work is about a new class of object trackers that are based on a boosted Multiple Instance Learning (MIL) algorithm to track an object in a video sequence. We show how the scope of such trackers can be expanded to the tracking of articulated movements by humans that frequently result i ..."
Abstract
- Add to MetaCart
(Show Context)
ABSTRACT Our work is about a new class of object trackers that are based on a boosted Multiple Instance Learning (MIL) algorithm to track an object in a video sequence. We show how the scope of such trackers can be expanded to the tracking of articulated movements by humans that frequently result in large frame-to-frame variations in the appearance of what needs to be tracked. To deal with the problems caused by such variations, our paper presents a component based version of the boosted MIL algorithm. Components are the output of an image segmentation algorithm applied to the pixels in the bounding box encapsulating the object to be tracked. The components give the boosted MIL the additional degrees of freedom that it needs in order to deal with the large frame-to-frame variations associated with articulated movements.