Results 1 - 10
of
67
A system for learning statistical motion patterns
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2006
"... permission from the publisher. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of th ..."
Abstract
-
Cited by 42 (0 self)
- Add to MetaCart
permission from the publisher. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. © 2006 IEEE. Copyright and all rights therein are retained by authors or by other copyright holders. All persons downloading this information are expected to adhere to the terms and constraints invoked by copyright. This document or any part thereof may not be reposted without the explicit permission of the copyright holder. Citation for this copy:
Semi-supervised adapted hmms for unusual event detection
- IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR
, 2005
"... We address the problem of temporal unusual event detection. Unusual events are characterized by a number of features (rarity, unexpectedness, and relevance) that limit the application of traditional supervised model-based approaches. We propose a semi-supervised adapted Hidden Markov Model (HMM) fra ..."
Abstract
-
Cited by 32 (1 self)
- Add to MetaCart
We address the problem of temporal unusual event detection. Unusual events are characterized by a number of features (rarity, unexpectedness, and relevance) that limit the application of traditional supervised model-based approaches. We propose a semi-supervised adapted Hidden Markov Model (HMM) framework, in which usual event models are first learned from a large amount of (commonly available) training data, while unusual event models are learned by Bayesian adaptation in an unsupervised manner. The proposed framework has an iterative structure, which adapts a new unusual event model at each iteration. We show that such a framework can address problems due to the scarcity of training data and the difficulty in pre-defining unusual events. Experiments on audio, visual, and audiovisual data streams illustrate its effectiveness, compared with both supervised and unsupervised baseline methods. 1
Machine recognition of human activities: A survey
, 2008
"... The past decade has witnessed a rapid proliferation of video cameras in all walks of life and has resulted in a tremendous explosion of video content. Several applications such as content-based video annotation and retrieval, highlight extraction and video summarization require recognition of the a ..."
Abstract
-
Cited by 31 (0 self)
- Add to MetaCart
The past decade has witnessed a rapid proliferation of video cameras in all walks of life and has resulted in a tremendous explosion of video content. Several applications such as content-based video annotation and retrieval, highlight extraction and video summarization require recognition of the activities occurring in the video. The analysis of human activities in videos is an area with increasingly important consequences from security and surveillance to entertainment and personal archiving. Several challenges at various levels of processing—robustness against errors in low-level processing, view and rate-invariant representations at midlevel processing and semantic representation of human activities at higher level processing—make this problem hard to solve. In this review paper, we present a comprehensive survey of efforts in the past couple of decades to address the problems of representation, recognition, and learning of human activities from video and related applications. We discuss the problem at two major levels of complexity: 1) “actions ” and 2) “activities. ” “Actions ” are characterized by simple motion patterns typically executed by a single human. “Activities ” are more complex and involve coordinated actions among a small number of humans. We will discuss several approaches and classify them according to their ability to handle varying degrees of complexity as interpreted above. We begin with a discussion of approaches to model the simplest of action classes known as atomic or primitive actions that do not require sophisticated dynamical modeling. Then, methods to model actions with more complex dynamics are discussed. The discussion then leads naturally to methods for higher level representation of complex activities.
Aligned cluster analysis for temporal segmentation of human motion
- IN AFGR
, 2008
"... Temporal segmentation of human motion into actions is a crucial step for understanding and building computational models of human motion. Several issues contribute to the challenge of this task. These include the large variability in the temporal scale and periodicity of human actions, as well as th ..."
Abstract
-
Cited by 24 (4 self)
- Add to MetaCart
Temporal segmentation of human motion into actions is a crucial step for understanding and building computational models of human motion. Several issues contribute to the challenge of this task. These include the large variability in the temporal scale and periodicity of human actions, as well as the exponential nature of all possible movement combinations. We formulate the temporal segmentation problem as an extension of standard clustering algorithms. In particular, this paper proposes Aligned Cluster Analysis (ACA), a robust method to temporally segment streams of motion capture data into actions. ACA extends standard kernel k-means clustering in two ways: (1) the cluster means contain a variable number of features, and (2) a dynamic time warping (DTW) kernel is used to achieve temporal invariance. Experimental results, reported on synthetic data and the Carnegie Mellon Motion Capture database, demonstrate its effectiveness.
Detection and explanation of anomalous activities: representing activities as bags of event n-grams
- IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR
, 2005
"... We present a novel representation and method for detecting and explaining anomalous activities in a video stream. Drawing from natural language processing, we introduce a representation of activities as bags of event n-grams, where we analyze the global structural information of activities using the ..."
Abstract
-
Cited by 21 (6 self)
- Add to MetaCart
We present a novel representation and method for detecting and explaining anomalous activities in a video stream. Drawing from natural language processing, we introduce a representation of activities as bags of event n-grams, where we analyze the global structural information of activities using their local event statistics. We demonstrate how maximal cliques in an undirected edge-weighted graph of activities, can be used in an unsupervised manner, to discover regular sub-classes of an activity class. Based on these discovered sub-classes, we formulate a definition of anomalous activities and present a way to detect them. Finally, we characterize each discovered sub-class in terms of its “most representative member, ” and present an informationtheoretic method to explain the detected anomalies in a human-interpretable form. 1. Introduction and Previous
Unsupervised activity perception by hierarchical bayesian models
- In CVPR. IEEE
, 2007
"... We propose a novel unsupervised learning framework for activity perception. To understand activities in complicated scenes from visual data, we propose a hierarchical Bayesian model to connect three elements: low-level visual features, simple “atomic ” activities, and multi-agent interactions. Atomi ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
We propose a novel unsupervised learning framework for activity perception. To understand activities in complicated scenes from visual data, we propose a hierarchical Bayesian model to connect three elements: low-level visual features, simple “atomic ” activities, and multi-agent interactions. Atomic activities are modeled as distributions over low-level visual features, and interactions are modeled as distributions over atomic activities. Our models improve existing language models such as Latent Dirichlet Allocation (LDA) and Hierarchical Dirichlet Process (HDP) by modeling interactions without supervision. Our data sets are challenging video sequences from crowded traffic scenes with many kinds of activities co-occurring. Our approach provides a summary of typical atomic activities and interactions in the scene. Unusual activities and interactions are found, with natural probabilistic explanations. Our method supports flexible high-level queries on activities and interactions using atomic activities as components. 1.
Weakly supervised discriminative localization and classification: a joint learning process
, 2009
"... ..."
Semi-latent Dirichlet allocation: A hierarchical model for human action recognition
- In 2nd Workshop on Human Motion Understanding, Modeling, Capture and Animation
, 2007
"... Abstract. We propose a new method for human action recognition from video sequences using latent topic models. Video sequences are represented by a novel “bag-of-words ” representation, where each frame corresponds to a “word”. The major difference between our model and previous latent topic models ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
Abstract. We propose a new method for human action recognition from video sequences using latent topic models. Video sequences are represented by a novel “bag-of-words ” representation, where each frame corresponds to a “word”. The major difference between our model and previous latent topic models for recognition problems in computer vision is that, our model is trained in a “semi-supervised ” way. Our model has several advantages over other similar models. First of all, the training is much easier due to the decoupling of the model parameters. Secondly, it naturally solves the problem of how to choose the appropriate number of latent topics. Thirdly, it achieves much better performance by utilizing the information provided by the class labels in the training set. We present action classification and irregularity detection results, and show improvement over previous methods. 1
Making a long video short: Dynamic video synopsis
- In CVPR’06
, 2006
"... The power of video over still images is the ability to represent dynamic activities. But video browsing and retrieval are inconvenient due to inherent spatio-temporal redundancies, where some time intervals may have no activity, or have activities that occur in a small image region. Video synopsis a ..."
Abstract
-
Cited by 13 (5 self)
- Add to MetaCart
The power of video over still images is the ability to represent dynamic activities. But video browsing and retrieval are inconvenient due to inherent spatio-temporal redundancies, where some time intervals may have no activity, or have activities that occur in a small image region. Video synopsis aims to provide a compact video representation, while preserving the essential activities of the original video. We present dynamic video synopsis, where most of the activity in the video is condensed by simultaneously showing several actions, even when they originally occurred at different times. For example, we can create a ”stroboscopic movie”, where multiple dynamic instances of a moving object are played simultaneously. This is an extension of the still stroboscopic picture. Previous approaches for video abstraction addressed mostly the temporal redundancy by selecting representative key-frames or time intervals. In dynamic video synopsis the activity is shifted into a significantly shorter period, in which the activity is much denser. Video examples can be found online in
Nonchronological video synopsis and indexing
- TPAMI
"... Abstract—The amount of captured video is growing with the increased numbers of video cameras, especially the increase of millions of surveillance cameras that operate 24 hours/day. Since video browsing and retrieval is time consuming, most captured video is never watched or examined. Video synopsis ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
Abstract—The amount of captured video is growing with the increased numbers of video cameras, especially the increase of millions of surveillance cameras that operate 24 hours/day. Since video browsing and retrieval is time consuming, most captured video is never watched or examined. Video synopsis is an effective tool for browsing and indexing of such a video. It provides a short video representation, while preserving the essential activities of the original video. The activity in the video is condensed into a shorter period by simultaneously showing multiple activities, even when they originally occurred at different times. The synopsis video is also an index of the original video by pointing to the original time of each activity. Video synopsis can be applied to create a synopsis of endless video streams, as generated by webcams and by surveillance cameras. It can address queries like “Show in one minute the synopsis of this camera broadcast during the past day. ” This process includes two major phases: 1) an online conversion of the endless video stream into a database of objects and activities (rather than frames) and 2) a response phase, generating the video synopsis as a response to the user’s query. Index Terms—Video summary, video indexing, video surveillance. Ç 1

