Results 1 - 10
of
25
W4: Real-time surveillance of people and their activities
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2000
"... w4 is a real time visual surveillance system for detecting and tracking multiple people and monitoring their activities in an outdoor environment. It operates on monocular gray-scale video imagery, or on video imagery from an infrared camera. W4 employs a combination of shape analysis and tracking t ..."
Abstract
-
Cited by 341 (7 self)
- Add to MetaCart
w4 is a real time visual surveillance system for detecting and tracking multiple people and monitoring their activities in an outdoor environment. It operates on monocular gray-scale video imagery, or on video imagery from an infrared camera. W4 employs a combination of shape analysis and tracking to locate people and their parts (head, hands, feet, torso) and to create models of people's appearance so that they can be tracked through interactions such as occlusions. It can determine whether a foreground region contains multiple people and can segment the region into its constituent people and track them. W4 can also determine whether people are carrying objects, and can segment objects from their silhouettes, and construct appearance models for them so they can be identified in subsequent frames. W4 can recognize events between people and objects, such as depositing an object, exchanging bags, or removing an object. It runs at 25 Hz for 320x240 resolution images on a 400 Mhz dual-Pentium II PC.
A System for Video Surveillance and Monitoring
, 2000
"... Under the three-year Video Surveillance and Monitoring (VSAM) project (1997--1999), the Robotics Institute at Carnegie Mellon University (CMU) and the Sarnoff Corporation developed a system for autonomous Video Surveillance and Monitoring. The technical approach uses multiple, cooperative video s ..."
Abstract
-
Cited by 132 (0 self)
- Add to MetaCart
Under the three-year Video Surveillance and Monitoring (VSAM) project (1997--1999), the Robotics Institute at Carnegie Mellon University (CMU) and the Sarnoff Corporation developed a system for autonomous Video Surveillance and Monitoring. The technical approach uses multiple, cooperative video sensors to provide continuous coverage of people and vehicles in a cluttered environment. This final report presents an overview of the system, and of the technical accomplishments that have been achieved. c fl2000 Carnegie Mellon University This work was funded by the DARPA Image Understanding under contract DAAB07-97-C-J031, and by the Office of Naval Research under grant N00014-99-1-0646. 1 Introduction The thrust of CMU research under the DARPA Video Surveillance and Monitoring (VSAM) project is cooperative multi-sensor surveillance to support battlefield awareness [17]. Under our VSAM Integrated Feasibility Demonstration (IFD) contract, we have developed automated video understandi...
Detecting Unusual Activity in Video
, 2004
"... We present an unsupervised technique for detecting unusual activity in a large video set using many simple features. No complex activity models and no supervised feature selections are used. We divide the video into equal length segments and classify the extracted features into prototypes, from whic ..."
Abstract
-
Cited by 76 (0 self)
- Add to MetaCart
We present an unsupervised technique for detecting unusual activity in a large video set using many simple features. No complex activity models and no supervised feature selections are used. We divide the video into equal length segments and classify the extracted features into prototypes, from which a prototype--segment co-occurrence matrix is computed. Motivated by a similar problem in documentkeyword analysis, we seek a correspondence relationship between prototypes and video segments which satisfies the transitive closure constraint. We show that an important sub-family of correspondence functions can be reduced to co-embedding prototypes and segments to N-D Euclidean space. We prove that an efficient, globally optimal algorithm exists for the co-embedding problem. Experiments on various real-life videos have validated our approach.
Real-time Human Motion Analysis by Image Skeletonization
- In Proceedings of IEEE WACV98
, 1998
"... In this paper, a process is described for analysing the motion of a human target in a video stream. Moving targets are detected and their boundaries extracted. From these, a "star" skeleton is produced. Two motion cues are determined from this skeletonization: body posture, and cyclic motion of skel ..."
Abstract
-
Cited by 75 (6 self)
- Add to MetaCart
In this paper, a process is described for analysing the motion of a human target in a video stream. Moving targets are detected and their boundaries extracted. From these, a "star" skeleton is produced. Two motion cues are determined from this skeletonization: body posture, and cyclic motion of skeleton segments. These cues are used to determine human activities such as walking or running, and even potentially, the target's gait. Unlike other methods, this does not require an a priori human model, or a large number of "pixels on target". Furthermore, it is computationally inexpensive, and thus ideal for real-world video applications such as outdoor video surveillance. 1. Introduction Using video in machine understanding has recently become a significant research topic. One of the more active areas is activity understanding from video imagery [6]. Understanding activities involves being able to detect and classify targets of interest and analyze what they are doing. Human motion analy...
Extraction and Clustering of Motion Trajectories in Video
- In International Conference on Pattern Recognition
, 2004
"... A system is described that tracks moving objects in a video dataset so as to extract a representation of the objects' 3D trajectories. The system then finds hierarchical clusters of similar trajectories in the video dataset. Objects' motion trajectories are extracted via an EKF formulation that prov ..."
Abstract
-
Cited by 28 (0 self)
- Add to MetaCart
A system is described that tracks moving objects in a video dataset so as to extract a representation of the objects' 3D trajectories. The system then finds hierarchical clusters of similar trajectories in the video dataset. Objects' motion trajectories are extracted via an EKF formulation that provides each object's 3D trajectory up to a constant factor. To increase accuracy when occlusions occur, multiple tracking hypotheses are followed. For trajectory-based clustering and retrieval, a modified version of edit distance, called longest common subsequence is employed. Similarities are computed between projections of trajectories on coordinate axes. Trajectories are grouped based, using an agglomerative clustering algorithm. To check the validity of the approach, experiments using real data were performed.
Video Arrays For Real-Time Tracking Of Person, Head, And Face In An Intelligent Room
, 2003
"... Real-time three-dimensional tracking of people is an important requirement for a growing number of applications. In this paper we describe two trackers; both of them use a network of video cameras for person tracking. These trackers are called a rectilinear video array tracker (R-VAT) and an omnidir ..."
Abstract
-
Cited by 27 (18 self)
- Add to MetaCart
Real-time three-dimensional tracking of people is an important requirement for a growing number of applications. In this paper we describe two trackers; both of them use a network of video cameras for person tracking. These trackers are called a rectilinear video array tracker (R-VAT) and an omnidirectional video array tracker (O-VAT), indicating the two different ways of video capture. The specific objectives of this paper are twofold: (i) to present a systematic comparison of these two trackers using an extensive series of experiments conducted in an `intelligent' room; (ii) to develop a real-time system for tracking the head and face of a person, as an extension of the O-VAT approach. The comparative research indicates that O-VAT is more robust to the number of people, less complex and runs faster, needs manual camera calibration, and the integrated omnidirectional video network has better reconfigurability. The person head and face tracker study shows that such a system can serve as a most effective input stage for face recognition and facial expression analysis modules.
Collaborative Surveillance Using Both Fixed and Mobile Unattended Ground Sensor Platforms
, 1999
"... We begin by considering current shortfalls with conventional surveillance systems and discuss the potential advantages of distributed, collaborative surveillance systems. Distributed surveillance systems offer the capability to monitor activity from multiple locations over time thereby increasing th ..."
Abstract
-
Cited by 17 (10 self)
- Add to MetaCart
We begin by considering current shortfalls with conventional surveillance systems and discuss the potential advantages of distributed, collaborative surveillance systems. Distributed surveillance systems offer the capability to monitor activity from multiple locations over time thereby increasing the likelihood of obtaining discriminating data necessary for interpretation of the activity. Yet the multiplicity of sensors magnifies the volumes of data that must be processed. We present our vision of a system which generates timely interpretations of activities in the scene automatically through the use of mechanisms for collaboration among sensing systems and efficient perception methods which complement the sensing paradigm. Then we review our recent efforts toward achieving this goal and present initial results.
Background subtraction techniques
- In Proc. of Image and Vision Computing
, 2000
"... Background subtraction is a commonly used class of techniques for segmenting out objects of interest in a scene for applications such as surveillance. This paper surveys a representative sample of the published techiques for background subtraction, and analyses them with respect to three important a ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
Background subtraction is a commonly used class of techniques for segmenting out objects of interest in a scene for applications such as surveillance. This paper surveys a representative sample of the published techiques for background subtraction, and analyses them with respect to three important attributes: foreground detection; background maintenance; and postprocessing.
Local Application of Optic Flow to Analyse Rigid versus Non-Rigid Motion
- In ICCV99 Workshop on Frame-Rate Applications
, 1999
"... Optic flow has been a research topic of interest for many years. It has, until recently, been largely inapplicable to real-time video applications due to its computationally expensive nature. This paper presents a new, reliable flow technique called dynamic region matching, based on the work of Anan ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
Optic flow has been a research topic of interest for many years. It has, until recently, been largely inapplicable to real-time video applications due to its computationally expensive nature. This paper presents a new, reliable flow technique called dynamic region matching, based on the work of Anandan[1], Lucas and Kanade[10] and Okutomi and Kanade[11], which can be combined with a motion detection algorithm (from stationary or stabilised camera image streams) to allow flow-based analyses of moving entities in real-time. If flow vectors need only be calculated for "moving" pixels, then the computation time is greatly reduced, making it applicable to real-time implementation on modest computational platforms (such as standard Pentium II based PCs). Applying this flow technique to moving entities provides some straightforward primitives for analysing the motion of those objects. Specifically, in this paper, methods are presented for: analysing rigidity and cyclic motion using residual ...
Agent-based moving object correspondence using differential discriminative diagnosis
, 2000
"... We propose a novel method for temporally and spatially corresponding moving objects by automatically learning the relevance of the objects ’ appearance features to the task of discrimination. Efficient correspondence is achieved by enforcing temporal consistency of the relevances for a particular ob ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
We propose a novel method for temporally and spatially corresponding moving objects by automatically learning the relevance of the objects ’ appearance features to the task of discrimination. Efficient correspondence is achieved by enforcing temporal consistency of the relevances for a particular object. Relevances are learned using a technique we have termed “differential discriminative diagnosis. ” An agent is assigned to each moving object in the scene. The agent possesses the basic capability to decide whether or not an object in the scene is the one it represents. Each agent customizes itself to the object by means of differential discriminative diagnosis as the object persists in the scene. We explain this correspondence scheme as applied to the task of corresponding moving people in a surveillance system. 1

