Results 1 - 10
of
210
A Bayesian computer vision system for modeling human interactions
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2000
"... We describe a real-time computer vision and machine learning system for modeling and recognizing human behaviors in a visual surveillance task [1]. The system is particularly concerned with detecting when interactions between people occur and classifying the type of interaction. Examples of interes ..."
Abstract
-
Cited by 260 (6 self)
- Add to MetaCart
We describe a real-time computer vision and machine learning system for modeling and recognizing human behaviors in a visual surveillance task [1]. The system is particularly concerned with detecting when interactions between people occur and classifying the type of interaction. Examples of interesting interaction behaviors include following another person, altering one's path to meet another, and so forth. Our system combines top-down with bottom-up information in a closed feedback loop, with both components employing a statistical Bayesian approach [2]. We propose and compare two different state-based learning architectures, namely, HMMs and CHMMs for modeling behaviors and interactions. The CHMM model is shown to work much more efficiently and accurately. Finally, to deal with the problem of limited training data, a synthetic ªAlife-styleº training system is used to develop flexible prior models for recognizing human interactions. We demonstrate the ability to use these a priori models to accurately classify real human behaviors and interactions with no additional tuning or training.
Recognition of visual activities and interactions by stochastic parsing
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2000
"... This paper describes a probabilistic syntactic approach to the detection and recognition of temporally extended activities and interactions between multiple agents. The fundamental idea is to divide the recognition problem into two levels. The lower level detections are performed using standard inde ..."
Abstract
-
Cited by 170 (5 self)
- Add to MetaCart
This paper describes a probabilistic syntactic approach to the detection and recognition of temporally extended activities and interactions between multiple agents. The fundamental idea is to divide the recognition problem into two levels. The lower level detections are performed using standard independent probabilistic event detectors to propose candidate detections of low-level features. The outputs of these detectors provide the input stream for a stochastic context-free grammar parsing mechanism. The grammar and parser provide longer range temporal constraints, disambiguate uncertain low-level detections, and allow the inclusion of a priori knowledge about the structure of temporal events in a given domain. To achieve such a system we: 1) provide techniques for generating a discrete symbol stream from continuous low-level detectors; 2) extend stochastic context-free parsing to handle uncertainty in the input symbol stream; 3) augment a run-time parsing algorithm to enforce intersymbol constraints such as requiring temporal consistency between primitives; and 4) extend the consistency filtering to maintain consistent multiobject interactions. We develop a real-time system and demonstrate the approach in several experiments on gesture recognition and in video surveillance. In the surveillance application, we show how the system correctly interprets activities of multiple, interacting objects.
A survey on visual surveillance of object motion and behaviors
- IEEE Transactions on Systems, Man and Cybernetics
, 2004
"... Abstract—Visual surveillance in dynamic scenes, especially for humans and vehicles, is currently one of the most active research topics in computer vision. It has a wide spectrum of promising applications, including access control in special areas, human identification at a distance, crowd flux stat ..."
Abstract
-
Cited by 123 (2 self)
- Add to MetaCart
Abstract—Visual surveillance in dynamic scenes, especially for humans and vehicles, is currently one of the most active research topics in computer vision. It has a wide spectrum of promising applications, including access control in special areas, human identification at a distance, crowd flux statistics and congestion analysis, detection of anomalous behaviors, and interactive surveillance using multiple cameras, etc. In general, the processing framework of visual surveillance in dynamic scenes includes the following stages: modeling of environments, detection of motion, classification of moving objects, tracking, understanding and description of behaviors, human identification, and fusion of data from multiple cameras. We review recent developments and general strategies of all these stages. Finally, we analyze possible research directions, e.g., occlusion handling, a combination of twoand three-dimensional tracking, a combination of motion analysis and biometrics, anomaly detection and behavior prediction, content-based retrieval of surveillance videos, behavior understanding and natural language description, fusion of information from multiple sensors, and remote surveillance. Index Terms—Behavior understanding and description, fusion of data from multiple cameras, motion detection, personal identification, tracking, visual surveillance.
Recent Developments in Human Motion Analysis
"... Visual analysis of human motion is currently one of the most active research topics in computer vision. This strong interest is driven by a wide spectrum of promising applications in many areas such as virtual reality, smart surveillance, perceptual interface, etc. Human motion analysis concerns the ..."
Abstract
-
Cited by 109 (1 self)
- Add to MetaCart
Visual analysis of human motion is currently one of the most active research topics in computer vision. This strong interest is driven by a wide spectrum of promising applications in many areas such as virtual reality, smart surveillance, perceptual interface, etc. Human motion analysis concerns the detection, tracking and recognition of people, and more generally, the understanding of human behaviors, from image sequences involving humans. This paper provides a comprehensive survey of research on computer vision based human motion analysis. The emphasis is on three major issues involved in a general human motion analysis system, namely human detection, tracking and activity understanding. Various methods for each issue are discussed in order to examine the state of the art. Finally, some research challenges and future directions are discussed.
Automatic Analysis of Multimodal Group Actions in Meetings
, 2003
"... This paper investigates the recognition of group actions in meetings. A framework is employed in which group actions result from the interactions of the individual participants. The group actions are modelled using different HMM-based approaches, where the observations are provided by a set of audio ..."
Abstract
-
Cited by 90 (26 self)
- Add to MetaCart
This paper investigates the recognition of group actions in meetings. A framework is employed in which group actions result from the interactions of the individual participants. The group actions are modelled using different HMM-based approaches, where the observations are provided by a set of audio-visual features monitoring the actions of individuals. Experiments demonstrate the importance of taking interactions into account in modelling the group actions. It is also shown that the visual modality contains useful information, even for predominantly audio-based events, motivating a multimodal approach to meeting analysis.
Detecting Unusual Activity in Video
, 2004
"... We present an unsupervised technique for detecting unusual activity in a large video set using many simple features. No complex activity models and no supervised feature selections are used. We divide the video into equal length segments and classify the extracted features into prototypes, from whic ..."
Abstract
-
Cited by 76 (0 self)
- Add to MetaCart
We present an unsupervised technique for detecting unusual activity in a large video set using many simple features. No complex activity models and no supervised feature selections are used. We divide the video into equal length segments and classify the extracted features into prototypes, from which a prototype--segment co-occurrence matrix is computed. Motivated by a similar problem in documentkeyword analysis, we seek a correspondence relationship between prototypes and video segments which satisfies the transitive closure constraint. We show that an important sub-family of correspondence functions can be reduced to co-embedding prototypes and segments to N-D Euclidean space. We prove that an efficient, globally optimal algorithm exists for the co-embedding problem. Experiments on various real-life videos have validated our approach.
Grounding the lexical semantics of verbs in visual perception using force dynamics and event logic
- Journal of Artificial Intelligence Research
, 2001
"... This paper presents an implemented system for recognizing the occurrence of events described by simple spatial-motion verbs in short image sequences. The semantics of these verbs is specified with event-logic expressions that describe changes in the state of force-dynamic relations between the parti ..."
Abstract
-
Cited by 75 (2 self)
- Add to MetaCart
This paper presents an implemented system for recognizing the occurrence of events described by simple spatial-motion verbs in short image sequences. The semantics of these verbs is specified with event-logic expressions that describe changes in the state of force-dynamic relations between the participants of the event. An efficient finite representation is introduced for the infinite sets of intervals that occur when describing liquid and semi-liquid events. Additionally, an efficient procedure using this representation is presented for inferring occurrences of compound events, described with event-logic expressions, from occurrences of primitive events. Using force dynamics and event logic to specify the lexical semantics of events allows the system to be more robust than prior systems based on motion profile. 1.
Sensory-Motor Primitives as a Basis for Imitation: Linking Perception to Action and Biology to Robotics
- Imitation in Animals and Artifacts
, 2000
"... ing away from the specific coding of the spinal fields, the examples from neurobiology provide the framework for a motor control system based on a small number of additive primitives (or basis behaviors) sufficient for a rich output movement repertoire. Our previous work (Matari'c 1995, Matari'c 199 ..."
Abstract
-
Cited by 72 (17 self)
- Add to MetaCart
ing away from the specific coding of the spinal fields, the examples from neurobiology provide the framework for a motor control system based on a small number of additive primitives (or basis behaviors) sufficient for a rich output movement repertoire. Our previous work (Matari'c 1995, Matari'c 1997), inspired by the same biological results, has successfully applied the idea of basis behaviors to control of mobile robots 6 by fitting it directly into the modular behavior-based control paradigm. Applictions of schema theory (Arbib 1992) to behavior-based mobile robots (Arkin 1987) have employed a similar notion of composable behaviors, stemming from foundations in neuroscience (Arbib 1981, Arbib 1989). The idea of using such primitives for articulator control has been recently studied in robotics. Williamson (1996) and Marjanovi'c, Scassellati & Williamson (1996) developed a 6 DOF (degrees of freedom) robot arm controller. While in the biological and mobile robotics work primitives c...
Automated derivation of primitives for movement classification
- In Proc. of First IEEE-RAS International Conference on Humanoid Robots
, 2000
"... Abstract. We present a new method for representing human movement compactly, in terms of a linear superimposition of simpler movements termed primitives. This method is a part of a larger research project aimed at modeling motor control and imitation using the notion of perceptuo-motor primitives, a ..."
Abstract
-
Cited by 72 (8 self)
- Add to MetaCart
Abstract. We present a new method for representing human movement compactly, in terms of a linear superimposition of simpler movements termed primitives. This method is a part of a larger research project aimed at modeling motor control and imitation using the notion of perceptuo-motor primitives, a basis set of coupled perceptual and motor routines. In our model, the perceptual system is biased by the set of motor behaviors the agent can execute, so it automatically classifies observed movements into its executable repertoire. In this paper, we describe a method for automatically deriving a set of primitives directly from human movement data. We used data from a psychophysical experiment on human imitation to derive a set of primitives, and then used those primitives as a basis for superposition and sequencing to reconstruct the original movements. We performed principal component analysis on segments from these data, resulting in a set of basis vectors. Next we clustered in the space of projections of segments onto the eigenvectors, to obtain a set of frequently used movements. To validate the approach experimentally, we used the movement obtained by expanding the cluster points in terms of the eigenvectors as a sequence of via points to control a humanoid dynamic simulation. We also developed an error metric to measure the effectiveness of the process. 1

