Results 1 - 10
of
515
Recognizing human actions: A local SVM approach
- In ICPR
, 2004
"... Local space-time features capture local events in video and can be adapted to the size, the frequency and the velocity of moving patterns. In this paper we demonstrate how such features can be used for recognizing complex motion patterns. We construct video representations in terms of local space-ti ..."
Abstract
-
Cited by 758 (20 self)
- Add to MetaCart
(Show Context)
Local space-time features capture local events in video and can be adapted to the size, the frequency and the velocity of moving patterns. In this paper we demonstrate how such features can be used for recognizing complex motion patterns. We construct video representations in terms of local space-time features and integrate such representations with SVM classification schemes for recognition. For the purpose of evaluation we introduce a new video database containing 2391 sequences of six human actions performed by 25 people in four different scenarios. The presented results of action recognition justify the proposed method and demonstrate its advantage compared to other relative approaches for action recognition. 1.
Object Tracking: A Survey
, 2006
"... The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends. Object tracking, in general, is a challenging problem. Difficulties in tracking objects can arise due to abrupt object motion, changing appearance patterns o ..."
Abstract
-
Cited by 701 (7 self)
- Add to MetaCart
The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends. Object tracking, in general, is a challenging problem. Difficulties in tracking objects can arise due to abrupt object motion, changing appearance patterns of both the object and the scene, nonrigid object structures, object-to-object and object-to-scene occlusions, and camera motion. Tracking is usually performed in the context of higher-level applications that require the location and/or shape of the object in every frame. Typically, assumptions are made to constrain the tracking problem in the context of a particular application. In this survey, we categorize the tracking methods on the basis of the object and motion representations used, provide detailed descriptions of representative methods in each category, and examine their pros and cons. Moreover, we discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.
Recent Developments in Human Motion Analysis
"... Visual analysis of human motion is currently one of the most active research topics in computer vision. This strong interest is driven by a wide spectrum of promising applications in many areas such as virtual reality, smart surveillance, perceptual interface, etc. Human motion analysis concerns the ..."
Abstract
-
Cited by 264 (3 self)
- Add to MetaCart
Visual analysis of human motion is currently one of the most active research topics in computer vision. This strong interest is driven by a wide spectrum of promising applications in many areas such as virtual reality, smart surveillance, perceptual interface, etc. Human motion analysis concerns the detection, tracking and recognition of people, and more generally, the understanding of human behaviors, from image sequences involving humans. This paper provides a comprehensive survey of research on computer vision based human motion analysis. The emphasis is on three major issues involved in a general human motion analysis system, namely human detection, tracking and activity understanding. Various methods for each issue are discussed in order to examine the state of the art. Finally, some research challenges and future directions are discussed.
Head Pose Estimation in Computer Vision: A Survey
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2008
"... The capacity to estimate the head pose of another person is a common human ability that presents a unique challenge for computer vision systems. Compared to face detection and recognition, which have been the primary foci of face-related vision research, identity-invariant head pose estimation has ..."
Abstract
-
Cited by 195 (13 self)
- Add to MetaCart
The capacity to estimate the head pose of another person is a common human ability that presents a unique challenge for computer vision systems. Compared to face detection and recognition, which have been the primary foci of face-related vision research, identity-invariant head pose estimation has fewer rigorously evaluated systems or generic solutions. In this paper, we discuss the inherent difficulties in head pose estimation and present an organized survey describing the evolution of the field. Our discussion focuses on the advantages and disadvantages of each approach and spans 90 of the most innovative and characteristic papers that have been published on this topic. We compare these systems by focusing on their ability to estimate coarse and fine head pose, highlighting approaches that are well suited for unconstrained environments.
The humanID gait challenge problem: Data sets, performance, and analysis
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2005
"... Abstract—Identification of people by analysis of gait patterns extracted from video has recently become a popular research problem. However, the conditions under which the problem is “solvable ” are not understood or characterized. To provide a means for measuring progress and characterizing the pro ..."
Abstract
-
Cited by 179 (1 self)
- Add to MetaCart
(Show Context)
Abstract—Identification of people by analysis of gait patterns extracted from video has recently become a popular research problem. However, the conditions under which the problem is “solvable ” are not understood or characterized. To provide a means for measuring progress and characterizing the properties of gait recognition, we introduce the HumanID Gait Challenge Problem. The challenge problem consists of a baseline algorithm, a set of 12 experiments, and a large data set. The baseline algorithm estimates silhouettes by background subtraction and performs recognition by temporal correlation of silhouettes. The 12 experiments are of increasing difficulty, as measured by the baseline algorithm, and examine the effects of five covariates on performance. The covariates are: change in viewing angle, change in shoe type, change in walking surface, carrying or not carrying a briefcase, and elapsed time between sequences being compared. Identification rates for the 12 experiments range from 78 percent on the easiest experiment to 3 percent on the hardest. All five covariates had statistically significant effects on performance, with walking surface and time difference having the greatest impact. The data set consists of 1,870 sequences from 122 subjects spanning five covariates (1.2 Gigabytes of data). The gait data, the source code of the baseline algorithm, and scripts to run, score, and analyze the challenge experiments are available at
Data fusion for visual tracking with particles
- Proc. IEEE
, 2004
"... Abstract—The effectiveness of probabilistic tracking of objects in image sequences has been revolutionized by the development of particle filtering. Whereas Kalman filters are restricted to Gaussian distributions, particle filters can propagate more general distributions, albeit only approximately. ..."
Abstract
-
Cited by 166 (2 self)
- Add to MetaCart
(Show Context)
Abstract—The effectiveness of probabilistic tracking of objects in image sequences has been revolutionized by the development of particle filtering. Whereas Kalman filters are restricted to Gaussian distributions, particle filters can propagate more general distributions, albeit only approximately. This is of particular benefit in visual tracking because of the inherent ambiguity of the visual world that stems from its richness and complexity. One important advantage of the particle filtering framework is that it allows the information from different measurement sources to be fused in a principled manner. Although this fact has been acknowledged before, it has not been fully exploited within a visual tracking context. Here we introduce generic importance sampling mechanisms for data fusion and discuss them for fusing color with either stereo sound, for tele-conferencing, or with motion, for surveillance with a still camera. We show how each of the three cues can be modeled by an appropriate data likelihood function, and how the intermittent cues (sound or motion) are best handled by generating proposal distributions from their likelihood functions. Finally, the effective fusion of the cues by particle filtering is demonstrated on real tele-conference and surveillance data. Index Terms — Visual tracking, data fusion, particle filters, sound, color, motion I.
Social Signal Processing: Survey of an Emerging Domain
, 2008
"... The ability to understand and manage social signals of a person we are communicating with is the core of social intelligence. Social intelligence is a facet of human intelligence that has been argued to be indispensable and perhaps the most important for success in life. This paper argues that next- ..."
Abstract
-
Cited by 153 (32 self)
- Add to MetaCart
The ability to understand and manage social signals of a person we are communicating with is the core of social intelligence. Social intelligence is a facet of human intelligence that has been argued to be indispensable and perhaps the most important for success in life. This paper argues that next-generation computing needs to include the essence of social intelligence – the ability to recognize human social signals and social behaviours like turn taking, politeness, and disagreement – in order to become more effective and more efficient. Although each one of us understands the importance of social signals in everyday life situations, and in spite of recent advances in machine analysis of relevant behavioural cues like blinks, smiles, crossed arms, laughter, and similar, design and development of automated systems for Social Signal Processing (SSP) are rather difficult. This paper surveys the past efforts in solving these problems by a computer, it summarizes the relevant findings in social psychology, and it proposes a set of recommendations for enabling the development of the next generation of socially-aware computing.
Tracking multiple humans in complex situations
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2004
"... Abstract—Tracking multiple humans in complex situations is challenging. The difficulties are tackled with appropriate knowledge in the form of various models in our approach. Human motion is decomposed into its global motion and limb motion. In the first part, we show how multiple human objects are ..."
Abstract
-
Cited by 134 (3 self)
- Add to MetaCart
(Show Context)
Abstract—Tracking multiple humans in complex situations is challenging. The difficulties are tackled with appropriate knowledge in the form of various models in our approach. Human motion is decomposed into its global motion and limb motion. In the first part, we show how multiple human objects are segmented and their global motions are tracked in 3D using ellipsoid human shape models. Experiments show that it successfully applies to the cases where a small number of people move together, have occlusion, and cast shadow or reflection. In the second part, we estimate the modes (e.g., walking, running, standing) of the locomotion and 3D body postures by making inference in a prior locomotion model. Camera model and ground plane assumptions provide geometric constraints in both parts. Robust results are shown on some difficult sequences. Index Terms—Multiple-human segmentation, multiple-human tracking, visual surveillance, human shape model, human locomotion model. 1
Multimodal human computer interaction: A survey
, 2005
"... In this paper we review the major approaches to Multimodal Human Computer Interaction, giving an overview of the field from a computer vision perspective. In particular, we focus on body, gesture, gaze, and affective interaction (facial expression recognition and emotion in audio). We discuss user ..."
Abstract
-
Cited by 119 (3 self)
- Add to MetaCart
(Show Context)
In this paper we review the major approaches to Multimodal Human Computer Interaction, giving an overview of the field from a computer vision perspective. In particular, we focus on body, gesture, gaze, and affective interaction (facial expression recognition and emotion in audio). We discuss user and task modeling, and multimodal fusion, highlighting challenges, open issues, and emerging applications for Multimodal Human Computer Interaction (MMHCI) research.