• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

View-invariant modeling and recognition of human actions using grammars. ICCV’05 (2005)

by A S Ogale, A Karapurkar, Y Aloimonos
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 13
Next 10 →

A sensory grammar for inferring behaviors in sensor networks

by Dimitrios Lymberopoulos, Abhijit S. Ogale, Andreas Savvides, Yiannis Aloimonos - In Proceedings of Information Processing in Sensor Networks (IPSN , 2006
"... The ability of a sensor network to parse out observable activities into a set of distinguishable actions is a powerful feature that can potentially enable many applications of sensor networks to everyday life situations. In this paper we introduce a framework that uses a hierarchy of Probabilistic C ..."
Abstract - Cited by 30 (17 self) - Add to MetaCart
The ability of a sensor network to parse out observable activities into a set of distinguishable actions is a powerful feature that can potentially enable many applications of sensor networks to everyday life situations. In this paper we introduce a framework that uses a hierarchy of Probabilistic Context Free Grammars (PCFGs) to perform such parsing. The power of the framework comes from the hierarchical organization of grammars that allows the use of simple local sensor measurements for reasoning about more macroscopic behaviors. Our presentation describes how to use a set of phonemes to construct grammars and how to achieve distributed operation using a messaging model. The proposed framework is flexible. It can be mapped to a network hierarchy or can be applied sequentially and across the network to infer behaviors as they unfold in space and time. We demonstrate this functionality by inferring simple motion patterns using a sequence of simple direction vectors obtained from our camera sensor network testbed.

Cross-View Action Recognition from Temporal Self-similarities

by Imran N. Junejo, Emilie Dexter, Ivan Laptev, Patrick Pérez, Inria Rennes, Bretagne Atlantique
"... Abstract. This paper concerns recognition of human actions under view changes. We explore self-similarities of action sequences over time and observe the striking stability of such measures across views. Building upon this key observation we develop an action descriptor that captures the structure o ..."
Abstract - Cited by 24 (3 self) - Add to MetaCart
Abstract. This paper concerns recognition of human actions under view changes. We explore self-similarities of action sequences over time and observe the striking stability of such measures across views. Building upon this key observation we develop an action descriptor that captures the structure of temporal similarities and dissimilarities within an action sequence. Despite this descriptor not being strictly view-invariant, we provide intuition and experimental validation demonstrating the high stability of self-similarities under view changes. Self-similarity descriptors are also shown stable under action variations within a class as well as discriminative for action recognition. Interestingly, self-similarities computed from different image features possess similar properties and can be used in a complementary fashion. Our method is simple and requires neither structure recovery nor multi-view correspondence estimation. Instead, it relies on weak geometric properties and combines them with machine learning for efficient cross-view action recognition. The method is validated on three public datasets, it has similar or superior performance compared to related methods and it performs well even in extreme conditions such as when recognizing actions from top views while using side views for training only. 1

A Lightweight Camera Sensor Network Operating on Symbolic Information

by Thiago Teixeira, Dimitrios Lymberopoulos, Eugenio Culurciello, Yiannis Aloimonos, Andreas Savvides
"... Abstract — This paper provides an overview of the research aspects of our DSC06 demonstration. We present a new camera sensor network for behavior recognition. Two new technologies are explored, biologically inspired address-event image sensors and sensory grammars. This paper explains how these two ..."
Abstract - Cited by 14 (2 self) - Add to MetaCart
Abstract — This paper provides an overview of the research aspects of our DSC06 demonstration. We present a new camera sensor network for behavior recognition. Two new technologies are explored, biologically inspired address-event image sensors and sensory grammars. This paper explains how these two technologies are used together and reports of the current status of our prototyping effort. The application of the resulting system in assisted living is also described. I.

Macroscopic Human Behavior Interpretation Using Distributed Imager and Other Sensors

by Dimitrios Lymberopoulos, Student Member, Thiago Teixeira, Student Member, Andreas Savvides
"... This paper presents BScope, a new system for interpreting human activity patterns using a sensor network. BScope provides a run-time, user-programmable framework that processes streams of timestamped sensor data along with prior context information to infer activities and generate appropriate notifi ..."
Abstract - Cited by 12 (10 self) - Add to MetaCart
This paper presents BScope, a new system for interpreting human activity patterns using a sensor network. BScope provides a run-time, user-programmable framework that processes streams of timestamped sensor data along with prior context information to infer activities and generate appropriate notifications. The users of the system are able to describe human activities with high level scripts that are directly mapped to hierarchical probabilistic grammars used to parse low level sensor measurements into high level distinguishable activities. Our approach is presented, though not limited, in the context of an assisted living application in which a small, privacy preserving camera sensor network of five nodes is used to monitor activity in the entire house over a period of 25 days. Privacy is preserved by the fact that camera sensors only provide discrete high-level features, such as motion information in the form of image locations, and not actual images. In this deployment, our primary sensing modality is a distributed array of image sensors with wide-angle lens that observe people’s locations in the house during the course of the day. We demonstrate that our system can successfully generate summaries of everyday activities and trigger notifications at run-time by using more than 1.3 million location measurements acquired through our real home deployment.

View-Independent Action Recognition from Temporal Self-Similarities

by Imran N. Junejo, Emilie Dexter, Ivan Laptev, Patrick Pérez - SUBMITTED TO IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
"... This paper addresses recognition of human actions under view changes. We explore self-similarities of action sequences over time and observe the striking stability of such measures across views. Building upon this key observation, we develop an action descriptor that captures the structure of tempo ..."
Abstract - Cited by 11 (1 self) - Add to MetaCart
This paper addresses recognition of human actions under view changes. We explore self-similarities of action sequences over time and observe the striking stability of such measures across views. Building upon this key observation, we develop an action descriptor that captures the structure of temporal similarities and dissimilarities within an action sequence. Despite this temporal self-similarity descriptor not being strictly view-invariant, we provide intuition and experimental validation demonstrating its high stability under view changes. Self-similarity descriptors are also shown stable under performance variations within a class of actions, when individual speed fluctuations are ignored. If required, such fluctuations between two different instances of the same action class can be explicitly recovered with dynamic time warping, as will be demonstrated, to achieve cross-view action synchronization. More central to present work, temporal ordering of local selfsimilarity descriptors can simply be ignored within a bag-offeatures type of approach. Sufficient action discrimination is still retained this way to build a view-independent action recognition system. Interestingly, self-similarities computed from different image features possess similar properties and can be used in a complementary fashion. Our method is simple and requires neither structure recovery nor multi-view correspondence estimation. Instead, it relies on weak geometric properties and combines them with machine learning for efficient cross-view action recognition. The method is validated on three public datasets. It has similar or superior performance compared to related methods and it performs well even in extreme conditions such as when recognizing actions from top views while using side views only for training.

Human Activity Recognition and Pattern Discovery

by Eunju Kim, Sumi Helal, Diane Cook
"... Activity recognition is an important technology in pervasive computing because it can be applied to many real-life, human-centric problems such as eldercare and healthcare. Successful research has so far focused on recognizing simple human activities. Recognizing complex activities remains a challen ..."
Abstract - Cited by 6 (0 self) - Add to MetaCart
Activity recognition is an important technology in pervasive computing because it can be applied to many real-life, human-centric problems such as eldercare and healthcare. Successful research has so far focused on recognizing simple human activities. Recognizing complex activities remains a challenging and active area of research. Specifically, the nature of human activities poses the following challenges: Recognizing concurrent activities People can do several activities at the same time [3]. For example, people can watch television while talking to their friends. These behaviors should be recognized using a different approach from that for sequential activity. Recognizing interleaved activities Certain real life activities may be interleaved [3]. For instance, while cooking, if there is a call from a friend, people pause cooking for a while and after talking to their friend, they come back to the kitchen and continue to cook. Ambiguity of interpretation Depending on the situation, the interpretation of similar activities may be different. For example, an activity “open refrigerator ” can belong to several activities, such as “cooking ” or “cleaning”. Multiple residents In many environments more than one resident is present. The activities that are being performed by the residents in parallel need to be recognized, even if the activity is performed together by the residents in a group. Human activity understanding encompasses activity recognition and activity pattern discovery. The first focuses on accurate detection of the human activities based on a predefined activity model. Therefore, an activity recognition researcher builds a highlevel conceptual model first, and then implements the model by building a suitable pervasive system. On the other hand, activity pattern discovery is more about finding some unknown patterns directly from low-level sensor data without any predefined models or assumptions. Hence, the researcher of activity pattern discovery builds a pervasive system first and then analyzes the sensor data to discover activity patterns. Even though the two techniques are different, they both aim at improving human activity technology. Additionally, they are complementary to each other- the discovered activity pattern can be used to define the activities that will be recognized and tracked.

Extracting spatiotemporal human activity patterns in assisted living using a home sensor network

by Dimitrios Lymberopoulos, Athanasios Bamis, Andreas Savvides - In PETRA ’08: Proceedings of the 1st international conference on PErvasive Technologies Related to Assistive Environments , 2008
"... This paper presents an automated methodology for extracting the spatiotemporal activity model of a person using a wireless sensor network deployed inside a home. The sensor network is modeled as a source of spatiotemporal symbols whose output is triggered by the monitored person’s motion over space ..."
Abstract - Cited by 5 (5 self) - Add to MetaCart
This paper presents an automated methodology for extracting the spatiotemporal activity model of a person using a wireless sensor network deployed inside a home. The sensor network is modeled as a source of spatiotemporal symbols whose output is triggered by the monitored person’s motion over space and time. Using this stream of symbols, we formulate the problem of human activity modeling as a spatiotemporal pattern-matching problem on top of the sequence of symbolic information the sensor network produces and solve it using an exhaustive search algorithm. The effectiveness of the proposed methodology is demonstrated on a real 30-day dataset extracted from an ongoing deployment of a sensor network inside a home monitoring an elder. Our algorithm examines the person’s data over these 30 days and automatically extracts the person’s daily pattern.

Recovering the basic structure of human activities from noisy video-based symbol strings

by Kris M. Kitani, Yoichi Sato, Akihiro Sugimoto - International Journal of Pattern Recognition and Artificial Intelligence
"... In recent years stochastic context-free grammars have been shown to be effective in modeling human activities because of the hierarchical structures they represent. However, most of the research in this area has yet to address the issue of learning the activity grammars from a noisy input source, na ..."
Abstract - Cited by 4 (1 self) - Add to MetaCart
In recent years stochastic context-free grammars have been shown to be effective in modeling human activities because of the hierarchical structures they represent. However, most of the research in this area has yet to address the issue of learning the activity grammars from a noisy input source, namely, video. In this paper, we present a framework for identifying noise and recovering the basic activity grammar from a noisy symbol string produced by video. We identify the noise symbols by finding the set of non-noise symbols that optimally compresses the training data, where the optimality of compression is measured using an MDL criterion. We show the robustness of our system to noise and its effectiveness in learning the basic structure of human activity, through an experiment with real video from a local convenience store. 1

View and Style-Independent Action Manifolds for Human Activity Recognition

by Dimitrios Makris, Jean-christophe Nebel
"... Abstract. We introduce a novel approach to automatically learn intuitive and compact descriptors of human body motions for activity recognition. Each action descriptor is produced, first, by applying Temporal Laplacian Eigenmaps to view-dependent videos in order to produce a stylistic invariant embe ..."
Abstract - Cited by 2 (1 self) - Add to MetaCart
Abstract. We introduce a novel approach to automatically learn intuitive and compact descriptors of human body motions for activity recognition. Each action descriptor is produced, first, by applying Temporal Laplacian Eigenmaps to view-dependent videos in order to produce a stylistic invariant embedded manifold for each view separately. Then, all view-dependent manifolds are automatically combined to discover a unified representation which model in a single three dimensional space an action independently from style and viewpoint. In addition, a bidirectional nonlinear mapping function is incorporated to allow projecting actions between original and embedded spaces. The proposed framework is evaluated on a real and challenging dataset (IXMAS), which is composed of a variety of actions seen from arbitrary viewpoints. Experimental results demonstrate robustness against style and view variation and match the most accurate action recognition method.

An Unsupervised Framework for Action Recognition Using Actemes

by Kaustubh Kulkarni, Edmond Boyer, Radu Horaud, Amit Kale, Rhone Alpes
"... Abstract. In speech recognition, phonemes have demonstrated their efficacy to model the words of a language. While they are well defined for languages, their extension to human actions is not straightforward. In this paper, we study such an extension and propose an unsupervised framework to find pho ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Abstract. In speech recognition, phonemes have demonstrated their efficacy to model the words of a language. While they are well defined for languages, their extension to human actions is not straightforward. In this paper, we study such an extension and propose an unsupervised framework to find phoneme-like units for actions, which we call actemes, using 3D data and without any prior assumptions. To this purpose, build on an earlier proposed framework in speech literature to automatically find actemes in the training data. We experimentally show that actions defined in terms of actemes and actions defined by whole units give similar recognition results. We define actions out of the training set in terms of these actemes to see whether the actemes generalize to unseen actions. The results show that although the acteme definitions of the actions are not always semantically meaningful, they yield optimal recognition accuracy and constitute a promising direction of research for action modeling. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University