Results 1 -
5 of
5
Thinking Inside the Box: A Comprehensive Spatial Representation for Video Analysis
"... Successful analysis of video data requires an integration of techniques from KR, Computer Vision, and Machine Learning. Being able to detect and to track objects as well as extracting their changing spatial relations with other objects is one approach to describing and detecting events. Different ki ..."
Abstract
-
Cited by 8 (6 self)
- Add to MetaCart
Successful analysis of video data requires an integration of techniques from KR, Computer Vision, and Machine Learning. Being able to detect and to track objects as well as extracting their changing spatial relations with other objects is one approach to describing and detecting events. Different kinds of spatial relations are important, including topology, direction, size, and distance between objects as well as changes of those relations over time. Typically these kinds of relations are treated separately, which makes it difficult to integrate all the extracted spatial information. We present a uniform and comprehensive spatial representation of moving objects that includes all the above spatial/temporal aspects, analyse different properties of this representation and demonstrate that it is suitable for video analysis.
Efficient Extraction and Representation of Spatial Information from Video Data
"... Vast amounts of video data are available on the web and are being generated daily using surveillance cameras or other sources. Being able to efficiently analyse and process this data is essential for a number of different applications. We want to be able to efficiently detect activities in these vid ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
Vast amounts of video data are available on the web and are being generated daily using surveillance cameras or other sources. Being able to efficiently analyse and process this data is essential for a number of different applications. We want to be able to efficiently detect activities in these videos or be able to extract and store essential information contained in these videos for future use and easy search and access. Cohn et al. (2012) proposed a comprehensive representation of spatial features that can be efficiently extracted from video and used for these purposes. In this paper, we present a modified version of this approach that is equally efficient and allows us to extract spatial information with much higher accuracy than previously possible. We present efficient algorithms both for extracting and storing spatial information from video, as well as for processing this information in order to obtain useful spatial features. We evaluate our approach and demonstrate that the extracted spatial information is considerably more accurate than that obtained from existing approaches. 1
Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Efficient Extraction and Representation of Spatial Information from Video Data
"... Vast amounts of video data are available on the web and are being generated daily using surveillance cameras or other sources. Being able to efficiently analyse and process this data is essential for a number of different applications. We want to be able to efficiently detect activities in these vid ..."
Abstract
- Add to MetaCart
Vast amounts of video data are available on the web and are being generated daily using surveillance cameras or other sources. Being able to efficiently analyse and process this data is essential for a number of different applications. We want to be able to efficiently detect activities in these videos or be able to extract and store essential information contained in these videos for future use and easy search and access. Cohn et al. (2012) proposed a comprehensive representation of spatial features that can be efficiently extracted from video and used for these purposes. In this paper, we present a modified version of this approach that is equally efficient and allows us to extract spatial information with much higher accuracy than previously possible. We present efficient algorithms both for extracting and storing spatial information from video, as well as for processing this information in order to obtain useful spatial features. We evaluate our approach and demonstrate that the extracted spatial information is considerably more accurate than that obtained from existing approaches. 1
REDVINE Version 0.5 Beta Documentation
"... This documentation details the user manual and technical background of the REDVINE syste,, a novel approach, where interactions in a video are represented using an activity graph. The activity graph embodies all the interactions i.e. all the qualitative spatiotemporal relations between all pairs of ..."
Abstract
- Add to MetaCart
(Show Context)
This documentation details the user manual and technical background of the REDVINE syste,, a novel approach, where interactions in a video are represented using an activity graph. The activity graph embodies all the interactions i.e. all the qualitative spatiotemporal relations between all pairs of interacting co-
Determining Interacting Objects in Human-Centric Activities via Qualitative Spatio-Temporal Reasoning
"... Abstract. Understanding the activities taking place in a video is a chal-lenging problem in Artificial Intelligence. Complex video sequences con-tain many activities and involve a multitude of interacting objects. De-termining which objects are relevant to a particular activity is the first step in ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. Understanding the activities taking place in a video is a chal-lenging problem in Artificial Intelligence. Complex video sequences con-tain many activities and involve a multitude of interacting objects. De-termining which objects are relevant to a particular activity is the first step in understanding the activity. Indeed many objects in the scene are irrelevant to the main activity taking place. In this work, we consider human-centric activities and look to identify which objects in the scene are involved in the activity. We take an activity-agnostic approach and rank every moving object in the scene with how likely it is to be involved in the activity. We use a comprehensive spatio-temporal representation that captures the joint movement between humans and each object. We then use supervised machine learning techniques to recognize relevant objects based on these features. Our approach is tested on the challeng-ing Mind’s Eye dataset. 1