Results 1 - 10
of
32
Understanding video events: A survey of methods for automatic interpretation of semantic occurrences in videos
- TSMC
"... Abstract: Understanding Video Events, the translation of low-level content in video sequences into highlevel semantic concepts, is a research topic that has received much interest in recent years. Important applications of this work include smart surveillance systems, semantic video database indexin ..."
Abstract
-
Cited by 51 (0 self)
- Add to MetaCart
(Show Context)
Abstract: Understanding Video Events, the translation of low-level content in video sequences into highlevel semantic concepts, is a research topic that has received much interest in recent years. Important applications of this work include smart surveillance systems, semantic video database indexing, and interactive systems. This technology can be applied to several video domains including: airport terminal, parking lot, traffic, subway stations, aerial surveillance, and sign language data. In this work we survey the two main components of the event understanding process: Abstraction and Event modeling. Abstraction is the process of molding the data into informative units to be used as input to the event model. Event modeling is devoted to describing events of interest formally and enabling recognition of these events as they occur in the video sequence. Event modeling can be further decomposed in the categories of Pattern Recognition Methods, State Event Models, and Semantic Event Models. In this survey we discuss this proposed taxonomy of the literature, offer a unifying terminology, and discuss popular abstraction schemes (e.g. Motion History Images) and event modeling formalisms (e.g. Hidden Markov Model) and their use in video event understanding using extensive examples from the literature. Finally we consider the application domain of video event understanding in light of the proposed taxonomy, and propose future directions for research in this field.
Perception as Abduction: Turning Sensor Data into Meaningful Representation
- Cognitive Science
, 2005
"... This article presents a formal theory of robot perception as a form of abduction. The theory pins down the process whereby low-level sensor data is transformed into a symbolic representation of the external world, drawing together aspects such as incompleteness, top-down information flow, active per ..."
Abstract
-
Cited by 49 (2 self)
- Add to MetaCart
(Show Context)
This article presents a formal theory of robot perception as a form of abduction. The theory pins down the process whereby low-level sensor data is transformed into a symbolic representation of the external world, drawing together aspects such as incompleteness, top-down information flow, active perception, attention, and sensor fusion in a unifying framework. In addition, a number of themes are identified that are common to both the engineer concerned with developing a rigorous theory of perception, such as the one on offer here, and the philosopher of mind who is exercised by questions relating to mental representation and intentionality.
VidMAP: Video Monitoring of Activity with Prolog
- IEEE International Conference on Advanced Video and Signal based Surveillance
, 2005
"... This paper describes the architecture of a visual surveillance system that combines real time computer vision algorithms with logic programming to represent and recognize activities involving interactions amongst people, packages and the environments through which they move. The low level computer v ..."
Abstract
-
Cited by 34 (6 self)
- Add to MetaCart
(Show Context)
This paper describes the architecture of a visual surveillance system that combines real time computer vision algorithms with logic programming to represent and recognize activities involving interactions amongst people, packages and the environments through which they move. The low level computer vision algorithms log primitive events of interest as observed facts, while the higher level Prolog based reasoning engine uses these facts in conjunction with predefined rules to recognize various activities in the input video streams. The system is illustrated in action on a multi-camera surveillance scenario that includes both security and safety violations.
Multivalued Default Logic for Identity Maintenance in Visual Surveillance
- In ECCV, pages IV: 119–132
, 2006
"... Recognition of complex activities from surveillance video requires detection and temporal ordering of its constituent "atomic" events. It also requires the capacity to robustly track individuals and maintain their identities across single as well as multiple camera views. Identity maint ..."
Abstract
-
Cited by 23 (4 self)
- Add to MetaCart
(Show Context)
Recognition of complex activities from surveillance video requires detection and temporal ordering of its constituent "atomic" events. It also requires the capacity to robustly track individuals and maintain their identities across single as well as multiple camera views. Identity maintenance is a primary source of uncertainty for activity recognition and has been traditionally addressed via different appearance matching approaches. However these approaches, by themselves, are inadequate. In this paper, we propose a prioritized, multivalued, default logic based framework that allows reasoning about the identities of individuals.
Reasoning about depth and motion from an observer’s viewpoint
- Spatial Cognition and Computation
, 2007
"... The goal of this paper is to present a logic-based formalism for representing knowledge about objects in space and their movements, and show how this knowledge could be built up from the viewpoint of an observer immersed in a dynamic world. In this paper space is represented using functions that ext ..."
Abstract
-
Cited by 7 (7 self)
- Add to MetaCart
(Show Context)
The goal of this paper is to present a logic-based formalism for representing knowledge about objects in space and their movements, and show how this knowledge could be built up from the viewpoint of an observer immersed in a dynamic world. In this paper space is represented using functions that extract attributes of depth, size and distance from snapshots of the world. These attributes compose a novel spatial reasoning system named Depth Profile Calculus (DPC). Transitions between qualitative relations involving these attributes are represented by an extension of this calculus called Dynamic Depth Profile Calculus (DDPC). We argue that knowledge about objects in the world could be built up via a process of abduction on DDPC relations.
Benchmarking Qualitative Spatial Calculi for Video Activity Analysis
"... Abstract. This paper presents a general way of addressing problems in video activity understanding using graph based relational learning. Video activities are described using relational spatio-temporal graphs, that represent qualitative spatiotemporal relations between interacting objects. A wide ra ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
(Show Context)
Abstract. This paper presents a general way of addressing problems in video activity understanding using graph based relational learning. Video activities are described using relational spatio-temporal graphs, that represent qualitative spatiotemporal relations between interacting objects. A wide range of spatio-temporal relations are introduced, as being well suited for describing video activities. Then, a formulation is proposed, in which standard problems in video activity understanding such as event detection, are naturally mapped to problems in graph based relational learning. Experiments on video understanding tasks, for a video dataset consisting of common outdoor verbs, validate the significance of the proposed approach. 1
Going far, logically
- In Proceedings of IJCAI2005
, 2005
"... There are numerous applications where we need to ensure that multiple moving objects are sufficiently far apart. Furthermore, in many moving object domains, there is positional indeterminacy — we are not 100 % sure exactly when a given moving object will be at a given location. [Yaman et al., 2004] ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
There are numerous applications where we need to ensure that multiple moving objects are sufficiently far apart. Furthermore, in many moving object domains, there is positional indeterminacy — we are not 100 % sure exactly when a given moving object will be at a given location. [Yaman et al., 2004] provided a logic of motion but did not provide algorithms to ensure that moving objects are kept sufficiently far apart. In this paper, we extend their logic to include a “far ” predicate. We develop the CheckFar algorithm that checks if any given two objects will always be sufficiently far apart at during a time interval. We have run a set of experiments showing that our CheckFar algorithm scales very well. 1
Qualitative Reasoning Feeding Back into Quantitative Model-Based Tracking
- Proceedings of the 16th European Conference on Artificial Intelligence (ECAI-04
, 2004
"... Abstract. Tracking vehicles in image sequences of innercity road traffic scenes must be considered still to constitute a challenging task. Even if a-priori knowledge about the 3D shape of vehicles, of the background structure, and about vehicle motion is provided, (partial) occlusion and dense vehic ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
Abstract. Tracking vehicles in image sequences of innercity road traffic scenes must be considered still to constitute a challenging task. Even if a-priori knowledge about the 3D shape of vehicles, of the background structure, and about vehicle motion is provided, (partial) occlusion and dense vehicle queues easily can cause initialization and tracking failures. A stepwise improvement of the tracking approach requires numerous and time-consuming experiments. These difficulties can be eased considerably by endowing the system with – at least part of the – qualitative knowledge which a human observer activates in order to judge the results. In the case to be reported here, a system for qualitative reasoning has been coupled with a quantitative model-based tracking system in order to explore the feedback from qualitative reasoning into the geometric tracking subsystem. The approach and encouraging experimental results obtained for real-world image sequences are described. 1
Selecting ghosts and queues from a car trackers output using a spatio-temporal query language
- In: Proc. Conference on Computer Vision and Pattern Recognition
, 2004
"... This paper presents a spatio-temporal query language useful for video interpretation and event recognition. The language is suited to describe configurations of objects moving on a plane. To demonstrate its applicability it has been tested on the output of a tracker working on a car traffic scene. T ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
This paper presents a spatio-temporal query language useful for video interpretation and event recognition. The language is suited to describe configurations of objects moving on a plane. To demonstrate its applicability it has been tested on the output of a tracker working on a car traffic scene. The results of two example sets of queries are shown in two videos generated from the trackers data output. The first selects a ghost from the tracking data and the second shows how to find queues of cars in the road traffic scene without prior knowledge of lanes. 1.
Learning Scene Semantics
- In: ECOVISION 2004 Early Cognitive Vision Workshop. Isle of Skye
, 2004
"... Automated visual surveillance systems are required to emulate the cognitive abilities of surveillance personnel, who are able to detect, recognise and assess the severity of suspicious, unusual and threatening behaviours. We describe the architecture of our surveillance system, emphasising some of i ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
Automated visual surveillance systems are required to emulate the cognitive abilities of surveillance personnel, who are able to detect, recognise and assess the severity of suspicious, unusual and threatening behaviours. We describe the architecture of our surveillance system, emphasising some of its high-level cognitive capabilities. In particular, we present a methodology for automatically learning semantic labels of scene features. We also describe a framework that supports learning of a wider range of semantics, using a motion attention mechanism and exploiting long-term consistencies in video data.