Results 1 - 10
of
14
Perceiving, remembering, and communicating structure in events
- Journal of Experimental Psychology: General
, 2001
"... This article may not exactly replicate the final version published in the APA journal. It is not the copy of record. The archival text may be retrieved from: ..."
Abstract
-
Cited by 35 (5 self)
- Add to MetaCart
This article may not exactly replicate the final version published in the APA journal. It is not the copy of record. The archival text may be retrieved from:
Using Approximate Models as Source of Contextual Information for . . .
- In Proc. of the ICCV'95 Workshop on Context-Based Vision
, 1995
"... Most computer vision algorithms are based on strong assumptions about the objects and the actions depicted in the image. To safely apply those algorithms in real world image sequences, it is necessary to verify that their assumptions are satisfied in the context of the visual process. We propose the ..."
Abstract
-
Cited by 17 (4 self)
- Add to MetaCart
Most computer vision algorithms are based on strong assumptions about the objects and the actions depicted in the image. To safely apply those algorithms in real world image sequences, it is necessary to verify that their assumptions are satisfied in the context of the visual process. We propose the use of approximate world models -- coarse descriptions of objects and actions in the world --- as the appropriate representation for contextual information. The approximate world models are employed to verify the applicability of a vision routine in a given situation. Under these conditions, a task module can reliably use the outputs of the contextually-safe vision routines, without having to refer to an accurate reconstruction of the world. We are using approximate world models in a project to control cameras in a TV studio. In our Intelligent Studio automatic cameras respond to verbal requests for shots from the TV director. Contextual information is obtained from the script of the TV sho...
Causal reconstruction
- Massachusetts Institute of Technology, AI Lab, memo
, 1993
"... Causal reconstruction is the task of reading a written causal description of a physical behavior, forming an internal model of the described activity, and demonstrating comprehension through question answering. This task is difficult because written descriptions often do not specify exactly how ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Causal reconstruction is the task of reading a written causal description of a physical behavior, forming an internal model of the described activity, and demonstrating comprehension through question answering. This task is difficult because written descriptions often do not specify exactly how referenced events fit together. This article (1) characterizes the causal reconstruction problem, (2) presents a representation called transition space, which portrays events in terms of "transitions," or collections of changes expressible in everydaylanguage, and (3) describes a program called PATHFINDER, which uses the transition space representation to perform causal reconstruction on simplified English descriptions of physical activity.PATHFINDER works byidentifying partial matches between the representations of events and using these matches to form causal chains, fill causal gaps, and merge overlapping accounts of activity. By applying transformations to events prior to matching, PATHFINDER is also able to handle a range of discontinuities arising from a writer's use of analogy or abstraction.
Understanding and Developing Models for Detecting and Differentiating Breakpoints During Interactive Tasks
- Proc. CHI 2007
"... The ability to detect and differentiate breakpoints during task execution is critical for enabling defer-to-breakpoint policies within interruption management. In this work, we examine the feasibility of building statistical models that can detect and differentiate three granularities (types) of per ..."
Abstract
-
Cited by 11 (5 self)
- Add to MetaCart
The ability to detect and differentiate breakpoints during task execution is critical for enabling defer-to-breakpoint policies within interruption management. In this work, we examine the feasibility of building statistical models that can detect and differentiate three granularities (types) of perceptually meaningful breakpoints during task execution, without having to recognize the underlying tasks. We collected ecological samples of task execution data, and asked observers to review the interaction in the collected videos and identify any perceived breakpoints and their type. Statistical methods were applied to learn models that map features of the interaction to each type of breakpoint. Results showed that the models were able to detect and differentiate breakpoints with reasonably high accuracy across tasks. Among many uses, our resulting models can enable interruption management systems to better realize defer-to-breakpoint policies for interactive, free-form tasks.
Intelligent Studios: Using Computer Vision to Control TV Cameras
, 1995
"... This paper demonstrates that automatic framing of TV shows is an interesting and tractable domain for both Computer Vision and Artificial Intelligence. Our basic goal is to build intelligent robotic cameras (SmartCams) able to frame subjects and objects in a TV studio upon verbal request from a TV d ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
This paper demonstrates that automatic framing of TV shows is an interesting and tractable domain for both Computer Vision and Artificial Intelligence. Our basic goal is to build intelligent robotic cameras (SmartCams) able to frame subjects and objects in a TV studio upon verbal request from a TV director. To cope with the problem of relating visual imagery to symbolic knowledge about the scene, we propose the use of an architecture based on two levels of representation. High level approximate world models roughly describe the objects and the occurring actions. Low level view representations are obtained by vision routines selected according to the present state of the world, as described by the approximate models. The approximate world models are updated by contextual information extracted from the script of the TV show and by processing the imagery gathered by wide-angle, low-resolution cameras monitoring the studio. Our Intelligent Studio is composed of one or more SmartCams which ...
Intelligent temporal subsampling of american sign language using event boundaries
- JOURNAL OF EXPERIMENTAL PSYCHOLOGY: HUMAN PERCEPTION AND PERFORMANCE
, 1990
"... How well can a sequence of frames be represented by a subset of the frames? Video sequences of American Sign Language (ASL) were investigated in two modes: dynamic (ordinary video) and slatie (frames printed side by side on the display). An activity index was used to choose critical frames at event ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
How well can a sequence of frames be represented by a subset of the frames? Video sequences of American Sign Language (ASL) were investigated in two modes: dynamic (ordinary video) and slatie (frames printed side by side on the display). An activity index was used to choose critical frames at event boundaries, times when the difference between successive frames is at a local minimum. Sign intelligibility was measured for 32 experienced ASL signers who viewed individual signs. For full gray-scale dynamic signs activity-index subsampling yielded sequences that were significantly more intelligible than when every ruth frame was chosen. This result was even more pronounced for static images. For binary images, the relative advantage of activity subsampling was smaller. We conclude that event boundaries can be defined computationally and that subsampling from event boundaries is better than choosing at regular intervals.
Event perception: A mind/brain perspective
- Psychological Bulletin
, 2007
"... People perceive and conceive of activity in terms of discrete events. Here the authors propose a theory according to which the perception of boundaries between events arises from ongoing perceptual processing and regulates attention and memory. Perceptual systems continuously make predictions about ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
People perceive and conceive of activity in terms of discrete events. Here the authors propose a theory according to which the perception of boundaries between events arises from ongoing perceptual processing and regulates attention and memory. Perceptual systems continuously make predictions about what will happen next. When transient errors in predictions arise, an event boundary is perceived. According to the theory, the perception of events depends on both sensory cues and knowledge structures that represent previously learned information about event parts and inferences about actors ’ goals and plans. Neurological and neurophysiological data suggest that representations of events may be implemented by structures in the lateral prefrontal cortex and that perceptual prediction error is calculated and evaluated by a processing pathway, including the anterior cingulate cortex and subcortical neuromodulatory systems.
The meaning of action: A review on action recognition and mapping
- Advanced Robotics
"... In this paper, we analyze the different approaches taken to-date within the computer vision, robotics and artificial intelligence communities for the representation, recognition, synthesis and understanding of action. We deal with action at different levels of complexity and provide the reader with ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
In this paper, we analyze the different approaches taken to-date within the computer vision, robotics and artificial intelligence communities for the representation, recognition, synthesis and understanding of action. We deal with action at different levels of complexity and provide the reader with the necessary related literature references. We put the literature reference further into context and outline a possible interpretation of action by taking into account the different aspects of action recognition, action synthesis and task-level planning.
Divide and Conquer: Using Approximate World Models to Control View-Based Algorithms
- IEEE gorithms. Media Lab. Perceptual Computing TR
, 1995
"... Most view-based vision algorithms are based on strong assumptions about the disposition of the objects in the image. To safely apply those algorithms in real world image sequences, we are proposing that a vision system should be divided into two components. The first component contains an approximat ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Most view-based vision algorithms are based on strong assumptions about the disposition of the objects in the image. To safely apply those algorithms in real world image sequences, we are proposing that a vision system should be divided into two components. The first component contains an approximate world model of the scene --- a low accuracy, coarse description of the objects and actions in the world. Approximate world models are constructed and updated by simple vision routines and by the use of contextual information. The second component employs view-based algorithms to perform required perceptual tasks; the selection and control of the view-based methods are determined by the information provided by the approximate world model. We demonstrate the approximate world model approach in a project to control cameras in a TV studio. In our Intelligent Studio automatic cameras respond to verbal requests for shots from the TV director. 1 Introduction Many of the methods developed by co...
Assessing Behavioral and Computational Approaches to Naturalistic Action Segmentation
"... Recognizing where one action ends and another begins is an automatic and seemingly effortless process that supports understanding of goal-directed action. One characteristic of such action segmentation is that it is hierarchical; it reflects the goals and sub-goals of an actor, which correspond to c ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Recognizing where one action ends and another begins is an automatic and seemingly effortless process that supports understanding of goal-directed action. One characteristic of such action segmentation is that it is hierarchical; it reflects the goals and sub-goals of an actor, which correspond to coarse- and fine-grained action units respectively. We report on the success of one method of assessing hierarchical segmentation of naturalistic footage taken from an extensive corpus of unscripted human action (Speechome project, e.g., Roy et al., 2006). Results indicate that hierarchical segmentation occurs in an on-line fashion, with event boundaries marked by surges in attention that are modulated based on whether a boundary marks a fine, intermediate, or coarse unit. We also describe a method by which objective changes in an actor’s movement can be measured and analyzed as a predictor of participants ’ segmentation behaviors.

