• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Grounding the lexical semantics of verbs in visual perception using force dynamics and event logic (0)

by J M Siskind
Venue:Journal of Artificial Intelligence Research
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 52
Next 10 →

Grounded semantic composition for visual scenes

by Peter Gorniak, Deb Roy - Journal of Artificial Intelligence Research , 2004
"... We present a visually-grounded language understanding model based on a study of how people verbally describe objects in scenes. The emphasis of the model is on the combination of individual word meanings to produce meanings for complex referring expressions. The model has been implemented, and it is ..."
Abstract - Cited by 70 (21 self) - Add to MetaCart
We present a visually-grounded language understanding model based on a study of how people verbally describe objects in scenes. The emphasis of the model is on the combination of individual word meanings to produce meanings for complex referring expressions. The model has been implemented, and it is able to understand a broad range of spatial referring expressions. We describe our implementation of word level visually-grounded semantics and their embedding in a compositional parsing framework. The implemented system selects the correct referents in response to natural language expressions for a large percentage of test cases. In an analysis of the system’s successes and failures we reveal how visual context influences the semantics of utterances and propose future extensions to the model that take such context into account. 1.

Semiotic Schemas: A Framework for Grounding Language in Action and Perception

by Deb Roy , 2005
"... A theoretical framework for grounding language is introduced that provides a computational path from sensing and motor action to words and speech acts. The approach combines concepts from semiotics and schema theory to develop a holistic approach to linguistic meaning. Schemas serve as structured be ..."
Abstract - Cited by 58 (10 self) - Add to MetaCart
A theoretical framework for grounding language is introduced that provides a computational path from sensing and motor action to words and speech acts. The approach combines concepts from semiotics and schema theory to develop a holistic approach to linguistic meaning. Schemas serve as structured beliefs that are grounded in an agent’s physical environment through a causal-predictive cycle of action and perception. Words and basic speech acts are interpreted in terms of grounded schemas. The framework reflects lessons learned from implementations of several language processing robots. It provides a basis for the analysis and design of situated, multimodal communication systems that straddle symbolic and non-symbolic realms.

Choosing words in computer-generated weather forecasts

by Ehud Reiter, Somayajulu Sripada, Jim Hunter, Ian Davy - Artificial Intelligence , 2005
"... One of the main challenges in automatically generating textual weather forecasts is choosing appropriate English words to communicate numeric weather data. A corpus-based analysis of how humans write forecasts showed that there were major differences in how individual writers performed this task, th ..."
Abstract - Cited by 37 (15 self) - Add to MetaCart
One of the main challenges in automatically generating textual weather forecasts is choosing appropriate English words to communicate numeric weather data. A corpus-based analysis of how humans write forecasts showed that there were major differences in how individual writers performed this task, that is, in how they translated data into words. These differences included both different preferences between potential near-synonyms that could be used to express information, and also differences in the meanings that individual writers associated with specific words. Because we thought these differences could confuse readers, we built our SumTime-Mousam weather-forecast generator to use consistent data-to-word rules, which avoided words which were only used by a few people, and words which were interpreted differently by different people. An evaluation by forecast users suggested that they preferred SumTime-Mousam’s texts to human-generated texts, in part because of better word choice; this may be the first time that an evaluation has shown that nlg texts are better than human-authored texts. Key words: natural language processing, natural language generation, language and the word, information presentation, weather forecasts, lexical choice, idiolect Preprint submitted to Elsevier Science 2 June 2005

Learning Visually-Grounded Words and Syntax for a Scene Description Task

by Deb K. Roy
"... A spoken language generation system has been developed that learns to describe objects in computer-generated visual scenes. The system is trained by a `show-and-tell' procedure in which visual scenes are paired with natural language descriptions. Learning algorithms acquire probabilistic structures ..."
Abstract - Cited by 30 (16 self) - Add to MetaCart
A spoken language generation system has been developed that learns to describe objects in computer-generated visual scenes. The system is trained by a `show-and-tell' procedure in which visual scenes are paired with natural language descriptions. Learning algorithms acquire probabilistic structures which encode the visual semantics of phrase structure, word classes, and individual words. Using these structures, a planning algorithm integrates syntactic, semantic, and contextual constraints to generate natural and unambiguous descriptions of objects in novel scenes.

The Challenges of Joint Attention

by Frederic Kaplan, Verena V. Hafner - Interaction Studies , 2004
"... This paper discusses the concept of joint attention and the di#erent skills underlying its development. We argue that joint attention is much more than gaze following or simultaneous looking because it implies a shared intentional relation to the world. The current state-of-the-art in robotic ..."
Abstract - Cited by 29 (6 self) - Add to MetaCart
This paper discusses the concept of joint attention and the di#erent skills underlying its development. We argue that joint attention is much more than gaze following or simultaneous looking because it implies a shared intentional relation to the world. The current state-of-the-art in robotic and computational models of the di#erent prerequisites of joint attention is discussed in relation with a developmental timeline drawn from results in child studies.

Learning visually grounded words and syntax for a scene description task

by Deb K. Roy , 2002
"... ..."
Abstract - Cited by 25 (0 self) - Add to MetaCart
Abstract not found

Specific-to-General Learning for Temporal Events with Application to Learning . . .

by Alan Fern, Robert Givan, Jeffrey Mark Siskind - JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH , 2002
"... We develop, analyze, and evaluate a novel, supervised, specific-to-general learner for a simple temporal logic and use the resulting algorithm to learn visual event definitions from video sequences. First, we introduce a simple, propositional, temporal, event-description language called AMA that ..."
Abstract - Cited by 23 (2 self) - Add to MetaCart
We develop, analyze, and evaluate a novel, supervised, specific-to-general learner for a simple temporal logic and use the resulting algorithm to learn visual event definitions from video sequences. First, we introduce a simple, propositional, temporal, event-description language called AMA that is sufficiently expressive to represent many events yet sufficiently restrictive to support learning. We then give algorithms, along with lower and upper complexity bounds, for the subsumption and generalization problems for AMA formulas. We present a positive-examples -- only specific-to-general learning method based on these algorithms. We also present a polynomial-time -- computable "syntactic" subsumption test that implies semantic subsumption without being equivalent to it. A generalization algorithm based on syntactic subsumption can be used in place of semantic generalization to improve the asymptotic complexity of the resulting learning algorithm. Finally

Coupling Perception and Simulation: Steps Towards Conversational Robotics

by Kai-Yuh Hsiao , Nikolaos Mavridis, Deb Roy - IN PROCEEDINGS OF IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS , 2003
"... Human cognition makes extensive use of visualization and imagination. As a first step towards giving a robot similar abilities, we have built a robotic system that uses a perceptually-coupled physical simulator to produce an internal world model of the robot's environment. Real-time perceptual coupl ..."
Abstract - Cited by 22 (15 self) - Add to MetaCart
Human cognition makes extensive use of visualization and imagination. As a first step towards giving a robot similar abilities, we have built a robotic system that uses a perceptually-coupled physical simulator to produce an internal world model of the robot's environment. Real-time perceptual coupling ensures that the model is constantly kept in synchronization with the physical environment as the robot moves and obtains new sense data. This model allows the robot to be aware of objects no longer in its field of view (a form of "object permanence"), as well as to visualize its environment through the eyes of the user by enabling virtual shifts in point of view using synthetic vision operating within the simulator. This architecture provides a basis for our long term goals of developing conversational robots that can ground the meaning of spoken language in terms of sensorimotor representations.

Learning semantic combinatoriality from the interaction between linguistic and behavioral processes

by Yuuya Sugita, Jun Tani - ADAPTIVE BEHAVIOR , 2005
"... ..."
Abstract - Cited by 20 (9 self) - Add to MetaCart
Abstract not found

Learning Visually Grounded Words and Syntax of Natural Spoken Language

by Deb Roy - Evolution of Communication , 2000
"... Properties of the physical world have shaped human evolutionary design and given rise to physically grounded mental representations. These grounded representations provide the foundation for higher level cognitive processes including language. Most natural language processing machines to date lack g ..."
Abstract - Cited by 16 (5 self) - Add to MetaCart
Properties of the physical world have shaped human evolutionary design and given rise to physically grounded mental representations. These grounded representations provide the foundation for higher level cognitive processes including language. Most natural language processing machines to date lack grounding. This paper advocates the creation of physically grounded language learning machines as a path toward scalable systems which can conceptualize and communicate about the world in human-like ways. As steps in this direction, two experimental language acquisition systems are presented.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University