Results 1 -
8 of
8
Reading Between the Lines: Learning to Map High-level Instructions to Commands
"... In this paper, we address the task of mapping high-level instructions to sequences of commands in an external environment. Processing these instructions is challenging—they posit goals to be achieved without specifying the steps required to complete them. We describe a method that fills in missing i ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
In this paper, we address the task of mapping high-level instructions to sequences of commands in an external environment. Processing these instructions is challenging—they posit goals to be achieved without specifying the steps required to complete them. We describe a method that fills in missing information using an automatically derived environment model that encodes states, transitions, and commands that cause these transitions to happen. We present an efficient approximate approach for learning this environment model as part of a policygradient reinforcement learning algorithm for text interpretation. This design enables learning for mapping high-level instructions, which previous statistical methods cannot handle. 1 1
Natural Language and Spatial Reasoning
, 2010
"... Making systems that understand language has long been a dream of artificial intelligence. This thesis develops a model for understanding language about space and movement in realistic situations. The system understands language from two realworld domains: finding video clips that match a spatial lan ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Making systems that understand language has long been a dream of artificial intelligence. This thesis develops a model for understanding language about space and movement in realistic situations. The system understands language from two realworld domains: finding video clips that match a spatial language description such as “People walking through the kitchen and then going to the dining room ” and following natural language commands such as “Go down the hall towards the fireplace in the living room. ” Understanding spatial language expressions is a challenging problem because linguistic expressions, themselves complex and ambiguous, must be connected to real-world objects and events. The system bridges the gap between language and the world by modeling the meaning of spatial language expressions hierarchically, first capturing the semantics of spatial prepositions, and then composing these meanings into higher level structures. Corpus-based evaluations of how well the
A Discriminative Model for Understanding Natural Language Route Directions
"... To be useful teammates to human partners, robots must be able to follow spoken instructions given in natural language. However, determining the correct sequence of actions in response to a set of spoken instructions is a complex decisionmaking problem. There is a “semantic gap ” between the high-lev ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
To be useful teammates to human partners, robots must be able to follow spoken instructions given in natural language. However, determining the correct sequence of actions in response to a set of spoken instructions is a complex decisionmaking problem. There is a “semantic gap ” between the high-level symbolic models of the world that people use, and the low-level models of geometry, state dynamics, and perceptions that robots use. In this paper, we show how this gap can be bridged by inferring the best sequence of actions from a linguistic description and environmental features. This work improves upon previous work in three ways. First, by using a conditional random field (CRF), we learn the relative weight of environmental and linguistic features, enabling the system to learn the meanings of words and reducing the modeling effort in learning how to follow commands. Second, a number of long-range features are added, which help the system to use additional structure in the problem. Finally, given a natural language command, we infer both the referred path and landmark directly, thereby requiring the algorithm to pick a landmark by which it should navigate. The CRF is demonstrated to have 15 % error on a held-out dataset, when compared with 39 % error for a Markov random field (MRF). Finally, by analyzing the additional annotations necessary for this work, we find that natural language route directions map sequentially onto the corresponding path and landmarks 99.6 % of the time. In addition, the size of the referred landmark varies from 0m 2 to 1964m 2 and the length of the referred path varies from 0m to 40.83m. 1
Learning to Interpret Natural Language Navigation Instructions from Observations
"... The ability to understand natural-language instructions is critical to building intelligent agents that interact with humans. We present a system that learns to transform natural-language navigation instructions into executable formal plans. Given no prior linguistic knowledge, the system learns by ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
The ability to understand natural-language instructions is critical to building intelligent agents that interact with humans. We present a system that learns to transform natural-language navigation instructions into executable formal plans. Given no prior linguistic knowledge, the system learns by simply observing how humans follow navigation instructions. The system is evaluated in three complex virtual indoor environments with numerous objects and landmarks. A previously collected realistic corpus of complex English navigation instructions for these environments is used for training and testing data. By using a learned lexicon to refine inferred plans and a supervised learner to induce a semantic parser, the system is able to automatically learn to correctly interpret a reasonable fraction of the complex instructions in this corpus. 1
Approaching the Symbol Grounding Problem with Probabilistic Graphical Models
"... In order for robots to engage in dialog with human teammates, they must have the ability to identify correspondences between elements of language and aspects of the external world. A solution to this symbol grounding problem (Harnad, 1990) would enable a robot to interpret commands such as “Drive ov ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In order for robots to engage in dialog with human teammates, they must have the ability to identify correspondences between elements of language and aspects of the external world. A solution to this symbol grounding problem (Harnad, 1990) would enable a robot to interpret commands such as “Drive over to receiving and pick up the tire pallet. ” This article describes several of our results that use probabilistic inference to address the symbol grounding problem. Our approach is to develop models that factor according to the linguistic structure of a command. We first describe an early result, a generative model that factors according to the sequential structure of language, then discuss our new framework, Generalized Grounding Graphs (G 3). The G 3 framework dynamically instantiates a probabilistic graphical model for a natural language input, enabling a mapping between words in language and concrete objects, places, paths and events in the external world. We report on corpus-based experiments in which the robot is able to learn and use word meanings in three real-world tasks: indoor navigation, spatial language video retrieval, and mobile manipulation. 1
PANNING FOR GOLD: FINDING RELEVANT SEMANTIC CONTENT FOR GROUNDED LANGUAGE LEARNING
"... One of the key challenges in grounded language acquisition is resolving the intentions of the expressions. Typically the task involves identifying a subset of records from a list of candidates as the correct meaning of a sentence. While most current work assume complete or partial independence betwe ..."
Abstract
- Add to MetaCart
One of the key challenges in grounded language acquisition is resolving the intentions of the expressions. Typically the task involves identifying a subset of records from a list of candidates as the correct meaning of a sentence. While most current work assume complete or partial independence between the records, we examine a scenario in which they are strongly related. By representing the set of potential meanings as a graph, we explicitly encode the relationships between the candidate meanings. We introduce a refinement algorithm that first learns a lexicon which is then used to remove parts of the graphs that are irrelevant. Experiments in a navigation domain shows that the algorithm successfully recovered over three quarters of the correct semantic content. Index Terms — ambiguously supervised learning, grounded language acquisition 1.
Bootstrapping semantic parsers from conversations
- In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing
, 2011
"... Conversations provide rich opportunities for interactive, continuous learning. When something goes wrong, a system can ask for clarification, rewording, or otherwise redirect the interaction to achieve its goals. In this paper, we present an approach for using conversational interactions of this typ ..."
Abstract
- Add to MetaCart
Conversations provide rich opportunities for interactive, continuous learning. When something goes wrong, a system can ask for clarification, rewording, or otherwise redirect the interaction to achieve its goals. In this paper, we present an approach for using conversational interactions of this type to induce semantic parsers. We demonstrate learning without any explicit annotation of the meanings of user utterances. Instead, we model meaning with latent variables, and introduce a loss function to measure how well potential meanings match the conversation. This loss drives the overall learning approach, which induces a weighted CCG grammar that could be used to automatically bootstrap the semantic analysis component in a complete dialog system. Experiments on DARPA Communicator conversational logs demonstrate effective learning, despite requiring no explicit meaning annotations. 1
ACTION-CENTERED REASONING FOR PROBABILISTIC DYNAMIC SYSTEMS
, 2011
"... ... focuses on modeling stochastic dynamic domains, using representations and algorithms that combine logical AI ideas and probabilistic methods. We introduce new tractable and highly accurate algorithms for reasoning in those complex domains. Furthermore, we apply these algorithms to tasks of narra ..."
Abstract
- Add to MetaCart
... focuses on modeling stochastic dynamic domains, using representations and algorithms that combine logical AI ideas and probabilistic methods. We introduce new tractable and highly accurate algorithms for reasoning in those complex domains. Furthermore, we apply these algorithms to tasks of narrative understanding and web page monitoring. We model stochastic dynamic domains with a factored logical representation that uses a graphical model to represent a prior distribution over initial states. Our representation uses sequences of actions (represented in logical form) to represent transitions. We introduce an algorithm for reasoning in stochastic dynamic domains (in propositional and relational fashions) based on subroutines for reasoning in deterministic substructure of the domain. Our algorithm takes advantage of the factored logical representation and efficient subroutines for logical progression and regression. The tractability of the algorithm results from the tractability of these underlying subroutines. Our theoretical and empirical results show significant improvement of our algorithm over previous approaches for reasoning. Our novel algorithm for reasoning in probabilistic dynamic domains samples sequences of deterministic actions corresponding to an observed sequence of probabilistic actions. This algorithm is built upon a novel exact and tractable

