• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Importance-Driven Turn-Bidding for Spoken Dialogue Systems

by Ethan O. Selfridge, Peter A. Heeman
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 10

Stability and accuracy in incremental speech recognition.

by Ethan O Selfridge , Iker Arizmendi , Peter A Heeman , Jason D Williams - In Proceedings of the SIGdial , 2011
"... Abstract Conventional speech recognition approaches usually wait until the user has finished talking before returning a recognition hypothesis. This results in spoken dialogue systems that are unable to react while the user is still speaking. Incremental Speech Recognition (ISR), where partial phra ..."
Abstract - Cited by 11 (2 self) - Add to MetaCart
Abstract Conventional speech recognition approaches usually wait until the user has finished talking before returning a recognition hypothesis. This results in spoken dialogue systems that are unable to react while the user is still speaking. Incremental Speech Recognition (ISR), where partial phrase results are returned during user speech, has been used to create more reactive systems. However, ISR output is unstable and so prone to revision as more speech is decoded. This paper tackles the problem of stability in ISR. We first present a method that increases the stability and accuracy of ISR output, without adding delay. Given that some revisions are unavoidable, we next present a pair of methods for predicting the stability and accuracy of ISR results. Taken together, we believe these approaches give ISR more utility for real spoken dialogue systems.

Recognizing Authority in Dialogue with an Integer Linear Programming Constrained Model

by Elijah Mayfield, Carolyn Penstein Rosé
"... We present a novel computational formulation of speaker authority in discourse. This notion, which focuses on how speakers position themselves relative to each other in discourse, is first developed into a reliable coding scheme (0.71 agreement between human annotators). We also provide a computatio ..."
Abstract - Cited by 10 (6 self) - Add to MetaCart
We present a novel computational formulation of speaker authority in discourse. This notion, which focuses on how speakers position themselves relative to each other in discourse, is first developed into a reliable coding scheme (0.71 agreement between human annotators). We also provide a computational model for automatically annotating text using this coding scheme, using supervised learning enhanced by constraints implemented with Integer Linear Programming. We show that this constrained model’s analyses of speaker authority correlates very strongly with expert human judgments (r 2 coefficient of 0.947). 1
(Show Context)

Citation Context

...92). This is a large and active field, with applications in tutorial dialogues (Core, 2003), human-robot interactions (Peltason and Wrede, 2010), and more general approaches to effective turn-taking (=-=Selfridge and Heeman, 2010-=-). However, that body of work focuses on influencing discourse structure through positioning. The question that we are asking instead focuses on how speakers view their authority as a source of inform...

Decisions about Turns in Multiparty Conversation: From Perception to Action

by Dan Bohus, Eric Horvitz
"... We present a decision-theoretic approach for guiding turn taking in a spoken dialog system operating in multiparty settings. The proposed methodology couples inferences about multiparty conversational dynamics with assessed costs of different outcomes, to guide turn-taking decisions. Beyond consider ..."
Abstract - Cited by 9 (1 self) - Add to MetaCart
We present a decision-theoretic approach for guiding turn taking in a spoken dialog system operating in multiparty settings. The proposed methodology couples inferences about multiparty conversational dynamics with assessed costs of different outcomes, to guide turn-taking decisions. Beyond considering uncertainties about outcomes arising from evidential reasoning about the state of a conversation, we endow the system with awareness and methods for handling uncertainties stemming from computational delays in its own perception and production. We illustrate via sample cases how the proposed approach makes decisions, and we investigate the behaviors of the proposed methods via a retrospective analysis on logs collected in a multiparty interaction study.
(Show Context)

Citation Context

... managing turn taking, in both dyadic [10, 19, 20] and multiparty settings [2, 4, 21]. With respect to turn-taking decisions, a number of more principled approaches have been proposed. As an example, =-=[16]-=- proposes a bidding approach to turn taking and investigates in simulations the use of reinforcement learning techniques for choosing appropriate turn bids, based on utterance importance. [8] proposes...

Continuously predicting and processing barge-in during a live spoken dialogue task

by Ethan O Selfridge , Iker Arizmendi , Peter A Heeman , Jason D Williams - In Proc. of SIGDIAL , 2013
"... Abstract Barge-in enables the user to provide input during system speech, facilitating a more natural and efficient interaction. Standard methods generally focus on singlestage barge-in detection, applying the dialogue policy irrespective of the barge-in context. Unfortunately, this approach perfor ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
Abstract Barge-in enables the user to provide input during system speech, facilitating a more natural and efficient interaction. Standard methods generally focus on singlestage barge-in detection, applying the dialogue policy irrespective of the barge-in context. Unfortunately, this approach performs poorly when used in challenging environments. We propose and evaluate a barge-in processing method that uses a prediction strategy to continuously decide whether to pause, continue, or resume the prompt. This model has greater task success and efficiency than the standard approach when evaluated in a public spoken dialogue system.
(Show Context)

Citation Context

... system to resume quickly as the NUBI likelihood is greater. Both T1 and T2 are dependent on the number of system resumptions, as we view the action of resuming the prompt as an indication that the threshold is not correct. With every resumption, the parameter R is incremented by 1 and, to account for changing environments, R is decremented by 0.2 for every full prompt that is not paused until it reaches 0. Using R, T1 is computed by T1 = 0.17· R, and T2 by T2 = argmax(0.1, 1− (0.1 ·R)).2 3.5 Method Discussion The motivation behind the PBR model is both theoretical and practical. According to Selfridge and Heeman (2010), turn-taking is best viewed as a collaborative process where the turn assignment should be determined by the importance of the utterance. During barge-in, the system is speaking and so should only yield the turn if the user’s speech is more important than its own. For many domains, we view non-understood input as less important than the system’s prompt and so, in this case, the system should not release the turn by stopping the prompt and initiating a clarifying subdialogue. On the practical side, there is a high likelihood that non-advancing input is not system directed, to which the system ...

Inverse Reinforcement Learning for Micro-Turn Management

by Dongho Kim, Catherine Breslin, Pirros Tsiakoulis, Matthew Henderson, Steve Young
"... Existing spoken dialogue systems are typically not de-signed to provide natural interaction since they impose a strict turn-taking regime in which a dialogue consists of interleaved system and user turns. To allow more responsive and natural interaction, this paper describes a system in which turn-t ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
Existing spoken dialogue systems are typically not de-signed to provide natural interaction since they impose a strict turn-taking regime in which a dialogue consists of interleaved system and user turns. To allow more responsive and natural interaction, this paper describes a system in which turn-taking decisions are taken at a more fine-grained micro-turn level. A decision-theoretic approach is then applied to optimise turn-taking control. Inverse reinforcement learning is used to cap-ture the complex but natural behaviours from human-human di-alogues and optimise interaction without specifying a reward function manually. Using a corpus of human-human interac-tion, experiments show that IRL is able to learn an effective reward function which outperforms a comparable handcrafted policy. Index Terms: dialogue management, spoken dialogue systems, inverse reinforcement learning, Markov decision processes
(Show Context)

Citation Context

...utterance. To allow more responsive and natural interaction, a variety of decision-theoretic approaches have been proposed, which provide a more principled way of optimising the control of turntaking =-=[1, 2, 3, 4, 5]-=-. Since any turn-taking decision may have an effect on the future evolution of the dialogue, a general solution should view the problem as sequential decision making in which the system has to optimis...

Learning Turn, Attention, and Utterance Decisions in a Negotiative Slot-Filling Domain

by Ethan O. Selfridge, Peter A. Heeman , 2011
"... Abstract—Mixed-Initiative dialogue systems must be effective and natural, and both turn-taking and attention play an important role in meeting these goals. We present the Tau Architecture which separates Turn, Attention, and Utterance decisions, and uses Reinforcement Learning to jointly optimize th ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Abstract—Mixed-Initiative dialogue systems must be effective and natural, and both turn-taking and attention play an important role in meeting these goals. We present the Tau Architecture which separates Turn, Attention, and Utterance decisions, and uses Reinforcement Learning to jointly optimize them. The development of sophisticated dialogue managers using Reinforcement Learning requires a simulation domain before any human evaluation, and we describe the Negotiative Slot-Filling domain. This domain is a closer approximation to true mixedinitiative dialogue than any previously used to train dialogue managers. We then detail the Tau implementation in the domain, and demonstrate both. I.
(Show Context)

Citation Context

...e system should begin speaking before some indication of user completion. In prior work, using a simple negotiation task, we proposed an SDS which could “bid for the turn” and actively try to take it =-=[8]-=-. This turn-bidding system had substantially shorter dialogues than an SDS which used conventional turn-taking techniques, demonstrating that effective SDS turn-taking in an integral aspect of efficie...

Is it really worth it? Cost-based selection of system responses

by Jens Edlund, Anna Hjalmarsson
"... to speech-in-overlap ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
to speech-in-overlap
(Show Context)

Citation Context

...f-speech, and [13] manipulates the detection of end-of-speech. Another model that takes the urgency of an intendedspeech segment into consideration is the importance driven turn-bidding presented in =-=[15]-=-. In this general turn-taking framework, the turn is allocated to either the user or the system depending who shows more eager to speak by taking the prominence of different turn-taking cues into acco...

Dialog Goes Pervasive Until recently, many dialog

by Amanda J. Stent
"... systems were information retrieval systems. For example, using a telephone-based interactive response system a US-based user can find flights from United (1-800-UNITED-1), get movie schedules (1-800-777-FILM), or get bus information (Black et al., 2011). These systems save companies money and help u ..."
Abstract - Add to MetaCart
systems were information retrieval systems. For example, using a telephone-based interactive response system a US-based user can find flights from United (1-800-UNITED-1), get movie schedules (1-800-777-FILM), or get bus information (Black et al., 2011). These systems save companies money and help users access information 24/7. However, the interaction between user and system is tightly constrained. For the most part, each system only deals with one domain, so the task models are typically flat slot-filling models (Allen et al., 2001b). Also, the dialogs are very structured, with system initiative and short user responses, giving limited scope to study important phenomena such as coreference. Smart phones and other mobile devices make possible pervasive human-computer spoken dialog. For example, the Vlingo system lets users do web searches (information retrieval), but also connects calls, opens other apps, and permits voice dictation of emails or social media updates 1. Siri can also help users make reservations and schedule meetings 2. These new dialog systems are different from traditional ones in several ways; they are multi-task, asynchronous, can involve rich context modeling, and have side effects in the “real world”: Multi-task – The system interacts with the user to accomplish a series of (possibly related) tasks. For example, a user might use the system to order a book and then say schedule it for book club- a different task (e.g. requiring different backend DB lookups) but related to the previous one by the book informa-1 www.vlingo.com
(Show Context)

Citation Context

...ilence, turn-final and turn-initial prosodic cues), and/or on user error rates and satisfaction scores. An initial dialog layer-focused challenge could be on turn-taking (Baumann and Schlangen, 2011; =-=Selfridge and Heeman, 2010-=-). Task modeling focused – This type of challenge will move from modeling individual tasks, to automatic acquisition and use of task models for interactive tasks in dialog systems. Future challenges o...

Open Dialogue Management for Relational Databases

by Ben Hixon, Rebecca J. Passonneau
"... We present open dialogue management and its application to relational databases. An open dialogue manager generates dialogue states, actions, and default strategies from the semantics of its application domain. We define three open dialogue management tasks. First, vocabulary selection finds the int ..."
Abstract - Add to MetaCart
We present open dialogue management and its application to relational databases. An open dialogue manager generates dialogue states, actions, and default strategies from the semantics of its application domain. We define three open dialogue management tasks. First, vocabulary selection finds the intelligible attributes in each database table. Second, focus discovery selects candidate dialogue foci, tables that have the most potential to address basic user goals. Third, a focus agent is instantiated for each dialogue focus with a default dialogue strategy governed by efficiency. We demonstrate the portability of open dialogue management on three very different databases. Evaluation of our system with simulated users shows that users with realistically limited domain knowledge have dialogues nearly as efficient as those of users with complete domain knowledge. 1

Turn-Taking Cues in a Human Tutoring Corpus

by Heather Friedberg
"... Most spoken dialogue systems are still lacking in their ability to accurately model the complex process that is human turntaking. This research analyzes a humanhuman tutoring corpus in order to identify prosodic turn-taking cues, with the hopes that they can be used by intelligent tutoring systems t ..."
Abstract - Add to MetaCart
Most spoken dialogue systems are still lacking in their ability to accurately model the complex process that is human turntaking. This research analyzes a humanhuman tutoring corpus in order to identify prosodic turn-taking cues, with the hopes that they can be used by intelligent tutoring systems to predict student turn boundaries. Results show that while there was variation between subjects, three features were significant turn-yielding cues overall. In addition, a positive relationship between the number of cues present and the probability of a turn yield was demonstrated. 1
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University