Results 1 -
2 of
2
On the Challenges and Opportunities of Physically Situated Dialog
"... We outline several challenges and opportunities for building physically situated systems that can interact in open, dynamic, and relatively unconstrained environments. We review a platform and recent progress on developing computational methods for situated, multiparty, open-world dialog, and highli ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
We outline several challenges and opportunities for building physically situated systems that can interact in open, dynamic, and relatively unconstrained environments. We review a platform and recent progress on developing computational methods for situated, multiparty, open-world dialog, and highlight the value of representations of the physical surroundings and of harnessing the broader situational context when managing communicative processes such as engagement, turn-taking, language understanding, and dialog management. Finally, we outline an open-world learning challenge that spans these different levels.
A Multimodal End-of-Turn Prediction Model: Learning from Parasocial Consensus Sampling
"... Virtual humans, with realistic behaviors and increasingly human-like social skills, evoke in users a range of social behaviors normally only seen in human face-to-face interactions. One of the key challenges in creating such virtual humans is giving them human-like conversational skills. Traditional ..."
Abstract
- Add to MetaCart
(Show Context)
Virtual humans, with realistic behaviors and increasingly human-like social skills, evoke in users a range of social behaviors normally only seen in human face-to-face interactions. One of the key challenges in creating such virtual humans is giving them human-like conversational skills. Traditional conversational virtual humans usually make turn-taking decisions depending on explicit cues, such as "press-to-talk buttons", from the human users. In contrast, people decide when to take turns by observing their conversational partner's behavior. In this paper, we present a multimodal end-of-turn prediction model. Instead of recording face-to-face conversations, we collect the turn-taking data using Parasocial Consensus Sampling (PCS) framework, where participants are guided to interact with media representation of people parasocially. Then, we analyze the relationship between verbal and nonverbal features and turn-taking behavior using the consensus data and show how these features influence the time people use to take turns. Finally, we present a probabilistic multimodal end-of-turn prediction model learned from the consensus data, which enables virtual humans to make real-time turn-taking predictions. The evaluation results show that our model achieves a high accuracy and takes human-like pauses, in terms of length, before taking its turns. Our work demonstrates the validity of Parasocial Consensus Sampling and generalizes this framework to model turn-taking behavior. 1.