Results 1 - 10
of
121
A Computational Theory of Executive Cognitive Processes and Multiple-Task Performance: Part 2. . .
- PSYCHOLOGICAL REVIEW
, 1997
"... ..."
3-D Sound for Virtual Reality and Multimedia
, 2000
"... This paper gives HRTF magnitude data in numerical form for 43 frequencies between 0.2---12 kHz, the average of 12 studies representing 100 different subjects. However, no phase data is included in the tables; group delay simulation would need to be included in order to account for ITD. In 3-D sound ..."
Abstract
-
Cited by 178 (1 self)
- Add to MetaCart
This paper gives HRTF magnitude data in numerical form for 43 frequencies between 0.2---12 kHz, the average of 12 studies representing 100 different subjects. However, no phase data is included in the tables; group delay simulation would need to be included in order to account for ITD. In 3-D sound applications intended for many users, we want might want to use HRTFs that represent the common features of a number of individuals. But another approach might be to use the features of a person who has desirable HRTFs, based on some criteria. (One can sense a future 3-D sound system where the pinnae of various famous musicians are simulated.) A set of HRTFs from a good localizer (discussed in Chapter 2) could be used if the criterion were localization performance. If the localization ability of the person is relatively accurate or more accurate than average, it might be reasonable to use these HRTF measurements for other individuals. The Convolvotron 3-D audio system (Wenzel, Wightman, and Foster, 1988) has used such sets particularly because elevation accuracy is affected negatively when listening through a bad localizers ears (see Wenzel, et al., 1988). It is best when any single nonindividualized HRTF set is psychoacoustically validated using a 113 statistical sample of the intended user population, as shown in Chapter 2. Otherwise, the use of one HRTF set over another is a purely subjective judgment based on criteria other than localization performance. The technique used by Wightman and Kistler (1989a) exemplifies a laboratory-based HRTF measurement procedure where accuracy and replicability of results were deemed crucial. A comparison of their techniques with those described in Blauert (1983), Shaw (1974), Mehrgardt and Mellert (1977), Middlebrooks, Makous, and Gree...
Interruption of People in Human-Computer Interaction: A General Unifying Definition of Human Interruption and Taxonomy
, 1997
"... User-interruption in human-computer interaction (HCI) is an increasingly important problem. Many of the useful advances in intelligent and multitasking computer systems have the significant side effect of greatly increasing user-interruption. This previously innocuous HCI problem has become critical ..."
Abstract
-
Cited by 101 (3 self)
- Add to MetaCart
User-interruption in human-computer interaction (HCI) is an increasingly important problem. Many of the useful advances in intelligent and multitasking computer systems have the significant side effect of greatly increasing user-interruption. This previously innocuous HCI problem has become critical to the successful function of many kinds of modern computer systems. Unfortunately, no HCI design guidelines exist for solving this problem. In fact, theoretical tools do not yet exist for investigating the HCI problem of user-interruption in a comprehensive and generalizable way. This report asserts that a single unifying definition of user-interruption and the accompanying practical taxonomy would be useful theoretical tools for driving effective investigation of this crucial HCI problem. These theoretical tools are constructed here. A comprehensive analysis is conducted through the existing literature. Theoretical constructs from several relevant but diverse fields are identified and discussed. A unifying definition of user-interruption is synthesized. This new definition is supported with an array of postulates, assertions, and a taxonomy of human interruption to facilitate its practical application.
A Review of The Cocktail Party Effect
- JOURNAL OF THE AMERICAN VOICE I/O SOCIETY
, 1992
"... The "cocktail party effect"---the ability to focus one's listening attention on a single talker among a cacophony of conversations and background noise---has been recognized for some time. This specialized listening ability may be because of characteristics of the human speech production system, the ..."
Abstract
-
Cited by 74 (3 self)
- Add to MetaCart
The "cocktail party effect"---the ability to focus one's listening attention on a single talker among a cacophony of conversations and background noise---has been recognized for some time. This specialized listening ability may be because of characteristics of the human speech production system, the auditory system, or high-level perceptual and language processing. This paper investigates the literature on what is known about the effect, from the original technical descriptions through current research in the areas of auditory streams and spatial display systems. The underlying goal of the paper is to analyze the components of this effect to uncover relevant attributes of the speech production and perception chain that could be exploited in future speech communication systems. The motivation is to build a system that can simultaneously present multiple streams of speech information such that a user can focus on one stream, yet easily shift attention to the others. A set of speech appli...
The Scope and Importance of Human Interruption In Human-Computer . . .
- HUMAN-COMPUTER INTERACTION
, 2002
"... At first glance it seems absurd that busy people doing important jobs should want their computers to interrupt them. Interruptions are disruptive and people need to concentrate to make good decisions. However, successful job performance also frequently depends on people's abilities to (a) constantly ..."
Abstract
-
Cited by 61 (0 self)
- Add to MetaCart
At first glance it seems absurd that busy people doing important jobs should want their computers to interrupt them. Interruptions are disruptive and people need to concentrate to make good decisions. However, successful job performance also frequently depends on people's abilities to (a) constantly monitor their dynamically changing information environments, (b) collaborate and communicate with other people in the system, and (c) supervise background autonomous services. These critical abilities can require people to simultaneously query a large set of information sources, continuously monitor for important events, and respond to and communicate with other human operators. Automated monitoring
The Audio Notebook - Paper and Pen Interaction with Structured Speech
, 2001
"... This paper addresses the problem that a listener experiences when attempting to capture information presented during a lecture, meeting, or interview. Listeners must divide their attention between the talker and their notetaking activity. We propose a new device -- the Audio Notebook -- for taking n ..."
Abstract
-
Cited by 59 (2 self)
- Add to MetaCart
This paper addresses the problem that a listener experiences when attempting to capture information presented during a lecture, meeting, or interview. Listeners must divide their attention between the talker and their notetaking activity. We propose a new device -- the Audio Notebook -- for taking notes and interacting with a speech recording. The Audio Notebook is a combination of a digital audio recorder and paper notebook, all in one device. Audio recordings are structured using two techniques: user structuring based on notetaking activity, and acoustic structuring based on a talker's changes in pitch, pausing, and energy. A field study showed that the interaction techniques enabled a range of usage styles, from detailed review to high speed skimming. The study motivated the addition of phrase detection and topic suggestions to improve access to the audio recordings. Through these audio interaction techniques, the Audio Notebook defines a new approach for navigation in the audio domain.
Hyperspeech: Navigating in Speech-Only Hypermedia
- In Hypertext '91
, 1991
"... Most hypermedia systems emphasize the integration of graphics, images, video, and audio into a traditional hypertext framework. The hyperspeech system described in this paper, a speech-only hypermedia application, explores issues of navigation and system architecture in an audio environment without ..."
Abstract
-
Cited by 51 (11 self)
- Add to MetaCart
Most hypermedia systems emphasize the integration of graphics, images, video, and audio into a traditional hypertext framework. The hyperspeech system described in this paper, a speech-only hypermedia application, explores issues of navigation and system architecture in an audio environment without a visual display. The system under development uses speech recognition to maneuver in a database of digitally recorded speech segments; synthetic speech is used for control information and user feedback. In this research prototype, recorded audio interviews were segmented by topic, and hypertext-style links were added to connect logically related comments and ideas. The software architecture is data driven, with all knowledge embedded in the links and nodes, allowing the software that traverses through the network to be straightforward and concise. Several user interfaces were prototyped, emphasizing different styles of speech interaction and feedback between the user and machine. In additio...
Mapping GUIs to Auditory Interfaces
, 1992
"... This paper describes work to provide mappings between Xbased graphical interfaces and auditory interfaces. In our system, dubbed Mercator, this mapping is transparent to applications. The primary motivation for this work is to provide accessibility to graphical applications for users who are blind o ..."
Abstract
-
Cited by 40 (10 self)
- Add to MetaCart
This paper describes work to provide mappings between Xbased graphical interfaces and auditory interfaces. In our system, dubbed Mercator, this mapping is transparent to applications. The primary motivation for this work is to provide accessibility to graphical applications for users who are blind or visually impaired. We describe the design of an auditory interface which simulates many of the features of graphical interfaces. We then describe the architecture we have built to model and transform graphical interfaces. Finally, we conclude with some indications of future research for improving our translation mechanisms and for creating an auditory "desktop" environment.
On ideal binary mask as the computational goal of auditory scene analysis
- in Speech Separation by Humans and Machines
, 2005
"... What is the computational goal of auditory scene analysis? This is a key issue to address in the Marrian information-processing framework. It is also an important question for researchers in computational auditory scene analysis (CASA) because it bears directly on how a CASA system should be evaluat ..."
Abstract
-
Cited by 40 (20 self)
- Add to MetaCart
What is the computational goal of auditory scene analysis? This is a key issue to address in the Marrian information-processing framework. It is also an important question for researchers in computational auditory scene analysis (CASA) because it bears directly on how a CASA system should be evaluated. In this chapter I discuss different objectives used in CASA. I suggest as a main CASA goal the use of the ideal time-frequency (T-F) binary mask whose value is one for a T-F unit where the target energy is greater than the interference energy and is zero otherwise. The notion of the ideal binary mask is motivated by the auditory masking phenomenon. Properties of the ideal binary mask are discussed, including their relationship to automatic speech recognition and human speech intelligibility. This CASA goal has led to algorithms that directly estimate the ideal binary mask in monaural and binaural conditions, and these algorithms have substantially advanced the state-of-the-art performance in speech separation. 1.

