Results 11 - 20
of
25
Analysing meeting records: an ethnographic study and technological implications
- in Proc. Workshop on Machine Learning for Multimodal Interaction (MLMI
, 2005
"... Abstract. Whilst there has been substantial research into the support of meetings, there has been relatively little study of how meeting participants currently make records and how these records are used to direct collective and individual actions outside the meeting. This paper empirically investig ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Abstract. Whilst there has been substantial research into the support of meetings, there has been relatively little study of how meeting participants currently make records and how these records are used to direct collective and individual actions outside the meeting. This paper empirically investigates current meeting recording practices in order to both understand fundamental collaboration processes and to determine how these might be better supported by technology. Our main findings were that participants create two types of meeting record. Public records are a collectively negotiated contract of decisions and commitments. Personal records, in contrast, are a highly personalised reminding tool, recording both actions and the context surrounding these actions. These observations are then used to informally evaluate current meeting support technology and to suggest new directions for research. 1
Design and evaluation of systems to support interaction capture and retrieval
- PERSONAL AND UBIQUITOUS COMPUTING
, 2006
"... Although many recent systems have been built to support Information Capture and Retrieval (ICR), these have not generally been successful. This paper presents studies that evaluate two different hypotheses for this failure, firstly that systems fail to address user needs and secondly that they prov ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Although many recent systems have been built to support Information Capture and Retrieval (ICR), these have not generally been successful. This paper presents studies that evaluate two different hypotheses for this failure, firstly that systems fail to address user needs and secondly that they provide only rudimentary support for ICR. Having first presented a taxonomy of different systems built to support ICR, we then describe a study that attempts to identify user needs for ICR. On the basis of that study we carried out two user-oriented evaluations. In the first we carried out a task-based evaluation of a state-of-the-art ICR system, finding that it failed to provide users with abstract ways to view meetings data, and did not present users with information categories that they considered to be important. In a second study we introduce a new method for comparative evaluation of different techniques for accessing meetings data. The second study showed that simple interface techniques that extracted key information from meetings were effective in allowing users to extract gist from meetings data. We conclude with a discussion of outstanding issues and future directions for ICR research.
Searching in audio: the utility of transcripts, dichotic presentation, and time-compression
- ACM CHI Conference on Human Factors in Computing Systems
, 2006
"... Searching audio data can potentially be facilitated by the use of automatic speech recognition (ASR) technology to generate text transcripts which can then be easily queried. However, since current ASR technology cannot reliably generate 100 % accurate transcripts, additional techniques for fluid br ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Searching audio data can potentially be facilitated by the use of automatic speech recognition (ASR) technology to generate text transcripts which can then be easily queried. However, since current ASR technology cannot reliably generate 100 % accurate transcripts, additional techniques for fluid browsing and searching of the audio itself are required. We explore the impact of transcripts of various qualities, dichotic presentation, and time-compression on an audio search task. Results show that dichotic presentation and reasonably accurate transcripts can assist in the search process, but suggest that time-compression and low accuracy transcripts should be used carefully. Author Keywords Dichotic listening, transcripts, audio time-compression.
Two New Experimental Protocols for Measuring STT Readability. Report for DARPA/EARS/Rich Transcription 2004 Workshop
, 2004
"... This paper reports results from two recent psycholinguistic experiments that measure the readability of four types of speech transcripts for the DARPA EARS Program. The two key questions in these experiments are (1) how much speech transcript cleanup aids readability and (2) how much the type of cle ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This paper reports results from two recent psycholinguistic experiments that measure the readability of four types of speech transcripts for the DARPA EARS Program. The two key questions in these experiments are (1) how much speech transcript cleanup aids readability and (2) how much the type of cleanup matters. We employ two variants of the fourpart figure of merit to measure readability defined at the RT02 workshop and described in our Eurospeech 2003 paper [4] namely: accuracy of answers to comprehension questions, reactiontime for passage reading, reactiontime for question answering and a subjective rating of passage difficulty. The first protocol employs a questionanswering task under time pressure. The second employs a selfpaced linebyline paradigm. Both protocols yield similar results: all three types of cleanup in the experiment improve readability 510%, but the selfpaced reading protocol needs far fewer subjects for statistical significance. 1.
A Study of Out-of-turn Interaction in Menu-based, IVR, Voicemail Systems
"... We present the first user study of out-of-turn interaction in menu-based, interactive voice-response systems. Out-ofturn interaction is a technique which empowers the user (unable to respond to the current prompt) to take the conversational initiative by supplying information that is currently unsol ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
We present the first user study of out-of-turn interaction in menu-based, interactive voice-response systems. Out-ofturn interaction is a technique which empowers the user (unable to respond to the current prompt) to take the conversational initiative by supplying information that is currently unsolicited, but expected later in the dialog. The technique permits the user to circumvent any flows of navigation hardwired into the design and navigate the menus in a manner which reflects their model of the task. We conducted a laboratory experiment to measure the effect of the use of outof-turn interaction on user performance and preference in a menu-based, voice interface to voicemail. Specifically, we compared two interfaces with the exact same hierarchical menu design: one with the capability of accepting out-ofturn utterances and one without this feature. The results indicate that out-of-turn interaction significantly reduces task completion time, improves usability, and is preferred to the baseline. This research studies an unexplored dimension of the design space for automated telephone services, namely the nature of user-addressable input (utterance) supplied (inturn vs. out-of-turn), in contrast to more traditional dimensions such as input modality (touch-tone vs. text vs. voice) and style of interaction (menu-based vs. natural language).
Retrieval and browsing of spoken content
- IEEE Signal Processing Mag
, 2008
"... [A discussion of the technical issues involved in developing information retrieval systems for the spoken word] © IMAGESTATE Ever-increasing computing power and connectivity bandwidth, together with falling storage costs, are resulting in an overwhelming amount of data of various types being produce ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
[A discussion of the technical issues involved in developing information retrieval systems for the spoken word] © IMAGESTATE Ever-increasing computing power and connectivity bandwidth, together with falling storage costs, are resulting in an overwhelming amount of data of various types being produced, exchanged, and stored. Consequently, information search and retrieval has emerged as a key application area. Text-based search is the most active area, with applications that range from Web and local network search to searching for personal information residing on one’s own hard-drive. Speech search has received less attention perhaps because large collections of spoken material have previously not been available. However, with cheaper storage and increased broadband access, there has been a subsequent increase in the availability of online spoken audio content such as news broadcasts, podcasts, and academic lectures.
A Cascaded Broadcast News Highlighter
"... Abstract — This paper presents a fully automatic news skimming system which takes a broadcast news audio stream and provides the user with the segmented, structured and highlighted transcript. This constitutes a system with three different, cascading stages: converting the audio stream to text using ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract — This paper presents a fully automatic news skimming system which takes a broadcast news audio stream and provides the user with the segmented, structured and highlighted transcript. This constitutes a system with three different, cascading stages: converting the audio stream to text using an automatic speech recogniser, segmenting into utterances and stories and finally determining which utterance should be highlighted using a saliency score. Each stage must operate on the erroneous output from the previous stage in the system; an effect which is naturally amplified as the data progresses through the processing stages. We present a large corpus of transcribed broadcast news data enabling us to investigate to which degree information worth highlighting survives this cascading of processes. Both extrinsic and intrinsic experimental results indicate that mistakes in the story boundary detection has a strong impact on the quality of highlights, whereas erroneous utterance boundaries cause only minor problems. Further, the difference in transcription quality does not affect the overall performance greatly. Index Terms — statistical modelling, spoken language processing, speech understanding, information extraction I.
The boundaries of virtual communities
"... This position paper is comprised of extracts from a research proposal we are currently putting together on the workshop topic. The grant is primarily about real and virtual places and community in the more traditional sense. However, we believe that ‘virtual community ’ (computer supported social ne ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This position paper is comprised of extracts from a research proposal we are currently putting together on the workshop topic. The grant is primarily about real and virtual places and community in the more traditional sense. However, we believe that ‘virtual community ’ (computer supported social networks) and ‘community ’ are essentially equivalent in modern societies where nearly all ties are mediated by technology. Any comments on this document would be greatly appreciated, as we will be submitting the fuller document with detailed plans of our research shortly after the workshop. 1. Overview and Objectives This proposal addresses fundamental social and technical issues in interactive location-based systems. The emergence of handheld, pervasively connected, location-aware devices raises the potential for fundamentally new information and communication services. Work to date has emphasized understanding the significance of mobility, providing users “anytime, anywhere ” access, and delivering information to users based on their location (Dix et al 2000). Several proof-of-concept systems have explored techniques that enable individuals and groups to associate their own information with locations (Burrell & Gay 2000;
Measuring the Acceptable Word Error Rate of Machine-Generated Webcast Transcripts
, 2006
"... The increased availability of broadband connections has recently led to an increase in the use of Internet broadcasting (webcasting). Most webcasts are archived and accessed numerous times retrospectively. One of the hurdles users face when browsing and skimming through archives is the lack of text ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The increased availability of broadband connections has recently led to an increase in the use of Internet broadcasting (webcasting). Most webcasts are archived and accessed numerous times retrospectively. One of the hurdles users face when browsing and skimming through archives is the lack of text transcripts of the audio channel of the webcast archive. In this paper, we proposed a procedure for prototyping an Automatic Speech Recognition (ASR) system that generates realistic transcripts of any desired Word Error Rate (WER), thus overcoming the drawbacks of both prototypebased and Wizard of Oz simulations. We used such a system in a study where human subjects perform question-answering tasks using archives of webcast lectures, and showed that their performance and perception of transcript quality is linearly affected by WER, and that transcripts of WER equal or less than 25 % would be acceptable for use in webcast archives.

