Results 1 - 10
of
14
Sentence Boundary Detection in Broadcast Speech Transcripts
- in Proc. of ISCA Workshop: Automatic Speech Recognition: Challenges for the new Millennium ASR-2000
, 2000
"... This paper presents an approach to identifying sentence boundaries in broadcast speech transcripts. We describe finite state models that extract sentence boundary information statistically from text and audio sources. An n-gram language model is constructed from a collection of British English news ..."
Abstract
-
Cited by 34 (3 self)
- Add to MetaCart
This paper presents an approach to identifying sentence boundaries in broadcast speech transcripts. We describe finite state models that extract sentence boundary information statistically from text and audio sources. An n-gram language model is constructed from a collection of British English news broadcasts and scripts. An alternative model is estimated from pause duration information in speech recogniser outputs aligned with their programme script counterparts. Experimental results show that the pause duration model alone outperforms the language modelling approach and that, by combining these two models, it can be improved further and precision and recall scores of over 70% were attained for the task. 1. INTRODUCTION Spoken audio data is a rich information source. Extensive research efforts during past decades have resulted in automatic speech transcription systems that can perform certain tasks (e.g., large vocabulary dictation from a cooperative speaker) with a high degree of a...
From Text Summarisation to Style-Specific Summarisation for Broadcast News
, 2004
"... In this paper we report on a series of experiments investigating the path from text summarisation to style-specific summarisation of spoken news stories. We show that the portability of traditional text summarisation features to broadcast news is dependent on the diffusiveness of the information in ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
In this paper we report on a series of experiments investigating the path from text summarisation to style-specific summarisation of spoken news stories. We show that the portability of traditional text summarisation features to broadcast news is dependent on the diffusiveness of the information in the broadcast news story. An analysis of two categories of news stories (containing only read speech or including some spontaneous speech) demonstrates the importance of the style and the quality of the transcript, when extracting the summary-worthy information content. Further experiments indicate the advantages of doing style-specific summarisation of broadcast news.
Web-Assisted Annotation, Semantic Indexing and Search of Television and Radio News
- In Proceedings of the 14th International World Wide Web Conference
, 2005
"... The Rich News system, that can automatically annotate radio and television news with the aid of resources retrieved from the World Wide Web, is described. Automatic speech recognition gives a temporally precise but conceptually inaccurate annotation model. Information extraction from related web new ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
The Rich News system, that can automatically annotate radio and television news with the aid of resources retrieved from the World Wide Web, is described. Automatic speech recognition gives a temporally precise but conceptually inaccurate annotation model. Information extraction from related web news sites gives the opposite: conceptual accuracy but no temporal data. Our approach combines the two for temporally accurate conceptual semantic annotation of broadcast news. First low quality transcripts of the broadcasts are produced using speech recognition, and these are then automatically divided into sections corresponding to individual news stories. A key phrases extraction component finds key phrases for each story and uses these to search for web pages reporting the same event. The text and meta-data of the web pages is then used to create index documents for the stories in the original broadcasts, which are semantically annotated using the KIM knowledge management platform. A web interface then allows conceptual search and browsing of news stories, and playing of the parts of the media files corresponding to each news story. The use of material from the World Wide Web allows much higher quality textual descriptions and semantic annotations to be produced than would have been possible using the ASR transcript directly. The semantic annotations can form a part of the Semantic Web, and an evaluation shows that the system operates with high precision, and with a moderate level of recall.
Content-based Access to Spoken Audio
- IEEE Signal Processing Magazine
, 2005
"... This article describes approaches to content-based access to spoken audio with a qualitative and tutorial emphasis. We describe how the analysis, retrieval and delivery phases contribute making spoken audio content more accessible, and we outline a number of outstanding research issues. We also disc ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
This article describes approaches to content-based access to spoken audio with a qualitative and tutorial emphasis. We describe how the analysis, retrieval and delivery phases contribute making spoken audio content more accessible, and we outline a number of outstanding research issues. We also discuss the main application domains and try to identify important issues for future developments. The structure of the article is based on general system architecture for content-based 2 access which is depicted in Figure 1. Although the tasks within each processing stage may appear unconnected, the interdependencies and the sequence with which they take place vary
A discriminative HMM/n-gram-based retrieval approach for Mandarin spoken documents
- ACM Transactions on Asian Language Information Processing
, 2004
"... Statistical modeling approaches have been steadily gaining popularity in the field of information retrieval in recent years. This paper presents an HMM/N-gram-based retrieval approach for Mandarin spoken documents. The underlying characteristics and different structures of this approach were extensi ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Statistical modeling approaches have been steadily gaining popularity in the field of information retrieval in recent years. This paper presents an HMM/N-gram-based retrieval approach for Mandarin spoken documents. The underlying characteristics and different structures of this approach were extensively investigated and analyzed. The retrieval capabilities were verified by tests with indexing features of word- and syllable-levels and comparison with the conventional vector space model approach. To further improve the discrimination capabilities of the HMMs, both the expectation-maximization (EM) and minimum classification error (MCE) training algorithms were introduced in training. The information fusion of indexing features of word- and syllable-levels was also investigated. The spoken document retrieval experiments were performed on the Topic Detection and Tracking Corpora (TDT-2 and TDT-3). Very encouraging retrieval performance was obtained. 1.
Effect of Recognition Errors on Text Clustering
, 2004
"... Abstract. This paper presents clustering experiments performed over noisy texts (i.e. texts that have been extracted through an automatic process like character or speech recognition). The effect of recognition errors is investigated by comparing clustering results performed over both clean (manuall ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Abstract. This paper presents clustering experiments performed over noisy texts (i.e. texts that have been extracted through an automatic process like character or speech recognition). The effect of recognition errors is investigated by comparing clustering results performed over both clean (manually typed data) and noisy (automatic speech transcriptions) versions of the same speech recording corpus. 2 IDIAP–RR 04-82 1
Abberley The THISL SDR system at TREC-9
- Proceedings of TREC-9
, 2000
"... This paper describes our participation in the TREC-9 Spoken Document Retrieval (SDR) track. The THISL SDR system consists of a realtime version of a hybrid connectionist/HMM large vocabulary speech recognition system and a probabilistic text retrieval system. This paper describes the configuration o ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This paper describes our participation in the TREC-9 Spoken Document Retrieval (SDR) track. The THISL SDR system consists of a realtime version of a hybrid connectionist/HMM large vocabulary speech recognition system and a probabilistic text retrieval system. This paper describes the configuration of the speech recognition and text retrieval systems, including segmentation and query expansion. We report our results for development tests using the TREC-8 queries, and for the TREC-9 evaluation. 1.
An evaluation of a spoken document retrieval baseline system in Finnish
- in Proceedings of the International Conference on Spoken Language Processing ICSLP 2004, Jeju Island, Korea
, 2004
"... This paper presents a baseline spoken document retrieval system in Finnish. Due to its agglutinative structure, Finnish speech can not be adequately transcribed using the standard large vocabulary continuous speech recognition approaches. The definition of a sufficient lexicon and the training of th ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This paper presents a baseline spoken document retrieval system in Finnish. Due to its agglutinative structure, Finnish speech can not be adequately transcribed using the standard large vocabulary continuous speech recognition approaches. The definition of a sufficient lexicon and the training of the statistical language models are difficult, because the words appear transformed by many inflections and compounds. In this work we apply a recently developed unlimited vocabulary speech recognition system that allows the use of n-gram language models based on morpheme-like subword units discovered in an unsupervised manner. In addition to word-based indexing, we also propose an indexing based on the subword units provided directly by our speech recognizer. In an initial evaluation of newsreading in Finnish, we obtained a fairly low recognition error rate and average document retrieval precisions close to that from human reference transcripts. 1.
Content Augmentation for Mixed-Mode News Broadcasts
"... Rich News, a system that augments news broadcasts with textual content, is described. The system identifies individual stories in news broadcasts, and annotates them with related content from the World Wide Web. The web content is subsequently semantically analysed, and used to produce summary infor ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Rich News, a system that augments news broadcasts with textual content, is described. The system identifies individual stories in news broadcasts, and annotates them with related content from the World Wide Web. The web content is subsequently semantically analysed, and used to produce summary information for each news story. This content can then be delivered to users as part of an interactive television broadcast, or used to create semantically enhanced electronic programme guides. It also enables sophisticated search and browsing of news stories via a web interface. Rich News could be deployed either by broadcasters, or on digital video recorders in viewers ’ homes, and allows the creation of new media experiences that integrate television and web content into one unified viewing experience. Key Words Semantic television, interactive television, electronic programme guides, multi-media, natural language processing, automatic speech recognition. 1.
B.: Semantically enhanced television news through web and video integration
- In Proceedings of the Workshop on Multimedia and the Semantic Web at the European Semantic Web Conference (ESWC
, 2005
"... Abstract. The Rich News system for semantically annotating television news broadcasts and augmenting them with additional web content is described. On-line news sources were mined for material reporting the same stories as those found in television broadcasts, and the text of these pages was semanti ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract. The Rich News system for semantically annotating television news broadcasts and augmenting them with additional web content is described. On-line news sources were mined for material reporting the same stories as those found in television broadcasts, and the text of these pages was semantically annotated using the KIM knowledge management platform. This resulted in more effective indexing than would have been possible if the programme transcript was indexed directly, owing to the poor quality of transcripts produced using automatic speech recognition. In addition, the associations produced between web pages and television broadcasts enables the automatic creation of augmented interactive television broadcasts and multi-media websites. 1

