Results 1 - 10
of
11
Connectionist speech recognition of Broadcast News
, 2002
"... This paper describes connectionist techniques for recognition of Broadcast News. The fundamental difference between connectionist systems and more conventional mixture-of-Gaussian systems is that connectionist models directly estimate posterior probabilities as opposed to likelihoods. Access to post ..."
Abstract
-
Cited by 28 (10 self)
- Add to MetaCart
This paper describes connectionist techniques for recognition of Broadcast News. The fundamental difference between connectionist systems and more conventional mixture-of-Gaussian systems is that connectionist models directly estimate posterior probabilities as opposed to likelihoods. Access to posterior probabilities has enabled us to develop a number of novel approaches to confidence estimation, pronunciation modelling and search. In addition we have investigated a new feature extraction technique based on the modulation-filtered spectrogram (MSG), and methods for combining multiple information sources. We have incorporated all of these techniques into a system for the transcription
Indexing and Retrieval of Broadcast News
- Speech Communication
, 2000
"... This paper describes a spoken document retrieval (SDR) system for British and North American Broadcast News. The system is based on a connectionist large vocabulary speech recognizer and a probabilistic information retrieval system. We discuss the development of a realtime Broadcast News speech r ..."
Abstract
-
Cited by 22 (6 self)
- Add to MetaCart
This paper describes a spoken document retrieval (SDR) system for British and North American Broadcast News. The system is based on a connectionist large vocabulary speech recognizer and a probabilistic information retrieval system. We discuss the development of a realtime Broadcast News speech recognizer, and its integration into an SDR system. Two advances were made for this task: automatic segmentation and statistical query expansion using a secondary corpus. Precision and recall results using the Text Retrieval Conference (TREC) SDR evaluation infrastructure are reported throughout the paper, and we discuss the application of these developments to a large scale SDR task based on an archive of British English broadcast news. Keywords: Spoken Document Retrieval; Information Retrieval; Broadcast Speech; Large Vocabulary Speech Recognition. 1 Introduction Retrieval of audio segments according to their content is a challenging and significant problem. It has been estimated th...
The THISL Broadcast News Retrieval System
, 1999
"... This paper described the THISL spoken document retrieval system for British and North American Broadcast News. The system is based on the ABBOT large vocabulary speech recognizer, using a recurrent network acoustic model, and a probabilistic text retrieval system. We discuss the development of a rea ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
This paper described the THISL spoken document retrieval system for British and North American Broadcast News. The system is based on the ABBOT large vocabulary speech recognizer, using a recurrent network acoustic model, and a probabilistic text retrieval system. We discuss the development of a realtime British English Broadcast News system, and its integration into a spoken document retrieval system. Detailed evaluation is performed using a similar North American Broadcast News system, to take advantage of the TREC SDR evaluation methodology. We report results on this evaluation, with particular reference to the effect of query expansion and of automatic segmentation algorithms. 1. INTRODUCTION THISL is an ESPRIT Long Term Research project in the area of speech retrieval. It is concerned with the construction of a system which performs good recognition of broadcast speech from television and radio news programmes, from which it can produce multimedia indexing data. The principal obj...
Stream combination before and/or after the acoustic model
- PROC. INT. CONF. ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING
, 2000
"... Combining a number of diverse feature streams has proven to be a very flexible and beneficial technique in speech recognition. In the context of hybrid connectionist-HMM recognition, feature streams can be combined at several points. In this work, we compare two forms of combination: at the input to ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Combining a number of diverse feature streams has proven to be a very flexible and beneficial technique in speech recognition. In the context of hybrid connectionist-HMM recognition, feature streams can be combined at several points. In this work, we compare two forms of combination: at the input to the acoustic model, by concatenating the feature streams into a single vector (feature combination or FC), and at the output of the acoustic model, by averaging the logs of the estimated posterior probabilities of each subword unit (posterior combination or PC). Based on four feature streams with varying degrees of mutual dependence, we find that the best combination strategy is a combination of feature and posterior combination, with streams that are more independent, as measured by an approximation to conditional mutual information, showing more benefit from posterior combination.
Recognition, Indexing And Retrieval Of British Broadcast News With The Thisl System
, 1999
"... This paper described the THISL spoken document retrieval system for British and North American Broadcast News. The system is based on the ABBOT large vocabulary speech recognizer and a probabilistic text retrieval system. We discuss the development of a realtime British English Broadcast News system ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
This paper described the THISL spoken document retrieval system for British and North American Broadcast News. The system is based on the ABBOT large vocabulary speech recognizer and a probabilistic text retrieval system. We discuss the development of a realtime British English Broadcast News system, and its integration into a spoken document retrieval system. Detailed evaluation is performed using a similar North American Broadcast News system, to take advantage of the TREC SDR evaluation methodology. We report results on this evaluation, with particular reference to the effect of query expansion and of automatic segmentation algorithms. 1.INTRODUCTION THISL is an ESPRIT Long Term Research project in the area of speech retrieval. It is concerned with the construction of a system which performs good recognition of broadcast speech from television and radio news programmes, from which it can produce multimedia indexing data. The principal objective of the project is to construct a spo...
The THISL Spoken Document Retrieval System
- In TREC-6
, 1998
"... THISL is an ESPRIT Long Term Research Project focused the development and construction of a system to items from an archive of television and radio news broadcasts. In this paper we outline our spoken document retrieval system based on the ABBOT speech recognizer and a text retrieval system based on ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
THISL is an ESPRIT Long Term Research Project focused the development and construction of a system to items from an archive of television and radio news broadcasts. In this paper we outline our spoken document retrieval system based on the ABBOT speech recognizer and a text retrieval system based on Okapi term-weighting . The system has been evaluated as part of the TREC-6 and TREC-7 spoken document retrieval evaluations and we report on the results of the TREC-7 evaluation based on a document collection of 100 hours of North American broadcast news. Keywords: Multimedia Information Retrieval; Spoken Document Retrieval; Speech Recognition; Broadcast Data. 1 INTRODUCTION THISL is an ESPRIT Long Term Research project in the area of speech retrieval. It is concerned with the construction of a system which performs good recognition of broadcast speech from television and radio news programmes, from which it can produce multimedia indexing data. The project is concentrating on British an...
Abberley The THISL SDR system at TREC-9
- Proceedings of TREC-9
, 2000
"... This paper describes our participation in the TREC-9 Spoken Document Retrieval (SDR) track. The THISL SDR system consists of a realtime version of a hybrid connectionist/HMM large vocabulary speech recognition system and a probabilistic text retrieval system. This paper describes the configuration o ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This paper describes our participation in the TREC-9 Spoken Document Retrieval (SDR) track. The THISL SDR system consists of a realtime version of a hybrid connectionist/HMM large vocabulary speech recognition system and a probabilistic text retrieval system. This paper describes the configuration of the speech recognition and text retrieval systems, including segmentation and query expansion. We report our results for development tests using the TREC-8 queries, and for the TREC-9 evaluation. 1.
Integrated Transcription And Identification Of Named Entities In Broadcast Speech
- In Proc. Eurospeech
, 1999
"... This paper presents an approach to integrating functions for both transcription and named entity (NE) identification into a large vocabulary continuous speech recognition system. It builds on NE tagged language modelling approach, which was recently applied for development of the statistical NE anno ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This paper presents an approach to integrating functions for both transcription and named entity (NE) identification into a large vocabulary continuous speech recognition system. It builds on NE tagged language modelling approach, which was recently applied for development of the statistical NE annotation system. We also present results for proper name identification experiment using the Hub-4 evaluation data. 1. INTRODUCTION The accurate identification of proper names and other named entities (NEs) has a useful role to play in spoken language processing, as component in speech understanding systems, and as a way of structuring recogniser output (e.g., as a cue to punctuation and capitalisation). Recently trainable hidden Markov model systems for NE identification have been reported with a precision /recall performance similar to that of the best grammar based systems and only a small amount of degradation when applied to speech recogniser output [1, 2]. We have previously presented...
The Thisl Sdr System At Trec-8
- Proc. of the 8th Text Retrieval Conference TREC-8, Nov 1999. Martine Adda-Decker, Gilles Adda
"... This paper describes the participation of the THISL group at the TREC-8 Spoken Document Retrieval (SDR) track. The THISL SDR system consists of the realtime version of the ABBOT large vocabulary speech recognition system and the THISLIR text retrieval system. The TREC-8 evaluation assessed SDR perfo ..."
Abstract
- Add to MetaCart
This paper describes the participation of the THISL group at the TREC-8 Spoken Document Retrieval (SDR) track. The THISL SDR system consists of the realtime version of the ABBOT large vocabulary speech recognition system and the THISLIR text retrieval system. The TREC-8 evaluation assessed SDR performance on a corpus of 500 hours of broadcast news material collected over a five month period. The main test condition involved retrieval of stories defined by manual segmentation of the corpus in which non-news material, such as commercials, were excluded. An optional test condition required required retrieval of the same stories from the unsegmented audio stream. The THISL SDR system participated at both test conditions. The results show that a system such as THISL can produce respectable information retrieval performance on a realistically-sized corpus of unsegmented audio material. 1. INTRODUCTION The TREC-8 test collection was obtained from the TDT2 corpus and consisted of 902 shows (...
Language Model Adaptation In Speech Recognition Using Document Maps
- In IEEE Workshop on Neural Networks for Signal Processing (NNSP’02
, 2002
"... In this paper we present speech experiments that were carried out to evaluate a topically focusing language model in large vocabulary speech recognition. An ordered topical clustering is rst computed as a self-organized mapping of a large document collection. Language models are then trained for eac ..."
Abstract
- Add to MetaCart
In this paper we present speech experiments that were carried out to evaluate a topically focusing language model in large vocabulary speech recognition. An ordered topical clustering is rst computed as a self-organized mapping of a large document collection. Language models are then trained for each text cluster or for several neighboring clusters. The obtained organized collection of language models is eciently utilized in continuous speech recognition to concentrate on the model that corresponds closest to the current topic of discussion. The speech recognition experiments are carried out on a novel Finnish speech database. A property of Finnish that is particularly challenging for speech recognition is the extremely fast vocabulary growth that makes many of the standard word-based language modeling methods impractical for large vocabulary tasks.

