Results 1 -
8 of
8
Lattice-based search for spoken utterance retrieval
- In Proceedings of HLT-NAACL 2004
, 2004
"... Recent work on spoken document retrieval has suggested that it is adequate to take the singlebest output of ASR, and perform text retrieval on this output. This is reasonable enough for the task of retrieving broadcast news stories, where word error rates are relatively low, and the stories are long ..."
Abstract
-
Cited by 32 (8 self)
- Add to MetaCart
Recent work on spoken document retrieval has suggested that it is adequate to take the singlebest output of ASR, and perform text retrieval on this output. This is reasonable enough for the task of retrieving broadcast news stories, where word error rates are relatively low, and the stories are long enough to contain much redundancy. But it is patently not reasonable if one’s task is to retrieve a short snippet of speech in a domain where WER’s can be as high as 50%; such would be the situation with teleconference speech, where one’s task is to find if and when a participant uttered a certain phrase. In this paper we propose an indexing procedure for spoken utterance retrieval that works on lattices rather than just single-best text. We demonstrate that this procedure can improve F scores by over five points compared to singlebest retrieval on tasks with poor WER and low redundancy. The representation is flexible so that we can represent both word lattices, as well as phone lattices, the latter being important for improving performance when searching for phrases containing OOV words. 1
Retrieval Of Broadcast News Speech In Mandarin Chinese Collected In Taiwan Using Syllable-Level Statistical Characteristics
- Proceedings of the 2000 International Conference on Acoustics Speech and Signal Processing
, 2000
"... Spoken document retrieval has been extensively studied in recent years because of its high potential in various applications in the near future. Considering the monosyllabic structure of Chinese language, a whole class of indexing features for retrieval of spoken documents in Mandarin Chinese us ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
Spoken document retrieval has been extensively studied in recent years because of its high potential in various applications in the near future. Considering the monosyllabic structure of Chinese language, a whole class of indexing features for retrieval of spoken documents in Mandarin Chinese using syllable-level statistical characteristics has been studied, and very encouraging experimental results on retrieval of broadcast news speech collected in Taiwan were obtained. This paper reports some interesting initial results and findings obtained in this research. 1. INTRODUCTION The network technologies and the Internet activities have created a completely new information era. Intelligent and efficient information retrieval techniques providing Internet users with easy access to spoken documents, such as broadcast radio and television programs, become highly desired and have been extensively studied in recent years [1-6]. At the same time, the DARPA Hub-4 contest that began in...
Retrieval of Mandarin Broadcast News Using Spoken Queries
- In Proc. International Conference on Spoken Language Processing, (ICSLP2000
, 2000
"... Considering the monosyllabic structure of the Chinese language, a whole class of indexing features for retrieval of Mandarin broadcast news using syllable-level statistical characteristics has been previously investigated. This paper presents the improvements achieved over the previous results. The ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Considering the monosyllabic structure of the Chinese language, a whole class of indexing features for retrieval of Mandarin broadcast news using syllable-level statistical characteristics has been previously investigated. This paper presents the improvements achieved over the previous results. The major differences are: (1) Multi-scale character- and word-level indexing terms have been integrated with the syllable-level information. (2) Information cues from the contemporary newswire text corpus have been used to create more accurate syllable indexing terms. (3) Automatic document expansion, blind relevance feedback, and query expansion via the term association matrix have been applied in retrieval. With all these schemes, the average precision can be improved from 55.46 % to 71.29%. 1.
Content-based Language Models for Spoken Document Retrieval
- In Proceedings of the 5th International Workshop on Information Retrieval with Asian Languages (IRAL 2000
, 2000
"... Spoken document retrieval (SDR) has been extensively studied in recent years because of its potential use in navigating large multimedia collections in the near future. This paper presents a novel concept of applying content-based language models to spoken document retrieval. In an example task for ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Spoken document retrieval (SDR) has been extensively studied in recent years because of its potential use in navigating large multimedia collections in the near future. This paper presents a novel concept of applying content-based language models to spoken document retrieval. In an example task for retrieval of Mandarin Chinese broadcast news data, the content-based language models either trained on automatic transcriptions of spoken documents or adapted from baseline language models using automatic transcriptions of spoken documents were used to create more accurate recognition results and indexing terms from both spoken documents and speech queries. We report on some interesting findings obtained in this research.
Phone-Based Spoken Document Retrieval in Conformance with the MPEG-7 Standard
, 2004
"... This paper presents a phone-based approach of spoken document retrieval, developed in the framework of the emerging MPEG-7 standard. The audio part of MPEG-7 encloses a SpokenContent tool that provides a standardized description of the content of spoken documents. In the context of MPEG-7, we propos ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper presents a phone-based approach of spoken document retrieval, developed in the framework of the emerging MPEG-7 standard. The audio part of MPEG-7 encloses a SpokenContent tool that provides a standardized description of the content of spoken documents. In the context of MPEG-7, we propose an indexing and retrieval method that uses phonetic information only and a vector space IR model. Experiments are conducted on a database of German spoken documents, with 10 city name queries. Two phone-based retrieval approaches are presented and combined. The first one is based on the combination of phone N-grams of different lengths used as indexing terms. The other consists of expanding the document representation by means of phone confusion probabilities
AUTOMATIC TOPIC DETECTION STRATEGY FOR INFORMATION RETRIEVAL IN SPOKEN DOCUMENT
"... This paper suggests an alternative solution for the task of spoken document retrieval (SDR). The proposed system runs retrieval on multi-level transcriptions (word and phone) produced by word and phone recognizers respectively, and their outputs are combined. We propose to use latent Dirichlet alloc ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This paper suggests an alternative solution for the task of spoken document retrieval (SDR). The proposed system runs retrieval on multi-level transcriptions (word and phone) produced by word and phone recognizers respectively, and their outputs are combined. We propose to use latent Dirichlet allocation (LDA) model for capturing the semantic information on word transcription. The LDA model is employed for estimating topic distribution in queries and word transcribed spoken documents, and the matching is performed at the topic level. Acoustic matching between query words and phonetically transcribed spoken documents is performed using phone-based matching algorithm. The results of acoustic and topic level matching methods are compared and shown to be complementary. 1.
The SoVideo broadcast news retrieval system for Mandarin Chinese
"... This paper describes the SoVideo broadcast news retrieval system for Mandarin Chinese. The system is based on technologies such as large-vocabulary continuous speech recognition for Mandarin Chinese, automatic story segmentation, and information retrieval. Until now, the database consisted of 177 ho ..."
Abstract
- Add to MetaCart
This paper describes the SoVideo broadcast news retrieval system for Mandarin Chinese. The system is based on technologies such as large-vocabulary continuous speech recognition for Mandarin Chinese, automatic story segmentation, and information retrieval. Until now, the database consisted of 177 hours of broadcast news, which yielded 3264 stories by automatic story segmentation. We discuss the development of the retrieval system, and the evaluation of each component and the retrieval system. 1.
COMBINING CONFUSION NETWORKS WITH PROBABILISTIC PHONE MATCHING FOR OPEN-VOCABULARY KEYWORD SPOTTING IN SPONTANEOUS SPEECH SIGNAL
"... www.nue.tu-berlin.de In this paper, we study several methods for keyword spotting in spontaneous speech signal. Novel method combining probabilistic phone matching (PSM) approach with word confusion networks (WCN) is proposed for open-vocabulary ..."
Abstract
- Add to MetaCart
www.nue.tu-berlin.de In this paper, we study several methods for keyword spotting in spontaneous speech signal. Novel method combining probabilistic phone matching (PSM) approach with word confusion networks (WCN) is proposed for open-vocabulary

