Results 1 - 10
of
5,549
Implementation of Vocal Tract Length Normalization for Phoneme Recognition on TIMIT Speech Corpus
"... Abstract. Inter-speaker variability, one of the problems faced in speech recognition system, has caused the performance degradation in recognizing varied speech spoken by different speakers. Vocal Tract Length Normalization (VTLN) method is known to improve the recognition performances by compensat ..."
Abstract
- Add to MetaCart
by compensating the speech signal using specific warping factor. Experiments are conducted using TIMIT speech corpus and Hidden Markov Model Toolkit (HTK) together with the implementation of VTLN method in order to show improvement in speaker independent phoneme recognition. The results show better recognition
WTIMIT: The TIMIT Speech Corpus Transmitted Over the 3G AMR Wideband Mobile Network
"... In anticipation of upcoming mobile telephony services with higher speech quality, a wideband (50 Hz to 7 kHz) mobile telephony deriva-tive of TIMIT has been recorded called WTIMIT. It opens up various scientific investigations; e.g., on speech quality and intelligibility, as well as on wideband upgr ..."
Abstract
- Add to MetaCart
In anticipation of upcoming mobile telephony services with higher speech quality, a wideband (50 Hz to 7 kHz) mobile telephony deriva-tive of TIMIT has been recorded called WTIMIT. It opens up various scientific investigations; e.g., on speech quality and intelligibility, as well as on wideband
Building a Large Annotated Corpus of English: The Penn Treebank
- COMPUTATIONAL LINGUISTICS
, 1993
"... There is a growing consensus that significant, rapid progress can be made in both text understanding and spoken language understanding by investigating those phenomena that occur most centrally in naturally occurring unconstrained materials and by attempting to automatically extract information abou ..."
Abstract
-
Cited by 2740 (10 self)
- Add to MetaCart
-1992), this corpus has been annotated for part-of-speech (POS) information. In addition, over half of it has been annotated for skeletal syntactic structure. These materials are available to members of the Linguistic Data Consortium; for details, see Section 5.1.
A Maximum Entropy Model for Part-Of-Speech Tagging
, 1996
"... This paper presents a statistical model which trains from a corpus annotated with Part-OfSpeech tags and assigns them to previously unseen text with state-of-the-art accuracy(96.6%). The model can be classified as a Maximum Entropy model and simultaneously uses many contextual "features" t ..."
Abstract
-
Cited by 580 (1 self)
- Add to MetaCart
This paper presents a statistical model which trains from a corpus annotated with Part-OfSpeech tags and assigns them to previously unseen text with state-of-the-art accuracy(96.6%). The model can be classified as a Maximum Entropy model and simultaneously uses many contextual "
Proceedings 24 (2001), 117–123. VOWEL NORMALIZATIONS WITH THE TIMIT ACOUSTIC PHONETIC SPEECH CORPUS
"... In this paper we present preliminary results of speaker normalization procedures that were tested with all 35,385 stressed vowels of 438 male speakers in the TIMIT speech corpus. First we investigate a procedure to reduce the variance in vowel space. This procedure knows about the identity of the sp ..."
Abstract
- Add to MetaCart
In this paper we present preliminary results of speaker normalization procedures that were tested with all 35,385 stressed vowels of 438 male speakers in the TIMIT speech corpus. First we investigate a procedure to reduce the variance in vowel space. This procedure knows about the identity
Self-organized language modeling for speech recognition
- Readings in Speech Recognition
, 1990
"... In the case of a trlgr~m language model, the proba-bility of the next word conditioned on the previous two words is estimated from a large corpus of text. The re-sulting static trigram language model (STLM) has fixed probabilities that are independent of the document being dictated. To improve the l ..."
Abstract
-
Cited by 394 (6 self)
- Add to MetaCart
In the case of a trlgr~m language model, the proba-bility of the next word conditioned on the previous two words is estimated from a large corpus of text. The re-sulting static trigram language model (STLM) has fixed probabilities that are independent of the document being dictated. To improve
The DARPA TIMIT Acoustic-Phonetic Continuous Speech
"... In anticipation of upcoming mobile telephony services with higher speech quality, a wideband (50 Hz to 7 kHz) mobile telephony deriva-tive of TIMIT has been recorded called WTIMIT. It opens up various scientific investigations; e.g., on speech quality and intelligibility, as well as on wideband upgr ..."
Abstract
- Add to MetaCart
In anticipation of upcoming mobile telephony services with higher speech quality, a wideband (50 Hz to 7 kHz) mobile telephony deriva-tive of TIMIT has been recorded called WTIMIT. It opens up various scientific investigations; e.g., on speech quality and intelligibility, as well as on wideband
Comparative Evaluation of Speech Enhancement Methods for Robust Automatic Speech Recognition
"... A comparative evaluation of speech enhancement algorithms for robust automatic speech recognition is presented. The evaluation is performed on a core test set of the TIMIT speech corpus. Mean objective speech quality scores as well as ASR correctness scores under two noise conditions are given. Inde ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
A comparative evaluation of speech enhancement algorithms for robust automatic speech recognition is presented. The evaluation is performed on a core test set of the TIMIT speech corpus. Mean objective speech quality scores as well as ASR correctness scores under two noise conditions are given
The QUT-NOISE-TIMIT Corpus for the Evaluation of Voice Activity Detection Algorithms
"... The QUT-NOISE-TIMIT corpus consists of 600 hours of noisy speech sequences designed to enable a thorough evaluation of voice activity detection (VAD) algorithms across a wide variety of common background noise scenarios. In order to construct the final mixed-speech database, a collection of over 10 ..."
Abstract
- Add to MetaCart
The QUT-NOISE-TIMIT corpus consists of 600 hours of noisy speech sequences designed to enable a thorough evaluation of voice activity detection (VAD) algorithms across a wide variety of common background noise scenarios. In order to construct the final mixed-speech database, a collection of over 10
STC-TIMIT: Generation of a single-channel telephone corpus
- In LREC
, 2008
"... This paper describes a new speech corpus, STC-TIMIT, and discusses the process of design, development and its distribution through LDC. The STC-TIMIT corpus is derived from the widely used TIMIT corpus by sending it through a real and single telephone channel. TIMIT is phonetically balanced, covers ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
This paper describes a new speech corpus, STC-TIMIT, and discusses the process of design, development and its distribution through LDC. The STC-TIMIT corpus is derived from the widely used TIMIT corpus by sending it through a real and single telephone channel. TIMIT is phonetically balanced, covers
Results 1 - 10
of
5,549