Results 1 - 10
of
12
Automatic text-independent pronunciation scoring of foreign language student speech
- Proc. Int’l Conf. on Spoken Language Processing
, 1996
"... SRI International is currently involved in the development of a new generation of software systems for automatic scoring of pronunciation as part of the Voice Interactive Language Training System (VILTS) project. This paper describes the goals of the VILTS system, the speech corpus, and the algorith ..."
Abstract
-
Cited by 36 (13 self)
- Add to MetaCart
SRI International is currently involved in the development of a new generation of software systems for automatic scoring of pronunciation as part of the Voice Interactive Language Training System (VILTS) project. This paper describes the goals of the VILTS system, the speech corpus, and the algorithm development. The automatic grading system uses SRI’s Decipher ™ continuous speech recognition system [1] to generate phonetic segmentations that are used to produce pronunciation scores at the end of each lesson. The scores produced by the system are similar to those of expert human listeners. Unlike previous approaches in which models were built for specific sentences or phrases, we present a new family of algorithms designed to perform well even when knowledge of the exact text to be used is not available. 1.
Automatic pronunciation scoring for language instruction
, 1997
"... This work is part of an effort aimed at developing computerbased systems for language instruction; we address the task of grading the pronunciation quality of the speech of a student of a foreign language. The automatic grading system uses SRI’s Decipher ™ continuous speech recognition system to gen ..."
Abstract
-
Cited by 34 (10 self)
- Add to MetaCart
This work is part of an effort aimed at developing computerbased systems for language instruction; we address the task of grading the pronunciation quality of the speech of a student of a foreign language. The automatic grading system uses SRI’s Decipher ™ continuous speech recognition system to generate phonetic segmentations. Based on these segmentations and probabilistic models we produce pronunciation scores for individual or groups of sentences. Scores obtained from expert human listeners are used as the reference to evaluate the different machine scores and to provide targets when training some of the algorithms. In previous work [1] we had found that durationbased scores outperformed HMM log-likelihood-based scores. In this paper we show that we can significantly improve HMMbased scores by using average phone segment posterior probabilities. Correlation between machine and human scores went up from r=0.50 with likelihood-based scores to r=0.88 with posterior-based scores. The new measures also outperformed duration-based scores in their ability to produce reliable scores from only a few sentences. 1.
Combination of Machine Scores for Automatic Grading of Pronunciation Quality
, 1998
"... This work is part of an effort aimed at developing computer-based systems for language instruction; we address the task of grading the pronunciation quality of the speech of a student of a foreign language. The automatic grading system uses SRI's Decipher^TM continuous speech recognition system to ..."
Abstract
-
Cited by 21 (5 self)
- Add to MetaCart
This work is part of an effort aimed at developing computer-based systems for language instruction; we address the task of grading the pronunciation quality of the speech of a student of a foreign language. The automatic grading system uses SRI's Decipher^TM continuous speech recognition system to generate phonetic segmentations. Based on these segmentations and probabilistic models we produce different pronunciation scores for individual or groups of sentences that can be used as predictors of the pronunciation quality. Different types of these machine scores can be combined to obtain a better prediction of the overall pronunciation quality. In this paper we review some of the best-performing machine scores, and discuss the application of several methods based on linear and nonlinear mapping and combination of individual machine scores to predict the pronunciation quality grade that a human expert would have given. We evaluate these methods in a database that consists of pronunciation-quality-graded speech from American students speaking French. With predictors based on spectral match and on durational characteristics, we find that the combination of scores improved the prediction of the human grades and that nonlinear mapping and combination methods performed better than linear ones. Characteristics of the different nonlinear methods studied are discussed.
Detection of Foreign Speakers' Pronunciation Errors for Second Language Training - Preliminary Results
- In ICSLP '96
, 1996
"... With the present generation of speech recognizers, dealing with speaker-independent continuous speech and medium-sized vocabularies, the possibilities of applications become larger. Yet some applications have not yet been tried, or have been tried with heavy constraints on the user, due to expected ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
With the present generation of speech recognizers, dealing with speaker-independent continuous speech and medium-sized vocabularies, the possibilities of applications become larger. Yet some applications have not yet been tried, or have been tried with heavy constraints on the user, due to expected poor recognition performance. And the lack of results to date in the domain of prosody has severely limited use of that information. Researchers may be overly pessimistic. Herein we explore the possibility of using CMU's SPHINX II recognizer and of obtaining correct prosody information in order to implement it in a system to aid in foreign language learning.
Automatic Pronunciation Scoring of Specific Phone Segments for Language Instruction
- in Proc. Eurospeech97, Vol.2
, 1997
"... The aim of the work described in this paper is to develop methods for automatically assessing the pronunciation quality of specific phone segments uttered by students learning a foreign language. From the phonetic time alignments generated by SRI’s Decipher ™ HMM-based speech recognition system, we ..."
Abstract
-
Cited by 17 (6 self)
- Add to MetaCart
The aim of the work described in this paper is to develop methods for automatically assessing the pronunciation quality of specific phone segments uttered by students learning a foreign language. From the phonetic time alignments generated by SRI’s Decipher ™ HMM-based speech recognition system, we use various probabilistic models to produce pronunciation scores for the phone utterance. We evaluate the performance of the proposed algorithms by measuring how well the machine-produced scores correlate with human judgments on a large database. Of the various algorithms considered, the one based on phone log-posterior-probability produced the highest correlation (r xy = 0.72) with the human ratings, which was comparable with correlations between human raters. 1.
Speech technology in computer-aided language learning: Strengths and limitations of a new CALL paradigm. Language Learning
- Technology
, 1998
"... We investigate the suitability of deploying speech technology in computer-based systems that can be used to teach foreign language skills. In reviewing the current state of speech recognition and speech processing technology and by examining a number of voice-interactive CALL applications, we sugges ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
We investigate the suitability of deploying speech technology in computer-based systems that can be used to teach foreign language skills. In reviewing the current state of speech recognition and speech processing technology and by examining a number of voice-interactive CALL applications, we suggest how to create robust interactive learning environments that exploit the strengths of speech technology while working around its limitations. In the conclusion, we draw on our review of these applications to identify directions of future research that might improve both the design and the overall performance of voice-interactive CALL systems.
Automatic Scoring of Pronunciation Quality
- Speech Communication
, 1999
"... We present a paradigm for the automatic assessment of pronunciation quality by machine. ..."
Abstract
-
Cited by 15 (4 self)
- Add to MetaCart
We present a paradigm for the automatic assessment of pronunciation quality by machine.
Automatic Detection of Mispronunciation for Language Instruction
- Proc. of Eurospeech 97
"... This work is part of a project aimed at developing a speech recognition system for language instruction that can assess the quality of pronunciation, identify pronunciation problems, and provide the student with accurate feedback about specific mistakes. Previous work was mainly concerned with scori ..."
Abstract
-
Cited by 14 (6 self)
- Add to MetaCart
This work is part of a project aimed at developing a speech recognition system for language instruction that can assess the quality of pronunciation, identify pronunciation problems, and provide the student with accurate feedback about specific mistakes. Previous work was mainly concerned with scoring the quality of pronunciation. In this work we focus on automatic detection of mispronunciation. While scoring quantifies the mispronunciation, detection identifies the occurrence of a specific problem. Detecting pronunciation problems is necessary for providing feedback to the student. We use pronunciation scoring techniques to evaluate the performance of our mispronunciation model. 1.
Modeling the language assessment process and result: Proposed architecture for an automatic oral proficiency assessment
"... We outline challenges for modeling human language assessment in automatic systems, both in terms of the process and the reliability of the result. We propose an architecture for a system to evaluate examinees via the Computerized Oral Proficiency Instrument, to determine whether they have `reached' ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We outline challenges for modeling human language assessment in automatic systems, both in terms of the process and the reliability of the result. We propose an architecture for a system to evaluate examinees via the Computerized Oral Proficiency Instrument, to determine whether they have `reached' or `not reached' the Intermediate Low level of proficiency, according to the American Council on the Teaching of Foreign Languages (ACTFL) Speaking Proficiency Guidelines. Our system divides the acoustic and non-acoustic features, incorporating human process modeling where permitted by the technology and required by the domain. We suggest machine learning techniques applied to this type of system permit insight into yet unarticulated aspects of the human rating process. 1 Introduction Computer-mediated language assessment appeals to educators and language evaluators because it has the potential for making language assessment widely available with minimal human effort and limited expense. F...
Automatic Text-Independent Pronunciation Scoring Of Foreign Language Student Speech
, 1996
"... SRI International is currently involved in the development of a new generation of software systems for automatic scoring of pronunciation as part of the Voice Interactive Language Training System (VILTS) project. This paper describes the goals of the VILTS system, the speech corpus, and the algorith ..."
Abstract
- Add to MetaCart
SRI International is currently involved in the development of a new generation of software systems for automatic scoring of pronunciation as part of the Voice Interactive Language Training System (VILTS) project. This paper describes the goals of the VILTS system, the speech corpus, and the algorithm development. The automatic grading system uses SRI's Decipher^TM continuous speech recognition system [1] to generate phonetic segmentations that are used to produce pronunciation scores at the end of each lesson. The scores produced by the system are similar to those of expert human listeners. Unlike previous approaches in which models were built for specific sentences or phrases, we present a new family of algorithms designed to perform well even when knowledge of the exact text to be used is not available.

