Results 11 - 20
of
24
Techniques for modelling Phonological Processes in Automatic Speech Recognition
, 2001
"... Declaration This dissertation is the result of my own work and includes nothing which is the outcome of work done in collaboration, except where stated. It has not been submitted in whole or part for a degree at any other university. The length of this thesis including footnotes and appendices does ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Declaration This dissertation is the result of my own work and includes nothing which is the outcome of work done in collaboration, except where stated. It has not been submitted in whole or part for a degree at any other university. The length of this thesis including footnotes and appendices does not exceed 29,500 words and includes no more than 40 figures. 1 Systems which automatically transcribe carefully dictated speech are now commercially available, but their performance degrades dramatically when the speaking style of users becomes more relaxed or conversational. This dissertation focuses on techniques that aim to improve the robustness of statistical speech transcription systems to conversational speaking styles. The dissertation shows first that the performance degradation occuring as speech becomes more conversational is severe and is partially attributable to differences in the acoustic realizations of sentences. Hypothesizing that the quantifiably wider range of
Performance Prediction for Exponential Language Models
"... We investigate the task of performance prediction for language models belonging to the exponential family. First, we attempt to empirically discover a formula for predicting test set cross-entropy for n-gram language models. We build models over varying domains, data set sizes, and n-gram orders, an ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
We investigate the task of performance prediction for language models belonging to the exponential family. First, we attempt to empirically discover a formula for predicting test set cross-entropy for n-gram language models. We build models over varying domains, data set sizes, and n-gram orders, and perform linear regression to see whether we can model test set performance as a simple function of training set performance and various model statistics. Remarkably, we find a simple relationship that predicts test set performance with a correlation of 0.9997. We analyze why this relationship holds and show that it holds for other exponential language models as well, including class-based models and minimum discrimination information models. Finally, we discuss how this relationship can be applied to improve language model performance. 1
The 1998 HTK Broadcast News Transcription System: Development and Results
, 1999
"... This paper presents the development of the HTK broadcast news transcription system for the November 1998 Hub4 evaluation. Relative to the previous year's system The system a number of features were added including vocal tract length normalisation; cluster-based variance normalisation; double the qua ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This paper presents the development of the HTK broadcast news transcription system for the November 1998 Hub4 evaluation. Relative to the previous year's system The system a number of features were added including vocal tract length normalisation; cluster-based variance normalisation; double the quantity of acoustic training data; interpolated word level language models to combine text sources; increased broadcast news language model training data; and an extra adaptation stage using a full-variance transform. Overall these changes to the system reduced the error rate by 13% on the 1997 evaluation data and the final system had an overall word error rate of 13.8% for the 1998 evaluation data sets.
Efficient Class-Based Language Modelling For Very Large Vocabularies
, 2001
"... This paper investigates the perplexity and word error rate performance of two different forms of class model and the respective data-driven algorithms for obtaining automatic word classifications. The computational complexity of the algorithm for the `conventional' two-sided class model is found to ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This paper investigates the perplexity and word error rate performance of two different forms of class model and the respective data-driven algorithms for obtaining automatic word classifications. The computational complexity of the algorithm for the `conventional' two-sided class model is found to be unsuitable for very large vocabularies ( 100k) or large numbers of classes ( 2000). A one-sided class model is therefore investigated and the complexity of its algorithm is found to be substantially less in such situations. Perplexity results are reported on both English and Russian data. For the latter both 65k and 430k vocabularies are used. Lattice rescoring experiments are also performed on an English language broadcast news task. These experimental results show that both models, when interpolated with a word model, perform similarly well. Moreover, classifications are obtained for the one-sided model in a fraction of the time required by the two-sided model, especially for very large vocabularies.
Automatic Capitalisation Generation for Speech Input
"... Two different systems are proposed for the task of capitalisation generation. The first system is a slightly modified speech recogniser. In this system, every word in the vocabulary is duplicated: once in a decapitalised form and again in capitalised forms. In addition, the language model is re-t ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Two different systems are proposed for the task of capitalisation generation. The first system is a slightly modified speech recogniser. In this system, every word in the vocabulary is duplicated: once in a decapitalised form and again in capitalised forms. In addition, the language model is re-trained on mixed case texts. The other system
Improvements In Accuracy And Speed In The HTK Broadcast News Transcription System
, 1999
"... This paper describes a number of recent improvements to the HTK Broadcast News Transcription System. Changes to the system include the use of more acoustic training data; use of cluster-based variance normalisation and vocal tract length normalisation; the use of interpolated language models and enh ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper describes a number of recent improvements to the HTK Broadcast News Transcription System. Changes to the system include the use of more acoustic training data; use of cluster-based variance normalisation and vocal tract length normalisation; the use of interpolated language models and enhanced adaptation using a full variance transform. These changes produce an reduction in word error rate of 13%. A simplified version of the system has also been constructed that runs in less than 10 times real-time and gives a 2.3% absolute higher error rate than the 300xRT full system. 1. INTRODUCTION There is currently much interest in the automatic transcription of found speech and general audio sources. This paper takes as its starting point the HTK Broadcast News Transcription System used in the 1997 DARPA/NIST Hub4 evaluation. Changes to the system are described which improve both transcription accuracy and allow a simplified version to operate in less than 10 times realtime on commod...
Named entity recognition from speech and its use in the generation of enhanced speech recognition output
, 2001
"... Abstract Page 1 The work in this thesis concerns Named Entity (NE) recognition from speech and its use in the generation of enhanced speech recognition output with automatic punctuation and automatic capitalisation. A method for the automatic generation of rules is proposed for NE recognition. Punct ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract Page 1 The work in this thesis concerns Named Entity (NE) recognition from speech and its use in the generation of enhanced speech recognition output with automatic punctuation and automatic capitalisation. A method for the automatic generation of rules is proposed for NE recognition. Punctuation marks are generated using context and prosody information. Capitalisation is pro-duced based on the results of NE recognition and punctuation generation. Previous work regarding the NE task is mainly categorised by hand crafted rule-based systems and stochastic systems. By contrast, in this thesis, an automatic rule generating method, which uses the Brill rule inference approach, is proposed. The performance of the rule-based NE recog-niser is compared with that of the BBN’s commercial implementation called IdentiFinder. When only the sequences of words are available, both systems show almost equal performance as is also the case with additional information such as punctuation, capitalisation and name lists. In cases where input texts are corrupted by speech recognition errors, the performances of both systems are degraded by almost the same level. Although the rule-based approach is different
The Hidden Vector State Language Model
"... The Hidden Vector State (HVS) model extends the basic Hidden Markov Model (HMM) by encoding each state as a vector of stack states but with restricted stack operations. The model uses a right branching stack automaton to assign valid stochastic parses to a word sequence from which the language model ..."
Abstract
- Add to MetaCart
The Hidden Vector State (HVS) model extends the basic Hidden Markov Model (HMM) by encoding each state as a vector of stack states but with restricted stack operations. The model uses a right branching stack automaton to assign valid stochastic parses to a word sequence from which the language model probability can be estimated. The model is completely data driven and is able to model classes from the data that reflect the hierarchical structures found in natural language. This paper describes the design and the implementation of the HVS language model [1], focusing on the practical issues of initialisation and training using Baum-Welch re-estimation whilst accommodating a large and dynamic state space. Results of experiments conducted using the ATIS corpus [2] show that the HVS language model reduces test set perplexity compared to standard class based language models. 1.
CSE 256 (Spring 2004)
"... 1.4 The norms accepted ASTM Asm ANSI Cei Isa IEEE Nfpa Cee Oms En,afnor,bs Asme 1.2 technique condition Capacity 2 m3 /h for 8h non stop ???? To be confirmed. The system most be plugged to the national water network (Ade) at the pressure of 3 bars And temp following the ambient condition The caract ..."
Abstract
- Add to MetaCart
1.4 The norms accepted ASTM Asm ANSI Cei Isa IEEE Nfpa Cee Oms En,afnor,bs Asme 1.2 technique condition Capacity 2 m3 /h for 8h non stop ???? To be confirmed. The system most be plugged to the national water network (Ade) at the pressure of 3 bars And temp following the ambient condition The caracteristique of the water to be treated is noted in the chapter 2.2.1 General demand Reference in this field. Experience in this type of installation Experience in the treatment of water reference in Algeria (all type of project) 3.1 The technique of water treatment Ref: drawing .1. Done by Didine Abdoune following 3.1 3.3 demands specified 1 traitement of water most following this point The treatment of water most be done following the data in the table of the caracteristique of the water table 1 Treatment of water containing floating particle Treatment of water wear the turbidity is high Correction of the smell the test and elimination of the chloral And the water most not contain um1 Treatment of bacterologique and disinfection and sterilization of the water For the sterilization we most use the UV method ///// most is in the norm of OMS OR THE EUROPEN UNION Demand of the equipments, most be \\\\\ Most be storage tank of the capacity of 15 M3 for the no treated water (venting valve) and man hole, over flow line, bady switched to indicate the level for the pump number 2 So the start and stop following the info given from the BDY SWICHE Storage tank for produced water in capacity of 4m3 x 2 in parallel (valve for by passe) sit glass safety valve and over flow, man hole. Bd swich for the control of start and stop of pump n 2 Pipeline Material most be STM a 120 SCH 80 OR EQUIVALENT The connection from the man most be EPOER PIPE MOST STAND THE FOLOWING SITUATION - dilation and the contraction thermique - vibration - effects of the temperature - support flexibility - over thickness 1,6 mm for the corrosion - the sludge and drain most be connected to the existing system - the treed most be type NPT ,drilled and treed conform to ANSI B 1.20.1 - The flanges most be ANSI class 150/PN20 - Manometer most be installed in the piping - No return valve drainer , venting , and sample - The pvc and the aplomb are excluded from the installation Civil engineering and still structure - All unites most be mounted in skids and chassis the chassis will be fix in foundation plate form of beton armed and welded inside structure the chassis most be from galvanized type and will be having low ding point - The offers most have also the offer of the installation housing still structure and...etc all proposition are well come the cover of the housing most be maid of TN40 - Construction most be erthequek proven Instrumentation and control - The installation most have the measurement and control of the following - The hardness - The ph - Pressure indicator - Pressure in the filters and alarms Power valuable - 380 v - phase 03 neutral - frequency HZ 50 Spare parts - list of spare parts for period of 2 years this list most have consumable part and changing part for the total unite - Most be list of spare parts of start up. - including the tousle and special tousles for the operation and maintenance of the installation - if there is chimiquell product to included in the prosses most provide the information of this one The noise restriction is 85 db The provider most guaranty 20 years of providing ( ) 20 years of spar parts The time of delivery is 6 month staring the day of the signature of the contract? Test the unite most be tested in eyes whiteness of 2 person from the sonatrach The submission most include - drawings of the implantation of the unite - foundation drawing and the still structure - process drawings - electrical drawings - specifications of the equipment List data scheet of the instrumentation Isometriques drawings Operating manuals Planning for the global phases
CSE 254 (Spring 2003)
"... We implement variable n-grams using a word-tree data structure, where nodes represent sequences of words ("contexts") and store how often that context appeared in a training corpus. We build the tree by growing it from the root up. Unlike other methods, there is no pruning step. Instead, we used ..."
Abstract
- Add to MetaCart
We implement variable n-grams using a word-tree data structure, where nodes represent sequences of words ("contexts") and store how often that context appeared in a training corpus. We build the tree by growing it from the root up. Unlike other methods, there is no pruning step. Instead, we used the simple heuristic of maintaining a priority-queue of candidate leaves, sorted by how often those contexts occurred in the training text. The most popular leaves are then added to the tree, and this process repeats until a specified memory limit is reached. In this way, the tree was able to make branches for longer sentence fragments like "across the street from the" while saving the space from storing uncommon ones.

