Results 11  20
of
30
Stable Encoding of FiniteState Machines in DiscreteTime Recurrent Neural Nets with Sigmoid Units
, 1998
"... In recent years, there has been a lot of interest in the use of discretetime recurrent neural nets (DTRNN) to learn finitestate tasks, with interesting results regarding the induction of simple finitestate machines from inputoutput strings. Parallel work has studied the computational power of DT ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
In recent years, there has been a lot of interest in the use of discretetime recurrent neural nets (DTRNN) to learn finitestate tasks, with interesting results regarding the induction of simple finitestate machines from inputoutput strings. Parallel work has studied the computational power of DTRNN in connection with finitestate computation. This paper describes a simple strategy to devise stable encodings of finitestate machines in computationally capable discretetime recurrent neural architectures with sigmoid units, and gives a detailed presentation on how this strategy may be applied to encode a general class of finitestate machines in a variety of commonlyused first and secondorder recurrent neural networks. Unlike previous work that either imposed some restrictions to state values, or used a detailed analysis based on fixedpoint attractors, the present approach applies to any positive, bounded, strictly growing, continuous activation function, and uses simple bounding criteri...
Parametric Subspace Modeling Of Speech Transitions
 Speech Communication
, 1998
"... This report describes an attempt at capturing segmental transition information for speech recognition tasks. The slowly varying dynamics of spectral trajectories carries much discriminant information that is very crudely modelled by traditional approaches such as HMMs. In approaches such as recurren ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
This report describes an attempt at capturing segmental transition information for speech recognition tasks. The slowly varying dynamics of spectral trajectories carries much discriminant information that is very crudely modelled by traditional approaches such as HMMs. In approaches such as recurrent neural networks there is the hope, but not the convincing demonstration, that such transitional information could be captured. The method presented here starts from the very different position of explicitly capturing the trajectory of short time spectral parameter vectors on a subspace in which the temporal sequence information is preserved. We approach this by introducing a temporal constraint into the well known technique of Principal Component Analysis. On this subspace, we attempt a parametric modelling of the trajectory, and compute a distance metric to perform classification of diphones. We use the principal curves method of Hastie and Stuetzle and the Generative Topographic map (GTM...
Multilayer perceptrons and probabilistic neural networks for phoneme recognition
 In: Proc Eurospeech '93, 3rd European Conference on Speech Communication and Technology
, 1993
"... Two artificial neural networks have been trained to recognise phonemes in continuous speech: multilayer perceptron (MLP) nets and probabilistic neural networks (PNN). The speech material was recorded by one male Swedish speaker and the sentences were phonetically labelled. Fifty sentences were used ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
Two artificial neural networks have been trained to recognise phonemes in continuous speech: multilayer perceptron (MLP) nets and probabilistic neural networks (PNN). The speech material was recorded by one male Swedish speaker and the sentences were phonetically labelled. Fifty sentences were used for training and another fifty were used for testing. Both networks had a single hidden layer and 38 output nodes corresponding to Swedish phonemes. The MLP was trained by the supervised backpropagation algorithm. The PNN was trained by a selforganising clustering algorithm, a stochastic approximation to the expectation maximisation algorithm. The classification results for a feedforward MLP and the PNN were rather similar, but an MLP with simple recurrency using context nodes gave the best performance. Several other differences of practical value was noted.
Continuous Speech Recognition in the WAXHOLM Dialogue System
, 1996
"... This paper presents the status of the continuous speech recognition engine of the WAXHOLM project. The engine is a software only system written in portable C code. The design is flexible and different modes for phonetic pattern matching are available. In particular, artificial neural networks and ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
This paper presents the status of the continuous speech recognition engine of the WAXHOLM project. The engine is a software only system written in portable C code. The design is flexible and different modes for phonetic pattern matching are available. In particular, artificial neural networks and standard multiple Gaussian mixtures are implemented for phone probability estimation, and for research purposes, a general mode where the input consists of a phonegraph also exists. A lexicon with multiple pronunciations for many words and a class bigramgrammar is used. The lexicon and grammar constraints are represented by a lexical graph, optimised for efficient lexical decoding. The decoding is performed in a twopass search. The first pass is a Viterbi beamsearch and the second is an A* stackdecoding search. Pruningstrategies and memory management in the two passes are discussed in the report. Several different output formats are available. Results can be reported either on the word or phoneme level with or without the time alignment information. Multiple hypotheses can be output either as standard Nbest lists or in a more compact wordgraph format. Continuous speech recognition can be performed on a standard UNIX workstation in realtime with a lexicon of about 1000 words.
Automatic Continuous Speech Recognition with Rapid Speaker Adaption for Human/Machine Interaction
, 1997
"... This thesis presents work in three main directions of the automatic speech recognition field. The work within two of these  dynamic decoding and hybrid HMM/ANN speech recognition  has resulted in a realtime speech recognition system, currently in use in the human/machine dialogue demonstra ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
This thesis presents work in three main directions of the automatic speech recognition field. The work within two of these  dynamic decoding and hybrid HMM/ANN speech recognition  has resulted in a realtime speech recognition system, currently in use in the human/machine dialogue demonstration system WAXHOLM, developed at the department. The third direction is fast unsupervised speaker adaptation, where "fast" refers to adaptation with a small amount of adaptation speech. The work in
Neurocomputing on the RAP
 Digital Parallel Implementations of Neural Networks
, 1992
"... In 1989 we designed and implemented a Ring Array Processor (RAP) for fast execution of our continuous speech recognition training algorithms, which have been dominated by connectionist calculations. The RAP is a multiDSP system with a lowlatency ring interconnection scheme using programmable gate ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
In 1989 we designed and implemented a Ring Array Processor (RAP) for fast execution of our continuous speech recognition training algorithms, which have been dominated by connectionist calculations. The RAP is a multiDSP system with a lowlatency ring interconnection scheme using programmable gate array technology and a significant amount of local memory per node (16 MBytes of dynamic memory and 256 KByte of fast static RAM). Theoretical peak performance is 128 MFlops/board, with sustained performance of 3090% for backpropagation problems of interest to us. Systems with up to 40 nodes have been tested, for which throughputs of up to 574 Million Connections Per Second (MCPS) have been measured, as well as learning rates of up to 106 Million Connection Updates Per Second (MCUPS) for training. While the system is tuned to these algorithms, it is also a fully programmable computer, and users code in C++, C, and assembly language. Practical considerations such as workstation address spac...
Combining ANNs To Improve Phone Recognition
 ICASSP
, 1997
"... In applying neural networks to speech recognition, one often finds that slightly different training configurations lead to significantly different networks. Thus different training sessions using different setups will likely end up in "mixed" network configurations representing different solutions i ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
In applying neural networks to speech recognition, one often finds that slightly different training configurations lead to significantly different networks. Thus different training sessions using different setups will likely end up in "mixed" network configurations representing different solutions in different regions of the data space. This sensitivity to the initial weights assigned, the training parameters and the training data can be used to enhance performance, using a committee of neural networks. In this paper, we study various ways to combine contextdependent (CD) and context independent (CI) neural network phone estimators to improve phone recognition. As a result, we obtain 6.3% and 2.2% increase in accuracy in phone recognition using monophones and biphones respectively. 1. INTRODUCTION In the past decade, a number of connectionist approaches have enabled a new computing paradigm for speech recognition with some success [1, 6, 12, 15]. In these ANNbased speech recognizer...
FiniteState Computation in Analog Neural Networks: Steps Towards Biologically Plausible Models?
, 2001
"... Finitestate machines are the most pervasive models of computation, not only in theoretical computer science, but also in all of its applications to reallife problems, and constitute the best characterized computational model. On the other hand, neural networks proposed almost sixty years ag ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Finitestate machines are the most pervasive models of computation, not only in theoretical computer science, but also in all of its applications to reallife problems, and constitute the best characterized computational model. On the other hand, neural networks proposed almost sixty years ago by McCulloch and Pitts as a simplified model of nervous activity in living beings have evolved into a great variety of socalled artificial neural networks. Artificial neural networks have become a very successful tool for modelling and problem solving because of their builtin learning capability, but most of the progress in this field has occurred with models that are very removed from the behaviour of real, i.e., biological neural networks. This paper surveys the work that has established a connection between finitestate machines and (mainly discretetime recurrent) neural networks, and suggests possible ways to construct finitestate models in biologically plausible neural networks.
Comparing phoneme and feature based speech recognition using artificial neural networks
 Proc. ICSLP
, 1992
"... An artificial neural network has been trained by the error backpropagation technique to recognise phonemes and words. The speech material was recorded by a male Swedish talker and was labelled by a phonetician. There were 38 output nodes corresponding to Swedish phonemes. The training algorithm was ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
An artificial neural network has been trained by the error backpropagation technique to recognise phonemes and words. The speech material was recorded by a male Swedish talker and was labelled by a phonetician. There were 38 output nodes corresponding to Swedish phonemes. The training algorithm was somewhat modified to increase the training speed. Introducing coarticulation information by adding simple recurrency to the net is shown to more effective than expanding the size of the input spectral window. The phoneme recognition network was used with dynamic programming for time alignment to recognise connected digits. It was compared to a similar recogniser based on nine quasiphonetic features instead of 38 phonemes. The phoneme based system performed better than the feature based one. I.
ForeNet: Fourier Recurrent Networks for Time Series Prediction
, 2000
"... Recurrent neural networks have been established as a general tool for tting sequential input=output data. On the other hand, Fourier analysis is a useful tool for time series analysis. In this paper, these two elds are linked together to form a new interpretation to recurrent networks for time serie ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Recurrent neural networks have been established as a general tool for tting sequential input=output data. On the other hand, Fourier analysis is a useful tool for time series analysis. In this paper, these two elds are linked together to form a new interpretation to recurrent networks for time series prediction. Fourier analysis of a time series is applied to construct a complexvalued recurrent neural network. The proposed network is called Fourier Recurrent Network (ForeNet). We showed the proper parameter initialization and the learning algorithm for the complex weights in ForeNet. Experimental results show that ForeNet speeds up the learning, and the generalization performance is superior to traditional recurrent network. 1 Introduction Time Delayed Recurrent networks are essentially feedforward networks with the addition of feedback connections. Due to the existence of these recurrent links, the recurrent neural network models preserve information through time and are more pow...