Results 1  10
of
24
An SVM Based Voting Algorithm with Application to Parse Reranking
 In Proc. of CoNLL 2003
, 2003
"... This paper introduces a novel Support Vector Machines (SVMs) based voting algorithm for reranking, which provides a way to solve the sequential models indirectly. We have presented a risk formulation under the PAC framework for this voting algorithm. We have applied this algorithm to the parse ..."
Abstract

Cited by 42 (4 self)
 Add to MetaCart
(Show Context)
This paper introduces a novel Support Vector Machines (SVMs) based voting algorithm for reranking, which provides a way to solve the sequential models indirectly. We have presented a risk formulation under the PAC framework for this voting algorithm. We have applied this algorithm to the parse reranking problem, and achieved labeled recall and precision of 89.4%/89.8% on WSJ section 23 of Penn Treebank.
Training Recurrent Networks by Evolino
, 2007
"... In recent years, gradientbased LSTM recurrent neural networks (RNNs) solved many previously RNNunlearnable tasks. Sometimes, however, gradient information is of little use for training RNNs, due to numerous local minima. For such cases, we present a novel method: EVOlution of systems with LINear O ..."
Abstract

Cited by 35 (5 self)
 Add to MetaCart
(Show Context)
In recent years, gradientbased LSTM recurrent neural networks (RNNs) solved many previously RNNunlearnable tasks. Sometimes, however, gradient information is of little use for training RNNs, due to numerous local minima. For such cases, we present a novel method: EVOlution of systems with LINear Outputs (Evolino). Evolino evolves weights to the nonlinear, hidden nodes of RNNs while computing optimal linear mappings from hidden state to output, using methods such as pseudoinversebased linear regression. If we instead use quadratic programming to maximize the margin, we obtain the first evolutionary recurrent support vector machines. We show that Evolinobased LSTM can solve tasks that Echo State nets (Jaeger, 2004a) cannot and achieves higher accuracy in certain continuous function generation tasks than conventional gradient descent RNNs, including gradientbased LSTM.
Support vector machines for segmental minimum bayes risk decoding of continuous speech
 In IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU
, 2003
"... Segmental Minimum Bayes Risk (SMBR) Decoding involves the refinement of the search space into sequences of small sets of confusable words. We describe the application of Support Vector Machines (SVMs) as discriminative models for the refined search spaces. We show that SVMs, which in their basic for ..."
Abstract

Cited by 34 (6 self)
 Add to MetaCart
Segmental Minimum Bayes Risk (SMBR) Decoding involves the refinement of the search space into sequences of small sets of confusable words. We describe the application of Support Vector Machines (SVMs) as discriminative models for the refined search spaces. We show that SVMs, which in their basic formulation are binary classifiers of fixed dimensional observations, can be used for continuous speech recognition. We also study the use of GiniSVMs, which is a variant of the basic SVM. On a small vocabulary task, we show this two pass scheme outperforms MMI trained HMMs. Using system combination we also obtain further improvements over discriminatively trained HMMs. 1.
Large margin hidden markov models for speech recognition
, 2005
"... In this work, motivated by large margin classifiers in machine learning, we propose a novel method to estimate continuous density hidden Markov model (CDHMM) for speech recognition according to the principle of maximizing the minimum muticlass separation margin. The approach is named as large margi ..."
Abstract

Cited by 33 (4 self)
 Add to MetaCart
In this work, motivated by large margin classifiers in machine learning, we propose a novel method to estimate continuous density hidden Markov model (CDHMM) for speech recognition according to the principle of maximizing the minimum muticlass separation margin. The approach is named as large margin HMM. Firstly, we show this type of large margin HMM estimation problem can be formulated as a constrained minimax optimization problem. Secondly, by imposing different constraints to the minimax problem, we propose three solutions to the large margin HMM estimation problem, namely the iterative localized optimization method, the constrained joint optimization method and the semidefinite programming (SDP) method. These new training methods are evaluated in the isolated Eset recognition task using ISOLET database and the TIDIGITS connected digit string recognition task. Experimental results clearly show that the large margin HMMs consistently outperform the conventional HMM training methods. It has been consistently observed that the large margin training method yields significant recognition error rate reduction even on top of some popular discriminative training methods.
Discriminative keyword spotting
 In Proc. of Workshop on NonLinear Speech Processsing
, 2007
"... This paper proposes a new approach for keyword spotting, which is not based on HMMs. The proposed method employs a new discriminative learning procedure, in which the learning phase aims at maximizing the area under the ROC curve, ..."
Abstract

Cited by 26 (10 self)
 Add to MetaCart
(Show Context)
This paper proposes a new approach for keyword spotting, which is not based on HMMs. The proposed method employs a new discriminative learning procedure, in which the learning phase aims at maximizing the area under the ROC curve,
Phoneme alignment based on discriminative learning
 Proceedings of Interspeech 2005
, 2005
"... Phoneme alignment based on discriminative learning. ..."
Abstract

Cited by 15 (5 self)
 Add to MetaCart
(Show Context)
Phoneme alignment based on discriminative learning.
Lattice segmentation and support vector machines for large vocabulary continuous speech recognition
 In: Proc. ICASSP
, 2005
"... Lattice segmentation procedures are used to spot possible recognition errors in firstpass recognition hypotheses produced by a large vocabulary continuous speech recognition system. This approach is analyzed in terms of its ability to reliably identify, and provide good alternatives for, incorrectl ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
(Show Context)
Lattice segmentation procedures are used to spot possible recognition errors in firstpass recognition hypotheses produced by a large vocabulary continuous speech recognition system. This approach is analyzed in terms of its ability to reliably identify, and provide good alternatives for, incorrectly hypothesized words. A procedure is described to train and apply Support Vector Machines to strengthen the first pass system where it was found to be weak, resulting in small but statistically significant recognition improvements on a large test set of conversational speech. 1.
KernelBased Feature Extraction with a Speech Technology Application
, 2004
"... Kernelbased nonlinear feature extraction and classification algorithms are a popular new research direction in machine learning. This paper examines their applicability to the classification of phonemes in a phonological awareness drilling software package. We first give a concise overview of the n ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
(Show Context)
Kernelbased nonlinear feature extraction and classification algorithms are a popular new research direction in machine learning. This paper examines their applicability to the classification of phonemes in a phonological awareness drilling software package. We first give a concise overview of the nonlinear feature extraction methods such as kernel principal component analysis (KPCA), kernel independent component analysis (KICA), kernel linear discriminant analysis (KLDA) and kernel springy discriminant analysis (KSDA). The overview deals with all the methods in a unified framework, regardless of whether they are unsupervised or supervised. The effect of the transformations on a subsequent classification is tested in combination with learning algorithms such as Gaussian mixture modeling (GMM), artificial neural nets (ANN), projection pursuit learning (PPL), decision treebased classification (C4.5) and support vector machines (SVM). We found in most cases that the transformations have a beneficial effect on the classification performance. Furthermore, the nonlinear supervised algorithms yielded the best results.
Discriminative kernelbased phoneme sequence recognition
 IN PROC. OF ICSLP
, 2006
"... We describe a new method for phoneme sequence recognition given a speech utterance. In contrast to HMMbased approaches, our method uses a kernelbased discriminative training procedure in which the learning process is tailored to the goal of minimizing the Levenshtein distance between the predicted ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
(Show Context)
We describe a new method for phoneme sequence recognition given a speech utterance. In contrast to HMMbased approaches, our method uses a kernelbased discriminative training procedure in which the learning process is tailored to the goal of minimizing the Levenshtein distance between the predicted phoneme sequence and the correct sequence. The phoneme sequence predictor is devised by mapping the speech utterance along with a proposed phoneme sequence to a vectorspace endowed with an innerproduct that is realized by a Mercer kernel. Building on large margin techniques for predicting whole sequences, we are able to devise a learning algorithm which distills to separating the correct phoneme sequence from all other sequences. We describe an iterative algorithm for learning the phoneme sequence recognizer and further describe an efficient implementation of it. We present initial encouraging experimental results with the TIMIT and compare the proposed method to an HMMbased approach.
Evolino for recurrent support vector machines
 IN ESANN’06
, 2006
"... Traditional Support Vector Machines (SVMs) need prewired finite time windows to predict and classify time series. They do not have an internal state necessary to deal with sequences involving arbitrary longterm dependencies. Here we introduce a new class of recurrent, truly sequential SVMlike dev ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
(Show Context)
Traditional Support Vector Machines (SVMs) need prewired finite time windows to predict and classify time series. They do not have an internal state necessary to deal with sequences involving arbitrary longterm dependencies. Here we introduce a new class of recurrent, truly sequential SVMlike devices with internal adaptive states, trained by a novel method called EVOlution of systems with KErnelbased outputs (Evoke), an instance of the recent Evolino class of methods [1, 2]. Evoke evolves recurrent neural networks to detect and represent temporal dependencies while using quadratic programming/support vector regression to produce precise outputs, in contrast to our recent work [1, 2] which instead uses pseudoinverse regression. Evoke is the first SVMbased mechanism learning to classify a contextsensitive language. It also outperforms recent stateoftheart gradientbased recurrent neural networks (RNNs) on various time series prediction tasks.