Results 1 
6 of
6
Discriminative Training of Hidden Markov Models
, 1998
"... vi Abbreviations vii Notation viii 1 Introduction 1 2 Hidden Markov Models 4 2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 HMM Modelling Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 HMM Topology . . . . . . . . . ..."
Abstract

Cited by 27 (0 self)
 Add to MetaCart
vi Abbreviations vii Notation viii 1 Introduction 1 2 Hidden Markov Models 4 2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 HMM Modelling Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 HMM Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.4 Finding the Best Transcription . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.5 Setting the Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3 Objective Functions 19 3.1 Properties of Maximum Likelihood Estimators . . . . . . . . . . . . . . . . . . . 19 3.2 Maximum Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3 Maximum Mutual Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.4 Frame Discrimination . . . . . . . . . . . . . . . . ....
Using a financial training criterion rather than a prediction criterion
 International Journal of Neural Systems
, 1997
"... noisy time series The application of this work is to decision taking with nancial timeseries, using learning algorithms. The traditional approach is to train a model using a prediction criterion, such as minimizing the squared error between predictions and actual values of a dependent variable, or ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
(Show Context)
noisy time series The application of this work is to decision taking with nancial timeseries, using learning algorithms. The traditional approach is to train a model using a prediction criterion, such as minimizing the squared error between predictions and actual values of a dependent variable, or maximizing the likelihood of a conditional model of the dependent variable. We nd here with noisy timeseries that better results can be obtained when the model is directly trained in order to maximize the nancial criterion of interest, here gains and losses (including those due to transactions) incurred during trading. Experiments were performed on portfolio selection with 35 Canadian stocks. 1
Why Error Measures are SubOptimal for Training Neural Network Pattern Classifiers
, 1992
"... Pattern classifiers that are trained in a supervised fashion (e.g., multilayer perceptrons, radial basis functions, etc.) are typically trained with an error measure objective function such as meansquared error (MSE) or crossentropy (CE). These classifiers can in theory yield (optimal) Bayesian d ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Pattern classifiers that are trained in a supervised fashion (e.g., multilayer perceptrons, radial basis functions, etc.) are typically trained with an error measure objective function such as meansquared error (MSE) or crossentropy (CE). These classifiers can in theory yield (optimal) Bayesian discrimination, but in practice they often fail to do so. We explain why this happens. In so doing, we identify a number of characteristics that the optimal objective function for training classifiers must have. We show that classification figures of merit (CFMmono) possess these optimal characteristics, whereas error measures such as MSE and CE do not. We illustrate our arguments with a simple example in which a CFMmonotrained loworder polynomial neural network approximates Bayesian discrimination on a random scalar with the fewest number of training samples and the minimum functional complexity necessary for the task. A comparable MSEtrained net yields significantly worse discrimination on the same task.
Differential Theory of Learning for Efficient Neural Network Pattern Recognition
 in Applications of Artificial Neural Networks IV
, 1965
"... We describe a new theory of differential learning by which a broad family of pattern classifiers (including many wellknown neural network paradigms) can learn stochastic concepts efficiently. We describe the relationship between a classifier's ability to generalize well to unseen test examples ..."
Abstract
 Add to MetaCart
We describe a new theory of differential learning by which a broad family of pattern classifiers (including many wellknown neural network paradigms) can learn stochastic concepts efficiently. We describe the relationship between a classifier's ability to generalize well to unseen test examples and the efficiency of the strategy by which it learns. We list a series of proofs that differential learning is efficient in its information and computational resource requirements, whereas traditional probabilistic learning strategies are not. The proofs are illustrated by a simple example that lends itself to closedform analysis. We conclude with an optical character recognition task for which three different types of differentially generated classifiers generalize significantly better than their probabilistically generated counterparts. 1 DIFFERENTIAL LEARNING A differentiable supervised classifier is one that learns an inputtooutput mapping by adjusting a set of internal parameters ` via...
Les organisationspartenaires / The Partner Organizations •École des Hautes Études Commerciales
"... Le CIRANO est un organisme sans but lucratif constitué en vertu de la Loi des compagnies du Québec. Le financement de son infrastructure et de ses activités de recherche provient des cotisations de ses organisationsmembres, d’une subvention d’infrastructure du ministère de la Recherche, de la Scien ..."
Abstract
 Add to MetaCart
Le CIRANO est un organisme sans but lucratif constitué en vertu de la Loi des compagnies du Québec. Le financement de son infrastructure et de ses activités de recherche provient des cotisations de ses organisationsmembres, d’une subvention d’infrastructure du ministère de la Recherche, de la Science et de la Technologie, de même que des subventions et mandats obtenus par ses équipes de recherche. CIRANO is a private nonprofit organization incorporated under the Québec Companies Act. Its infrastructure and research activities are funded through fees paid by member organizations, an infrastructure grant from the Ministère de la Recherche, de la Science et de la Technologie, and grants and research mandates obtained by its research teams.
Differentially Generated Neural Network Classifiers Are Efficient
"... Differential learning for statistical pattern classification is described in [5]; it is based on the classification figureofmerit (CFM) objective function described in [9, 5]. We prove that differential learning is asymptotically efficient, guaranteeing the best generalization allowed by the choic ..."
Abstract
 Add to MetaCart
Differential learning for statistical pattern classification is described in [5]; it is based on the classification figureofmerit (CFM) objective function described in [9, 5]. We prove that differential learning is asymptotically efficient, guaranteeing the best generalization allowed by the choice of hypothesis class (see below) as the training sample size grows large, while requiring the least classifier complexity necessary for Bayesian (i.e., minimum probabilityoferror) discrimination. Moreover, differential learning almost always guarantees the best generalization allowed by the choice of hypothesis class for small training sample sizes.