Results 1 
9 of
9
Fast Reinforcement Learning with Large Action Sets using ErrorCorrecting Output Codes for MDP Factorization
, 2012
"... The use of Reinforcement Learning in realworld scenarios is strongly limited by issues of scale. Most RL learning algorithms are unable to deal with problems composed of hundreds or sometimes even dozens of possible actions, and therefore cannot be applied to many realworld problems. We consider t ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
(Show Context)
The use of Reinforcement Learning in realworld scenarios is strongly limited by issues of scale. Most RL learning algorithms are unable to deal with problems composed of hundreds or sometimes even dozens of possible actions, and therefore cannot be applied to many realworld problems. We consider the RL problem in the supervised classification framework where the optimal policy is obtained through a multiclass classifier, the set of classes being the set of actions of the problem. We introduce errorcorrecting output codes (ECOCs) in this setting and propose two new methods for reducing complexity when using rolloutsbased approaches. The first method consists in using an ECOCbased classifier as the multiclass classifier, reducing the learning complexity from O(A 2) to O(A log(A)). We then propose a novel method that profits from the ECOC’s coding dictionary to split the initial MDP into O(log(A)) seperate twoaction MDPs. This second method reduces learning complexity even further, from O(A 2) to O(log(A)), thus rendering problems with large action sets tractable. We finish by experimentally demonstrating the advantages of our approach on a set of benchmark problems, both in speed and performance.
Performance Evaluation of the Machine Learning Algorithms Used in Inference Mechanism of a Medical Decision Support System
"... ..."
(Show Context)
Reinforcement and Imitation Learning via Interactive NoRegret Learning
, 2014
"... ar ..."
(Show Context)
Costsensitive learning for largescale hierarchical classification of commercial products
 In CIKM
, 2013
"... We study hierarchical classification of products in electronic commerce, classifying a text description of a product into one of the leaf classes of a treestructure taxonomy. In particular, we investigate two essential problems, performance evaluation and learning, in a synergistic way. Unless we ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
We study hierarchical classification of products in electronic commerce, classifying a text description of a product into one of the leaf classes of a treestructure taxonomy. In particular, we investigate two essential problems, performance evaluation and learning, in a synergistic way. Unless we know what is the appropriate performance evaluation metric for a task, we are not going to learn a classifier of maximum utility for the task. Given the characteristics of the task of hierarchical product classification, we shed insight into how and why common evaluation metrics such as error rate can be misleading, which is applicable for treating other real world applications. The analysis leads to a new performance evaluation metric that tailors this task to reflect a vendor’s business goal of maximizing revenue. The proposed metric has an intuitive meaning as the average revenue loss, which depends on both the value of individual products and the hierarchical distance between the true class and the predicted class. Correspondingly, our learning algorithm uses multiclass SVM with margin rescaling to optimize the proposed metric, instead of error rate or other common metrics. Margin rescaling is sensitive to the scaling of loss functions. We propose a loss normalization approach to appropriately calibrating the scaling of loss functions, which is applicable to general classification and structured prediction tasks whenever using structured SVM with margin rescaling. Experiments on a large dataset show that our approach outperforms standard multiclass SVM in terms of the proposed metric, effectively reducing the average revenue loss.
JMLR: Workshop and Conference Proceedings 1–12 Fast Reinforcement Learning with Large Action Sets using ErrorCorrecting Output Codes for MDP Factorization
"... The use of Reinforcement Learning in realworld scenarios is strongly limited by issues of scale. Most RL learning algorithms are unable to deal with problems composed of hundreds or sometimes even dozens of possible actions, and therefore cannot be applied to many realworld problems. We consider t ..."
Abstract
 Add to MetaCart
The use of Reinforcement Learning in realworld scenarios is strongly limited by issues of scale. Most RL learning algorithms are unable to deal with problems composed of hundreds or sometimes even dozens of possible actions, and therefore cannot be applied to many realworld problems. We consider the RL problem in the supervised classification framework where the optimal policy is obtained through a multiclass classifier, the set of classes being the set of actions of the problem. We introduce errorcorrecting output codes (ECOCs) in this setting and propose two new methods for reducing complexity when using rolloutsbased approaches. The first method consists in using an ECOCbased classifier as the multiclass classifier, reducing the learning complexity from O(A2) to O(A log(A)). We then propose a novel method that profits from the ECOC’s coding dictionary to split the initial MDP into O(log(A)) separate twoaction MDPs. This second method reduces learning complexity even further, from O(A2) to O(log(A)), thus rendering problems with large action sets tractable. We finish by experimentally demonstrating the advantages of our approach on a set of benchmark problems, both in speed and performance. 1.
unknown title
"... Apprentissage par renforcement rapide pour des grands ensembles d’actions en utilisant des codes correcteurs d’erreur ..."
Abstract
 Add to MetaCart
Apprentissage par renforcement rapide pour des grands ensembles d’actions en utilisant des codes correcteurs d’erreur
Research Article Performance Evaluation of the Machine Learning Algorithms Used in Inference Mechanism of
"... Copyright © 2014 Mert Bal et al.This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The importance of the decision support systems is inc ..."
Abstract
 Add to MetaCart
(Show Context)
Copyright © 2014 Mert Bal et al.This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The importance of the decision support systems is increasingly supporting the decision making process in cases of uncertainty and the lack of information and they are widely used in various fields like engineering, finance, medicine, and so forth,Medical decision support systems help the healthcare personnel to select optimal method during the treatment of the patients. Decision support systems are intelligent software systems that support decision makers on their decisions. The design of decision support systems consists of four main subjects called inference mechanism, knowledgebase, explanation module, and active memory. Inference mechanism constitutes the basis of decision support systems. There are various methods that can be used in these mechanisms approaches. Some of these methods are decision trees, artificial neural networks, statistical methods, rulebased methods, and so forth. In decision support systems, those methods can be used separately or a hybrid system, and also combination of those methods. In this study, synthetic data with 10, 100, 1000, and 2000 records have been produced to reflect the probabilities on the
unknown title
"... Apprentissage par renforcement rapide pour des grands ensembles d’actions en utilisant des codes correcteurs d’erreur ..."
Abstract
 Add to MetaCart
Apprentissage par renforcement rapide pour des grands ensembles d’actions en utilisant des codes correcteurs d’erreur
MARK D. REID — THE AUSTRALIAN NATIONAL UNIVERSITY Synonyms: Sample Complexity, Inequalities Definition
"... In the theory of statistical machine learning, a generalization bound—or, more precisely, a generalization error bound—is a statement about the predictive performance of a learning algorithm or class of algorithms. Here, a learning algorithm is viewed as a procedure that takes some finite training s ..."
Abstract
 Add to MetaCart
In the theory of statistical machine learning, a generalization bound—or, more precisely, a generalization error bound—is a statement about the predictive performance of a learning algorithm or class of algorithms. Here, a learning algorithm is viewed as a procedure that takes some finite training sample of labelled instances as input and returns a hypothesis regarding the labels of all instances, including those which may not have appeared in the training sample. Assuming labelled instances are drawn from some fixed distribution, the quality of a hypothesis can be measured in terms of its risk—that is, its incompatibility with the distribution. The performance of a learning algorithm can then be expressed in terms of the expected risk of its hypotheses given randomly generated training samples. Under these assumptions a generalization bound is a theorem which holds for any distribution and states that, with high probability, applying the learning algorithm to a randomly drawn sample will result in a hypothesis with risk no greater than some value. This bounding value typically depends on the size of the training sample, an empirical assessment of the risk of the hypothesis on the training sample