Results 1  10
of
527,988
Polynomial Cost Approximations in Markov Decision Theory Based Least Cost Routing
 IEEE/ACM Transactions on Networking
, 2001
"... The problem of call admission control and routing in a multiservice circuitswitched loss network can be solved optimally under certain assumptions by the tools of Markov decision theory. However in networks of practical size a number of simplifying approximations are needed to make the solution fe ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
The problem of call admission control and routing in a multiservice circuitswitched loss network can be solved optimally under certain assumptions by the tools of Markov decision theory. However in networks of practical size a number of simplifying approximations are needed to make the solution
Call Admission Control and Routing of SelfSimilar Call Traffic Using Markov Decision Theory
"... An effective statedependent Call Admission Control (CAC) and Routing algorithm is presented in this paper. The algorithm is based on Markov Decision Theory whose objective is to maximize the total revenue from the calls from different categories. CAC is applied for a single link and we try to prove ..."
Abstract
 Add to MetaCart
An effective statedependent Call Admission Control (CAC) and Routing algorithm is presented in this paper. The algorithm is based on Markov Decision Theory whose objective is to maximize the total revenue from the calls from different categories. CAC is applied for a single link and we try
Reinforcement learning: a survey
 Journal of Artificial Intelligence Research
, 1996
"... This paper surveys the field of reinforcement learning from a computerscience perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem ..."
Abstract

Cited by 1714 (25 self)
 Add to MetaCart
of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state
An introduction to hidden Markov models
 IEEE ASSp Magazine
, 1986
"... The basic theory of Markov chains has been known to ..."
Abstract

Cited by 1132 (2 self)
 Add to MetaCart
The basic theory of Markov chains has been known to
The Infinite Hidden Markov Model
 Machine Learning
, 2002
"... We show that it is possible to extend hidden Markov models to have a countably infinite number of hidden states. By using the theory of Dirichlet processes we can implicitly integrate out the infinitely many transition parameters, leaving only three hyperparameters which can be learned from data. Th ..."
Abstract

Cited by 637 (41 self)
 Add to MetaCart
We show that it is possible to extend hidden Markov models to have a countably infinite number of hidden states. By using the theory of Dirichlet processes we can implicitly integrate out the infinitely many transition parameters, leaving only three hyperparameters which can be learned from data
DecisionTheoretic Planning: Structural Assumptions and Computational Leverage
 JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
, 1999
"... Planning under uncertainty is a central problem in the study of automated sequential decision making, and has been addressed by researchers in many different fields, including AI planning, decision analysis, operations research, control theory and economics. While the assumptions and perspectives ..."
Abstract

Cited by 515 (4 self)
 Add to MetaCart
and perspectives adopted in these areas often differ in substantial ways, many planning problems of interest to researchers in these fields can be modeled as Markov decision processes (MDPs) and analyzed using the techniques of decision theory. This paper presents an overview and synthesis of MDP
Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms
, 2002
"... We describe new algorithms for training tagging models, as an alternative to maximumentropy models or conditional random fields (CRFs). The algorithms rely on Viterbi decoding of training examples, combined with simple additive updates. We describe theory justifying the algorithms through a modific ..."
Abstract

Cited by 660 (13 self)
 Add to MetaCart
We describe new algorithms for training tagging models, as an alternative to maximumentropy models or conditional random fields (CRFs). The algorithms rely on Viterbi decoding of training examples, combined with simple additive updates. We describe theory justifying the algorithms through a
Prospect theory: An analysis of decisions under risk
 Econometrica
, 1979
"... Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at ..."
Abstract

Cited by 6312 (30 self)
 Add to MetaCart
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
Markov Random Field Models in Computer Vision
, 1994
"... . A variety of computer vision problems can be optimally posed as Bayesian labeling in which the solution of a problem is defined as the maximum a posteriori (MAP) probability estimate of the true labeling. The posterior probability is usually derived from a prior model and a likelihood model. The l ..."
Abstract

Cited by 516 (18 self)
 Add to MetaCart
. The latter relates to how data is observed and is problem domain dependent. The former depends on how various prior constraints are expressed. Markov Random Field Models (MRF) theory is a tool to encode contextual constraints into the prior probability. This paper presents a unified approach for MRF modeling
Markov games as a framework for multiagent reinforcement learning
 IN PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING
, 1994
"... In the Markov decision process (MDP) formalization of reinforcement learning, a single adaptive agent interacts with an environment defined by a probabilistic transition function. In this solipsistic view, secondary agents can only be part of the environment and are therefore fixed in their behavior ..."
Abstract

Cited by 601 (13 self)
 Add to MetaCart
In the Markov decision process (MDP) formalization of reinforcement learning, a single adaptive agent interacts with an environment defined by a probabilistic transition function. In this solipsistic view, secondary agents can only be part of the environment and are therefore fixed
Results 1  10
of
527,988