## Minimal Sufficient Explanations for Factored Markov Decision Processes (2009)

Venue: | In Proceedings of the 19th International Conference on Automated Planning and Scheduling (ICAPS’09 |

Citations: | 4 - 1 self |

### BibTeX

@INPROCEEDINGS{Khan09minimalsufficient,

author = {Omar Zia Khan and Pascal Poupart and James P. Black},

title = {Minimal Sufficient Explanations for Factored Markov Decision Processes},

booktitle = {In Proceedings of the 19th International Conference on Automated Planning and Scheduling (ICAPS’09},

year = {2009}

}

### OpenURL

### Abstract

Explaining policies of Markov Decision Processes (MDPs) is complicated due to their probabilistic and sequential nature. We present a technique to explain policies for factored MDP by populating a set of domain-independent templates. We also present a mechanism to determine a minimal set of templates that, viewed together, completely justify the policy. Our explanations can be generated automatically at run-time with no additional effort required from the MDP designer. We demonstrate our technique using the problems of advising undergraduate students in their course selection and assisting people with dementia in completing the task of handwashing. We also evaluate our explanations for courseadvising through a user study involving students.

### Citations

3776 | A.G.: Reinforcement Learning: an Introduction
- Sutton, Barto
- 1998
(Show Context)
Citation Context ... = π∗ (s) otherwise Note that πs0,a and π∗ are equivalent if a = π∗ (s0). We can express the expected utility of executing this policy, as V πs 0 ,a . This is equivalent to the action-value function (=-=Sutton and Barto 1998-=-), also known as the Q-function, Qπ∗ (s0, a). We can compute the value of V πs0 ,a or Qπ∗ (s0, a) using Eq. 4. Since a template is populated by a frequency and a state/scenario, let us define a term t... |

581 |
Markov Decision Processes
- Puterman
- 1994
(Show Context)
Citation Context ...l or system. However, deciding on a course of action is notoriously difficult when there is uncertainty in the effects of the actions and the objectives are complex. Markov decision processes (MDPs) (=-=Puterman 1994-=-) provide a principled approach for automated planning under uncertainty. While such an automated approach harnesses the computational power of machines to optimize difficult sequential decision makin... |

179 | SPUDD: Stochastic Planning using Decision Diagrams
- Hoey, St-Aubin, et al.
- 1999
(Show Context)
Citation Context ...that automatically aggregates states with identical values/frequencies, needed for scenarios, thereby significantly reducing the running time. More details on the use of ADDs in MDPs can be found in (=-=Hoey et al. 1999-=-). Discussion For any given state, an MSE is guaranteed to exist; in the worst case it will need to include all the terms in the MSE from Eq. 6 or Eq. 8. Thus, the upper bound on the number of templat... |

147 | Stochastic Dynamic Programming with Factored Representations - Boutilier, Dearden, et al. - 2000 |

111 |
The Epistemology of a Rule-Based Expert System: A Framework for Explanation
- Clancey
- 1981
(Show Context)
Citation Context ...at provided justifications of its decisions. In addition to the rules used by the expert system, it also needed additional domain knowledge to generate explanations. Another example is s ′ (5) MYCIN (=-=Clancey 1983-=-) which provided execution traces as explanations. This approach is infeasible for MDPs as the computation is too complex to be explained directly. Herlocker et al. (1999) presented the idea of highli... |

95 |
Xplain: A System for Creating and Explaining Expert Consulting Programs
- Swartout
- 1983
(Show Context)
Citation Context ...a [ ρ (s, a) + γ ∑ P r (s ′ |s, a) V ∗ (s ′ ] ) Related Work Explanations for intelligent systems such as expert and recommender systems have been studied widely (Tintarev and Masthoff 2007). Xplain (=-=Swartout 1983-=-) was an early example of an intelligent tutoring system that provided justifications of its decisions. In addition to the rules used by the expert system, it also needed additional domain knowledge t... |

63 | Exploiting Structure to Efficiently Solve Large Scale Partially Observable Markov Decision Processes - Poupart - 2005 |

36 | Assisting persons with dementia during handwashing using a partially observable Markov decision process - Hoey, Bertoldi, et al. - 2007 |

24 | A review of explanation methods for Bayesian networks - Lacave, Diez - 2002 |

23 | Halpern [1997]: Defining explanation in probabilistic systems - Chajewska, Y |

9 | Explaining task processing in cognitive assistants that learn - McGuinness, Glass, et al. - 2007 |

5 | An MDP Approach for Explanation Generation - Flores, Sucar, et al. - 2007 |

4 | Explanation of bayesian networks and influence diagrams in elvira - Lacave, Luque, et al. |

3 | Explanations in recommender systems - Herlocker - 1999 |

3 | Poet: The online preference elicitation tool - Royalty, Holland, et al. - 2002 |

1 | Explaining recommendations generated by MDPs - Khan, Poupart, et al. - 2008 |