Provides the infrastructure to define and analyze the solutions of Partially Observable Markov Decision Processes (POMDP) models. The package includes pomdp-solve (Cassandra, 2015) to solve POMDPs using a variety of algorithms.
The package provides the following algorithms:
-
Exact value iteration
- Enumeration algorithm (Sondik 1971, Mohan 1982).
- Two pass algorithm (Sondik 1971).
- Witness algorithm (Littman, Cassandra, Kaelbling 1996).
- Incremental pruning algorithm (Zhang and Liu 1996, Cassandra et al 1997).
-
Approximate value iteration
- Finite grid algorithm (Cassandra 2015), a variation of point-based value iteration to solve larger POMDPs (PBVI; see Pineau 2003) without dynamic belief set expansion.
Stable CRAN version: install from within R with
install.packages("pomdp")
Current development version: install from GitHub (needs devtools).
library("devtools")
install_github("farzad/pomdp")
Solving the simple infinite-horizon Tiger problem.
R> library("pomdp")
R> data("Tiger")
R> Tiger
Unsolved POMDP model: Tiger Problem
horizon: Inf
> sol <- solve_POMDP(model = Tiger)
> sol
Solved POMDP model: Tiger Problem
solution method: grid
horizon: Inf
converged: TRUE
total expected reward (for start probabilities): 1.933439
> policy(sol)
[[1]]
tiger-left tiger-right action tiger-left tiger-right
1 -98.549921 11.450079 open-left 3 3
2 -10.854299 6.516937 listen 3 1
3 1.933439 1.933439 listen 4 2
4 6.516937 -10.854299 listen 5 3
5 11.450079 -98.549921 open-right 3 3
- Cassandra, A. (2015). pomdp-solve: POMDP Solver Software, http://www.pomdp.org.
- Sondik, E. (1971). The Optimal Control of Partially Observable Markov Processes. Ph.D. Dissertation, Stanford University.
- Cassandra, A., Littman M.L., Zhang L. (1997). Incremental Pruning: A Simple, Fast, Exact Algorithm for Partially Observable Markov Decision Processes. UAI'97: Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence, August 1997, pp. 54-61.
- Monahan, G. E. (1982). A survey of partially observable Markov decision processes: Theory, models, and algorithms. Management Science 28(1):1-16.
- Littman, M. L.; Cassandra, A. R.; and Kaelbling, L. P. (1996). Efficient dynamic-programming updates in partially observable Markov decision processes. Technical Report CS-95-19, Brown University, Providence, RI.
- Zhang, N. L., and Liu, W. (1996). Planning in stochastic domains: Problem characteristics and approximation. Technical Report HKUST-CS96-31, Department of Computer Science, Hong Kong University of Science and Technology.
- Pineau, J., Gordon, G.J., Thrun, S.B. (2003). Point-based value iteration: an anytime algorithm for POMDPs. IJCAI'03: Proceedings of the 18th international joint conference on Artificial Intelligence. Pages 1025-1030.
Development of this package was supported in part by National Institute of Standards and Technology (NIST) under grant number 60NANB17D180.