R package pomdp: Partially Observable Markov Decision Processes

Provides the infrastructure to define and analyze the solutions of Partially Observable Markov Decision Processes (POMDP) models. The package includes pomdp-solve (Cassandra, 2015) to solve POMDPs using a variety of algorithms.

The package provides the following algorithms:

Exact value iteration
- Enumeration algorithm (Sondik 1971, Mohan 1982).
- Two pass algorithm (Sondik 1971).
- Witness algorithm (Littman, Cassandra, Kaelbling 1996).
- Incremental pruning algorithm (Zhang and Liu 1996, Cassandra et al 1997).
Approximate value iteration
- Finite grid algorithm (Cassandra 2015), a variation of point-based value iteration to solve larger POMDPs (PBVI; see Pineau 2003) without dynamic belief set expansion.

Installation

Stable CRAN version: install from within R with

install.packages("pomdp")

Current development version: install from GitHub (needs devtools).

library("devtools")
install_github("farzad/pomdp")

Usage

Solving the simple infinite-horizon Tiger problem.

R> library("pomdp")
R> data("Tiger")
R> Tiger

Unsolved POMDP model: Tiger Problem 
 	horizon: Inf

> sol <- solve_POMDP(model = Tiger)
> sol

Solved POMDP model: Tiger Problem 
 	solution method: grid 
 	horizon: Inf 
  	converged: TRUE 
 	total expected reward (for start probabilities): 1.933439

> policy(sol)

[[1]]
  tiger-left tiger-right     action tiger-left tiger-right
1 -98.549921   11.450079  open-left          3           3
2 -10.854299    6.516937     listen          3           1
3   1.933439    1.933439     listen          4           2
4   6.516937  -10.854299     listen          5           3
5  11.450079  -98.549921 open-right          3           3

References

Cassandra, A. (2015). pomdp-solve: POMDP Solver Software, http://www.pomdp.org.
Sondik, E. (1971). The Optimal Control of Partially Observable Markov Processes. Ph.D. Dissertation, Stanford University.
Cassandra, A., Littman M.L., Zhang L. (1997). Incremental Pruning: A Simple, Fast, Exact Algorithm for Partially Observable Markov Decision Processes. UAI'97: Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence, August 1997, pp. 54-61.
Monahan, G. E. (1982). A survey of partially observable Markov decision processes: Theory, models, and algorithms. Management Science 28(1):1-16.
Littman, M. L.; Cassandra, A. R.; and Kaelbling, L. P. (1996). Efficient dynamic-programming updates in partially observable Markov decision processes. Technical Report CS-95-19, Brown University, Providence, RI.
Zhang, N. L., and Liu, W. (1996). Planning in stochastic domains: Problem characteristics and approximation. Technical Report HKUST-CS96-31, Department of Computer Science, Hong Kong University of Science and Technology.
Pineau, J., Gordon, G.J., Thrun, S.B. (2003). Point-based value iteration: an anytime algorithm for POMDPs. IJCAI'03: Proceedings of the 18th international joint conference on Artificial Intelligence. Pages 1025-1030.

Acknowledgments

Development of this package was supported in part by National Institute of Standards and Technology (NIST) under grant number 60NANB17D180.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

R package pomdp: Partially Observable Markov Decision Processes

Installation

Usage

References

Acknowledgments

Files

README.md

Latest commit

History

README.md

File metadata and controls

R package pomdp: Partially Observable Markov Decision Processes

Installation

Usage

References

Acknowledgments