Skip to content

Latest commit

 

History

History
89 lines (71 loc) · 3.8 KB

README.md

File metadata and controls

89 lines (71 loc) · 3.8 KB

R package pomdp: Partially Observable Markov Decision Processes

CRAN version Rdoc R build status CRAN RStudio mirror downloads

Provides the infrastructure to define and analyze the solutions of Partially Observable Markov Decision Processes (POMDP) models. The package includes pomdp-solve (Cassandra, 2015) to solve POMDPs using a variety of algorithms.

The package provides the following algorithms:

  • Exact value iteration

    • Enumeration algorithm (Sondik 1971, Mohan 1982).
    • Two pass algorithm (Sondik 1971).
    • Witness algorithm (Littman, Cassandra, Kaelbling 1996).
    • Incremental pruning algorithm (Zhang and Liu 1996, Cassandra et al 1997).
  • Approximate value iteration

    • Finite grid algorithm (Cassandra 2015), a variation of point-based value iteration to solve larger POMDPs (PBVI; see Pineau 2003) without dynamic belief set expansion.

Installation

Stable CRAN version: install from within R with

install.packages("pomdp")

Current development version: install from GitHub (needs devtools).

library("devtools")
install_github("farzad/pomdp")

Usage

Solving the simple infinite-horizon Tiger problem.

R> library("pomdp")
R> data("Tiger")
R> Tiger
Unsolved POMDP model: Tiger Problem 
 	horizon: Inf 
> sol <- solve_POMDP(model = Tiger)
> sol
Solved POMDP model: Tiger Problem 
 	solution method: grid 
 	horizon: Inf 
  	converged: TRUE 
 	total expected reward (for start probabilities): 1.933439 
> policy(sol)
[[1]]
  tiger-left tiger-right     action tiger-left tiger-right
1 -98.549921   11.450079  open-left          3           3
2 -10.854299    6.516937     listen          3           1
3   1.933439    1.933439     listen          4           2
4   6.516937  -10.854299     listen          5           3
5  11.450079  -98.549921 open-right          3           3

References

  • Cassandra, A. (2015). pomdp-solve: POMDP Solver Software, http://www.pomdp.org.
  • Sondik, E. (1971). The Optimal Control of Partially Observable Markov Processes. Ph.D. Dissertation, Stanford University.
  • Cassandra, A., Littman M.L., Zhang L. (1997). Incremental Pruning: A Simple, Fast, Exact Algorithm for Partially Observable Markov Decision Processes. UAI'97: Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence, August 1997, pp. 54-61.
  • Monahan, G. E. (1982). A survey of partially observable Markov decision processes: Theory, models, and algorithms. Management Science 28(1):1-16.
  • Littman, M. L.; Cassandra, A. R.; and Kaelbling, L. P. (1996). Efficient dynamic-programming updates in partially observable Markov decision processes. Technical Report CS-95-19, Brown University, Providence, RI.
  • Zhang, N. L., and Liu, W. (1996). Planning in stochastic domains: Problem characteristics and approximation. Technical Report HKUST-CS96-31, Department of Computer Science, Hong Kong University of Science and Technology.
  • Pineau, J., Gordon, G.J., Thrun, S.B. (2003). Point-based value iteration: an anytime algorithm for POMDPs. IJCAI'03: Proceedings of the 18th international joint conference on Artificial Intelligence. Pages 1025-1030.

Acknowledgments

Development of this package was supported in part by National Institute of Standards and Technology (NIST) under grant number 60NANB17D180.