MarkovDecisionProcess

Skip navigation links

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method

Type Parameters:

S - the state type.

A - the action type.

All Known Implementing Classes:

MDP
```
public interface MarkovDecisionProcess<S,A extends Action>
```
Artificial Intelligence A Modern Approach (3rd Edition): page 647.

A sequential decision problem for a fully observable, stochastic environment with a Markovian transition model and additive rewards is called a Markov decision process, or MDP, and consists of a set of states (with an initial state s₀; a set ACTIONS(s) of actions in each state; a transition model P(s' | s, a); and a reward function R(s).

Note: Some definitions of MDPs allow the reward to depend on the action and outcome too, so the reward function is R(s, a, s'). This simplifies the description of some environments but does not change the problem in any fundamental way.

Author:

Ciaran O'Reilly, Ravi Mohan

Method Summary

All Methods Instance Methods Abstract Methods
Modifier and Type	Method and Description
`java.util.Set<A>`	`actions(S s)` Get the set of actions for state s.
`S`	`getInitialState()` Get the initial state s₀ for this instance of a Markov decision process.
`double`	`reward(S s)` Get the reward associated with being in state s.
`java.util.Set<S>`	`states()` Get the set of states associated with the Markov decision process.
`double`	`transitionProbability(S sDelta, S s, A a)` Return the probability of going from state s using action a to s' based on the underlying transition model P(s' \| s, a).

- Method Detail
  - states
```
java.util.Set<S> states()
```
    Get the set of states associated with the Markov decision process.
    
    Returns:
    
    the set of states associated with the Markov decision process.
  - getInitialState
```
S getInitialState()
```
    Get the initial state s₀ for this instance of a Markov decision process.
    
    Returns:
    
    the initial state s₀.
  - actions
```
java.util.Set<A> actions(S s)
```
    Get the set of actions for state s.
    
    Parameters:
    
    s - the state.
    
    Returns:
    
    the set of actions for state s.
  - transitionProbability
```
double transitionProbability(S sDelta,
                             S s,
                             A a)
```
    Return the probability of going from state s using action a to s' based on the underlying transition model P(s' | s, a).
    
    Parameters:
    
    sDelta - the state s' being transitioned to.
    
    s - the state s being transitions from.
    
    a - the action used to move from state s to s'.
    
    Returns:
    
    the probability of going from state s using action a to s'.
  - reward
```
double reward(S s)
```
    Get the reward associated with being in state s.
    
    Parameters:
    
    s - the state whose award is sought.
    
    Returns:
    
    the reward associated with being in state s.

Skip navigation links

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method