S
- the state type.A
- the action type.public interface PolicyEvaluation<S,A extends Action>
Modifier and Type | Method and Description |
---|---|
java.util.Map<S,java.lang.Double> |
evaluate(java.util.Map<S,A> pi_i,
java.util.Map<S,java.lang.Double> U,
MarkovDecisionProcess<S,A> mdp)
Policy evaluation: given a policy πi, calculate
Ui=Uπi, the utility of each state if
πi were to be executed.
|
java.util.Map<S,java.lang.Double> evaluate(java.util.Map<S,A> pi_i, java.util.Map<S,java.lang.Double> U, MarkovDecisionProcess<S,A> mdp)
pi_i
- a policy vector indexed by stateU
- a vector of utilities for states in Smdp
- an MDP with states S, actions A(s), transition model P(s'|s,a)