Consider a modified version of the vacuum environment in Exercise vacuum-start-exercise, in which the agent is penalized one point for each movement.
Can a simple reflex agent be perfectly rational for this environment? Explain.
What about a reflex agent with state? Design such an agent.
How do your answers to 1 and 2 change if the agent’s percepts give it the clean/dirty status of every square in the environment?