On the Expressivity of Markov Reward

Our main results prove that while reward can express many tasks, there exist instances of each task type that no Markov reward function can capture. We then provide a set of polynomial-time algorithms that construct a reward function which allows an agent to optimize tasks of each of these three types, and correctly determine when no such reward function exists.

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *