Do deep reinforcement learning agents model intentions?

Our latest paper shows that deep reinforcement learning agents seem to model intentions of other agents in a cooperative task. Also that trained agents tend to overfit to each other and do not generalize to unseen partners. Short read:

Generalization gap video (featuring Sheldon agents!):

Intention reading video:

