Our latest paper shows that deep reinforcement learning agents seem to model intentions of other agents in a cooperative task. Also that trained agents tend to overfit to each other and do not generalize to unseen partners. Short read: https://arxiv.org/abs/1805.06020.
Generalization gap video (featuring Sheldon agents!):
Intention reading video: