Dota 2 with Large Scale Deep Reinforcement Learning
"When successfully scaled up, modern reinforcement learning techniques can achieve superhuman performance in competitive esports games. The key ingredients are to expand the scale of compute used, by increasing the batch size and total training time. In order to extend the training time of a single run to ten months, we developed surgery techniques for continuing training across changes to the model and environment. While we focused on Dota 2, we hypothesize these results will apply more generally and these methods can solve any zero-sum two-team continuous environment which can be simulated in parallel across hundreds of thousands of instances. In the future, environments and tasks will continue to grow in complexity. Scaling will become even more important (for current methods) as the tasks become more challenging." This excerpt is from this document:  https://cdn.openai.com/dota-2.pdf