Dota 2 with Large Scale Deep Reinforcement Learning
"When successfully scaled up, modern reinforcement learning techniques can achieve superhuman
performance in competitive esports games. The key ingredients are to expand the scale of compute
used, by increasing the batch size and total training time. In order to extend the training time of a
single run to ten months, we developed surgery techniques for continuing training across changes to
the model and environment. While we focused on Dota 2, we hypothesize these results will apply
more generally and these methods can solve any zero-sum two-team continuous environment which
can be simulated in parallel across hundreds of thousands of instances. In the future, environments
and tasks will continue to grow in complexity. Scaling will become even more important (for current
methods) as the tasks become more challenging."
This excerpt is from this document: https://cdn.openai.com/dota-2.pdf
This excerpt is from this document: https://cdn.openai.com/dota-2.pdf
Comments
Post a Comment