Reinforcement learning with prediction-based rewards
OpenAI Blog
October 31, 2018
We’ve developed Random Network Distillation (RND), a prediction-based method for encouraging reinforcement learning agents to explore their environments through curiosity, which for the first time exceeds average human performance on Montezuma’s Revenge.
Verticals
airesearch
Originally published on OpenAI Blog on 10/31/2018