SURREAL - 斯坦福视觉和学习实验室的开源分布式强化学习框架

网友投稿 584 2022-11-05

SURREAL - 斯坦福视觉和学习实验室的开源分布式强化学习框架

SURREAL - 斯坦福视觉和学习实验室的开源分布式强化学习框架

SURREAL

About Installation Benchmarking Citation

Open-Source Distributed Reinforcement Learning Framework

Stanford Vision and Learning Lab

SURREAL is a fully integrated framework that runs state-of-the-art distributed reinforcement learning (RL) algorithms.

Scalability. RL algorithms are data hungry by nature. Even the simplest Atari games, like Breakout, typically requires up to a billion frames to learn a good solution. To accelerate training significantly, SURREAL parallelizes the environment simulation and learning. The system can easily scale to thousands of CPUs and hundreds of GPUs. Flexibility. SURREAL unifies distributed on-policy and off-policy learning into a single algorithmic formulation. The key is to separate experience generation from learning. Parallel actors generate massive amount of experience data, while a single, centralized learner performs model updates. Each actor interacts with the environment independently, which allows them to diversify the exploration for hard long-horizon robotic tasks. They send the experiences to a centralized buffer, which can be instantiated as a FIFO queue for on-policy mode and replay memory for off-policy mode.

Reproducibility. RL algorithms are notoriously hard to reproduce [Henderson et al., 2017], due to multiple sources of variations like algorithm implementation details, library dependencies, and hardware types. We address this by providing an end-to-end integrated pipeline that replicates our full cluster hardware and software runtime setup.

Installation

Surreal algorithms can be deployed at various scales. It can run on a single laptop and solve easier locomotion tasks, or run on hundreds of machines to solve complex manipulation tasks.

Surreal on your LaptopSurreal on Google Cloud Kubenetes EngineCustomizing SurrealDocumentation Index

Benchmarking

Scalability of Surreal-PPO with up to 1024 actors on Surreal Robotics Suite.

Training curves of 16 actors on OpenAI Gym tasks for 3 hours, compared to other baselines.

Citation

Please cite our CORL paper if you use this repository in your publications:

@inproceedings{corl2018surreal, title={SURREAL: Open-Source Reinforcement Learning Framework and Robot Manipulation Benchmark}, author={Fan, Linxi and Zhu, Yuke and Zhu, Jiren and Liu, Zihua and Zeng, Orien and Gupta, Anchit and Creus-Costa, Joan and Savarese, Silvio and Fei-Fei, Li}, booktitle={Conference on Robot Learning}, year={2018}}

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:基于module
下一篇:koa-lana是用于第三方微信开放平台而开发的第二代koa2框架
相关文章

 发表评论

暂时没有评论,来抢沙发吧~