Skip to content Search
03.12.2024
Let’s focus on the publication: "Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control," awarded spotlight at NeurIPS 2024.

The paper Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control is co-authored by Michał Nauman (ex-IDEAS NCBR), Mateusz Ostaszewski (Warsaw University of Technology), Krzysztof Jankowski (University of Warsaw), Piotr Miłoś (IM PAN, UW, IDEAS NCBR), Marek Cygan (UW, Nomagic).

See the paper here: https://arxiv.org/abs/2405.16158

Robot dog in virtual environment running using a previous algorithm

Robot dog in virtual environment running using BRO

The BRO algorithm is designed for training robots in simulations, like the well-known DeepMind Control Suite. In these virtual environments, the algorithm learns to control simulated robots with different morphologies (for example, a humanoid robot or a robot dog). Its task is to learn how to move, without any prior knowledge about the world. If an algorithm like BRO performs well in a complex simulation, we can reasonably assume that it will also learn quickly in the real world – as simulations can closely reflect real-world scenarios.

In one test, BRO was tasked with learning how to move as quickly as possible. In just three hours, it progressed from crawling to running, without any prior understanding of how running should look. In a sense, you could say the algorithm “discovered” how to run on its own.

What sets BRO apart from traditional methods is that most reinforcement learning systems need massive amounts of data and trial-and-error practice to learn effectively. But BRO improves on this by expanding the algorithm’s size and making it more flexible across different tasks. Using strong rules (regularization) to guide the learning process and an exploration strategy that encourages it to try new things, BRO uses data more efficiently. As a result, it performs better with less effort and computing time, making it a major step forward in the field of robotics and AI.

In the gifs, see the difference between virtual robot dogs run by one of other algorithms (GIF 1) and BRO (GIF 2). The BRO dog runs distinctly better.

See all publications co-authored by IDEAS NCBR researchers at NeurIPS 2024.

Featured news

24.01.2025
Videos from Warsaw IACR Summer School on Post-Quantum Cryptography available online
20.01.2025
A new role for phenomenology in empowering patients based on quantitative evidence-based research
18.12.2024
Tomasz Trzciński is ELLIS Fellow