{"id":24498,"date":"2024-12-03T10:57:37","date_gmt":"2024-12-03T10:57:37","guid":{"rendered":"https:\/\/ideas-ncbr.pl\/?p=24498"},"modified":"2024-12-06T11:14:52","modified_gmt":"2024-12-06T11:14:52","slug":"bro-algorithm-neurips-2024-spotlight","status":"publish","type":"post","link":"https:\/\/ideas-ncbr.pl\/en\/bro-algorithm-neurips-2024-spotlight\/","title":{"rendered":"BRO algorithm: NeurIPS 2024 spotlight"},"content":{"rendered":"\n<p>The paper <strong>Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control<\/strong> is co-authored by Micha\u0142 Nauman (ex-IDEAS NCBR), Mateusz Ostaszewski (Warsaw University of Technology), Krzysztof Jankowski (University of Warsaw), Piotr Mi\u0142o\u015b (IM PAN, UW, IDEAS NCBR), Marek Cygan (UW, Nomagic).<\/p>\n\n\n\n<p>See the paper here: <a href=\"https:\/\/arxiv.org\/abs\/2405.16158\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/arxiv.org\/abs\/2405.16158<\/a>  <\/p>\n\n\n\n<div style=\"height:10px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-media-text is-stacked-on-mobile\"><figure class=\"wp-block-media-text__media\"><img loading=\"lazy\" decoding=\"async\" width=\"268\" height=\"200\" src=\"https:\/\/ideas-ncbr.pl\/wp-content\/uploads\/2024\/12\/dog_BAC_gif_c.gif\" alt=\"\" class=\"wp-image-24499 size-full\"\/><\/figure><div class=\"wp-block-media-text__content\">\n<p>Robot dog in virtual environment running using a previous algorithm<\/p>\n<\/div><\/div>\n\n\n\n<div style=\"height:10px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-media-text is-stacked-on-mobile\"><figure class=\"wp-block-media-text__media\"><img loading=\"lazy\" decoding=\"async\" width=\"200\" height=\"200\" src=\"https:\/\/ideas-ncbr.pl\/wp-content\/uploads\/2024\/12\/dog_BRO_c_10s.gif\" alt=\"\" class=\"wp-image-24501 size-full\"\/><\/figure><div class=\"wp-block-media-text__content\">\n<p>Robot dog in virtual environment running using BRO<\/p>\n<\/div><\/div>\n\n\n\n<div style=\"height:10px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>The BRO algorithm is designed for training robots in simulations, like the well-known DeepMind Control Suite. In these virtual environments, the algorithm learns to control simulated robots with different morphologies (for example, a humanoid robot or a robot dog). Its task is to learn how to move, without any prior knowledge about the world. If an algorithm like BRO performs well in a complex simulation, we can reasonably assume that it will also learn quickly in the real world \u2013 as simulations can closely reflect real-world scenarios.<\/p>\n\n\n\n<p>In one test, BRO was tasked with learning how to move as quickly as possible. In just three hours, it progressed from crawling to running, without any prior understanding of how running should look. In a sense, you could say the algorithm &#8220;discovered&#8221; how to run on its own.<\/p>\n\n\n\n<p>What sets BRO apart from traditional methods is that most reinforcement learning systems need massive amounts of data and trial-and-error practice to learn effectively. But BRO improves on this by expanding the algorithm\u2019s size and making it more flexible across different tasks. Using strong rules (regularization) to guide the learning process and an exploration strategy that encourages it to try new things, BRO uses data more efficiently. As a result, it performs better with less effort and computing time, making it a major step forward in the field of robotics and AI.<\/p>\n\n\n\n<p>In the gifs, see the difference between virtual robot dogs run by one of other algorithms (GIF 1) and BRO (GIF 2). The BRO dog runs distinctly better.<\/p>\n\n\n\n<div style=\"height:100px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>See <strong><span style=\"text-decoration: underline;\"><a href=\"https:\/\/ideas-ncbr.pl\/en\/ideas-ncbr-at-neurips-2024\/\" data-type=\"link\" data-id=\"https:\/\/ideas-ncbr.pl\/en\/ideas-ncbr-at-neurips-2024\/\">all publications<\/a><\/span><\/strong> co-authored by IDEAS NCBR researchers at NeurIPS 2024.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Let\u2019s focus on the publication: &#8220;Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control,&#8221; awarded spotlight at NeurIPS 2024.<\/p>\n","protected":false},"author":27,"featured_media":24504,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"footnotes":"","_links_to":"","_links_to_target":""},"categories":[79,86],"tags":[],"class_list":["post-24498","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-news-en-2","category-news-en-en"],"acf":[],"_links":{"self":[{"href":"https:\/\/ideas-ncbr.pl\/en\/wp-json\/wp\/v2\/posts\/24498","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ideas-ncbr.pl\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ideas-ncbr.pl\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ideas-ncbr.pl\/en\/wp-json\/wp\/v2\/users\/27"}],"replies":[{"embeddable":true,"href":"https:\/\/ideas-ncbr.pl\/en\/wp-json\/wp\/v2\/comments?post=24498"}],"version-history":[{"count":3,"href":"https:\/\/ideas-ncbr.pl\/en\/wp-json\/wp\/v2\/posts\/24498\/revisions"}],"predecessor-version":[{"id":24565,"href":"https:\/\/ideas-ncbr.pl\/en\/wp-json\/wp\/v2\/posts\/24498\/revisions\/24565"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ideas-ncbr.pl\/en\/wp-json\/wp\/v2\/media\/24504"}],"wp:attachment":[{"href":"https:\/\/ideas-ncbr.pl\/en\/wp-json\/wp\/v2\/media?parent=24498"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ideas-ncbr.pl\/en\/wp-json\/wp\/v2\/categories?post=24498"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ideas-ncbr.pl\/en\/wp-json\/wp\/v2\/tags?post=24498"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}