Skip to content Search

Continual Learning – what is it?

Artificial neural networks are powerful models used in various fields because they can learn to represent useful features. When we train neural networks, we usually start with random settings and adjust them based on the data available at that moment. However, this approach is different from how humans learn. Humans continuously build on their knowledge over time; they are lifelong learners.

The problem described above arises when we try to train neural networks in a similar way, using non-stationary data that changes over time. This leads to a phenomenon called “catastrophic forgetting,” where the network forgets previously learned information when presented with new data.

Continual Learning (CL) aims to solve this issue by finding ways for neural networks to learn incrementally without forgetting what they previously learned.

Those who cannot remember the past are condemned to repeat it. – George Santayana

Changes are hard….. also for artificial neural networks

All deep learning models suffer from catastrophic forgetting. The widespread use of neural networks in various applications, such as visual search engines, localization services, and medical diagnostics, means that many models become outdated within days or months of deployment due to catastrophic forgetting. This problem occurs when neural networks forget what they have learned when faced with new data.

One of the simplest solutions to this issue is retraining the neural network from scratch using updated datasets. However, this approach is like starting from day zero every time, from tabula rasa for neural network. Continual Learning addresses this problem by allowing neural networks not only to finetune to new data and forget previous one (what is mainly the domain of transfer learning) but train the model continuously from a stream of ever-changing data.

Eco friendly or up-to-date – that is a question

Continual learning is an energy efficient approach. The current approach to deep learning is neither sustainable nor green. According to the neural scaling law, making models larger or acquiring more data surely can bring progress. But this is not sustainable. Only a few organizations can afford to do this, such as OpenAI with the release of ChatGPT, but the energy required to train such a model is still prohibitive. Continuing this trend without incorporating incremental model adaptation is wasteful and not sustainable.

How most of the models are built has nothing in common with being energy efficient. Increasing demand for AI/ML solutions will only make the situation worse from the energy expenditure perspective. Continual learning methods by design are prepared not to waste what was learned and to adapt fast to new conditions, being more energy efficient. That is also a reason it is also one of the main pillars in Zero-waste Machine Learning in Computer Vision Group.

Real-world settings require Continual Learning

In real-world settings, the rapid adaptation and continual learning of neural network models hold significant practical importance for several reasons. Firstly, in numerous real-life applications, there are constraints on energy or computational resources available for model retraining. This makes it impractical to retrain the model on small devices with limited battery power, especially for equipment like drones, small autonomous robots, or edge devices. Waiting for the results after retraining becomes unfeasible in such scenarios.

Imagine a drone delivering a life-support system in a forest, where the weather conditions abruptly change due to an unexpected storm, which was never present in the training data. In such situations, there is no possibility to go back and re-train the model using new data. The model needs to adapt quickly and complete its mission despite the challenging conditions.

Secondly, the open-world environment is subject to constant changes. These changes can range from simple shifts in weather conditions leading to domain data changes to complex legislative alterations in driving laws, resulting in concept drift or the emergence of new categories for self-driving cars. In this context, detecting novel elements and ensuring generalization to new categories becomes vital to avoid costly errors.

Fast adaptation and continual learning of neural network models are imperative in an open-world context where computational resources are limited, and environmental changes are inevitable. Neglecting these aspects may lead to expensive errors with potentially severe repercussions.

Progress of Artificial Intelligence cannot be made without understanding Continual Learning

Continual Learning is considered a fundamental aspect of artificial intelligence. Understanding the process of machine learning ability to learn and adapt to new information without forgetting previously learned samples is kind of the holy grail for intelligent autonomous agents, that can first try to mimic humans’ life-long learning skills and can learn and adapt in a flexible and dynamic way.

Current research directions and projects

  • Continual Representation Learning
  • Generative models in Continual Learning
  • Architecture-based methods and ensemble models for CL
  • Open-world problems and Generalized Category Discovery
  • Test-time adaptation and domain generalization
  • Data perspective on forgetting and stability
  • Bayesian approaches for Continual Learning

Research team leader


Bartłomiej Twardowski

Bartłomiej Twardowski is a research team leader at the IDEAS NCBR research institute and a researcher at the Computer Vision Center, Universitat Autònoma de Barcelona. He earned his Ph.D. in 2018, specializing in recommender systems and neural networks. Following his doctoral studies, he spent 1.5 years as an assistant professor in the AI group at Warsaw University of Technology before embarking on a post-doctoral program at the Computer Vision Center, UAB. With over 12 years of industry experience, including international giants like Zalando, Adform, Huawei, and Naspers Group (Allegro), Bartłomiej has also collaborated with startups on research projects, such as Sotrender and Scattered. His research portfolio encompasses a range of projects, from €40k to €1.4M, and he’s a Ramón y Cajal fellow. He boasts a prolific publication history in esteemed conferences like CVPR (2020, 2 papers), NeurIPS (2020), ICCV (2021, 2023), ICLR (2023), and ECIR (2021, 2023). Bartłomiej is also an active reviewer for renowned AI/ML conferences, including AAAI, CVPR, ECCV, ICCV, ICML, and NeurIPS. Currently, his research is centered on lifelong machine learning in computer vision, efficient neural network training, transferability and domain adaptation, alongside information retrieval and recommender systems.

 

Other research groups and teams

  • AI for Security Amongst other things, we develop multi-level management systems to protect critical infrastructure as well as systems for securing key state services against both kinetic and cyber threats.
    Tomasz Michalak
  • Psychiatry and Computational Phenomenology Most mental disorders are highly complex and have high phenotypic variability, partially vague diagnostic criteria, and a significant overlap ratio.
    Marcin Moskalewicz
  • Learning in control, graphs and networks The team develops neural networks that generate graphs. These solutions are oriented towards the automatic design of structures
    Paweł Wawrzyński