Skip to content Search

Continual Learning – what is it?

Artificial neural networks are powerful models used in various fields because they can learn to represent useful features. When we train neural networks, we usually start with random settings and adjust them based on the data available at that moment. However, this approach is different from how humans learn. Humans continuously build on their knowledge over time; they are lifelong learners.

The problem described above arises when we try to train neural networks in a similar way, using non-stationary data that changes over time. This leads to a phenomenon called “catastrophic forgetting,” where the network forgets previously learned information when presented with new data.

Continual Learning (CL) aims to solve this issue by finding ways for neural networks to learn incrementally without forgetting what they previously learned.

Those who cannot remember the past are condemned to repeat it. – George Santayana

Changes are hard… also for artificial neural networks

All deep learning models suffer from catastrophic forgetting. The widespread use of neural networks in various applications, such as visual search engines, localization services, and medical diagnostics, means that many models become outdated within days or months of deployment due to catastrophic forgetting. This problem occurs when neural networks forget what they have learned when faced with new data.

One of the simplest solutions to this issue is retraining the neural network from scratch using updated datasets. However, this approach is like starting from day zero every time, from tabula rasa for neural network. Continual Learning addresses this problem by allowing neural networks not only to finetune to new data and forget previous one (what is mainly the domain of transfer learning) but train the model continuously from a stream of ever-changing data.

Eco friendly or up-to-date – that is a question

Continual learning is an energy efficient approach. The current approach to deep learning is neither sustainable nor green. According to the neural scaling law, making models larger or acquiring more data surely can bring progress. But this is not sustainable. Only a few organizations can afford to do this, such as OpenAI with the release of ChatGPT, but the energy required to train such a model is still prohibitive. Continuing this trend without incorporating incremental model adaptation is wasteful and not sustainable.

How most of the models are built has nothing in common with being energy efficient. Increasing demand for AI/ML solutions will only make the situation worse from the energy expenditure perspective. Continual learning methods by design are prepared not to waste what was learned and to adapt fast to new conditions, being more energy efficient. That is also a reason it is also one of the main pillars in Zero-waste Machine Learning in Computer Vision Group.

Real-world settings require Continual Learning

In real-world settings, the rapid adaptation and continual learning of neural network models hold significant practical importance for several reasons. Firstly, in numerous real-life applications, there are constraints on energy or computational resources available for model retraining. This makes it impractical to retrain the model on small devices with limited battery power, especially for equipment like drones, small autonomous robots, or edge devices. Waiting for the results after retraining becomes unfeasible in such scenarios.

Imagine a drone delivering a life-support system in a forest, where the weather conditions abruptly change due to an unexpected storm, which was never present in the training data. In such situations, there is no possibility to go back and re-train the model using new data. The model needs to adapt quickly and complete its mission despite the challenging conditions.

Secondly, the open-world environment is subject to constant changes. These changes can range from simple shifts in weather conditions leading to domain data changes to complex legislative alterations in driving laws, resulting in concept drift or the emergence of new categories for self-driving cars. In this context, detecting novel elements and ensuring generalization to new categories becomes vital to avoid costly errors.

Fast adaptation and continual learning of neural network models are imperative in an open-world context where computational resources are limited, and environmental changes are inevitable. Neglecting these aspects may lead to expensive errors with potentially severe repercussions.

Progress of Artificial Intelligence cannot be made without understanding Continual Learning

Continual Learning is considered a fundamental aspect of artificial intelligence. Understanding the process of machine learning ability to learn and adapt to new information without forgetting previously learned samples is kind of the holy grail for intelligent autonomous agents, that can first try to mimic humans’ life-long learning skills and can learn and adapt in a flexible and dynamic way.

Current research directions and projects

  • Continual Representation Learning
  • Generative models in Continual Learning
  • Architecture-based methods and ensemble models for CL
  • Open-world problems and Generalized Category Discovery
  • Test-time adaptation and domain generalization
  • Data perspective on forgetting and stability
  • Bayesian approaches for Continual Learning

Research Team Leader

Bartłomiej Twardowski

Bartłomiej Twardowski is a research team leader at the IDEAS NCBR research institute and a researcher at the Computer Vision Center, Universitat Autònoma de Barcelona. He earned his Ph.D. in 2018, focusing on recommender systems and neural networks. Following his doctoral studies, he served as an assistant professor at Warsaw University of Technology in the AI group for 1.5 years before deciding to join the Computer Vision Center, UAB, for a post-doctoral program. He has been actively involved in various research projects related to DL/NLP/ML (ranging from €40k to €1.4M). He is a Ramón y Cajal fellow. He has a wide industry experience (more than 15 years), including international companies, e.g., Zalando, Adform, Huawei, Naspers Group (Allegro), as well as helping startups with research projects (Sotrender, Scattered). Throughout his career, he has had the opportunity to publish papers in prestigious conferences such as CVPR (2020, 2 papers), NeurIPS (2020, 2023), ICCV (2021, 2023), ICLR (2023, 2024), and ECIR (2021, 2023). He is a member of ELLIS Society and serves as a reviewer for multiple AI/ML conferences, i.e., AAAI, CVPR, ECCV, ICCV, ICML, and NeurIPS. Currently, his research primarily focuses on lifelong machine learning in computer vision, efficient neural network training, transferability and domain adaptation, as well as information retrieval and recommender systems.

ELLIS Society member.

Other research groups and teams

  • Sustainable Computer Vision For Autonomous Machines Our solutions could potentially be used in drones as a tool supporting the protection of national parks, including animals against poaching. They allow for fast and efficient monitoring of large land areas in remote locations...
    Bartosz Zieliński
  • Zero-waste Machine Learning in Computer Vision Today, both science and industry rely heavily on machine learning models, predominantly artificial neural networks, that become increasingly complex and demand a significant amount of computational resources.
    Tomasz Trzciński
  • Computer Graphics Computer graphics, a sub-discipline of computer science, is traditionally concerned with algorithms for digital synthesis and manipulation of visual and geometric content.
    Przemysław Musialski