Polish computer scientist, habilitated doctor engineer of machine vision, professor at the Warsaw University of Technology and the Jagiellonian University. Member of the ELLIS Society.
Today, both science and industry rely heavily on machine learning models, predominantly artificial neural networks, that become increasingly complex and demand a significant amount of computational resources. Although their proliferation confirms the effectiveness of those models, increasing expectations of everyday users combined with a growth of available digital data reservoirs motivate computer scientists to build neural network models of ever-growing complexity and an astonishing number of parameters, ranging up to billions or even trillions, such as GPT-3. These new complex models are resource-hungry, and their computations require resource-heavy infrastructure, which is expensive and consumes a significant amount of energy. This trend can be observed across a multitude of applications, especially those revolving around computer vision such as medical image processing or robotics. As these applications become fundamental building blocks of our digital economy, tackling the problem of machine learning efficiency is critical, especially given that contemporary model are about to hit the inevitable resource limitation of the existing computational infrastructure.
Existing approaches to reduce this burden are either focused on constraining the optimization with a limited budget of computational resources or they attempt to compress models. However, they rarely look at the computational resources from the perspective of a green sustainable economy. In our group we plan to leverage this inspiration and instead of limiting training of machine learning models, we ask a different question: how can we make the best out of the information, resource, and computations that we already have access to? Instead of constraining the number of computations or memory used by the models, we focus on reusing what is available to them: computations done in the previous processing steps, partial information accessible at run-time, or knowledge gained by the model during previous training sessions in continually learned models. We look at the research problem of efficient machine learning from the computation recycling perspective and propose methods that build upon our previous works and preliminary results. Our project is therefore focused on creating models that learn themselves to be efficient, rather than just solve a given task. To that end, we hypothesize that recycling resources used by machine learning models can significantly increase their efficiency. Driven by this assumption, we aim to initiate a new research path of zero-waste machine learning focused on saving computations of machine learning models and reducing their impact on resource usage.
Real-world market applications require efficient machine learning models, e.g. to deploy a reinforcement learning system in production the policy inference must be done in real-time, while a long inference latency in autonomous cars leads to fatal accidents. Furthermore, contemporary robot performances suffer from the delay between measuring a system state and acting upon it. Our research will have a direct impact on increasing the efficiency of latency-sensitive real-life applications of machine learning and their robustness, hence opening new paths in commercialization opportunities of R&D works. In our research, we will follow three main directions: computation recycling, leveraging partial information and accumulating knowledge in continual learning models.
Conditional computation methods speed up decision-making by adjusting the internal processing path of a network based on the input signal. Most recent approaches that reduce the inference time of machine learning models focus on shortening decision paths in latency-critical applications. They discard, however, information processed at earlier stages, while information recycling reduces waste of computation resources as well as offers a better trade-off between the model’s accuracy and its computational cost.
Instead of constraining the number of computations or memory used by the models, we focus on reusing what is available to them: computations done in the previous processing steps, partial information accessible at run-time, or knowledge gained by the model during previous training sessions in continually learned models. We look at the research problem of efficient machine learning from the computation recycling perspective and propose methods that build upon our previous works and preliminary results.
Partial evidence methods leverage additional information available for the network at run-time to increase the quality of prediction without retraining the entire model. They can save computations when partial information about the input data is available. For instance, when classifying images of objects while being certain that the photos were taken at the beach, we can drastically reduce the amount of resources used for computations. We plan to integrate novel network architectures to further exploit available information.
Continual learning methods attempt to accumulate knowledge learned from the data incoming as a stream by building on top of previously gained skills without catastrophically forgetting them. They efficiently accumulate knowledge about previously seen data and reuse it when new data is fed during the training. This efficient knowledge accumulation mechanism inspires us to look at the computational complexity of the models from the perspective of continually learned models and investigate efficiency gains obtained in this approach. Furthermore, the zero-waste machine learning paradigm sets a framework for efficient model training and inference, yet it also encompasses the existing mechanisms present, e.g. in the continually learned models that need to accumulate knowledge efficiently to avoid forgetting past abilities. This behavior can be modeled in the latent space of generative models, and we aim to investigate further the efficiency of knowledge accumulation within and beyond continually learned models.
The main novelty of our research lies in its focus on eco-friendly machine learning models. More efficient models will need less resources, consume less energy, and therefore will have a smaller impact on the environment. This focal point is not only in line with the paradigms of a sustainable zero-waste economy, but also focuses on creating models that learn to be efficient and not just solve a given task. We believe that this is the approach that will allow us to make a real difference in the creation of a sustainable economy and will have a high deployment potential.
Tomasz Trzciński, Professor of the Warsaw University of Technology, directs the work of the CVLab machine vision team. He is also a member of the GMUM machine learning team at the Jagiellonian University, as well as the leader of the machine vision research group at IDEAS NCBR. He obtained the degree of habilitated doctor at the Warsaw University of Technology in 2020, doctor in the field of machine vision at the École Polytechnique Fédérale de Lausanne in 2014, and a double master’s degree at the Universitat Politècnica de Catalunya and Politecnico di Torino. He completed research internships at Stanford University in 2017 and at Nanyang Technological University in 2019. He is an Associate Editor at IEEE Access and MDPI Electronics, he is also a reviewer of papers published in TPAMI, IJCV, CVIU, TIP and TMM journals, and a member of the organizing committees of conferences, m.in CVPR, ICCV and ICML. He worked at Google in 2013, Qualcomm in 2012 and Telefónica in 2010. He is a Senior Member at IEEE, a member of the ELLIS Society and the ALICE Collaboration at CERN, he is an expert of the National Science Centre and the Foundation for Polish Science. He is a co-owner of Tooploox, where as Chief Scientist he leads a machine learning team, as well as co-founder of the technology startup Comixify, which uses artificial intelligence methods for video editing.
His scientific interests focus on such topics as: machine vision (simultaneous location and mapping, visual search), machine learning (deep neural networks, generative models, continuous learning), and representation teaching (binary descriptors).
Algorithms, especially the ones used in machine learning, promise to aid people in making decisions.
Blockchain technology was introduced in 2008.