04

Research

Intelligent Algorithms and Learned Data Structures

Piotr Sankowski

Leader of Research Group

Polish computer scientist, habilitated doctor of mathematical sciences. He specializes in algorithms, in particular in the application of algebraic methods in graph algorithms. Associate professor at the Institute of Computer Science, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw

Explainable Algorithmic Tools

Algorithms, especially the ones used in machine learning, promise to aid people in making decisions. In many scenarios even conceptually simple algorithms can deliver better quality answers than humans, e.g., make less mistakes. Still this promise is far from being realized in its full potential. The main problem is that algorithms are not able to give much insight nor explanation why the returned answer is the correct one. Even when we try to back-engineer the reasons why this decision was made, the answer we deduce cannot be understood by humans. This problem could be clearly exemplified by how Deep Blue plays chess.

In case of games, we might continue the philosophical dispute forever whether Deep Blue understands chess or not. However, this becomes a cornerstone issue when deploying algorithmic solutions in real-world. We need to guarantee that algorithms can be understood by people using them, as otherwise they refuse to follow them. Surprisingly, there exist no good tools to deliver explanations even for the most basic algorithms. For example, some of us often doubt whether GPS found the fastest route, and might be trying alternative ones. This despite the fact that finding the shortest path is a well-studied problem and we can even prove that the algorithm works correctly. The problem is rather about explaining the solution to the user, and not the algorithm itself. Moreover, the algorithm might be used by someone that does not have the right algorithmic background, so explaining its decision becomes even more important. The good news is that one does not need to study algorithms for years to see that they work in the right way. For example, one of the main reasons why Google Maps shows alternative routes, is to explain to the user why the returned solution is the best one. This is because the possible alternatives are worse!

Algorithms are becoming more prevalent in our lives due to ongoing digitalization which is further sped-up by the current pandemic situation. Hence, the need to explain their results becomes more immanent. This research aims to deliver a proof of concept of the tools that would provide explanations for the most basic optimization problems, e.g., assignment problem. Assignments are the cornerstone problem in operational research and are used to describe matching of jobs to workers, jobs to servers, page views to ads, residents to hospitals, students to universities or houses to tenants. Furthermore, all decisions taken by algorithms should be explainable as required by “Ethics guidelines for trustworthy AI’’ that apply to complex ML solutions, but to this case as well.

Learned Data-Structures

Artificial intelligence and algorithms are two areas of computer science that have been developed in separation for many years. Only very recently we have been witnessing more and more interactions between these two fields. These interactions lead to mutual influence and development of new tools on both sides. On one hand, we are witnessing the emergence of machine learning tools and algorithms that have been fully analyzed from theoretical point of view, e.g., approximate nearest neighbor search.

On the other hand, ML tools enter as interior components into basic data structures or state-of-the-art approximation algorithms resulting in solutions that have better practical properties, e.g., indices. These new hybrid constructions are called learned data-structures. As the work on these ideas has just started we miss the right framework and tools for implementing state-of-art solutions and thus the research on new tools and models is hampered. This problem could be solved by creating new algorithms and data structure together with a software library that would allow to compose easily different building blocks. This could prove tools to bridge the gap between theory and practice in algorithms and show that new theoretical advances can have practical implications.

WE’RE INCREASING THE R&D&I POTENTIAL IN THE FIELD OF ARTIFICIAL INTELLIGENCE.

dr hab. PIOTR SANKOWSKI

CEO IDEAS NCBR

Algorithmic Tools for Data Science and ML

Data is constantly but surely becoming “the oil of a new age”. However, in order to be able to draw meaningful conclusions from this ever growing information we need to design efficient algorithms for this. This project aims at solving this fundamental problems arising in the context of processing huge data sets. The long term goal is to develop a techniques for designing very fast and possibly scalable algorithms. Graphs a the most common way to capture structures and relations in data and thus are often used in data science.

The most well-known examples include analysis of social networks, e.g., community detection or cascading processes. Similarly, graphs are often used to represent knowledge, e.g., using knowledge graphs. This approach has been visible in the shift from relational to graph perspective in databases. These and other examples show that graphs and graph algorithms are basic tools for representing and analyzing the data. However, sizes of current data sets requires adoption of parallel computation, as even storing them on single machine is often impossible. For example, the World Wide Web consist of more than 5 billion pages, Google Knowledge Graph contains 500 billion facts on 5 billion entities whereas Facebook has 1.88 billion active users daily. In these scenarios, data is usually distributed across many machines, and computation needs to be performed in parallel. Real-world systems are modeled in Massively Parallel Computation (MPC) frameworks, such as MapReduce, Hadoop, Spark, or Flume.

Although, different parallel computation models have be studied for years already. MPC comes with a completely new possibilities as well as requirements. MPC computations are executed in synchronous rounds, but implementing these rounds on real-world systems takes considerable time. One round takes orders of magnitude longer than on classical Cray type system. Thus we would like to solve problems, in particular graph problems, in as few rounds as possible. With this challenge in mind, this project aims to design methods to break barriers that were impossible to overcome using classical techniques and models. More specifically, we are going to work on new algorithmic tools that would improve efficiency of both parallel and non-parallel algorithms used in data science.

One of the main reasons for establishing a research center at the NCBR, dedicated to artificial intelligence and digital economy, was to train researchers to bridge the gap between basic and applied research

dr hab. PIOTR SANKOWSKI

CEO IDEAS NCBR

Piotr Sankowski

Leader of Research Group

Professional experience

  • 2021-today CEO of IDEAS NCBR – research institute dedicated to artificial intelligence and digital economy
  • 2021-today Chief Scientific Officer of MIM Solutions – spin-off created with University of Warsaw
  • 2015-2021 CEO of MIM Solutions – spin-off created with University of Warsaw
  • 2010-today Associate professor at Institute of Informatics at University of Warsaw
  • 2006-2010 Assistant professor at Institute of Informatics at University of Warsaw
  • 2009-2010 Post-doc at “Sapienza” University of Rome
  • 2008 Post-doc at Swiss Federal Institute of Technology Zurich
  • 2006-2008 Post-doc at “Sapienza” University of Rome
  • 2005-2006 Assistant lecturer at Institute of Informatics at University of Warsaw

Awards and achievements

  • 2018 Winner of the individual Crystal Brussels Sprout Award
  • 2018 Winner of the National Science Centre Award
  • 2017 Visiting scientist in the Simons Institute for the Theory of Computing
    in the semester “Bridging Continuous and Discrete Optimization”
  • 2016 Visiting scientist in the Simons Institute for the Theory of Computing in the semester “Algorithms and Uncertainty”
  • 2011 Best Poster Award – 4th ACM International Conference on Web Search
    and Data Mining (WSDM ’11)
  • 2010-2015 ‘Ideas for Poland’ Scholarship awarded by The Foundation for Polish
    Science (FNP)
  • 2010 Second class scientific award from Minister of Higher Education
  • 2005 Winner of Witold Lipski prize for Young Researchers in Computer Science
  • 2005 ‘Stay with Us’ Scholarship awarded by Polityka, Polish weekly magazine
  • 2005 Best Student Paper Award – 13th Annual European Symposium on Algorithms

(ESA’ 05)

  • 2004 Best Student Paper Award – 45th Annual IEEE Symposium on Foundations
    of Computer Science (FOCS’ 04)
  • 2004 Best Student Paper Award – 12th Annual European Symposium on Algorithms (ESA ’04)

Grants

  • 2018-2023 Principal Investigator in the ERC CoG Grant 772346 – Towards Unification of Algorithmic Tools
  • 2015-2018 Site Leader in National Centre for Research and Development in Poland grant – Crime Prediction and Police Force Management System
  • 2015-2018 Principal Investigator in the Polish NCN grant no. 2014/13/B/ST6/01811 – Efficient Planar Graph Algorithms
  • 2015-2017 Principal Investigator in the ERC PoC Grant 680912 – Practical Approximation Algorithms – Proof of Concept
  • 2012-2016 Site Leader in EU-FET 317532 – Foundational Research on MULTIlevel comPLEX networks and systems
  • 2010-2015 Principal Investigator in the ERC StG Grant 259515 – Practical Approximation Algorithms
  • 2009-2012 Principal Investigator in the Polish NCN grant no. N N206 355636 – Approximation Algorithms with Bounded Resources
  • 2006-2007 Principal Investigator in the Polish KBN grant no. 1 PO3A 018 30 –
    Algebraic Graph Algorithms

Research

Algorithms, especially the ones used in machine learning, promise to aid people in making decisions.

Blockchain technology was introduced in 2008.

Skip to content