## 29 March 2022

Data Science and counterintelligence. A case for a new model of terrorist networks detection.

Our postdoc researcher, Mathieu Mari, explaines his recent paper on maximum independent set problem using the metaphore of organizing a party. Let’s get it started then!

Imagine that you are organizing a party for your birthday, and you are thinking about which friends to invite. Ideally, you would like to invite everyone. The problem is that some of your friends are in conflict, and you would naturally prefer to avoid any dispute during your birthday. For instance, you know that Janusz would not come if either Andrzej, Antoni, Jarek, or Mateusz are also invited; that Grażyna cannot stand either Andrzej, Antoni, or Zbigniew; and probably that having Antoni and either Mateusz or Zbigniew in the same place would potentially ruin the party. To decide who to invite to avoid any potential conflict, you can represent this problem by a ‘conflict’ graph by putting an *edge* between any two of your friends that are in conflict.

For a collection of geometric objects in the Euclidean plane (for example disks, squares, rectangles, etc.), we can create an intersection graph that has one node for each object, and two objects are connected by an edge in the graph if they overlap in the plane. Finding an independent set in this intersection graph is equivalent to finding a family of objects in the Euclidean plane that does not intersect.

**The perfect guest list**

To avoid any conflict, you should invite some friends that form an *independent set* in this graph, that is nodes (here “your friends”), such that no two of them are connected by an edge (here “are in conflict”). For instance, you could invite Janusz and Grażyna, but this is not a very satisfying solution because then, you cannot invite anyone else. It seems that here, the optimal choice to have a peaceful party is to invite Zbigniew, Jarek, Mateusz, and Andrzej. In the language of graph theory, we say that they form a *maximum independent set*.

On small graphs, it is easy to find a maximum independent set by hand. However, on larger graphs, the task becomes much harder. Scholars of complexity theory would say that the Maximum Independent set problem is *NP-hard*, which means that unless the famous conjecture P≠NP is disproved, there is no fast algorithm to solve exactly this problem. Håstad showed that the situation is even worse: unless P=NP, there is no fast algorithm that computes a good *approximate* solution.

Håstad’s observation pushed computer scientists to look for good and fast algorithms for the Maximum Independent set problem in more restricted families of graphs. One particular type of graph that has attracted a lot of attention is called *geometric intersection*. For a collection of geometric objects in the Euclidean plane (for example disks, squares, rectangles, etc.), we can create an intersection graph that has one node for each object, and two objects are connected by an edge in the graph if they overlap in the plane. Finding an independent set in this intersection graph is equivalent to finding a family of objects in the Euclidean plane that does not intersect.

Here, the largest set of non-overlapping objects, or equivalently the maximum independent set in its intersection graph, is the set {A,C,E}.

Researchers have tried to understand the approximability of the maximum independent set problem for these geometric intersection graphs. And today, it is now well-understood for most of the most natural geometric objects: Intervals, disks, squares, etc. One important case which remains unsolved is the case of *axis-parallel rectangles*, for which we are still not able to describe the best approximation algorithm. Apart from its importance for the computational geometry community, this question is also motivated by numerous applications in practice including for instance channel admission control, map labeling, and data mining. Now a bit of history on the *maximum independent set problem of Rectangles* (in short MISR). There are multiple O(log n)-approximation algorithms. It had been a long-standing open problem to find a constant-approximation algorithm for MISR. One possible approach for this is to compute an optimal solution to the canonical *linear programming relaxation* of MISR and round it. This approach was used by __Chalermsook__ and __Chuzhoy__ to obtain an O(log log n)-approximation. On the other hand, it seems likely that MISR admits even a (1+ε)-approximation, given that it admits a QPTAS due to Adamaszek and Wiese.

**A new approach to MISR.**

Recently, in a breakthrough result presented in FOCS 2021, Mitchell proposed the first constant-approximation algorithm for the maximum independent set problem of rectangles and consequently solved the aforementioned long-standing open problem. Instead of rounding the linear program, he employs a recursive partitioning of the plane into a special type of rectilinear polygons called corner-clipped rectangles (CCRs). Given a CCR, he recursively subdivides it into at most five smaller CCRs until he obtains CCRs which essentially contain at most one rectangle from the optimal solution OPT each. In the end, he outputs the rectangles contained in these final CCRs plus some carefully chosen rectangles from OPT that are intersected by these recursive cuts. With a *dynamic program*, he computes the recursive partition that yields the largest number of output rectangles, which in particular “remembers” in each step a constant number of rectangles that were intersected by some previous cuts. In a structural theorem, he shows that there exists a set of at least |OPT|/10 rectangles that can be output by such a recursive partitioning, leading to the approximation ratio of 10. The structural theorem is proved using an exhaustive case analysis for defining the subdivision of a given CCR, with sixty cases in total.

Mitchell’s result yields several interesting open questions, most notably whether one can improve the approximation ratio and whether one can give a simpler analysis that does not rely on a large case distinction.

In a paper presented at SODA 22, Mathieu Mari and his co-authors answered both questions in the affirmative. In particular, they obtained a (2+ ε)-approximation algorithm for MISR which is also based on a recursive partitioning scheme. First, they use a partition into a class of axis-parallel polygons with constant complexity each more general than CCRs. This allows them to provide an arguably simpler analysis and at the same time already improves the approximation ratio to 4. Then, using a more elaborate charging scheme and a recursive partitioning into general axis-parallel polygons with constant complexity, they improve the approximation ratio to 2__+__ ε. In particular, they construct a recursive partitioning based on more general fences which can be sequences of up to 1/ε line segments each. This partitioning routine and their other new ideas may be useful for future work towards a (1+ε)-approximation for MISR. For more details, see the paper published on arxiv: https://arxiv.org/abs/2106.00623

To prove that the approximation ratio is bounded by 2+ ε, the authors had to develop an intricate charging scheme*. *This work has contributed to simplifying and exploiting the potential of Michell’s idea for the maximum independent set problem of rectangles. In particular, the new best approximation ratio for this problem can be arbitrarily close to 2. It seems that to improve further this approximation, and eventually exhibit the hidden PTAS, substantially new ideas will be needed. In the meantime, don’t worry even if you haven’t found the optimal independent set of friends for your party. Since the laws of friendship are usually less strict and formal than those of mathematics perhaps you could also just give Janusz, Grażyna and Jarek chance to clear things up between them?

**Author: Mathieu Mari**

Mathieu Mari is a postdoctoral fellow in IDEAS-NCBR and the University of Warsaw. Before coming to Poland, he achieved his Ph.D. at École Normale Supérieure de Paris, under the supervision of Claire Mathieu and Chien-Chung Huang. Mari’s work focuses on the design and the analysis of algorithms, including approximation algorithms, graph algorithms, and computational geometry. Recently, his interest is turning towards artificial intelligence and machine learning topics.

Cover photo: Freepik

Data Science and counterintelligence. A case for a new model of terrorist networks detection.

This time we talk about challenges related to the formulation of a research problem and about ideas for interesting research with Dr. Tomasz Michalak, leader of IDEAS NCBR’s work team.

**Two researchers from IDEAS NCR have been accepted into the ELLIS Society (European Laboratory for Learning and Intelligent Systems). This prestigious European association brings together experts in the field of artificial intelligence.**

Dear User

we use cookies on this website to facilitate its use. This happens by saving information on the device you use when visiting our website. Taking care of your right to privacy, you can set your preferences regarding the use of cookies below. These settings can be changed at any time. However, please note that blocking certain types of cookies may have a negative impact on the use of our website and the services we offer.

Some of the cookies we use are essential for the website to function properly. You cannot change the use of these cookies in your preferences settings.

For more information on the types of cookies used, see Cookies Policy

we use cookies on this website to facilitate its use. This happens by saving information on the device you use when visiting our website. Taking care of your right to privacy, you can set your preferences regarding the use of cookies below. These settings can be changed at any time. However, please note that blocking certain types of cookies may have a negative impact on the use of our website and the services we offer.

Some of the cookies we use are essential for the website to function properly. You cannot change the use of these cookies in your preferences settings.

For more information on the types of cookies used, see Cookies Policy

Manage consent

Other uncategorized cookies are those that are analyzed and have not yet been categorized.