06/03/2025

Seminario IC - Marzo 2025

Exact approaches for the hyper-rectangular clustering problem

with axis-parallel clusters

Diego Delle Donne

Information systems, Decisions Sciences and Statistics Department

ESSEC Business School of Paris, France

Miércoles 12 de marzo a las 14hs

Cero+Infinito, 1er piso, Salón 2119 (Esquina)

Resumen:

Given a set of points in the d'dimensional space, a hyper-rectangle
clustering is a partition of the set of points into groups called
clusters, where each cluster is determined by the smallest
hyper-rectangle containing the points in the cluster. We address the
problem of finding a hyper-rectangle clustering of minimum size.

Machine Learning techniques are being widely used to understand large
amounts of data. The main goal of these techniques is usually to
provide a set of rules that can interpret the data and serve as
predictive models. These rules are however not always easily
explainable to humans. Hyper-rectangular clustering has been proposed
as a model for "explainable clustering", i.e., a procedure not only
providing a partition of the points into clusters but also providing
the rationale for the inclusion of each point in its corresponding
cluster.

On a first basis, we present mixed-integer programming (MIP)
formulations to solve the problem. On a second basis, as the proposed
MIP formulations are proved to be quite limited for real-size
instances, we develop an incremental exact strategy which takes
advantage of the capacity of the proposed models to solve small
instances to optimality. The core idea of this algorithm is to start
by solving an instance with a small subset of points and iteratively
adding more points if the obtained solution does not cover all the
required points. We prove that as soon as a solution covers the whole
set of points from the instance, then the solution is an optimal
solution for the original problem.

Finally, we compare the efficiency of the developed methods with an
exhaustive computational experimentation in which we show that the new
approach is able to solve to optimality instances of higher orders of
magnitude.