Fixed-Parameter Tractable Certified Algorithms for Covering and Dominating in Planar Graphs and Beyond
ABSTRACT. For a positive real $\gamma \geq 1$, a $\gamma$-certified algorithm for a vertex-weighted graph optimization problem is an algorithm that, given a weighted graph $(G,w)$, outputs a re-weighting of the graph obtained by scaling each weight individually with a factor between $1$ and $\gamma$, along with a solution which is optimal for the perturbed weight function. Here we provide $(1+\varepsilon)$-certified algorithms for Dominating Set and H-Subgraph-Free-Deletion which, for any $\varepsilon > 0$, run in time $f(1/\varepsilon) \cdot n^{O(1)}$ on minor-closed classes of graphs of bounded local tree-width and polynomially-bounded weights. We obtain our algorithms as corollaries of a more general result establishing FPT-time certified algorithms for problems admitting, at an intuitive level, certain `local solution-improvement properties'. These results improve -- in terms of generality, running time and parameter dependence -- on Angelidakis, Awasthi, Blum, Chatziafratis and Dan's XP-time $(1+\varepsilon)$-certified algorithm for Independent Set on planar graphs (ESA2019). Furthermore, our methods are also conceptually simpler: our algorithm is based on elementary local re-optimizations inspired by Baker's technique, as opposed to the heavy machinery of the Sherali-Adams hierarchy required in previous work.
Finding Induced Subgraphs from Graphs with Small Mim-Width
ABSTRACT. In the last decade, algorithmic frameworks based on a structural graph parameter called mim-width have been developed to solve generally NP-hard problems. However, it is known that the frameworks cannot be applied to the Clique problem, and the complexity status of many problems of finding dense induced subgraphs remains open when parameterized by mim-width. In this paper, we investigate the complexity of the problem of finding a maximum induced subgraph that satisfies prescribed properties from a given graph with small mim-width. We first give a meta-theorem implying that various induced subgraph problems are NP-hard for bounded mim-width graphs. Moreover, we show that some problems, including Clique and Induced Cluster Subgraph, remain NP-hard even for graphs with (linear) mim-width at most 2. In contrast to the intractability, we provide an algorithm that, given a graph and its branch decomposition with mim-width at most 1, solves Induced Cluster Subgraph in polynomial time. We emphasize that our algorithmic technique is applicable to other problems such as Induced Polar Subgraph and Induced Split Subgraph. Since a branch decomposition with mim-width at most 1 can be constructed in polynomial time for block graphs, interval graphs, permutation graphs, cographs, distance-hereditary graphs, convex graphs, and their complement graphs, our positive results reveal the polynomial-time solvability of various problems for these graph classes.
ABSTRACT. We study the following Independent Stable Set problem. Let G be an undirected graph and M = (V(G), I) be a matroid whose elements are the vertices of G. For an integer k\geq 1, the task is to decide whether G contains a set S\subseteq V(G) of size at least k which is independent (stable) in G and independent in M. This problem generalizes several well-studied algorithmic problems, including Rainbow Independent Set, Rainbow Matching, and Bipartite Matching with Separation. We show that
- When the matroid M is represented by the independence oracle, then for any computable function f, no algorithm can solve Independent Stable Set using f(k) \cdot n^{o(k)} calls to the oracle.
- On the other hand, when the graph G is of degeneracy d, then the problem is solvable in time O((d+1)^k n), and hence is FPT parameterized by d+k. Moreover, when the degeneracy d is not a part of the input, the problem admits a kernel polynomial in k. More precisely, we prove that for every integer d\geq 0, the problem admits a kernelization algorithm that in time n^{Oh(d)} outputs an equivalent framework with a graph on 2^{O(d)}k^{O(d)} vertices. A lower bound complements this when d is part of the input: Independent Stable Set does not admit a polynomial kernel when parameterized by k+d unless NP \subseteq coNP/poly. This lower bound holds even when M is a partition matroid.
- Another set of results concerns the scenario when the graph G is chordal. In this case, our computational lower bound excludes an FPT algorithm when the input matroid is given by its independence oracle. However, we demonstrate that Independent Stable Set can be solved in 2^{O(k)}||M||^{O(1)} time when M is a linear matroid given by its representation. In the same setting, Independent Stable Set does not have a polynomial kernel when parameterized by k, unless NP\subseteq coNP/poly.
The Simultaneous Interval Number: a New Width Parameter That Measures the Similarity to Interval Graphs
ABSTRACT. We propose a novel way of generalizing the class of interval graphs, via a graph width parameter called the simultaneous interval number. This parameter is related to the simultaneous representation problem for interval graphs and defined as the smallest number d of labels such that the graph admits a d-simultaneous interval representation, that is, an assignment of intervals and label sets to the vertices such that two vertices are adjacent if and only if the corresponding intervals, as well as their label sets, intersect. We show that this parameter is NP-hard to compute and give several bounds for the parameter, showing in particular that it is sandwiched between pathwidth and linear mim-width. For classes of graphs with bounded parameter values, assuming that the graph is equipped with a simultaneous interval representation with a constant number of labels, we give FPT algorithms for the clique, independent set, and dominating set problems, and hardness results for the independent dominating set and coloring problems.
Approximation Algorithms for the Airport and Railway Problem
ABSTRACT. In this paper, we present approximation algorithms for the airport and railway problem (\AR) on several classes of graphs.
The $\AR$ problem, introduced by \cite{ar16}, is a combination of the Capacitated Facility Location problem (\CFL) and the network design problem.
An $\AR$ instance consists of a set of points (cities) $V$ in a metric $d(.,.)$, each of which is associated with a non-negative cost $f_v$ and a number $k$, which represent respectively the cost of establishing an airport (facility) in the corresponding point, and the universal airport capacity.
A feasible solution is a network of airports and railways providing services to all cities without violating any capacity, where railways are edges connecting pairs of points, with their costs equivalent to the distance between the respective points. The objective is to find such a network with the least cost. In other words, find a forest, each component having at most $k$ points and one open facility, minimizing the total cost of edges and airport opening costs.
Adamaszek et al.~\cite{ar16} presented a PTAS for $\AR$ in the two-dimensional Euclidean metric $\mathbb{R}^2$ with a uniform opening cost. In subsequent work
\cite{ar18} presented a bicriteria $\frac{4}{3}\z(2+\frac{1}{\alpha}\y)$-approximation algorithm for $\AR$ with non-uniform opening costs but violating the airport capacity by a factor of $1+\alpha$, i.e.~$(1+\alpha)k$ capacity where $0<\alpha\le1$, a $\z(2+\frac{k}{k-1}+\ve\y)$-approximation algorithm and a bicriteria Quasi-Polynomial Time Approximation Scheme (QPTAS) for the same problem in the Euclidean plane $\mathbb{R}^2$.
In this work, we give a $2$-approximation for $\AR$ with a uniform opening cost for general metrics and an $O(\log n)$-approximation for non-uniform opening costs.
We also give a QPTAS for $\AR$ with a uniform opening cost in graphs of bounded treewidth and a QPTAS for a slightly relaxed version in the non-uniform setting.
The latter implies $O(1)$-approximation on graphs of bounded doubling dimensions, graphs of bounded highway dimensions and planar graphs in quasi-polynomial time.
A Logarithmic Integrality Gap for Generalizations of Quasi-Bipartite Instances of Directed Steiner Tree
ABSTRACT. In the classic Directed Steiner Tree problem (DST), we are given an edge-weighted directed graph G = (V,E) with n nodes, a specified root node r, and k terminals X. The goal is to find the cheapest subset of edges F such that r can reach any terminal using only edges in F.
Designing approximation algorithms for DSTis quite challenging, to date the best approximation guarantee of a polynomial-time algorithm for DST is k^epsfor any constant eps > 0 [Charikar et. al., 1999]. For network design problems like DST, one often
relies on natural cut-based LP relaxations to design approximation algorithms. In general, the integrality gap of such an LP for DSTis known to have a polynomial integrality gap lower bound [Zosin and Khuller, 2002; Li and Laekhanukit, 2021]. So particular interest has been invested in special cases or in strengthenings of this LP.
In this work, we show the integrality gap is only O(log k) for instances of \dst where no Steiner node has both an edge from another Steiner node and an edge to another Steiner node, i.e. the longest path using only Steiner nodes has length at most 1. This generalizes the well-studied case of quasi-bipartite DST where no edge has both endpoints being Steiner nodes. Our result is also optimal in the sense that the integrality gap can be as bad as poly(n) even if the longest path with only Steiner nodes has length 2.
Range Reporting for Time Series via Rectangle Stabbing
ABSTRACT. We study the Fréchet queries problem. It is a data structure problem for range reporting, where we are given a set S of n polygonal curves and a distance threshold ρ. The data structure should support queries with a polygonal curve q for the elements of S, for which the continuous Fréchet distance to q is at most ρ. Afshani and Driemel in 2018 studied this problem for two-dimensional polygonal curves of constant complexity and gave upper and lower bounds on the space-query time tradeoff. We study the case that the ambient space of the curves is one-dimensional and show an intimate connection to the well-studied rectangle stabbing problem. Here, we are given a set of hyperrectangles as input and a query with a point q should return all input rectangles that contain this point. Using known data structures for rectangle stabbing or orthogonal range searching this directly leads to a data structure with size in O(n (log n)^(t−1)) and query time in O((log n)^(t−1) + k), where k denotes the output size and t can be chosen as the maximum number of vertices of either (a) the stored curves or (b) the query curves. Note that we omit factors depending on the complexity of the curves that do not depend on n. The resulting bounds improve upon the bounds by Afshani and Driemel in both the storage and query time. In addition, we show that known lower bounds for rectangle stabbing and orthogonal range reporting with dimension parameter d = ⌊t/2⌋ can be applied to our problem via reduction.
ABSTRACT. The sliding cubes model is a well-established theoretical framework that supports the analysis of reconfiguration algorithms for modular robots consisting of face-connected cubes. As is common in the literature, we focus on reconfiguration via an intermediate canonical shape. Specifically, we present an in-place algorithm that reconfigures any n-cube configuration into a compact canonical shape using a number of moves proportional to the sum of coordinates of the input cubes. This result is asymptotically optimal and strictly improves on all prior work. Furthermore, our algorithm directly extends to dimensions higher than three.
ABSTRACT. A key goal of clustering is data reduction. In center-based clustering of complex objects therefore not only the number of clusters but also the complexity of the centers plays a crucial role. We propose $L$-Budget Clustering as unifying perspective on this task, optimizing the clustering under the constraint that the summed complexity of all centers is at most $L$. We present algorithms for clustering curves under the Fr\'echet distance, but note that our algorithms more generally apply to objects in metric spaces if a notion of simplification of objects is applicable.
A scenario in which data reduction is of particular importance is when the space is limited.
Our main result is an efficient approximation algorithm that maintains an $L$-budget clustering under insertion of curves using only $O(L)$ space.
ABSTRACT. The Hilbert metric is a distance function defined for points lying within the interior of a convex body. It arises in the analysis and processing of convex bodies, machine learning, and quantum information theory. In this paper, we show how to adapt the Euclidean Delaunay triangulation to the Hilbert geometry defined by a convex polygon in the plane. We analyze the geometric properties of the Hilbert Delaunay triangulation, which has some notable differences with respect to the Euclidean case, including the fact that the triangulation does not necessarily cover the convex hull of the point set. We also introduce the notion of a Hilbert ball at infinity, which is a Hilbert metric ball centered on the boundary of the convex polygon. We present a simple randomized incremental algorithm that computes the Hilbert Delaunay triangulation for a set of $n$ points in the Hilbert geometry defined by a convex m-gon. The algorithm runs in O(n (log n + log^3 m)) expected time. In addition we introduce the notion of the Hilbert hull of a set of points, which we define to be the region covered by their Hilbert Delaunay triangulation. We present an algorithm for computing the Hilbert hull in time O(n h log^2 m), where h is the number of points on the hull's boundary.
ABSTRACT. In the multiple-selection problem one is given an unsorted array~$S$ of $N$ elements and an array of $q$ query ranks $r_1<\cdots<r_q$, and the task is to return, in sorted order, the $q$ elements of $S$ of rank $r_1, \ldots, r_q$, respectively.
The asymptotic deterministic comparison complexity of the problem was settled by Dobkin and Munro [JACM~1981].
In the I/O model an optimal I/O complexity was achieved by Hu \textit{et al}.~[SPAA~2014].
Recently [ESA~2023], we presented a \emph{cache-oblivious} algorithm with matching I/O complexity, named \emph{funnelselect}, since it heavily borrows ideas from the cache-oblivious sorting algorithm \emph{funnelsort} from the seminal paper by Frigo, Leiserson, Prokop and Ramachandran [FOCS~1999].
\algname{Funnelselect} is inherently randomized as it relies on sampling for cheaply finding many good pivots.
In this paper we present \emph{deterministic funnelselect}, achieving the same optional I/O complexity cache-obliviously without randomization.
Our new algorithm essentially replaces a single (in expectation) reversed-funnel computation using random pivots by a recursive algorithm using multiple reversed-funnel computations.
To meet the I/O bound, this requires a carefully chosen subproblem size based on the entropy of the sequence of query ranks;
\algname{deterministic funnelselect} thus raises distinct technical challenges not met by randomized \algname{funnelselect}.
The resulting worst-case I/O bound is $O\bigl(\sum_{i=1}^{q+1} \frac{\Delta_i}{B} \cdot \log_{M/B} \frac{N}{\Delta_i} + \frac{N}{B}\bigr)$, where $B$ is the external memory block size, $M\geq B^{1+\epsilon}$ is the internal memory size, for some constant $\epsilon>0$, and $\Delta_i = r_{i} - r_{i-1}$ (assuming $r_0=0$ and $r_{q+1}=N + 1$).
ABSTRACT. A filter is a widely used data structure for storing an approximation of a given set $S$ of elements from some universe $\UU$ (a countable set). It represents a superset $S'\supseteq S$ that is ``close to~$S$'' in the sense that for $x\not\in S$, the probability that $x\in S'$ is bounded by some $\eps > 0$. The advantage of using a Bloom filter, when some false positives are acceptable, is that the space usage becomes smaller than what is required to store $S$ exactly.
Though filters are well-understood from a worst-case perspective, it is clear that state-of-the-art constructions may not be close to optimal for particular distributions of data and queries. Suppose, for instance, that some elements are in $S$ with probability close to~1. Then it would make sense to always include them in $S'$, saving space by not having to represent these elements in the filter. Questions like this have been raised in the context of Weighted Bloom filters (Bruck, Gao and Jiang, ISIT 2006) and Bloom filter implementations that make use of access to learned components (Vaidya, Knorr, Mitzenmacher, and Krask, ICLR 2021).
In this paper, we present a lower bound for the expected space that such a filter requires. We also show that the lower bound is asymptotically tight by exhibiting a filter construction that executes queries and insertions in worst-case constant time, and has a false positive rate at most $\eps$ with high probability over input sets drawn from a product distribution. We also present a Bloom filter alternative, which we call the \emph{Daisy Bloom filter}, that executes operations faster and uses significantly less space than the standard Bloom filter.
ABSTRACT. We explore Cluster Editing and its generalization Correlation Clustering with a new operation called permissive vertex splitting which addresses finding overlapping clusters in the face of uncertain information. We determine that both problems are NP-hard, yet they exhibit significant differences in parameterized complexity and approximability.
For Cluster Editing with Permissive Vertex Splitting, we show a polynomial kernel when parameterized by the solution size and develop a polynomial-time algorithm with approximation factor 7.
In the case of Correlation Clustering, we establish para-NP-hardness when parameterized by solution size and demonstrate that computing an n^{1-ε}-approximation is NP-hard for any constant ε > 0.
Additionally, we extend the established link between Correlation Clustering and Multicut to the setting with permissive vertex splitting.