ABSTRACT. The $b$-matching problem is an allocation problem where the vertices on the
left-hand side of a bipartite graph, referred to as servers, may be matched
multiple times. In the setting with stochastic rewards, an assignment between
an incoming request and a server turns into a match with a given success
probability. Mehta and Panigrahi (FOCS 2012) introduced online bipartite
matching with stochastic rewards, where each vertex may be matched once.
The framework is equally interesting in graphs with vertex capacities.
In Internet advertising, for instance, the advertisers seek successful
matches with a large number of users.
We develop (tight) upper and lower bounds on the competitive ratio of
deterministic and randomized online algorithms, for $b$-matching with
stochastic rewards. Our bounds hold for both offline benchmarks considered
in the literature. As in prior work, we first consider vanishing
probabilities. We show that no randomized online algorithm can achieve
a competitive ratio greater than $1-1/e\approx 0.632$, even for
identical vanishing probabilities and arbitrary uniform server capacities.
Furthermore, we conduct a primal-dual analysis of the deterministic \textsc{StochasticBalance} algorithm. We prove that it achieves a competitive
ratio of $1-1/e$, as server capacities increase, for arbitrary heterogeneous
non-vanishing edge probabilities. This performance guarantee
holds in a general setting where servers have individual capacities and for
the vertex-weighted problem extension. To the best of our knowledge, this
is the first result for \textsc{StochasticBalance} with arbitrary non-vanishing
probabilities.
We remark that our impossibility result implies in particular that, for the
AdWords problem, no online algorithm can be better than $(1-1/e)$-competitive
in the setting with stochastic rewards.
ABSTRACT. A string is said to be closed if its length is one, or if it has a non-empty factor that occurs both as a prefix and as a suffix of the string, but does not occur elsewhere. The notion of closed strings was introduced by [Fici, WORDS 2011]. Recently, the maximum number of distinct closed factors occurring in a string was investigated by [Parshina and Puzynina, Theor. Comput. Sci. 2024], and an asymptotic tight bound was proved. In this paper, we propose two algorithms to count the distinct closed factors in a string T of length n over an alphabet of size \sigma. The first algorithm runs in O(n log \sigma) time using O(n) space for string T given in an online manner. The second algorithm runs in O(n) time using O(n) space for string T given in an offline manner. Both algorithms utilize suffix trees for sliding windows.
Online Busy Time Scheduling with Untrusted Prediction
ABSTRACT. In this work, we study the online busy time scheduling problem with infinite processors, where each job j has a release time r_j, a processing time p_j, and a deadline d_j. Busy time scheduling aims to use multiple processors to schedule jobs concurrently to minimize the time a machine has to process jobs.
We consider the case proposed by Koehler and Khuller, where a single machine has unlimited processors.
Moreover, we consider the case where the online algorithm can access prediction, which might be imperfect, on when the machine should be active.
We present an algorithm, Multiplier, that dynamically adjusts its strategy for reserving time windows for potential future jobs according to the prediction it receives and how much it trusts the prediction. We show that Multiplier is (1 + 4/(1+λ) )-consistent and (1 + 4/(1−λ) )-robust with a trust parameter $\lambda \in [0, 1)$.
The Complexity of Counting Turns in the Line-Based Dial-a-Ride Problem
ABSTRACT. Dial-a-Ride problems have been proposed to model the challenge to consolidate passenger transportation requests with a fleet of shared vehicles.
The line-based Dial-a-Ride problem (LiDARP) is a variant where the passengers are transported along a fixed sequence of stops, with the option of taking shortcuts.
In this paper we consider the LiDARP with the objective function to maximize the number of transported requests. We investigate the complexity of two optimization problems: the LiDARP, and the problem to determine the minimum number of turns needed in an optimal LiDARP solution, called the MinTurn problem.
Based on a number of instance parameters and characteristics, we are able to state the boundary between polynomially solvable and NP-hard instances for both problems.
Furthermore, we provide parameterized algorithms that are able to solve both the LiDARP and MinTurn problem.
ABSTRACT. An acyclic deterministic finite automaton (ADFA) is a data structure that represents a set of strings (i.e., a dictionary) and facilitates a pattern searching problem of determining whether a given pattern string is present in the dictionary.
We introduce the packed ADFA (PADFA), a compact variant of ADFA, which is designed to achieve more efficient pattern searching by encoding specific paths as packed strings stored in contiguous memory.
We theoretically demonstrate that pattern searching in PADFA is near time-optimal with a small additional overhead and becomes fully time-optimal for sufficiently long patterns.
Moreover, we prove that a PADFA requires fewer bits than a trie when the dictionary size is relatively smaller than the number of states in the PADFA.
Lastly, we empirically show that PADFAs improve both the space and time efficiency of pattern searching on real-world datasets.
Fast Practical Compression of Deterministic Finite Automata
ABSTRACT. We revisit the popular delayed deterministic finite automaton (D^2FA) compression algorithm introduced by Kumar~et~al.~[SIGCOMM 2006] for compressing deterministic finite automata (DFAs) used in intrusion detection systems. This compression scheme exploits similarities in the outgoing sets of transitions among states to achieve strong compression while maintaining high throughput for matching.
The D^2FA algorithm and later variants of it, unfortunately, require at least quadratic compression time since they compare all pairs of states to compute an optimal compression. This is too slow and, in some cases, even infeasible for collections of regular expression in modern intrusion detection systems that produce DFAs of millions of states.
Our main result is a simple, general framework for constructing D^2FA based on locality-sensitive hashing that constructs an approximation of the optimal D^2FA in near-linear time. We apply our approach to the original D^2FA compression algorithm and two important variants, and we experimentally evaluate our algorithms on DFAs from widely used modern intrusion detection systems. Overall, our new algorithms compress up to an order of magnitude faster than existing solutions with either no or little loss of compression size. Consequently, our algorithms are significantly more scalable and can handle larger collections of regular expressions than previous solutions.
ABSTRACT. Recently the descriptional and computational complexity of various pumping lemmata for general regular languages have been inves- tigated in the literature. There it turned out that in almost all cases tight bounds on the operational complexity of minimal pumping constants for regular languages have been obtained. From the computational perspec- tive it was shown that in most cases the question whether a certain value can serve as a pumping constant w.r.t. a fixed pumping lemma is computationally intractable. Whether similar results can be obtained for restricted regular languages, such as unary regular languages, was left open---a language is unary if the underlying alphabet is a singleton set. Here we fill this gap by considering in detail questions on various pumping lemmata for unary regular languages in detail. While some of the results obtained are similar to those in the general case, we also find significant differences. The results presented here fit well with the previous results and give a mostly complete picture of the problems in question.
On Computational Completeness of Semi-Conditional Matrix Grammars
ABSTRACT. Matrix grammars are one of the first approaches ever proposed in regulated rewriting, prescribing that rules have to be applied in a certain order. In regulated rewriting, the most interesting case shows up when all rules are context-free.
Typical descriptional complexity measures incorporate the number of nonterminals or the matrix length, i.e., the number of rules per matrix. When viewing matrices as program fragments, it becomes natural to consider additional applicability conditions for such matrices.
Here, we focus on attaching a permitting and a forbidden string to every matrix in a matrix grammar. The matrix is applicable to a sentential form~$w$ only if the permitting string is a subword in~$w$ and the forbidden string is not a subword in~$w$. We call such a grammar, where the application of a matrix is conditioned as described, a semi-conditional matrix grammar. We consider (1) the maximal lengths of permitting and forbidden strings, (2) the number of nonterminals, (3) the number of conditional matrices, (4) the maximal length $\ell$ of any matrix and (5) the number of conditional matrices with nonempty permitting and forbidden strings, as the resources (descriptional complexity measures) of a semi-conditional matrix grammar.
We show that certain semi-conditional matrix grammar families defined by restricting resources can generate all recursively enumerable languages.
ABSTRACT. We study the maximum set coverage problem in massively parallel and distributed models. In this setting, $m$ sets that are subsets of a universe of $n$ elements are distributed among $m$ machines. In each round, these machines can communicate with each other, subject to the memory constraint that no machine may use more than $\tO{n}$ memory. The objective is to find the $k$ sets whose coverage is maximized. We consider the regime where $k = \Omega(m)$ (i.e., $k = m/100$), $m = O(n)$, and each machine has $\tO{n}$ memory \footnote{The input size is $O(mn)$ and each machine has the memory enough to store a constant number of sets.}.
Maximum coverage is a special case of the submodular maximization problem subject to a cardinality constraint that can be approximated to within a $1-1/e$ factor using the greedy algorithm, which is not directly applicable to parallel and distributed models.
When $k = \Omega(m)$, to obtain a $1-1/e-\epsilon$ approximation, previous work either requires $\tO{mn}$ memory per machine which is not interesting compared to the trivial algorithm that sends the entire input to a single machine, or requires $2^{O(1/\epsilon)} n$ memory per machine which is prohibitively expensive even for a moderately small value $\epsilon$.
Our result is a randomized $1-1/e-\epsilon$ approximation in
\[O(1/\epsilon^3 \cdot \min\{\log n, \log (1/\epsilon) + \log m\})\] rounds. Our algorithm involves solving a slightly transformed linear program of the maximum coverage problem using the multiplicative weights update method, classic techniques in parallel computing such as parallel prefix, and various combinatorial arguments.
ABSTRACT. We study the problem of scheduling an arbitrary computational DAG on a fixed number of processors while minimizing the makespan. While previous works have mostly studied this problem in fairly restricted models, we define and analyze DAG scheduling in the Bulk Synchronous Parallel (BSP) model, which is a well-established parallel computing model that captures the communication cost between processors much more accurately. We provide a taxonomy of simpler scheduling models that can be understood as variants or special cases of BSP, and discuss how the properties and optimum cost of these models relate to BSP. This essentially allows us to dissect the different building blocks of the BSP model, and gain insight into how these influence the scheduling problem.
We then analyze the hardness of DAG scheduling in BSP in detail. We show that the problem is solvable in polynomial time for some very simple classes of DAGs, but it is already NP-hard for in-trees or DAGs of height $2$. We also prove that in general DAGs, the problem is APX-hard: it cannot be approximated to a $(1+\epsilon)$-factor in polynomial time for some specific $\epsilon>0$. We then separately study the subproblem of scheduling communication steps, and we show that the NP-hardness of this problem depends on the problem parameters and the communication rules within the BSP model. Finally, we present and analyze a natural formulation of our scheduling task as an Integer Linear Program.
Parallel Peeling of Invertible Bloom Lookup Tables in a Constant Number of Rounds
ABSTRACT. Invertible Bloom lookup tables (IBLTs) are a compact way of probabilistically representing a set of n key-value pairs so as to support insertions, deletions, and lookups. If an IBLT is not overloaded (as a function of its size and number of key-value pairs that have been inserted), then reporting all the stored key-value pairs can also be done via a “parallel peeling” process. For the case when the IBLT is represented in a very compact form, this can be implemented to run in O(log log n) parallel rounds, with all but inversely polynomial probability, as shown in prior work by Jiang, Mitzenmacher, and Thaler, as well as in Gao’s work on parallel peeling algorithms for random hypergraphs. Although O(log log n) is practically constant for reasonable values of n, there are nevertheless scenarios (such as in the parallel GPU or MapReduce frameworks) where parallel peeling is desired to run in a constant number of rounds, with failure probabilities that are negligible rather than simply being polynomially small. In this paper, we study simple constant-round parallel peeling algorithms for IBLTs, focusing on negligible failure probabilities based on table size, number of elements stored, and number of hash functions. We prove the surprising result that with O(n log n) space a one-round parallel peeling process succeeds with high probability while a two-round parallel peeling process succeeds with overwhelming probability. We then provide a time-space trade-off theorem for parallel peeling in a constant k number of rounds while still maintaining overwhelming success probability. We also give several new algorithmic applications of parallel peeling of IBLTs and we experimentally study the effectiveness of our approach in practice.
Dynamic Range Minimum Queries on the Ultra-Wide Word RAM
ABSTRACT. We consider the dynamic range minimum problem on the ultra-wide word RAM model of computation. This model extends the classic $w$-bit word RAM model with special ultrawords of length $w^2$ bits that support standard arithmetic and boolean operation and scattered memory access operations that can access $w$ (non-contiguous) locations in memory. The ultra-wide word RAM model captures (and idealizes) modern vector processor architectures.
Our main result is a linear space data structure that supports range minimum queries and updates in $O(\log \log \log n)$ time. This exponentially improves the time of existing techniques. Our result is based on a simple reduction to prefix minimum computations on sequences $O(\log n)$ words combined with a new parallel, recursive implementation of these.