SOFSEM 2025: 50TH INTERNATIONAL CONFERENCE ON CURRENT TRENDS IN THEORY AND PRACTICE OF COMPUTER SCIENCE
PROGRAM FOR MONDAY, JANUARY 20TH
Days:
next day
all days

View: session overviewtalk overview

09:00-10:00 Session 1: Invited talk
09:00
The challenge of stability, accuracy and robustness in data-driven AI
10:00-10:30Coffee Break
10:30-12:10 Session 2: S1
10:30
Query Learning of Context-Deterministic and Congruential Context-Free Languages over Infinite Alphabets

ABSTRACT. This paper is concerned with learning some context-free languages (CFLs) over infinite alphabets from membership and equivalence queries. Argyros and D'Antoni (2018) proposed a query learning algorithm for regular languages over infinite alphabets using deterministic symbolic automata. Our algorithms learn context-deterministic CFLs and congruential CFLs over infinite alphabets. We enhance the existing algorithms for learning those CFL classes over finite alphabets by adapting Argyros and D'Antoni's technique. Our result shows that their technique can be extended to learn richer classes beyond regular languages.

10:55
Knowledge Neurons in the Knowledge Graph-based Link Prediction Models

ABSTRACT. Recent studies have shown that using Graph Transformers and attention weights are often claimed to confer interpretability, purportedly helpful in providing insights and explaining why a model makes its decisions. While there is already quite an extensive range of techniques explaining Graph Neural Networks (GNNs), their interpretability and model transparency could be improved. A significant challenge in Explainable AI has recently been correctly interpreting neuron behavior to identify what a deep learning system has internally detected as relevant to the input. To tackle these challenges, we present a knowledge attribution method for the link prediction task to identify the neurons that express the input Knowledge Graph (KG) triples. This method not only enhances the interpretability of the model but also improves its transparency, providing a clearer understanding of how specific factual knowledge is stored. It enables human-centric and knowledge attribution explanations by extracting factual knowledge from identified decision drivers and enriching them with available background knowledge. Empirical results on two standard KG-based link prediction datasets shed light on understanding the storage of knowledge within GNNs.

11:20
Beyond Image-Text Matching: Verb Understanding in Multimodal Transformers Using Guided Masking

ABSTRACT. Probing methods are widely used to evaluate the multimodal representations of vision-language models (VLMs), with dominant approaches relying on zero-shot performance in image-text matching task. This method typically assesses models on curated datasets that focus on linguistic aspects such as counting, relations, or attributes. This work uses a complementary probing strategy called guided masking. This approach selectively masks different modalities and evaluates the model’s ability to predict the masked word with high accuracy. We specifically focus on probing verbs, as their comprehension is crucial for understanding actions and relationships in images, and it presents a greater challenge than subjects, objects, or attributes. Our analysis targets VLMs that use region-of-interest (ROI) features obtained from object detectors as input tokens. Our experiments demonstrate that selected models can predict the correct verb with high accuracy, challenging previous conclusions based on image-text matching methods, which suggested VLMs fail in situations requiring verb understanding. The code for all experiments will be publicly available.

11:45
The Computational Complexity of Equilibria with Strategic Constraints

ABSTRACT. Computational aspects of equilibrium notions for games have been extensively studied, including settings where the goal is to find an equilibrium that possesses some additional properties. Our work extends this direction by considering games in which players are subject to some form of constraint on their strategic choices. We also consider the relationship between Nash equilibria and so-called generalized or social equilibria in this context. Our results demonstrate that the complexity of finding an equilibrium varies significantly between games with slightly different strategic constraints. We also demonstrate that these constraints are useful for modeling problems involving strategic resource allocation and also are of interest from the perspective of behavioral game theory.

12:30-14:00Lunch
14:00-15:40 Session 3: S2
14:00
Quantum Algorithm for the Multiple String Matching Problem

ABSTRACT. Let us consider the Multiple String Matching Problem. In this problem, we consider a long string, denoted by $t$, of length $n$. This string is referred to as a text. We also consider a sequence of $m$ strings, denoted by $S$, which we refer to as a dictionary. The total length of all strings from the dictionary is represented by the variable L. The objective is to identify all instances of strings from the dictionary within the text. The standard classical solution to this problem is Aho–Corasick Algorithm that has $O(n+L)$ query and time complexity. At the same time, the classical lower bound for the problem is the same $\Omega(n+L)$. We propose a quantum algorithm with $O(n+\sqrt{mL\log n}+m\log n)$ query complexity and $O(n+\sqrt{mL\log n}\log b+m\log n)=O^*(n+\sqrt{mL})$ time complexity, where $b$ is the maximal length of strings from the dictionary. This improvement is particularly significant in the case of dictionaries comprising long words. Our algorithm's complexity is equal to the quantum lower bound $O(n + \sqrt{mL})$, up to a log factor. In some sense, our algorithm can be viewed as a quantum analogue of the Aho–Corasick algorithm.

14:25
Warm-Started QAOA with Aligned Mixers Converges Slowly Near the Poles of the Bloch Sphere

ABSTRACT. In order to boost the performance of the Quantum Approximate Optimization Algorithm (QAOA) to solve problems in combinatorial optimization, researchers have leveraged the solutions returned from classical algorithms in order to create a warm-started quantum initial state for QAOA that is biased towards "good" solutions. Cain et al. showed that if the classically-obtained solutions are mapped to the poles of the Bloch sphere, then vanilla QAOA with the standard mixer "gets stuck". If the classically-obtained solution is instead mapped to within some angle θ from the poles of the Bloch sphere, creating an initial product state, then QAOA with optimal variational parameters is known to converge to the optimal solution with increased circuit depth if the mixer is modified to be "aligned" with the warm-start initial state. Leveraging recent work of Benchasattabuse et al., we provide theoretical lower bounds on the circuit depth necessary for this form of warm-started QAOA to achieve a desired change Δλ in approximation ratio; in particular, we show that for small θ, the lower bound on the circuit depth roughly scales proportionally with Δλ/θ.

14:50
Expected Density of Random Minimizers

ABSTRACT. Minimizer schemes, or just minimizers, are a very important computational primitive in sampling and indexing biological strings. Assuming a fixed alphabet of size $\sigma$, a minimizer is defined by two integers $k,w\ge2$ and a total order $\O$ on strings of length $k$ (also called $k$-mers). A string is processed by a sliding window algorithm that chooses, in each window of length $w+k-1$, its minimal $k$-mer with respect to $\O$. A key characteristic of the minimizer is the expected density of chosen $k$-mers among all $k$-mers in a random infinite $\sigma$-ary string. Random minimizers, in which the order $\O$ is chosen uniformly at random, are often used in applications. However, little is known about their expected density $\R_\sigma(k,w)$ besides the fact that it is close to $\frac{2}{w+1}$ unless $w\gg k$.

We first show that $\R_\sigma(k,w)$ can be computed in $O(k\sigma^{k+w})$ time. Then we attend to the case $w\le k$ and present a formula that allows one to compute $\R_\sigma(k,w)$ in just $O(w^2)$ time. Further, we describe the behaviour of $\R_\sigma(k,w)$ in this case, establishing the connection between $\R_\sigma(k,w)$, $\R_\sigma(k+1,w)$, and $\R_\sigma(k,w+1)$. In particular, we show that $\R_\sigma(k,w)<\frac{2}{w+1}$ (by a tiny margin) unless $w$ is small. We conclude with some partial results and conjectures for the case $w>k$.

15:15
Forest Covers and Bounded Forest Covers

ABSTRACT. We study approximation algorithms for the forest cover and bounded forest cover problems. A probabilistic $2+\epsilon$ approximation algorithm for the forest cover problem is given using the method of dual fitting. A deterministic algorithm with a 2-approximation ratio that rounds the optimal solution to a linear program is given next. The 2-approximation for the forest cover is then used to give a 6-approximation for the bounded forest cover problem. The use of the probabilistic method to develop the $2+\epsilon$ approximation algorithm may be of independent interest.

15:40-16:10Coffee Break
16:10-17:50 Session 4: S3
16:10
On the periodic decompositions of multidimensional configurations

ABSTRACT. We consider $d$-dimensional configurations, that is, colorings of the $d$-dimensional integer grid $\Z^d$ with finitely many colors. Moreover, we interpret the colors as integers so that configurations are functions $\Z^d \rightarrow \Z$ of finite range. We say that such function is $k$-periodic if it is invariant under translations in $k$ linearly independent directions. It is known that if a configuration has a non-trivial annihilator, that is, if some non-trivial linear combination of its translations is the zero function, then it is a sum of finitely many periodic functions. This result is known as the periodic decomposition theorem. We prove two different improvements of it. The first improvement gives a characterization on annihilators of a configuration to guarantee the $k$-periodicity of the functions in its periodic decomposition --- for any $k$. The periodic decomposition theorem is then a special case of this result with $k=1$. The second improvement concerns so called sparse configurations for which the number of non-zero values in patterns grows at most linearly with respect to the diameter of the pattern. We prove that a sparse configuration with a non-trivial annihilator is a sum of finitely many periodic fibers where a fiber means a function whose non-zero values lie on a unique line.

16:35
Maximal $\alpha$-gapped Repeats in a Fibonacci String

ABSTRACT. A gapped repeat is a substring of the form $uvu$ where $u$ is any nonempty string called arm, and $v$ is any string called gap. A gapped repeat is maximal if the characters on the left to both arms differ and those on the right to both arms differ. For any real number $\alpha\geq 1$, an $\alpha$-gapped repeat is a gapped repeat such that $|u| + |v| \leq \alpha|u|$. One of the fundamental problems for repetitive structures on strings is analyzing the number of substrings that have such structures in a string. Kolpakov et al.~[CPM 2014] showed that $O(\alpha^2 n)$ upper bound and $\Omega(\alpha n)$ lower bound on the number of maximal $\alpha$-gapped repeats in a string of length $n$, and the current best upper bound is $3(\pi^2/6 +5/2)\alpha n$ by I and K{\"o}ppl~[TCS 2019]. Another interesting problem in this line of research is revealing the structures in well-known repetitive strings. In this paper, we investigate the maximal $\alpha$-gapped repeats in a Fibonacci string. We show an interesting characterization of the form of an arm. More precisely, there are two possible cases in the form of an arm: Fibonacci arms and palindromic arms. By using these structures, we can obtain an upper bound of the maximum number of maximal $\alpha$-gapped repeats in the $k$-th Fibonacci string. We can also show that the Fibonacci strings are a new example that contains $\Omega(\alpha n)$ maximal $\alpha$-gapped repeats.

17:00
Incremental computation of the set of period sets

ABSTRACT. Overlaps between words are crucial in many areas of computer science, such as code design, stringology, and bioinformatics. A self overlapping word is characterized by its periods and borders. A period of a word $u$ is the starting position of a suffix of $u$ that is also a prefix $u$, and such a suffix is called a border. Each word of length, say $n>0$, has a set of periods, but not all combinations of integers are sets of periods. Computing the period set of a word $u$ takes linear time in the length of $u$. We address the question of computing, the set, denoted $\Gamma_n$, of all period sets of words of length $n$. Although period sets have been characterized, there is no formula to compute the cardinality of $\Gamma_n$ (which is exponential in $n$), and the known dynamic programming algorithm to enumerate $\Gamma_n$ suffers from its space complexity. We present an incremental approach to compute $\Gamma_n$ from $\Gamma_{n-1}$, which reduces the space complexity, and then a constructive certification algorithm useful for verification purposes. The incremental approach defines a parental relation between sets in $\Gamma_{n-1}$ and $\Gamma_n$, enabling one to investigate the dynamics of period sets, and their intriguing statistical properties. Moreover, the period set of a word $u$ is key for computing the absence probability of $u$ in random texts. Thus, knowing $\Gamma_n$ is useful to assess the significance of word statistics, such as the number of missing words in a random text.

17:25
Outer-(ap)RAC Graphs

ABSTRACT. An outer-RAC drawing of a graph is a straight-line drawing where all vertices are incident to the outer cell and all edge crossings occur at a right angle. If additionally, all crossing edges are either horizontal or vertical, we call the drawing outer-apRAC (ap standing for axis parallel). We investigate the class of outer-(ap)RAC graphs admitting outer-(ap)RAC drawings. We show that the outer-RAC graphs are a proper subset of the planar graphs with at most 2.5n-4 edges where n is the number of vertices. This density bound is tight, even for outer-apRAC graphs. Moreover, we provide an SPQR-tree based linear-time algorithm which computes an outer-RAC drawing for a given series-parallel graph of maximum degree four. As a complement to this result, we present planar graphs of maximum degree four and series-parallel graphs of maximum degree five that are not outer-RAC. Finally, for series-parallel graphs of maximum degree three we show how to compute an outer-apRAC drawing in linear time.