ABSTRACT. We study the PLU decomposition of rectangular matrices
with polynomial coefficients, depending on parameters, possibly
subject to algebraic constraints.
Computing such a
parametric decomposition typically requires to split
computations into cases.
This yields various computational challenges, in particular
the fact that more cases than necessary may be generated.
In this paper, we present heuristic methods which attempt
to minimize the number of cases. In particular, these methods
try to avoid splitting the computations, if this is possible.
We implemented these methods in Maple and evaluated
their benefits on various test problems coming from the literature.
An Abstraction-Preserving Block Matrix Implementation in Maple
ABSTRACT. An implementation supporting recursive block, or partitioned, matrices in Maple is described. A data structure is proposed and support functions are defined. These include constructor functions, basic operations such as addition and product, and inversion. The package is demonstrated by calculating the LU factors of a block-defined matrix.
Some Family Relations for Rational Orthogonal Matrices
ABSTRACT. We continue developing the project on working with rational orthogonal matrices. Previous work presented at SYNASC is improved by initiating a database of matrices. This requires two developments: A systematic method for generating matrices, and a method for organising and presenting the matrices in the database accessible to users.
A Detailed Survey of Textural Features in Breast Cancer Detection Using Deep Learning
ABSTRACT. Early cancer diagnosis is crucial for effective health management. This paper explores various approaches for the early prognostication of breast cancer. We focus on applying textural features and Convolutional Neural Networks (CNNs) to develop a model distinguishing between normal and abnormal classes. Furthermore, the model identifies and differentiates benign tumours from malignant ones within the abnormal cases. Textural features present a viable option for early diagnosis due to their fast computational capabilities and cancer staging.
An Analysis of Multitask Deep Learning Models for Histopathology
ABSTRACT. Artificial Intelligence has the potential to streamline and facilitate numerous processes in the medical field, increasing the quality of life for millions and potentially saving lives. One area which requires a lot of effort when it comes to diagnosing severe diseases is Histopathology. Recently, a modern learning strategy, Multitask learning, was applied in Digital Histopathology in order to obtain relevant medical information from histological images. This paradigm is able to increase performance by learning multiple objectives simultaneously resulting in more general features. The resulting methods reduce overfitting and computational complexity while increasing data efficiency making it a suitable choice for the high-dimensionality low sample size sets from the medical field. The aim of this work is to present novel multitask approaches with applications in Histopathology, analyse them, showcase their advantages and drawbacks and identify possible future research directions.
Prepared for Lift-off: Hybrid CNN-LSTM Architecture for Aircraft Engine Remaining Useful Life Estimation
ABSTRACT. Equipment failure is the leading cause of production halts, which can be avoided through Predictive Maintenance indicators, such as the Remaining Useful Life (RUL). This task is particularly noteworthy in domains like aircraft maintenance, where undetected faults can result in loss of human life. To address this, we propose a solution that integrates preprocessing techniques with a hybrid architecture consisting of a Convolutional Neural Network (CNN) and a Long Short-Term Memory (LSTM) network. We rely on a piece-wise linear regression function to pinpoint the start of degradation for turbofan engines used in contemporary aviation. Using this method alongside classical preprocessing strategies and the versatile CNN-LSTM hybrid exhibits state-of-the-art predictive capabilities on the chosen dataset. Performance evaluations, in terms of RMSE, PHME, MAE and R-squared are computed across all subsets of the NASA C-MAPSS dataset. Our methodology showcases a RMSE decrease of at least 50% compared to other studied approaches in the existing literature. The full implementation code is available at https://github.com/ariandra34/Prepared-for-Lift-off.
ABSTRACT. The Theorema system (https://www3.risc.jku.at/research/theorema/software) is a complex framework built upon Mathematica (www.wolfram.com/mathematica) that supports the processes of defining mathematical theories, including the definition of algorithms through logical formulae, experimentation by running the algorithms, and the development and use of mechanical provers. The system enhances the synthesis and the certification of algorithms as they are implemented directly in predicate logic along with their specifications, bypassing the need for a programming language. A distinctive feature of the Theorema system is its use of a natural style, similar to human expression, for presenting logical formulae and algorithms, formulating the inference rules of the provers, and displaying proofs.
The tutorial will be based on a live demo of the system showing how to implement and execute definitions, and how to automatically generate proofs by using the existent provers in the system.
ABSTRACT. Many theorems in mathematics have the form of an implication, an equivalence, or an
equality, and in the standard prover in the Theorema system such formulas are handled by rewriting. Definitions of
new function- or predicate symbols are yet another example of formulas that require rewriting in their treatment in
the Theorema system. Both theorems and definitions in practice often carry conditions under which they are valid.
Rewriting is, thus, only valid in cases where all side-conditions are met. On the other hand, many of
these side-conditions are trivial and when presenting a proof we do not want to distract the reader with lengthy
derivations that justify
the side-conditions. The goal of this paper is to present the design and implementation of a mechanism that efficiently checks side-conditions
in rewriting while preserving the readability and the explanatory power of a mathematical proof, which has always been of central
interest in the development of the Theorema system.
ABSTRACT. Technical debt prediction involves anticipating problems and challenges in software projects. Because of lousy coding choices, these projects are often far from the optimal design. These issues can accumulate over time and affect long-term maintainability and stability.
Numerous studies are available for predicting technical debt. For instance, some studies utilize static analysis tools like Pylint, and others use dynamic code analysis tools like SZZ. This study addresses AI predictions using data obtained from static analysis tools.
This paper presents technical debt prediction using static code analysis data generated with SonarQube and Radon. It covers both short-term and long-term predictions by analyzing code quality metrics such as complexity, duplication, and adherence to standards. The study also integrates git commit classification to analyze commit history, providing valuable insights into the evolution and management of technical debt within the project.
Preemptive Phased Execution Models with Scratchpad Memory Requirements
ABSTRACT. There is currently a widespread interest in using
phased execution models in the design of real-time systems,
stimulated by their ability to reduce contention in accessing
shared resources. The main beneficiary of such approaches is the
wide category of Commercial Off-The-Shelf (COTS) multi-core
systems, where complexity is an underlying issue. However, the
advantage of predictability comes at a cost, which often resides in
the additional decisions that are usually made when using phased
execution models. One such decision is employing non-preemptive
scheduling; as is well known, this leads to NP-hardness for most
instances of the scheduling problem, while preemption can avoid
this level of complexity in many cases. It is no surprise, then, that
recent efforts have attempted to explore the possibilities offered
by preemptive scheduling in combination with phased execution
models.
Unlike existing work, the current paper focuses on the
scratchpad memory consumption in preemptive phased execution
models. The starting point is the model introduced in [25],
which was meant for the general analysis of the efficiency of
preemptive scheduling in phased execution models. Notably,
two methods for handling intermediate data are introduced,
namely, the Waiting Time Minimizing Preemption (WMPM) and
the Overhead Minimizing Preemption (OMPM). The current
paper analyses the requirements of each approach in terms of
scratchpad memory. It also introduces a hybrid method, in order
to achieve a finer balance between memory consumption and
schedulability constraints, and comes with a new heuristic to
improve the efficiency of scheduling.
Concepts involved in creating a viewer for disassembly
ABSTRACT. Disassembly is the process that translates machine code into a higher-level and more human-readable form. From the security analysis and malware detection perspective, it is an essential technique. However, traditional disassembly tools often display the information in a textual format, thus making it difficult to navigate and understand.
In this paper, we explore the concepts involved in creating a viewer for disassembly that focuses on static analysis and boosts the user experience like clearly displaying the code, showcasing relevant parts, guiding the user, offering an interactive experience, map data structures and seamless integration with other tools. Furthermore, we emphasize the importance of having both a user-friendly and also customizable interface without having to concede on overall speed and performance. In this manner, even a novice in security analysis could have a positive experience. Considering all these features, the final result leads to a more productive and efficient analysis.
ABSTRACT. Sum-Product Networks (SPN) is a significant and novel development within the probabilistic graphical models domain that unites the representational capabilities of deep learning with effective probabilistic inference. Our focus is on presenting these unique structural properties of SPNs along with their theoretical foundations and putting them into application on real-life diverse datasets. Despite its promising prospects, only a few existing applications are using this technology.
To illustrate how this method can be used in practice, we have applied the model to three different datasets: the popular MNIST dataset, the Fashion MNIST dataset, and the Fast, Furious and Insured dataset, respectively, through the libspn_keras library.
In the first two datasets, we used SPNs to classify hand-written digits and fashion products respectively displaying their effectiveness within a standard image classification task. The models' architecture incorporates SPN-specific layers together with conventional convolutional layers thereby achieving remarkable performance measures.
Results show that SPNs form a strong basis for theoretical exploration and practical application to probabilistic modeling along with deep learning. This paper adds to the common understanding of SPN applications and, at the same time, highlights possible further areas of study within the same field.
Learning Dispatching Rules for Flexible Assembly Scheduling Problems Using Deep Reinforcement Learning
ABSTRACT. The Job Scheduling Problem and its extensions have been extensively investigated in the past decade, utilizing various solving methods such as heuristics, metaheuristics, and even deep reinforcement learning (DeepRL) and Graph Neural Networks (GNN). Unfortunately, the real-world scheduling problems are difficult to solve efficiently, leading to significant time consumption when attempting to solve large industrial instances. While heuristic-based dispatching methods offer a way to compute schedules within a reasonable time frame, creating effective heuristics that produce satisfactory solutions is challenging and laborious.
In this work, we propose solving the Assembly Scheduling Problems (ASP) by using DeepRL, according to the needs of a manufacturer involved in an academia-industry collaborative project. The starting point of our work is the Schlably framework (https://github.com/tmdt-buw/schlably). We aimed to extend its capabilities to accommodate in-tree precedence relations characteristic of ASP. Additionally, we implemented several other dispatching rules, further enriching the framework's versatility. Finally, we performed an in-depth evaluation of these enhancements, shedding light on their effectiveness.
The Nurse Rostering Problem solved using Machine Learning solutions – state of the art and promising future research directions
ABSTRACT. This paper provides an overview of the state of the art Machine Learning methods for solving the Nurse Rostering Problem and highlights promising future directions. This survey differentiates from existent ones being the first to capture the emergence of the Machine Learning techniques in this field. This rise is being marked for example by the exploration of less commonly utilized learning paradigms (imitation, contrastive) and by the increasing usage of Deep Learning architectures such as Graph Neural Networks for encoding the problems' constraints and other relevant information. This survey represents work in progress that has been submitted to a journal and must not be published in the conference volume or elsewhere.
ABSTRACT. The Theorema system (https://www3.risc.jku.at/research/theorema/software) is a complex framework built upon Mathematica (www.wolfram.com/mathematica) that supports the processes of defining mathematical theories, including the definition of algorithms through logical formulae, experimentation by running the algorithms, and the development and use of mechanical provers. The system enhances the synthesis and the certification of algorithms as they are implemented directly in predicate logic along with their specifications, bypassing the need for a programming language. A distinctive feature of the Theorema system is its use of a natural style, similar to human expression, for presenting logical formulae and algorithms, formulating the inference rules of the provers, and displaying proofs.
The tutorial will be based on a live demo of the system showing how to implement and execute definitions, and how to automatically generate proofs by using the existent provers in the system.
Entropy-Driven Visualization in GView: Unveiling the Unknown in Binary File Formats
ABSTRACT. The number of known malicious samples has increased exponentially in the last decade, making it more complicated for a security researcher to identify and cluster them quickly. While for most scenarios, most data viewers relate to the file type (either binary or textual) to extract meaningful data, the protective measurements of different malicious payloads may make this task difficult.
Our research introduces an approach that develops an enhanced visualization mode within the open-source framework GView to address this challenge. It harnesses established entropy-based analytical principles to facilitate the identification of anomalies and intrinsic properties in binary data, irrespective of specific file formats.
This methodology streamlines threat evaluation and contributes to the broader field of cybersecurity by offering a building block for a scalable solution for analyzing the ever-expanding volumes of digital information.
A software engineering approach into analyzing Microsoft Office-based email attacks
ABSTRACT. Phishing emails are among today’s most common attack vectors since most enterprises and consumer users still rely on email for day-to-day operations. At the same time, some functionalities (such as MACRO support and Visual Basic Programming) available in Microsoft Office Suite make it easier for an attacker to inject malicious code into a document that can be sent via email. From a research perspective, analyzing such an attack implies understanding not only the artifacts that compose the kill chain steps but also being able to extract, from a document, the embedded macros and the scripts.
This paper focuses on building such a system using the forensics platform GView by adding various plugins and functionalities designed to augment the existing support and allow a security researcher to analyze the content of an email and attached documents quickly.
ABSTRACT. Testing and validating embedded software is paramount when it comes to safety-critical systems. BTC EmbeddedPlatform® supports these activities using various approaches including test case definition means. In this system description we present ScriptView, a domain-specific language designed to support test case implementation in a simple programmatic way. The language has been integrated in our platform and can be used in production by our customers.
Software Model Checking for Memory Consistency Verification
ABSTRACT. The memory model describes memory consistency requirements in a multithreading system. Compiler optimizations may violate the consistency requirements due to bugs, and the program behavior will differ from the required one.
Bugs in compiler optimizations, like incorrect instruction reordering, are very difficult to detect because they may occur with a very low chance in real execution on hardware.
There are different approaches to formal verification for memory consistency requirements, but the challenge is that the approaches are not scalable for industrial software. In the paper, we present the MCC tool that was evaluated on the industrial virtual machine ARK VM and was able to find a real bug in a compiler optimization.
The MCC is a static tool that allows to check all possible executions of a particular test without relying on a hardware execution. The completeness of the approach depends on the test suite and specified memory consistency properties.
Efficiently Optimizing Crane Scheduling: A Rule-Based Approach for Dynamic and Uncertain Environments
ABSTRACT. In logistics and industrial operations, optimizing crane scheduling is crucial for efficiency and resource allocation. Given the dynamic nature of crane scheduling environments, influenced by factors such as fluctuating order volumes and varying crane availability, robust scheduling solutions are imperative.
This paper introduces a rule-based approach designed to optimize crane scheduling within a simulated dynamic and uncertain environment, focusing on real-time responsiveness, efficient task allocation, and continuous operational efficiency. Our approach was developed while participating in the `Crane Scheduling' track of the `Dynamic Stacking Optimization in Uncertain Environments' (DynStack) competition, held with the Genetic and Evolutionary Computation Conference (GECCO) 2024.. Our experimental results underscore the efficacy of our approach in achieving zero delivery errors, maximizing the handling of blocks, and minimizing parking durations, hereby promoting cost-effectiveness and environmental sustainability in crane operations. As of June 30, 2024, our approach is ranked first on the competition leaderboard across both testing simulation environments.
Dynamic Stacking Optimization in Uncertain Environments: a Rolling Mill Complex System
ABSTRACT. Dynamic stacking optimization is of critical importance in industries where timely production and delivery are essential. This process involves the strategic scheduling of crane operations to relocate products while adhering to stringent time constraints. This paper presents a study on dynamic stacking optimization for the Rolling Mill track of the DynStack competition hosted at GECCO 2024. The primary objective is to develop a heuristic-based approach to minimize rolling program mess-ups, blocked mill time, and crane manipulations. The challenges posed by the uncertainty in block arrivals and the need for real-time decision-making are addressed. Various heuristic and optimization techniques were explored, with an emphasis on enhancing crane efficiency and block sorting accuracy.
Synthetic dataset generation for edge drone inference and training
ABSTRACT. This research investigates the generation of synthetic aerial images for training deep neural networks, particularly targeting the task of object detection in mostly top-down views. The experiment leveraged a simulation environment to create a complex dataset, addressing the challenges associated with collecting real-world aerial images. The work encompassed several critical components, including algorithm parameter variation for ground truth annotation, the exploration of bounding box generation techniques and an investigation into camera orientations and altitude to simulate real-world scenarios. Additionally, the research delved into the training of neural network models, specifically tailored for low-power devices, particularly the Coral Edge TPU. The findings of this research indicate the feasibility of utilizing synthetic environments and datasets to train neural networks for aerial image analysis. The trained model showcased promising results, with the potential for broader applications in object detection. The study also addressed safety considerations, emphasizing the need for high-resolution cameras in drone operations to detect objects, such as human bodies, from safe distances. Furthermore, the paper outlines a roadmap for future work, highlighting areas such as multi-class object detection, integration with autopilot systems, and real-world testing. This research serves as a foundational step toward harnessing synthetic data and deep learning for enhanced drone capabilities in diverse scenarios, with a commitment to safety and compliance with aviation regulations
ABSTRACT. The paper presents an approach to the design and implementation of a multi-agent system capable of effectively Multi-Agent Pathfinding (MAPF) within complex abstract and realistic simulated environments. It offers a solution for MAPF in an abstract environment in which the agents find the path to a specified goal using two variants of the Rapidly exploring Random Tree (RRT) algorithm, namely RRT with motion primitives and RRT*. Then it shows how we designed a MAPF for several agents with different features in a Gazebo simulated environment using ROS2 and reports on our experimental results. We also highlight the challenges of extending the Gazebo environment for coping with more than one agent. The results obtained from the simulated environment have shown that the proposed solution is capable of generating efficient, collision-free paths for multiple agents.
Efficient Performance Analysis of Modular Rewritable Petri Nets
ABSTRACT. Petri Nets (PN) are widely used as a strong formalism for modeling concurrent and distributed systems, but they face challenges in effectively modeling adaptive systems. To address this, we have created a formalization for 'rewritable' PT nets (RwPT) using Maude, a declarative language that maintains consistent rewriting logic semantics. Recently, we introduced a modular approach that utilizes algebraic operators to construct large RwPT models. This technique employs composite node labeling to preserve hierarchical organization through net rewrites and has proven effective.
Once stochastic parameters are added to the formalism, we present an automated process to derive a lumped CTMC from the quotient graph generated by a modular RwPT model.
To illustrate the effectiveness of our approach, we use a fault-tolerant manufacturing system as a case study.
A type system for data flow and alias analysis in ReScript
ABSTRACT. ReScript is a strongly typed language that targets JavaScript, as an alternative to gradually typed languages, such as TypeScript. In this paper, we present a sound type system for data-flow analysis for a subset of the ReScript language, more specifically for a λ -calculus with mutability and pattern matching. The type system is a local analysis that collects information about variables are used at each program point as well as alias information.
Leveraging Slither and Interval Analysis to build a static analysis tool
ABSTRACT. Even though much progress has been made in identifying and mitigating smart contract vulnerabilities, we often still hear about coding or design issues leading to great financial losses. This paper presents the progress that we made towards finding defects that are still sometimes not detected or completely not detected by state-of-the-art analysis tools. Despite the fact that it is still in its incipient phase, we developed a working solution built on top of Slither\cite{slither} that uses interval analysis to evaluate the contract state by simulating the execution of each instruction. We present the current solution architecture in great detail, how it could be extended to other static analysis techniques, and how it can be integrated with other third-party tools. Our current benchmark contains smart contracts that highlight the potential of this approach for detecting certain code defects.
Static Analysis framework for detecting use-after-free bugs in C++
ABSTRACT. Pointers are a powerful, but dangerous feature provided by the C and C++ programming languages, and incorrect use of pointers is a common source of bugs and security vulnerabilities. Making secure software is crucial, as vulnerabilities exploited by
malicious actors not only leads to monetary losses, but possibly loss of human lives.
Fixing these vulnerabilities is costly if they are found at the end of development, and
the cost will be event higher if found after deployment. That is why it is desirable for
bugs to be found as early in the development process as possible. We propose a framework that can statically find use-after-free bugs at compile-time and report the errors
to the users. It works by tracking the lifetime of objects and memory locations pointers might point to and, using this information, a possibly invalid dereferencing of a
pointer can be detected. The framework was tested on over 100 handwritten small
tests, as well as 5 real-world projects, and has shown good results detecting errors,
while at the same time highlighting some scenarios where false positive reports may
occur. Based on the results, it was concluded that our framework achieved its goals,
as it is able to detect multiple patterns of use-after-free bugs, and correctly report the
errors to the programmer.
ABSTRACT. The talk presents several research projects and tools from Microsoft Research and their impact on programming secure and reliable systems. As a common basis they take a formal methods angle where systems are viewed as mathematical objects and consider computation through lenses of calculi and measurements. We then describe how these research threads interleave with major developments from academic research and phase shifts in industry. With Microsoft rapidly pivoting on deploying and delivering AI products the talk connects foundations with current projects, including development of provably secure systems,
securing smart contracts, network verification, efficient and correct compilation for ML systems, and programming systems and runtimes for interacting with AI.