Automatic Classification of Recurring Tasks in Software Development Projects
ABSTRACT. Background: Information about project tasks stored
in Issue tracking systems (ITS) can be used for project analytics
or process simulation. Such tasks can be categorized as stateful
or recurrent. Although automatic categorization of stateful tasks
is relatively simple, performing the same task for recurrent
tasks constitutes a challenge.
Aims: The goal of this study
is to investigate the possibility of employing machine-learning
algorithms to automatically categorize recurrent tasks in software
projects based on information stored in ITS.
Method: We perform a study on a dataset from six industrial projects
containing 9,589 tasks and augment it with an additional dataset of 91,145
task descriptions from other industrial projects to up-sample
minority classes during training. We perform ten runs of 10-
fold cross-validation for each project and evaluate classifiers
using a set of state-of-the-art prediction quality metrics, i.e.,
Accuracy, Precision, Recall, F1-score, and MCC. Our machine-
learning pipeline includes a Transformer-based sentence embed-
der (’mxbai-embed-large-v1’) and XGBoost classifier.
Results: The model automatically classifies software process tasks into 14
classes with MCC ranging between 0.69 and 0.88 (mean: 0.77).
We observed higher prediction quality for the largest projects
in the dataset and those managed according to “traditional”
project management methodologies. Conclusions: We conclude
that machine-learning algorithms can effectively categorize re-
current tasks. However, it requires collecting a large balanced
dataset of ITS tasks or using a pre-trained model like the one
provided in this study.
Leveraging GANs to Generate Synthetic Log Files for Smart-Troubleshooting in Industry 4.0
ABSTRACT. In this paper, we tackle the challenge of generating synthetic log files using generative adversarial networks to support smart-troubleshooting experimentation. Log files are critical for implementing monitoring systems for smart-roubleshooting,
as they capture valuable information about the activities and events occurring within the monitored system. Analyzing these logs is crucial for effective smart-troubleshooting, enhancing the overall efficiency, reliability, and security of smart manufacturing processes. However, accessing public log data is difficult due to privacy concerns and the need to protect sensitive information. Moreover, for the purpose of effective troubleshooting, it is essential to have datasets categorized as either fault, error, and failure logs or standard logs. In recent years, synthetic log files have emerged as a promising solution to augment limited real world datasets and facilitate the development and evaluation of anomaly detection techniques. We leverage mentioned state-of-the-art method to develop a specific log generation technique and dataset tailored for testing smart-troubleshooting techniques in heterogeneous connected systems environments, such as industrial
cyber-physical systems and the internet of things. First, We propose a methodology that generates synthetic log files based on generative adversarial networks. later we instantiate this methodology using different GAN implementations and present
a validation and a comprehensive comparative analysis of their performance. Eventually, we provide a robust dataset for anomaly detection and threat analysis in cyberspace security. Base on the results of our comparison, CTGAN has shown superior
performance in generating high-quality synthetic log files.
How to Measure Performance in Agile Software Development? A Mixed-Method Study
ABSTRACT. Context: Software process improvement (SPI) is known as a key for being successfull in software development. Measuring quality and performance is of high importance in agile software development as agile approaches focussing strongly on short-term success in dynamic markets. Even if software engineering research emphasizes the importance of performance metrics while using agile methods, the literature lacks on detail how to apply such metrics in practice and what challenges may occur while using them. Objective: The core objective of our study is to identify challenges that arise when using agile software development performance metrics in practice and how we can improve their successful application. Method: We decided to design a mixed-method study. First, we performed a rapid literature review to provide an up-to-date overview of used performance metrics. Second, we conducted a single case study using a focus group approach and qualitativ data collection and analysis in a real-world setting. Results: Our results show that while widely used performance metrics such as story points and burn down charts are widely used in practice, agile
software development teams face challenges due to a lack of transparency and standardization as well as insufficient accuracy. Contributions: Based on our findings, we present a repository of widely used performance metrics for agile software development. Furthermore, we present implications for practitioners and researchers especially how to deal with challenges agile software development face while applying such metrics in practice.
DevOps Value Flows In Software-Intensive System of Systems
ABSTRACT. DevOps has become a widely adopted approach in the software industry, especially among companies developing web-based applications. The main focus of DevOps is to address social and technical bottlenecks along the software flow, from the developers' code changes to delivering these changes to the production environments used by customers. However, DevOps does not consider the software flow's content, e.g., new features, bug fixes, or security patches, and the customer value of each content. In addition, DevOps assumes that a streamlined software flow leads to a continuous value flow, as customers use the new software and extract value-adding content intuitively. However, in a Software-intensive System of Systems (SiSoS), customers need to understand the content of the software flow to validate, test, and adopt their operation procedures before using the new software. Thus, while DevOps has been extensively studied in the context of web-based applications, its adoption in SiSoS is a relatively unexplored area. Therefore, we conducted a case study at a multinational telecommunications provider focusing on 5G systems. Our findings reveal that DevOps has three sub-flows: legacy, feature, and solution. Each sub-flow has distinct content and customer value, requiring a unique approach to extracting it. Our findings highlight the importance of understanding the software flow's content and how each content's value can be extracted when adopting DevOps in SiSoS.
Evaluating the Role of Security Assurance Cases in Agile Medical Device Development
ABSTRACT. Cybersecurity issues in medical devices threaten patient safety and can cause harm if exploited. Standards and regulations therefore require vendors of such devices to provide an assessment of the cybersecurity risks as well as a description of their mitigation. Security assurance cases (SACs) capture these elements as a structured argument. Compiling an SAC requires taking domain-specific regulations and requirements as well as the way of working into account. In this case study, we evaluate CASCADE, an approach for building SAC in the context of a large medical device manufacturer with an established agile development workflow. We investigate the regulatory context as well as the adaptations needed in the development process. Our results show the suitability of SACs in the medical device industry.
We identified 17 use cases in which an SAC supports internal and external needs. The connection to safety assurance can be achieved by incorporating information from the risk assessment matrix into the SAC. Integration into the development process
can be achieved by introducing a new role and rules for the design review and the release to production as well as additional criteria for the definition of done. We also show that SACs built with CASCADE fulfill the requirements of relevant standards in the medical domain such as ISO 14971.
How Industry Tackles Anomalies during Runtime: Approaches and Key Monitoring Parameters
ABSTRACT. Deviations from expected behavior during runtime, known as anomalies, have become more common due to the systems' complexity, especially for microservices. Consequently, analyzing runtime monitoring data, including logs, traces for microservices, and metrics, poses challenges due to the sheer volume of monitoring data collected and the required excellent understanding of the runtime monitoring data to provide adequate rules or AI algorithms to detect unforeseen anomalies reliably.
This paper seeks to comprehend anomalies and current anomaly detection approaches across diverse industrial sectors. Additionally, it aims to pinpoint the parameters necessary for identifying anomalies via runtime monitoring data.
Therefore, we conducted semi-structured interviews with fifteen industry participants who rely on anomaly detection during runtime. Additionally, to supplement information from the interviews, we performed a literature review focusing on anomaly detection approaches applied to industrial real-life datasets.
Our paper (1) demonstrates the diversity of interpretations and examples of software anomalies during runtime and (2) explores the reasons behind choosing rule-based methods in the industry over self-developed AI methods, which are highly prominent in published industry-related papers. Furthermore, we (3) identified key monitoring parameters collected during runtime (logs, traces, and metrics) that assist practitioners in detecting anomalies during runtime without introducing bias in their anomaly detection approach due to inconclusive parameters.
ABSTRACT. Data is key for rapid and continuous delivery of customer value. By collecting data from products in the field, companies in the embedded systems domain can measure and monitor product performance and they get the opportunity to provide customers with insights and data-driven services. However, while the notion of data-driven development is not new, embedded systems companies are facing a situation in which data volumes are growing exponentially and this is not without its challenges. Suddenly, the cost of collecting, storing and processing data becomes a concern and while there is prominent research on different aspects of data-driven development, there is little guidance for how to reason about business value versus costs of data. In this paper, we present findings from case study research conducted in close collaboration with four companies in the embedded systems domain. The case companies share the challenge of having data volumes that are increasing very rapidly and they experience challenges in how to reason about the value versus cost of the data they collect. The contribution of this paper is a framework that provides a holistic understanding of the multiple dimensions that need to be considered when reasoning about business value versus cost of collecting, storing and processing data.
ABSTRACT. As society is continuously adapting to technological
change and progress, fast-moving digital transformations are the
driving force for setting the necessary skillsets for the workforce.
Furthermore, the advent of Industry 5.0 as a defining concept
for the future, which advocates a human-centric coalescence of
humans and technology or software, renders the skilled workforce
the most important asset in any organization or business. The
endgame of the digital transformation is to evoke the reshaping,
evolution, or replacement of traditional and possibly obsolete
processes at intra- or inter-organizational levels in multiple
aspects, introducing innovative ways of re-defining the workforce.
In this context SKILLAB will act as a smart tool for handling,
honing, and widening the competencies of the personnel of
companies, forecasting future skill gaps and providing European
citizens with a tool for upskilling and reskilling.
ABSTRACT. Developing and managing 6G software demands cutting-edge software engineering tailored for the complexity and vast numbers of connected edge devices. Our project aims to lead in developing sustainable methods and energy-efficient orchestration models specifically for edge environments, enhancing architectural support for edge computing and edge AI. This initiative seeks to position Finland at the forefront of the 6G landscape, focusing on sophisticated edge orchestration and robust software architectures to optimize the performance and scalability of edge networks. Collaborating with leading Finnish universities and companies, the project emphasizes deep industry-academia collaboration and international expertise to address critical challenges in edge orchestration and software architecture, aiming to drive significant advancements in software productivity and market impact.
ABSTRACT. Littering is a major problem that threatens the environment, society, and economy. Keep track, monitor and regularly clean littering sites can be a crucial problem that involves public authorities, municipalities, companies, and citizens. Various approaches have been proposed over the past few years, but they rely on expensive and advanced technologies that do not leverage the knowledge and capabilities that derive from the federation of multiple communities, such as cities, public bodies, and organizations. In this paper, we describe the COBOL project, a National PRIN PNRR project funded by the Italian MUR in 2023. The project aims to definite a flexible framework for managing the waste disposal process through a federated learning architecture that collects and integrates the reports (e.g., annotated pictures and user feedback) shared by multiple communities involved in the waste disposal process. To deliver an advanced waste disposal service based on the direct participation of citizens, COBOL also integrates Model-Driven Engineering principles, Computer Vision techniques, and Self-Adaptation mechanisms. Early results show that reports can be effectively collected and processed with COBOL.
BOTQUAS: Blockchain-based Solutions for Trustworthy Data Sharing in Sustainable and Circular Economy
ABSTRACT. Monitoring business processes within complex supply chains demands efficient data collection and analytics tailored to diverse phenomena. Traditional centralized solutions face limitations in adapting to the dynamic nature of supply chains. This calls for distributed solutions which break the usual architectural assumption to have a central entity in charge of collecting, integrating and offering tools for the analysis. This project, embedded in a larger initiative called MICS, proposes an innovative distributed monitoring solution integrating blockchain for a trustworthy and efficient data analytics strategy that preserves data sovereignty in complex collaborative environments. Leveraging the cloud-edge continuum, the solution aims to ensure secure data exchange, adherence to agreements, and real-time analytics. Expected outcomes include an innovative federated architecture, 5G slice management solutions, an adversarial analysis of supply chain security, and a proof-of-concept implementation of the blockchain-based data flow tracking system. These developments aim to enhance the reliability, security, and efficiency of supply chain monitoring in dynamic industrial environments.
The Trade-off Between Data Volume and Quality in Predicting User Satisfaction in Software Projects
ABSTRACT. Most predictive studies involving the ISBSG dataset used only high-quality cases according to the Data Quality Rating and UFP Rating and a few predictors with no or very few missing values.
This study investigated the trade-off between data volume and quality when predicting user satisfaction in software projects. Specifically, it explored whether machine learning models would perform better when trained using a larger dataset containing some portion of low-quality data, a smaller dataset with only high-quality data, or an intermediate setting.
A standardised accuracy, a “win-tie-loss” approach, and a matched-pairs rank biserial correlation coefficient were used to evaluate predictive performance. The rankings of data selection strategies for particular models were created using the Scott-Knott Effect Size Difference test. The robustness of results was assessed using Kendall W.
For most models, a higher predictive accuracy was achieved when trained on a larger subset, even though it contained some low-quality data. For most models, data selection strategies were robust to data splits. The ranks of data selection strategies were stable across models.
Hence, a practical recommendation for predicting user satisfaction, especially when a dataset is small, is to train predictive models on a relatively high-volume subset despite some low-quality data. Provided rankings may be helpful when setting up future experiments on user satisfaction with the ISBSG dataset.
Best Practices for Resource Provisioning Declaration Within the Cognitive Cloud Continuum
ABSTRACT. The evolution of cloud computing, driven by advances in mobile, edge technologies, and AI, has led to the development of the Cognitive Cloud Continuum (COCLCON). However, this paradigm introduces new challenges in managing and optimizing computing resources across a heterogeneous environment. This paper explores best practices for declaring resources within COCLCON, with a focus on efficient resource allocation and transparently declaring available resources by devices.
In this study, we undertook a non-holistic literature review to identify current technologies used to specify requirements and to determine current gaps in best practices. The main outcome of our work is a proposed schema for Resource Provisioning Declaration, which will allow for increased knowledge related to the available devices and resources within the COCLCON.
Continuous Training vs. Transfer Learning on Edge and Fog Environments: A Steam detection use case
ABSTRACT. The implementation of smart manufacturing, which
utilises advanced digital technologies to enhance the agility and
productivity of the traditional manufacturing sector, has the
potential to reduce resource consumption, optimise processes
and enhance safety. One challenge in process automation (PA) is
its strict real-time requirements. One solution to this challenge
is the use of Edge and Fog computing platforms with finite
computational power, which brings processing and data storing
closer to the data sources. This proximity of computing devices
reduces the latency and bandwidth requirements, relaxes the need
for a reliable Internet connection, and provides more security in
design over the Cloud solutions.
This paper compares the performance of Edge and Fog computing for soft real-time machine learning-based visual process
monitoring that supports the human operator. The objective is
to get a better understanding how this ML task can be relocated
within Edge and Fog layers. Moreover, the article provides considerations of emerging difficulties of practical implementation of
Continuous Training pipeline and soft real-time steam detection.
Graph-based Anti-Pattern Detection in Microservice Applications
ABSTRACT. Features of microservice architectures, such as scalability, separation of concerns, and their ability to facilitate the rapid evolution of polyglot systems, have made them popular with large organizations employing many software developers. The features that make them attractive also create complexity and require maintenance over the evolution of an application, especially concerning application decomposition. Microservice architecture decomposition evolves together with the application and is prone to errors referred to as architectural anti-patterns over its lifetime. These can be difficult to detect and manage because of their informal natural language definitions and a lack of automated tooling.
In this paper, we introduce a graph-based methodology for the detection of architectural anti-patterns. To achieve this, we have created a new Granular Hardware Utilization-Based Service Dependency Graph (GHUBS) model, formal mathematical anti-pattern definitions of three example anti-patterns, language-agnostic detection algorithms, and finally a visualization of detected anti-patterns. The proposed methodology is tested and validated in a case study of a popular microservice benchmarking suite.
The Metamorphic Lighthouse: Understanding the Input Data Space of Metamorphic Relations
ABSTRACT. Metamorphic Testing (MT) addresses the test oracle problem by defining how program outputs should change in response to specific input changes. The relations between input changes and their corresponding output changes are called Metamorphic Relations (MRs). Generating suitable MRs is complex and often requires deep domain knowledge. Our previous work introduced MetaTrimmer, a test-data-driven approach for selecting and constraining MRs, involving three steps: Test Data (TD) Generation, MT Process, and MR Analysis. MR Analysis is done to decide whether the violation of an MR for a specific input data pair (original and changed) indicates a failure or simply means that the MR does not apply for the chosen inputs.
In this paper, we present an association-rule-based approach that semi-automatically extracts constraints dividing the input space into valid/invalid data during the MR Analysis step of MetaTrimmer. We validate our approach using 44 methods to which six predefined MRs are applied. Our results indicate that the proposed method efficiently identifies correct input data space constraints. More studies are needed to provide additional evidence that MetaTrimmer with the enhanced MR Analysis step is scalable and generalisable.
Fault-Proneness of Python Programs Tested By Smelled Test Code
ABSTRACT. Software testing is one of the most crucial quality assurance activities,
and test results are of great concern to software developers.
However, the quality assurance of the test code (test case) itself also becomes critical
because a poor-quality test case may fail to detect latent faults and give developers false comfort regarding the test result.
A code smell threatening test code quality has been studied as ``test smell.''
This paper conducts an investigation of test smells in 775 Python open-source programs and
reports the results of a quantitative analysis regarding whether test smells impact the fault-proneness of the product code under test.
The analysis results show the following two findings.
(1) A production code tested by a test code having one of the specific kinds of test smells (10 out of 18 detectable ones) is more fault-prone than the others.
(2) The fault-proneness of a production code tends to get higher when the corresponding test code has two or more different kinds of test smells---over 75% of test smell combinations showed such a trend of increasing the risk of being faulty production code.
Many a Little Makes a Mickle: On Micro-Optimisation of Containerised Microservices
ABSTRACT. Performance optimisation is widely recognised as a key to the success of microservices architecture. Correspondingly, a large number of studies have been conducted on optimising orchestration or composition of multiple microservices within different application contexts. Unlike the existing efforts on the global optimisation, we are concerned with the internal optimisation of individual microservices. Considering the loosely coupled nature of individual microservices, their performance improvements could be independent of each other and thus would always bring benefits to their composite applications. Driven by such intuitive ideas together with the de facto tech stack, we have been working on micro-optimisation of containerised microservices at the infrastructure-as-code level. Based on both theoretical discussions and empirical investigations, our most recent work delivered three micro-optimisation principles, namely just-enough containerisation, just-for-me configuration, and just- in-time compilation (during containerisation). Although these principles need to be further strengthened through enriched case studies, our current research outcomes have not only offered new ideas and practical strategies for optimising microservices, but they have also expanded the conceptual scope and the research field of software micro-optimisation.
Tapping in a Remote Vehicle's onboard LLM to Complement the Ego Vehicle's Field-of-View
ABSTRACT. Today's advanced automotive systems are turning into intelligent Cyber-Physical Systems (CPS), bringing computational intelligence to their cyber-physical context. Such systems power advanced driver assistance systems (ADAS) that observe a vehicle's surroundings for their functionality. However, such ADAS have clear limitations in scenarios when the direct line-of-sight to surrounding objects is occluded, like in urban areas. Imagine now automated driving (AD) systems that ideally could benefit from other vehicles' field-of-view in such occluded situations to increase traffic safety if, for example, locations about pedestrians can be shared across vehicles. Current literature suggests vehicle-to-infrastructure (V2I) via roadside units (RSUs) or vehicle-to-vehicle (V2V) communication to address such issues that stream sensor or object data between vehicles. When considering the ongoing revolution in vehicle system architectures towards powerful, centralized processing units with hardware accelerators, foreseeing the onboard presence of large language models (LLMs) to improve the passengers' comfort when using voice assistants becomes a reality. We are suggesting and evaluating a concept to complement the ego vehicle's field-of-view (FOV) with another vehicle's FOV by tapping into their onboard LLM to let the machines have a dialogue about what the other vehicle ``sees''. Our results show that very recent versions of LLMs, such as GPT-4V and GPT-4o, understand a traffic situation to an impressive level of detail, and hence, they can be used even to spot traffic participants. However, better prompts are needed to improve the detection quality and future work is needed towards a standardised message interchange format between vehicles.
Towards Real-time Object Detection for Safety Analysis in an ML-Enabled System Simulation
ABSTRACT. Machine learning (ML)-equipped critical systems such as collaborative artificial intelligence systems (CAISs), where humans and intelligent robots work together in a shared space are increasingly being studied and implemented in different domains.
The complexities of these systems raise major concerns for safety risks because decisions for controlling the dynamics of the robot during the interaction with humans must be done quickly driving
the detection of potential risks in form of collision between a robot and a human operator using information obtained from sensors such as camera or LIDAR. In this work, we explore and compare
the performance of two You Only Look Once (YOLO) models - YOLOv3 and YOLOv8 - which rely on convolutional neural networks (CNNs) for real-time object detection in a case study collaborative robot system simulation example. The preliminary results show that both models achieve high accuracy (≥ 98%) and real-time performance albeit requiring a GPU to run at such speed as 40FPS. The results indicate the feasibility of real-time object detection in a CAIS simulation implemented with CoppeliaSim software.
Log Frequency Analysis for Anomaly Detection in Cloud Environments at Ericsson
ABSTRACT. Log analysis monitors system behavior, detects errors and anomalies, and predicts future trends in systems and applications. However, with the continuous evolution and growth
of systems and applications, the amount of log data generated on a timely basis is increasing rapidly. This causes an increase in the manual effort invested in log analysis for error detection and root cause analysis. The current automated log analysis techniques mainly concentrate on the messages displayed by the logs as one of the main features. However, the timestamps of the logs are often ignored, which can be used to identify temporal patterns between the logs which can form a key aspect of log analysis in itself. In this paper, we share our experiences of combining log frequency based analysis with log message based analysis, which thereby helped in reducing the volume of logs which are sent for manual analysis for anomaly detection and root cause analysis.
Comparing Approaches for Prioritizing and Selecting Scenarios in Simulation-based Safety Testing of Automated Driving Systems
ABSTRACT. Recently, the Simulation-based Safety Testing Scenario Selection (SSTSS) process was proposed with the aim to assist software engineers in selecting the most effective scenarios for testing Automated Driving Systems (ADS) in a simulator, thereby enhancing the safety and functionality of these systems. In our study, we conduct a literature review to identify other documented approaches for selecting or prioritizing scenarios for ADS testing. We compare the selected approaches with the
SSTSS process and show via an illustrative example how the SSTSS process could be combined with other approaches for improved testing effectiveness and efficiency. We identified five other approaches and compared them with SSTSS.
Development of a Firearms and Target Weapons Recognition and Alerting System Applying Artificial Intelligence
ABSTRACT. Nowadays, surveillance systems that make use of security cameras are indispensable to ensure the protection and security of companies and organizational entities. These systems operate through monitoring by trained individuals. The progress of a system that uses artificial intelligence to identify and recognize firearms and knives is based on the implementation of machine learning techniques and real-time image and video analysis. The main goal is to increase public safety by accurately and quickly detecting the presence of weapons in various environments. The purpose of this research is to improve public safety through the early detection of threats and more effective responses by security forces. To achieve this, convolutional neural networks have been used. During the development of the system, a database has been created using images and videos containing firearms or knives, based on the "You Only Look Once" (YOLO) algorithm, particularly YOLOv5s.
Optimizing End-to-End test execution: Unleashing the Resource Dispatcher
ABSTRACT. Continuous integration practices have transformed software development, but executing test suites of modern software developments addresses new challenges due to its complexity and its huge number of test cases. Certain test levels, like End-to-end testing, are even more challenging due to long execution times and resource-intensive requirements, moreover when we have many End-to-end test suites. Those E2E test suites are executed sequentially and in parallel over the same infrastructure and can be executed several times (e.g., due to some tester consecutive contributions, or version changes performed by automation engines). In previous works, we presented a framework that optimizes E2E test execution by characterizing Resources and grouping/scheduling test cases, based on their compatible usage. However, the approach only optimizes a single test suite execution and neglects other executions or test suites that can share Resources and lead to savings in terms of time and number of Resource redeployments. In this work, we present a new Resource allocation strategy, materialized through a Resource Dispatcher entity. The Resource Dispatcher centralizes the Resource management and allocates the test Resources to the different test suites executed in the continuous integration system, according to their compatible usage. Our approach seeks efficient Resource sharing among test cases, test suites, and suite executions, reducing the need for Resource redeployments and improving the execution time. We have conducted a proof of concept, based on real-world continuous integration data, that shows savings in both Re-source redeployments and execution time