Program for Thursday, August 29th

PROGRAM FOR THURSDAY, AUGUST 29TH

Days:

next day

all days

View: session overview talk overview

09:10-10:10 Session 1: INVITED TALK: Albert Zomaya

Location: Colombo Theatre A

10:30-12:10 Session 2A: Security Track

Location: Colombo Theatre A

10:30	William Aiken, Paula Branco and Guy-Vincent Jourdan DevilDiffusion: Embedding Hidden Noise Backdoors into Diffusion Models ABSTRACT. Diffusion models represent state-of-the-art deep learning architectures behind many popular and powerful image-synthesizing generative Artificial Intelligence (AI) systems. Their underlying approach, which relies on scheduled noise addition and optimized noise removal, has been applicable to the synthesis of sophisticated and nearly photorealistic images across various domains; however, the potential impacts of compromised diffusion models remain an underexplored area in the literature. While prior studies have investigated the insertion of explicit triggers into the noise or prompt spaces of diffusion models, it is unrealistic to assume that benign users would intentionally input such triggers into their own diffusion process. In our DevilDiffusion approach, we demonstrate the capability to surreptitiously embed triggers that leverage uncommon but naturally-occurring characteristics of Gaussian noise directly into the noise space of conditional diffusion models. When these specific characteristics occur in naturally-occurring noise, the model will instead construct our target backdoor image. By adjusting the trigger size and the ratio of poisoned images, we can control the trigger rates of the specified target image, ranging from less than 0.01% to 25% of all generated images, while still maintaining performance on the target task.
10:50	Mohammed-El-Amin Tebib, Oum-El-Kheir Aktouf, Pascal André and Mariem Graa PrivBench: A Benchmark Capturing Privilege Escalation Vulnerabilities in Android (ONLINE TALK) PRESENTER: Mohammed-El-Amin Tebib ABSTRACT. Security code smells are receiving increasing attention in the domain of Android apps development. They serve as coding guidelines aimed at identifying vulnerabilities originating from the application’s source code. Numerous related tools have been proposed to align with DevSecOps guidelines for integration into the Android apps development process. However, there’s a lack of comparable effort in creating benchmarks for open-source apps, making the thorough evaluation of these tools challenging, often requiring manual effort and time-consuming processes. In this paper, we propose PrivBench, an evolving benchmark that captures vulnerabilities related to Android ecosystem. For instance, PrivBench focuses on Privilege Escalation (PE) vulnerabilities. It incorporates multiple code patterns for each vulnerability, demonstrating their potential existence within the app source code. To showcase the significance of PrivBench, we used it for evaluating two well-known tools used to identify Android security code smells. This evaluation allowed us to present their performance in detecting the vulnerabilities within their scopes. We believe that our benchmark can be useful for advancing the capabilities of state-of-the-art tools, enhancing their effectiveness in vulnerability detection, and increasing developers awareness to evade privilege escalation vulnerabilities.
11:10	Kevin Saric, Gowri Sankar Ramachandran, Raja Jurdak and Surya Nepal Efficient Data Security Using Predictions of File Availability on the Web ABSTRACT. As we approach the physical limits of storage density, digital storage prices are no longer plummeting, despite the lingering belief that they still are. Meanwhile, data production continues to grow, making it harder to securely manage the data we produce. Typical digital storage media is often consumed by a small number of large files that are widely available on the web. If the availability of files on the web could be predicted, the choice between consuming local storage resources or simply redownloading the file in the future could be automated, thus increasing the efficiency of backup and encryption workflows. Through a large-scale analysis of hundreds of billions of crawl URLs spanning 8 years, as well as over 60 million HTTP header request responses from web servers, we explore the requirements and design of a framework for such predictions. It includes a data structure for efficiently representing the lateral/longitudinal availability of files and an extensible mathematical model for fast and adaptable prediction calculations. Additionally, we contribute novel observations about file availability on the web, including the identification of a period of initial volatility in their lifespans. Analysis indicates that a pool of 2,500TB of distributed, popular files is freely and predictably available to users, offering opportunities to reduce the storage and computational costs of both backup and encryption.
11:30	Saurabh Anand, Shubham Malaviya, Manish Shukla and Sachin Lodha CompFreeze : Combining Compacters and Layer Freezing for Enhanced Pre-Trained Language Model (ONLINE TALK ABSTRACT. Leveraging vast datasets from various security tools and sources for training AI models in cyber security offers significant potential to learn better representations of real-world behavior. However, data drift and labeled data scarcity are major challenges leading to frequent and costly model updates and the risk of overfitting. To address these issues, we introduce CompFreeze, a parameter-efficient fine-tuning technique combining compacters and layer freezing strategies. We evaluate the effectiveness of CompFreeze on three pre-trained models in the cyber security domain across various downstream tasks. We demonstrate that with significantly less trainable parameters, CompFreeze performs on par with full model fine-tuning while taking less training time. We have performed comprehensive experimental analysis involving investigating the impact of different learning rates and varying the number of compacter modules integrated into models, offering an in-depth analysis of the tradeoff between accuracy and inference time.
11:50	Ross Porter, Morteza Biglari-Abhari, Duleepa Thrimawithana and Benjamin Tan Extending ISO 15118-20 EV Charging: Preventing Downgrade Attacks and Enabling New Security Capabilities ABSTRACT. Previous works have identified that EV charging can be weaponised to attack the power grid. As a case study, we consider the newest charging protocol ISO 15118-20, which provides a high-level communication protocol for EV charging. We first highlight fundamental issues in ISO 15118-20 which prevent the development of security features within the existing standard: We show that an attacker can perform a downgrade attack on ISO 15118-20, and propose modifications to the standard to prevent this. We show how this can be used to enable the development of additional security features within the modified protocol. A proof of concept is developed to prove functionality, determine interoperability between various parties, verify that it meets the original standard's timing requirements, and does not impact charging speed or the length of a charging session.

10:30-11:50 Session 2B: Privacy Track

Location: Session Room LG01

10:30	Huanyi Ye, Ziyao Liu, Yu Jiang, Jiale Guo and Kwok-Yan Lam Malicious Unlearning in Ensemble Models ABSTRACT. Knowledge removal is a crucial task in AI safety and for aligning with the Right To Be Forgotten (RTBF) principle. Machine Unlearning (MU) is an important means for achieving knowledge removal by removing the ML impacts of a specified subset of training data. However, existing MU frameworks may be misused to facilitate emerging novel poisoning attacks, where adversaries may introduce both poisoned data and the corresponding mitigation data that temporarily neutralize the effects of the poisoned data. The adversaries then submit malicious unlearning requests for the mitigation data, hence maintaining the malicious effects of the poison. Such attacks have been shown to be effective in single-model scenarios; however, their impacts on ensemble models, which are widely adopted because of their robustness, remain underexplored. Recognizing this gap, we extend these emerging poisoning attacks to ensemble settings to better understand and address the potential risks of malicious unlearning. Our extensive experimental results show that the proposed extended poisoning attacks are effective also in the ensemble settings, achieving a high attack success rate, highlighting the importance of continued research in safeguard measures against misuse of MU as one of the important requirements of AI safety.
10:50	Atthapan Daramas, Vimal Kumar and Marinho Barcellos Quantifying Privacy in Cooperative Awareness Services Through Trajectory Reconstruction PRESENTER: Atthapan Daramas ABSTRACT. Cooperatively creating awareness of the vehicle and its surroundings can improve the safety of the transportation system. Creating such awareness involves frequently sharing the vehicle's location and kinematics information with its surroundings, which can be achieved by broadcasting Cooperative Awareness Messages (CAMs) or Basic Safety Messages (BSMs). The receivers of these messages know the current location and kinematics of the sender and can estimate the possibility of collision. However, continuously receiving CAMs/BSMs allows the receiver to reconstruct the sender's trajectory, in which the full trajectory may reveal information about the users, such as, house and workplace location. Hence, the user's privacy is violated. Prior works focused only on location-based trajectory reconstruction and ignored the other kinematics, such as heading and speed. Ignoring such information could lead to underestimating the adversary who seeks to misuse the communication. This work analyses the privacy loss which arises from additional information in BSMs/CAMs. We propose a trajectory reconstruction model that leverages all kinematics (AKs), including location, heading, and speed. The trajectory reconstruction model is composed of two sub-models, namely, inference and data association models. The first sub-model estimates the probability from the estimated value of a vehicle's kinematics, and the second sub-model performs linking between pseudonyms. We quantify the privacy loss regarding the precision, recall, and F1-score of the ability to identify the correct link between pseudonyms with AKs-based trajectory reconstruction and compare the proposed model with the location-based approach. We also quantify the users' privacy through the uncertainty in the trajectory reconstruction process. We show that, in some scenarios, the AKs-based trajectory reconstruction gains higher precision, recall, F1-score, and certainty in trajectory reconstruction compared to the location-based approach.
11:10	Masaya Kobayashi, Atsushi Fujioka, Koji Chida, Akira Nagai and Kan Yasuda $Pk$-Anonymization Meets Differential Privacy PRESENTER: Masaya Kobayashi ABSTRACT. This paper explores the relationships between two privacy protection measures: $Pk$-anonymity and $\varepsilon$-differential privacy. $Pk$-anonymity and $\varepsilon$-differential privacy are proposed by Ikarashi et al. and Dwork et al., respectively, and they are independent privacy measures. The previous research has indicated the relationships between $k$-anonymity and $(\beta, \varepsilon, \delta)$-differential privacy under sampling, and precisely, have shown that a $k$-anonymization algorithm can satisfy $(\beta, \varepsilon, \delta)$-differential privacy under sampling within a range of parameters. Although $k$-anonymity is a stronger notion than $Pk$-anonymity, $(\beta, \varepsilon, \delta)$-differential privacy under sampling is a weaker one than $\varepsilon$-differential privacy. We introduce a property of anonymization, named record-independence where the processing of one record is not affected by the values of other records, and show that a $Pk$-anonymization algorithm can satisfy $\varepsilon$-differential privacy within a range of parameters under the condition where the anonymization algorithm is record-independent. With the fact that $k$-anonymity implies $Pk$-anonymity, $k$-anonymity meets $\varepsilon$-differential privacy. Then, it implies that an algorithm with a strong privacy notion can satisfy a strong one in another privacy measure. Numerical experiments are then performed to give relations among the parameters of $Pk$-anonymity and $\varepsilon$-differential privacy.
11:30	Mst Mahamuda Sarkar Mithila, Fangyi Yu, Miguel Vargas Martin and Shengqian Wang Visualizing Differential Privacy: Assessing Infographics’ Impact on Layperson Data-sharing Decisions and Comprehension PRESENTER: Mst Mahamuda Sarkar Mithila ABSTRACT. Differential privacy (DP) has emerged as a promising approach for protecting users’ data in the era of big data and machine learning. Despite its deployment by governments and organizations, the concept of DP remains difficult for non-technical users to comprehend. Visual aids, such as infographics, have the potential to bridge this knowledge gap and enable users to make informed data-sharing decisions. In this paper, we propose to use carefully designed infographics to explain differential privacy and compare their effectiveness with traditional text descriptions. We conducted a vignette survey study with 367 participants on Prolific and found that our static and dynamic infographic designs improved participants’ understanding of DP, including its mechanism and implication compared with text descriptions. Our infographics also enhance users’ understanding of DP and assist them in sharing their highly sensitive information whether the privacy budget ϵ is exposed. This research contributes to the growing body of literature on designing effective differential privacy descriptions to communicate DP to laypeople to facilitate their data-sharing decisions.

13:30-14:30 Session 3: INVITED TALK: Willy Susilo

Location: Colombo Theatre A

14:50-16:30 Session 4A: Security Track

Location: Colombo Theatre A

14:50	Abdelfattah Amamra, Rym Khettab and Raissa Mezine Enhancing Network Intrusion Detection Systems: A Review of Feature Selection Algorithms for Streaming Data Processing ABSTRACT. The proliferation of Internet of Things (IoT) devices has precipitated a substantial expansion in network size and data volume, concurrently giving rise to heightened security concerns. Consequently, there has been a concerted effort in research to augment the execution and prediction performance of Network Intrusion Detection Systems (NIDS). The efficacy of machine learning (ML) models crucially hinges on judicious choices in algorithms and feature sets. While much of the previous research on feature selection algorithms concentrated on static datasets, real-world network intrusion detection systems grapple with the challenges posed by streaming data. This study comprehensively reviews feature selection algorithms, shedding light on their merits, drawbacks, and practical applications, with a specific emphasis on prerequisites for effective processing of streaming data. Noteworthy contributions include bridging gaps in prior research by delineating requirements tailored to the challenges of streaming data and conducting experimental analyses using contemporary datasets.
15:10	Arthur Drichel and Ulrike Meyer A Comprehensive Study on Multi-Task Learning for Domain Generation Algorithm (DGA) Detection PRESENTER: Arthur Drichel ABSTRACT. In this work, we perform a comparative evaluation of 21 approaches to multi-task learning (MTL) for the detection of domain generation algorithms (DGAs). To this end, we train and evaluate 2300 classifiers using a combination of 14 different optimization strategies and 6 MTL architectures and compare them statistically with the state of the art. In this context, we propose a novel ResNet backbone, which already surpasses the state of the art on its own, but shines especially in combination with MTL. We evaluate the novel DGA classifiers in a real-world study that avoids temporal and spatial experimental biases to assess whether they generalize well between different networks and are robust over time. Moreover, we analyze the classifiers' capability to detect yet unknown DGAs and discuss their practical application. Our best-performing classifier surpasses the state of the art by over 5.7% in area under the curve (AUC) for the practically relevant false-positive rates (FPRs) of [0,0.01] and exceeds the state of the art by over 7.3% in true-positive rate (TPR) at the same fixed FPR of 0.001 in a real-world setting.
15:30	Juliet Samandari and Clémentine Gritti Post-Quantum Authentication and Integrity in 3-Layer IoT Architectures PRESENTER: Juliet Samandari ABSTRACT. The Internet of Things (IoT) is a growing area of technology and has been identified as a key tool for enhancing industries’ operation and performance. As IoT deployment rises worldwide, so do the threats; hence, security, especially authentication and integrity, is a critical consideration. One significant future threat is quantum attacks, which can only be defeated using Post-Quantum (PQ) cryptosystems. New Digital Signature (DS) standards for PQ security have been selected by the US National Institute of Standards and Technology (NIST). However, IoT comes with its own technical challenges from the constrained resources allocated to sensors and other similar devices. As a consequence, the use and suitability of these PQ schemes for IoT remains an open research area. In this paper, we identify an IoT architecture built from three distinct layers represented by a server, a gateway and an IoT device, respectively. We first test PQ DS standards and compare them with current standards in order to assess their practicality for use in this architecture to provide authentication and integrity. Then, we select the most suitable PQ scheme at each layer according to the features of the corresponding device (server, gateway, IoT device) and the security property (authentication, integrity). We finally carry out experiments on our selection and provide an architectural model for making IoT communication and interaction PQ secure.
15:50	Ahmed Tanvir Mahdad and Nitesh Saxena Mobile Login Bridge: Subverting 2FA and Passwordless Authentication via Android Debug Bridge. (ONLINE TALK) ABSTRACT. Smartphones have become ubiquitous for a range of social, financial, and personal endeavors, as well as for accessing sensitive resources like confidential files from organizations. Nevertheless, this extensive usage has also made smartphones vulnerable to multiple security risks posed by malicious adversaries who intend to breach user accounts or steal personal information. Specifically, high-profile individuals or organizations are susceptible to becoming targets of targeted attacks. Previous research has identified various vulnerabilities that can compromise smartphones and access users’ confidential information. A prominent example of such a vulnerability, known as the “Android Debug Bridge (ADB) vulnerability,” is widely recognized as it enables an attacker to remotely access and manipulate an Android smartphone and perform malicious activities. However, the existing body of literature lacks a comprehensive examination of the implications of this vulnerability on modern authentication systems, web-based password managers, and financial and e-commerce applications. In this paper, we shed light on this area and evaluated the security of multi-factor authentication systems, browser-based password managers, and popular financial and e-commerce applications. For this purpose, we introduce the BadAuth attack that exploits a set of ADB utilities. Our results reveal the susceptibility of secure authentication systems and browser-based password managers to a sophisticated one-time attack on a non-rooted device even with the latest Android version (Android 14.0). Furthermore, our research exposes the alarming ability of adversaries to access all passwords stored by browser-based password managers, thus paving the way for more severe attacks, including large-scale breaches within organizational settings. Additionally, our assessment underscores potential privacy and security risks for financial and e-commerce apps under BadAuth attacks, along with possible risk mitigation strategies.
16:10	Duc-Thuan Dam, Trong-Hung Nguyen, Thai-Ha Tran, Binh Kieu-Do-Nguyen, Trong-Thuc Hoang and Cong-Kha Pham An Efficient Method for Accelerating Kyber and Dilithium Post-Quantum Cryptography PRESENTER: Duc-Thuan Dam ABSTRACT. Post-quantum cryptography (PQC) algorithms were introduced in response to the threats of attacks using quantum computers. The CRYSTALS-Kyber and CRYSTALS-Dilithium are two of the algorithms chosen by NIST to standardize the PQC, which are lattice-based algorithms. Number theoretic transform (NTT) helps lattice-based algorithms reduce latency, but it is still their bottleneck. Along with that, the RISC-V instruction set architecture also opens up flexible methods to solve different problems. This paper proposes a RISC-V system-on-a-chip (SoC) architecture with a computational accelerator for NTT-based calculations for Kyber and Dilithium. Implementation results show that software running on proposed SoC using accelerators has improved in NTT/INTT by up to 36.75×/42.69× compared to software on embedded devices, up to 4.07×/4.38× for software running on RISC-V SoCs, and up to 8.11× for NTT of the previous software/hardware architectures.

14:50-15:50 Session 4B: Privacy track

Location: Session Room LG01

14:50	Jinglin Sun, Basem Suleiman and Imdad Ullah Effectiveness of Privacy-Preserving Algorithms for Large Language Models: A Benchmark Analysis ABSTRACT. Recently, several privacy-preserving algorithms for NLP have emerged. These algorithms can be suitable for LLMs as they can protect both training and query data. However, there is no benchmark exists to guide the evaluation of these algorithms when applied to LLMs. This paper presents a benchmark framework for evaluating the effectiveness of privacy-preserving algorithms applied to training and query data for fine-tuning LLMs under various scenarios. The proposed benchmark is designed to be transferable, enabling researchers to assess other privacy-preserving algorithms and LLMs. The benchmark focuses on assessing the privacy-preserving algorithms on training and query data when fine-tuning LLMs in various scenarios. We evaluated the SANTEXT+ algorithm on the open-source Llama2-7b LLM using a sensitive medical transcription dataset. Results demonstrate the algorithm’s effectiveness while highlighting the importance of considering specific situations when determining algorithm parameters. This work aims to facilitate the development and evaluation of effective privacy-preserving algorithms for LLMs, contributing to the creation of trusted LLMs that mitigate concerns regarding the misuse of sensitive information.
15:10	Costanza Alfieri, Suriya Ganesh Ayyamperumal, Limin Ge, Jingxin Shi and Norman Sadeh “I was diagnosed with...": sensitivity detection and rephrasing of Amazon reviews with ChatGPT (ONLINE TALK) PRESENTER: Costanza Alfieri ABSTRACT. The proliferation of platforms such as e-commerce and social networks has led to an increasing amount of personal health information being disclosed in user-generated content. This study investigates the use of Large Language Models (LLMs) to detect and sanitize sensitive health data disclosures in reviews posted on Amazon. Specifically, we present an approach that uses ChatGPT to evaluate both the sensitivity and informativeness of Amazon reviews. The approach uses prompt engineering to identify sensitive content and rephrase reviews to reduce sensitive disclosures while maintaining informativeness. Empirical results indicate that ChatGPT is capable of reliably assigning sensitivity scores and informativeness scores to user-generated reviews and can be used to generate sanitized reviews that remain informative.
15:30	Hiroaki Anada, Masayuki Fukumitsu and Shingo Hasegawa Group Signatures with Designated Traceability over Openers' Attributes from Symmetric-Key Primitives PRESENTER: Hiroaki Anada ABSTRACT. A group signature scheme in which signers are able to designate openers by specifying access structures over openers' attributes was introduced at CANDAR 2021, which is called GSdT. In this paper, we present a construction of GSdT from only symmetric-key primitives; pseudorandom functions, hash functions and commitments. Due to the features, our GSdT is expected to be secure against computational power of quantum computers. We first introduce syntax and security definitions in the static group model. Then, in our construction, the key ingredient is a non-interactive zero-knowledge proof of knowledge system that is constructed from the primitives in the "MPC-in-the-head" paradigm, owing the technique that was developed by Katz, Kolesnikov and Wang (ACM-CCS 2018). Our approach starts with their group signature scheme, but non-trivially extends the Merkle tree so that signers can treat (all-AND) boolean formulas as the access structures. According to our estimation, the signing time is less than 3.0 sec and the signature size is less than 0.5 MB in a scenario that the numbers of group members and attributes are 2^7 and 2^3, respectively, and security to be attained is 128 bit quantum security.

15:50-16:20 Session 5: Trust Track

Location: Session Room LG01

15:50

Huong Nguyen, Hong-Tri Nguyen, Lauri Loven and Susanna Pirttikangas

Stake-Driven Rewards and Log-Based Free Rider Detection in Federated Learning (ONLINE TALK)

PRESENTER: Huong Nguyen

ABSTRACT. Federated learning has become increasingly popular due to its ability to bring together multiple learners, enhance model generalizability, and promote knowledge exchange. Such systems inherently rely on the bedrock of security, trust, and fairness among training workers to ensure a conducive learning environment. However, this collaborative landscape has encountered the challenge of free riders, individuals who join the systems to gain benefits without making any substantial contributions. This can negatively impact learning outcomes, fairness, and sustainability of a collaborative system. In this paper, we first present a novel stake-based incentive mechanism to maximize the reward for clients, thereby encouraging active participation from contributors. Second, we propose an efficient method for identifying free riders in federated learning based on submission log analysis. Our method delegates the detection of free riders to training workers and the identification to the aggregator, rather than relying solely on the aggregator. We explore potential deceptive strategies employed by free riders and assess the extent of our method's coverage across these scenarios. Additionally, experimental results conducted on different free rider ratios also demonstrate the versatility and applicability of our approach in detecting these clients within the federated learning paradigm.