MOBISEC 2025: THE 9TH INTERNATIONAL CONFERENCE ON MOBILE INTERNET SECURITY
PROGRAM FOR THURSDAY, DECEMBER 18TH
Days:
previous day
next day
all days

View: session overviewtalk overview

09:00-10:45 Session 13A: Cryptographic Applications and Analysis
09:00
ET-BERT with Adapter Fusion: Time-Efficient Continual Learning Framework for Encrypted Traffic Classification

ABSTRACT. Encrypted traffic classification is essential in modern network defense but remains a challenging task due to the opacity of payloads and the continuously evolving attack surface. Recently, ET-BERT ~\cite{1}, a Transformer-based model, achieved state-of-the-art performance by treating traffic sequences as contextualized tokens. However, the approach of fine-tuning the entire model for each new class is prohibitively costly and entails the risk of catastrophic forgetting. In this study, we introduce an adapter-based parameter-efficient continual learning framework for encrypted traffic classification. The framework consists of (i) the baseline full fine-tuning (FT), (ii) adapter tuning for base classes (0–9), (iii) incremental adapter training with only a small number of parameters for the novel class (10), and (iv) AdapterFusion, which non-destructively combines base and incremental knowledge. Since this design freezes existing parameters and learns only new modules, it structurally suppresses the risk of catastrophic forgetting while enabling real-time updates without full retraining. On the USTC-TFC2016 dataset, the framework achieved an accuracy of 0.9970 with FT on all 11 classes, 0.9941 with base adapter tuning, and 0.9862 when applying Fusion. The incremental adapter immediately adapted to the new class, and the Fusion step reliably recovered overall classification performance. In terms of training time, while full FT required about 50 minutes, incremental adapter training took 3.5 minutes, and even with Fusion it was completed within about 13 minutes, demonstrating more than 4x faster training. This study empirically demonstrates that it is possible to simultaneously achieve performance retention and rapid deployment updates while minimizing the retraining burden.

09:18
Split Credential Authentication: A Privacy-Preserving Protocol with Decoupled Authority and Issuer Roles

ABSTRACT. Attribute-based authentication (ABA) is a cryptographic protocol which enables access control based on user-specific attributes such as age, affiliation, or location. While this approach offers fine-grained authorization, conventional ABA schemes require users to disclose all attributes in their credentials, posing significant privacy risks. Anonymous credentials (AC) address this issue by allowing users to hide their attributes during issuance and selectively disclose them during authentication. However, existing AC models assume that users interact directly with issuers, which creates practical challenges: issuers are expected to issue credentials without knowing whether the underlying attribute should be authorized. This design raises both security and accountability concerns and often necessitates centralized attribute management, which is undesirable in real-world settings.

In this paper, we propose {\it Split Credential Authentication} (SCA), a new cryptographic framework that separates attribute management and credential issuance. This separation better reflects real-world institutional settings, where authorities managing user attributes (e.g., municipalities or hospitals) are typically distinct from certificate issuers. At the core of SCA, we introduce a novel cryptographic primitive called {\it Oblivious Certificate Generation} (OCG), which enables certificate issuance without revealing attribute contents to the issuer, nor linking certificates to specific users from the authority’s perspective. We provide a formal definition of OCG and its construction based on standard digital signatures and blind signature schemes satisfying a novel property called splittability. Then, we formalize SCA and its construction based on OCG and non-interactive zero-knowledge proofs to enable selective attribute disclosure. Finally, we demonstrate the applicability of SCA in sensitive domains such as disability services, where it reduces the privacy burden on users while preserving verifiability and policy compliance.

09:36
Ghost Recon: An Orchestration-based, Non-Intrusive, Persistent Multimodal Authentication System

ABSTRACT. Existing authentication systems rely on one-time authentication methods like passwords, patterns, and one-time facial recognition. These methods only verify the user at the moment of login, leaving them vulnerable to hijacking, token reuse, and abuse of privileges during the session. To address this, zero-trust systems have been introduced that verify all elements. However, frequent re-verification increases delays and computational overhead, leading to request timeouts, reduced throughput, and proxy/policy engine bottlenecks, resulting in user inconvenience and increased operational costs. This paper proposes a method for silent continuous authentication that minimizes user inconvenience. This method orchestrates face, skeleton, and voice authentication methods to selectively re-verify based on a trust score. This is expected to maintain the zero-trust principle of constant verification while suppressing unnecessary computation and delays and minimizing user burden.

09:54
Closing Early Attack Surfaces in Bluetooth Secure Simple Pairing using Identity-Based Signatures
PRESENTER: Bae Woori

ABSTRACT. Bluetooth BR/EDR Secure Simple Pairing (SSP) can be exposed in its early phase to active attacks such as confusion, downgrade, and key substitu- tion. The TOFU-or-DOFU model strengthens past/future key security via de- ferrable authentication, but it does not structurally preclude the attacks before deferrable authentication. This paper proposes IBS-based SSP, which inserts an ID-Based Signature (IBS) check immediately after the SSP Key exchange. Us- ing BD ADDR as the ID, it binds ga, gb, Role, the IOcap/AuthReq summary F lagSet, and the selected model MethodTag into a single signature input, blocking the attacks as Role Confusion, Method Confusion, Pairing Confusion, Ghost Keystroke, and Downgrade to Just Works. The overhead is limited to a σ ≈ 96B signature (two compressed points at 128-bit security), carried within the standard LMP Encapsulated PDU, making it more efficient than certificate- based alternatives.

10:12
Non-Recoverable Signature: Digital Signatures Torelant to Partial Disclosures
PRESENTER: Daiki Sasame

ABSTRACT. Digital signatures are fundamental primitives for secure communication and online transactions. As standard security, existential unforgeability under chosen-message attacks (EUF-CMA) is required for digital signature. However, some real-world scenarios expose signatures to threats that go beyond EUF-CMA. In particular, when partial information of a valid signature leaks, for example, through side-channel attacks on communication channels or storage, an adversary may attempt to reconstruct or forge the target signature. Existing notions such as incompressible signatures partially capture this threat, but they impose unrealistic restrictions on information leakage and fail to model the realistic scenario where only the target signature may be exposed.  In this work, we propose non-recoverable signatures, a new paradigm that ensure unforgeability even when the adversary can freely obtain arbitrary signatures and additionally acquires partial information about the target signature. We present a generic construction of non-recoverable signatures based on (standard) digital signature schemes. Futhermore, We show that non-recoverable signatures are strictly stronger than both EUF-CMA secure signatures and incompressible signatures.

10:30
Education Framework of SOME/IP Security Enhancement through Game-Based Learning and Testing Tool Development

ABSTRACT. As the automotive industry shifts toward software-centric development, cyber security threats are increasingly diverse. In response, there is growing interest in training experts equipped with practical problem-solving skills for security issues. This study evaluates the effectiveness of Capture The Flag (CTF)-based learning, a pedagogical method specialized for the security field, applied to the automotive cyber security domain with a focus on the SOME/IP protocol. Through surveys, performance analysis, and interviews with practitioners and students, the findings reveal that the CTF approach enhances participants' understanding of automotive protocols and vulnerabilities through hands-on experience, significantly boosting learning engagement and motivation. Furthermore, providing appropriate tools effectively lowers entry barriers, a common challenge in CTF-based learning. This research proposes a practical education method tailored to the demands of automotive cyber security, with promising implications for strengthening vehicle safety and reliability.

09:00-10:45 Session 13B: Adaptive Intelligence and System Security 1
09:00
Pareto-Optimized Dynamic Tuner for Anomaly Detection of Battery-Constrained Internet of Things

ABSTRACT. The rapid proliferation of Internet of Things (IoT) devices has made both security and energy sustainability critical design objectives. This paper proposes a Pareto-front–optimized dynamic tuner that adaptively selects an appropriate hyperparameter configuration according to an IoT device’s battery state. We categorize battery condition into three discrete states—Normal, Limit, and Crisis—and associate each state with a minimum F1 threshold, a complexity weight, and a latency weight. Rather than using a static configuration that holds hyperparameters fixed during operation, the proposed tuner selects a solution from the Pareto front according to a state-specific policy, thereby balancing detection performance, computational complexity, and latency in real time. In simulated battery experiments, our method maintains or improves detection performance while substantially extending device lifetime. Quantitatively, the dynamic tuner yields approximately a 10% relative improvement in F1-score (accuracy) and extends battery lifetime by 1.7× compared to conventional static approaches. These results demonstrate that dynamically adapting model operating points via Pareto-optimized trade-offs is an effective strategy for reconciling the conflicting goals of security (detection accuracy) and sustainability (battery longevity) in resource-constrained IoT environments.

09:18
Real-time Channel Adaptive Guard Interval Adjustment and Secondary Data Transmission Framework
PRESENTER: Jung-Min Moon

ABSTRACT. This paper proposes a adaptive GI management and secondary data transmission framework that reflects real-time channel characteristics to improve the inefficiency of fixed GI (Guard Interval) used to cope with channel delay spread in wireless communication systems. The proposed technique measures the channel delay spread in real time to calculate the minimum required GI for primary data protection. It enhances transmission efficiency by utilizing the remaining space within the fixed GI for secondary data transmission. Furthermore, it prioritizes primary data reliability by suspending secondary data transmission when channel conditions deteriorate. Simulation results on Wi-Fi 6 networks demonstrate that this technique achieves a 22% transmission efficiency improvement indoors and a 121.5% improvement outdoors while maintaining a 0% Bit Error Rate, completely suppressing Inter-Symbol Interference. Furthermore, it highlights potential security and privacy risks associated with the secondary data transmission segment and discusses future scalability through hardware implementation and integrated security measures.

09:36
Protecting the Preamble: An Adaptive Scheduling Approach for Reliable Multi-Link Operation
PRESENTER: Ye-Sin Kim

ABSTRACT. The recent proliferation of Internet of Things and real-time streaming devices has spurred the widespread adoption of Multi-Link Operation (MLO) technology, which allows multiple wireless links to operate simultaneously in a shared space. Although MLO improves throughput and delivery reliability through parallel transmissions, interference from adjacent channels can corrupt the preamble, a component crucial for reception synchronization. This corruption forces the entire frame to be discarded and retransmitted, and in cases of intentional interference, it can lead to denial-of-service attacks. To address this vulnerability, the present study proposes an adaptive scheduling technique that protects the preamble by deferring transmissions when a collision with another preamble is anticipated. Before transmitting, a link determines if a neighboring link is sending its preamble. If a collision is anticipated, the transmission is briefly delayed to prevent interference. Experimental results under various traffic loads, frame lengths, and maximum retransmission counts demonstrate that the proposed technique reduces unnecessary retransmissions and decreases the average delay by up to 38.8% compared to existing methods.

09:54
Adversarial Attacks on Plausibility Checks in V2X Security
PRESENTER: Jungwoo Park

ABSTRACT. Vehicle-to-Everything (V2X) communication is a core technology that enhances road safety and efficiency through the periodic transmission of Basic Safety Messages (BSMs). To ensure message reliability, the system adopts a multi-layered security architecture composed of digital signature–based authentication (ECDSA or PQC), L1/L2 Plausibility Checks, and a machine learning–based Misbehavior Detection System (ML-MDS). However, this architecture remains vulnerable to subtle adversarial perturbations within permissible physical limits. This study proposes a novel conditional adversarial perturbation model that can bypass both cryptographic verification and rule-based plausibility checks in existing V2X systems while evading ML-MDS detection. Experiments using the VeReMi dataset demonstrate that an insider attacker possessing a legitimate certificate can conditionally manipulate speed, acceleration, and yaw rate, causing all transmitted messages to pass the L1/L2 Plausibility Checks while significantly degrading the detection performance of the ML-MDS. In particular, perturbations activated when the Time-to-Collision (TTC) falls below a predefined threshold distort the reported TTC to appear longer than the actual value, thereby concealing collision risks. These findings indicate that “cryptographic integrity combined with rule-based verification” alone cannot ensure the semantic integrity of V2X communications, underscoring the necessity of enhancing the adversarial robustness of ML-MDS and developing MLSecOps-based defense strategies.

10:12
Design and Evaluation of a Management Target Control Mechanism in a Function for Tracing Diffusion of Classified Information on KVM

ABSTRACT. The leakage of classified information from computer systems can cause serious damage to organizations and individuals. To monitor and manage such incidents, a function for tracing diffusion of classified information implemented on a Kernel-based Virtual Machine (KVM) has been developed.This function hooks system calls via the virtual machine monitor (VMM) to identify processes and files under management that may contain classified information and records them as logs, enabling the tracking of information diffusion without modifying guest operating systems (OSs) or applications.However, this function lacks management capabilities such as grasping, adding, and removing objects under management during runtime, which limits its operational manageability. To address this limitation, this study proposes and implements a management control mechanism using the proc file system (procfs).This mechanism allows users to grasp the current objects under management and dynamically add or remove processes and files under management. Evaluations showed that it introduces minimal overhead, demonstrating a lightweight and non-intrusive approach to controlling objects under management in the tracing function.

09:00-10:45 Session 13C: Special Session
Location: Crown Room (3F)
09:00
One Passport to Govern Them All: Bringing Order to IoT Security and Compliance

ABSTRACT. The increasing complexity of the Internet of Things ecosystems has exposed critical gaps in visibility, traceability, and governance of device security throughout the supply chain. Current practices rely on fragmented descriptors that, although valuable individually, lack a unified framework for integration, traceability, and lifecycle management. These limitations are particularly pressing in light of emerging regulatory requirements, most notably the EU Cyber Resilience Act (CRA), which demands structured, transparent, and auditable security documentation. To bridge this gap, we introduce the Device Security Passport (DSP), a structured, extensible, and lifecycle-aware model to consolidate and exchange security-related information about IoT devices. Built upon the Open Security Controls Assessment Language, the DSP integrates multiple security descriptors such as software and hardware bills of materials, vulnerability disclosures, and behavioral specifications, into a cohesive, hierarchical framework that evolves with the device from manufacturing to decommissioning. By allowing collaborative contributions from manufacturers, integrators, and operators, the DSP facilitates continuous security assurance, automated policy enforcement, and compliance with regulatory frameworks such as the CRA, thereby fostering greater transparency and accountability throughout the IoT supply chain.

09:18
Replicating Network Topologies through the use of LLMs

ABSTRACT. The upcoming sixth generation of mobile communications (6G) proposes unprecedented challenges in terms of network management, optimization, and cybersecurity. Network Digital Twins (NDTs) emerge as a key solution because they enable the creation of dynamic virtual replicas of physical infrastructure with real-time telemetry and advanced simulation capabilities, e.g., replicating attacks, honeypot implementation, etc. This work addresses the use of Large Language Models (LLMs) to automatically generate accurate topologies for NDT, utilizing their ability to process different data formats. Concretely, we evaluate several well-known LLMs on their ability to generate virtualized network infrastructure models from multimodal inputs, namely, natural language descriptions, image files, and JSON topology models. As output, we adopt the novel CONTAINERlab framework that enables the deployment of interconnected network containers. Our research represents the first multimodal performance evaluation of LLMs for NDT topology generation, focusing specifically on creating configuration files for network deployment. Results demonstrate current potential and limitations of LLMs to facilitate the automated creation of realistic NDTs, indicating that, although useful, the outputs generated still require human inspection given the variability of errors that may contain.

09:36
Prioritized Multi-Criteria Optimization for Efficient Cloud-Native Application Resource Allocation

ABSTRACT. The increasing complexity and performance demands of cloud-native applications have prompted users to specify their requirements through high-level intents i.e., abstract descriptions of desired behaviors and performance objectives. This shift necessitates advanced resource allocation mechanisms capable of efficiently optimizing multiple criteria with differing importance. Such mechanisms must simultaneously accommodate applications’ high-level intents and the operational constraints of the underlying infrastructure. Traditional resource allocation methods often struggle to competently address these varied and conflicting goals, and are generally limited to optimizing a restricted set of global objectives. This work introduces a lexicographic optimization approach that ensures adherence to the diverse requirements of each application and its constituent microservices by organizing optimization objectives in a prioritized order. Hence, higher-priority objectives are optimized lexicographically, permitting improvements in lower-priority metrics only when they do not compromise the former, enabling microservice specific allocations. The resource-allocation problem is therefore cast as an ILP-based lexicographic formulation, implemented as a sequence of ε-constraint MILPs, one per metric in the priority hierarchy. Additionally, a greedy heuristic mechanism is proposed to reduce execution time. Experimental evaluations demonstrate that the proposed mechanisms deliver flexible, multi-objective optimized resource allocations, leveraging edge-cloud infrastructures to enhance both application performance and infrastructure utilization.

09:54
Enhancing Privacy in Multi-Domain Network Intent Negotiation

ABSTRACT. In this study, we analyzed the quantity of sensitive knowledge present in data exchanges between agents of multi-domain network intent negotiations. Using information theory, we constructed a model that maps the set of characteristics that define negotiation operations to a value that represents the quantity of sensitive knowledge shared in relation to the number of negotiation goals achieved. We used the model to find a set of characteristic boundaries for which a negotiation process will produce the optimum relation of goals attained per each sensitive knowledge item shared. We validated our model by constructing a negotiation process with such characteristics, and demonstrating it attains the goals while sharing 20 % less knowledge than previous state-of-the-art systems.

10:12
Defending Minds, Not Just Machines: An Agentic Approach to Cognitive Security
PRESENTER: Jaime Fúster

ABSTRACT. We define cognitive security as the protection of human perception, judgment, and decision-making from manipulation, overload, and engineered uncertainty across digital channels and organizational contexts. Rather than replacing traditional technical controls, cognitive security complements them by focusing on human outcomes—helping people notice what matters, reason under pressure, and act in ways that are explainable and reversible.

Delivering this protection is a complex socio-technical challenge that requires multidisciplinary work spanning security engineering, human-computer interaction (HCI), cyber threat intelligence (CTI), psychology, organizational behavior, governance, and law. Within this broader effort, we argue that generative-AI (GenAI) agents are powerful enablers: they can continuously watch information flows, synthesize and contextualize evidence, and support decision-making with timely, human-aligned assistance when their roles are clearly scoped.

Cognitive-security failures often stem from a tempo mismatch: machines demand continuous vigilance while humans excel at careful, contextual judgment—leading to overload, missed cues, and brittle responses. To address this, we propose a role-based taxonomy that separates speed from care and makes cognitive delegation explicit. Sentinels are always-on monitors that surface calibrated cues with minimal friction. Advisors are on-demand sensemakers that assemble and explain the most relevant evidence. Executors apply narrowly scoped, reversible actions under explicit bounds. Keepers maintain posture over time by tracking configuration, drift, and hygiene. Stewards provide overall governance—ensuring policy compliance, privacy handling, provenance, auditability, and human override. This composition aligns technical capability with human limitations and institutional constraints, reducing overload while preserving accountability, and provides a practical blueprint for deploying GenAI to protect societies, organizations, and citizens.

10:30
Lightweight Secure Federated Learning for Energy-Constrained IoT: A Case Study in Smart Irrigation

ABSTRACT. Federated learning (FL) is increasingly adopted in Internet of Things (IoT) environments, yet real deployments must address both energy constraints and security threats. This paper presents a lightweight and energy-aware FL framework designed for heterogeneous IoT devices, with smart irrigation used as a representative case study. The framework adapts client participation according to available energy budgets while securing model updates with AES-128 encryption and SHA-256 integrity verification. A real-world dataset from six agricultural patches and a Docker-based testbed with heterogeneous clients are employed to evaluate the approach. Results over 30 training rounds show that the framework sustains predictive accuracy, mitigates risks of tampering and eavesdropping, and introduces less than 1% computational overhead. The study demonstrates that lightweight cryptographic protection combined with energy-aware scheduling provides a practical balance between efficiency and trustworthiness in federated IoT systems.

11:00-12:00 Session 14A: Digital Asset Protection
11:00
Invisible Watermarking with DWT-SVD for Safeguarding Copyrighted Images against Unauthorized Generative AI Training

ABSTRACT. With advancements in artificial intelligence (AI) technologies, copyright infringement of digital images has become increasingly severe, leading to growing interest in digital watermarking as a key solution. Digital watermarking is a technique that embeds and detects a unique watermark to protect digital content’s copyright and identify and trace tampering and forgery. In this paper, we propose a method that applies a three-level discrete wavelet transform (DWT) on an image to separate its frequency components into multiple levels, followed by singular value decomposition (SVD) across multiple regions to repeatedly embed the watermark into the singular values. The goal is to achieve robust and imperceptible watermarking capable of withstanding signal distortion attacks.

11:18
Hidden in the Noise: Noise-Embedded Watermarking for Black-Box Image Classifiers

ABSTRACT. As a representative technique for protecting model ownership against model extraction attacks, watermarking has been proposed. In particular, black-box watermarking offers the advantage of verifying ownership through specific input–output pairs without accessing model parameters. However, conventional approaches suffer from a severe degradation in performance when the input deviates from the primary data distribution. In this study, we propose a noise-embedded watermarking method applicable to image classification models to alleviate this limitation. The proposed method constructs a watermark dataset by embedding imperceptible noise patterns into original images and jointly training it with the original training data, thereby naturally embedding the watermark into the model. Experimental results demonstrate that the proposed method achieves a watermark detection rate of up to 99.98% while minimizing performance degradation compared with conventional backdoor-based watermarking. Furthermore, by adjusting the noise magnitude (ε), we confirm that the trade-off among model performance, detection rate, and visual perceptibility can be effectively optimized. Additional analyses show that no detection occurs in unwatermarked models and that valid detection is only observed in watermarked models, supporting the reliability of ownership verification. This study highlights the potential of watermarking as a practical and unobtrusive means of protecting ownership of image classification models in black-box environments.

11:36
Cross-Artifact Comparative Analysis of Legal and Illegal Korean Streaming Sites

ABSTRACT. Illegal streaming platforms have proliferated alongside the rapid expansion of online video streaming services. These sites provide unauthorized access to copyrighted content while mimicking the appearance of legitimate Over-the-Top (OTT) platforms, thereby undermining legal services and exposing users to various risks. In this study, we focus on Korean streaming platforms, conducting a cross-artifact comparative analysis of real-world legitimate and illegal sites across three categories of Web artifacts: HTML structure, cache entries, and cookies. Our analysis shows that legitimate platforms maintain consistent domain-based resource calls and employ standardized cookie management practices. In contrast, illegal sites exhibit distinctive and immature operational patterns, including extensive use of external domains, repeated reliance on default session management cookies, opaque a liations with advertising networks, and high redundancy in cache data. Moreover, some pairs of illegal sites displayed strong similarities across all three artifact types, suggesting the possibility of shared infrastructure or common operators. By revealing these consistent cross-artifact patterns, this study provides empirical evidence that can inform identication of illegal streaming platforms.

11:00-12:00 Session 14B: Adaptive Intelligence and System Security 2
Chair:
11:00
Spatio-Temporal Adaptive Reinforcement Learning for Task Offloading in Mobile Edge Computing

ABSTRACT. Mobile Edge Computing (MEC) reduces latency by offloading intensive tasks to edge servers, yet dynamic multi-user environments make offloading highly challenging. Deep reinforcement learning (DRL) has been widely adopted, but prior approaches often fail to jointly capture network topology and temporal dynamics. To overcome this limitation, we propose spatio-temporal adaptive reinforcement learning (STAR), which formulates the offloading problem as a markov decision process (MDP) with the MEC network represented as a heterogeneous graph. STAR combines an attention-based spatial perception layer to extract graph features with a temporal memory layer to model state evolution, enabling proactive and context-aware offloading. Comprehensive simulations demonstrate that STAR reduces mean processing delay (MPD) by over 25% and offloading failure rates (OFR) by approximately 26%. Beyond efficiency improvements, the learned policy generalizes well, consistently achieving low latency and high reliability in novel and large-scale network environments.

11:18
Drain-Like Log Parsing for Distributed Systems: Efficient Template Mining on HDFS Datasets

ABSTRACT. A crucial preprocessing step in natural language pro- cessing (NLP) applications for system logs is log parsing, which makes structured analysis possible for tasks like performance monitoring, failure diagnosis, and anomaly identification. With the help of adaptive similarity thresholds and better preprocess- ing methods, we provide an improved Drain-like parser in this study that is designed for distributed system logs and can handle dynamic fields in high-volume logs. Our method integrates post- parsing template merging to reduce redundancy and edit distance for clustering, building on heuristic-based techniques such as Drain. With rigorous edit distance requirements, our parser produces 44 distinct templates with good grouping accuracy (100% on sampled logs) when tested on a subset of the HDFS dataset (100,000 logs).

11:36
Friendly-fire: Measurement and Analysis of the Interaction Dynamics Between Anti-Malware and PETs

ABSTRACT. We present a measurement study of how behavioural ransomware-detection techniques interact with storage privacy-enhancing technologies (PETs). Using a 10,966-file corpus exercised under six controlled workloads—per-file encrypt-then-decrypt, re-runs, fresh re-installs, plain PC-to-USB transfer, three-pass zero-overwrite deletion, and staggered encryption with delay—we evaluate eight widely deployed anti-malware products that use behavioural techniques around encryption functions (Windows Defender, ESET Security, Kaspersky, Trend Micro, Avast, Bitdefender, Malwarebytes, Acronis) and six common PET/tools (VeraCrypt, Picocrypt, Cryptomator, 7-Zip, AttacheCase4, Encrypto).

Across benign PET workloads, vendors exhibit highly variable behaviour with systematic false positives that disrupt legitimate encryption. Path dependence is pronounced: once a PET is quarantined or removed, subsequent runs on the same host are often impossible without a fresh re-install. In our encryption family, 7/8 products intervened on per-file encrypt-then-decrypt and 6/8 on delayed runs; staggered encryption evades some heuristics while triggering others. By contrast, on-access extraction-time detection for known ransomware families is strong for most products. We release PART to support replication and future comparative testing.

11:00-12:00 Session 14C: Covert Threat Analysis
Location: Crown Room (3F)
11:00
Binary Lifting into LLVM IR with Large Language Models

ABSTRACT. The process of converting binaries into intermediate representation (IR), known as lifting, is a pivotal role in binary analysis. Existing lifting tools mainly rely on rulebased techniques that generate low-level labels in the resulting IR and are sensitive to environmental constraints such as version compatibility of the Operating System (OS), LLVM, and auxiliary tools. By contrast, recent Large Language Model (LLM) research focuses on text generation for high-level languages, with limited attention to low-level analysis. Also, support for Windows binaries is limited, and recompilation of the generated IR is often not guaranteed. To bridge the gap, we present an LLM-based lifting framework that produces highly readable IR and is robust to environmental variation. By incorporating observed error patterns during IR generation and execution into the system prompt, the success rate is significantly improved, compared with a question-based prompt. The generated IR is validated and compared to McSema, a representative lifting tool. This study is the first attempt to apply LLMs to binary lifting for IR and suggests that LLMs could serve as a viable alternative to existing lifting tools.

11:18
Classification of Tor Metadata Combinations for Investigation
PRESENTER: Yiseul Choi

ABSTRACT. Tor is a widely deployed anonymity network that relies on layered encryption and multi-hop circuits to conceal user identity and communication patterns. While extensive research has focused on traffic analysis and deanonymization, little work has been devoted to a systematic examination of the metadata inherently generated during Tor’s internal operations. This study conducts a source code, driven analysis of Tor’s circuit creation, stream multiplexing, and cell transmission processes, identifying critical metadata such as circuit purpose, stream identifiers, and cell exchange patterns. Based on this analysis, we evaluated the potential for utilizing metadata from an investigative perspective. Unlike conventional deanonymization techniques, we demonstrate how metadata can assist investigations by identifying the scope of information that can be discerned from it based on its visibility. Conclusively, we examine changes in metadata based on visibility scope and discuss its potential for use as evidence. This demonstrates the feasibility of metadata-based classification in real-world investigative contexts, laying the foundation for future integration of this methodology into monitoring and forensic systems.

11:36
ObfSwin: Transformer-based Multi-class Obfuscation Classification for Windows Portable Executable

ABSTRACT. Windows remains the primary target of malware attacks, and attackers routinely deploy obfuscation—often in overlapping combinations—to evade detection and frustrate reverse engineering. When protections are layered, their effects compound: transformations interleave and mask one another, standard heuristics break, and efficient deobfuscation becomes difficult without first knowing the protection set. Yet efficient, execution-free methods that clearly identify the protections present in a binary remain scarce. We present a purely static learning approach that infers applied protections directly from raw bytes. We model three-byte transitions as a 3D Markov tensor, convert it into a single RGB image representation, and feed that image to a Swin Transformer. We train combination size specific heads: one covering all pairwise two option settings and another covering all triple three option settings, using Windows PE binaries protected by a commercial tool. Evaluated on 100k obfuscated samples, consisting of six two option combinations and four three option combinations, our framework achieves 94\% accuracy on the two option subset and 96\% on the three option subset. The approach avoids execution and disassembly, substantially reduces memory and training cost compared with naïve 3D slice representations, and provides a practical basis for large scale static identification of overlapping protection options.