CAMLIS 2019: CONFERENCE ON APPLIED MACHINE LEARNING FOR INFORMATION SECURITY
PROGRAM FOR FRIDAY, OCTOBER 25TH
Days:
next day
all days

View: session overviewtalk overview

10:15-11:45 Session 1
10:15
Felipe Ducau (Sophos, UK)
Konstantin Berlin (Sophos, United States)
Ethan Rudd (Sophos, United States)
Tad Heppner (Sophos, UK)
Alex Long (Sophos, United States)
Describing Malware via Tagging
PRESENTER: Felipe Ducau

ABSTRACT. Although powerful for conviction of malicious artifacts, machine learning based detection do not generally produce further information about the type of malware has been detected. In this work, we address the information gap between ML and signature-based detection methods by introducing an ML-based tagging model that is trained to generate human-interpretable semantic descriptions of malicious software (e.g. file-infector, downloader, etc.). 

Even though much has changed over the last 30 years of malware detection, most anti-malware solutions still rely on the concept of malware families for describing the capabilities of malicious software. The increased number of malware specimens along with the introduction of techniques such as polymorphism, packing, and obfuscation, has turned the task of malware description via family classification into a difficult and oftentimes intractable one. This has led to a (very) large number of mutually exclusive malware families, typically highly vendor-specific (oftentimes inconsistent across vendors) and not necessarily designed for human consumption. 

We propose an alternative approach to malware description based on semantic tags. In contradistinction to (family) detection names, semantic tags aim to convey high-level descriptions of the capabilities and properties of a given malware sample. They can refer to their purpose (e.g. 'dropper’, ‘downloader’), malware family (e.g. ‘ransomware’), file characteristics (e.g. ‘packed’), etc. Semantic tags are non-exclusive, meaning that a malware campaign can be associated with multiple tags, and a given tag can be associated with multiple malware families. By moving the focus of malware description from a large set of mutually exclusive malware families to an intelligible set of malware tags we also enable the possibility of learning the relationship between files and semantic tags with machine learning techniques. 

With this in mind, we first introduce a simple annotation method for deriving high-level descriptions of malware files based on (but not necessarily constrained to) an ensemble of vendor family names. We then formulate the problem of malware description as a tagging problem and formalize it under the framework of multi-label learning. We further propose a joint-embedding deep neural network architecture that maps both semantic tags and Windows portable executable files — represented by a set of features derived statically— to the same low-dimensional embedding space. We can then use the similarity between files and tags in this embedding space to automatically annotate previously unseen samples. 

We empirically demonstrate that when evaluated against (noisy) tags extracted from an ensemble of anti-virus detection names, the proposed tagging model correctly identifies about 94% of eleven possible tag descriptions for a given sample, at a deployable false positive rate (FPR) of 1% per tag. Finally, we evaluate the performance of our model on (high quality) tags extracted from execution traces of files, showing that it is feasible to learn behavioral characteristics of malicious software from a static representation of the files.

10:45
Lara Dedic (Novetta, United States)
Matthew Teschke (Novetta, United States)
CNN-Based Malware Visualization and Explainability

ABSTRACT. Manually determining the malware-like characteristics of an executable using signature and behavioral based identifiers has become difficult and laborious for domain experts as malware becomes more complex. Using machine learning models to automatically detect important features in malware by taking advantage of advancements in deep learning, such as image classification, has developed into a research topic that both interests malware reverse engineers and data scientists.

This work is an expansion of recent attempts to better interpret convolutional neural networks (CNNs) that have been trained on image representations of malware through the network’s activations. We present a reproducible approach to visually explain a CNN’s predictions by overlaying heatmaps on top of disassembled malware that’s been transformed into images, and then show how it can be used as an automated malware analysis tool for reverse engineers as a way to navigate through a complex piece of malware for the first time. We use fastai, a deep learning library that simplifies training state of the art neural networks for any task including malware binary classification, and Gradient-weighted Class Activation Mappings (Grad-CAM) to generate the heatmaps over regions in the image that might indicate malicious behavior.

11:15
Bobby Filar (Endgame Inc, United States)
David French (Endgame Inc, United States)
ProblemChild: Discovering Anomalous Patterns based on Parent-Child Process Relationships

ABSTRACT. It is becoming more common that malware attacks are not just a standalone executable or script. These attacks often have conspicuous process heritage that is ignored by machine learning models that rely solely on static features (e.g. PE header metadata) to make a decision. Advanced attacker techniques, like “living off the land,” that appear normal in isolation become more suspicious when observed in a parent-child context. The context derived from parent-child process chains can help identify and group malware families, as well as discover novel attacker techniques. These techniques can be chained to perform persistence, defense bypasses and execution actions. In response, security vendors commonly write heuristics, commonly referred to as analytics to identify these events.

We present ProblemChild, a graph-based framework designed to discover malicious software based on process relationships. ProblemChild applies machine learning to derive a weighted graph used to identify communities of seemingly disparate events into larger attack sequences. Additionally, ProblemChild uses the conditional probability P( child | parent ) to automatically uncover rare or common process-level events that can be used to elevate or suppress anomalous communities. We evaluate ProblemChild against a replay of the 2018 Mitre ATT&CK evaluation (APT3) to show how it performs compared to other security vendors who participated in the test.

Additionally, a paper will be made available on Arxiv the day(s) leading up to CAMLIS.

13:00-14:30 Session 2
13:00
C. Bayan Bruss (Capital One, United States)
Applying Deep Graph Representation Learning to the Malware Graph

ABSTRACT. Malware is widespread, both increasing in its ubiquity but also growing in diversity. This poses significant challenges for detecting, classifying and understanding new malware as it is observed. Static and dynamic attributes about the specific malware only tell you so much. The sheer scale of the problem makes in-depth investigation impossible. However, when malware are viewed as nodes in a heterogeneous graph, where edges can be connected to IP addresses, URLs, domains, etc., then topological information can provide added context beyond just the individual node attributes. While this can be a powerful visualization and investigation tool, graph structures are very sparse and high dimensional making it challenging for ML task such as node classification, similarity search and clustering. In recent years, a variety of graph embeddings techniques have gained popularity for using machine learning to learn lower dimensional vector representations of graphs in a way that encodes topology, node and edge attributes and neighborhood statistics. In this presentation we apply common graph embedding techniques (e.g. DeepWalk, GraphSage), to the malware. We investigate the outputs of these models to gauge their ability to learn meaningful representations in this domain with this graph and share learnings for how to incorporate Graph ML techniques to large malware networks.

13:30
Michael Slawinski (Cylance Inc., United States)
Applications of Graph Integration to Function Comparison and Malware Classification

ABSTRACT. We classify .NET files as either benign or malicious by examining directed graphs derived from the set of functions comprising the given file. Each graph is viewed probabilistically as a Markov chain where each node represents a code block of the corresponding function, and by computing the PageRank vector (Perron vector with transport), a probability measure can be defined over the nodes of the given graph. Each graph is vectorized by computing Lebesgue antiderivatives of hand-engineered functions defined on the vertex set of the given graph against the PageRank measure. Files are subsequently vectorized by aggregating the set of vectors corresponding to the set of graphs resulting from decompiling the given file. The result is a fast, intuitive, and easy-to-compute glass-box vectorization scheme, which can be leveraged for training a standalone classifier or to augment an existing feature space. We refer to this vectorization technique as PageRank Measure Integration Vectorization (PMIV). We demonstrate the efficacy of PMIV by training a vanilla random forest on 2.5 million samples of decompiled .NET, evenly split between benign and malicious, from our in-house corpus and compare this model to a baseline model which leverages a text-only feature space. The median time needed for decompilation and scoring was 24ms.

14:00
Erick Galinkin (Netskope, United States)
What is the Shape of an Executable?

ABSTRACT. The empirical success of neural networks in fields such as natural language processing and computer vision has led researchers in many other fields, including information security, to try their hand at deep learning. However, the landmark results seen in some applications have not been repeated in information security and have rarely been successful without significant feature engineering. Most convolutional neural networks are written to use rectangular filters, but the convolution operator is flexible and its efficacy in signal processing is often contingent on the shape of the signal being processed and the filter it is convolved with. We survey vectorization methods that have been used on binary executable files and consider their efficacy on the EMBER dataset. Additionally, we explore the topology of the binary executable files associated with the three major desktop operating systems (Windows, Mac OS X, and *nix) and compare the accuracy of neural networks using non-rectangular filters against benchmark accuracy results.

15:00-17:00 Session 3
15:00
David Elkind (CrowdStrike, Inc., United States)
Mitigating Adversarial Attacks against Machine Learning for Static Analysis

ABSTRACT. Computer security increasingly leverages machine learning to detect malware. This is not without risks. Machine learning methods have weaknesses that can be exploited by a savvy attacker. In the case of malware, adversaries have an enormous amount of control over how to accomplish their malicious goals in code; this flexibility allows malware authors to engineer PE files that can evade detection by machine learning.

In this presentation, we outline the high-level pipeline for identifying malware using machine learning and demonstrate an elementary strategy to evade detection using machine learning. Even simple modifications to a PE file can be leveraged to make the file evade naive machine learning models. As a notional example, we append ASCII bytes to the overlay of a PE file; because appending bytes to the overlay is unlikely to change the operation of the executable, the malicious functionality is likely left intact by this modification. Moreover, we show that such evasion can be mitigated using a novel regularization technique.

Our novel strategy for mitigating evasions leverages the internal structure of a deep neural network for malware classification. Specifically, we penalize the deep network’s objective function proportional to the magnitude of the discrepancy between the hidden representations of a PE file and its corresponding modified version. This penalty encourages pairs of files (original files are paired with the same with ASCII bytes appended) to be given similar learned representations within the hidden layers of the network. We know that the “twins” must have the same functionality, so the network should give them a similar representation. We show that this regularization strategy results in a model which is much more robust to targeted file perturbations such as the ASCII bytes evasion strategy. Furthermore, we analyze the trade-offs researchers need to make between adversarial hardening and detection efficacy.

15:30
Andy Applebaum (The MITRE Corporation, United States)
Trying to Make Meterpreter into an Adversarial Example

ABSTRACT. While machine learning has put previously hard-to-solve problems within reach, recent research has shown that many of the associated methods are susceptible to misclassification via the explicit construction of adversarial examples. These cleverly crafted inputs are designed to toe the line of classifier decision boundaries, and are typically constructed by slightly perturbing correctly classified instances until the classifier misclassifies it, even though the instance is largely the same. Researchers have published ways to construct these examples with full, some, or no knowledge of the target classifier, and have furthermore shown their applicability to a variety of domains, including in security.

In this talk, we’ll discuss several experiments where we attempted to make Meterpreter – a well-known and well-signatured RAT – into an adversarial example. To do this, we leveraged the open-source gym-malware package, which treats the target classifier as a black-box and uses reinforcement learning to train an agent on how to apply perturbations to input PE files in a way that results in evasive malware. Deviating from existing work, our approach trained and tested only on different versions of Meterpreter, which were compiled by using msfvenom with different compilation options, such as templates, encoders, added code, and others. Our goal was in part to test if the reinforcement learning approach is more effective when focused on one malware family, as well as to see if we can make something well-known (and widely-used) evasive.

Unfortunately, our results were underwhelming: we found little difference between using a fully black-box, gray box, or random agent to apply perturbations, and we also did not see significant changes between varying the game length between 10 or 20 perturbations per instance. However, on analyzing the samples generated by msfvenom, we saw that many of the instances we created were naturally evasive due to their compilation parameters, and did not benefit from applied perturbations; applying an encoder, for example, increased the confidence of the classifier, whereas using a template – even of a malicious executable – decreased it. Taken as a whole, our results lay out interesting areas for future work, both in the realm of pre- and post-compilation adversarial example construction.

16:00
Evan C Yang (Intel, United States)
Towards a Trustworthy and Resilient Machine Learning Classifier - a Case Study of Ransomware Behavior Detector

ABSTRACT. The crypto-ransomware is a type of malware which hijacks user’s resources and demands for a ransom. It was expected to cost business more than $75 billion in 2019 and continues to be a problem for enterprises*. Due to the encryption, the damage caused by the crypto-ransomware is difficult to revert. Even equipping with an endpoint protection software, infections may still occur*. To block an unseen ransomware, behavior-based detection with a proper backup mechanism is one of mitigation solutions. In this presentation, machine learning (ML) and deep learning (DL) classifiers were proposed to detect the ransomware behaviors. We executed ransomwares in Windows sandboxes and collected their input/output activities (I/O). The time-series behavior data was analyzed by long short term memory (LSTM) or N-gram featured support vector machine (SVM). We found a naïve trained classifier even with good accuracy (>98%) and low false positive rate (<1.4%)) didn’t perform well at online detector in the wild. To boost the early detection rate and to overcome the potential overfitting issue, data augmentation techniques were definitely needed. Also to avoid the sensitivity of the sliding window size, an over-sampling mechanism was deployed to synthesize samples similar to the ones from I/O event stream. A ML/DL model without adversarial mitigation may be vulnerable to adversarial attacks. A simulated ransomware, the Red team, was developed to probe the blind spots of our classifiers. This simulated program can perform core ransomware behaviors, the malicious encryption, and configurable benign I/O activities, e.g. file creation or modification etc. With minor change to the I/O pattern of encryption, the Red team found no difficulty to bypass the detection. We conclude that an adversarial mitigation is necessary procedure to fortify the ML/DL classifier especially when dataset size is limited. For security application, it is important to ensure the classifier making decision based on meaningful features. The Integrated Gradient method was selected in our experiment to show the attribution of each time steps in LSTM model. We observed that the attribution pattern did match the known malicious series activities and the fidelity of classifier can be confirmed. We can also apply the same method to understand how an adversarial sample bypasses the detection. By building a ransomware detector, this presentation demonstrates a full stack of ML/DL development process. We found the simulated adversarial program is very helpful which can disclose the weakness of the model and also serve as an adversarial sample generator. In addition to the regular ML/DL training-testing iteration for model optimization, we proposed to synthesize adversarial samples by a polymorphic Red team program for adversarial training iteration. Combining with data augmentation and model explanation techniques, the resiliency and fidelity of the model can be enhanced and ensured. The tips and lessons learned for each steps of two-iteration pipeline will be shared in our presentation. We believe this in-depth analysis can be a general recommendation for all cybersecurity ML/DL development. *https://phoenixnap.com/blog/ransomware-statistics-facts

16:30
Giorgio Severi (FireEye, United States)
Jim Meyer (FireEye, United States)
Scott Coull (FireEye, United States)
Exploring Backdoor Poisoning Attacks Against Malware Classifiers

ABSTRACT. Antivirus vendors often rely on crowdsourced threat feeds, such as VirusTotal and ReversingLabs, to provide them with a large, diverse stream of data to train their malware classifiers. Since these threat feeds are largely built around user-submitted binaries, they provide an ideal vector for poisoning attacks, where an attacker injects manipulated samples into the classifier’s training data in an effort to cause misclassifications after deployment. In a backdoor poisoning attack, the attacker places a carefully chosen watermark into the feature space such that the classifier learns to associate its presence with a class of the attacker’s choosing. These backdoor attacks have been proven extremely effective against image classifiers without requiring a large number of poisoned examples, but their applicability to the malware classification domain remains uncertain.

In this talk, we explore the application of backdoor poisoning to malware classification through the development of novel, model-agnostic attacks in the white box setting that leverage tools from the area of model interpretability, namely SHapley Additive exPlanations (SHAP). Intuitively, our attack uses the SHAP values for the features as a proxy for how close certain values are to the decision boundary of the classifier, and consequently how easily we can manipulate them to embed our watermark. At the same time, we balance the ease of manipulation against our desire to blend in with surrounding (non-poisoned) samples, ensuring that we use watermarks that are consistent with the remainder of the dataset. Unlike previous work on backdoor attacks against image classifiers, which focus solely on deep neural networks, our techniques can operate on any model where SHAP values can be approximated for the underlying feature space. Moreover, we adapt the threat model developed in the image classification space to more accurately reflect the realities of malware classification so that we can evaluate the efficacy of our attack as a function of the attacker’s knowledge and capabilities in manipulating the feature space.

The results of our experiments on the EMBER dataset highlight the effectiveness of our backdoor attack, demonstrating high evasion rates with a training set containing a small proportion of poisoned examples. Even in the more extreme attack settings, these poisoned examples did not significantly impact the baseline performance of the classifier. In addition, we explored several common anomaly detection and dataset cleansing techniques to better understand useful mitigation strategies that antivirus vendors might use against our attack. Taken together, the results of our experiments validate the effectiveness of our model-agnostic backdoor poisoning attacks and bring to light a potential threat that antivirus vendors face when using crowdsourced threat feeds for training machine learning models.