MLPA 2020: Machine Learning for Program Analysis |
Website | https://sites.google.com/view/mlpa2020 |
Submission deadline | September 4, 2020 |
MLPA: Machine Learning for Program Analysis
https://sites.google.com/view/mlpa2020/
Submission Deadline: September 4, 2020
We are excited to invite submission for the Machine Learning for Program Analysis (MLPA) workshop, which will take place virtually in January 2021, and is collocated with IJCAI-PRICAI 2020/2021 conference.
The main objective of the workshop is to foster new discussions across the machine learning and program analysis communities, highlight open problems, and suggest possible research directions to solve technical and fundamental aspects of challenges and limitations in today’s program analysis models and techniques. Submissions discussing how existing work is relevant in this context are also welcome.
The workshop will not have formal proceedings considered as archival publications.
Workshop Scope
Program analysis is an essential research area in software security. In addition to formal methods and compiler theory, a large span of post-development techniques have been developed over time in order to solve software security problems ranging from vulnerability discovery, reverse engineering, code clone detection and obfuscation/deobfuscation among many other applications. Some approaches require source-code to operate at the language or bytecode level, whereas other approaches focus on binary code in order to cope with situations where source code and/or build environments are not accessible.
In both cases, methods for post-development program analysis have traditionally relied on manually defined heuristics, requiring human effort and limiting the scalability of the resulting models.
In recent years, in a context of constantly growing software size, complexity and attack surface, there has been a growing interest in applying machine learning techniques to further automate and improve the scalability of program analysis techniques. Examples include the use of Conditional Random Fields for recovering debug information about binaries, developing deep neural networks for identifying function boundaries and function types, discovering new vulnerabilities, and decompilation. In addition to this, graph-based methods have also been used for assessing similarity between two binary inputs and code-duplicate detection, code classification and vulnerability detection, among others.
The main objective of this workshop is to bring together researchers in machine learning and program analysis communities and serve as a platform for identifying cross-disciplinary problems of mutual interest. A partial list of the topics covered at the workshop include: Representation learning, Natural language processing, Graph based methods for source-level, binary-level, bytecode-level program analysis.
We encourage the following submissions to MLPA:
-
Submissions that apply machine learning to solve program analysis problems at the source-code level and which could benefit to binary program analysis level. Such submissions should include a paragraph discussing why the proposed model is relevant to binary program analysis, how it would apply, and which sets of problems it would solve.
-
Submissions focusing on binary program analysis in all its forms, i.e., the analysis of compiled executable programs or bytecode including smart contracts.
-
Submissions which do not currently involve machine learning, but are discussing possible future plans involving machine learning, and in particular, in order to a) cope with current limitations of applied binary program analysis models, or b) solve fundamental problems of binary program analysis which cannot be solved deterministically using programmatic approaches or require extensive use of specialized heuristics. Such submissions should include a section discussing the technical aspects and impact of these limitations, highlight open problems, and suggest possible research directions to address those with machine learning.
A partial list of topics of interest covered in this workshop includes:
-
Representation learning for program analysis
-
Natural language processing for program analysis
-
Graph neural networks for intermediate representations
-
Supervised vs unsupervised problems in program analysis
-
Relevant applications, e.g,:
-
code similarity detection
-
vulnerability detection
-
function boundary identification
-
-
Standardized datasets and benchmarks
-
Automated analysis approaches for Go and Rust binaries
-
Automated analysis of smart contracts
Paper submissions for MLPA can be of two types:
-
Full-length papers (max 6 pages + 1 page for references) describing original research findings
-
Short papers (max 4 pages + 1 page for references) describing challenge problems.
*Please note that MLPA submissions will not appear in proceedings*.
Paper should be formatted using IJCAI-PRICAI formatting guidelines, see here: https://www.ijcai.org/authors_kit
Important Dates
Submission deadline: Friday, September 4, 2020
Notification of acceptance: Friday, September 22, 2020
Camera-ready version: Friday, September 29, 2020
*All submission deadlines are at 23:59 Anywhere on Earth on the date indicated.*
Program Committee
-
Shushan Arakelyan, Information Sciences Institute/University of Southern California
-
Aram Galstyan, Information Sciences Institute/University of Southern California
-
Christophe Hauser, Information Sciences Institute/University of Southern California
-
Dawn Song, University of California at Berkeley
-
Heng Yin, University of California at Riverside
COVID-19: in spite of the evolving COVID-19 situation, the IJCAI-PRICAI conference will take place in some form (physical or virtual). In any case, the MLPA workshop will provide a virtual option to ensure participants can attend regardless of the situation.