CFP

MLPA 2020: Machine Learning for Program Analysis

Website	https://sites.google.com/view/mlpa2020
Submission deadline	September 4, 2020

Topics: machine learning program analysis

MLPA: Machine Learning for Program Analysis

https://sites.google.com/view/mlpa2020/

Submission Deadline: September 4, 2020

We are excited to invite submission for the Machine Learning for Program Analysis (MLPA) workshop, which will take place virtually in January 2021, and is collocated with IJCAI-PRICAI 2020/2021 conference.

The main objective of the workshop is to foster new discussions across the machine learning and program analysis communities, highlight open problems, and suggest possible research directions to solve technical and fundamental aspects of challenges and limitations in today’s program analysis models and techniques. Submissions discussing how existing work is relevant in this context are also welcome.

The workshop will not have formal proceedings considered as archival publications.

Workshop Scope

Program analysis is an essential research area in software security. In addition to formal methods and compiler theory, a large span of post-development techniques have been developed over time in order to solve software security problems ranging from vulnerability discovery, reverse engineering, code clone detection and obfuscation/deobfuscation among many other applications. Some approaches require source-code to operate at the language or bytecode level, whereas other approaches focus on binary code in order to cope with situations where source code and/or build environments are not accessible.

In both cases, methods for post-development program analysis have traditionally relied on manually defined heuristics, requiring human effort and limiting the scalability of the resulting models.

In recent years, in a context of constantly growing software size, complexity and attack surface, there has been a growing interest in applying machine learning techniques to further automate and improve the scalability of program analysis techniques. Examples include the use of Conditional Random Fields for recovering debug information about binaries, developing deep neural networks for identifying function boundaries and function types, discovering new vulnerabilities, and decompilation. In addition to this, graph-based methods have also been used for assessing similarity between two binary inputs and code-duplicate detection, code classification and vulnerability detection, among others.

The main objective of this workshop is to bring together researchers in machine learning and program analysis communities and serve as a platform for identifying cross-disciplinary problems of mutual interest. A partial list of the topics covered at the workshop include: Representation learning, Natural language processing, Graph based methods for source-level, binary-level, bytecode-level program analysis.

We encourage the following submissions to MLPA:

Submissions that apply machine learning to solve program analysis problems at the source-code level and which could benefit to binary program analysis level. Such submissions should include a paragraph discussing why the proposed model is relevant to binary program analysis, how it would apply, and which sets of problems it would solve.

Submissions focusing on binary program analysis in all its forms, i.e., the analysis of compiled executable programs or bytecode including smart contracts.

Submissions which do not currently involve machine learning, but are discussing possible future plans involving machine learning, and in particular, in order to a) cope with current limitations of applied binary program analysis models, or b) solve fundamental problems of binary program analysis which cannot be solved deterministically using programmatic approaches or require extensive use of specialized heuristics. Such submissions should include a section discussing the technical aspects and impact of these limitations, highlight open problems, and suggest possible research directions to address those with machine learning.

A partial list of topics of interest covered in this workshop includes:

Representation learning for program analysis
Natural language processing for program analysis
Graph neural networks for intermediate representations
Supervised vs unsupervised problems in program analysis
Relevant applications, e.g,:
- code similarity detection
- vulnerability detection
- function boundary identification
Standardized datasets and benchmarks
Automated analysis approaches for Go and Rust binaries
Automated analysis of smart contracts

Paper submissions for MLPA can be of two types:

Full-length papers (max 6 pages + 1 page for references) describing original research findings
Short papers (max 4 pages + 1 page for references) describing challenge problems.

*Please note that MLPA submissions will not appear in proceedings*.

Paper should be formatted using IJCAI-PRICAI formatting guidelines, see here: https://www.ijcai.org/authors_kit

Important Dates

Submission deadline: Friday, September 4, 2020

Notification of acceptance: Friday, September 22, 2020

Camera-ready version: Friday, September 29, 2020

*All submission deadlines are at 23:59 Anywhere on Earth on the date indicated.*

Program Committee

Shushan Arakelyan, Information Sciences Institute/University of Southern California
Aram Galstyan, Information Sciences Institute/University of Southern California
Christophe Hauser, Information Sciences Institute/University of Southern California
Dawn Song, University of California at Berkeley
Heng Yin, University of California at Riverside

COVID-19: in spite of the evolving COVID-19 situation, the IJCAI-PRICAI conference will take place in some form (physical or virtual). In any case, the MLPA workshop will provide a virtual option to ensure participants can attend regardless of the situation.