Tags:abstract syntax tree, ast augmented ast, augmented ast, control flow, deep learning, fixed vocabulary, graph neural network, graph neural networks, learning representation, machine learning, natural language processing, neural network, source code, unbounded vocabulary and variable naming task
Abstract:
A major challenge when using techniques from Natural Language Processing for supervised learning on computer program source code is that many words in code are neologisms. Reasoning over such an unbounded vocabulary is not something NLP methods are typically suited for. We introduce a deep model that contends with an unbounded vocabulary (at training or test time) by embedding new words as nodes in a graph as they are encountered and processing the graph with a Graph Neural Network.
Deep Learning On Code with an Unbounded Vocabulary