Learning to Identify High Betweenness Centrality Nodes from Scratch: A Novel Graph Neural Network Approach

Abstract:

Betweenness centrality (BC) is one of the most used centrality measures for network analysis. It is key to many valuable applications, including community detection and network dismantling. Computing BC scores on large networks is computational challenging due to its high time complexity. Many approximation algorithms have been proposed to speed up the running time, which are mainly sampling-based. However, these methods are still time consuming in handling large-scale networks, and are not adaptive to even small changes in the networks.

In this paper, we focus on identifying nodes with high betweenness centrality in a graph, since many appplication scenarios are built upon retrieving nodes with top-k betweenness centrality. Different from previous heuristic methods, we turn this task into a learning problem and design an encoder-decoder based framework to resolve the problem. More specifcally, the encoder leverages the network structure to encode each node into an embedding vector, which captures the important structural information of the node. The decoder transforms the embedding vector for each node into a scalar, which captures the relative rank position of this node in terms of BC. We use the pairwise ranking loss to train the model to identify the orders of nodes regarding their BC. By training on small-scale networks, the learned model is capable of assigning relative BC scores to nodes for any unseen networks, and thus identify the highly-ranked nodes. Extensive experiments on both synthetic and real-world networks demonstrate that, compared to representative baselines, our model drastically speeds up the prediction without noticeable sacrifce in accuracy. And on some large real data, we even outperform the state-of-the-art by accuracy also along with runtime.