GPU Vs CPU Implementation of the Kernighan-Lin Partitioning Algorithm for Improved Partitioning Results

Abstract:

Kernighan-Lin Partitioning Algorithm divides a graph into two roughly equal-sized sets, minimizing the number of edges that connect the two. The algorithm's core idea is to iteratively exchange nodes between the two sets in order to enhance partitioning. The initial cut size, or the number of edges that cross the partition, is determined by first splitting the nodes into two sets. The process then repeatedly chooses a pair of nodes from various sets whose exchange causes the greatest reduction in cut size. Up until there is no more improvement possible, this process is repeated. This algorithm is implemented on two hardware platforms – CPU and a GPGPU. The former was implemented using C++ and latter was with CUDAC using Nvidia’s nvGRAPH Library. The complexity of the CPU implementation of the KL algorithm for VLSI partitioning is O(n3). To overcome the high degree of complexity, the different stages of KL Partitioning Algorithm can be completed in parallel on the GPU that features a tightly parallel design with thousands of smaller, more efficient cores that are intended to do many jobs at once. Hence, GPU implementation reduces the complexity to O(n).