Accelerating Neural Policy Repair with Preservation via Stability-Plasticity Interpolation
ABSTRACT. Neural network (NN)-based control policies have been widely adopted in cyber-physical systems (CPS). When a NN-based policy fail to fulfill a formally specified task, researchers leverage NN repair algorithms to fix it. A recent literature raises the problem of Repair with Preservation (RwP), which asks for preservation of existing correct behaviors while repairing the incorrect ones --- a corresponding solution is given, known as Incremental Simulated Annealing Repair (ISAR). In this paper, we tackle the computational efficiency issue of ISAR, which involves expensive log-barriered objective functions and wastes computational efforts rolling back when a repaired NN breaks correct behaviors. With our analysis, we reduce the RwP problem to a stability-plasticity (S-P) trade-off interpolation problem, which has been studied in continual learning (CL). Then, we propose our method, ISAR with Interpolation (ISAR-I), which majorly improves ISAR. ISAR-I abandons the expensive log barriers and rolling back to allow intermediate policies to compromise correct behaviors for repair. Then, an interpolation of the S-P trade-off between the original NN and the intermediate NN is kicked off in the Bayesian space, searching for a final NN that both repairs and preserves. Case studies on OpenAI Gym mountain car and an unmanned underwater vehicle show that ISAR-I is able to preserve all verified trajectories while repairing 81.7\% and 21.3\% of the broken ones, respectively, achieving the same performance as ISAR, with runtime cost of only 6.5\% and 19.6\%, on average.