Fast, Flexible, Timing-accurate and Open Source Performance Modeling Method for Compute Accelerators

Title:Fast, Flexible, Timing-accurate and Open Source Performance Modeling Method for Compute Accelerators

Authors:Vishal Chovatiya, Andrew Stevens and Snehith Shenoy

Conference:DVCon Europe 2025

Tags:Compute Accelerators, CorePerfDSL, Design Space Exploration, Functional Model, Performance Modeling, Timing Model and Virtual Prototyping

Abstract:

The proliferation of Artificial Intelligence and Machine Learning (AI/ML) applications has fueled demand for specialised compute accelerators, creating complex design challenges across hardware and software domains. Traditional Register-Transfer Level (RTL) simulation approaches face significant limitations due to their inherent complexity, slow simulation speeds, and extensive development resource requirements. This paper presents a fast, flexible, and timing-accurate performance modeling method for compute accelerators using CorePerfDSL, a domain-specific language developed initially for CPU pipeline modeling. We demonstrate the methodology’s effectiveness by implementing and validating a comprehensive mini-NPU (Neural Processing Unit) performance model. The approach separates functional and timing concerns, enabling rapid architectural exploration while maintaining predictive accuracy. Experimental validation using MLPerf Tiny inference benchmarks shows the performance model can predict mini-NPU accelerator performance with mean absolute error less than 10% compared to RTL simulation for 84% of evaluated layers. The method achieves significantly higher simulation speeds than RTL while providing timing-accurate results suitable for design space exploration and early software validation of compute accelerators.