Tags:benchmarking, cloud computing, emulation, trace and workload
Abstract:
Although major cloud providers have captured and published workload executions in the form of traces, it is not clear how to use them for workload generation on a wide range of existing platforms. A methodological challenge that remains is to generate and execute realistic datacenter workloads on any infrastructure, using information from available traces. In this paper, we propose Tracie, a methodology addressing this challenge, and introduce a pair of tools supporting its implementation. We present the methodology and all steps of workload generation: analysis of data-center traces, extraction of parameters, application selection, and scaling of a workload to match the capabilities of the underlying infrastructure. Our evaluation validates that Tracie can generate executable workloads that closely resemble their trace-based counterparts. For validation, we correlate the recorded system metrics of a trace against the actual execution’s measured ones. We find that the average system metrics of synthetic workloads differ at most 5% compared to the ones in the trace and that they are highly correlated above 80% in all cases.