Tags:Deep Learning, Machine Learning and Databases, Predictive Query, Spatio-temporal Databases and Time-series
Abstract:
Consider a set of black-box models -- each of them independently trained on a different dataset -- answering the same predictive spatio-temporal query. Being built in isolation, each model traverses its own life-cycle until it is deployed to production. As such, these competitive models learn data patterns from different datasets and face independent hyper-parameter tuning. In order to answer the query, the set of black-box predictors has to be ensembled and allocated to the spatio-temporal query region. However, computing an optimal ensemble is a complex task that involves selecting the appropriate models and defining an effective allocation function that maps the models to the query region.
In this paper, we present a cost-based approach for the automatic selection and allocation of a disjoint ensemble of black-box predictors to answer predictive spatio-temporal queries. We conduct a set of extensive experiments that evaluate the DJEnsemble approach and highlight its efficiency. We show that our cost model produces plans that are close to the best plan. When compared against the traditional ensemble approach, DJEnsemble achieves up to $4X$ improvement in execution time and almost $9X$ improvement in prediction accuracy.
DJEnsemble: a Cost-Based Selection and Allocation of a Disjoint Ensemble of Spatio-Temporal Models