Tags:Adaptive Oversampling, Class Imbalance, Classification Performance, Density-Adaptive and Synthetic Sampling
Abstract:
Class imbalance is a significant and emerging issue in machine learning, which expresses that the number of majority class instances is much greater than the number of minority class instances. In real applications, a large number of imbalanced datasets are available, which is difficult to classify normally. Conventional oversampling techniques, including SMOTE and ADASYN, are not able to solve the issue quietly. However, they are prone to outliers and class overlaps. To address this problem, we propose the adaptive density-based synthetic sampling, DASS, technique in this paper. In the DASS technique, we select neighboring instances based on local density estimations to obtain a more accurate sampling of the minority classes. This method reduces overfitting and increases model generalization. In the experimental results, we observe that DASS outperforms baseline approaches using different classifiers in terms of ROC AUC, F1 score, recall, precision, and overall accuracy. This reflects that DASS is an efficient alternative for accurately managing imbalance and significantly contributes to improving prediction power. The proposed technique provides a new perspective on addressing class imbalance more effectively than the baseline resampling methods in machine learning.
DASS: Density-Adaptive Synthetic Sampling for Improved Imbalanced Classification