Recently, deep learning techniques have made remarkable progress in various areas. One of these, object segmentation or instance segmentation, allows us to segment and classify objects of an image. There exist several models of neural networks (such as the Mask R-CNN network) that perform this task with good results. However, it is possible to improve the results of the Mask R-CNN network if several models of it are combined, and their results are merged using Ensemble Learning. We propose an algorithm to assemble the results of two Mask R-CNN networks, which we name as Simple Instance Segmentation Ensemble. In our experiments, we train several Mask R-CNN networks with synthetic images of machinery objects. In addition, these Mask R-CNN networks have different backbones and different sizes of kernels for the Gaussian Blur filter applied to the synthetic machinery images used during training. We tested the performance of these networks by predicting real images of machinery. Besides, we propose the SISE algorithm to assemble the predictions of two previously trained Mask R-CNN networks, and we obtained better results than those of the individual Mask R-CNN networks. In particular, our best result is an ensemble that has one Mask R-CNN trained with synthetic images smoothed by Gaussian Blur filter with a kernel size of 7x7, and another network with a kernel size of 3x3. Both networks have as backbone a ResNeXt 101 with FPN (Feature Pyramid Network). This ensemble has a bounding box mAP of 89.42% and a segmentation mAP of 88.34% in the real machinery test images.
Ensemble Learning to Perform Instance Segmentation over Synthetic Data