Tags:Bounding Box Regression, Deep Learning, Document Processing, Industry 4.0, Object Detection, Page Object Detection and Supply Chain Optimization
Abstract:
Documents are constantly being processed within supply chains in various industries throughout the globe. Within those documents, often times the most important content is stored in tabular format. Therefore an automated technique for supply chain document processing is highly desired. Deep learning approaches show promise to deliver an end-to-end extraction model. However, it has been shown that tabular detection accuracy is not always correlated to tabular localization accuracy. Portions of the desired tabular information can easily be cropped out due to a lack of localization accuracy. In this paper, we propose a two stage convolutional neural network-based deep learning framework to improve tabular localization accuracy. We use pre-trained backbone network ResNet-50 and then apply transfer learning to fit our application. One of our main contributions is the introduction of the KL loss function into Faster-RCNN. Once the bounding box variances are output, we introduce a voting procedure with soft-non-maximum suppression (Soft-NMS) to improve localization performance. The proposed framework is trained and evaluated on public and private datasets that span from scientific documents to various electronic components. Our test results show that the precision of tabular detection can be improved by 1.2% while achieving the same recall as other state-of-the-art models on the public ICDAR2013 dataset. Furthermore, a large improvement in precision has been achieved at extremely high intersection over union (IoU) thresholds (i.e. 95%). Thus, 10% higher precision is achieved at 95% IoU for ICDAR2013. For another public dataset, namely ICDAR2017, 8.4% higher precision is achieved at 95% IoU .
High Precision Deep Learning-Based Tabular Detection