GLESDO2022: Graph Representation Learning for Scanned Document Analysis Montréal Québec, Canada, August 21-25, 2022 |
Conference website | https://sites.google.com/view/glesdo-workshop-icpr2022/home |
Submission link | https://easychair.org/conferences/?conf=glesdo2022 |
Introduction:
Robust reading, also known as automatic document image processing, is an essential task in various applications areas such as data invoice extraction, subject review, medical prescription analysis, etc. and holds significant commercial potential. Several approaches are proposed in the literature, but datasets' availability and data privacy challenge it.
Considering the problem of information extraction from documents, different aspects must be taken into account, such as (1) document classification (2) text localization (3) OCR (Optical Character Recognition) (4) table extraction (5) key information detection. In this context, the graph-based approaches are attractive methods for document processing. In fact, graphs are a natural way to represent the connections among objects (text, blocks, images, etc.) and aim to discover novel and hidden knowledge from data. The extracted text from scanned documents can be represented in the shape of a graph to exploit the best features of their characteristics. On the other hand, understanding spatial relationships is critical for text document extraction results for some applications such as invoice analysis. The aim is to capture the structural connections between keywords (invoice number, date, amounts) and the main value (the desired information). An effective approach requires a combination of spatial and textual information.
After the success of GLESDO 2021, the second edition aims to bring together an area for experts from industry, science, and academia to exchange ideas and discuss on-going research in graph representation learning for scanned document analysis.
We encourage the description of novel problems or applications for document image analysis in the area of information retrieval that have emerged in recent years. Furthermore, we also encourage works that develop new scanned document datasets for novel applications.
Topics of interests:
This workshop invites submissions with high-quality works that are related, but are not limited, to the topics below:
-
Graph embeddings
-
Deep learning for graph
-
Probabilistic graphical models for graphs
-
Graph-based approaches for text mining
-
Graph-based approaches for a spatial component in a scanned document
-
Graph representation learning for NLP
-
Graph-based approaches using kernels
-
Spectral graph clustering
-
Semi-supervised graph-based methods
-
Dynamic graph analysis
-
Information Retrieval and Extraction using Graph-based methods
-
Knowledge graph for semantic document analysis
-
Semantic understanding of document content
-
Entity and link prediction in graphs
-
Merging ontologies with graph-based methods using NLP techniques
-
Cleansing and image enhancement techniques for scanned document
-
Document structure and layout learning
-
OCR based graph methods
-
Font text recognition in a scanned document
-
Table identification and extraction from scanned documents
-
Handwriting detection and recognition in documents
-
Signature detection and verification in documents
-
Visual document structure understanding
-
Visual Question Answering
-
Invoice analysis
-
Scanned documents classification
-
Scanned documents summarization
-
Scanned documents translation
-
and so on.
Submission:
All submissions will be handled electronically via EasyChair website. All authors must agree to the policies stipulated below.
We welcome the following types of contributions:
-
Full research papers (8-10 pages): Finished or consolidated R&D works, to be included in one of the Workshop topics
-
Short papers (4-6 pages): Ongoing works with relevant preliminary results, opened to discussion.
Important dates:
Submission Deadline: June 6, 2022 at 11:59pm Pacific Time
Decisions Announced: June 27, 2022, at 12:00pm Pacific Time
Camera Ready Deadline: July 11, 2022, at 12:00pm Pacific Time
Workshop day: August 21, 2022
Workshop Chairs
Rim Hantach, Engie, France
Rafika Boutalbi, Stuttgart University, Germany
Philippe Calvez, Engie, France