MioGatto: A Math Identifier-Oriented Grounding Annotation Tool

EasyChair Preprint no. 6209

7 pagesDate: August 1, 2021


We present a new annotation tool, called MioGatto, to efficiently build large corpora for grounding math formulae. While in documents in science, technology, engineering, and mathematics, math identifiers can be used in multiple meanings in a single document, corpora with annotated coreference relations between identifiers are crucial for the grounding task. Using MioGatto, annotators can produce a list of math concepts for each document, associate one of the math concepts with each occurrence of math identifiers, and annotate the text span that is the source for grounding. In general, manual annotation of coreference relations is a very tough task, but this tool is specialized for building grounding corpora and can annotate them more efficiently than existing general-purpose annotation tools. The tool can be obtained from

Keyphrases: annotation tool, coreference resolution, Grounding of formulae, Mathematical Language Processing, Natural Language Processing

