previous day
all days

View: session overviewtalk overview

09:30-12:00 Session 12: Doctoral Programme
Undefinedness and Soft Typing in Formal Mathematics

ABSTRACT. I want to extend the MMT ecosystem by support for undefinedness and soft type systems as well as provide the automated reasoning infrastructure that is needed for this to succeed.

Formally Correct Deduction Methods for Computational Logic

ABSTRACT. Project title: Formally Correct Deduction Methods for Computational Logic. Supervisor: Jørgen Villadsen. Co-supervisor: Nina Gierasimczuk. Dates: Februar 2020 – January 2023.

Formal Verification for Safe and Secure Hardware Systems

ABSTRACT. Please find attached.

Studying the learning process of student teachers regarding the subject of mathematical terminology in graph theory with the usage of an electronic dictionary

ABSTRACT. The project mainly aims to examining how the usage of an electronic dictionary affects the learning process regarding terminology. In particular, we work with mathematic student teachers in the field of graph theory. There are some further questions which accompany this main research question: How many and which single as well as multi word terms do students use without the usage of the dictionary - compared to the usage \emph{with} the dictionary? How is the dictionary then used?

As we also develop the dictionary itself, there are further questions regarding information retrieval from mathematical texts: Which term extraction results can be obtained by a pattern-based approach? How do regular term extraction tools work on mathematical language?

We aim to achieve a highly automated creation process for the dictionary. The plan is to develop a method specialized for mathematical language which can be applied for future dictionaries in related fields. This should be done by analyzing the patterns and relations in the texts to make them directly part of the final dictionary entries.

The preparation of the project began in 2018 and is planned to be finished in 2022. It is divided into two phases: First, the development of the dictionary and second, the evaluation of the dictionary usage.

The corpus-based dictionary will be bilingual, German and English. Therefore we built two comparable corpora, consisting of (parts of) text books and scientific papers for the domain of graph theory. In total, we have about 700.000 German tokens and one million English tokens, with 30.000 types each. This process is finalized.

Currently, we work on the term extraction. For that, we use a pattern-based approach by finding typical definitional patterns in mathematical texts. This approach will be combined with data produced by other term extraction tools. The merged results are assessed by expert raters (inter-rater reliability to be computed). This will lead to the final lemma list. The selected lemmas will be classified according to nine different categories: parts (of graphs), types (of graphs), properties (of graphs), algorithms, mappings, theorems, problems, activities and persons. The category system can also be applied to other fields of mathematics if similar dictionaries are to be built.

Besides the lemmas and their German or English equivalent, the dictionary will also provide terms being in a semantic relations to each lemma as an ontology forms the backbone of the dictionary. The related terms given depend on the particular category of the lemma. In the dictionary, we paraphrase the relations using general language as the user group might not be acquainted with the linguistic terminology: synonyms (is also called), hypernyms (is always a) / hyponyms (examples), meronyms (is part of) / holonyms (is composed of), eponyms (is named after), pertonyms (linguistically related), mapped to (is usually mapped to / is canonically mapped to), alternatives, attributes (possible properties), analogies (is analogous to). The system, which relations match with which categories is implemented as an OWL-structure. At the moment, we are doing research on appropriate software for the user interface of the dictionary which meets the onomasiological as well as the lexicographic needs.

In addition, we did some quantitative and qualitative research with mathematic students regarding their usage of terminology by examining theses and doing two surveys. There first one was carried out in spring 2019 and investigated which sources the students use during their studies. The result was, they mainly use Wikipedia beside textbooks and scientific papers. These results were also used for the creation of the corpus.

The second survey is also a pre-study for the final evaluation of the dictionary during which the students have to do different (terminological) exercises. We work with mathematic undergraduate students who have a basic understanding of scholar mathematics because they only attended introductory lectures without a focus on graph theory. This way, we aim to accomplish resilient data as any knowledge in the field of graph theory can be led back to their current use of Wikipedia or the dictionary.

The final evaluation of the dictionary will be done in a quantitative way: Two groups of students will receive the same exercises and have to solve them either with the developed dictionary or with other electronic tools like Wikipedia. Additionally, a prototype of the dictionary will be tested with methods of the user-centered design.

High-Precision Semantics Extraction in STEM

ABSTRACT. I'm finishing up my master studies and want to start my PhD this summer. The plan for my PhD is to combine the symbolic tools I've been working on with statistical tools to achieve high-precision semantics extraction for STEM documents.

13:00-14:30 Session 13: NFM - Natural Formal Mathematics Workshop
How Natural is Formal Mathematics?
PRESENTER: Michael Junk

ABSTRACT. We present results from a lecture on mathematical modeling where both lectures and homework assignments are based on an editor which checks a formal syntax and the correct use of names. Since the lecture addresses students without prior training or special inclination to for- mal mathematics our observations shed some light on the ambivalent relation between formality and naturalness.

(Invited talk) Formal Proofs for the Future


15:00-16:00 Session 14: NFM - Natural Formal Mathematics Workshop
(Invited talk) The evolution of the language of mathematics
16:30-18:30 Session 15: NFM - Natural Formal Mathematics Workshop
Formalizing Foundational Notions in Naproche-SAD (4 short talks)
PRESENTER: Peter Koepke

ABSTRACT. We present three current formalization projects with the proof assistant Naproche-SAD. Naproche-SAD formalizations use the natural mathematical input language ForTheL and in favorable cases read like textbook material. In this paper we emphasize the encoding of basic notions and axioms on the basis of inbuilt or slightly modified mechanisms of Naproche-SAD. The formalizations concern an introduction into set theory up to substantial results in infinitary combinatorics, a general theory of structures which are all made pairwise disjoint to avoid ambiguities, and an introduction to abstract linear algebra where for ef- ficiency reasons the type system is restricted to a single type "object", which is hard-coded into Naproche-SAD.

Integrating Formal Methods into LaTeX Workflows
PRESENTER: Dennis Müller

ABSTRACT. While some mathematicians have begun actively adapting formal methods for their work, there is a prohibitively large discrepancy between the way new mathematical results are developed, presented and published in mathematical practice, and the way they are formalized and implemented in formal systems. This discrepancy is primarily due to the respective surface languages and the logical foundations used in theorem prover systems. The former tend to be more reminiscent of programming languages with strict formal grammars than (informal) mathematical language, whereas the latter are most prominently variants of type theories – both of which require considerable effort to learn and understand for users unfamiliar with them.

Consequently, many mathematicians complain that

• formal systems are difficult to learn and use, even if one is well acquainted with the (informal) mathematics involved,

• they require a level of detail in proofs that is prohibitive even for “obvious” conclusions,

• their libraries are difficult to grasp without already being familiar with the system’s language, conventions and functionalities.

Consequently, the utility of formalizing mathematical results can be too easily (and too often is) dismissed in light of the additional time and work required even for experts. This is despite the fact that many services available for formal mathematics are already enabled by semi-formal (or flexiformal) representations, such as semantic annotations in natural language texts, or formal representations containing opaque informal expressions. Therefore, we need to invest into methods for bridging the gap between informal mathematical practice and (semi-)formal mathematics.

However, the vast majority of mathematicians can safely be assumed to be comfortable with using LaTeX and do so on a regular basis. The second author developed the STeX package for LaTeX, specifically for annotating mathematica documents with structural and formal semantics. In particular, STeX is based on an OMDoc ontology, which is foundation-agnostic in the sense that it does not favor a specific foundation (such as type or set theories) over any other. This approach is consequently best suited for semantifying informal documents, where foundations are often unspecified, left implicit or switched fluently. Furthermore, STeX allows markup both on the level of mathematical expressions as well as on a structural level, such as declarations, definienda/definientia and theorems. The SMGloM library [Koh14] (Semantic Multilingual Glossary of Mathematics) contains hundreds of STeX-annotated concepts and definitions, providing LaTeX-macros for their symbolic notations (i.e. presentation as pure LaTeX) in signature files. These are documented in various natural language dictionary entries, defining and describing the concepts in a flexiformal manner. Besides natural languages, the SMGloM could conceivably be extended to cover controlled languages such as Naproche, or fully formal languages such as those associated with interactive theorem provers. In addition to being valid LaTeX compilable via pdflatex, STeX-documents can be transformed to OMDoc using the LaTeXML-software [Gin+11], yielding a formally disambiguated representation of the document and in particular the symbolic expressions therein. STeX itself is integrated, and shares an underlying OMDoc ontology, with the MMT system – a foundation-independent meta-framework and API for knowledge management services. This integration makes the generic services provided by MMT – e.g. type checking, library management/browsing, translation – available to informal mathematical texts. Using alignments, OMDoc-expressions can be translated between different libraries, languages and foundations. This allows for e.g. translating (originally) STeX-content to a typed setting in order to e.g. check expressions and run type inference. Additionally, several theorem prover libraries have been translated to OMDoc and integrated in the MMT system. Extending these integrations to enable exporting from MMT as well (and in conjunction with natural language processing), this could enable verifying informal mathematics imported via STeX using external state-of-the-art theorem prover systems.

We propose to

1. extend the STeX package to be compatible with arbitrary LaTeX document classes and packages, as well as dedicated macros covering a multitude of common styles of presentation and linguistic phenomena in scientific documents

2. aligning the (ever growing) SMGloM concepts with their counterparts in formal languages, so that the SMGLoM can serve as a true Math-in-the-Middle library for translation between languages,

3. develop an IDE that specifically supports and suggests STeX macros to significantly lower the entry barrier and allows accessing MMT's formal services directly, and

4. extend MMT by new helpful services for STeX content.

Why We Need Structured Proofs in Mathematics

ABSTRACT. When mathematicians write a standard proof of some theorem, they have to decide on which level of detail they will present their argument. If they provide too little justification, readers may spend a lot of time filling the holes left or, even worse, may not understand why a specific step of the proof is (or not) correct. However, more argumentation is not necessarily always better, since this may obscure the "big picture" and some readers may be more interested in seeing the "big picture" than in checking every tiny detail of the proof. Therefore, the ideal level of detail in a mathematical proof varies from reader to reader, as it depends on the reader's previous knowledge and the reader’s intention.

The standard way of writing mathematical proofs, which will henceforth be called prose proofs, cannot satisfy everyone. In contrast with prose proofs, we have the concept of structured proofs. In a structured proof, the steps and substeps are hierarchically presented. The proof is decomposed as a tree of steps, and each step can be further decomposed into smaller subtrees of substeps and so on. By correctly numerating each step and by using the right indentation, the hierarchy of the proof is clear at first sight. Examples of structured proofs can be found in the extended version of this paper, available at

If writing structured proofs is accompanied by the discipline of explaining every step in meticulous detail, the probability of obtaining an incorrect proof or a wrong result diminishes. That is because the hierarchy of the proof will allow the writer to add justifications to a given step without "clouding" other steps of the proof, and also make it easier to check if all the corner cases were handled.

Since structured proofs can make it harder to obtain an incorrect result, a comparison may arise between structured proofs and interactive theorem provers (ITPs). By using an ITP, the chances of obtaining an incorrect result are significantly lower than by using a structured proof, as the computer checks every step of the proof. However, structured proofs also have advantages over ITPs, as they are faster to write and more readable. As members of a research group bridging interest from graduate programs in Mathematics and Computer Science we also have a huge interest in motivating mathematicians to use ITPs. For this we have developed mathematical theories in PVS as well as short-courses to attract the interest of such audience.