previous day
next day
all days

View: session overviewtalk overview

09:10-10:00 Session 1: Plenary Session I
Location: New York 1
Old Frisian Terms in the "Deutsches Rechtswörterbuch". The Multilingual Approach of the Dictionary of Historical German Legal Terms

ABSTRACT. The Deutsches Rechtswörterbuch (German Legal Dictionary) is the most comprehensive dictionary of German technical terminology. Despite its name, it focuses exclusively on older legal language. Moreover, the dictionary does not only contain German historical legal terms, but also legal terms from all West Germanic languages. It covers the legally relevant terms from the beginning of Germanic written tradition (about 400 AD) up to 1815 (word occurrences in the online version until 1835) and comprises, in addition to Modern and Middle High German, e.g. Old English (500-1100), Lombardic (650-1000), Old Dutch (700-1200), Old Saxon (800-1200), Old Frisian (800-1500), Middle Dutch (1200-1600) and Middle Low German (1200-1650). Each entry analyzes a word from its first historical record until the beginning of the 19th century, which in some cases spans more than 1400 years. From the perspective of legal historical research Old Frisian is of special interest. This language has been handed down mainly in numerous legal sources, which not only represent a specific vocabulary, but also provide an exciting insight into past living conditions – demonstrated by words like “sinekerf” (Sehnekerb, punishable dissection of a tendon), “sawenbethe” (Siebenbuße, sevenfold fine) or “sponste” (Sponst, seduction). When the project will be completed, the Deutsches Rechtswörterbuch is expected to consist of 16 volumes with about 120,000 main entries. About 95,000 main entries in more than 20,000 columns have been printed so far – in alphabetical order from Aachenfahrt (pilgrimage to the coronation church of the Holy Roman Empire at Aachen) to Stadtkanzlei (city chancellery). Every year more than a thousand new entries will be added.

10:30-12:30 Session 2A
Location: New York 1
Preparing the Dictionnaire Universel for Automatic Enrichment

ABSTRACT. The Dictionnaire Universel (DU) is an encyclopaedic dictionary originally written by Antoine Furetière around 1676-78, later revised and improved by the Protestant jurist Henri Basnage de Beauval who expanded, corrected and included terms of arts, crafts and sciences, into the Dictionnaire.

The aim of the BASNUM project is to digitize the DU in its second edition rewritten by Basnage de Beauval, to analyse it with computational methods in order to better assess the importance of this work for the evolution of sciences and mentalities in the 18th century, and to contribute to the contemporary movement for creating innovative and data-driven computational methods for text digitization, encoding and analysis.

Based on the experience acquired within the research group, an enrichment workflow based upon a series of Natural Language Processing processes is being set up to be applied to Basnage's work. This includes, among others, automatic identification of the dictionary structure (macro-, meso- and microstructure), named-entity recognition (in particular persons and locations), classification of dictionary entries, detection and study of polysemy markers, tracking and classification of quotation use (bibliographic references), scoring semantic similarity between the DU and other dictionaries. The main challenges being the lack of available annotated data in order to train machine learning models, decreased accuracy when using modern re-trained models due to the differences between present-day and 18th century French, and even unreliable or low quality OCRisation. 

The paper describes methods that are useful to tackle these issues in order to prepare the the DU for automatic enrichment going beyond what current available tools like Grobid-dictionaries can do, thanks to the advent of deep learning NLP models. The paper also describes how these methods could be applied to other dictionaries or even other types of ancient texts.



Furetière, Antoine. (1701) Dictionnaire Universel, contenant généralement tous les mots françois tant vieux que modernes, & les termes des sciences et des arts. La Haye, Rotterdam: Arnoud et Reinier Leers.

Khemakhem, Mohamed et al. (2017):  Automatic Extraction of TEI Structures in Digitized Lexical Resources using Conditional Random Fields. Electronic lexicography, eLex 2017. Leiden, Netherlands.

Khemakhem, Mohamed et al. (2018): Enhancing Usability for Automatically Structuring Digitised Dictionaries. GLOBALEX workshop at LREC 2018. Miyazaki, Japan.



The history of German paronym dictionaries: From prescriptive print editions to electronic corpus-based resources

ABSTRACT. The objective of this talk is to sketch the development of paronym dictionaries in German. These document and describe commonly confused words which cause uncertainties because they are similar in sound, spelling or meaning (e.g. effektiv/effizient, sportlich/sportiv). Firstly, an overview of existing references guides is provided covering different linguistic/lexicographic traditions. Numerous lemma lists have been collected for pedagogical purposes and there has always been a steady interest in the lexicological treatment of paronyms. However, neither a large amount of dictionaries covering the use of commonly confused pairs nor a lot of genuine paronym dictionaries have been compiled in the past (cf. Hausmann 1990). I will focus on larger lexicographic endeavours including Wustmann (1891), Müller (1973) and Pollmann/Wolk (2001). Secondly, I will shed light on the differences in description styles and lexicographic presentations. On the one hand, I will demonstrate how traditional prescriptive approaches have been replaced by empirical descriptive accounts. On the other hand, dictionaries in general have moved away from restricted, static print editions towards dynamic e-dictionaries. Finally, a new e-dictionary “Paronyme − Dynamisch im Kontrast” is presented with contrastive and flexible two-level consultation views. Its three key elements are its corpus-based foundation, the implementation of meta-lexicographic requirements and a consideration of users’ needs/interests. It is descriptive in nature, documenting conventionalised patterns and preferences as observed in authentic communication. This up-to-date dictionary has also implemented a user-friendly and dynamic interface. 



Hausmann, Franz Jospeh (1990): Das Wörterbuch der Homonyme, Homophone und Paronyme. In Hausmann, F.J./Reichmann, O./Wiegand, H.E. (eds.), Wörterbücher. Vol. 2. Berlin/New York: de Gruyter, pp. 1120-1125.

Müller, Wolfgang (1973): Leicht verwechselbare Wörter. Duden Taschenwörterbücher Vol. 17. Mannheim: Bibliographisches Institut.

Paronyme – Dynamisch im Kontrast: http://www.owid.de/parowb/.

Pollmann, Christoph/Wolk, Ulrike (2010): Wörterbuch der verwechselten Wörter. 1000 Zweifelsfälle verständlich erklärt. Stuttgart: Pons.

Wustmann, Gustav (1891): Allerhand Sprachdummheiten: kleine deutsche Grammatik des Zweifelhaften, des Falschen und des Häßlichen. Grunow: Leipzig.

A tsunami of English? Counting loanwords in Dutch newspapers 1950-2002

ABSTRACT. Loanwords form part of the lexicon of most languages, and Dutch is no exception. Loanwords in Dutch have been documented by both academic publications (e.g. Haspelmath & Tadmor 2009) and prescriptivist publications (e.g. Koops et al 2009). However, these publications focus on signaling loanwords types, but do not address the question to what extent these words are actually used. In fact, very little research seems to have been done into the token presence of loanwords in Dutch. Only Van der Sijs (2012) took a quantitative approach, but used very small samples. The lack of research into this question is unfortunate, if only because of the widespread folk linguistic perception that there is a ‘tsunami of English’ in Dutch.

In this paper we study the presence of loanwords in Dutch, using the recently presented Loanword-o-meter (Beelen et al 2019), based on a Dutch digital loanword lexicon compiled especially for this purpose. We base our analysis on the VU-DNC, a diachronic corpus of newspaper language. Our research shows that the overall percentage of loanwords remains fairly constant for the period 1950-2000. However, several trends can be distinguished: some words go in and out of fashion, and there does seem to be an increase in the use of English loanwords. Also, we confirm Van der Sijs (2012) in concluding that, at least in newspaper language, most of the loanword types are rarely if ever used.



Beelen, K. et al (2019), ‘Loanword-o-meter: Studying Dutch Loanwords across Genre and over Time’, CLIN2019.

Haspelmath, M. & Tadmor, U. (eds.) (2009), Loanwords in the World’s Languages: A Comparative Handbook, Berlin and New York.

Koops, B.-J. et al (2009), Funshoppen in het Nederlands, Amsterdam.

Sijs, N. van der (2012), ‘Engelse leenwoorden revisited. Hoeveel wordt het Nederlands gemixt met Engels?’, in: Onze Taal 5, 132-134; 6, 157.


10:30-12:30 Session 2B
Location: Paris
Old English ‘be sorry for and grieve at’ verbs: class membership and syntactic behaviour

ABSTRACT. This paper explores the grammatical behavior and class membership of Old English verbs expressing ‘to be sorry for and grieve at’. The semantic oriented classification of the Old English lexicon by Roberts and Kay’s Thesaurus of Old English has assorted the following verbs into the same category: besārgian, dǣdbētan, gnornian, hēofan/hēofian, ofhreowan, sīcan (æfter), sorgian (ymbe/on/for), and (ge)wēpan. These verbs constitute the starting point of an analysis that will be framed within Levin’s model of verbal classes and alternations, whereby verb behavior is the result of the “interaction of its meaning and general principles of grammar” (1993: 11). The Dictionary of Old English Corpus has provided the data of analysis. Additional sources have been consulted for meaning information, including standard dictionaries (Sweet, Clark Hall-Meritt, Bosworth-Toller, Dictionary of Old English) and the lexical database of Old English Nerthus; others have allowed for syntax disambiguation, this is the case of The York-Toronto-Helsinki Parsed Corpus of Old English Prose and The York-Helsinki Parsed Corpus of Old English Poetry. In order to examine these verbs’ behavior and determine their consistency as verbal class, a number of parameters have been taken into account, namely argument realization, clause structure and participation in syntactic alternations within the verb phrase. Considerable differences have been identified in what regards morpho-syntactic coding in complementation patterns (case of arguments, prepositional government, etc.) and in the participation in simple and complex configurations, which calls for a redefinition of the verb class boundaries. By way of illustration, should the (in)trasitivity of a verb be the criterion for class definition, only dǣdbētan, sīcan and sorgian are found in intransitive realizations; if the prevailing criterion is the participation into complex syntactic configurations, only besārgian, hēofian, ofhreowan and sorgian admit finite and/or non-finite clauses. All things considered, when both semantic and syntactic components are taken into account, the verbs at stake prove to constitute a heterogeneous group, notwithstanding their being semantically related. 



Bosworth, J. and T. N. Toller. 1973 (1898). An Anglo-Saxon Dictionary. Oxford: Oxford University Press.

Clark Hall, J. R. 1996 (1896). A Concise Anglo-Saxon Dictionary. Toronto: University of Toronto Press.

Healey, A diPaolo (ed.) with J. Proce and X. XIang. 2004. The Dictionary of Old English Web Corpus. Toronto: Dictionary of Old English Project, Centre for Medieval Studies, University of Toronto. 

Healey, A diPaolo (ed.). 2016. The Dictionary of Old English in Electronic Form A-H. Toronto: Dictionary of Old English Project, Centre for Medieval Studies, University of Toronto. 

Levin, B. 1993. English Verb Classes and Alternations. Chicago: University of Chicago Press. 

Nerthus: Lexical Database of Old English [www.nerthusproject.com] 

Roberts, Jane and C. Kay with L. Grundy. 2017. A Thesaurus of Old English. Glasgow: University of Glasgow. http://oldenglishthesaurus.arts.gla.ac.uk/

Sweet, H. 1976 (1896). The student’s Dictionary of Anglo-Saxon. Cambridge: Cambridge University Press.

Taylor, A., A. Warner, S. Pintzuk and F. Beths. 2003. The York-Toronto-Helsinki Parsed Corpus of Old English Prose. Department of Language and Linguistic Science, University of York [http://www.helsinki.fi/varieng/CoRD/corpora/YCOE/]

Taylor, A., A. Warner, S. Pintzuk, F. Beths and L. Plug. 2001. The York-Helsinki Parsed Corpus of Old English Poetry. Department of Language and Linguistic Science, University of York [http://www-users.york.ac.uk/~lang18/pcorpus.html]


The Importance of the Bible Translation for the Old Czech Lexicography

ABSTRACT. This paper deals with the Old Czech Bible translation in the Late Middle Ages and its importance for the historical lexicography. The vernacular translation of the whole Bible is priceless evidence for study and description of the Old Czech language. The Old Czech Bible is the most extensive and lexically rich text; it is also one of the first texts in Czech. The translations of Book of Psalms and Gospel lectionary date back to the beginning of the 14th century, the complete Bible translation comes from the middle of the 14th century. The Hussite reform movement in the 15th century stimulated an extraordinary interest in Holy

Scriptures in vernacular, the biblical text has been therefore revised and newly translated into Old Czech several times. By the end of the 15th century, there are four distinct versions of the translation, the so-called redactions, surviving in more than one hundred manuscripts and incunabula and in nearly another hundred fragments. 

The Old Czech Bible has a very significant place in the historical lexicology and lexicography. The numerous biblical manuscripts and prints contain lexemes from a centre and a periphery of the Old Czech vocabulary. There are two main reasons, why is the vocabulary of Old Czech Bibles immensely important for the lexicography of the Old Czech language. First, the text in the Old Czech Bible has been translated from the Latin Vulgate, therefore a Latin equivalent usually helps to uncover the meaning of an Old Czech word. Secondly, in four various translations of the Old Czech Bible we can follow and compare equivalents and their phonology, morphology, and valency during the two hundred years of the Old Czech period.

In the paper, some desirable improvements in the research of the Old Czech biblical vocabulary will be discussed: especially a need of more critical editions of the particular Old Czech Bible translations to complete the existing excerption from several selected biblical sources, prepared for the Old Czech Dictionary in 1960s, and also, a further research of the formation of the biblical terminology and style.



Kyas, V. (1997) Česká bible v dějinách národního písemnictví. Praha: Vyšehrad.



The history of "nitchevo" in the light of diachronic evidence

ABSTRACT. Word histories are a fascinating object of inquiry within the domain of historical lexicography. To a historian of the English language, the Oxford English Dictionary (third edition, henceforth, OED3) is no doubt a rich source of data on the etymologies, meanings and usage of hundreds of thousands of words. And yet, the information included in the dictionary entry is usually, out of necessity, highly condensed, which means that one rarely gains insight into the complex web of historical facts on which it has been based.

This paper suggests that a range of digital resources available today may be used to provide additional and more nuanced information in order to verify and, whenever possible, update the current state of knowledge. Such a procedure was followed for a handful of Russian borrowings, of which nitchevo is a case in point. Crucially, by bringing to light evidence hitherto unknown, it was possible not only to come up with substantial antedatings, new spelling variants and suggestions on further semantic division, but also to identify the “dissemination routes” of the word in question. The conference talk will describe the results of the research in some detail.





Old Onomatopoeia: What Etymological Dictionaries Tell us about Sound Imitation in Extinct Languages

ABSTRACT. The growing bulk of evidence from modern languages of different families (see e.g. Hinton et al 1994; Voeltz et al 2001; Voronin 2006; Iconicity Atlas 2018) suggests that onomatopoeic (more broadly – iconic) words might be a language universal. This gives a reason to suspect that they can be found in ancient and reconstructed languages as well. Indeed, there are several works devoted to diachronic studies of phono-symbolism (Malkiel 1990; Liberman 2010), confirming this suggestion. But do these words differ from modern sound imitations? If yes in what way? For how long do they stay in a language? Do they change? Become obsolete? The purpose of this paper is to suggest some possible  answers to these questions as well as outline the general tendencies of expressivity loss in onomatopoeic lexicon.

There are two main challenges in describing old onomatopoeic lexicons. First, one should from evidence scarce often despite words iconic identify etymology 

dictionaries and ambiguous textual examples. To facilitate their identification we use the method of phonosemantic analysis (Voronin 2006) which implies both etymological and typological investigations. The second challenge is to describe the dynamics of iconic words’ expressivity loss due to regular sound changes and sense development. We have devised a four-step classification of onomatopoeic words according to the degrees of their de-iconization, grading all iconic words from most recent, expressive onomatopes (e.g. English ding-dong, bang!)to obsolete, no longer imitative ones (e.g. English gargoyle, lunch, abeyance).

The study of ancient sound imitations gives an insight into the history of language evolution and development. This paper provides examples from the Germanic family of languages (Old English, Old Norse, Gothic and Proto-Germanic itself).





13:30-15:30 Session 3A
Location: New York 1
Opportunities and challenges in historical lexicography of smaller languages. The example of the Dictionnaire de l'ancien francoprovençal (Old Francoprovençal dictionary)

ABSTRACT. In this paper, we would like to share some of our experience in elaborating an Old Francoprovençal dictionary (DAfp), for which the first 400 articles have now been written. We will focus particularly on aspects related to the fact that Francoprovençal before 1600 may be considered a smaller language regarding the size of the corpus as well as its sociolinguistic status.

Working on smaller languages implies certain difficulties and limitations, such as the fact that a small corpus makes it more difficult to assess the degree of fixedness of a syntagma. One or a few major sources can also more easily pull a small corpus off balance. There are, however, also some advantages to it: For instance, one can acquire a thorough knowledge of the entire corpus. The lack of a prestigious literature encourages you to also consider other precious sources that are often neglected in the lexicography of literary languages.

We will retrace the short history of Old Francoprovençal lexicography and illustrate some of the opportunities and challenges met while elaborating the DAfp, whether they apply only to Old Francoprovençal or also to other small languages. To put our results in perspective, we will compare them to the lexicography of other contemporary Gallo-romance varieties, such as Old French or Old Gascon.


Old Saxon Lexicography: A Critical Overview

ABSTRACT. Although it has traditionally been something of a Cinderella figure among the early Germanic languages, Old Saxon has enjoyed increased scholarly popularity over the past two decades or so.  New handbooks have appeared, a number of more junior scholars have engaged with Old Saxon topics, and more papers on Old Saxon are being presented at international conferences.  These developments indicate that the time is ripe for critical overviews of the field.  This paper will therefore present such an overview of the lexicographic resources available for Old Saxon.  To keep the paper to a manageable length, we will focus on what we see as the five most important such resources: Sehrt (1925), Holthausen (1954), Berr (1971), Tiefenbach (2011), and Köbler (2014).  This paper is therefore a down payment on a full treatment of the topic, one which will (1) contextualize each of these resources within the history of the field and (2) offer comprehensive evaluations of them.

 We argue that each of these works is valuable in its own way, but that some of them are clearly better than others.  Sehrt (1925) is of course a classic, but is now somewhat outdated.  Holthausen (1954), for instance, is a handy work, but is quite short (under 100 pages), meaning that it is nowhere near being comprehensive.  Berr (1971) is the fullest English-language treatment of the topic, but is often something of a grab bag of ideas, rather than a consistent scholarly analysis.  Tiefenbach (2011) and Köbler (2014) are both generally excellent (and moreover have the advantage of being in both English and German, which is crucially important at a time when English has become the main language of Germanic linguistics), but could use some additional etymological information and a few improved English glosses. 

The Electronic Dictionary of Old Czech, newly with examples
PRESENTER: Irena Fuková

ABSTRACT. Elektronický slovník staré češtiny [Electronic Dictionary of Old Czech] is an online historical dictionary, describing the documented vocabulary of the Old Czech period (from the mid-12th century to 1500) in its entiretyIt is a team work in progress, initiated in 2005 by scholars of Czech Language Institute of the Czech Academy of Sciences and following up the extensive Staročeský slovník [Old Czech Dictionary] (1968–2008), which covered in detail lexical units from N to při.

In this paper, we will introduce the Elektronický slovník staré češtiny [Electronic Dictionary of Old Czech], including its historical background, and describe three stages of its developement in 14 years of existence. We will focus on the current change of its concept and especially on the decision that the described lexemes will be newly documented by short examples from Old Czech texts. We will introduce the basic points of the chosen solution and present the principles that have been set for the selection of examples.

We will also discuss the solution for that part of the Elektronický slovník staré češtiny, which has been already published online following the older concept, i. e. without the Old Czech examples, especially for the part from  při- to Ž.

The process of formation of the new concept will be presented as a process of finding balance between the requirement for the quickest results and the requirement for the most accurate processing of the vast preserved Old Czech language material.



 Elektronický slovník staré češtiny [online] (2006–). Praha: Ústav pro jazyk český AV ČR. <http://vokabular.ujc.cas.cz>.

Staročeský slovník (1968–2008). Praha: Academia.


A minor period of a major language: lexicographical approaches to late Old English

ABSTRACT. Although English certainly cannot be considered a “smaller” language, some parts of its history have nevertheless received comparatively little lexicographical attention. Late Old English, particularly as attested in texts produced after the Norman Conquest of 1066, is a case in point. Recent scholarship (e.g. Treharne 2007, 2012) has argued that the tendency to consider eleventh- and twelfth-century texts primarily in terms of their relationship to earlier Old English has caused their historical, cultural and linguistic interest to be underestimated. With a few exceptions (e.g. Stanley 2000), there has been little detailed discussion of how the vocabulary of such texts has been documented by lexicographers. It is certainly true that lexicographical coverage of the language of these texts is generally divided between dictionaries of Old and Middle English, reflecting and reinforcing the impression that late Old English is transitional and marginal.

This paper examines the treatment of late Old English in period dictionaries, from the appearance of the first published dictionary of Old English in the seventeenth century (Somner 1659) to the present-day Dictionary of Old English (Cameron et al. 2018). I ask how this relatively under-studied period has been defined by lexicographers and how it is presented in relation to the earlier (and more traditionally canonical) varieties of Old English with which these dictionaries are primarily concerned. These investigations will offer an insight into current and historical resources for the study of late Old English lexis. They will also serve as a case study into how lexicographers through history have established the scope of their projects, and what happens to the material that is relegated to the margins. Finally, I hope to address the question of whether the changes in format made possible by electronic dictionaries might offer new approaches to lexicographers handling this marginal period.



Cameron, A., A. Crandell Amos, A. dePaolo Healey et al. (2018) Dictionary of Old English: A to I online. Toronto: Dictionary of Old English Project.

Somner, W. (1659) Dictionarium Saxonico-Latino-Anglicum. Oxford: excudebat Guliel. Hall. Stanley, E. (2000) OED and the Earlier History of English. In Mugglestone, L. (ed.): Lexicography and the OED: Pioneers in the Untrodden Forest: 126-155. Oxford: OUP.

Treharne, E. (2007) Periodization and Categorization: The Silence of (the) English in the Twelfth Century. In Copeland, R., W. Scase and D. Wallace (eds.): New Medieval Literatures 8: 248-75.

Treharne, E. (2012) Living through conquest: the politics of early English, 1020-1220. Oxford: OUP.



13:30-15:30 Session 3B
Location: Paris
A lexicographic description of Russian dialects in the Leningrad Region: research methods and principles

ABSTRACT. The paper outlines the basic theoretical and methodological concepts underlying the compilation of the “Dictionary of the Leningrad Region dialects of 1936–1947”. This project aims at a lexicographic description of a portion of the manuscript archive stored at the Institute for Linguistic Studies of the Russian Academy of Sciences. The goal is to compile a “Dictionary of the Leningrad Region Dialects” based on the unique hand-written records dating back to 1936–1947. Although the lexical systems of most Northern Russian dialects have been described in various lexicographical publications, no separate lexicographic work focused exclusively on Leningrad Region dialects. This situation is partly due to the many administrative territorial divisions the region has undergone. The work on the systematization and detailed consideration of the vocabulary of archival materials demonstrates the absence of many dialect words, phrases and their meanings in the printed sources on the Russian dialects. The importance of these materials is revealed not only in the availability of more precise geographical and temporal data, previously undescribed meanings, and their clarification, but in the new word forms and word combinations. In addition to survey answers, field records include explanations of words, sayings and a large number of oral speech records of dialect speakers. The latter contain names of agricultural tools, dialectal forms of nouns and adverbs, unique forms of verbs, etc.

New materials will help to establish and clarify the distribution boundaries of the dialect words, their isoglosses, to explore the correlation of dialect and standard meanings in a word, to follow the penetration of dialect words into the standard language.




[1] The work was supported by the Russian Foundation for Basic Research in the framework of the research project "Russian dialect lexicography Summary: Problems and Prospects», № 19-012-00415.

Scientific eponyms throughout the history of English scholarly journal articles

ABSTRACT. Scientific eponyms used in modern languages as well as modern eponymic processes are covered in various specialized lexicographical resources and word-formation studies, but historical eponyms and eponymization processes throughout the history of scientific writing are still relatively under-researched, even for well-documented languages such as English. One of the reasons for this research gap is the lack of suitable, well-annotated and sufficiently large diachronic corpora that allow a systematic retrieval of historical eponyms which have played a role in scientific discourse.

The information contained in eponymous lexemes, derivatives and multi-word constructions that were used in the past is highly relevant for historical lexicographers and for the reconstruction of various aspects of our linguistic, cultural and scientific heritage. This study will help to fill a research gap in understanding English eponymic processes in a diachronic dataset covering 330 years – the Royal Society Corpus (RSC, Kermes et al. 2016). The analysis is focused on certain types of eponyms that can be queried and retrieved due to common morphological characteristics (one word derivatives with suffixes, parasynthetic formations, multi-word eponymisms such as possessive constructions, binomials and polynomials, cf. typology of eponyms in Popescu 2019: 133ff). The RSC contains all digitized texts of the Philosophical Transactions and Proceedings of the Royal Society of London between the 1660s and 1990s (ca. 126 million tokens; 19,000 academic journal texts) and is enriched with fine-grained linguistic and metadata annotations. Corpus-based lists of historical terms involving eponyms have been created and sorted according to common structural features and textual metadata categories (e.g. chronologically or thematically according to scientific disciplines or text topics). Additionally, surprisal values as information-theoretic operationalization for measuring information density have been analysed to provide us with insights on the probability of linguistic units to occur in textual contexts.

It can be confirmed on the basis of the RSC data that eponymization has always been an important word-formation process in academic English, especially in mathematics and the natural sciences. Eponymous concepts have often been formed on the basis of non-English proper names, which suggests long-term cross-cultural contacts and exchange of ideas. In certain periods such as between 1850 and 1900 as well as during the last 50 years of the corpus data, we observe a particularly sharp increase with regard to the frequency and productivity of this type of word formation phenomenon. Some eponyms have become common terms that were or are still widely used while others were only used in certain time periods, occurred as occasionalisms or are rather specific terms in very specialized domains (e.g. ‘Malpighian bodies’ occurs particularly often in texts on physiology from the 1840s).  

The results of the quantitative and qualitative analysis contribute to reconstructing historical and cross-cultural aspects of eponyms in scientific discourse from the early stages of the first scholarly journals published in English to contemporary scientific publications. Certain types of historical eponyms can be associated with specific historical strata and reflect cultural and linguistic contacts in specific time periods (e.g. with German, Italian or French). As historical eponyms often served as internationalisms with direct equivalents in other languages the results may also contribute to contrastive lexicological research.



Kermes, H., S. Degaetano, A. Khamis, J. Knappen & E. Teich (2016). The Royal Society Corpus: From Uncharted Data to Corpus. Proceedings of LREC 2016. Portoroz, Slovenia: 1928–1931. (http://hdl.handle.net/11858/00-246C-0000-0023-8D1C-0)

Popescu, F. (2019). A Paradigm of Comparative Lexicology. Newcastle: Cambridge Scholars Publishing.

The sources of the text of the article DEFINITION in Ephraim Chambers’s Cyclopaedia

ABSTRACT. Ephraim Chambers scoured European literatures to compile his Cyclopaedia. In his Preface, he acknowledged: “What the French Academists, the Jesuits de Trevoux, Daviler, Chomel, Savary, Chauvin, Harris, Wolfius, and many more have done, has been subservient to my Purposes.” However, this is the first detailed study of Chambers’s use of sources to compile a specific article in his Cyclopaedia: we analyze the article DEFINITION in the first edition (1728) of the Cyclopaedia. While this is valuable in its own right, the article DÉFINITION in Diderot’s Encyclopédie is largely Mill’s translation of Chamber’s DEFINITION: thus, Chambers’s sources become Diderot’s sources, and here they discredit Diderot’s attribution of DÉFINITION to Johann Formey.

This study identifies sources for each paragraph in DEFINITION; with one exception, Chambers’s words are traced to specific lines in the writings of Charles Gildon, Dominique de Colonia, Christian Wolff, Étienne Chauvin, the Dictionnaire de Trévoux, Jean Le Clerc, and, ultimately, the Port-Royal Logique. Some of these were expected, but others are surprising.

DEFINITION comprises a long section on definition in logic  and a short section on  definition in rhetoric. This study shows that the text of definition in logic  aligns closely with the text  of the Port-Royal Logique as interpreted by Le Clerc and extended by him to incorporate Locke’s ideas on definition. The text of definition in rhetoric is taken wholly from de Colonia.

Further, the study shows that DEFINITION consists almost entirely of texts translated from French, Latin, and German; this recalls Diderot’s complaint that Chambers wantonly appropriated French writers. Even Locke is transmuted by translation into French by Le Clerc before being returned to English by Gildon.

The presentation concludes with a detailed chart that maps source texts to paragraphs in the article DEFINITION.

Towards a Diachronic Semantic Lexicon of Dutch

ABSTRACT. Since 2005, the Instituut voor de Nederlandse Taal (INT) has been working on a lexicographical infrastructure for historical Dutch, consisting of lexical data and corpus material.

The core of the lexical part of the infrastructure is formed by the four scholarly historical dictionaries of Dutch: the Woordenboek der Nederlandsche Taal​  3  [1][2][3] , the Middelnederlandsch Woordenboek4    2 , the Vroegmiddelnederlands Woordenboek and the Oudnederlands Woordenboek​  , together covering

Dutch language from ca. 500 – 1976.  The dictionaries have been put online in a dictionary portal (gtb.ivdnt.org). This component supports semasiological search. The second component currently in development, is the computational lexicon module GiGaNT[4],  providing information on words and their inflectional and spelling variation. It is based on the attestations of the entry in the dictionary quotations, that are also dated.  GiGaNT is used for query expansion, linguistic annotation of corpora and functions as a central database for the description of Dutch at INT. 

The third component and subject of this paper is DiaMaNT[5], a diachronic semantic computational lexicon of Dutch. The main purpose of this lexicon is to enhance text accessibility and foster research in the development of concepts, by interrelating attested word forms and semantic units (concepts), and tracing semantic developments through time. The lexicon is built by adding a semantic layer to the word form lexicon GiGaNT, using the definitions coming from the dictionary articles from which the word form lexicon is built. 

Apart from developing several tools and strategies for building this lexicon, including the use of distributional semantics, several strategies for lexicon deployment will also be developed. So far, a project internal release has been done of the lexicon, containing synonym information extracted from the dictionary definitions. A Linked Open Data publishing format has been designed. Exploratory research has been done into the potential distributional semantics offers for lexicon development and deployment. A first version of a  user-friendly interface to the DiaMaNT data, which will be demonstrated  at the conference,  is to be released by mid 2019.




[1] WNT, Dictionary of the Dutch Language

[2] MNW, Dictionary of Middle DutchVMNW, Early Middle Dutch Dictionary         

[3] ONW, Dictionary of Old Dutch.  

[4] GiGaNT, Groot Geïntegreerd Lexicon van de Nederlandse Taal; large integrated lexicon of the Dutch language)

[5] DiaMaNT, Diachroon seMantisch lexicon van de Nederlandse Taal

16:00-17:30 Session 4A
Location: New York 1
Jens Christian Svabo's Glossary: the oral tradition at the beginning of Faroese lexicography

ABSTRACT. The rediscovery, in the second half of the 18th century, of the Faroese language is strongly connected with the interest for the heroic ballads which had been transmitted orally over the centuries in the North Atlantic Islands. It was only thanks to the desire of collecting and making accessible for comparative purposes this ancient cultural heritage, in fact, that the Faroese language appeared in writing for the first time after it had been prohibited in schools, churches and official documents in 1536. The first collector of Faroese ballads was the Faroe-born scholar Jens Christian Svabo (1746-1824), who, not surprisingly, also compiled the first Faroese dictionary, the Dictionarium Faeroense, which was published some two hundred years later by Christian Matras in 1966-70, and developed the first systematic and consistent orthography of his mother tongue.

In this paper, I will focus on the Collectio Vocum et Phrasium ex Carminibus Færoënsibus antiquis (1780s), a work epitomizing per definition the close connection between Svabo’s literary and lexicographic activities. In this, particular attention will be paid not only to the selection of lemmata, to the structure of their bilingual (Latin-Danish) interpretamenta, but also to the interaction between this glossary and the Dictionarium Faeroense and, through this, later Faroese lexicography.



Matras, C. (1943) Svabos glossar til Færøske Visehaandskrifter. København: Bianco Lunos bogtrykkeri.

Svabo, J. C. (1939) Svabos færøske Visehaandskrifter. København: Bianco Lunos bogtrykkeri.

Svabo, J. C. (1966-1970) Dictionarium Faroense. København: Munksgaard.

Thráinsson, H. et al. (2004) Faroese. An overview and reference grammar. Tórshavn: Føroya Fróðskaparfelag.






Vernacular Terminologies in Ancient Japan: A Reconstruction of the Yōshi kangoshō (720 ca.)

ABSTRACT. In the 8th/9th century Japan, the cultural élite of the centralized bureaucratic ritsuryō State was characterized by diglossia/digraphia: the Written Sinitic was the cosmopolitan written language of prestige, learning, and wide circulation in East Asia and contrasted the Vernacular Japanese, a smaller language in the East-Asian context.

The early 8th-century prototypes of bilingual lexicography are some dictionaries, now lost and only surviving in indirect transmission thanks to numerous excerpts quoted by the scholar/official Minamoto no Shitagō (911-983) in his Wamyōruijushō (Categorized Notes on Japanese Words, 934 ca.), the oldest extant Sino-Japanese dictionary.

We know three of these dictionaries, namely Benshiki rissei (Compendium of Classifications, first half of the 8th century), Yōshi kangoshō (Collection of Chinese Words by Master Yang, 720 ca.), and Kangoshō (Collection of Chinese Words, 8th century). As also stated by Kuranaka Susumu (2002), the lexical domains covered by these dictionaries (bovines and horses, vehicles, textile, agriculture, aquaculture, hunting and falconry, metallurgy) are technical and strictly linked to lower-ranking State officials. In other words, these dictionaries provide insights into several technical terminologies whose attestation cannot be found in canonical written production (such as poetry and historiography).

In this paper I will propose the reconstruction of the Yōshi kangoshō, and I will evaluate its usage in the writing of official texts, by an analysis of Vernacular equivalents furnished and their attestation in practical documents on wooden tablets and paper. A main point to be discussed will be here the role played by Japanese officials in the development of an indigenous lexicography centred on Vernacular Japanese.



Kuranaka, S. (2002): Wamyōruijushō shoin Yōshi kangoshō kō. Tōyō kenkyū 145: 1–37.

Kuranaka, S. (2003): Wamyōruijushō shoin Kangoshō kō. Tōyō kenkyū 150: 1–37.

Mabuchi, K. (ed.) (2008) Koshahon Wamyōruijushō shūsei, 3 voll. Tōkyō: Bensei shuppansha.

Teramura, M. (2009). Zōhochū Narachō shoki no hakuwa kango jisho. Yōshi kangoshō, Benshiki rissei, Kangoshō ni tsuite. In Teramura, M. (ed.): Minato. Kotoba to rekishi, Vol. 21: 12–17. Tōkyō: Bensei shuppansha.

The Lexical Database of the Medieval Polish Language – on the investigation of the Old Polish inflection

ABSTRACT. The research on the inflection of the medieval Polish language is completely different from the research on the inflection of the contemporary Polish language. Very often we have to deal with problems concerning incomplete paradigms, interrelations between the paradigms or a high number of parallel forms. Additionally, everything is made even more complicated by the instable orthography. Those problems are specific to the Old Polish language. It is in the Middle Ages that the Polish inflection started to shape. This is clearly visible in the parallel existence of archaic and newer forms. Scanty material, as compared to the modern Polish, does not always permit to point the exact moment and geographical area where the changes started to occur.

These problems translate into difficulties in presenting the material in a printed dictionary. Based on the attestation of the lemma forms noted in the Old Polish Dictionary, we would  like to study how the inflection of the medieval Polish language was shaped. The results of our studies will be presented in an internet database. During the poster session we would like to present the main principles of the description of the Old Polish inflection as well as the project of the Lexical Database of the Medieval Polish language.

Beside the presentation of the concept of the description of the source material and the presentation of the Lexical Database of the medieval Polish language, we would also like to show the problems encountered while preparing the Database concerning the combination of the lexicographic and computational methods.



Słownik staropolski (Old Polish Dictionary), 1953–2002, t. I–XI, ed. S. Urbańczyk, Wrocław–Kraków.

Korpus tekstów staropolskich (do 1500 r.) (Corpus of the Old Polish texts until 1500), https://ijp.pan.pl/publikacje-elektroniczne/korpus-tekstow-staropolskich.

16:00-17:30 Session 4B
Location: Paris
From the history of modern Russian-Tajik special lexicography

ABSTRACT. In Tajik bilingual lexicography, there are more than 80 Russian-Tajik special dictionaries. The aim of the paper is to study and analyze the history of modern Russian-Tajik special lexicographical works published over the past eighty years, since the appearance of the first Russian-Tajik dictionary of biological terms in 1941.

In Tajik bilingual lexicography, Russian-Tajik terminological dictionaries can be divided into three types:

1) Translation dictionaries – dictionaries in which Russian terminology and their Tajik equivalents or translations are given without interpretation of the concept of the term. The Majority of Russian-Tajik dictionaries are made by this method;

2) The dictionaries in which Russian terminology are given without interpretation, and their Tajik equivalents and translation are given with interpretation. The number of such dictionaries is not great;

3) Dictionaries, in which both Russian terms and their Tajik equivalents, and translations are given with interpretation.

One of the most important and difficult aspects of compiling bilingual Russian-Tajik terminological dictionaries is the choice of entry words. Analysis of existing dictionaries shows that the choice of the entry words in most cases does not meet the modern requirements of lexicography, resulting in the majority of dictionaries have some shortcomings in content and structure. Of these, the most significant are:

  • the dictionaries, along with the terminology of particular area, also include words and expressions of common vocabulary that have no relation to the terms of this area;
  • terminological dictionaries do not always reflect all the special terms of a particular area;
  • the special terminology of a particular field in the dictionaries of different authors is translated into Tajik in different ways, or their interpretation differs from each other.

The formation and development of Tajik terminological lexicography contributes to standardization of terms in the Tajik language and provides great assistance in the creation of larger bilingual special dictionaries in the future.




Mark Ridley’s Dictionary: Plants and Names

ABSTRACT. The sources containing old Russian plant names in the Early Russian literature are not numerous. There are some in Naziraciel, Domostroy and in a number of herbal books, but sometimes there is even no possibility to identify them. In such situation the Dictionarie of the Vulgar Russe Tonge by Mark Ridley (published by G. Stone, 1996) is of extreme value. It was created late in the 16th century while his compiler lived in Russia and was the personal physician to the tsar Feodor Ioannovich. The dictionary combines alphabetical and conceptual parts, both of them include Old Russian plant names with Latin (and sometimes English) translation.

As the Russian part of the dictionary is highly heterogeneous and includes transliterated or calqued Latin terms, standard Russian names and local names, to identify the plants we used various reference books, containing medical terms (Hooper R. A new medical dictionary... Philadelphia, 1817), botanical dictionaries (N. Annenkov, 1878) as well as Russian dialect dictionaries. As a result, almost all Russian plant names meanings were identified. Also, it was determined that Ridley widely used specific names used in pharmacopoeia. 

All plant names from Ridley’s Dictionary are included in the Russian plant names database PhytoLex (http://phytonyms.iling.spb.ru) that allowed putting them into the wide context and compare with other names of the same plant from other sources.

The research is supported by the RFBR (the Russian Foundation for Basic Research), project 17-06-00376 “Russian Phytonyms in the Diachronic Aspect (1117 cc.)”.


A missing link: J. Redding Ware’s "Passing English of the Victorian Era" and the history of English slang lexicography

ABSTRACT. J. Redding Ware’s Passing English of the Victorian Era (1909) cuts an inconspicuous figure in the history of English slang lexicography. Jonathon Green ignores him in The Vulgar Tongue:

Green’s History of Slang (2015) and his earlier history of dictionaries, Chasing the Sun (1996). In A History of Cant and Slang Dictionaries: Volume III: 1859–1936 (2009), Julie Coleman addresses Ware’s dictionary briefly and concludes that it “does not match the standards set by Farmer and Henley, but [Ware] was a careful observer of language, and used a variety of written and spoken sources.” As Coleman observes, only 13% of Ware’s dictionary overlaps with Farmer and Henley, which makes it a rich independent source of words and phrases of historical interest. However, Coleman notes that “Ninety-six per cent of entries include usage labels,” and that and the nature of those labels are the critical facts in this paper’s argument — it helps us to place Ware’s dictionary in the history of English lexicography and in a central disagreement about slang represented by divergent, competitive traditions within that history.

Ware’s emphasis on the subcultures in which slang originates ties him, not to Farmer and Henley and the multivolume, historical Slang and Its Analogues (1890–1904), which set the lexicographical standard for English slang, but to John Camden Hotten’s Dictionary of Modern Slang, Cant, and Other Vulgar Words (1859), which leaned heavily in conception on Henry

Mayhew’s London Labour and the London Poor (1851) — that is to say, Hotten’s motivation was as much sociological as linguistic, and so, I argue, was Ware’s, given his labeling and entrylevel commentary. In contrast, Eric Partridge based his Dictionary of Slang and Unconventional English (1936) on Farmer and Henley (see The Gentle Art of Lexicography [1963], p. 62), thus establishing a tradition of slang lexicography modeled partly on the New English Dictionary and its historical principles, though Partridge departed noticeably from those principles. When one consults indices of books about English dictionaries — Mitford Mathews’ A Survey of English Dictionaries (1933) and J. R. Hulbert’s Dictionaries: British and American (1955) — one finds Farmer but no Ware. The historical tradition is thus the historiographically favored tradition.

But the sociological turn constitutes another tradition, from Hotten to Ware and then into the sort of underworld sociolinguistics favored by David W. Maurer, which was finely focused on slang’s subcultural origins and salience (see Language of the Underworld (1981), which collects several of Maurer’s studies). As Coleman demonstrates in “Historical and Sociological Methods in Slang Lexicography: Partridge, Maurer, and Cant” (2010), conflict between the traditions eventually erupted, with sociolinguistics and slang lexicography subsequently going their separate ways, at least, for a while. The sociological tradition persists, however, to this day, synthesized with the historical tradition in Green’s Dictionary of Slang (2010). Ware’s dictionary appears in the bibliography to GDoS — one wonders why Green has otherwise spared it so little attention, but as material and as methodological example, it remains relevant today.