Damegender: Towards an International and Free Dataset about Name, Gender and Frequency
EasyChair Preprint 5763, version 2
5 pages•Date: August 1, 2021Abstract
Equality of gender is the 5th objective of sustanaible development
in United Nations
This equality can be reached working on to measure and to analyze
data and to apply politics from the results. On many gender studies,
we need to count males and females deciding gender from names, for
instance, research papers, job positions, streets, ... The
traditional way is to use commercial APIs with propietary data
without idea about how the data has been built. Another way, is
taking data from wikipedia or scientific sites.
With Open Data idea, many statistics institutions are providing Open
Datasets about name, gender and frequency. So, we need a scientific
discussion about unifying formats, making easy ways to process these
data and ways towards make standards.
The dataset is covering more than 20 countries in the occidental
world, with more names than any open source software in this
moment. Allowing to measure gender gap to students and academics
interested on the phenomenon without costs and on a reproducible
way, more people will be contributing to fix the gender gap.
There are a warranty of quality on reproducible research, that's the
Free Software and the citation about official sources about names,
gender and frequency provided by statistics institutions making easy
the peer review and opening doors to the semantic web and the
attention to diversity.
Keyphrases: Gender Detection Tool from the Name, gender gap, open datasets