Damegender: Towards an International and Free Dataset about Name, Gender and Frequency
EasyChair Preprint 5763, version 1
5 pages•Date: June 10, 2021Abstract
Equality of gender is the 5th objective of sustanaible development
in United Nations
This equality can be reached working on measure and analyze data and
to apply politics from the results. On many gender studies, we need
to count males and females deciding gender from names, for instance,
research papers, job positions, streets, ... The traditional way is
to use commercial APIs with propietary data without idea about how
the data has been built. Another way, is taking data from wikipedia
or scientific sites.
With Open Data idea, many statistics institutions are providing Open
Datasets about name, gender and frequency. So, we need a scientific
discussion about unifying formats, make easy ways to process these
data and ways towards make standards. In this discussion, we take
into account minorities, for example, LGTB claims such as attending
the non binary reality.
The dataset is covering more than 20 countries in the occidental
world. Having more names than any open source software in this
moment. Allowing to measure gender gap to students and academics
interested on the phenomenon.
There are a warranty of quality on reproducible research the
citation about official sources provided by statistics institutions
making easy the peer review and opening doors to the semantic web
and the attention to minorities such as trans gender people, or
cultures with own languages in states with another main language
doing more cheaper to measure the gender gap.
Keyphrases: Gender Detection Tool from the Name, gender gap, open datasets