Damegender: Towards an International and Free Dataset about Name, Gender and Frequency

Equality of gender is the 5th objective of sustainable development in United Nations.

This equality can be reached by measuring and analyzing data
and applying the results politically. Many gender studies
count males and females based on their names, for
instance, research papers, job positions, streets, ... The
traditional research method is to use commercial APIs with
proprietary data without idea about how the data was collected.
Data may also be gathered from Wikipedia, linguistic studies,
scientific sites, or statistical offices.

This approach is based collecting Open Datasets regarding name,
gender and frequency from many statistical institutions. So, we
need a scientific discussion about unifying formats and processing
data easily.

Therefore, Damegender (Free and Open Source Software) to retrieve
and make calculus with these data.

The dataset we used covers more than 20 countries in the occidental
world encompassing many names with an accuracy of approximately 90%
with it. This will create to measure gender gap to students and
academics interested on the phenomenon without costs and on a
reproducible way and more people will be contributing to fix the
gender gap.

Free software and the data provided by statistical institutions make
it possible to produce reproducible research for peer review. Thus,
semantics and diversity can be more easily addressed.

