Tags:annotation, data accuracy, label, quality and requirements framework
Abstract:
The novel design and application of algorithms has been the focus of machine learning research to create the highest performing models. As applications of deep neural networks re-emerged in 2012, algorithmic methods have converged to a more constrained set for applied usage as performance gains from algorithm improvements have plateaued. The search for improvement gains has shifted toward methods to increase data quality and availability as the alternate path toward further model performance. This early-stage research sets forth a journey toward constructing a data-informed annotation systems requirements framework that accounts for human factors beginning with the analyzing the implications of annotation tool interface design on NLP use case label quality (i.e. label accuracy). The design and analysis of a set of controlled annotation experiments will be foundation of this work. This contribution towards a consolidated annotation requirements framework for applied AI practitioners is intended to support improvements in production model performance gains through the cost-efficient creation of higher quality scaled datasets for model training and tuning.
Impacts of Annotator Interface Design on NLP Data Annotation Quality