Tags:authorship identification, confidence intervals, nonparametric methods and p-statistics
Abstract:
The paper presents the results of the comparison of two nonparametric methods of authorship identification of the Ukrainian literature texts. The paper describes the implementation of the corresponding methods based on the Klyushin–Petunin tests and its simplified version. The method of n-gram selection is applied. For testing a collection of texts up to 200,000 characters from 10 authors was used. As a result of carrying out the test, it was found out that the simplified test appears to be more sensitive and specific, and monograms and bigrams in opposite to trigrams provide clear detection of authorship.
Nonparametric Methods of Authorship Attribution in Ukrainian Literature