Determining author or reader

Due to the nature of literary texts as being composed of words rather than numbers, they are not an obvious choice to serve as data for statistical analyses. However, with the help of computer programs, words can be converted to numbers and specific parts of a text can be examined on a large scale. Textual elements such as sentence length, word length and lexical diversity have been associated with various concept by scholars in different fields. Stylometry is one of these fields and focusses on the writing style of an individual author and more specifically tracing markers of their style to attribute authorship to anonymous texts. On the other hand, in children’s literature studies, these markers or textual elements are most often associated with the complexity of a text and the intended age of its readers. In this paper, data from the entire CAFYR corpus (little under 700 English and Dutch books written for different ages) is subjected to statistical evaluation to investigate whether the textual elements studied are better qualified to detect the age of the intended reader of a text or the identity or age of its author.

Geybels, Lindsey. ‘Determining Author or Reader: A Statistical Analysis of Textual Features in Children’s and Adult Literature’.

Proceedings of the Computational Humanities Research Conference, 2022, pp. 355–365.

Read this paper