27  Annotation for Canonicity

Lisanne van Rossum (Amsterdam)

27.1 Annotating the canon

Annotation indicating canonicity or prestige is closely intertwined with the development of selection criteria for a corpus as is shown for example by Algee-Hewitt and McGurl (2015). An annotation scheme for both corpus selection and canonicity requires informed, consistent, and schematic decision-making to maintain internal logic. With this in mind, annotating increasingly complex socio-literary concepts such as canonicity seems a daunting task. The following examples from applied practice present first steps, and are often based on a taxonomy, or classification scheme, of literary value.

27.2 Current practices

In studying the canon, several approaches have been explored concerning annotation for canonicity or prestige. Using categorical principal component analysis (Princals), Marc Verboord has developed an annotation scheme for the prestige status of authors. Princals is a statistical method to reduce variables to a smaller number of components by dividing it up in relative categories, such as “small”, “medium” and “large” publishing houses based on their contribution of books published onto the literary market over a given time period (2003, 272). With a data base created from six source types (literary encyclopedias, popular encyclopedias, publishing house status, academic publications, popular prizes, and literary prizes), Verboord created the Institutional Literary Prestige (ILP) system. The ILP system is a classification system covering 502 authors with varying degrees of institutional prestige that juxtaposes popular and literary appeal (2003).

In turn, based on Heydebrand and Winko’s 2008 model of literary valuation (see von Heydebrand and Winko 2008; Herrmann, Jacobs, and Piper 2021), Thomas Messerli and Berenike Herrmann have used lemmatization combined with metaphor detection software and manual annotation to investigate how everyday German readers conceptualize literary quality and the reading experience in thematic terms of the thematic categories ‘food’ (Herrmann and Messerli 2020) and ‘motion’ (Herrmann and Messerli 2019) in 1.3 million German language online book reviews.

Alternatively, the Riddle of Literary Quality project has approached prestige annotation by circumventing a predefined definition of literariness in its entirety and by foregrounding bottom-up perceptions of literary quality from the reading public, who in a National Reader Survey awarded works a Likert scale score between 1 and 7 for literariness (Koolen et al. 2020). A preliminary outcome of the project was that genre and literariness perceptions appeared to be closely related (van Dalen-Oskam 2021; Van Dalen-Oskam 2023). A genre annotation scheme was developed based on the Dutch Uniform Classification Coding system. At the time of the project, the system was widely applied by publication houses to aid booksellers with the in-store arrangement of books for optimal consumer orientation (Koolen 2018).

Another approach to reader annotation is the Ben-Gurion University of the Negev (BGU) Literary Lab’s distant public reading initiative, a large-scale study of the Hebrew novel since its emergence in 1853 to the present day. The collective reading project garnered a total of 525 online questionnaires, filled out by 229 readers about 386 novels. The survey was designed to include narratological properties, such as ‘theme,’ ‘plotting,’ and ‘structure’, and bibliographical properties, such as ‘author gender’ and ‘reception’. The survey also invited the reader to reflect on the perceived readability of the book and the reading experience itself (Dekel and Marienberg-Milikowsky 2021). In practice, no two completed questionnaires of the same novel were identical, a finding that lays bare the complexities of inter-annotator agreement (2021, 246).

27.3 Limitations

It is expected that many more approaches to annotation for canonicity or prestige will be explored, each depending on slightly different research questions or diverging social or historical contexts of the corpus studied. We are still far from a unified and validated annotation standard for canonicity – if that would be possible at all.


See works cited and further reading for this chapter on Zotero.

Citation suggestion

Lisanne van Rossum (2023): “Annotation for Canonicity”. In: Survey of Methods in Computational Literary Studies (= D 3.2: Series of Five Short Survey Papers on Methodological Issues). Edited by Christof Schöch, Julia Dudar, Evegniia Fileva. Trier: CLS INFRA. URL: https://methods.clsinfra.io/annotation-canon.html, DOI: 10.5281/zenodo.7892112.

License: Creative Commons Attribution 4.0 International (CC BY).