Output details
29 - English Language and Literature
Cardiff University
Choosing the best tools for comparative analyses of texts
This 19,000 word paper was the core output of an 18-month AHRC grant AH/E001874/1. The work it summarises entailed detailed assessments of 381 different measures of language patterns, all applied to the same written language dataset, evaluated for their similarity, reliability and usefulness, and cross-referenced with cognitive measures of the text writers. The subset of tools judged most useful for profiling written texts could not have been selected without this extensive evaluation, which included both computational and manual analyses, along with examination of the relationship between mathematical assumptions and the linguistic patterns they were a proxy for.