A growing number of studies in the humanities now use the tools of authorship attribution to answer traditionally “subjective” questions of literary style. However, scientists still for the most part develop these tools with more traditional classification tasks in mind, and ultimately most scholars of literature still believe that quantified data cannot tell the whole story. We aim to hone the tools of textual analysis to literary goals, to make the expression of digital analysis more flexible, and to strengthen that tenuous connection between feature set and literature upon which stylistics depends. In this paper, we introduce a new feature for stylistics called the “functional n-gram,” which captures the repetitive stylistic nature of sound oriented texts. Using functional n-grams and Support Vector Machines, we present a variety of authorship attribution experiments using English language novels, as well as Romantic, Renaissance and Classical Poetry. Extending our analysis further, we go on to use functional n-grams as a feature basis for a series of Principal Components Analysis experiments examining stylistic consistency in Homer.




Downloads Statistics

Download Full History