Published 2010 | Version v1
Journal article Open

Features From Frequency: Authorship and Stylistic Analysis Using Repetitive Sound

  • 1. University of New York at Buffalo
  • 2. University of Colorado at Colorado Springs

Description

A growing number of studies in the humanities now use the tools of authorship attribution to answer traditionally "subjective" questions of literary style. However, scientists still for the most part develop these tools with more traditional classification tasks in mind, and ultimately most scholars of literature still believe that quantified data cannot tell the whole story. We aim to hone the tools of textual analysis to literary goals, to make the expression of digital analysis more flexible, and to strengthen that tenuous connection between feature set and literature upon which stylistics depends. In this paper, we introduce a new feature for stylistics called the "functional n-gram," which captures the repetitive stylistic nature of sound oriented texts. Using functional n-grams and Support Vector Machines, we present a variety of authorship attribution experiments using English language novels, as well as Romantic, Renaissance and Classical Poetry. Extending our analysis further, we go on to use functional n-grams as a feature basis for a series of Principal Components Analysis experiments examining stylistic consistency in Homer.

Files

56-270-1-PB.pdf

Files (2.4 MB)

Name Size Download all
md5:520bb75df2c2c64e9c6d82e90a236895
2.4 MB Preview Download
md5:81a7c3520a5e3d420435a96e35a8e47d
1.9 kB Preview Download

Additional details

Identifiers

Other
oai:knowledge.uchicago.edu:132

UChicago Information

Department(s)
2010 Journal of the Chicago Colloquium on Digital Humanities and Computer Science Vol. 1, No. 2