Files
Abstract
Through a multitude of platforms and sources, news media permeates online daily interactions. This reach affords news media significant social influence. Analyzing news articles at scale can reveal latent trends in news media, which ultimately have the potential to be norm-setting. In this study, we implement computational tools to reveal large scale trends in news reporting. Specifically, we integrate NER parsing, record linkage (in the form of gender prediction), topic modeling and word embeddings to reveal trends both in the corpus overall, as well as specific to gendered contexts. Named Entity Recognition and record linkage to isolate contexts in which an individual is reported on (and predict the individual’s gender in order to make larger claims about gender representation in news media). These contexts are then used to train the word embeddings: illuminating differences in the semantic contexts and roles for women/men in news media contexts. This study contributes to an emerging field at the intersection between machine learning and quantitative social science by implementing advanced model architectures to answer questions based in cultural and media studies.