Files
Abstract
We mention what is remarkable while letting the unremarkable go unsaid. Thus, while language can tell us a lot about the world, it does not veridically reflect the world: people are more likely to talk about atypical features (e.g., "purple carrot") than typical features (e.g., "[orange] carrot"). In this dissertation, I characterize how people selectively describe the features of things and examine the implications of this selective description for how children and adults learn from language. In Chapter 1, I show that adults speaking to other adults, caregivers speaking to children, and children themselves tend to mention the atypical more than the typical features of concrete things. Language is structured to emphasize what is atypical—so how can one learn about what things are typically like from language? In this chapter I also show that distributional semantics models that use word co-occurrence to derive word meaning (word2vec) do not capture the typicality of adjective–noun pairs well. I also examine the performance of two more sophisticated language models (BERT and GPT-3); these models have input unlike what children have access to, but provide useful bounds on the typicality information learnable from applying simple training objectives to language alone. However, people can learn about typicality in other ways: in Chapter 2, I show that people infer that mentioned features are atypical. That is, when a novel object is called a "purple toma," adults infer that tomas are less commonly purple in general. This inference is captured by a model in the Rational Speech Act framework that posits that listeners reason about speakers' communicative goals. In Chapter 3, I ask: do children themselves infer that mentioned features are atypical? I find preliminary evidence that 5- to 6-year-old children who reliably respond on our typicality measure tend toward making contrastive rather than associative inferences; further work is necessary to confirm this finding and test younger children's contrastive inferences. Overall, this dissertation examines how language does not directly reflect the world, but selectively picks out remarkable facets of it, and what this implies for how adults, children, and language models learn.