There is a longstanding debate about the specificity and encapsulation of the mechanisms behind humans’ ability to understand speech. Some scholars argue that these abilities arise from evolutionarily specialized language modules, while others hold that they are a result of the interaction between cognitive-general mechanisms and extensive experience. In the present work, we use meaningful non-speech sounds, e.g. dog barks and train whistles, in the context of spoken sentences, to probe the extent to which language understanding mechanisms can quickly and easily adapt to non-speech. We find that behaviorally, listeners can recognize and understand non-speech sounds in sentence context quickly and easily, with processing costs much smaller than those that would be expected from a “translation” or covert naming process. Moreover, neurally, the N400 and P600 potentials are remarkably similar between non-speech sounds and matched spoken words, suggesting similar processing that is, in both cases, equally sensitive to sentential constraint and congruency with the preceding sentence context. These results closely mirrored the results from a final study in which we examined neural responses to spoken words said by a different talker. Analysis of relationships between working memory and scalp topography, as well as source analysis, indicated that both the comprehension of a different talker and the comprehension of an environmental sound involve early recruitment of working memory in the N1 and P2 time windows. These processes are mediated by parietal and temporal speech areas that have previously been identified in studies of talker normalization. Though the degree of recruitment of these resources may be greater and/or last longer for non-speech sounds than for a changing talker, the underlying mechanisms appear to be qualitatively the same. These results are discussed in light of a new model of auditory understanding, in which working memory, attention, and experience interact to allow flexible and rapid understanding of many different types of auditory stimuli, including (but not limited to) speech.