Abstract
Both language and genes evolve by transmission over generations with opportunity for differential replication of forms. The understanding that gene frequencies change at random by genetic drift, even in the absence of natural selection, was a seminal advance in evolutionary biology. Stochastic drift must also occur in language as a result of randomness in how linguistic forms are copied between speakers. Here we quantify the strength of selection relative to stochastic drift in language evolution. We use time series derived from large corpora of annotated texts dating from the 12th to 21st centuries to analyse three well-known grammatical changes in English: the regularization of past-tense verbs, the introduction of the periphrastic ’do’, and variation in verbal negation. We reject stochastic drift in favour of selection in some cases but not in others. In particular, we infer selection towards the irregular forms of some past-tense verbs, which is likely driven by changing frequencies of rhyming patterns over time. We show that stochastic drift is stronger for rare words, which may explain why rare forms are more prone to replacement than common ones. This work provides a method for testing selective theories of language change against a null model and reveals an underappreciated role for stochasticity in language evolution.