Abstract
How do cognitive pressures shape the lexicons of natural languages? Here, we reframe George Kingsley Zipf's proposed “law of abbreviation” within a more general framework that relates it to cognitive pressures that affect speakers and listeners. In this new framework, speakers' drive to reduce effort (Zipf's proposal) is counteracted by the need for low‐frequency words to have word forms that are sufficiently distinctive to allow for accurate recognition by listeners. To support this framework, we replicate and extend recent work using the prevalence of subword phonemic sequences (phonotactic probability) to measure speakers' production effort in place of Zipf's measure of length. Across languages and corpora, phonotactic probability is more strongly correlated with word frequency than word length. We also show this measure of ease of speech production (phonotactic probability) is strongly correlated with a measure of perceptual difficulty that indexes the degree of competition from alternative interpretations in word recognition. This is consistent with the claim that there must be trade‐offs between these two factors, and is inconsistent with a recent proposal that phonotactic probability facilitates both perception and production. To our knowledge, this is the first work to offer an explanation why long, phonotactically improbable word forms remain in the lexicons of natural languages.