Fodor argues that speech perception is accomplished by a module. Typically, modular processing is taken to be bottom-up processing. Yet there is ubiquitous empirical evidence that speech perception is influenced by top-down processing. Fodor attempts to resolve this conflict by denying that modular processing must be exclusively bottom-up. It is argued, however, that Fodor's attempt to reconcile top-down and modular processing fails, because: (i) it undermines Fodor's own conception of modular processing; and (ii) it cannot account for the contextually varying (...) top-down influences that characterize speech perception. (shrink)
In this paper I provide a metatheoretical analysis of speech perception research. I argue that the central turning point in the history of speech perception research has not been well understood. While it is widely thought to mark a decisive break with what I call "the alphabetic conception of speech," I argue that it instead marks the entrenchment of this conception of speech. In addition, I argue that the alphabetic conception of speech continues to underwrite speech perception research today and (...) moreover that it functions as a dogma which ought to be rejected. (shrink)
The overall goal of speech perception research is to explain how spoken language is recognized and understood. In the current research framework it is assumed that the key to achieving this overall goal is to solve the lack of invariance problem. But nearly half a century of sustained effort in a variety of theoretical perspectives has failed to solve this problem. Indeed, not only has the problem not been solved, virtually no empirical candidates for solving the problem have been produced. (...) One explanation for this lack of progress is simply that no theory has yet hit upon the correct set of invariant properties. Another explanation is that the goal of solving the lack of invariance problem is itself misguided. The most basic claim of this dissertation is that the latter explanation is correct. ;The lack of progress in explaining speech perception exhibited by the current research framework, I argue, is not, in the first instance, due to the failure of individual theories to solve the lack of invariance problem, but rather to the common background assumption that doing so is in fact the key to explaining speech perception. ;My overall argument in support of this basic claim has three main components: criticism of the empirical results of research in the current framework; criticism of the formulation of theories generated in the current research framework; and the availability of an alternative account of phonetic structure. ;The heart of my argument is an analysis of the character of the theoretical weaknesses of three theories of speech perception: the motor theory of speech perception, the ecological theory of speech perception, and the theory of acoustic invariance. I show that, in each case, the particular theoretical problems result from trying to satisfy the common goal of identifying invariant properties of phonetic percepts. ;I analyze the methodological error embodied in the current approach to explaining speech perception as, in general, one of abstracting the wrong kinds of properties from the detail of the speech event, and, in particular, of abstracting away from dynamic and context-dependent properties. (shrink)
Imagine walking into Starbuck's, ordering a double latte, meeting a friend, drinking up, and leaving. In the course of this simple event, you would engage in a wide variety of cognitive activities, among them problem solving, face recognition, speech production and perception, memory, and motor control. How does the mind – an apparently unitary entity – accomplish such a diversity of tasks? Is the mind partitioned into diverse mechanisms, each responsible for a different job? Or are more uniform, general‐purpose mechanisms (...) deployed for different cognitive purposes? Which tasks even count as the same, and which as different? Is visual recognition a single task, or are the mechanisms that recognize objects fundamentally distinct from those that recognize faces? Is speech produced and perceived by similar processes or by different ones? More generally, how, and how much, do such different processes interact? (shrink)
. Björn Lindbloms account of the emergence of phonemic structure is a central reference point in contemporary discussions of the emergence of language. I argue that there are two distinct, and largely orthogonal conceptions of emergence implicit in Lindbloms account. According to one conception (causal emergence), the process by which minimal pairs are generated is crucial to the claim that phonemic structure is emergent; according to the other conception (analytic emergence), the fact that segments are an abstraction from the physical (...) signal is what is crucial to the description of phonemic structure as emergent. The purpose of distinguishing rather than conflating these two conceptions of emergence is not in the first instance to criticize Lindbloms account or to force us to choose between the two conceptions for consistency, but rather to give us a more detailed purchase on the notoriously thorny concept of emergent explanation. (shrink)
The suggestion that analytic isomorphism should be rejected applies especially to the domain of speech perception because (1) the guiding assumption that solving the lack of invariance problem is the key to explaining speech perception is a form of analytic isomorphism, and (2) after nearly half a century of research there is virtually no empirical evidence of isomorphism between perceptual experience and lower-level processing units.
Norris, McQueen & Cutler claim that all known speech recognition data can be accounted for with their autonomous model, “Merge.” But this claim is doubly misleading. (1) Although speech recognition is autonomous in their view, the Merge model is not. (2) The body of data which the Merge model accounts for, is not, in their view, speech recognition data. Footnotes1 Author is also affiliated with the Center for the Study of Language and Information, Stanford University, Stanford, CA 94305, [email protected].