ENS, salle Ribot, 29 rue d'Ulm, 75005 Paris
The auditory system must integrate across many different temporal scales to derive meaning from complex natural sounds such as speech and music. A key challenge is that sound structures – such as phonemes, syllables, and words in speech – have highly variable durations. As a consequence, there is a fundamental difference between integrating across absolute time (e.g., a 100-millisecond window) vs. integrating across sound structure (e.g., a phoneme or word). Auditory models have typically assumed time-yoked integration, while cognitive models have often assumed structure-yoked integration, which implies that the integration time should scale with structure duration. Little empirical work has directly tested these important and divergent assumptions, in part due to the difficulty of measuring integration windows from nonlinear systems like the brain and the poor spatiotemporal resolution of noninvasive neuroimaging methods. To address this question, we measured neural integration windows for time-stretched and compressed speech (preserving pitch) using a novel method for estimating integration windows from nonlinear systems (the temporal context invariance paradigm) applied to spatiotemporally precise intracranial recordings from human neurosurgical patients. Stretching and compression rescale the duration of all sound structures and should thus scale the integration window if it is yoked to structure but not time. Across the auditory cortex, we observed significantly longer integration windows for stretched vs. compressed speech, demonstrating the existence of structure-yoked integration in the human auditory cortex. However, this effect was small relative to the difference in structure durations, even in non-primary regions of the superior temporal gyrus with long integration windows (>200 milliseconds) that have been implicated in speech-specific processing. These findings suggest that the human auditory cortex encodes sound structure using integration windows that are mainly yoked to absolute time and weakly yoked to structure duration, presenting a challenge for existing models that assume purely time-yoked or structure-yoked integration.