The perception of speaking styles under cochlear implant simulation
Real-life speech communication is complicated not only by background noise and competition from other talkers, but also natural variability encoded in the speech signal. To deal with speech variability, listeners must identify the speech form and extract information about the environment, context, and talker. Further, they must make rapid perceptual adjustments and learn from systematic variation to facilitate speech recognition. These tasks may be very challenging for hearing-impaired users of cochlear implants (CIs), since limitations of the CI may prevent users from being able to reliably perceive and use subtle variations in speech. While robust speech perception in CI users is mostly achieved for ideal speech, i.e., carefully controlled speech with clear pronunciations, our knowledge of CI detection and adaptation to real-life speech is still limited. Further, CI users’ perception of speech produced in well-controlled laboratory speech conditions may not reflect their actual real-life performance.
In order to begin to characterize CI perception of real-life speech forms, and to provide guidelines for best clinical practice, the perception of different speaking styles common in real-life speaking environments was investigated in normal and CI-simulated conditions. In particular, normal-hearing listeners completed a perceptual discrimination task and a sentence recognition task using casual and careful speech in three CI noise-vocoder simulation conditions, including none, 12- or 4-channel simulation. The results indicate that the CI simulation had a significant impact on the perception of real-life speaking styles. In the discrimination task, NH listeners were unable to reliably make the categorization judgments under CI simulation. In the sentence recognition task, listeners’ ability to recognize casual speech was disproportionately reduced as spectral resolution decreased, with listeners performing much worse on the casual speech than the careful speech under 4-channel simulation. Finally, performance on both tasks was compared to explore the relation between perceptual discrimination and word recognition across the speaking styles.
Taken together, the findings from the CI simulations suggest that perceptual adjustments to real-life speaking styles may be difficult for CI users, given that some important cues to speaking style, such as fine acoustic-phonetic detail, are not be available to them. Despite this, the results suggest that some CI listeners may still be able to use additional cues, such as durational cues related to speaking rate, and draw upon linguistic knowledge and experience to in their perception of real-life speaking styles. By characterizing how CI users perceive and encode speech information related to speaking style, we may be able to develop new clinical tools for the assessment and training of real-life speech perception performance for these listeners.