Using sung speech to evaluate the bimodal benefit to speech and music perception
Pitch cues provide important indexical and prosodic information for speech perception and are the basis for musical melody. Due to limited spectral resolution, cochlear implant (CI) users have great difficulty perceiving pitch. As a result, pitch-mediated speech perception and melody perception are poorer in CI users than in normal-hearing (NH) listeners. Combining a hearing aid (HA) with a CI (“bimodal” listening) has been shown to improve performance over the CI alone for many speech and music measures, presumably due to the better pitch perception afforded by the HA. However, bimodal benefit has been inconsistent across studies and patients, and seems to reflect interactions between the stimuli, test method, and hearing device, with the better ear for a given task and stimulus often contributing most strongly.
We recently developed the Sung Speech Corpus (SSC) to evaluate contributions of pitch and timbre to speech and music perception. The SSC consists of 50 sung monosyllable words, 10 for each of 5 categories (name, verb, number, color, and object). The words were sung at 13 fundamental frequencies (F0s) from A2 (110 Hz) to and A3 (220 Hz) in semitone steps and normalized to have the same duration and amplitude. The words can be used with a matrix test paradigm to test sentence recognition with fixed or mixed pitch cues across words. The words can also be used to measure melodic contour identification (MCI) with fixed or mixed timbre cues across pitch cues. As such, the SSC allows for the contribution of acoustic and electric hearing to be evaluated for speech and music perception, using the same stimuli that can be manipulated to provide different degrees of complexity for different listening tasks.
Sentence recognition and MCI with sung speech was measured in bimodal listeners; performance was measured with the CI-alone, the HA-alone, or with the CI+HA. Sentence recognition was measured with fixed or mixed pitch cues, as well as with spoken words. MCI was measured with fixed or mixed timbres (i.e., words), as well as with a piano sample. CI performance generally worsened as the stimuli became more complex. Thus, bimodal and CI sentence recognition declined from spoken word to fixed pitch to mixed pitch, and bimodal and HA MCI performance declined from piano to fixed word to mixed word. These preliminary data suggest that bimodal listeners still lack critical pitch processing abilities despite the low-frequency pitch cues provided by the HA.