Data for: Prosodic alignment toward emotionally expressive speech: Comparing human and Alexa model talkers
Description
This study tests whether individuals vocally align toward emotionally expressive prosody produced by two types of interlocutors: a human and a voice-activated artificially intelligent (voice-AI) assistant. Participants completed a word shadowing experiment of interjections (e.g., “Awesome”) produced in emotionally neutral and expressive prosodies by both a human voice and a voice generated by a voice-AI system (Amazon’s Alexa). Results show increases in participants’ word duration, mean f0, and f0 variation in response to emotional expressiveness, consistent with increased alignment toward a general ‘positive-emotional’ speech style. Small differences in emotional alignment by talker category (human vs. voice-AI) parallel the acoustic differences in the model talkers’ productions, suggesting that participants are mirroring the acoustics they hear. The similar responses to emotion in both a human and voice-AI talker support accounts of unmediated emotional alignment, as well as computer personification: people apply emotionally-mediated behaviors to both types of interlocutors. There were small differences in magnitude by participant gender, the overall patterns were similar for women and men, supporting a nuanced picture of emotional vocal alignment.