Data for: Prosodic alignment toward neutral and emotionally expressive speech interjections: Differences for human and device voices

Published: 4 May 2020| Version 1 | DOI: 10.17632/w54rh87pjx.1
Contributors:
Michelle Cohn, Melina Sarian, Georgia Zellou, Kristin Predeck

Description

This study investigates vocal alignment of emotional prosody. We ask three questions: 1) How does emotional expressiveness mediate vocal alignment? 2) Does vocal alignment of emotional expressiveness vary whether the interlocutor is a human or an artificially intelligent device? 3) Does vocal alignment of emotional expressiveness vary based on participants’ cognitive characteristics? Participants (n=66) completed a word-shadowing experiment of 24 interjections produced in emotionally neutral and expressive prosodies by a real human or generated by a voice-activated artificially intelligent (voice-AI) system (Amazon’s Alexa) (e.g., “Awesome”). We assessed participants’ alignment toward the human/Alexa for three prosodic dimensions (duration, mean f0, and f0 variation). Our results show greater alignment toward emotionally expressive interjections produced by both interlocutors, relative to neutral productions for duration and mean f0. Direction of the effect for emotional expressiveness was similar for the Alexa versus human model talkers, but magnitude differed slightly based on interlocutor (Alexa/human) and acoustic feature. Additionally, we conducted two individual differences analyses to test whether degree of emotional vocal alignment varied by a speaker’s extent of autistic like traits (assessed by the Autism Quotient, AQ) or depressive characteristics (assessed by the PHQ-9), two types of variation shown to affect emotion perception and interpersonal communication. For both AQ and PHQ-9, we observed distinct patterns: higher AQ score (i.e., greater autistic-like traits) was linked to less emotional alignment toward the human voice, while higher PHQ-9 score (i.e., higher depressive characteristics) was linked to greater emotional vocal alignment in general. Taken together, these findings are relevant for the role of emotion in speech communication theories, models of human vocal alignment, and for technology personification frameworks.

Files

Institutions

University of California Davis

Categories

Emotion, Human-Computer Interaction, Laboratory Phonetics

Licence