The Effects of Context and Prosody on the Perception of Verbal Sarcasm

Published: 30 November 2022| Version 1 | DOI: 10.17632/c8ftxyyh36.1
, Zhan Wang


Two experiments were conducted with Chinese materials in the current study. In the first part of Experiment 1, we invited a 24-year-old Chinese native speaker to record 42 acoustic sentences (21 with sincere prosody and 21 with sarcastic prosody). Applying voice processing software, we extracted or calculated seven acoustic parameters (mean intensity, mean pitch, intensity SD, pitch SD, intensity range, pitch range, tempo) of these 42 sentences (as what the file named Experiment 1 Part 1 shows). The result demonstrated that sarcastic prosody and sincere prosody were distinguishable by seven acoustic features. In the second part of Experiment 1, twenty-eight participants were invited to score the recorded experimental stimuli. After listening to the target sentences, participants were asked to rate on a 7-point Likert scale to what extent they thought the intention of each sound excerpt was sincere or sarcastic (1 = very sincere, 2 = sincere, 3 = a little sincere, 4 = neutral, 5 = a little sarcastic, 6 = sarcastic, 7 = very sarcastic). Forty-two target sentences in acoustic form were played in a random order. The data (as what the file named Experiment 1 Part 2 shows) indicated that the listener could distinguish a literal sincere meaning from an implied sarcastic meaning merely according to the prosody. Experiment 2 further explored the interplay role of context and prosody. The materials of Experiment 2 were 42 target sentences and 42 contexts written, recorded, and validated in Experiment 1. A single trial of Experiment 2 consisted of a situational context followed by a relevant target sentence. There were 21 contextual scenarios and 21 target sentences, and each contextual scenario included two versions: a negative context (leading to a “sarcastic” perception) and a positive context (leading to a “sincere” perception). Additionally, each target sentence included two prosodies: a negative prosody and a positive prosody. Thus, there were 84 combinations of context and target sentence (21 scenarios × 2 versions × 2 target sentences with different acoustic features). All materials were divided into four item pairs according to the Latin square design, and each participant finished one item pair. Each test consisted of 21 trials with four combined pairs. The four item pairs were combined with contexts (negative and positive) and prosodies (negative and positive). Thus, four item pairs were obtained: NN, PP, NP, and PN. We obtained a total of 1680 (21 trials × 80 participants) responses including choices and RTs (21 trials × 80 participants). We rejected some invalid responses, remaining 90.48% (1520) of the total responses (as what the file named Experiment 2 shows).



Lanzhou University