Impact of perturbation and value of fundamental frequency on the sound quality of electrolaryngeal speech.

Published: 8 November 2021| Version 1 | DOI: 10.17632/rnrpz5g8m5.1
Giovanna Castilho Davatz


Electrolarynx is a device used to replace the glottal source in cases of irreversible loss of natural voice. However, the sound is considered robotic, which in the literature is related to the absence of cycle-by-cycle perturbation. Furthermore, low values of fundamental frequency (f0) represent an additional challenge, especially for women. Thus, was developed the present research aiming to verify the impact of perturbation and f0 value on the sound quality of electrolaryngeal speech. In view of this, three studies were developed: Study 1, cross-sectional, to obtain human perturbation and acoustic measurements of source (f0) and filter (first to fourth formants - F1- F4 and bandwidths - B1- B4) necessary for the synthesis experiment. For this, was analyzed recordings of sustained vowels / a / emitted by 162 laryngeal speakers, 78 women and 84 men, young (18-44 years old), middle-aged (45-59 years old) and elderly (60-80 years old). Study 2, quasi-experimental, with synthesis of 24 vowels / a / sustained to verify, through blind auditory-perceptual judgment, the degree of naturalness provided by two mathematical patterns of perturbation of f0 - (I) random with uniform distribution and (II) second order plus randomness - having as control the natural perturbation extracted from recorded human voices and as a placebo the absence of perturbation. Study 3, quasi-experimental, with blind comparative analysis of electrolaryngeal emissions of 10 total laryngectomized patients (1 woman and 9 men) with Conventional equipment (available in the national market) and Modified (with random perturbation and f0 referring to the sample mean as a function of sex and age group) performed by the patients themselves, in addition to auditory-perceptual judgment performed by Speech-Language Pathologists. As a result, the comparative analysis of the vowels synthesized from the means of f0, F1, F2, F3, F4, B1, B2, B3, B4 of women and men of different age groups, with different patterns of perturbation, indicated that the second-order model provided naturalness degree similar to human perturbation. Despite the random pattern not having the same performance, it showed proximity to the natural perturbation in the voice synthesized with of elderly parameters. Due to the characteristic of the Eletrolaringe circuits, it was not possible to insert the second order pattern. Therefore, the equipment was adjusted with random perturbation. In view of the sonorities, the laryngectomized woman preferred the modified; about the men, 4 preferred the modified, 1 the conventional and 4 said there was no difference. In the auditory-perceptual judgment performed by Speech Pathologist, in 4 the modified was better, among them for women, in 4 the conventional was better and in 2 there was no difference. Given the findings, it was concluded that there were no relevant improvements in electrolaryngeal sound quality, possibly due to other aberrant sound characteristics of the equipment.


Oncology, Speech Processing, Voice Input, Voice Output, Human Voice