Speech Production Dataset for Canavesano Piedmontese

Published: 26 June 2024| Version 1 | DOI: 10.17632/87djnsc8n5.1
Alec Gallo


This study presents the first comprehensive acoustic analysis of the vowel system in Canavesano Piedmontese, an endangered variety of Piedmontese spoken in the Canavese subregion of northwest Italy. Nineteen native speakers aged 57-83 participated in sociolinguistic interviews, generating a dataset of 2426 tokens analyzed for formant frequencies (F1 and F2). Results confirm the presence of nine vowels /a,ɛ,e,ø,ə,y,o,u,i/, including the mid-central vowel /ə/, which exhibits distinct acoustic properties compared to surrounding vowels. Despite individual variability among speakers, the community-level analysis reveals a stable vowel system similar to nearby Turinese Piedmontese. These findings contribute to the documentation and preservation of Canavesano Piedmontese, highlighting the importance of acoustic phonetic studies in understanding and conserving endangered languages.


Steps to reproduce

Data collection consisted of an informal sociolinguistic interview using a picture-naming task (Gallo, 2024). Participants were asked to name 93 pictures shown on a computer screen in front of them. A total of 2565 tokens (15 tokens x nine vowels x 19 participants) were recorded for the current research (see Supplementary Table 2). Speech was recorded using a smartphone’s internal microphone and audio recorder – 44.1 kHz sampling rate and 16-bit resolution. The device was placed on the table in front of the participant, a few centimeters away from the center of their chest and at an approximate distance of 15-20 cm below their mouth. The whole interview was conducted in Canavesano Piedmontese. The length of the recordings was 7.39 minutes on average, with a range of 4.36 min to 11.11 min (SD = 2.05).


Universitat Konstanz, University of Texas at Austin, Universidad del Pais Vasco - Campus de Alava


Language Production