Deep learning-based smart speaker to confirm surgical sites for cataract surgeries: A pilot study
Version 1.01 In the experiments, we attempted to add additional short words from the various text-to-voice tools. The researchers recorded the target words, such as time-out, cataract, phacoemulsification, and intraocular lens, with varying accents, speed, and voice tones that provided by the text-to-voice tools. As the voice interface relied on keyword spotting to initialize the interactions in most devices, “time-out” was assigned as a keyword to initialize the automated detection. Finally, the dataset consists of different people speaking the same word for training and validation. The Speech Commands dataset provides several basic noise data including background sounds from white noise, pink noise, exercise, and doing the dishes. Additional noise sounds in operation room including vital monitoring sound and background sound of surgery were added in the noise database.