Application of Omnidirectional Recording Devices in Computer Music Recording: Deep Learning Algorithms for Multi-channel Audio Processing

Published: 25 April 2024| Version 1 | DOI: 10.17632/4jhc594ckt.1
Hengyang Xu


This study aims to utilize omnidirectional recording equipment for audio collection and enhance the sound processing effectiveness in computer music post-production by using the deep learning (DL) algorithm. In the audio acquisition stage, multichannel audio signals are acquired using omnidirectional recording equipment. Subsequently, a novel multichannel deep spectral clustering (M-DeepSC) algorithm is proposed, which integrates DeepSC and multichannel processing techniques to efficiently separate distinct sound sources in the time-frequency domain. At the same time, an adaptive learning rate and iterative weight adjustment mechanism are introduced to optimize the algorithm's performance, facilitating highly accurate localization and separation of various sound elements in audio. In the experiment, by comparing the performance of the M-DeepSC algorithm, traditional DL algorithm, and spectrum clustering (SC) algorithm, it can be found that the Convolutional Neural Network (CNN) achieves an accuracy of 79.2%, the Recurrent Neural Network (RNN) achieves 78.4%, and the SC algorithm achieves 75.2% accuracy in sound separation. After introducing M-DeepSC, the accuracy is close to 100%, which is more than 20% higher than other algorithms. Additionally, the spectral fidelity of CNN, RNN, and SC algorithms is measured at 84.9%, 82.1%, and 80.3%, respectively. The adoption of M-DeepSC results in a nearly 100% spectral fidelity, signifying an increase of over 15%. Furthermore, regarding signal-to-noise ratio (SNR) improvement, after using M-DeepSC, the SNR witnesses a 30.1% improvement, reaching 26dB. It can be concluded that the DL algorithm and M-DeepSC combined with omnidirectional recording equipment, as explored in this study, represent a higher level of technical complexity and processing efficacy in the post-production of computer music, thereby introducing novel possibilities to the domain of music production.



Audio Recording