|
With the rapid development of the information technology, digital music is becoming an essential carrier of music communication. Researches of speech enhancement are penetrating deeply and have got a good performance recently, while it is relatively less in music denoising. In the applications of music information retrieval(MIR), such as automatic music transcription and music alignment, various noises may influence the performance of retrieval algorithms. In this paper, end-to-end approaches based on high-resolution representations are presented without manual feature extraction. The U-Net model cascades neural networks with different resolutions while the HRNet model fuses high-to-low resolutions to extract high-resolution features. To solve the lack of specialized music denoising database, we mix the music database MusicNet and the noise database NoiseX-92 with five different signal-to-noise ratios. Through the analysis of source to distortion ratio, the HRNet model outperforms the U-Net model. The results reveal that the resolution fusion methods of HRNet is more suitable for high-resolution feature extraction to reduce noises. |
|
Keywords:Signal and information processing, music denoising, high-resolution representations, convolutinal neural networks. |
|