Monaural Speech Enhancement Using Combined Convolution Neural Network In The Time Domain

ZHANG Cheng; JIANG Ting; YU Jia-Cheng

Chinese︱Feedback︱Save this page

• Elaborating Academic Views 　　　　 • Exchanging Innovative Ideas
• Protecting Intellectual Properties 　　• Fast Sharing Science Papers

Sponsored by the Center for Science and Technology Development of the Ministry of Education
Supervised by Ministry of Education of the People's Republic of China

Home > Papers

Monaural Speech Enhancement Using Combined Convolution Neural Network In The Time Domain

ZHANG Cheng, JIANG Ting *, YU Jia-Cheng

School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876

*Correspondence author

#Submitted by

Subject:

Funding: none

Opened online: 8 April 2022

Accepted by: none

Citation: ZHANG Cheng, JIANG Ting, YU Jia-Cheng.Monaural Speech Enhancement Using Combined Convolution Neural Network In The Time Domain[OL]. [ 8 April 2022] http://en.paper.edu.cn/en_releasepaper/content/4757315

Recent studies have shown that convolution network (CNN) has a good performance on modeling the long-term dependence of speech sequences in time domain. Multi-layer stacked dilated convolution is used to effectively enlarge the receptive field of network. However, the distance between the feature points mapped to the previous layer will become larger with the increase of dilated rate in higher layer, which easily leads to the neglect of the short-range information between the feature points. This paper proposes a plug-and-play inverted residual and linear bottleneck module called combined convolution (CB-Conv) module, aiming to extract the short-range information between feature points. The main part of CB-Conv module designs two parallel convolution blocks, one is common dilated convolution block, the other is aggregation convolution block. The latter mainly aggregates the lost details between adjacent points through pooling layer, and integrates with the output of the common dilated convolution to complete the information extraction. Experimental results on TIMIT datasets show that the proposed module achieves 1.04dB SI-SNR gain based on TasNet framework compared with the baseline Conv-TasNet under the condition of same number of stacted main module.

Keywords:speech enhancement; combined convolution; monaural

For this paper

● PDF (0B)
● Revision 0 　　
● Print this paper
● Recommend this paper to a friend
● Add to my favorite list

Saved Papers

Please enter a name for this paper to be shown in your personalized Saved Papers list

Tags

Add yours

Related Papers

Other similar papers
● Multi-scale convolutiona...
● Adaptive speech enhancem...

Statistics

PDF Downloaded	7
Bookmarked	0
Recommend	0
Comments	Array

Submit your papers

Alert Name:
Alerting to:
Authentication email will be sent to your email address in 24 hours
Frequency:
Email Message Format:	Plain text Graphical(HTML)

Complete the form below and we will recommend the selected titles to your friends on your behalf. * Indicates a required field.
Your name*:
Your email address*:
Recipient's name*:
Recipient's email address*:
(multiple recipient's names and email addresses should be separated with semicolons)
Your comments:	I thought you would find the page(s) useful.

Your name:
Your email address:
Recipient's name:
Recipient's email address:
(multiple recipient's names and email addresses should be separated with semicolons)
Your comments:	I thought you would find this page useful.

Disclaimer: This message was sent to your friend using the "Send it to a friend" facility on the Sciencepaper Online’ WWW site, http://www.paper.edu.cn/en. The Sciencepaper Online is not responsible for the content of this email, and anything said in this email does not necessarily reflect the Sciencepaper Online's views.

	Check out RSS, or use RSS reader to subscribe this item

Saved Papers