Check out RSS, or use RSS reader to subscribe this item
Confirmation
Authentication email has already been sent, please check your email box: and activate it as soon as possible.
You can login to My Profile and manage your email alerts.
Sponsored by the Center for Science and Technology Development of the Ministry of Education
Supervised by Ministry of Education of the People's Republic of China
Sound event detection (SED) is a task to detect the onsets and offsets of sound events in an audio recording. Sound event detection of weakly labelled data only needs the event category label existing in the audio recording, and does not need their onset and offset times, which can significantly reduce the cost of labeling data. Weakly-labeled sound event detection system cannot accurately know the activity status of various events in each frame during the training process, so more time series information needs to be captured so that the predictions of frames with the same event are as similar as possible. Drawing inspiration from unsupervised deep clustering that allows the network to learn more similar feature representations for elements of the same class, this study proposes to jointly train neural networks with auxiliary clustering loss. The joint training method proposed in this paper improves the performance of multiple systems on the DCASE 2017 Task 4 dataset and achieves the state-of-the-art F1 score for sound event detection.
Keywords:Signal and Information Processing; sound event detection; deep clustering; clustering loss