Using Graph Sampling and Aggregation to Refine Speaker Embeddings in Speaker Diarization

HE Shuyi; WANG Lei

Chinese︱Feedback︱Save this page

• Elaborating Academic Views 　　　　 • Exchanging Innovative Ideas
• Protecting Intellectual Properties 　　• Fast Sharing Science Papers

Sponsored by the Center for Science and Technology Development of the Ministry of Education
Supervised by Ministry of Education of the People's Republic of China

Home > Papers

Using Graph Sampling and Aggregation to Refine Speaker Embeddings in Speaker Diarization

HE Shuyi,WANG Lei *

Department of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876

*Correspondence author

#Submitted by

Subject:

Funding: none

Opened online:25 March 2022

Accepted by: none

Citation: HE Shuyi,WANG Lei.Using Graph Sampling and Aggregation to Refine Speaker Embeddings in Speaker Diarization[OL]. [25 March 2022] http://en.paper.edu.cn/en_releasepaper/content/4756829

At present, deep neural networks are often used to extract speaker embeddings, such as x-vector and d-vector, and combine the speaker embeddings with clustering to implement a speaker segmentation system. The robustness of the speaker embedding determines the performance of the speaker segmentation system. Recently, ECAPA-TDNN embeddings have shown better performance than x-vector in speaker classification systems. In the work of this paper, the embedding extracted from each session is converted into a graph, and the embedding is used as a node of the graph, and two points whose similarity is greater than a set threshold are connected. Sampling and aggregating features from the local neighborhood of each node in the graph, using the structural information in the graph to reconstruct new speaker embeddings for each session through supervised learning. This embedding is then used for speaker segmentation using spectral clustering. The system proposed in this paper achieves the state-of-the-art results on the AMI dataset.

Keywords:Signal and Information Processing; Speaker Diarization; Graph Neural Network; Clustering

For this paper

● PDF (0B)
● Revision 0 　　
● Print this paper
● Recommend this paper to a friend
● Add to my favorite list

Saved Papers

Please enter a name for this paper to be shown in your personalized Saved Papers list

Tags

Add yours

Related Papers

Statistics

PDF Downloaded	9
Bookmarked	0
Recommend	0
Comments	Array

Submit your papers

Alert Name:
Alerting to:
Authentication email will be sent to your email address in 24 hours
Frequency:
Email Message Format:	Plain text Graphical(HTML)

Complete the form below and we will recommend the selected titles to your friends on your behalf. * Indicates a required field.
Your name*:
Your email address*:
Recipient's name*:
Recipient's email address*:
(multiple recipient's names and email addresses should be separated with semicolons)
Your comments:	I thought you would find the page(s) useful.

Your name:
Your email address:
Recipient's name:
Recipient's email address:
(multiple recipient's names and email addresses should be separated with semicolons)
Your comments:	I thought you would find this page useful.

Disclaimer: This message was sent to your friend using the "Send it to a friend" facility on the Sciencepaper Online’ WWW site, http://www.paper.edu.cn/en. The Sciencepaper Online is not responsible for the content of this email, and anything said in this email does not necessarily reflect the Sciencepaper Online's views.

	Check out RSS, or use RSS reader to subscribe this item

Saved Papers