MFPNet:Multi Frame Propagation Network For Video Instance-level Human Parsing

Zhou Du; Li Wei

Chinese︱Feedback︱Save this page

• Elaborating Academic Views 　　　　 • Exchanging Innovative Ideas
• Protecting Intellectual Properties 　　• Fast Sharing Science Papers

Sponsored by the Center for Science and Technology Development of the Ministry of Education
Supervised by Ministry of Education of the People's Republic of China

Home > Papers

MFPNet:Multi Frame Propagation Network For Video Instance-level Human Parsing

Zhou Du,Li Wei *

State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications

*Correspondence author

#Submitted by

Subject:

Funding: none

Opened online:21 January 2020

Accepted by: none

Citation: Zhou Du,Li Wei.MFPNet:Multi Frame Propagation Network For Video Instance-level Human Parsing[OL]. [21 January 2020] http://en.paper.edu.cn/en_releasepaper/content/4750488

Video instance-level human parsing can easily implement functions such as background replacement, adding decorations and scaling human part. In this paper, the spatial features from the current frame and the temporal features from the previous k frames are unified into a network, and a Multi Frame Propagation Net (MFPNet) is proposed to solve this task. The main contributions are shown below. First, we propose two blocks Position-Squeeze-and-Excitation (P-SE) and Global Attention Module (GAM). P-SE applies the idea of Squeeze-and-Excitation (SE) to spatial locations. It can learn a spatial attention map, which represent the correlation degree of body parts. GAM is a combination of SE and P-SE, which can extract global structured features. Second, a propagation module is proposed to obtain the temporal features between video frames. This module consists of 3D convolution and Convolutional Gated Recurrent Unit (ConvGRU). 3D convolution can better obtain the spatiotemporal features between consecutive frames, and ConvGRU further obtains the temporal features. Third, MFPNet has achieved the state of the art in the Video Instance-level Parsing (VIP) dataset.

Keywords:Artificial Intelligence;Video Instance-level Parsing; Global Attention Module

For this paper

● PDF (0B)
● Revision 0 　　
● Print this paper
● Recommend this paper to a friend
● Add to my favorite list

Saved Papers

Please enter a name for this paper to be shown in your personalized Saved Papers list

Tags

Add yours

Related Papers

Statistics

PDF Downloaded	70
Bookmarked	0
Recommend	0
Comments	Array

Submit your papers

Alert Name:
Alerting to:
Authentication email will be sent to your email address in 24 hours
Frequency:
Email Message Format:	Plain text Graphical(HTML)

Complete the form below and we will recommend the selected titles to your friends on your behalf. * Indicates a required field.
Your name*:
Your email address*:
Recipient's name*:
Recipient's email address*:
(multiple recipient's names and email addresses should be separated with semicolons)
Your comments:	I thought you would find the page(s) useful.

Your name:
Your email address:
Recipient's name:
Recipient's email address:
(multiple recipient's names and email addresses should be separated with semicolons)
Your comments:	I thought you would find this page useful.

Disclaimer: This message was sent to your friend using the "Send it to a friend" facility on the Sciencepaper Online’ WWW site, http://www.paper.edu.cn/en. The Sciencepaper Online is not responsible for the content of this email, and anything said in this email does not necessarily reflect the Sciencepaper Online's views.

	Check out RSS, or use RSS reader to subscribe this item

Saved Papers