Video Emotion Recognition Using Subtitles Semantics, Audio and Visual Features

LI Chao; CHENG Gong; HAN Junwei

Chinese︱Feedback︱Save this page

• Elaborating Academic Views 　　　　 • Exchanging Innovative Ideas
• Protecting Intellectual Properties 　　• Fast Sharing Science Papers

Sponsored by the Center for Science and Technology Development of the Ministry of Education
Supervised by Ministry of Education of the People's Republic of China

Home > Papers

Video Emotion Recognition Using Subtitles Semantics, Audio and Visual Features

LI Chao,CHENG Gong #,HAN Junwei *

School of Automation, Northwestern Polytechnical University, Xi'an 710072

*Correspondence author

#Submitted by

Subject:

Funding: Specialized Research Fund for the Doctoral Program of Higher Education （No.20136102110037 ）

Opened online:28 April 2017

Accepted by: none

Citation: LI Chao,CHENG Gong,HAN Junwei.Video Emotion Recognition Using Subtitles Semantics, Audio and Visual Features[OL]. [28 April 2017] http://en.paper.edu.cn/en_releasepaper/content/4728732

Recognizing the emotion embedded in the video provides another way to classify media and supplies accurate videos that users really want. Hence, effective techniques for video emotion recognition are highly required. This paper proposes a novel framework for video emotion recognition by integrating textual feature extracted from video subtitles, audio and visual features embedded in video content. Firstly, high-level dialogic semantic features are extracted from video subtitles via Natural Language Processing (NLP) technology. These semantic features can represent emotion information by analyzing the concept of video dialogs rather that simple analysis of words. It is also more practical to extract high-level features from large number of video than to extract physiological signals in implicit tagging from participants. Secondly, a multimodal Deep Boltzmann Machine (DBM) is adopted to learn a joint representation from audio feature, visual feature and textual semantics feature. Considering some dialogs or subtitles may be absent in some videos, this model has ability to predict the joint representation without textual semantics. Finally, the joint representations are inputted into Support Vector Machine (SVM) for video emotion classification and regression. Our experimental results on the open database show the effectiveness of our framework.

Keywords:Affective computing; video emotion recognition; dialogic semantics; multimodal DBM

For this paper

● PDF (0B)
● Revision 0 　　
● Print this paper
● Recommend this paper to a friend
● Add to my favorite list

Saved Papers

Please enter a name for this paper to be shown in your personalized Saved Papers list

Tags

Add yours

Related Papers

Other similar papers
● The Method of Analyzing ...
● Emotion Recognition from...

Statistics

PDF Downloaded	31
Bookmarked	0
Recommend	0
Comments	Array

Submit your papers

Alert Name:
Alerting to:
Authentication email will be sent to your email address in 24 hours
Frequency:
Email Message Format:	Plain text Graphical(HTML)

Complete the form below and we will recommend the selected titles to your friends on your behalf. * Indicates a required field.
Your name*:
Your email address*:
Recipient's name*:
Recipient's email address*:
(multiple recipient's names and email addresses should be separated with semicolons)
Your comments:	I thought you would find the page(s) useful.

Your name:
Your email address:
Recipient's name:
Recipient's email address:
(multiple recipient's names and email addresses should be separated with semicolons)
Your comments:	I thought you would find this page useful.

Disclaimer: This message was sent to your friend using the "Send it to a friend" facility on the Sciencepaper Online’ WWW site, http://www.paper.edu.cn/en. The Sciencepaper Online is not responsible for the content of this email, and anything said in this email does not necessarily reflect the Sciencepaper Online's views.

	Check out RSS, or use RSS reader to subscribe this item

Saved Papers