Home > Papers

 
 
Research on Fine-grained Text Similarity Detection for Research Papers via Rhetorical Structure Theory
XU Fan 1 * #,ZHU Qiaoming 2,LI Peifeng 2
1.School of Computer Science and Technology, Soochow University, JiangSu SuZhou 215006
2.School of Computer Science and Technology, Soochow University
*Correspondence author
#Submitted by
Subject:
Funding: Special Fund for Fast Sharing of Science Paper in Net Era by CSTD(No.No 20103201110021)
Opened online: 6 June 2012
Accepted by: none
Citation: XU Fan,ZHU Qiaoming,LI Peifeng.Research on Fine-grained Text Similarity Detection for Research Papers via Rhetorical Structure Theory[OL]. [ 6 June 2012] http://en.paper.edu.cn/en_releasepaper/content/4480267
 
 
Text similarity detection is important in NLP. Yet only course-grained perspective has been investigated so far. To the best of our knowledge, this is the first paper proposes using fine-grained and discourse tree technology to detect similarity for research papers. Specifically, we present 2-stage text similarity detection framework. The first stage is that we automatically classify corresponding type of each sentence in texts using machine learning technology. In our 10-fold cross validation experiment, the accuracy and F1 measure is significantly improved. The second stage is that we create discourse tree for research papers using Rhetorical Structure Theory(RST). We employ "Failure tree" data structure to represent the final similarity coeffi-cient for the tree, and verify the effec-tiveness of it through experiment.
Keywords:Fine-grained; Similarity; Research papers; Rhetorical Structure Theory
 
 
 

For this paper

  • PDF (0B)
  • ● Revision 0   
  • ● Print this paper
  • ● Recommend this paper to a friend
  • ● Add to my favorite list

    Saved Papers

    Please enter a name for this paper to be shown in your personalized Saved Papers list

Tags

Add yours

Related Papers

Statistics

PDF Downloaded 389
Bookmarked 0
Recommend 5
Comments Array
Submit your papers