|
Keyword extraction technology has gradually become a hot research problem in Natural Language Processing (NLP) and Information Retrieval. Many language tasks are inseparable from keyword extraction technology, such as long text classification, automatic summary, machine translation, dialogue system, etc. In this paper, we design a keyword extraction algorithm that can combine the benefits of both memorization and generalization. Our model contains a linear model and a deep neural networks. The linear model learns the relationship between statistic features and keywords, which can make full use of the memory capabilities of the shallow model. In the deep component, we feed the projection vector of words on the text to deep neural networks, which can enhance the generalization ability of the model. With the joint training of the linear model and the deep neural networks, our model achieves higher accuracy and scalability. Our method is compared with Frequency, Term Frequency-Inverse Document Frequency (TF-IDF) and TextRank. On the same batch of test dataset, our model is superior to the baseline model in Precision, Recall, and F-score, respectively. |
|
Keywords: Keywords extract; deep learning; joint training |
|