|
A recently proposed model named Lattice LSTM has focused on integrating segmentation information into the long short-term memory (LSTM) network. However, it can only affect the subsequent character sequence of each character in the sequence from the level of word granularity, which results in insufficient extraction of word segmentation information. Besides, features of characters extracted by LSTM are given the same weight when transferred to the conditional random field (CRF) layer, the key semantic information does not receive much consideration. To solve the above problems, a novel neural network model is proposed in this paper which improves the original lattice model (Att-Lattice BiLSTM) with bidirectional long short-term memory based on the attention mechanism. An information path is added from the end character of word to the start character of word in the back propagation of LSTM, which integrates the word boundary information into both the start and end character of the word during bidirectional transfer of LSTM network, introducing the word information comprehensively. Moreover, this new model allows seamlessly incorporating attention mechanism to capture relatively important semantic feature automatically. Meanwhile, two strategies are provided to aggregate the bidirectional LSTM layers output to integrate semantic features effectively. Experimental results on four data sets show that the proposed model performs better than other most advanced models. |
|
Keywords:named entity recognition; deep learning; bidirectional long short-term memory; attention mechanism; lattice network |
|