|
Much attention has been drawn to leveraging the sub-word information to improve word representation, especially in some morphological language like Chinese. Previous studies on Chinese word embeddings has explored diverse fine-grained sub-word information, such as character, radical, component, stoke n-grams. However, all of them do not distinguish the semantic contribution of word to the context and are weak at handling the ambiguity of characters and sub-character components as well. In this paper, we propose AJWE, a jointly model for learning Chinese word embeddings with heterogeneous attention. we explore an external self-attention mechanism to learning the word semantic contribution to the context, specially propose a bias-attention approach for in-ternal sub-word morphemes to address the ambiguity issue. Evaluation on the word similarity, word analogy, text classification and name entity recognition demonstrates that our model outperforms existing state-of-the-art methods. |
|
Keywords:natural language processing; text representation; Chinese word embeddings; morpheme disambiguation |
|