DOI: 10.18178/wcse.2019.06.075
Integrate Words Internal Information to Improve Word Embeddings
Abstract— we propose a method of improving word embeddings by fusing the hidden information within
words, which is different from the traditional method of directly using morphological information on the
surface of words to train word embeddings. Based on the average principle and two attention mechanisms,
we propose to use the hidden information inside words, which is called the implied meanings of morphemes
of words in this paper, and propose six implied meaning embedding models. The comparative experiments
are carried out on two basic Natural Language Processing tasks, which prove that our models have more
advantages than the classical models represented by CBOW, Skip-Gram and GloVe in mining semantic
information. In addition, exploring the relationship between the importance of synthetic implied meanings
and the word itself.
Index Terms— average principle, attention mechanism, word embedding, fusion.
Chuanxiang Tang, Yun Tang
School of Software, University of Science and Technology of China, CHINA
Cite: Chuanxiang Tang, Yun Tang, "Integrate Words Internal Information to Improve Word Embeddings," Proceedings of 2019 the 9th International Workshop on Computer Science and Engineering, pp. 508-514 508, Hong Kong, 15-17 June, 2019.