DOI: 10.18178/wcse.2019.06.051
Semi-supervised Chinese Named Entity Recognition with ELMo
Abstract— Named entity recognition is a subtask of information extraction. In general, the task of named entity recognition is to identify three main categories, including entity, time and numeric class. In Chinese named entity recognition, ambiguity and out-of-vocabulary often occurs as tricky problems, but traditional character-based and word-based model do not fix it. In this paper, we propose a semi-supervised approach by taking pre-trained embeddings from language models (ELMo) as additional embedding of word embedding. Our method could catch deep contextualized word representation, which is capable to represent lexical ambiguity in different contexts and complexity of vocabulary usage, such as grammar and semantics, by this way we are capable to identify more precise content and label the word with limited labeling data. Experiments on MSRA show that our model outperforms both word-based and character-based LSTM baselines, achieving the best results.
Index Terms— Named entity recognition, Neural networks, semi-supervised, language model, character-based, Embedding of language model.
Su Zhang, Wenxin Hu, Jun Zheng
East China Normal University, CHINA
Cite: Su Zhang, Wenxin Hu, Jun Zheng, "Semi-supervised Chinese Named Entity Recognition with ELMo," Proceedings of 2019 the 9th International Workshop on Computer Science and Engineering, pp. Semi-supervised Chinese Named Entity Recognition with ELMo, Hong Kong, 15-17 June, 2019.