Improving Performance of NMT Using Semantic Concept of WordNet Synset

EasyChair Preprint 518

13 pages•Date: September 20, 2018

Fangxu Liu, Jinan Xu, Guoyi Miao, Yufeng Chen and Yujie Zhang

Abstract

Neural machine translation (NMT) has shown promising progress in recent years. However, for reducing the computational complexity, NMT typically needs to limit its vocabulary scale to a fixed or relatively acceptable size, which leads to the problem of rare word and out-of-vocabulary (OOV). In this paper, we present that the semantic concept information of word can help NMT learn better semantic representation of word and improve the translation accuracy. The key idea is to utilize the external semantic knowledge base WordNet to replace rare words and OOVs with their semantic concepts of WordNet synsets. More specifically, we propose two semantic similarity models to obtain the most similar concepts of rare words and OOVs. Experimental results on 4 translation tasks show that our method outperforms the baseline RNNSearch by 2.38~2.88 BLEU points. Furthermore, the proposed hybrid method by combining BPE and our proposed method can also gain 0.39~0.97 BLEU points improvement over BPE. Experiments and analysis presented in this study also demonstrate that the proposed method can significantly improve translation quality of OOVs in NMT.

Keyphrases: NMT, Rare words, Semantic concept of synset, unknown words

Links:	https://easychair.org/publications/preprint/4T2F
	https://doi.org/10.29007/3wwz

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:518,
  author    = {Fangxu Liu and Jinan Xu and Guoyi Miao and Yufeng Chen and Yujie Zhang},
  title     = {Improving Performance of NMT Using Semantic Concept of WordNet Synset},
  doi       = {10.29007/3wwz},
  howpublished = {EasyChair Preprint 518},
  year      = {EasyChair, 2018}}

Download PDF Open PDF in browser