Download PDFOpen PDF in browser

Improving Performance of NMT Using Semantic Concept of WordNet Synset

EasyChair Preprint no. 518

13 pagesDate: September 20, 2018


Neural machine translation (NMT) has shown promising progress in recent years. However, for reducing the computational complexity, NMT typically needs to limit its vocabulary scale to a fixed or relatively acceptable size, which leads to the problem of rare word and out-of-vocabulary (OOV). In this paper, we present that the semantic concept information of word can help NMT learn better semantic representation of word and improve the translation accuracy. The key idea is to utilize the external semantic knowledge base WordNet to replace rare words and OOVs with their semantic concepts of WordNet synsets. More specifically, we propose two semantic similarity models to obtain the most similar concepts of rare words and OOVs. Experimental results on 4 translation tasks show that our method outperforms the baseline RNNSearch by 2.38~2.88 BLEU points. Furthermore, the proposed hybrid method by combining BPE and our proposed method can also gain 0.39~0.97 BLEU points improvement over BPE. Experiments and analysis presented in this study also demonstrate that the proposed method can significantly improve translation quality of OOVs in NMT.

Keyphrases: NMT, Rare words, Semantic concept of synset, unknown words

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
  author = {Fangxu Liu and Jinan Xu and Guoyi Miao and Yufeng Chen and Yujie Zhang},
  title = {Improving Performance of NMT Using Semantic Concept of WordNet Synset},
  howpublished = {EasyChair Preprint no. 518},
  doi = {10.29007/3wwz},
  year = {EasyChair, 2018}}
Download PDFOpen PDF in browser