Download PDFOpen PDF in browser

The End-to-End Speech Synthesis System for the VLSP Campaign 2019

EasyChair Preprint no. 1742

3 pagesDate: October 22, 2019


The traditional speech synthesis systems are typically built by multiple components, such as including a text analysis front-end, an acoustic model and an audio synthesis module. Building these components often requires a lot of people possessing extensive domain experts and may contain brittle design choices. In this paper, we describe how we build a Vietnamese speech synthesis system (TTS) based on Deep Learning techniques. We completed the build of two speech synthesis systems, with BigCorpus (Mean Opinion Score of 3.47) and SmallCorpus (Mean Opinion Score of 4.13) in text-to-speech shared-tasks of VLSP 2019. In addition, transfer learning and fine-tuning techniques are also applied to solve noise data problems of training data in BigCorpus and shortage of data in SmallCorpus.

Keyphrases: deep learning, speech synthesis, speech synthesis system, Tacotron2, text-to-speech, Vietnamese speech synthesis

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
  author = {Quang Pham Huu},
  title = {The End-to-End Speech Synthesis System for the VLSP Campaign 2019},
  howpublished = {EasyChair Preprint no. 1742},

  year = {EasyChair, 2019}}
Download PDFOpen PDF in browser