Building of Indian Accent Telugu and English Language TTS Voice Model Using Festival Framework

Text-to-speech synthesis is the art of designing talking machines, In this modern age of computers and technology, it plays a vital role in human-machine communication. In addition, it gives the output speech for the given input text of a particular language. Speech plays an important role in the language, speaking style and an efficient way of communication among the people. Speech is the primary thing to express their feelings or emotions to the society and its influence is come to leading position in the human lives. There were a lot of hardwired aspects in the building of English Voices for HTS with the Festvox, particularly in the feature names and questions for the cluster procedures in HTS. The goal of this project was to improve the robustness of the connectivity between HTS and festvox for various databases and languages. We will present Clustergen in this article, which was developed within the widely utilized festival/festvox voice suit. It has the advantage of smoothing the data. Available Indian accent text to speech voice models built by festival framework cannot vocalize/Synthesize text which is included Non standard words. The main objective of this paper is to build a TTS voice models for Indian accent English and Telugu Language to synthesize the given text which is included non standard words using Clustergen. In this Text to Speech synthesis, we are going to use the festival, festvox, Speech Tools, SPTK and Linux 16.04 LTS environment to build synthetic voice models of natural speech. Festival is run time synthesis engine to synthesize the text using built voice models. The voice models are generated by the Statistical Parametric (Clustergen) Speech synthesis technique using festvox, speech tool and SPTK with the help of festival frame work.

Keyphrases: Clustergen., Festival, Festvox, Speech Tool, SPTK, synthesis, TTS

