Building a natural sounding Text-To-Speech system for the Nepali language: research and development challenges and solutions

Authors

  • Roop Shree Ratna Bajracharya Department of Computer Science and Engineering, Kathmandu University, Nepal
  • Santosh Regmi KEIV Technologies Pvt. Ltd.
  • Bal Krishna Bal Department of Computer Science and Engineering, Kathmandu University, Nepal
  • Balaram Prasain Central Deartment of Linguistics, Tribhuvan University, Nepal

DOI:

https://doi.org/10.3126/gipan.v4i0.35461

Keywords:

Nepali Text-to-Speech, Festival Speech Synthesis, Unit Selection Speech Synthesis

Abstract

Text-to-Speech (TTS) synthesis has come far from its primitive synthetic monotone voices to more natural and intelligible sounding voices. One of the direct applications of a natural sounding TTS systems is the screen reader applications for the visually impaired and the blind community. The Festival Speech Synthesis System uses a concatenative speech synthesis method together with the unit selection process to generate a natural sounding voice. This work primarily gives an account of the efforts put towards developing a Natural sounding TTS system for Nepali using the Festival system. We also shed light on the issues faced and the solutions derived which can be quite overlapping across other similar under-resourced languages in the region.

Downloads

Download data is not yet available.
Abstract
264
PDF
512

Author Biographies

Roop Shree Ratna Bajracharya, Department of Computer Science and Engineering, Kathmandu University, Nepal

Roop Shree Ratna Bajracharya (bajracharya.roop@gmail.com) is a Faculty Member at
Department of Computer Science and Engineering, Kathmandu University.

Santosh Regmi, KEIV Technologies Pvt. Ltd.

Santosh Regmi (regmi.santosh32@gmail.com) is Managing director of KEIV Technologies Pvt. Ltd.

Bal Krishna Bal, Department of Computer Science and Engineering, Kathmandu University, Nepal

Dr. Bal Krishna Bal (bal@ku.edu.np) is an Associate Professor at Department of
Computer Science and Engineering, Kathmandu University.

Balaram Prasain, Central Deartment of Linguistics, Tribhuvan University, Nepal

Dr. Balaram Prasain (prasain2003@yahoo.com) is an Associate Professor at Central Deartment of Linguistics, Tribhuvan University.

Downloads

Published

2019-12-31

How to Cite

Bajracharya, R. S. R., Regmi, S., Bal, B. K., & Prasain, B. (2019). Building a natural sounding Text-To-Speech system for the Nepali language: research and development challenges and solutions. Gipan, 4, 106–116. https://doi.org/10.3126/gipan.v4i0.35461

Issue

Section

Articles