Author:Louw, Johannes A; Moodley, AvashlinDate:Dec 2017In this paper an automatic method to implicitly model intonation for statistical parametric speech synthesis (SPSS) is presented. The approach is ideally suited to single speaker speech databases as used in text-to-speech (TTS), due to the ...Read more
Author:De Wet, Febe; Dlamini, Nkosikhona; Van der Walt, Willem J; Govender, AvashnaDate:Dec 2017Creating synthetic voices that are both natural and intelligible is a daunting challenge for well-resourced languages. The challenge is much bigger for languages in which the speech and text resources required for voice development are not ...Read more
Author:Mogale, MM; Sefara, Tshephisho J; Mokgonyane, TBDate:Sep 2020Natural Language Processing (NLP) forms one of the important and fundamental components of speech synthesis while a language grammar forms one of the important requirements for NLP tasks. One of the major requirements in processing speech ...Read more
Author:Van Niekerk, DR; Barnard, EDate:Sep 2010The authors present an initial investigation into the acoustic realisation of tone in continuous utterances in Sepedi (a language in the Southern Bantu family). An analytic model for the generation of appropriate pitch contours given an ...Read more
Author:Van Niekerk, DR; Barnard, EDate:2013Pitch is a fundamental acoustic feature of speech and as such needs to be determined during the process of speech synthesis. While a range of communicative functions are attributed to pitch variation in speech of all languages, it plays a ...Read more
Author:Louw, Johannes A; Schlunz, Georg I; Van der Walt, W; De Wet, Febe; Pretorius, LDate:Sep 2013This paper describes the speect text-to-speech system entry for the Blizzard Challenge 2013. The techniques applied for the tasks of the challenge are described as well as the implementation details for the alignment of the audio books and ...Read more
Author:Louw, Johannes ADate:Dec 2020Sequence-to-sequence end-to-end models for text-to-speech have shown significant gains in naturalness of the produced synthetic speech. These models have an encoder-decoder architecture, without an explicit duration model, but rather a learned ...Read more