van den Oord, Aaron; Dieleman, Sander; Zen, Heiga; Simonyan, Karen; Vinyals, Oriol; Graves, Alex; Kalchbrenner, Nal; Senior, Andrew et al. (2016-09-12). “WaveNet: A Generative Model for Raw Audio” (English). arXiv. arXiv:1609.03499.
Andrew J., Hunt; Black, Alan W. (1996). “Unit selection in a concatenative speech synthesis system using a large speech database” (English). 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings (IEEE): 373–376. doi:10.1109/ICASSP.1996.541110. ISBN0-7803-3192-3. ISSN1520-6149.
Masuko, Takashi; Keiichi, Tokuda; Takao, Kobayashi; Satoshi, Imai (1999-05-09). “Speech synthesis using HMMs with dynamic features” (English). 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings (IEEE): 389–392. doi:10.1109/ICASSP.1996.541114. ISBN0-7803-3192-3. ISSN1520-6149.
Gopala K. Anumanchipalli, et al.. (2019) Speech synthesis from neural decoding of spoken sentences [paper]
"A formant synthesizer is a source-filter model in which the source models the glottal pulse train and the filter models the formant resonances of the vocal tract." Smith. (2010). Formant Synthesis Models. Physical Audio Signal Processing. ISBN 978-0-9745607-2-4
"Constrained linear prediction can be used to estimate the parameters ... more generally ... directly from the short-time spectrum" Smith. (2010). Formant Synthesis Models. Physical Audio Signal Processing. ISBN 978-0-9745607-2-4
Andrew J., Hunt; Black, Alan W. (1996). “Unit selection in a concatenative speech synthesis system using a large speech database” (English). 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings (IEEE): 373–376. doi:10.1109/ICASSP.1996.541110. ISBN0-7803-3192-3. ISSN1520-6149.
Masuko, Takashi; Keiichi, Tokuda; Takao, Kobayashi; Satoshi, Imai (1999-05-09). “Speech synthesis using HMMs with dynamic features” (English). 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings (IEEE): 389–392. doi:10.1109/ICASSP.1996.541114. ISBN0-7803-3192-3. ISSN1520-6149.
Zen, Heiga; Senior, Andrew; Schuster, Mike (2013-05-26). “Statistical parametric speech synthesis using deep neural networks” (English). 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE): 7962–7966. ISBN978-1-4799-0356-6. ISSN1520-6149.