Audio deepfake (English Wikipedia)

Analysis of information sources in references of the Wikipedia article "Audio deepfake" in English language version.

refsWebsite

Global rank English rank

22doi.org

2^nd place

21arxiv.org

69^th place

59^th place

18semanticscholar.org

11^th place

8^th place

10worldcat.org

5^th place

8ieee.org

652^nd place

515^th place

7web.archive.org

1^st place

3cnn.com

28^th place

26^th place

3github.com

383^rd place

320^th place

3darpa.mil

low place

7,006^th place

2springer.com

274^th place

309^th place

2isca-speech.org

low place

2sky.com

431^st place

274^th place

2handle.net

102^nd place

76^th place

2sites.google.com

626^th place

690^th place

2bbc.com

20^th place

30^th place

1sagepub.com

731^st place

638^th place

1bloomberg.com

99^th place

77^th place

1washingtonpost.com

34^th place

27^th place

1people.com

31^st place

25^th place

1sciencedirect.com

149^th place

178^th place

1wsj.com

79^th place

65^th place

1forbes.com

54^th place

48^th place

1axios.com

1,716^th place

973^rd place

1mcafee.com

low place

1vice.com

175^th place

137^th place

1theguardian.com

12^th place

11^th place

1ftc.gov

3,796^th place

2,268^th place

1wired.com

193^rd place

152^nd place

1lawandcrime.com

9,323^rd place

4,970^th place

1fcc.gov

1,271^st place

703^rd place

1cbsnews.com

108^th place

80^th place

1npr.org

92^nd place

72^nd place

1nips.cc

low place

7,050^th place

1asvspoof.org

low place

1deeplearning.ai

low place

1towardsdatascience.com

8,920^th place

6,292^nd place

1google.github.io

low place

1guardian.ng

2,612^th place

1,418^th place

1automaton-media.com

7,311^th place

low place

1openai.com

1,559^th place

1,155^th place

1babbel.com

low place

1elsevier.com

610^th place

704^th place

1koreascience.or.kr

low place

1harvard.edu

18^th place

17^th place

1sam.gov

low place

1govtribe.com

low place

1signalprocessingsociety.org

low place

1synsig.org

low place

1yahoo.com

38^th place

40^th place

1lbc.co.uk

7,372^nd place

4,076^th place

1msn.com

117^th place

145^th place

1thetimes.com

low place

arxiv.org

Lyu, Siwei (2020). "Deepfake Detection: Current Challenges and Next Steps". 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). pp. 1–6. arXiv:2003.09234. doi:10.1109/icmew46912.2020.9105991. ISBN 978-1-7281-1485-9. S2CID 214605906. Retrieved 2022-06-29.
Khanjani, Zahra; Watson, Gabrielle; Janeja, Vandana P. (2021-11-28). "How Deep Are the Fakes? Focusing on Audio Deepfake: A Survey". arXiv:2111.14203 [cs.SD].
Tan, Xu; Qin, Tao; Soong, Frank; Liu, Tie-Yan (2021-07-23). "A Survey on Neural Speech Synthesis". arXiv:2106.15561 [eess.AS].
Oord, Aaron van den; Dieleman, Sander; Zen, Heiga; Simonyan, Karen; Vinyals, Oriol; Graves, Alex; Kalchbrenner, Nal; Senior, Andrew; Kavukcuoglu, Koray (2016-09-19). "WaveNet: A Generative Model for Raw Audio". arXiv:1609.03499 [cs.SD].
Kuchaiev, Oleksii; Li, Jason; Nguyen, Huyen; Hrinchuk, Oleksii; Leary, Ryan; Ginsburg, Boris; Kriman, Samuel; Beliaev, Stanislav; Lavrukhin, Vitaly; Cook, Jack; Castonguay, Patrice (2019-09-13). "NeMo: a toolkit for building AI applications using Neural Modules". arXiv:1909.09577 [cs.LG].
Wang, Yuxuan; Skerry-Ryan, R. J.; Stanton, Daisy; Wu, Yonghui; Weiss, Ron J.; Jaitly, Navdeep; Yang, Zongheng; Xiao, Ying; Chen, Zhifeng; Bengio, Samy; Le, Quoc (2017-04-06). "Tacotron: Towards End-to-End Speech Synthesis". arXiv:1703.10135 [cs.CL].
Prenger, Ryan; Valle, Rafael; Catanzaro, Bryan (2018-10-30). "WaveGlow: A Flow-based Generative Network for Speech Synthesis". arXiv:1811.00002 [cs.SD].
Vasquez, Sean; Lewis, Mike (2019-06-04). "MelNet: A Generative Model for Audio in the Frequency Domain". arXiv:1906.01083 [eess.AS].
Ping, Wei; Peng, Kainan; Gibiansky, Andrew; Arik, Sercan O.; Kannan, Ajay; Narang, Sharan; Raiman, Jonathan; Miller, John (2018-02-22). "Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning". arXiv:1710.07654 [cs.SD].
Ren, Yi; Ruan, Yangjun; Tan, Xu; Qin, Tao; Zhao, Sheng; Zhao, Zhou; Liu, Tie-Yan (2019-11-20). "FastSpeech: Fast, Robust and Controllable Text to Speech". arXiv:1905.09263 [cs.CL].
Zhang, Mingyang; Wang, Xin; Fang, Fuming; Li, Haizhou; Yamagishi, Junichi (2019-04-07). "Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet". arXiv:1903.12389 [eess.AS].
Sercan, Ö Arık; Jitong, Chen; Kainan, Peng; Wei, Ping; Yanqi, Zhou (2018). "Neural Voice Cloning with a Few Samples". Advances in Neural Information Processing Systems (NeurIPS 2018). 31 (published 12 October 2018): 10040–10050. arXiv:1802.06006.
Kong, Jungil; Kim, Jaehyeon; Bae, Jaekyoung (2020-10-23). "HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis". arXiv:2010.05646 [cs.SD].
Kumar, Kundan; Kumar, Rithesh; de Boissiere, Thibault; Gestin, Lucas; Teoh, Wei Zhen; Sotelo, Jose; de Brebisson, Alexandre; Bengio, Yoshua; Courville, Aaron (2019-12-08). "MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis". arXiv:1910.06711 [eess.AS].
Liu, Xiao; Zhang, Fanjin; Hou, Zhenyu; Mian, Li; Wang, Zhaoyu; Zhang, Jing; Tang, Jie (2021). "Self-supervised Learning: Generative or Contrastive". IEEE Transactions on Knowledge and Data Engineering. 35 (1): 857–876. arXiv:2006.08218. doi:10.1109/TKDE.2021.3090866. ISSN 1558-2191. S2CID 219687051.
Fraga-Lamas, Paula; Fernández-Caramés, Tiago M. (2019-10-20). "Fake News, Disinformation, and Deepfakes: Leveraging Distributed Ledger Technologies and Blockchain to Combat Digital Deception and Counterfeit Reality". IT Professional. 22 (2): 53–59. arXiv:1904.05386. doi:10.1109/MITP.2020.2977589.
Müller, Nicolas M.; Czempin, Pavel; Dieckmann, Franziska; Froghyar, Adam; Böttinger, Konstantin (2022-04-21). "Does Audio Deepfake Detection Generalize?". arXiv:2203.16263 [cs.SD].
Zhang, You; Jiang, Fei; Duan, Zhiyao (2021). "One-Class Learning Towards Synthetic Voice Spoofing Detection". IEEE Signal Processing Letters. 28: 937–941. arXiv:2010.13995. Bibcode:2021ISPL...28..937Z. doi:10.1109/LSP.2021.3076358. ISSN 1558-2361. S2CID 235077416.
Bird, Jordan J.; Lotfi, Ahmad (2023). "Real-time Detection of AI-Generated Speech for DeepFake Voice Conversion". arXiv:2308.12734 [cs.SD].
Yamagishi, Junichi; Wang, Xin; Todisco, Massimiliano; Sahidullah, Md; Patino, Jose; Nautsch, Andreas; Liu, Xuechen; Lee, Kong Aik; Kinnunen, Tomi; Evans, Nicholas; Delgado, Héctor (2021-09-01). "ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection". arXiv:2109.00537 [eess.AS].
Yi, Jiangyan; Fu, Ruibo; Tao, Jianhua; Nie, Shuai; Ma, Haoxin; Wang, Chenglong; Wang, Tao; Tian, Zhengkun; Bai, Ye; Fan, Cunhang; Liang, Shan (2022-02-26). "ADD 2022: the First Audio Deep Synthesis Detection Challenge". arXiv:2202.08433 [cs.SD].

asvspoof.org

"| ASVspoof". www.asvspoof.org. Retrieved 2022-07-01.

automaton-media.com

Kurosawa, Yuki (January 19, 2021). "ゲームキャラ音声読み上げソフト「15.ai」公開中。『Undertale』や『Portal』のキャラに好きなセリフを言ってもらえる" [Game Character Voice Reading Software "15.ai" Now Available. Get Characters from Undertale and Portal to Say Your Desired Lines]. AUTOMATON (in Japanese). Archived from the original on January 19, 2021. Retrieved December 18, 2024.

axios.com

"Generative AI is making voice scams easier to believe". Axios. 13 June 2023. Retrieved 16 June 2023.

babbel.com

Babbel.com; GmbH, Lesson Nine. "The 10 Most Spoken Languages In The World". Babbel Magazine. Retrieved 2022-06-30.

bbc.com

"'Stop using my voice' - ScotRail's new announcer is my AI clone". BBC News. 2025-05-27. Retrieved 2025-05-28.
"'Give it time' - ScotRail defends AI announcer Iona". BBC News. 2025-05-22. Retrieved 2025-05-28.

bloomberg.com

Murphy, Margi (20 February 2024). "Deepfake Audio Boom Exploits One Billion-Dollar Startup's AI". Bloomberg.

cbsnews.com

Kramer, Marcia (2024-02-26). "Steve Kramer explains why he used AI to impersonate President Biden in New Hampshire - CBS New York". www.cbsnews.com. Retrieved 2024-05-23.

cnn.com

"Political consultant behind fake Biden AI robocall faces charges in New Hampshire".
David Wright; Brian Fung; Brian Fung (February 6, 2024). "Fake Biden robocall linked to Texas-based companies, New Hampshire attorney general announces". CNN.
Brian Fung (February 8, 2024). "FCC votes to ban scam robocalls that use AI-generated voices". CNN.

darpa.mil

"The SemaFor Program". www.darpa.mil. Retrieved 2022-07-01.
"The MediFor Program". www.darpa.mil. Retrieved 2022-07-01.
"DARPA Announces Research Teams Selected to Semantic Forensics Program". www.darpa.mil. Retrieved 2022-07-01.

deeplearning.ai

Ng, Andrew (April 1, 2020). "Voice Cloning for the Masses". DeepLearning.AI. Archived from the original on December 28, 2024. Retrieved December 22, 2024.

doi.org

Lyu, Siwei (2020). "Deepfake Detection: Current Challenges and Next Steps". 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). pp. 1–6. arXiv:2003.09234. doi:10.1109/icmew46912.2020.9105991. ISBN 978-1-7281-1485-9. S2CID 214605906. Retrieved 2022-06-29.
Diakopoulos, Nicholas; Johnson, Deborah (June 2020). "Anticipating and addressing the ethical implications of deepfakes in the context of elections". New Media & Society. 23 (7) (published 2020-06-05): 2072–2098. doi:10.1177/1461444820925811. ISSN 1461-4448. S2CID 226196422.
Chadha, Anupama; Kumar, Vaibhav; Kashyap, Sonu; Gupta, Mayank (2021), Singh, Pradeep Kumar; Wierzchoń, Sławomir T.; Tanwar, Sudeep; Ganzha, Maria (eds.), "Deepfake: An Overview", Proceedings of Second International Conference on Computing, Communications, and Cyber-Security, Lecture Notes in Networks and Systems, vol. 203, Singapore: Springer Singapore, pp. 557–566, doi:10.1007/978-981-16-0733-2_39, ISBN 978-981-16-0732-5, S2CID 236666289, retrieved 2022-06-29
Almutairi, Zaynab; Elgibreen, Hebah (2022-05-04). "A Review of Modern Audio Deepfake Detection Methods: Challenges and Future Directions". Algorithms. 15 (5): 155. doi:10.3390/a15050155. ISSN 1999-4893.
Caramancion, Kevin Matthe (June 2022). "An Exploration of Mis/Disinformation in Audio Format Disseminated in Podcasts: Case Study of Spotify". 2022 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS). pp. 1–6. doi:10.1109/IEMTRONICS55184.2022.9795760. ISBN 978-1-6654-8684-2. S2CID 249903722.
Chen, Tianxiang; Kumar, Avrosh; Nagarsheth, Parav; Sivaraman, Ganesh; Khoury, Elie (2020-11-01). "Generalization of Audio Deepfake Detection". The Speaker and Language Recognition Workshop (Odyssey 2020). ISCA: 132–137. doi:10.21437/Odyssey.2020-19. S2CID 219492826.
Ballesteros, Dora M.; Rodriguez-Ortega, Yohanna; Renza, Diego; Arce, Gonzalo (2021-12-01). "Deep4SNet: deep learning for fake speech classification". Expert Systems with Applications. 184: 115465. doi:10.1016/j.eswa.2021.115465. ISSN 0957-4174. S2CID 237659479.
Suwajanakorn, Supasorn; Seitz, Steven M.; Kemelmacher-Shlizerman, Ira (2017-07-20). "Synthesizing Obama: learning lip sync from audio". ACM Transactions on Graphics. 36 (4): 95:1–95:13. doi:10.1145/3072959.3073640. ISSN 0730-0301. S2CID 207586187.
Pradhan, Swadhin; Sun, Wei; Baig, Ghufran; Qiu, Lili (2019-09-09). "Combating Replay Attacks Against Voice Assistants". Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies. 3 (3): 100:1–100:26. doi:10.1145/3351258. S2CID 202159551.
Villalba, Jesus; Lleida, Eduardo (2011). "Preventing replay attacks on speaker verification systems". 2011 Carnahan Conference on Security Technology. pp. 1–8. doi:10.1109/CCST.2011.6095943. ISBN 978-1-4577-0903-6. S2CID 17048213. Retrieved 2022-06-29.
Tom, Francis; Jain, Mohit; Dey, Prasenjit (2018-09-02). "End-To-End Audio Replay Attack Detection Using Deep Convolutional Networks with Attention". Interspeech 2018. ISCA: 681–685. doi:10.21437/Interspeech.2018-2279. S2CID 52187155.
Ning, Yishuang; He, Sheng; Wu, Zhiyong; Xing, Chunxiao; Zhang, Liang-Jie (January 2019). "A Review of Deep Learning Based Speech Synthesis". Applied Sciences. 9 (19): 4050. doi:10.3390/app9194050. ISSN 2076-3417.
Rodríguez-Ortega, Yohanna; Ballesteros, Dora María; Renza, Diego (2020). "A Machine Learning Model to Detect Fake Voice". In Florez, Hector; Misra, Sanjay (eds.). Applied Informatics. Communications in Computer and Information Science. Vol. 1277. Cham: Springer International Publishing. pp. 3–13. doi:10.1007/978-3-030-61702-8_1. ISBN 978-3-030-61702-8. S2CID 226283369.
Najafian, Maryam; Russell, Martin (September 2020). "Automatic accent identification as an analytical tool for accent robust automatic speech recognition". Speech Communication. 122: 44–55. doi:10.1016/j.specom.2020.05.003. S2CID 225778214.
Liu, Xiao; Zhang, Fanjin; Hou, Zhenyu; Mian, Li; Wang, Zhaoyu; Zhang, Jing; Tang, Jie (2021). "Self-supervised Learning: Generative or Contrastive". IEEE Transactions on Knowledge and Data Engineering. 35 (1): 857–876. arXiv:2006.08218. doi:10.1109/TKDE.2021.3090866. ISSN 1558-2191. S2CID 219687051.
Rashid, Md Mamunur; Lee, Suk-Hwan; Kwon, Ki-Ryong (2021). "Blockchain Technology for Combating Deepfake and Protect Video/Image Integrity". Journal of Korea Multimedia Society. 24 (8): 1044–1058. doi:10.9717/kmms.2021.24.8.1044. ISSN 1229-7771.
Fraga-Lamas, Paula; Fernández-Caramés, Tiago M. (2019-10-20). "Fake News, Disinformation, and Deepfakes: Leveraging Distributed Ledger Technologies and Blockchain to Combat Digital Deception and Counterfeit Reality". IT Professional. 22 (2): 53–59. arXiv:1904.05386. doi:10.1109/MITP.2020.2977589.
Ki Chan, Christopher Chun; Kumar, Vimal; Delaney, Steven; Gochoo, Munkhjargal (September 2020). "Combating Deepfakes: Multi-LSTM and Blockchain as Proof of Authenticity for Digital Media". 2020 IEEE / ITU International Conference on Artificial Intelligence for Good (AI4G). pp. 55–62. doi:10.1109/AI4G50087.2020.9311067. ISBN 978-1-7281-7031-2. S2CID 231618774.
Mittal, Trisha; Bhattacharya, Uttaran; Chandra, Rohan; Bera, Aniket; Manocha, Dinesh (2020-10-12), "Emotions Don't Lie: An Audio-Visual Deepfake Detection Method using Affective Cues", Proceedings of the 28th ACM International Conference on Multimedia, New York, NY, USA: Association for Computing Machinery, pp. 2823–2832, doi:10.1145/3394171.3413570, ISBN 978-1-4503-7988-5, S2CID 220935571, retrieved 2022-06-29
Conti, Emanuele; Salvi, Davide; Borrelli, Clara; Hosler, Brian; Bestagini, Paolo; Antonacci, Fabio; Sarti, Augusto; Stamm, Matthew C.; Tubaro, Stefano (2022-05-23). "Deepfake Speech Detection Through Emotion Recognition: A Semantic Approach". ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Singapore, Singapore: IEEE. pp. 8962–8966. doi:10.1109/ICASSP43922.2022.9747186. hdl:11311/1220518. ISBN 978-1-6654-0540-9. S2CID 249436701.
Hosler, Brian; Salvi, Davide; Murray, Anthony; Antonacci, Fabio; Bestagini, Paolo; Tubaro, Stefano; Stamm, Matthew C. (June 2021). "Do Deepfakes Feel Emotions? A Semantic Approach to Detecting Deepfakes Via Emotional Inconsistencies". 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Nashville, TN, USA: IEEE. pp. 1013–1022. doi:10.1109/CVPRW53098.2021.00112. hdl:11311/1183572. ISBN 978-1-6654-4899-4. S2CID 235679849.
Zhang, You; Jiang, Fei; Duan, Zhiyao (2021). "One-Class Learning Towards Synthetic Voice Spoofing Detection". IEEE Signal Processing Letters. 28: 937–941. arXiv:2010.13995. Bibcode:2021ISPL...28..937Z. doi:10.1109/LSP.2021.3076358. ISSN 1558-2361. S2CID 235077416.

elsevier.com

linkinghub.elsevier.com

Najafian, Maryam; Russell, Martin (September 2020). "Automatic accent identification as an analytical tool for accent robust automatic speech recognition". Speech Communication. 122: 44–55. doi:10.1016/j.specom.2020.05.003. S2CID 225778214.

fcc.gov

"FCC Makes AI-Generated Voices in Robocalls Illegal | Federal Communications Commission". www.fcc.gov. 2024-02-08. Retrieved 2024-05-26.

forbes.com

Brewster, Thomas. "Fraudsters Cloned Company Director's Voice In $35 Million Bank Heist, Police Find". Forbes. Retrieved 2022-06-29.

ftc.gov

consumer.ftc.gov

"Scammers use AI to enhance their family emergency schemes". Consumer Advice. 2023-03-17. Retrieved 2024-05-26.

github.com

resemble-ai/Resemblyzer, Resemble AI, 2022-06-30, retrieved 2022-07-01
mendaxfz (2022-06-28), Synthetic-Voice-Detection, retrieved 2022-07-01
HUA, Guang (2022-06-29), End-to-End Synthetic Speech Detection, retrieved 2022-07-01

google.github.io

"Audio samples from "Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis"". 2018-08-30. Archived from the original on 2020-11-11. Retrieved 2022-06-05.

govtribe.com

"The DARPA MediFor Program". govtribe.com. Retrieved 2022-06-29.

guardian.ng

Temitope, Yusuf (December 10, 2024). "15.ai Creator reveals journey from MIT Project to internet phenomenon". The Guardian. Archived from the original on December 28, 2024. Retrieved December 25, 2024.

handle.net

hdl.handle.net

Conti, Emanuele; Salvi, Davide; Borrelli, Clara; Hosler, Brian; Bestagini, Paolo; Antonacci, Fabio; Sarti, Augusto; Stamm, Matthew C.; Tubaro, Stefano (2022-05-23). "Deepfake Speech Detection Through Emotion Recognition: A Semantic Approach". ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Singapore, Singapore: IEEE. pp. 8962–8966. doi:10.1109/ICASSP43922.2022.9747186. hdl:11311/1220518. ISBN 978-1-6654-0540-9. S2CID 249436701.
Hosler, Brian; Salvi, Davide; Murray, Anthony; Antonacci, Fabio; Bestagini, Paolo; Tubaro, Stefano; Stamm, Matthew C. (June 2021). "Do Deepfakes Feel Emotions? A Semantic Approach to Detecting Deepfakes Via Emotional Inconsistencies". 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Nashville, TN, USA: IEEE. pp. 1013–1022. doi:10.1109/CVPRW53098.2021.00112. hdl:11311/1183572. ISBN 978-1-6654-4899-4. S2CID 235679849.

harvard.edu

ui.adsabs.harvard.edu

Zhang, You; Jiang, Fei; Duan, Zhiyao (2021). "One-Class Learning Towards Synthetic Voice Spoofing Detection". IEEE Signal Processing Letters. 28: 937–941. arXiv:2010.13995. Bibcode:2021ISPL...28..937Z. doi:10.1109/LSP.2021.3076358. ISSN 1558-2361. S2CID 235077416.

ieee.org

ieeexplore.ieee.org

Lyu, Siwei (2020). "Deepfake Detection: Current Challenges and Next Steps". 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). pp. 1–6. arXiv:2003.09234. doi:10.1109/icmew46912.2020.9105991. ISBN 978-1-7281-1485-9. S2CID 214605906. Retrieved 2022-06-29.
Caramancion, Kevin Matthe (June 2022). "An Exploration of Mis/Disinformation in Audio Format Disseminated in Podcasts: Case Study of Spotify". 2022 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS). pp. 1–6. doi:10.1109/IEMTRONICS55184.2022.9795760. ISBN 978-1-6654-8684-2. S2CID 249903722.
Villalba, Jesus; Lleida, Eduardo (2011). "Preventing replay attacks on speaker verification systems". 2011 Carnahan Conference on Security Technology. pp. 1–8. doi:10.1109/CCST.2011.6095943. ISBN 978-1-4577-0903-6. S2CID 17048213. Retrieved 2022-06-29.
Liu, Xiao; Zhang, Fanjin; Hou, Zhenyu; Mian, Li; Wang, Zhaoyu; Zhang, Jing; Tang, Jie (2021). "Self-supervised Learning: Generative or Contrastive". IEEE Transactions on Knowledge and Data Engineering. 35 (1): 857–876. arXiv:2006.08218. doi:10.1109/TKDE.2021.3090866. ISSN 1558-2191. S2CID 219687051.
Ki Chan, Christopher Chun; Kumar, Vimal; Delaney, Steven; Gochoo, Munkhjargal (September 2020). "Combating Deepfakes: Multi-LSTM and Blockchain as Proof of Authenticity for Digital Media". 2020 IEEE / ITU International Conference on Artificial Intelligence for Good (AI4G). pp. 55–62. doi:10.1109/AI4G50087.2020.9311067. ISBN 978-1-7281-7031-2. S2CID 231618774.
Conti, Emanuele; Salvi, Davide; Borrelli, Clara; Hosler, Brian; Bestagini, Paolo; Antonacci, Fabio; Sarti, Augusto; Stamm, Matthew C.; Tubaro, Stefano (2022-05-23). "Deepfake Speech Detection Through Emotion Recognition: A Semantic Approach". ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Singapore, Singapore: IEEE. pp. 8962–8966. doi:10.1109/ICASSP43922.2022.9747186. hdl:11311/1220518. ISBN 978-1-6654-0540-9. S2CID 249436701.
Hosler, Brian; Salvi, Davide; Murray, Anthony; Antonacci, Fabio; Bestagini, Paolo; Tubaro, Stefano; Stamm, Matthew C. (June 2021). "Do Deepfakes Feel Emotions? A Semantic Approach to Detecting Deepfakes Via Emotional Inconsistencies". 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Nashville, TN, USA: IEEE. pp. 1013–1022. doi:10.1109/CVPRW53098.2021.00112. hdl:11311/1183572. ISBN 978-1-6654-4899-4. S2CID 235679849.
Zhang, You; Jiang, Fei; Duan, Zhiyao (2021). "One-Class Learning Towards Synthetic Voice Spoofing Detection". IEEE Signal Processing Letters. 28: 937–941. arXiv:2010.13995. Bibcode:2021ISPL...28..937Z. doi:10.1109/LSP.2021.3076358. ISSN 1558-2361. S2CID 235077416.

isca-speech.org

Chen, Tianxiang; Kumar, Avrosh; Nagarsheth, Parav; Sivaraman, Ganesh; Khoury, Elie (2020-11-01). "Generalization of Audio Deepfake Detection". The Speaker and Language Recognition Workshop (Odyssey 2020). ISCA: 132–137. doi:10.21437/Odyssey.2020-19. S2CID 219492826.
Tom, Francis; Jain, Mohit; Dey, Prasenjit (2018-09-02). "End-To-End Audio Replay Attack Detection Using Deep Convolutional Networks with Attention". Interspeech 2018. ISCA: 681–685. doi:10.21437/Interspeech.2018-2279. S2CID 52187155.

koreascience.or.kr

Rashid, Md Mamunur; Lee, Suk-Hwan; Kwon, Ki-Ryong (2021). "Blockchain Technology for Combating Deepfake and Protect Video/Image Integrity". Journal of Korea Multimedia Society. 24 (8): 1044–1058. doi:10.9717/kmms.2021.24.8.1044. ISSN 1229-7771.

lawandcrime.com

"Political consultant accused of hiring magician to spam voters with Biden deepfake calls". Law & Crime. 2024-03-15. Retrieved 2024-05-23.

lbc.co.uk

"Leading voiceover artist 'violated' by ScotRail AI announcements using her voice without 'permission'". LBC. Retrieved 2025-05-28.

mcafee.com

Bunn, Amy (15 May 2023). "Artificial Imposters—Cybercriminals Turn to AI Voice Cloning for a New Breed of Scam". McAfee Blog. Retrieved 16 June 2023.

msn.com

"MSN". www.msn.com. Retrieved 2025-05-28.

nips.cc

papers.nips.cc

Sercan, Ö Arık; Jitong, Chen; Kainan, Peng; Wei, Ping; Yanqi, Zhou (2018). "Neural Voice Cloning with a Few Samples". Advances in Neural Information Processing Systems (NeurIPS 2018). 31 (published 12 October 2018): 10040–10050. arXiv:1802.06006.

npr.org

"A political consultant faces charges and fines for Biden deepfake robocalls".

openai.com

"Navigating the Challenges and Opportunities of Synthetic Voices". OpenAI. March 9, 2024. Archived from the original on November 25, 2024. Retrieved December 18, 2024.

people.com

Etienne, Vanessa (August 19, 2021). "Val Kilmer Gets His Voice Back After Throat Cancer Battle Using AI Technology: Hear the Results". PEOPLE.com. Retrieved 2022-07-01.

sagepub.com

journals.sagepub.com

Diakopoulos, Nicholas; Johnson, Deborah (June 2020). "Anticipating and addressing the ethical implications of deepfakes in the context of elections". New Media & Society. 23 (7) (published 2020-06-05): 2072–2098. doi:10.1177/1461444820925811. ISSN 1461-4448. S2CID 226196422.

sam.gov

"SAM.gov". sam.gov. Retrieved 2022-06-29.

sciencedirect.com

Ballesteros, Dora M.; Rodriguez-Ortega, Yohanna; Renza, Diego; Arce, Gonzalo (2021-12-01). "Deep4SNet: deep learning for fake speech classification". Expert Systems with Applications. 184: 115465. doi:10.1016/j.eswa.2021.115465. ISSN 0957-4174. S2CID 237659479.

semanticscholar.org

api.semanticscholar.org

Lyu, Siwei (2020). "Deepfake Detection: Current Challenges and Next Steps". 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). pp. 1–6. arXiv:2003.09234. doi:10.1109/icmew46912.2020.9105991. ISBN 978-1-7281-1485-9. S2CID 214605906. Retrieved 2022-06-29.
Diakopoulos, Nicholas; Johnson, Deborah (June 2020). "Anticipating and addressing the ethical implications of deepfakes in the context of elections". New Media & Society. 23 (7) (published 2020-06-05): 2072–2098. doi:10.1177/1461444820925811. ISSN 1461-4448. S2CID 226196422.
Chadha, Anupama; Kumar, Vaibhav; Kashyap, Sonu; Gupta, Mayank (2021), Singh, Pradeep Kumar; Wierzchoń, Sławomir T.; Tanwar, Sudeep; Ganzha, Maria (eds.), "Deepfake: An Overview", Proceedings of Second International Conference on Computing, Communications, and Cyber-Security, Lecture Notes in Networks and Systems, vol. 203, Singapore: Springer Singapore, pp. 557–566, doi:10.1007/978-981-16-0733-2_39, ISBN 978-981-16-0732-5, S2CID 236666289, retrieved 2022-06-29
Caramancion, Kevin Matthe (June 2022). "An Exploration of Mis/Disinformation in Audio Format Disseminated in Podcasts: Case Study of Spotify". 2022 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS). pp. 1–6. doi:10.1109/IEMTRONICS55184.2022.9795760. ISBN 978-1-6654-8684-2. S2CID 249903722.
Chen, Tianxiang; Kumar, Avrosh; Nagarsheth, Parav; Sivaraman, Ganesh; Khoury, Elie (2020-11-01). "Generalization of Audio Deepfake Detection". The Speaker and Language Recognition Workshop (Odyssey 2020). ISCA: 132–137. doi:10.21437/Odyssey.2020-19. S2CID 219492826.
Ballesteros, Dora M.; Rodriguez-Ortega, Yohanna; Renza, Diego; Arce, Gonzalo (2021-12-01). "Deep4SNet: deep learning for fake speech classification". Expert Systems with Applications. 184: 115465. doi:10.1016/j.eswa.2021.115465. ISSN 0957-4174. S2CID 237659479.
Suwajanakorn, Supasorn; Seitz, Steven M.; Kemelmacher-Shlizerman, Ira (2017-07-20). "Synthesizing Obama: learning lip sync from audio". ACM Transactions on Graphics. 36 (4): 95:1–95:13. doi:10.1145/3072959.3073640. ISSN 0730-0301. S2CID 207586187.
Pradhan, Swadhin; Sun, Wei; Baig, Ghufran; Qiu, Lili (2019-09-09). "Combating Replay Attacks Against Voice Assistants". Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies. 3 (3): 100:1–100:26. doi:10.1145/3351258. S2CID 202159551.
Villalba, Jesus; Lleida, Eduardo (2011). "Preventing replay attacks on speaker verification systems". 2011 Carnahan Conference on Security Technology. pp. 1–8. doi:10.1109/CCST.2011.6095943. ISBN 978-1-4577-0903-6. S2CID 17048213. Retrieved 2022-06-29.
Tom, Francis; Jain, Mohit; Dey, Prasenjit (2018-09-02). "End-To-End Audio Replay Attack Detection Using Deep Convolutional Networks with Attention". Interspeech 2018. ISCA: 681–685. doi:10.21437/Interspeech.2018-2279. S2CID 52187155.
Rodríguez-Ortega, Yohanna; Ballesteros, Dora María; Renza, Diego (2020). "A Machine Learning Model to Detect Fake Voice". In Florez, Hector; Misra, Sanjay (eds.). Applied Informatics. Communications in Computer and Information Science. Vol. 1277. Cham: Springer International Publishing. pp. 3–13. doi:10.1007/978-3-030-61702-8_1. ISBN 978-3-030-61702-8. S2CID 226283369.
Najafian, Maryam; Russell, Martin (September 2020). "Automatic accent identification as an analytical tool for accent robust automatic speech recognition". Speech Communication. 122: 44–55. doi:10.1016/j.specom.2020.05.003. S2CID 225778214.
Liu, Xiao; Zhang, Fanjin; Hou, Zhenyu; Mian, Li; Wang, Zhaoyu; Zhang, Jing; Tang, Jie (2021). "Self-supervised Learning: Generative or Contrastive". IEEE Transactions on Knowledge and Data Engineering. 35 (1): 857–876. arXiv:2006.08218. doi:10.1109/TKDE.2021.3090866. ISSN 1558-2191. S2CID 219687051.
Ki Chan, Christopher Chun; Kumar, Vimal; Delaney, Steven; Gochoo, Munkhjargal (September 2020). "Combating Deepfakes: Multi-LSTM and Blockchain as Proof of Authenticity for Digital Media". 2020 IEEE / ITU International Conference on Artificial Intelligence for Good (AI4G). pp. 55–62. doi:10.1109/AI4G50087.2020.9311067. ISBN 978-1-7281-7031-2. S2CID 231618774.
Mittal, Trisha; Bhattacharya, Uttaran; Chandra, Rohan; Bera, Aniket; Manocha, Dinesh (2020-10-12), "Emotions Don't Lie: An Audio-Visual Deepfake Detection Method using Affective Cues", Proceedings of the 28th ACM International Conference on Multimedia, New York, NY, USA: Association for Computing Machinery, pp. 2823–2832, doi:10.1145/3394171.3413570, ISBN 978-1-4503-7988-5, S2CID 220935571, retrieved 2022-06-29
Conti, Emanuele; Salvi, Davide; Borrelli, Clara; Hosler, Brian; Bestagini, Paolo; Antonacci, Fabio; Sarti, Augusto; Stamm, Matthew C.; Tubaro, Stefano (2022-05-23). "Deepfake Speech Detection Through Emotion Recognition: A Semantic Approach". ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Singapore, Singapore: IEEE. pp. 8962–8966. doi:10.1109/ICASSP43922.2022.9747186. hdl:11311/1220518. ISBN 978-1-6654-0540-9. S2CID 249436701.
Hosler, Brian; Salvi, Davide; Murray, Anthony; Antonacci, Fabio; Bestagini, Paolo; Tubaro, Stefano; Stamm, Matthew C. (June 2021). "Do Deepfakes Feel Emotions? A Semantic Approach to Detecting Deepfakes Via Emotional Inconsistencies". 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Nashville, TN, USA: IEEE. pp. 1013–1022. doi:10.1109/CVPRW53098.2021.00112. hdl:11311/1183572. ISBN 978-1-6654-4899-4. S2CID 235679849.
Zhang, You; Jiang, Fei; Duan, Zhiyao (2021). "One-Class Learning Towards Synthetic Voice Spoofing Detection". IEEE Signal Processing Letters. 28: 937–941. arXiv:2010.13995. Bibcode:2021ISPL...28..937Z. doi:10.1109/LSP.2021.3076358. ISSN 1558-2361. S2CID 235077416.

signalprocessingsociety.org

"Audio Deepfake Detection: ICASSP 2022". IEEE Signal Processing Society. 2021-12-17. Retrieved 2022-07-01.

sites.google.com

"PREMIER". sites.google.com. Retrieved 2022-07-01.
"PREMIER - Project". sites.google.com. Retrieved 2022-06-29.

sky.com

news.sky.com

"Deepfake audio of Sir Keir Starmer released on first day of Labour conference".
"Voiceover artist Gayanne Potter urging ScotRail to remove her voice from new AI announcements". Sky News. Retrieved 2025-05-28.

springer.com

link.springer.com

Chadha, Anupama; Kumar, Vaibhav; Kashyap, Sonu; Gupta, Mayank (2021), Singh, Pradeep Kumar; Wierzchoń, Sławomir T.; Tanwar, Sudeep; Ganzha, Maria (eds.), "Deepfake: An Overview", Proceedings of Second International Conference on Computing, Communications, and Cyber-Security, Lecture Notes in Networks and Systems, vol. 203, Singapore: Springer Singapore, pp. 557–566, doi:10.1007/978-981-16-0733-2_39, ISBN 978-981-16-0732-5, S2CID 236666289, retrieved 2022-06-29
Rodríguez-Ortega, Yohanna; Ballesteros, Dora María; Renza, Diego (2020). "A Machine Learning Model to Detect Fake Voice". In Florez, Hector; Misra, Sanjay (eds.). Applied Informatics. Communications in Computer and Information Science. Vol. 1277. Cham: Springer International Publishing. pp. 3–13. doi:10.1007/978-3-030-61702-8_1. ISBN 978-3-030-61702-8. S2CID 226283369.

synsig.org

"Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020 - SynSIG". www.synsig.org. Archived from the original on 2022-07-02. Retrieved 2022-07-01.

theguardian.com

Evershed, Nick; Taylor, Josh (16 March 2023). "AI can fool voice recognition used to verify identity by Centrelink and Australian tax office". The Guardian. Retrieved 16 June 2023.

thetimes.com

English, David Leask | Paul (2025-05-27). "Actress feels 'cheated' by ScotRail's new AI voice announcer". www.thetimes.com. Retrieved 2025-05-28.

towardsdatascience.com

Chandraseta, Rionaldi (January 21, 2021). "Generate Your Favourite Characters' Voice Lines using Machine Learning". Towards Data Science. Archived from the original on January 21, 2021. Retrieved December 18, 2024.

vice.com

Cox, Joseph (23 February 2023). "How I Broke Into a Bank Account With an AI-Generated Voice". Vice. Retrieved 16 June 2023.

washingtonpost.com

"AI gave Val Kilmer his voice back. But critics worry the technology could be misused". Washington Post. ISSN 0190-8286. Retrieved 2022-06-29.

web.archive.org

Ng, Andrew (April 1, 2020). "Voice Cloning for the Masses". DeepLearning.AI. Archived from the original on December 28, 2024. Retrieved December 22, 2024.
Chandraseta, Rionaldi (January 21, 2021). "Generate Your Favourite Characters' Voice Lines using Machine Learning". Towards Data Science. Archived from the original on January 21, 2021. Retrieved December 18, 2024.
"Audio samples from "Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis"". 2018-08-30. Archived from the original on 2020-11-11. Retrieved 2022-06-05.
Temitope, Yusuf (December 10, 2024). "15.ai Creator reveals journey from MIT Project to internet phenomenon". The Guardian. Archived from the original on December 28, 2024. Retrieved December 25, 2024.
Kurosawa, Yuki (January 19, 2021). "ゲームキャラ音声読み上げソフト「15.ai」公開中。『Undertale』や『Portal』のキャラに好きなセリフを言ってもらえる" [Game Character Voice Reading Software "15.ai" Now Available. Get Characters from Undertale and Portal to Say Your Desired Lines]. AUTOMATON (in Japanese). Archived from the original on January 19, 2021. Retrieved December 18, 2024.
"Navigating the Challenges and Opportunities of Synthetic Voices". OpenAI. March 9, 2024. Archived from the original on November 25, 2024. Retrieved December 18, 2024.
"Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020 - SynSIG". www.synsig.org. Archived from the original on 2022-07-02. Retrieved 2022-07-01.

wired.com

Meaker, Morgan. "Slovakia's Election Deepfakes Show AI is a Danger to Democracy". Wired.

worldcat.org

search.worldcat.org

Smith, Hannah; Mansted, Katherine (April 1, 2020). Weaponised deep fakes: National security and democracy. Vol. 28. Australian Strategic Policy Institute. pp. 11–13. ISSN 2209-9689.
Diakopoulos, Nicholas; Johnson, Deborah (June 2020). "Anticipating and addressing the ethical implications of deepfakes in the context of elections". New Media & Society. 23 (7) (published 2020-06-05): 2072–2098. doi:10.1177/1461444820925811. ISSN 1461-4448. S2CID 226196422.
"AI gave Val Kilmer his voice back. But critics worry the technology could be misused". Washington Post. ISSN 0190-8286. Retrieved 2022-06-29.
Almutairi, Zaynab; Elgibreen, Hebah (2022-05-04). "A Review of Modern Audio Deepfake Detection Methods: Challenges and Future Directions". Algorithms. 15 (5): 155. doi:10.3390/a15050155. ISSN 1999-4893.
Ballesteros, Dora M.; Rodriguez-Ortega, Yohanna; Renza, Diego; Arce, Gonzalo (2021-12-01). "Deep4SNet: deep learning for fake speech classification". Expert Systems with Applications. 184: 115465. doi:10.1016/j.eswa.2021.115465. ISSN 0957-4174. S2CID 237659479.
Suwajanakorn, Supasorn; Seitz, Steven M.; Kemelmacher-Shlizerman, Ira (2017-07-20). "Synthesizing Obama: learning lip sync from audio". ACM Transactions on Graphics. 36 (4): 95:1–95:13. doi:10.1145/3072959.3073640. ISSN 0730-0301. S2CID 207586187.
Ning, Yishuang; He, Sheng; Wu, Zhiyong; Xing, Chunxiao; Zhang, Liang-Jie (January 2019). "A Review of Deep Learning Based Speech Synthesis". Applied Sciences. 9 (19): 4050. doi:10.3390/app9194050. ISSN 2076-3417.
Liu, Xiao; Zhang, Fanjin; Hou, Zhenyu; Mian, Li; Wang, Zhaoyu; Zhang, Jing; Tang, Jie (2021). "Self-supervised Learning: Generative or Contrastive". IEEE Transactions on Knowledge and Data Engineering. 35 (1): 857–876. arXiv:2006.08218. doi:10.1109/TKDE.2021.3090866. ISSN 1558-2191. S2CID 219687051.
Rashid, Md Mamunur; Lee, Suk-Hwan; Kwon, Ki-Ryong (2021). "Blockchain Technology for Combating Deepfake and Protect Video/Image Integrity". Journal of Korea Multimedia Society. 24 (8): 1044–1058. doi:10.9717/kmms.2021.24.8.1044. ISSN 1229-7771.
Zhang, You; Jiang, Fei; Duan, Zhiyao (2021). "One-Class Learning Towards Synthetic Voice Spoofing Detection". IEEE Signal Processing Letters. 28: 937–941. arXiv:2010.13995. Bibcode:2021ISPL...28..937Z. doi:10.1109/LSP.2021.3076358. ISSN 1558-2361. S2CID 235077416.

wsj.com

Stupp, Catherine. "Fraudsters Used AI to Mimic CEO's Voice in Unusual Cybercrime Case". WSJ. Retrieved 2024-05-26.

yahoo.com

uk.news.yahoo.com

"Actress claims ScotRail AI use her voice 'like something out of Black Mirror'". Yahoo News. 2025-05-28. Retrieved 2025-05-28.