Paaß, Gerhard; Giesselbach, Sven (2023-02-16). "Foundation Models for Speech, Images, Videos, and Control". Foundation Models for Natural Language Processing. Artificial Intelligence: Foundations, Theory, and Algorithms. pp. 313–382. arXiv:2302.08575. doi:10.1007/978-3-031-23190-2_7. ISBN978-3-031-23189-6. S2CID257019816.
Koenecke, Allison; Choi, Anna Seo Gyeong; Mei, Katelyn X.; Schellmann, Hilke; Sloane, Mona (2024-06-03). "Careless Whisper: Speech-to-Text Hallucination Harms". The 2024 ACM Conference on Fairness, Accountability, and Transparency. New York, NY, USA: ACM. pp. 1672–1681. arXiv:2402.08021. doi:10.1145/3630106.3658996. ISBN979-8-4007-0450-5.
Yuan, Gong; Khurana, Sameer; Karlinsky, Leonid; Glass, James (2023). "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers". Interspeech 2023. pp. 2798–2802. arXiv:2307.03183. doi:10.21437/Interspeech.2023-2193.
doi.org
Paaß, Gerhard; Giesselbach, Sven (2023-02-16). "Foundation Models for Speech, Images, Videos, and Control". Foundation Models for Natural Language Processing. Artificial Intelligence: Foundations, Theory, and Algorithms. pp. 313–382. arXiv:2302.08575. doi:10.1007/978-3-031-23190-2_7. ISBN978-3-031-23189-6. S2CID257019816.
Koenecke, Allison; Choi, Anna Seo Gyeong; Mei, Katelyn X.; Schellmann, Hilke; Sloane, Mona (2024-06-03). "Careless Whisper: Speech-to-Text Hallucination Harms". The 2024 ACM Conference on Fairness, Accountability, and Transparency. New York, NY, USA: ACM. pp. 1672–1681. arXiv:2402.08021. doi:10.1145/3630106.3658996. ISBN979-8-4007-0450-5.
Yuan, Gong; Khurana, Sameer; Karlinsky, Leonid; Glass, James (2023). "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers". Interspeech 2023. pp. 2798–2802. arXiv:2307.03183. doi:10.21437/Interspeech.2023-2193.
Paaß, Gerhard; Giesselbach, Sven (2023-02-16). "Foundation Models for Speech, Images, Videos, and Control". Foundation Models for Natural Language Processing. Artificial Intelligence: Foundations, Theory, and Algorithms. pp. 313–382. arXiv:2302.08575. doi:10.1007/978-3-031-23190-2_7. ISBN978-3-031-23189-6. S2CID257019816.