DEVLIN, Jacob; CHANG, Ming-Wei; LEE, Kenton. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 [cs]. 2019-05-24. ArXiv: 1810.04805
version: 2. Dostupné online [cit. 2022-10-27].
ROGERS, Anna; KOVALEVA, Olga; RUMSHISKY, Anna. A Primer in BERTology: What we know about how BERT works. arXiv:2002.12327 [cs]. 2020-11-09. ArXiv: 2002.12327. Dostupné online [cit. 2022-10-27].
ZHU, Yukun; KIROS, Ryan; ZEMEL, Richard. Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books. arXiv:1506.06724 [cs]. 2015-06-22. ArXiv: 1506.06724. Dostupné online [cit. 2022-10-27].
MOTTESI, Celeste. GPT-3 vs. BERT: Comparing the Two Most Popular Language Models. blog.invgate.com [online]. [cit. 2023-08-19]. Dostupné online. (anglicky)