Kovaleva, Olga; Romanov, Alexey; Rogers, Anna; Rumshisky, Anna (November 2019). Revealing the Dark Secrets of BERT. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)(амер.). с. 4364—4373. doi:10.18653/v1/D19-1445. S2CID201645145. Архів оригіналу за 20 жовтня 2020. Процитовано 28 жовтня 2020. (англ.)
arxiv.org
Devlin, Jacob; Chang, Ming-Wei; Lee, Kenton; Toutanova, Kristina (11 жовтня 2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805v2 [cs.CL]. (англ.)
Zhu, Yukun; Kiros, Ryan; Zemel, Rich; Salakhutdinov, Ruslan; Urtasun, Raquel; Torralba, Antonio; Fidler, Sanja (2015). Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books. с. 19—27. arXiv:1506.06724 [cs.CV]. (англ.)
Khandelwal, Urvashi; He, He; Qi, Peng; Jurafsky, Dan (2018). Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA, USA: Association for Computational Linguistics: 284—294. arXiv:1805.04623. Bibcode:2018arXiv180504623K. doi:10.18653/v1/p18-1027. S2CID21700944. (англ.)
Gulordava, Kristina; Bojanowski, Piotr; Grave, Edouard; Linzen, Tal; Baroni, Marco (2018). Colorless Green Recurrent Networks Dream Hierarchically. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Stroudsburg, PA, USA: Association for Computational Linguistics: 1195—1205. arXiv:1803.11138. Bibcode:2018arXiv180311138G. doi:10.18653/v1/n18-1108. S2CID4460159. (англ.)
Giulianelli, Mario; Harding, Jack; Mohnert, Florian; Hupkes, Dieuwke; Zuidema, Willem (2018). Under the Hood: Using Diagnostic Classifiers to Investigate and Improve how Language Models Track Agreement Information. Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Stroudsburg, PA, USA: Association for Computational Linguistics: 240—248. arXiv:1808.08079. Bibcode:2018arXiv180808079G. doi:10.18653/v1/w18-5426. S2CID52090220. (англ.)
Dai, Andrew; Le, Quoc (4 листопада 2015). Semi-supervised Sequence Learning. arXiv:1511.01432 [cs.LG]. (англ.)
Peters, Matthew; Neumann, Mark; Iyyer, Mohit; Gardner, Matt; Clark, Christopher; Lee, Kenton; Luke, Zettlemoyer (15 лютого 2018). Deep contextualized word representations. arXiv:1802.05365v2 [cs.CL]. (англ.)
Howard, Jeremy; Ruder, Sebastian (18 січня 2018). Universal Language Model Fine-tuning for Text Classification. arXiv:1801.06146v5 [cs.CL]. (англ.)
Kovaleva, Olga; Romanov, Alexey; Rogers, Anna; Rumshisky, Anna (November 2019). Revealing the Dark Secrets of BERT. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)(амер.). с. 4364—4373. doi:10.18653/v1/D19-1445. S2CID201645145. Архів оригіналу за 20 жовтня 2020. Процитовано 28 жовтня 2020. (англ.)
Clark, Kevin; Khandelwal, Urvashi; Levy, Omer; Manning, Christopher D. (2019). What Does BERT Look at? An Analysis of BERT's Attention. Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Stroudsburg, PA, USA: Association for Computational Linguistics: 276—286. doi:10.18653/v1/w19-4828. (англ.)
Khandelwal, Urvashi; He, He; Qi, Peng; Jurafsky, Dan (2018). Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA, USA: Association for Computational Linguistics: 284—294. arXiv:1805.04623. Bibcode:2018arXiv180504623K. doi:10.18653/v1/p18-1027. S2CID21700944. (англ.)
Gulordava, Kristina; Bojanowski, Piotr; Grave, Edouard; Linzen, Tal; Baroni, Marco (2018). Colorless Green Recurrent Networks Dream Hierarchically. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Stroudsburg, PA, USA: Association for Computational Linguistics: 1195—1205. arXiv:1803.11138. Bibcode:2018arXiv180311138G. doi:10.18653/v1/n18-1108. S2CID4460159. (англ.)
Giulianelli, Mario; Harding, Jack; Mohnert, Florian; Hupkes, Dieuwke; Zuidema, Willem (2018). Under the Hood: Using Diagnostic Classifiers to Investigate and Improve how Language Models Track Agreement Information. Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Stroudsburg, PA, USA: Association for Computational Linguistics: 240—248. arXiv:1808.08079. Bibcode:2018arXiv180808079G. doi:10.18653/v1/w18-5426. S2CID52090220. (англ.)
Khandelwal, Urvashi; He, He; Qi, Peng; Jurafsky, Dan (2018). Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA, USA: Association for Computational Linguistics: 284—294. arXiv:1805.04623. Bibcode:2018arXiv180504623K. doi:10.18653/v1/p18-1027. S2CID21700944. (англ.)
Gulordava, Kristina; Bojanowski, Piotr; Grave, Edouard; Linzen, Tal; Baroni, Marco (2018). Colorless Green Recurrent Networks Dream Hierarchically. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Stroudsburg, PA, USA: Association for Computational Linguistics: 1195—1205. arXiv:1803.11138. Bibcode:2018arXiv180311138G. doi:10.18653/v1/n18-1108. S2CID4460159. (англ.)
Giulianelli, Mario; Harding, Jack; Mohnert, Florian; Hupkes, Dieuwke; Zuidema, Willem (2018). Under the Hood: Using Diagnostic Classifiers to Investigate and Improve how Language Models Track Agreement Information. Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Stroudsburg, PA, USA: Association for Computational Linguistics: 240—248. arXiv:1808.08079. Bibcode:2018arXiv180808079G. doi:10.18653/v1/w18-5426. S2CID52090220. (англ.)
naacl2019.org
Best Paper Awards. NAACL. 2019. Архів оригіналу за 19 жовтня 2020. Процитовано 28 березня 2020. (англ.)
searchenginejournal.com
Montti, Roger (10 грудня 2019). Google's BERT Rolls Out Worldwide. Search Engine Journal. Search Engine Journal. Архів оригіналу за 29 листопада 2020. Процитовано 10 грудня 2019. (англ.)
Kovaleva, Olga; Romanov, Alexey; Rogers, Anna; Rumshisky, Anna (November 2019). Revealing the Dark Secrets of BERT. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)(амер.). с. 4364—4373. doi:10.18653/v1/D19-1445. S2CID201645145. Архів оригіналу за 20 жовтня 2020. Процитовано 28 жовтня 2020. (англ.)
Khandelwal, Urvashi; He, He; Qi, Peng; Jurafsky, Dan (2018). Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA, USA: Association for Computational Linguistics: 284—294. arXiv:1805.04623. Bibcode:2018arXiv180504623K. doi:10.18653/v1/p18-1027. S2CID21700944. (англ.)
Gulordava, Kristina; Bojanowski, Piotr; Grave, Edouard; Linzen, Tal; Baroni, Marco (2018). Colorless Green Recurrent Networks Dream Hierarchically. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Stroudsburg, PA, USA: Association for Computational Linguistics: 1195—1205. arXiv:1803.11138. Bibcode:2018arXiv180311138G. doi:10.18653/v1/n18-1108. S2CID4460159. (англ.)
Giulianelli, Mario; Harding, Jack; Mohnert, Florian; Hupkes, Dieuwke; Zuidema, Willem (2018). Under the Hood: Using Diagnostic Classifiers to Investigate and Improve how Language Models Track Agreement Information. Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Stroudsburg, PA, USA: Association for Computational Linguistics: 240—248. arXiv:1808.08079. Bibcode:2018arXiv180808079G. doi:10.18653/v1/w18-5426. S2CID52090220. (англ.)
Kovaleva, Olga; Romanov, Alexey; Rogers, Anna; Rumshisky, Anna (November 2019). Revealing the Dark Secrets of BERT. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)(амер.). с. 4364—4373. doi:10.18653/v1/D19-1445. S2CID201645145. Архів оригіналу за 20 жовтня 2020. Процитовано 28 жовтня 2020. (англ.)
Montti, Roger (10 грудня 2019). Google's BERT Rolls Out Worldwide. Search Engine Journal. Search Engine Journal. Архів оригіналу за 29 листопада 2020. Процитовано 10 грудня 2019. (англ.)
Best Paper Awards. NAACL. 2019. Архів оригіналу за 19 жовтня 2020. Процитовано 28 березня 2020. (англ.)