Трансформер (архітектура глибокого навчання) (Ukrainian Wikipedia)

Analysis of information sources in references of the Wikipedia article "Трансформер (архітектура глибокого навчання)" in Ukrainian language version.

refsWebsite
Global rank Ukrainian rank
69th place
188th place
1st place
1st place
2nd place
4th place
11th place
964th place
4th place
5th place
5th place
9th place
low place
low place
1,272nd place
1,695th place
1,559th place
901st place
low place
low place
low place
low place
6th place
6th place
8,920th place
6,385th place
222nd place
164th place
low place
2,839th place
1,697th place
2,794th place
7th place
43rd place
383rd place
596th place
1,185th place
1,586th place
18th place
74th place
low place
low place
low place
low place
low place
low place
23rd place
72nd place
179th place
277th place
low place
low place
low place
low place
low place
low place
low place
low place
6,158th place
9,645th place
low place
low place
low place
low place

aclweb.org

acm.org

dl.acm.org

archive.org

arxiv.org

  • Bahdanau; Cho, Kyunghyun; Bengio, Yoshua (1 вересня 2014). Neural Machine Translation by Jointly Learning to Align and Translate (англ.). arXiv:1409.0473 [cs.CL].
  • Luong, Minh-Thang; Pham, Hieu; Manning, Christopher D. (17 серпня 2015). Effective Approaches to Attention-based Neural Machine Translation (англ.). arXiv:1508.04025 [cs.CL].
  • Radford, Alec; Jong Wook Kim; Xu, Tao; Brockman, Greg; McLeavey, Christine; Sutskever, Ilya (2022). Robust Speech Recognition via Large-Scale Weak Supervision (англ.). arXiv:2212.04356 [eess.AS].
  • Choromanski, Krzysztof; Likhosherstov, Valerii; Dohan, David; Song, Xingyou; Gane, Andreea; Sarlos, Tamas; Hawkins, Peter; Davis, Jared; Mohiuddin, Afroz; Kaiser, Lukasz; Belanger, David; Colwell, Lucy; Weller, Adrian (2020). Rethinking Attention with Performers (англ.). arXiv:2009.14794 [cs.CL].
  • Sutskever, Ilya; Vinyals, Oriol; Le, Quoc V (2014). Sequence to Sequence Learning with Neural Networks. Advances in Neural Information Processing Systems. Curran Associates, Inc. 27. arXiv:1409.3215.
  • Cho, Kyunghyun; van Merrienboer, Bart; Bahdanau, Dzmitry; Bengio, Yoshua (2014). On the Properties of Neural Machine Translation: Encoder–Decoder Approaches. Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (англ.). Stroudsburg, PA, USA: Association for Computational Linguistics: 103—111. arXiv:1409.1259. doi:10.3115/v1/w14-4012. S2CID 11336213.
  • Chung, Junyoung; Gulcehre, Caglar; Cho, KyungHyun; Bengio, Yoshua (2014). Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling (англ.). arXiv:1412.3555 [cs.NE].
  • Bahdanau, Dzmitry; Cho, Kyunghyun; Bengio, Yoshua (1 вересня 2014). Neural Machine Translation by Jointly Learning to Align and Translate (англ.). arXiv:1409.0473 [cs.CL].
  • Wu, Yonghui та ін. (1 вересня 2016). Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation (англ.). arXiv:1609.08144 [cs.CL].
  • Devlin, Jacob; Chang, Ming-Wei; Lee, Kenton; Toutanova, Kristina (11 жовтня 2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (англ.). arXiv:1810.04805v2 [cs.CL].
  • Dosovitskiy, Alexey; Beyer, Lucas; Kolesnikov, Alexander; Weissenborn, Dirk; Zhai, Xiaohua; Unterthiner, Thomas; Dehghani, Mostafa; Minderer, Matthias; Heigold, Georg; Gelly, Sylvain; Uszkoreit, Jakob (3 червня 2021). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (англ.). arXiv:2010.11929 [cs.CV].
  • Gulati, Anmol; Qin, James; Chiu, Chung-Cheng; Parmar, Niki; Zhang, Yu; Yu, Jiahui; Han, Wei; Wang, Shibo; Zhang, Zhengdong; Wu, Yonghui; Pang, Ruoming (2020). Conformer: Convolution-augmented Transformer for Speech Recognition (англ.). arXiv:2005.08100 [eess.AS].
  • Xiong, Ruibin; Yang, Yunchang; He, Di; Zheng, Kai; Zheng, Shuxin; Xing, Chen; Zhang, Huishuai; Lan, Yanyan; Wang, Liwei; Liu, Tie-Yan (29 червня 2020). On Layer Normalization in the Transformer Architecture (англ.). arXiv:2002.04745 [cs.LG].
  • Raffel, Colin; Shazeer, Noam; Roberts, Adam; Lee, Katherine; Narang, Sharan; Matena, Michael; Zhou, Yanqi; Li, Wei; Liu, Peter J. (1 січня 2020). Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research (англ.). 21 (1): 140:5485–140:5551. arXiv:1910.10683. ISSN 1532-4435.
  • Clark, Kevin; Khandelwal, Urvashi; Levy, Omer; Manning, Christopher D. (серпень 2019). What Does BERT Look at? An Analysis of BERT's Attention. Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP (англ.). Florence, Italy: Association for Computational Linguistics: 276—286. arXiv:1906.04341. doi:10.18653/v1/W19-4828. Архів оригіналу за 21 жовтня 2020. Процитовано 20 травня 2020.
  • Shazeer, Noam (1 лютого 2020). GLU Variants Improve Transformer (англ.). arXiv:2002.05202 [cs.LG].
  • Dufter, Philipp; Schmitt, Martin; Schütze, Hinrich (6 червня 2022). Position Information in Transformers: An Overview. Computational Linguistics (англ.). 48 (3): 733—763. arXiv:2102.11090. doi:10.1162/coli_a_00445. ISSN 0891-2017. S2CID 231986066.
  • Su, Jianlin; Lu, Yu; Pan, Shengfeng; Murtadha, Ahmed; Wen, Bo; Liu, Yunfeng (1 квітня 2021). RoFormer: Enhanced Transformer with Rotary Position Embedding (англ.). arXiv:2104.09864 [cs.CL].
  • Press, Ofir; Smith, Noah A.; Lewis, Mike (1 серпня 2021). Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation (англ.). arXiv:2108.12409 [cs.CL].
  • Shaw, Peter; Uszkoreit, Jakob; Vaswani, Ashish (2018). Self-Attention with Relative Position Representations (англ.). arXiv:1803.02155 [cs.CL].
  • Dao, Tri; Fu, Dan; Ermon, Stefano; Rudra, Atri; Ré, Christopher (6 грудня 2022). FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness. Advances in Neural Information Processing Systems (англ.). 35: 16344—16359. arXiv:2205.14135.
  • Chowdhery, Aakanksha; Narang, Sharan; Devlin, Jacob; Bosma, Maarten; Mishra, Gaurav; Roberts, Adam; Barham, Paul; Chung, Hyung Won; Sutton, Charles; Gehrmann, Sebastian; Schuh, Parker; Shi, Kensen; Tsvyashchenko, Sasha; Maynez, Joshua; Rao, Abhishek (1 квітня 2022). PaLM: Scaling Language Modeling with Pathways (англ.). arXiv:2204.02311 [cs.CL].
  • Leviathan, Yaniv; Kalman, Matan; Matias, Yossi (18 травня 2023), Fast Inference from Transformers via Speculative Decoding (англ.), arXiv:2211.17192
  • Chen, Charlie; Borgeaud, Sebastian; Irving, Geoffrey; Lespiau, Jean-Baptiste; Sifre, Laurent; Jumper, John (2 лютого 2023), Accelerating Large Language Model Decoding with Speculative Sampling (англ.), arXiv:2302.01318
  • Kitaev, Nikita; Kaiser, Łukasz; Levskaya, Anselm (2020). Reformer: The Efficient Transformer (англ.). arXiv:2001.04451 [cs.LG].
  • Zhai, Shuangfei; Talbott, Walter; Srivastava, Nitish; Huang, Chen; Goh, Hanlin; Zhang, Ruixiang; Susskind, Josh (21 вересня 2021). An Attention Free Transformer (англ.). arXiv:2105.14103 [cs.LG].
  • Tay, Yi; Dehghani, Mostafa; Abnar, Samira; Shen, Yikang; Bahri, Dara; Pham, Philip; Rao, Jinfeng; Yang, Liu; Ruder, Sebastian; Metzler, Donald (8 листопада 2020). Long Range Arena: A Benchmark for Efficient Transformers (англ.). arXiv:2011.04006 [cs.LG].
  • Peng, Hao; Pappas, Nikolaos; Yogatama, Dani; Schwartz, Roy; Smith, Noah A.; Kong, Lingpeng (19 березня 2021). Random Feature Attention (англ.). arXiv:2103.02143 [cs.CL].
  • Choromanski, Krzysztof; Likhosherstov, Valerii; Dohan, David; Song, Xingyou; Gane, Andreea; Sarlos, Tamas; Hawkins, Peter; Davis, Jared; Belanger, David; Colwell, Lucy; Weller, Adrian (30 вересня 2020). Masked Language Modeling for Proteins via Linearly Scalable Long-Context Transformers (англ.). arXiv:2006.03555 [cs.LG].
  • Radford, Alec; Kim, Jong Wook; Xu, Tao; Brockman, Greg; McLeavey, Christine; Sutskever, Ilya (2022). Robust Speech Recognition via Large-Scale Weak Supervision (англ.). arXiv:2212.04356 [eess.AS].
  • Jaegle, Andrew; Gimeno, Felix; Brock, Andrew; Zisserman, Andrew; Vinyals, Oriol; Carreira, Joao (22 червня 2021). Perceiver: General Perception with Iterative Attention (англ.). arXiv:2103.03206 [cs.CV].
  • Jaegle, Andrew; Borgeaud, Sebastian; Alayrac, Jean-Baptiste; Doersch, Carl; Ionescu, Catalin; Ding, David; Koppula, Skanda; Zoran, Daniel; Brock, Andrew; Shelhamer, Evan; Hénaff, Olivier (2 серпня 2021). Perceiver IO: A General Architecture for Structured Inputs & Outputs (англ.). arXiv:2107.14795 [cs.LG].
  • Peebles, William; Xie, Saining (2 березня 2023). Scalable Diffusion Models with Transformers (англ.). arXiv:2212.09748 [cs.CV].

coursera.org

doi.org

doi.org

dx.doi.org

github.com

googleblog.com

ai.googleblog.com

harvard.edu

ui.adsabs.harvard.edu

huggingface.co

idsia.ch

people.idsia.ch

indico.io

infoq.com

jalammar.github.io

marktechpost.com

metaphysic.ai

blog.metaphysic.ai

neurips.cc

proceedings.neurips.cc

  • Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion; Gomez, Aidan N; Kaiser, Łukasz; Polosukhin, Illia (2017). Attention is All you Need (PDF). Advances in Neural Information Processing Systems (англ.). Curran Associates, Inc. 30.
  • Sutskever, Ilya; Vinyals, Oriol; Le, Quoc V (2014). Sequence to Sequence Learning with Neural Networks. Advances in Neural Information Processing Systems. Curran Associates, Inc. 27. arXiv:1409.3215.
  • Dao, Tri; Fu, Dan; Ermon, Stefano; Rudra, Atri; Ré, Christopher (6 грудня 2022). FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness. Advances in Neural Information Processing Systems (англ.). 35: 16344—16359. arXiv:2205.14135.

nih.gov

pubmed.ncbi.nlm.nih.gov

ncbi.nlm.nih.gov

notion.site

yaofu.notion.site

nytimes.com

openai.com

paperswithcode.com

princeton-nlp.github.io

scholar.google.com

  • Google Scholar. scholar.google.com (англ.). Процитовано 13 серпня 2023.

semanticscholar.org

api.semanticscholar.org

stanford.edu

crfm.stanford.edu

  • Stanford CRFM. crfm.stanford.edu (англ.). Процитовано 18 липня 2023.

together.ai

towardsdatascience.com

  • He, Cheng (31 грудня 2021). Transformer in CV. Transformer in CV (англ.). Towards Data Science. Архів оригіналу за 16 квітня 2023. Процитовано 19 червня 2021.

twitter.com

web.archive.org

wiley.com

doi.wiley.com

worldcat.org

search.worldcat.org