बड़े भाषा मॉडल (Hindi Wikipedia)

Pilehvar, Mohammad Taher; Camacho-Collados, Jose (June 2019). "WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations". Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics: 1267–1273. S2CID 102353817. डीओआइ:10.18653/v1/N19-1128.
Black, Sidney; Biderman, Stella; Hallahan, Eric (2022-05-01). "GPT-NeoX-20B: An Open-Source Autoregressive Language Model". Proceedings of BigScience Episode #5 -- Workshop on Challenges & Perspectives in Creating Large Language Models. Proceedings of BigScience Episode #5 -- Workshop on Challenges & Perspectives in Creating Large Language Models. pp. 95–136. https://aclanthology.org/2022.bigscience-1.9/. अभिगमन तिथि: 2022-12-19.
Black, Sidney; Biderman, Stella; Hallahan, Eric (2022-05-01). "GPT-NeoX-20B: An Open-Source Autoregressive Language Model". Proceedings of BigScience Episode #5 -- Workshop on Challenges & Perspectives in Creating Large Language Models. Proceedings of BigScience Episode #5 -- Workshop on Challenges & Perspectives in Creating Large Language Models. pp. 95–136. https://aclanthology.org/2022.bigscience-1.9/. अभिगमन तिथि: 2022-12-19.
Black, Sidney; Biderman, Stella; Hallahan, Eric (2022-05-01). "GPT-NeoX-20B: An Open-Source Autoregressive Language Model". Proceedings of BigScience Episode #5 -- Workshop on Challenges & Perspectives in Creating Large Language Models. Proceedings of BigScience Episode #5 -- Workshop on Challenges & Perspectives in Creating Large Language Models. pp. 95–136. https://aclanthology.org/2022.bigscience-1.9/. अभिगमन तिथि: 2022-12-19.

acm.org

dl.acm.org

. Virtual Event, Canada. Emily M., Bender; Gebru, Timnit; McMillan-Major, Angelina; Mitchell, Margaret (2021-03-01). "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜". FAccT '21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. FAccT '21. Virtual Event, Canada: ACM. pp. 610–623. doi:10.1145/3442188.3445922.
Ji, Ziwei; Lee, Nayeon; Frieske, Rita; Yu, Tiezheng; Su, Dan; Xu, Yan; Ishii, Etsuko; Bang, Yejin; Dai, Wenliang (November 2022). "Survey of Hallucination in Natural Language Generation" (pdf). ACM Computing Surveys. Association for Computing Machinery. 55 (12): 1–38. arXiv:2202.03629. डीओआइ:10.1145/3571730. अभिगमन तिथि 15 January 2023.Ji, Ziwei; Lee, Nayeon; Frieske, Rita; Yu, Tiezheng; Su, Dan; Xu, Yan; Ishii, Etsuko; Bang, Yejin; Dai, Wenliang; Madotto, Andrea; Fung, Pascale (November 2022). "Survey of Hallucination in Natural Language Generation" (pdf). ACM Computing Surveys. Association for Computing Machinery. 55 (12): 1–38. arXiv:2202.03629. doi:10.1145/3571730. S2CID 246652372. Retrieved 15 January 2023.

amacad.org

Manning, Christopher D. (2022). "Human Language Understanding & Reasoning". Daedalus. 151 (2): 127–138. डीओआइ:10.1162/daed_a_01905.Manning, Christopher D. (2022). "Human Language Understanding & Reasoning". Daedalus. 151 (2): 127–138. doi:10.1162/daed_a_01905. S2CID 248377870.

amazon.com

aws.amazon.com

"AlexaTM 20B is now available in Amazon SageMaker JumpStart | AWS Machine Learning Blog". aws.amazon.com. 17 November 2022. अभिगमन तिथि 13 March 2023.
"AlexaTM 20B is now available in Amazon SageMaker JumpStart | AWS Machine Learning Blog". aws.amazon.com. 17 November 2022. अभिगमन तिथि 13 March 2023.

amazon.science

"20B-parameter Alexa model sets new marks in few-shot learning". Amazon Science (अंग्रेज़ी में). 2 August 2022.

analyticsindiamag.com

Goled, Shraddha (May 7, 2021). "Self-Supervised Learning Vs Semi-Supervised Learning: How They Differ". Analytics India Magazine.Goled, Shraddha (May 7, 2021). "Self-Supervised Learning Vs Semi-Supervised Learning: How They Differ". Analytics India Magazine.
Naik, Amit Raja (September 23, 2021). "Google Introduces New Architecture To Reduce Cost Of Transformers". Analytics India Magazine.

anthropic.com

"Product". Anthropic (अंग्रेज़ी में). अभिगमन तिथि 14 March 2023.

archive.today

"OpenAI API". platform.openai.com (अंग्रेज़ी में). मूल से पुरालेखित 20 जून 2023. अभिगमन तिथि 2023-06-20.सीएस1 रखरखाव: BOT: original-url status unknown (link). platform.openai.com. Archived from the original on 16 Jun 2023. Retrieved 2023-06-20.

arxiv.org

Bowman, Samuel R. "Eight Things to Know about Large Language Models". arXiv:2304.00612 [cs.CL].Bowman, Samuel R. (2023). "Eight Things to Know about Large Language Models". arXiv:2304.00612 [cs.CL].
Sutskever, Ilya; Vinyals, Oriol; Le, Quoc V (2014). "Sequence to Sequence Learning with Neural Networks". Advances in Neural Information Processing Systems. Curran Associates, Inc. 27. arXiv:1409.3215.Sutskever, Ilya; Vinyals, Oriol; Le, Quoc V (2014). "Sequence to Sequence Learning with Neural Networks". Advances in Neural Information Processing Systems. Curran Associates, Inc. 27. arXiv:1409.3215.
Bahdanau, Dzmitry; Cho, Kyunghyun (2014-09-01). "Neural Machine Translation by Jointly Learning to Align and Translate". arXiv:1409.0473 [cs.CL].Bahdanau, Dzmitry; Cho, Kyunghyun; Bengio, Yoshua (2014-09-01). "Neural Machine Translation by Jointly Learning to Align and Translate". arXiv:1409.0473 [cs.CL].
Wu, Yonghui; Schuster, Mike (2016-09-01). "Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation". arXiv:1609.08144 [cs.CL].Wu, Yonghui; et al. (2016-09-01). "Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation". arXiv:1609.08144 [cs.CL].
Devlin, Jacob; Chang, Ming-Wei (11 October 2018). "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding". arXiv:1810.04805v2 [cs.CL].Devlin, Jacob; Chang, Ming-Wei; Lee, Kenton; Toutanova, Kristina (11 October 2018). "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding". arXiv:1810.04805v2 [cs.CL].
Sanh, Victor; Debut, Lysandre (2019-10-02). "DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter?". arXiv:1910.01108 [cs.CL].Sanh, Victor; Debut, Lysandre; Chaumond, Julien; Wolf, Thomas (2019-10-02). "DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter?". arXiv:1910.01108 [cs.CL].
A bot will complete this citation soon. Click here to jump the queue arXiv:1802.05365.Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018). "Deep contextualized word representations". arXiv:1802.05365 [cs.CL].
Kaplan, Jared; McCandlish, Sam. "Scaling Laws for Neural Language Models". arXiv:2001.08361 [cs.LG].Kaplan, Jared; McCandlish, Sam; Henighan, Tom; et al. (2020). "Scaling Laws for Neural Language Models". arXiv:2001.08361 [cs.LG].
Hoffmann, Jordan; Borgeaud, Sebastian. "Training Compute-Optimal Large Language Models". arXiv:2203.15556 [cs.CL].Hoffmann, Jordan; Borgeaud, Sebastian; Mensch, Arthur; et al. (2022). "Training Compute-Optimal Large Language Models". arXiv:2203.15556 [cs.CL].
Chowdhery, Aakanksha; Narang, Sharan. "PaLM: Scaling Language Modeling with Pathways". arXiv:2204.02311 [cs.CL].Chowdhery, Aakanksha; Narang, Sharan; Devlin, Jacob; et al. (2022). "PaLM: Scaling Language Modeling with Pathways". arXiv:2204.02311 [cs.CL].
Schaeffer, Rylan; Miranda, Brando (2023-04-01). "Are Emergent Abilities of Large Language Models a Mirage?". arXiv:2304.15004 [cs.AI].Schaeffer, Rylan; Miranda, Brando; Koyejo, Sanmi (2023-04-01). "Are Emergent Abilities of Large Language Models a Mirage?". arXiv:2304.15004 [cs.AI].
Shazeer, Noam; Mirhoseini, Azalia (2017-01-01). "Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer". arXiv:1701.06538 [cs.LG].Shazeer, Noam; Mirhoseini, Azalia; Maziarz, Krzysztof; Davis, Andy; Le, Quoc; Hinton, Geoffrey; Dean, Jeff (2017-01-01). "Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer". arXiv:1701.06538 [cs.LG].
Lepikhin, Dmitry; Lee, HyoukJoong (2021-01-12). "GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding" (अंग्रेज़ी में). arXiv:2006.16668 [cs.CL].Lepikhin, Dmitry; Lee, HyoukJoong; Xu, Yuanzhong; Chen, Dehao; Firat, Orhan; Huang, Yanping; Krikun, Maxim; Shazeer, Noam; Chen, Zhifeng (2021-01-12). "GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding". arXiv:2006.16668 [cs.CL].
Lepikhin, Dmitry; Lee, HyoukJoong (2021-01-12). "GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding" (अंग्रेज़ी में). arXiv:2006.16668 [cs.CL].Lepikhin, Dmitry; Lee, HyoukJoong; Xu, Yuanzhong; Chen, Dehao; Firat, Orhan; Huang, Yanping; Krikun, Maxim; Shazeer, Noam; Chen, Zhifeng (2021-01-12). "GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding". arXiv:2006.16668 [cs.CL].
Zaib, Munazza; Sheng, Quan Z.; Emma Zhang, Wei (4 February 2020). "A Short Survey of Pre-trained Language Models for Conversational AI-A New Age in NLP". Proceedings of the Australasian Computer Science Week Multiconference: 1–4. arXiv:2104.10810. आई॰ऍस॰बी॰ऍन॰ 9781450376976. डीओआइ:10.1145/3373017.3373028.Zaib, Munazza; Sheng, Quan Z.; Emma Zhang, Wei (4 February 2020). "A Short Survey of Pre-trained Language Models for Conversational AI-A New Age in NLP". Proceedings of the Australasian Computer Science Week Multiconference: 1–4. arXiv:2104.10810. doi:10.1145/3373017.3373028. ISBN 9781450376976. S2CID 211040895.
Zhu, Yukun; Kiros, Ryan; Zemel, Rich; Salakhutdinov, Ruslan; Urtasun, Raquel; Torralba, Antonio; Fidler, Sanja (December 2015). "Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books" (PDF). 2015 IEEE International Conference on Computer Vision (ICCV). 2015 IEEE International Conference on Computer Vision (ICCV). पपृ॰ 19–27. arXiv:1506.06724. आई॰ऍस॰बी॰ऍन॰ 978-1-4673-8391-2. डीओआइ:10.1109/ICCV.2015.11. अभिगमन तिथि 11 April 2023.Zhu, Yukun; Kiros, Ryan; Zemel, Rich; Salakhutdinov, Ruslan; Urtasun, Raquel; Torralba, Antonio; Fidler, Sanja (December 2015). "Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books" (PDF). 2015 IEEE International Conference on Computer Vision (ICCV). 2015 IEEE International Conference on Computer Vision (ICCV). pp. 19–27. arXiv:1506.06724. doi:10.1109/ICCV.2015.11. ISBN 978-1-4673-8391-2. S2CID 6866988. Retrieved 11 April 2023.
Polino, Antonio; Pascanu, Razvan (2018-02-01). "Model compression via distillation and quantization". arXiv:1802.05668 [cs.NE].Polino, Antonio; Pascanu, Razvan; Alistarh, Dan (2018-02-01). "Model compression via distillation and quantization". arXiv:1802.05668 [cs.NE].
Frantar, Elias; Ashkboos, Saleh (2022-10-01). "GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers". arXiv:2210.17323 [cs.LG].Frantar, Elias; Ashkboos, Saleh; Hoefler, Torsten; Alistarh, Dan (2022-10-01). "GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers". arXiv:2210.17323 [cs.LG].
Dettmers, Tim; Svirschevski, Ruslan (2023-06-01). "SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression". arXiv:2306.03078 [cs.CL].Dettmers, Tim; Svirschevski, Ruslan; Egiazarian, Vage; Kuznedelev, Denis; Frantar, Elias; Ashkboos, Saleh; Borzunov, Alexander; Hoefler, Torsten; Alistarh, Dan (2023-06-01). "SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression". arXiv:2306.03078 [cs.CL].
Dettmers, Tim; Pagnoni, Artidoro (2023-05-01). "QLoRA: Efficient Finetuning of Quantized LLMs". arXiv:2305.14314 [cs.LG].Dettmers, Tim; Pagnoni, Artidoro; Holtzman, Ari; Zettlemoyer, Luke (2023-05-01). "QLoRA: Efficient Finetuning of Quantized LLMs". arXiv:2305.14314 [cs.LG].
Sharir, Or; Peleg, Barak. "The Cost of Training NLP Models: A Concise Overview". arXiv:2004.08900 [cs.CL].Sharir, Or; Peleg, Barak; Shoham, Yoav (2020). "The Cost of Training NLP Models: A Concise Overview". arXiv:2004.08900 [cs.CL].
Biderman, Stella; Schoelkopf, Hailey (April 2023). "Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling". arXiv:2304.01373 [cs.CL].Biderman, Stella; Schoelkopf, Hailey; Anthony, Quentin; Bradley, Herbie; Khan, Mohammad Aflah; Purohit, Shivanshu; Prashanth, USVSN Sai (April 2023). "Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling". arXiv:2304.01373 [cs.CL].
Kaplan, Jared; McCandlish, Sam. "Scaling Laws for Neural Language Models". arXiv:2001.08361 [cs.LG].Kaplan, Jared; McCandlish, Sam; Henighan, Tom; Brown, Tom B.; Chess, Benjamin; Child, Rewon; Gray, Scott; Radford, Alec; Wu, Jeffrey; Amodei, Dario (2020). "Scaling Laws for Neural Language Models". arXiv:2001.08361 [cs.LG].
Wang, Yizhong; Kordi, Yeganeh (2022). "Self-Instruct: Aligning Language Model with Self Generated Instructions". arXiv:2212.10560 [cs.CL].Wang, Yizhong; Kordi, Yeganeh; Mishra, Swaroop; Liu, Alisa; Smith, Noah A.; Khashabi, Daniel; Hajishirzi, Hannaneh (2022). "Self-Instruct: Aligning Language Model with Self Generated Instructions". arXiv:2212.10560 [cs.CL].
Ouyang, Long; Wu, Jeff (2022). "Training language models to follow instructions with human feedback". arXiv:2203.02155 [cs.CL].Ouyang, Long; Wu, Jeff; Jiang, Xu; Almeida, Diogo; Wainwright, Carroll L.; Mishkin, Pamela; Zhang, Chong; Agarwal, Sandhini; Slama, Katarina; Ray, Alex; Schulman, John; Hilton, Jacob; Kelton, Fraser; Miller, Luke; Simens, Maddie; Askell, Amanda; Welinder, Peter; Christiano, Paul; Leike, Jan; Lowe, Ryan (2022). "Training language models to follow instructions with human feedback". arXiv:2203.02155 [cs.CL].
Gao, Luyu; Madaan, Aman (2022-11-01). "PAL: Program-aided Language Models". arXiv:2211.10435 [cs.CL].Gao, Luyu; Madaan, Aman; Zhou, Shuyan; Alon, Uri; Liu, Pengfei; Yang, Yiming; Callan, Jamie; Neubig, Graham (2022-11-01). "PAL: Program-aided Language Models". arXiv:2211.10435 [cs.CL].
Paranjape, Bhargavi; Lundberg, Scott (2023-03-01). "ART: Automatic multi-step reasoning and tool-use for large language models". arXiv:2303.09014 [cs.CL].Paranjape, Bhargavi; Lundberg, Scott; Singh, Sameer; Hajishirzi, Hannaneh; Zettlemoyer, Luke; Tulio Ribeiro, Marco (2023-03-01). "ART: Automatic multi-step reasoning and tool-use for large language models". arXiv:2303.09014 [cs.CL].
Liang, Yaobo; Wu, Chenfei (2023-03-01). "TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs". arXiv:2303.16434 [cs.AI].Liang, Yaobo; Wu, Chenfei; Song, Ting; Wu, Wenshan; Xia, Yan; Liu, Yu; Ou, Yang; Lu, Shuai; Ji, Lei; Mao, Shaoguang; Wang, Yun; Shou, Linjun; Gong, Ming; Duan, Nan (2023-03-01). "TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs". arXiv:2303.16434 [cs.AI].
Patil, Shishir G.; Zhang, Tianjun (2023-05-01). "Gorilla: Large Language Model Connected with Massive APIs". arXiv:2305.15334 [cs.CL].Patil, Shishir G.; Zhang, Tianjun; Wang, Xin; Gonzalez, Joseph E. (2023-05-01). "Gorilla: Large Language Model Connected with Massive APIs". arXiv:2305.15334 [cs.CL].
Lewis, Patrick; Perez, Ethan; Piktus, Aleksandra; Petroni, Fabio; Karpukhin, Vladimir; Goyal, Naman; Küttler, Heinrich; Lewis, Mike; Yih, Wen-tau (2020). "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks". Advances in Neural Information Processing Systems. Curran Associates, Inc. 33: 9459–9474. arXiv:2005.11401.Lewis, Patrick; Perez, Ethan; Piktus, Aleksandra; Petroni, Fabio; Karpukhin, Vladimir; Goyal, Naman; Küttler, Heinrich; Lewis, Mike; Yih, Wen-tau; Rocktäschel, Tim; Riedel, Sebastian; Kiela, Douwe (2020). "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks". Advances in Neural Information Processing Systems. Curran Associates, Inc. 33: 9459–9474. arXiv:2005.11401.
Huang, Wenlong; Abbeel, Pieter; Pathak, Deepak; Mordatch, Igor (2022-06-28). "Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents". Proceedings of the 39th International Conference on Machine Learning (अंग्रेज़ी में). PMLR: 9118–9147. arXiv:2201.07207.Huang, Wenlong; Abbeel, Pieter; Pathak, Deepak; Mordatch, Igor (2022-06-28). "Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents". Proceedings of the 39th International Conference on Machine Learning. PMLR: 9118–9147. arXiv:2201.07207.
Yao, Shunyu; Zhao, Jeffrey (2022-10-01). "ReAct: Synergizing Reasoning and Acting in Language Models". arXiv:2210.03629 [cs.CL].Yao, Shunyu; Zhao, Jeffrey; Yu, Dian; Du, Nan; Shafran, Izhak; Narasimhan, Karthik; Cao, Yuan (2022-10-01). "ReAct: Synergizing Reasoning and Acting in Language Models". arXiv:2210.03629 [cs.CL].
Wu, Yue; Prabhumoye, Shrimai (24 May 2023). "SPRING: GPT-4 Out-performs RL Algorithms by Studying Papers and Reasoning". arXiv:2305.15486 [cs.AI].Wu, Yue; Prabhumoye, Shrimai; Min, So Yeon (24 May 2023). "SPRING: GPT-4 Out-performs RL Algorithms by Studying Papers and Reasoning". arXiv:2305.15486 [cs.AI].
Shinn, Noah; Cassano, Federico (2023-03-01). "Reflexion: Language Agents with Verbal Reinforcement Learning". arXiv:2303.11366 [cs.AI].Shinn, Noah; Cassano, Federico; Labash, Beck; Gopinath, Ashwin; Narasimhan, Karthik; Yao, Shunyu (2023-03-01). "Reflexion: Language Agents with Verbal Reinforcement Learning". arXiv:2303.11366 [cs.AI].
Hao, Shibo; Gu, Yi (2023-05-01). "Reasoning with Language Model is Planning with World Model". arXiv:2305.14992 [cs.CL].Hao, Shibo; Gu, Yi; Ma, Haodi; Jiahua Hong, Joshua; Wang, Zhen; Zhe Wang, Daisy; Hu, Zhiting (2023-05-01). "Reasoning with Language Model is Planning with World Model". arXiv:2305.14992 [cs.CL].
Zhang, Jenny; Lehman, Joel (2 June 2023). "OMNI: Open-endedness via Models of human Notions of Interestingness". arXiv:2306.01711 [cs.AI].Zhang, Jenny; Lehman, Joel; Stanley, Kenneth; Clune, Jeff (2 June 2023). "OMNI: Open-endedness via Models of human Notions of Interestingness". arXiv:2306.01711 [cs.AI].
Park, Joon Sung; O'Brien, Joseph C. (2023-04-01). "Generative Agents: Interactive Simulacra of Human Behavior". arXiv:2304.03442 [cs.HC].Park, Joon Sung; O'Brien, Joseph C.; Cai, Carrie J.; Ringel Morris, Meredith; Liang, Percy; Bernstein, Michael S. (2023-04-01). "Generative Agents: Interactive Simulacra of Human Behavior". arXiv:2304.03442 [cs.HC].
Yin, Shukang; Fu, Chaoyou (2023-06-01). "A Survey on Multimodal Large Language Models". arXiv:2306.13549 [cs.CV].Yin, Shukang; Fu, Chaoyou; Zhao, Sirui; Li, Ke; Sun, Xing; Xu, Tong; Chen, Enhong (2023-06-01). "A Survey on Multimodal Large Language Models". arXiv:2306.13549 [cs.CV].
Li, Junnan; Li, Dongxu (2023-01-01). "BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models". arXiv:2301.12597 [cs.CV].Li, Junnan; Li, Dongxu; Savarese, Silvio; Hoi, Steven (2023-01-01). "BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models". arXiv:2301.12597 [cs.CV].
Alayrac, Jean-Baptiste; Donahue, Jeff; Luc, Pauline; Miech, Antoine; Barr, Iain; Hasson, Yana; Lenc, Karel; Mensch, Arthur; Millican, Katherine (2022-12-06). "Flamingo: a Visual Language Model for Few-Shot Learning". Advances in Neural Information Processing Systems (अंग्रेज़ी में). 35: 23716–23736. arXiv:2204.14198.Alayrac, Jean-Baptiste; Donahue, Jeff; Luc, Pauline; Miech, Antoine; Barr, Iain; Hasson, Yana; Lenc, Karel; Mensch, Arthur; Millican, Katherine; Reynolds, Malcolm; Ring, Roman; Rutherford, Eliza; Cabi, Serkan; Han, Tengda; Gong, Zhitao (2022-12-06). "Flamingo: a Visual Language Model for Few-Shot Learning". Advances in Neural Information Processing Systems. 35: 23716–23736. arXiv:2204.14198.
Driess, Danny; Xia, Fei (2023-03-01). "PaLM-E: An Embodied Multimodal Language Model". arXiv:2303.03378 [cs.LG].Driess, Danny; Xia, Fei; Sajjadi, Mehdi S. M.; Lynch, Corey; Chowdhery, Aakanksha; Ichter, Brian; Wahid, Ayzaan; Tompson, Jonathan; Vuong, Quan; Yu, Tianhe; Huang, Wenlong; Chebotar, Yevgen; Sermanet, Pierre; Duckworth, Daniel; Levine, Sergey (2023-03-01). "PaLM-E: An Embodied Multimodal Language Model". arXiv:2303.03378 [cs.LG].
Liu, Haotian; Li, Chunyuan (2023-04-01). "Visual Instruction Tuning". arXiv:2304.08485 [cs.CV].Liu, Haotian; Li, Chunyuan; Wu, Qingyang; Lee, Yong Jae (2023-04-01). "Visual Instruction Tuning". arXiv:2304.08485 [cs.CV].
Zhang, Hang; Li, Xin (2023-06-01). "Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding". arXiv:2306.02858 [cs.CL].Zhang, Hang; Li, Xin; Bing, Lidong (2023-06-01). "Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding". arXiv:2306.02858 [cs.CL].
OpenAI (2023-03-27). "GPT-4 Technical Report". arXiv:2303.08774 [cs.CL].OpenAI (2023-03-27). "GPT-4 Technical Report". arXiv:2303.08774 [cs.CL].
Anil, Rohan; Dai, Andrew M. "PaLM 2 Technical Report". arXiv:2305.10403 [cs.CL].Anil, Rohan; et al. (2023). "PaLM 2 Technical Report". arXiv:2305.10403 [cs.CL].
Wu, Shijie; Irsoy, Ozan. "BloombergGPT: A Large Language Model for Finance". arXiv:2303.17564 [cs.LG].Wu, Shijie; Irsoy, Ozan; Lu, Steven; Dabravolski, Vadim; Dredze, Mark; Gehrmann, Sebastian; Kambadur, Prabhanjan; Rosenberg, David; Mann, Gideon (2023). "BloombergGPT: A Large Language Model for Finance". arXiv:2303.17564 [cs.LG].
Dodge, Jesse; Sap, Maarten. "Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus". arXiv:2104.08758 [cs.CL].Dodge, Jesse; Sap, Maarten; Marasović, Ana; Agnew, William; Ilharco, Gabriel; Groeneveld, Dirk; Mitchell, Margaret; Gardner, Matt (2021). "Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus". arXiv:2104.08758 [cs.CL].
Villalobos, Pablo; Sevilla, Jaime (2022-10-25). "Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning". arXiv:2211.04325 [cs.LG].Villalobos, Pablo; Sevilla, Jaime; Heim, Lennart; Besiroglu, Tamay; Hobbhahn, Marius; Ho, Anson (2022-10-25). "Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning". arXiv:2211.04325 [cs.LG].
Brown, Tom B.; Mann, Benjamin. "Language Models are Few-Shot Learners". arXiv:2005.14165 [cs.CL].Brown, Tom B.; et al. (2020). "Language Models are Few-Shot Learners". arXiv:2005.14165 [cs.CL].
Hoffmann, Jordan; Borgeaud, Sebastian (2022-03-29). "Training Compute-Optimal Large Language Models". arXiv:2203.15556 [cs.CL].Hoffmann, Jordan; Borgeaud, Sebastian; Mensch, Arthur; Buchatskaya, Elena; Cai, Trevor; Rutherford, Eliza; Casas, Diego de Las; Hendricks, Lisa Anne; Welbl, Johannes; Clark, Aidan; Hennigan, Tom; Noland, Eric; Millican, Katie; Driessche, George van den; Damoc, Bogdan (2022-03-29). "Training Compute-Optimal Large Language Models". arXiv:2203.15556 [cs.CL].
Caballero, Ethan; Gupta, Kshitij. "Broken Neural Scaling Laws". arXiv:2210.14891 [cs.LG].Caballero, Ethan; Gupta, Kshitij; Rish, Irina; Krueger, David (2022). "Broken Neural Scaling Laws". arXiv:2210.14891 [cs.LG].
Li, Kenneth; Hopkins, Aspen K. (2022-10-01). "Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task". arXiv:2210.13382 [cs.LG].Li, Kenneth; Hopkins, Aspen K.; Bau, David; Viégas, Fernanda; Pfister, Hanspeter; Wattenberg, Martin (2022-10-01). "Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task". arXiv:2210.13382 [cs.LG].
Jin, Charles; Rinard, Martin (2023-05-01). "Evidence of Meaning in Language Models Trained on Programs". arXiv:2305.11169 [cs.LG].Jin, Charles; Rinard, Martin (2023-05-01). "Evidence of Meaning in Language Models Trained on Programs". arXiv:2305.11169 [cs.LG].
Nanda, Neel; Chan, Lawrence (2023-01-01). "Progress measures for grokking via mechanistic interpretability". arXiv:2301.05217 [cs.LG].Nanda, Neel; Chan, Lawrence; Lieberum, Tom; Smith, Jess; Steinhardt, Jacob (2023-01-01). "Progress measures for grokking via mechanistic interpretability". arXiv:2301.05217 [cs.LG].
Mitchell, Melanie; Krakauer, David C. (28 March 2023). "The debate over understanding in AI's large language models". Proceedings of the National Academy of Sciences. 120 (13): e2215907120. arXiv:2210.13966. PMID 36943882 |pmid= के मान की जाँच करें (मदद). डीओआइ:10.1073/pnas.2215907120. पी॰एम॰सी॰ 10068812 |pmc= के मान की जाँच करें (मदद). बिबकोड:2023PNAS..12015907M.Mitchell, Melanie; Krakauer, David C. (28 March 2023). "The debate over understanding in AI's large language models". Proceedings of the National Academy of Sciences. 120 (13): e2215907120. arXiv:2210.13966. Bibcode:2023PNAS..12015907M. doi:10.1073/pnas.2215907120. PMC 10068812. PMID 36943882.
Bubeck, Sébastien; Chandrasekaran, Varun (2023). "Sparks of Artificial General Intelligence: Early experiments with GPT-4". arXiv:2303.12712 [cs.CL].Bubeck, Sébastien; Chandrasekaran, Varun; Eldan, Ronen; Gehrke, Johannes; Horvitz, Eric; Kamar, Ece; Lee, Peter; Lee, Yin Tat; Li, Yuanzhi; Lundberg, Scott; Nori, Harsha; Palangi, Hamid; Ribeiro, Marco Tulio; Zhang, Yi (2023). "Sparks of Artificial General Intelligence: Early experiments with GPT-4". arXiv:2303.12712 [cs.CL]. सन्दर्भ त्रुटि: <ref> अमान्य टैग है; "microsoft sparks" नाम कई बार विभिन्न सामग्रियों में परिभाषित हो चुका है
Ji, Ziwei; Lee, Nayeon; Frieske, Rita; Yu, Tiezheng; Su, Dan; Xu, Yan; Ishii, Etsuko; Bang, Yejin; Dai, Wenliang (November 2022). "Survey of Hallucination in Natural Language Generation" (pdf). ACM Computing Surveys. Association for Computing Machinery. 55 (12): 1–38. arXiv:2202.03629. डीओआइ:10.1145/3571730. अभिगमन तिथि 15 January 2023.Ji, Ziwei; Lee, Nayeon; Frieske, Rita; Yu, Tiezheng; Su, Dan; Xu, Yan; Ishii, Etsuko; Bang, Yejin; Dai, Wenliang; Madotto, Andrea; Fung, Pascale (November 2022). "Survey of Hallucination in Natural Language Generation" (pdf). ACM Computing Surveys. Association for Computing Machinery. 55 (12): 1–38. arXiv:2202.03629. doi:10.1145/3571730. S2CID 246652372. Retrieved 15 January 2023.
Clark, Christopher; Lee, Kenton. "BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions". arXiv:1905.10044 [cs.CL].Clark, Christopher; Lee, Kenton; Chang, Ming-Wei; Kwiatkowski, Tom; Collins, Michael; Toutanova, Kristina (2019). "BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions". arXiv:1905.10044 [cs.CL].
Wayne Xin Zhao; Zhou, Kun. "A Survey of Large Language Models". arXiv:2303.18223 [cs.CL].Wayne Xin Zhao; Zhou, Kun; Li, Junyi; Tang, Tianyi; Wang, Xiaolei; Hou, Yupeng; Min, Yingqian; Zhang, Beichen; Zhang, Junjie; Dong, Zican; Du, Yifan; Yang, Chen; Chen, Yushuo; Chen, Zhipeng; Jiang, Jinhao; Ren, Ruiyang; Li, Yifan; Tang, Xinyu; Liu, Zikang; Liu, Peiyu; Nie, Jian-Yun; Wen, Ji-Rong (2023). "A Survey of Large Language Models". arXiv:2303.18223 [cs.CL].
Srivastava, Aarohi; Rastogi, Abhinav. "Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models". arXiv:2206.04615 [cs.CL].Srivastava, Aarohi; et al. (2022). "Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models". arXiv:2206.04615 [cs.CL].
Lin, Stephanie; Hilton, Jacob. "TruthfulQA: Measuring How Models Mimic Human Falsehoods". arXiv:2109.07958 [cs.CL].Lin, Stephanie; Hilton, Jacob; Evans, Owain (2021). "TruthfulQA: Measuring How Models Mimic Human Falsehoods". arXiv:2109.07958 [cs.CL].
Zellers, Rowan; Holtzman, Ari. "HellaSwag: Can a Machine Really Finish Your Sentence?". arXiv:1905.07830 [cs.CL].Zellers, Rowan; Holtzman, Ari; Bisk, Yonatan; Farhadi, Ali; Choi, Yejin (2019). "HellaSwag: Can a Machine Really Finish Your Sentence?". arXiv:1905.07830 [cs.CL].
Patel, Ajay; Li, Bryan; Rasooli, Mohammad Sadegh; Constant, Noah; Raffel, Colin; Callison-Burch, Chris (2022). "Bidirectional Language Models Are Also Few-shot Learners". arXiv:2209.14500 [cs.LG].
Yang, Zhilin; Dai, Zihang; Yang, Yiming; Carbonell, Jaime; Salakhutdinov, Ruslan; Le, Quoc V. (2 January 2020). "XLNet: Generalized Autoregressive Pretraining for Language Understanding". arXiv:1906.08237 [cs.CL].
Yang, Zhilin; Dai, Zihang; Yang, Yiming; Carbonell, Jaime; Salakhutdinov, Ruslan; Le, Quoc V. (2 January 2020). "XLNet: Generalized Autoregressive Pretraining for Language Understanding". arXiv:1906.08237 [cs.CL].
Table D.1 in Brown, Tom B.; Mann, Benjamin; Ryder, Nick; Subbiah, Melanie; Kaplan, Jared; Dhariwal, Prafulla; Neelakantan, Arvind; Shyam, Pranav; Sastry, Girish; Askell, Amanda; Agarwal, Sandhini; Herbert-Voss, Ariel; Krueger, Gretchen; Henighan, Tom; Child, Rewon; Ramesh, Aditya; Ziegler, Daniel M.; Wu, Jeffrey; Winter, Clemens; Hesse, Christopher; Chen, Mark; Sigler, Eric; Litwin, Mateusz; Gray, Scott; Chess, Benjamin; Clark, Jack; Berner, Christopher; McCandlish, Sam; Radford, Alec; Sutskever, Ilya; Amodei, Dario (May 28, 2020). "Language Models are Few-Shot Learners". arXiv:2005.14165v4 [cs.CL].
Table D.1 in Brown, Tom B.; Mann, Benjamin; Ryder, Nick; Subbiah, Melanie; Kaplan, Jared; Dhariwal, Prafulla; Neelakantan, Arvind; Shyam, Pranav; Sastry, Girish; Askell, Amanda; Agarwal, Sandhini; Herbert-Voss, Ariel; Krueger, Gretchen; Henighan, Tom; Child, Rewon; Ramesh, Aditya; Ziegler, Daniel M.; Wu, Jeffrey; Winter, Clemens; Hesse, Christopher; Chen, Mark; Sigler, Eric; Litwin, Mateusz; Gray, Scott; Chess, Benjamin; Clark, Jack; Berner, Christopher; McCandlish, Sam; Radford, Alec; Sutskever, Ilya; Amodei, Dario (May 28, 2020). "Language Models are Few-Shot Learners". arXiv:2005.14165v4 [cs.CL].
Gao, Leo; Biderman, Stella; Black, Sid; Golding, Laurence; Hoppe, Travis; Foster, Charles; Phang, Jason; He, Horace; Thite, Anish; Nabeshima, Noa; Presser, Shawn; Leahy, Connor (31 December 2020). "The Pile: An 800GB Dataset of Diverse Text for Language Modeling". arXiv:2101.00027 [cs.CL].
Gao, Leo; Biderman, Stella; Black, Sid; Golding, Laurence; Hoppe, Travis; Foster, Charles; Phang, Jason; He, Horace; Thite, Anish; Nabeshima, Noa; Presser, Shawn; Leahy, Connor (31 December 2020). "The Pile: An 800GB Dataset of Diverse Text for Language Modeling". arXiv:2101.00027 [cs.CL].
Dey, Nolan; Gosal, Gurpreet; Zhiming; Chen; Khachane, Hemant; Marshall, William; Pathria, Ribhu; Tom, Marvin; Hestness, Joel (2023-04-01). "Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster". arXiv:2304.03208 [cs.LG].
Wang, Shuohuan; Sun, Yu; Xiang, Yang; Wu, Zhihua; Ding, Siyu; Gong, Weibao; Feng, Shikun; Shang, Junyuan; Zhao, Yanbin; Pang, Chao; Liu, Jiaxiang; Chen, Xuyi; Lu, Yuxiang; Liu, Weixin; Wang, Xi; Bai, Yangfan; Chen, Qiuliang; Zhao, Li; Li, Shiyong; Sun, Peng; Yu, Dianhai; Ma, Yanjun; Tian, Hao; Wu, Hua; Wu, Tian; Zeng, Wei; Li, Ge; Gao, Wen; Wang, Haifeng (December 23, 2021). "ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation". arXiv:2112.12731 [cs.CL].
Askell, Amanda; Bai, Yuntao; Chen, Anna; एवं अन्य (9 December 2021). "A General Language Assistant as a Laboratory for Alignment". arXiv:2112.00861 [cs.CL].
Bai, Yuntao; Kadavath, Saurav; Kundu, Sandipan; एवं अन्य (15 December 2022). "Constitutional AI: Harmlessness from AI Feedback". arXiv:2212.08073 [cs.CL].
Hoffmann, Jordan; Borgeaud, Sebastian; Mensch, Arthur; एवं अन्य (29 March 2022). "Training Compute-Optimal Large Language Models". arXiv:2203.15556 [cs.CL].
Thoppilan, Romal; De Freitas, Daniel; Hall, Jamie; Shazeer, Noam; Kulshreshtha, Apoorv; Cheng, Heng-Tze; Jin, Alicia; Bos, Taylor; Baker, Leslie; Du, Yu; Li, YaGuang; Lee, Hongrae; Zheng, Huaixiu Steven; Ghafouri, Amin; Menegali, Marcelo (2022-01-01). "LaMDA: Language Models for Dialog Applications". arXiv:2201.08239 [cs.CL].
Zhang, Susan; Roller, Stephen; Goyal, Naman; Artetxe, Mikel; Chen, Moya; Chen, Shuohui; Dewan, Christopher; Diab, Mona; Li, Xian; Lin, Xi Victoria; Mihaylov, Todor; Ott, Myle; Shleifer, Sam; Shuster, Kurt; Simig, Daniel; Koura, Punit Singh; Sridhar, Anjali; Wang, Tianlu; Zettlemoyer, Luke (21 June 2022). "OPT: Open Pre-trained Transformer Language Models". arXiv:2205.01068 [cs.CL].
Lewkowycz, Aitor; Andreassen, Anders; Dohan, David; Dyer, Ethan; Michalewski, Henryk; Ramasesh, Vinay; Slone, Ambrose; Anil, Cem; Schlag, Imanol; Gutman-Solo, Theo; Wu, Yuhuai; Neyshabur, Behnam; Gur-Ari, Guy; Misra, Vedant (30 June 2022). "Solving Quantitative Reasoning Problems with Language Models". arXiv:2206.14858 [cs.CL].
Taylor, Ross; Kardas, Marcin; Cucurull, Guillem; Scialom, Thomas; Hartshorn, Anthony; Saravia, Elvis; Poulton, Andrew; Kerkez, Viktor; Stojnic, Robert (16 November 2022). "Galactica: A Large Language Model for Science". arXiv:2211.09085 [cs.CL].
Soltan, Saleh; Ananthakrishnan, Shankar; FitzGerald, Jack; एवं अन्य (3 August 2022). "AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model". arXiv:2208.01448 [cs.CL].
Penedo, Guilherme; Malartic, Quentin; Hesslow, Daniel; Cojocaru, Ruxandra; Cappelli, Alessandro; Alobeidli, Hamza; Pannier, Baptiste; Almazrouei, Ebtesam; Launay, Julien (2023-06-01). "The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only". arXiv:2306.01116 [cs.CL].
Wu, Shijie; Irsoy, Ozan; Lu, Steven; Dabravolski, Vadim; Dredze, Mark; Gehrmann, Sebastian; Kambadur, Prabhanjan; Rosenberg, David; Mann, Gideon (March 30, 2023). "BloombergGPT: A Large Language Model for Finance". arXiv:2303.17564 [cs.LG].
Ren, Xiaozhe; Zhou, Pingyi; Meng, Xinfan; Huang, Xinjing; Wang, Yadao; Wang, Weichao; Li, Pengfei; Zhang, Xiaoda; Podolskiy, Alexander; Arshinov, Grigory; Bout, Andrey; Piontkovskaya, Irina; Wei, Jiansheng; Jiang, Xin; Su, Teng; Liu, Qun; Yao, Jun (March 19, 2023). "PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing". arXiv:2303.10845 [cs.CL].
Köpf, Andreas; Kilcher, Yannic; von Rütte, Dimitri; Anagnostidis, Sotiris; Tam, Zhi-Rui; Stevens, Keith; Barhoum, Abdullah; Duc, Nguyen Minh; Stanley, Oliver; Nagyfi, Richárd; ES, Shahul; Suri, Sameer; Glushkov, David; Dantuluri, Arnav; Maguire, Andrew (2023-04-14). "OpenAssistant Conversations -- Democratizing Large Language Model Alignment". arXiv:2304.07327 [cs.CL].

blog.google

"Introducing PaLM 2". Google. May 10, 2023.

businesswire.com

UAE’s Falcon 40B, World’s Top-Ranked AI Model from Technology Innovation Institute, is Now Royalty-Free, 31 May 2023

cam.ac.uk

mlg.eng.cam.ac.uk

Gal, Yarin; Blunsom, Phil (12 June 2013). "A Systematic Bayesian Treatment of the IBM Alignment Models" (PDF). University of Cambridge. मूल से पुरालेखित 4 मार्च 2016. अभिगमन तिथि 26 October 2015.सीएस1 रखरखाव: BOT: original-url status unknown (link)Gal, Yarin; Blunsom, Phil (12 June 2013). (PDF). University of Cambridge. Archived from the original Archived 2016-03-04 at the वेबैक मशीन (PDF) on 4 Mar 2016. Retrieved 26 October 2015.

cerebras.net

Dey, Nolan (March 28, 2023). "Cerebras-GPT: A Family of Open, Compute-efficient, Large Language Models". Cerebras.

cnbc.com

Elias, Jennifer (16 May 2023). "Google's newest A.I. model uses nearly five times more text data for training than its predecessor". CNBC. अभिगमन तिथि 18 May 2023.

cv-foundation.org

Zhu, Yukun; Kiros, Ryan; Zemel, Rich; Salakhutdinov, Ruslan; Urtasun, Raquel; Torralba, Antonio; Fidler, Sanja (December 2015). "Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books" (PDF). 2015 IEEE International Conference on Computer Vision (ICCV). 2015 IEEE International Conference on Computer Vision (ICCV). पपृ॰ 19–27. arXiv:1506.06724. आई॰ऍस॰बी॰ऍन॰ 978-1-4673-8391-2. डीओआइ:10.1109/ICCV.2015.11. अभिगमन तिथि 11 April 2023.Zhu, Yukun; Kiros, Ryan; Zemel, Rich; Salakhutdinov, Ruslan; Urtasun, Raquel; Torralba, Antonio; Fidler, Sanja (December 2015). "Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books" (PDF). 2015 IEEE International Conference on Computer Vision (ICCV). 2015 IEEE International Conference on Computer Vision (ICCV). pp. 19–27. arXiv:1506.06724. doi:10.1109/ICCV.2015.11. ISBN 978-1-4673-8391-2. S2CID 6866988. Retrieved 11 April 2023.

deepmind.com

"Language modelling at scale: Gopher, ethical considerations, and retrieval". www.deepmind.com (अंग्रेज़ी में). अभिगमन तिथि 20 March 2023.

docs.google.com

"Parameter, Compute and Data Trends in Machine Learning". Google Docs (अंग्रेज़ी में). अभिगमन तिथि 2023-06-20.

doi.org

Manning, Christopher D. (2022). "Human Language Understanding & Reasoning". Daedalus. 151 (2): 127–138. डीओआइ:10.1162/daed_a_01905.Manning, Christopher D. (2022). "Human Language Understanding & Reasoning". Daedalus. 151 (2): 127–138. doi:10.1162/daed_a_01905. S2CID 248377870.
Chomsky, N. (September 1956). "Three models for the description of language". IRE Transactions on Information Theory. 2 (3): 113–124. आइ॰एस॰एस॰एन॰ 2168-2712. डीओआइ:10.1109/TIT.1956.1056813.Chomsky, N. (September 1956). "Three models for the description of language". IRE Transactions on Information Theory. 2 (3): 113–124. doi:10.1109/TIT.1956.1056813. ISSN 2168-2712. S2CID 19519474.
Winograd, Terry (1972-01-01). "Understanding natural language". Cognitive Psychology (अंग्रेज़ी में). 3 (1): 1–191. आइ॰एस॰एस॰एन॰ 0010-0285. डीओआइ:10.1016/0010-0285(72)90002-3.Winograd, Terry (1972-01-01). "Understanding natural language". Cognitive Psychology. 3 (1): 1–191. doi:10.1016/0010-0285(72)90002-3. ISSN 0010-0285.
Elman, Jeffrey L. (March 1990). "Finding Structure in Time". Cognitive Science (अंग्रेज़ी में). 14 (2): 179–211. डीओआइ:10.1207/s15516709cog1402_1.Elman, Jeffrey L. (March 1990). "Finding Structure in Time". Cognitive Science. 14 (2): 179–211. doi:10.1207/s15516709cog1402_1. S2CID 2763403.
Shannon, C. E. (January 1951). "Prediction and Entropy of Printed English". Bell System Technical Journal. 30 (1): 50–64. आइ॰एस॰एस॰एन॰ 0005-8580. डीओआइ:10.1002/j.1538-7305.1951.tb01366.x.Shannon, C. E. (January 1951). "Prediction and Entropy of Printed English". Bell System Technical Journal. 30 (1): 50–64. doi:10.1002/j.1538-7305.1951.tb01366.x. ISSN 0005-8580.
Banko, Michele; Brill, Eric (2001). "Scaling to very very large corpora for natural language disambiguation". Proceedings of the 39th Annual Meeting on Association for Computational Linguistics - ACL '01. Morristown, NJ, USA: Association for Computational Linguistics: 26–33. डीओआइ:10.3115/1073012.1073017.Banko, Michele; Brill, Eric (2001). "Scaling to very very large corpora for natural language disambiguation". Proceedings of the 39th Annual Meeting on Association for Computational Linguistics - ACL '01. Morristown, NJ, USA: Association for Computational Linguistics: 26–33. doi:10.3115/1073012.1073017. S2CID 6645623.
Cho, Kyunghyun; van Merrienboer, Bart; Bahdanau, Dzmitry; Bengio, Yoshua (2014). "On the Properties of Neural Machine Translation: Encoder–Decoder Approaches". Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation. Stroudsburg, PA, USA: Association for Computational Linguistics: 103–111. डीओआइ:10.3115/v1/w14-4012.Cho, Kyunghyun; van Merrienboer, Bart; Bahdanau, Dzmitry; Bengio, Yoshua (2014). "On the Properties of Neural Machine Translation: Encoder–Decoder Approaches". Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation. Stroudsburg, PA, USA: Association for Computational Linguistics: 103–111. doi:10.3115/v1/w14-4012. S2CID 11336213.
. Virtual Event, Canada. Emily M., Bender; Gebru, Timnit; McMillan-Major, Angelina; Mitchell, Margaret (2021-03-01). "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜". FAccT '21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. FAccT '21. Virtual Event, Canada: ACM. pp. 610–623. doi:10.1145/3442188.3445922.
Ganguli, Deep; Hernandez, Danny (2022-06-20). "2022 ACM Conference on Fairness, Accountability, and Transparency". ACM. pp. 1747–1764. doi:10.1145/3531146.3533229. Ganguli, Deep; Hernandez, Danny; Lovitt, Liane; et al. (2022-06-20). "Predictability and Surprise in Large Generative Models". 2022 ACM Conference on Fairness, Accountability, and Transparency. New York, NY, USA: ACM. pp. 1747–1764. doi:10.1145/3531146.3533229. ISBN 9781450393522.
Kjell (2019). "Semantic measures: Using natural language processing to measure, differentiate, and describe psychological constructs". Psychological Methods. 24 (1): 92–115. PMID 29963879. डीओआइ:10.1037/met0000191.Kjell (2019). "Semantic measures: Using natural language processing to measure, differentiate, and describe psychological constructs". Psychological Methods. 24 (1): 92–115. doi:10.1037/met0000191. PMID 29963879. S2CID 49642731.
Zaib, Munazza; Sheng, Quan Z.; Emma Zhang, Wei (4 February 2020). "A Short Survey of Pre-trained Language Models for Conversational AI-A New Age in NLP". Proceedings of the Australasian Computer Science Week Multiconference: 1–4. arXiv:2104.10810. आई॰ऍस॰बी॰ऍन॰ 9781450376976. डीओआइ:10.1145/3373017.3373028.Zaib, Munazza; Sheng, Quan Z.; Emma Zhang, Wei (4 February 2020). "A Short Survey of Pre-trained Language Models for Conversational AI-A New Age in NLP". Proceedings of the Australasian Computer Science Week Multiconference: 1–4. arXiv:2104.10810. doi:10.1145/3373017.3373028. ISBN 9781450376976. S2CID 211040895.
Zhu, Yukun; Kiros, Ryan; Zemel, Rich; Salakhutdinov, Ruslan; Urtasun, Raquel; Torralba, Antonio; Fidler, Sanja (December 2015). "Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books" (PDF). 2015 IEEE International Conference on Computer Vision (ICCV). 2015 IEEE International Conference on Computer Vision (ICCV). पपृ॰ 19–27. arXiv:1506.06724. आई॰ऍस॰बी॰ऍन॰ 978-1-4673-8391-2. डीओआइ:10.1109/ICCV.2015.11. अभिगमन तिथि 11 April 2023.Zhu, Yukun; Kiros, Ryan; Zemel, Rich; Salakhutdinov, Ruslan; Urtasun, Raquel; Torralba, Antonio; Fidler, Sanja (December 2015). "Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books" (PDF). 2015 IEEE International Conference on Computer Vision (ICCV). 2015 IEEE International Conference on Computer Vision (ICCV). pp. 19–27. arXiv:1506.06724. doi:10.1109/ICCV.2015.11. ISBN 978-1-4673-8391-2. S2CID 6866988. Retrieved 11 April 2023.
Pilehvar, Mohammad Taher; Camacho-Collados, Jose (June 2019). "WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations". Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics: 1267–1273. S2CID 102353817. डीओआइ:10.18653/v1/N19-1128.
Mitchell, Melanie; Krakauer, David C. (28 March 2023). "The debate over understanding in AI's large language models". Proceedings of the National Academy of Sciences. 120 (13): e2215907120. arXiv:2210.13966. PMID 36943882 |pmid= के मान की जाँच करें (मदद). डीओआइ:10.1073/pnas.2215907120. पी॰एम॰सी॰ 10068812 |pmc= के मान की जाँच करें (मदद). बिबकोड:2023PNAS..12015907M.Mitchell, Melanie; Krakauer, David C. (28 March 2023). "The debate over understanding in AI's large language models". Proceedings of the National Academy of Sciences. 120 (13): e2215907120. arXiv:2210.13966. Bibcode:2023PNAS..12015907M. doi:10.1073/pnas.2215907120. PMC 10068812. PMID 36943882.
Ji, Ziwei; Lee, Nayeon; Frieske, Rita; Yu, Tiezheng; Su, Dan; Xu, Yan; Ishii, Etsuko; Bang, Yejin; Dai, Wenliang (November 2022). "Survey of Hallucination in Natural Language Generation" (pdf). ACM Computing Surveys. Association for Computing Machinery. 55 (12): 1–38. arXiv:2202.03629. डीओआइ:10.1145/3571730. अभिगमन तिथि 15 January 2023.Ji, Ziwei; Lee, Nayeon; Frieske, Rita; Yu, Tiezheng; Su, Dan; Xu, Yan; Ishii, Etsuko; Bang, Yejin; Dai, Wenliang; Madotto, Andrea; Fung, Pascale (November 2022). "Survey of Hallucination in Natural Language Generation" (pdf). ACM Computing Surveys. Association for Computing Machinery. 55 (12): 1–38. arXiv:2202.03629. doi:10.1145/3571730. S2CID 246652372. Retrieved 15 January 2023.
"Prepare for truly useful large language models". Nature Biomedical Engineering (अंग्रेज़ी में). 7 (2): 85–86. 7 March 2023. PMID 36882584 |pmid= के मान की जाँच करें (मदद). डीओआइ:10.1038/s41551-023-01012-6."Prepare for truly useful large language models". Nature Biomedical Engineering. 7 (2): 85–86. 7 March 2023. doi:10.1038/s41551-023-01012-6. PMID 36882584. S2CID 257403466.
"Could chatbots help devise the next pandemic virus?". Science (अंग्रेज़ी में). 14 June 2023. डीओआइ:10.1126/science.adj2463."Could chatbots help devise the next pandemic virus?". Science. 14 June 2023. doi:10.1126/science.adj2463.
"Could chatbots help devise the next pandemic virus?". Science (अंग्रेज़ी में). 14 June 2023. डीओआइ:10.1126/science.adj2463."Could chatbots help devise the next pandemic virus?". Science. 14 June 2023. doi:10.1126/science.adj2463.
Ananthaswamy, Anil (8 March 2023). "In AI, is bigger always better?". Nature. 615 (7951): 202–205. PMID 36890378 |pmid= के मान की जाँच करें (मदद). S2CID 257380916. डीओआइ:10.1038/d41586-023-00641-w. बिबकोड:2023Natur.615..202A.

dx.doi.org

Winograd, Terry (1972-01-01). "Understanding natural language". Cognitive Psychology (अंग्रेज़ी में). 3 (1): 1–191. आइ॰एस॰एस॰एन॰ 0010-0285. डीओआइ:10.1016/0010-0285(72)90002-3.Winograd, Terry (1972-01-01). "Understanding natural language". Cognitive Psychology. 3 (1): 1–191. doi:10.1016/0010-0285(72)90002-3. ISSN 0010-0285.
Shannon, C. E. (January 1951). "Prediction and Entropy of Printed English". Bell System Technical Journal. 30 (1): 50–64. आइ॰एस॰एस॰एन॰ 0005-8580. डीओआइ:10.1002/j.1538-7305.1951.tb01366.x.Shannon, C. E. (January 1951). "Prediction and Entropy of Printed English". Bell System Technical Journal. 30 (1): 50–64. doi:10.1002/j.1538-7305.1951.tb01366.x. ISSN 0005-8580.
Banko, Michele; Brill, Eric (2001). "Scaling to very very large corpora for natural language disambiguation". Proceedings of the 39th Annual Meeting on Association for Computational Linguistics - ACL '01. Morristown, NJ, USA: Association for Computational Linguistics: 26–33. डीओआइ:10.3115/1073012.1073017.Banko, Michele; Brill, Eric (2001). "Scaling to very very large corpora for natural language disambiguation". Proceedings of the 39th Annual Meeting on Association for Computational Linguistics - ACL '01. Morristown, NJ, USA: Association for Computational Linguistics: 26–33. doi:10.3115/1073012.1073017. S2CID 6645623.
Cho, Kyunghyun; van Merrienboer, Bart; Bahdanau, Dzmitry; Bengio, Yoshua (2014). "On the Properties of Neural Machine Translation: Encoder–Decoder Approaches". Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation. Stroudsburg, PA, USA: Association for Computational Linguistics: 103–111. डीओआइ:10.3115/v1/w14-4012.Cho, Kyunghyun; van Merrienboer, Bart; Bahdanau, Dzmitry; Bengio, Yoshua (2014). "On the Properties of Neural Machine Translation: Encoder–Decoder Approaches". Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation. Stroudsburg, PA, USA: Association for Computational Linguistics: 103–111. doi:10.3115/v1/w14-4012. S2CID 11336213.
Ganguli, Deep; Hernandez, Danny (2022-06-20). "2022 ACM Conference on Fairness, Accountability, and Transparency". ACM. pp. 1747–1764. doi:10.1145/3531146.3533229. Ganguli, Deep; Hernandez, Danny; Lovitt, Liane; et al. (2022-06-20). "Predictability and Surprise in Large Generative Models". 2022 ACM Conference on Fairness, Accountability, and Transparency. New York, NY, USA: ACM. pp. 1747–1764. doi:10.1145/3531146.3533229. ISBN 9781450393522.

economist.com

"Your job is (probably) safe from artificial intelligence". The Economist. 7 May 2023. अभिगमन तिथि 18 June 2023."Your job is (probably) safe from artificial intelligence". The Economist. 7 May 2023. Retrieved 18 June 2023.

facebook.com

ai.facebook.com

"Democratizing access to large-scale language models with OPT-175B". ai.facebook.com (अंग्रेज़ी में).

fastcompanyme.com

"Abu Dhabi-based TII launches its own version of ChatGPT". tii.ae.

forefront.ai

"GPT-J-6B: An Introduction to the Largest Open Source GPT Model | Forefront". www.forefront.ai (अंग्रेज़ी में). मूल से 9 मार्च 2023 को पुरालेखित. अभिगमन तिथि 2023-02-28.

github.com

finetune-transformer-lm, OpenAI, June 11, 2018, अभिगमन तिथि 2023-05-01finetune-transformer-lm, OpenAI, June 11, 2018, retrieved 2023-05-01
"BERT". March 13, 2023 – वाया GitHub.
"gpt-2". GitHub. अभिगमन तिथि 13 March 2023.
"GPT Neo". March 15, 2023 – वाया GitHub.
Khrushchev, Mikhail; Vasilev, Ruslan; Petrov, Alexey; Zinov, Nikolay (2022-06-22), YaLM 100B, अभिगमन तिथि 2023-03-18

goldmansachs.com

"Generative AI Could Raise Global GDP by 7%". Goldman Sachs. अभिगमन तिथि 18 June 2023."Generative AI Could Raise Global GDP by 7%". Goldman Sachs. Retrieved 18 June 2023.

googleblog.com

ai.googleblog.com

"Minerva: Solving Quantitative Reasoning Problems with Language Models". ai.googleblog.com (अंग्रेज़ी में). 30 June 2022. अभिगमन तिथि 20 March 2023.

harvard.edu

ui.adsabs.harvard.edu

Mitchell, Melanie; Krakauer, David C. (28 March 2023). "The debate over understanding in AI's large language models". Proceedings of the National Academy of Sciences. 120 (13): e2215907120. arXiv:2210.13966. PMID 36943882 |pmid= के मान की जाँच करें (मदद). डीओआइ:10.1073/pnas.2215907120. पी॰एम॰सी॰ 10068812 |pmc= के मान की जाँच करें (मदद). बिबकोड:2023PNAS..12015907M.Mitchell, Melanie; Krakauer, David C. (28 March 2023). "The debate over understanding in AI's large language models". Proceedings of the National Academy of Sciences. 120 (13): e2215907120. arXiv:2210.13966. Bibcode:2023PNAS..12015907M. doi:10.1073/pnas.2215907120. PMC 10068812. PMID 36943882.
Ananthaswamy, Anil (8 March 2023). "In AI, is bigger always better?". Nature. 615 (7951): 202–205. PMID 36890378 |pmid= के मान की जाँच करें (मदद). S2CID 257380916. डीओआइ:10.1038/d41586-023-00641-w. बिबकोड:2023Natur.615..202A.

huggingface.co

"bigscience/bloom · Hugging Face". huggingface.co.
"The Falcon has landed in the Hugging Face ecosystem". huggingface.co. अभिगमन तिथि 2023-06-20.
"tiiuae/falcon-40b · Hugging Face". huggingface.co. 2023-06-09. अभिगमन तिथि 2023-06-20.

ieee.org

ieeexplore.ieee.org

Chomsky, N. (September 1956). "Three models for the description of language". IRE Transactions on Information Theory. 2 (3): 113–124. आइ॰एस॰एस॰एन॰ 2168-2712. डीओआइ:10.1109/TIT.1956.1056813.Chomsky, N. (September 1956). "Three models for the description of language". IRE Transactions on Information Theory. 2 (3): 113–124. doi:10.1109/TIT.1956.1056813. ISSN 2168-2712. S2CID 19519474.

japantimes.co.jp

Alba, Davey (1 May 2023). "AI chatbots have been used to create dozens of news content farms". The Japan Times. अभिगमन तिथि 18 June 2023.Alba, Davey (1 May 2023). "AI chatbots have been used to create dozens of news content farms". The Japan Times. Retrieved 18 June 2023.

jasonwei.net

"137 emergent abilities of large language models". Jason Wei (अंग्रेज़ी में). अभिगमन तिथि 2023-06-24."137 emergent abilities of large language models". Jason Wei. Retrieved 2023-06-24.

kdnuggets.com

"BERT, RoBERTa, DistilBERT, XLNet: Which one to use?".^{[मृत कड़ियाँ]}

lambdalabs.com

"OpenAI's GPT-3 Language Model: A Technical Overview". lambdalabs.com (अंग्रेज़ी में). 3 June 2020.

meta.com

ai.meta.com

"Introducing Llama 2: The Next Generation of Our Open Source Large Language Model". Meta AI (अंग्रेज़ी में). 2023. अभिगमन तिथि 2023-07-19.

microsoft.com

Alvi, Ali; Kharya, Paresh (11 October 2021). "Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World's Largest and Most Powerful Generative Language Model". Microsoft Research.
Alvi, Ali; Kharya, Paresh (11 October 2021). "Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World's Largest and Most Powerful Generative Language Model". Microsoft Research.

minedojo.org

voyager.minedojo.org

"Voyager | An Open-Ended Embodied Agent with Large Language Models". voyager.minedojo.org. अभिगमन तिथि 2023-06-09."Voyager | An Open-Ended Embodied Agent with Large Language Models". voyager.minedojo.org. Retrieved 2023-06-09.

mlr.press

proceedings.mlr.press

Nagel, Markus; Amjad, Rana Ali; Baalen, Mart Van; Louizos, Christos; Blankevoort, Tijmen (2020-11-21). "Up or Down? Adaptive Rounding for Post-Training Quantization". Proceedings of the 37th International Conference on Machine Learning (अंग्रेज़ी में). PMLR: 7197–7206.Nagel, Markus; Amjad, Rana Ali; Baalen, Mart Van; Louizos, Christos; Blankevoort, Tijmen (2020-11-21). "Up or Down? Adaptive Rounding for Post-Training Quantization". Proceedings of the 37th International Conference on Machine Learning. PMLR: 7197–7206.
Huang, Wenlong; Abbeel, Pieter; Pathak, Deepak; Mordatch, Igor (2022-06-28). "Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents". Proceedings of the 39th International Conference on Machine Learning (अंग्रेज़ी में). PMLR: 9118–9147. arXiv:2201.07207.Huang, Wenlong; Abbeel, Pieter; Pathak, Deepak; Mordatch, Igor (2022-06-28). "Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents". Proceedings of the 39th International Conference on Machine Learning. PMLR: 9118–9147. arXiv:2201.07207.
Kiros, Ryan; Salakhutdinov, Ruslan; Zemel, Rich (2014-06-18). "Multimodal Neural Language Models". Proceedings of the 31st International Conference on Machine Learning (अंग्रेज़ी में). PMLR: 595–603.Kiros, Ryan; Salakhutdinov, Ruslan; Zemel, Rich (2014-06-18). "Multimodal Neural Language Models". Proceedings of the 31st International Conference on Machine Learning. PMLR: 595–603.

nature.com

Ananthaswamy, Anil (8 March 2023). "In AI, is bigger always better?". Nature. 615 (7951): 202–205. PMID 36890378 |pmid= के मान की जाँच करें (मदद). S2CID 257380916. डीओआइ:10.1038/d41586-023-00641-w. बिबकोड:2023Natur.615..202A.

neurips.cc

proceedings.neurips.cc

Sutskever, Ilya; Vinyals, Oriol; Le, Quoc V (2014). "Sequence to Sequence Learning with Neural Networks". Advances in Neural Information Processing Systems. Curran Associates, Inc. 27. arXiv:1409.3215.Sutskever, Ilya; Vinyals, Oriol; Le, Quoc V (2014). "Sequence to Sequence Learning with Neural Networks". Advances in Neural Information Processing Systems. Curran Associates, Inc. 27. arXiv:1409.3215.
Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion; Gomez, Aidan N; Kaiser, Łukasz; Polosukhin, Illia (2017). "Attention is All you Need". Advances in Neural Information Processing Systems. Curran Associates, Inc. 30.Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion; Gomez, Aidan N; Kaiser, Łukasz; Polosukhin, Illia (2017). "Attention is All you Need". Advances in Neural Information Processing Systems. Curran Associates, Inc. 30.
Brown, Tom B.; Mann, Benjamin; Ryder, Nick; Subbiah, Melanie; Kaplan, Jared; Dhariwal, Prafulla; Neelakantan, Arvind; Shyam, Pranav; Sastry, Girish (Dec 2020). Larochelle, H.; Ranzato, M.; Hadsell, R.; Balcan, M.F.; Lin, H. (संपा॰). "Language Models are Few-Shot Learners" (PDF). Advances in Neural Information Processing Systems. Curran Associates, Inc. 33: 1877–1901.Brown, Tom B.; Mann, Benjamin; Ryder, Nick; Subbiah, Melanie; Kaplan, Jared; Dhariwal, Prafulla; Neelakantan, Arvind; Shyam, Pranav; Sastry, Girish; Askell, Amanda; Agarwal, Sandhini; Herbert-Voss, Ariel; Krueger, Gretchen; Henighan, Tom; Child, Rewon; Ramesh, Aditya; Ziegler, Daniel M.; Wu, Jeffrey; Winter, Clemens; Hesse, Christopher; Chen, Mark; Sigler, Eric; Litwin, Mateusz; Gray, Scott; Chess, Benjamin; Clark, Jack; Berner, Christopher; McCandlish, Sam; Radford, Alec; Sutskever, Ilya; Amodei, Dario (Dec 2020). Larochelle, H.; Ranzato, M.; Hadsell, R.; Balcan, M.F.; Lin, H. (eds.). "Language Models are Few-Shot Learners" (PDF). Advances in Neural Information Processing Systems. Curran Associates, Inc. 33: 1877–1901.
Lewis, Patrick; Perez, Ethan; Piktus, Aleksandra; Petroni, Fabio; Karpukhin, Vladimir; Goyal, Naman; Küttler, Heinrich; Lewis, Mike; Yih, Wen-tau (2020). "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks". Advances in Neural Information Processing Systems. Curran Associates, Inc. 33: 9459–9474. arXiv:2005.11401.Lewis, Patrick; Perez, Ethan; Piktus, Aleksandra; Petroni, Fabio; Karpukhin, Vladimir; Goyal, Naman; Küttler, Heinrich; Lewis, Mike; Yih, Wen-tau; Rocktäschel, Tim; Riedel, Sebastian; Kiela, Douwe (2020). "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks". Advances in Neural Information Processing Systems. Curran Associates, Inc. 33: 9459–9474. arXiv:2005.11401.
Krizhevsky, Alex; Sutskever, Ilya; Hinton, Geoffrey E (2012). "ImageNet Classification with Deep Convolutional Neural Networks". Advances in Neural Information Processing Systems. Curran Associates, Inc. 25.
Alayrac, Jean-Baptiste; Donahue, Jeff; Luc, Pauline; Miech, Antoine; Barr, Iain; Hasson, Yana; Lenc, Karel; Mensch, Arthur; Millican, Katherine (2022-12-06). "Flamingo: a Visual Language Model for Few-Shot Learning". Advances in Neural Information Processing Systems (अंग्रेज़ी में). 35: 23716–23736. arXiv:2204.14198.Alayrac, Jean-Baptiste; Donahue, Jeff; Luc, Pauline; Miech, Antoine; Barr, Iain; Hasson, Yana; Lenc, Karel; Mensch, Arthur; Millican, Katherine; Reynolds, Malcolm; Ring, Roman; Rutherford, Eliza; Cabi, Serkan; Han, Tengda; Gong, Zhitao (2022-12-06). "Flamingo: a Visual Language Model for Few-Shot Learning". Advances in Neural Information Processing Systems. 35: 23716–23736. arXiv:2204.14198.

nextplatform.com

Prickett, Nicole Hemsoth (2021-08-24). "Cerebras Shifts Architecture To Meet Massive AI/ML Models". The Next Platform (अंग्रेज़ी में). अभिगमन तिथि 2023-06-20.

nih.gov

pubmed.ncbi.nlm.nih.gov

Kjell (2019). "Semantic measures: Using natural language processing to measure, differentiate, and describe psychological constructs". Psychological Methods. 24 (1): 92–115. PMID 29963879. डीओआइ:10.1037/met0000191.Kjell (2019). "Semantic measures: Using natural language processing to measure, differentiate, and describe psychological constructs". Psychological Methods. 24 (1): 92–115. doi:10.1037/met0000191. PMID 29963879. S2CID 49642731.
Mitchell, Melanie; Krakauer, David C. (28 March 2023). "The debate over understanding in AI's large language models". Proceedings of the National Academy of Sciences. 120 (13): e2215907120. arXiv:2210.13966. PMID 36943882 |pmid= के मान की जाँच करें (मदद). डीओआइ:10.1073/pnas.2215907120. पी॰एम॰सी॰ 10068812 |pmc= के मान की जाँच करें (मदद). बिबकोड:2023PNAS..12015907M.Mitchell, Melanie; Krakauer, David C. (28 March 2023). "The debate over understanding in AI's large language models". Proceedings of the National Academy of Sciences. 120 (13): e2215907120. arXiv:2210.13966. Bibcode:2023PNAS..12015907M. doi:10.1073/pnas.2215907120. PMC 10068812. PMID 36943882.
"Prepare for truly useful large language models". Nature Biomedical Engineering (अंग्रेज़ी में). 7 (2): 85–86. 7 March 2023. PMID 36882584 |pmid= के मान की जाँच करें (मदद). डीओआइ:10.1038/s41551-023-01012-6."Prepare for truly useful large language models". Nature Biomedical Engineering. 7 (2): 85–86. 7 March 2023. doi:10.1038/s41551-023-01012-6. PMID 36882584. S2CID 257403466.
Ananthaswamy, Anil (8 March 2023). "In AI, is bigger always better?". Nature. 615 (7951): 202–205. PMID 36890378 |pmid= के मान की जाँच करें (मदद). S2CID 257380916. डीओआइ:10.1038/d41586-023-00641-w. बिबकोड:2023Natur.615..202A.

ncbi.nlm.nih.gov

Mitchell, Melanie; Krakauer, David C. (28 March 2023). "The debate over understanding in AI's large language models". Proceedings of the National Academy of Sciences. 120 (13): e2215907120. arXiv:2210.13966. PMID 36943882 |pmid= के मान की जाँच करें (मदद). डीओआइ:10.1073/pnas.2215907120. पी॰एम॰सी॰ 10068812 |pmc= के मान की जाँच करें (मदद). बिबकोड:2023PNAS..12015907M.Mitchell, Melanie; Krakauer, David C. (28 March 2023). "The debate over understanding in AI's large language models". Proceedings of the National Academy of Sciences. 120 (13): e2215907120. arXiv:2210.13966. Bibcode:2023PNAS..12015907M. doi:10.1073/pnas.2215907120. PMC 10068812. PMID 36943882.

notion.so

A Closer Look at Large Language Models Emergent Abilities (Yao Fu, Nov 20, 2022)

nytimes.com

Lewis-Kraus, Gideon (2016-12-14). "The Great A.I. Awakening". The New York Times (अंग्रेज़ी में). आइ॰एस॰एस॰एन॰ 0362-4331. मूल से पुरालेखित 24 मई 2023. अभिगमन तिथि 2023-06-22.सीएस1 रखरखाव: BOT: original-url status unknown (link)Lewis-Kraus, Gideon (2016-12-14). . The New York Times. ISSN 0362-4331. Archived from the original on 24 May 2023. Retrieved 2023-06-22.
Metz, Cade (16 May 2023). "Microsoft Says New A.I. Shows Signs of Human Reasoning". The New York Times.Metz, Cade (16 May 2023). "Microsoft Says New A.I. Shows Signs of Human Reasoning". The New York Times.
Roose, Kevin (30 May 2023). "Why an Octopus-like Creature Has Come to Symbolize the State of A.I." The New York Times. अभिगमन तिथि 12 June 2023.Roose, Kevin (30 May 2023). "Why an Octopus-like Creature Has Come to Symbolize the State of A.I." The New York Times. Retrieved 12 June 2023.

openai.com

"Better Language Models and Their Implications". OpenAI. 2019-02-14. मूल से 2020-12-19 को पुरालेखित. अभिगमन तिथि 2019-08-25."Better Language Models and Their Implications". OpenAI. 2019-02-14. Archived from the original on 2020-12-19. Retrieved 2019-08-25.
"Improving language understanding with unsupervised learning". openai.com (अंग्रेज़ी में). June 11, 2018. मूल से 2023-03-18 को पुरालेखित. अभिगमन तिथि 2023-03-18."Improving language understanding with unsupervised learning". openai.com. June 11, 2018. Archived from the original on 2023-03-18. Retrieved 2023-03-18.
"Better language models and their implications". openai.com.

platform.openai.com

"OpenAI API". platform.openai.com (अंग्रेज़ी में). मूल से पुरालेखित 23 अप्रैल 2023. अभिगमन तिथि 2023-04-30.सीएस1 रखरखाव: BOT: original-url status unknown (link). platform.openai.com. Archived from the original on April 23, 2023. Retrieved 2023-04-30.
"OpenAI API". platform.openai.com (अंग्रेज़ी में). मूल से पुरालेखित 20 जून 2023. अभिगमन तिथि 2023-06-20.सीएस1 रखरखाव: BOT: original-url status unknown (link). platform.openai.com. Archived from the original on 16 Jun 2023. Retrieved 2023-06-20.

cdn.openai.com

"GPT-4 Technical Report" (PDF). OpenAI. 2023. मूल से March 14, 2023 को पुरालेखित (PDF). अभिगमन तिथि March 14, 2023.

openreview.net

Wei, Jason; Tay, Yi; Bommasani, Rishi; Raffel, Colin; Zoph, Barret; Borgeaud, Sebastian; Yogatama, Dani; Bosma, Maarten; Zhou, Denny (31 August 2022). "Emergent Abilities of Large Language Models". Transactions on Machine Learning Research (अंग्रेज़ी में). आइ॰एस॰एस॰एन॰ 2835-8856.Wei, Jason; Tay, Yi; Bommasani, Rishi; Raffel, Colin; Zoph, Barret; Borgeaud, Sebastian; Yogatama, Dani; Bosma, Maarten; Zhou, Denny; Metzler, Donald; Chi, Ed H.; Hashimoto, Tatsunori; Vinyals, Oriol; Liang, Percy; Dean, Jeff; Fedus, William (31 August 2022). "Emergent Abilities of Large Language Models". Transactions on Machine Learning Research. ISSN 2835-8856.
Patel, Roma; Pavlick, Ellie (2021-10-06). "Mapping Language Models to Grounded Conceptual Spaces" (अंग्रेज़ी में). Cite journal requires |journal= (मदद)

paperswithcode.com

"Papers with Code - MassiveText Dataset". paperswithcode.com (अंग्रेज़ी में). अभिगमन तिथि 2023-04-26."Papers with Code - MassiveText Dataset". paperswithcode.com. Retrieved 2023-04-26.

philpapers.org

Miller, George A.; Chomsky, Noam (1963), Luce, D. (संपा॰), "Finitary Models of Language Users", Handbook of Mathematical Psychology, John Wiley & Sons., पपृ॰ 2–419, अभिगमन तिथि 2023-06-27Miller, George A.; Chomsky, Noam (1963), Luce, D. (ed.), "Finitary Models of Language Users", Handbook of Mathematical Psychology, John Wiley & Sons., pp. 2–419, retrieved 2023-06-27

pilehvar.github.io

"WiC: The Word-in-Context Dataset". pilehvar.github.io. अभिगमन तिथि 2023-06-27.

quantamagazine.org

Ornes, Stephen (March 16, 2023). "The Unpredictable Abilities Emerging From Large AI Models". Quanta Magazine.Ornes, Stephen (March 16, 2023). "The Unpredictable Abilities Emerging From Large AI Models". Quanta Magazine.

reasonwithpal.com

"PAL: Program-aided Language Models". reasonwithpal.com. अभिगमन तिथि 2023-06-12."PAL: Program-aided Language Models". reasonwithpal.com. Retrieved 2023-06-12.

researchgate.net

Kjell (2019). "Semantic measures: Using natural language processing to measure, differentiate, and describe psychological constructs". Psychological Methods. 24 (1): 92–115. PMID 29963879. डीओआइ:10.1037/met0000191.Kjell (2019). "Semantic measures: Using natural language processing to measure, differentiate, and describe psychological constructs". Psychological Methods. 24 (1): 92–115. doi:10.1037/met0000191. PMID 29963879. S2CID 49642731.
Zaib, Munazza; Sheng, Quan Z.; Emma Zhang, Wei (4 February 2020). "A Short Survey of Pre-trained Language Models for Conversational AI-A New Age in NLP". Proceedings of the Australasian Computer Science Week Multiconference: 1–4. arXiv:2104.10810. आई॰ऍस॰बी॰ऍन॰ 9781450376976. डीओआइ:10.1145/3373017.3373028.Zaib, Munazza; Sheng, Quan Z.; Emma Zhang, Wei (4 February 2020). "A Short Survey of Pre-trained Language Models for Conversational AI-A New Age in NLP". Proceedings of the Australasian Computer Science Week Multiconference: 1–4. arXiv:2104.10810. doi:10.1145/3373017.3373028. ISBN 9781450376976. S2CID 211040895.

science.org

"Could chatbots help devise the next pandemic virus?". Science (अंग्रेज़ी में). 14 June 2023. डीओआइ:10.1126/science.adj2463."Could chatbots help devise the next pandemic virus?". Science. 14 June 2023. doi:10.1126/science.adj2463.
"Could chatbots help devise the next pandemic virus?". Science (अंग्रेज़ी में). 14 June 2023. डीओआइ:10.1126/science.adj2463."Could chatbots help devise the next pandemic virus?". Science. 14 June 2023. doi:10.1126/science.adj2463.

semanticscholar.org

api.semanticscholar.org

Manning, Christopher D. (2022). "Human Language Understanding & Reasoning". Daedalus. 151 (2): 127–138. डीओआइ:10.1162/daed_a_01905.Manning, Christopher D. (2022). "Human Language Understanding & Reasoning". Daedalus. 151 (2): 127–138. doi:10.1162/daed_a_01905. S2CID 248377870.
Chomsky, N. (September 1956). "Three models for the description of language". IRE Transactions on Information Theory. 2 (3): 113–124. आइ॰एस॰एस॰एन॰ 2168-2712. डीओआइ:10.1109/TIT.1956.1056813.Chomsky, N. (September 1956). "Three models for the description of language". IRE Transactions on Information Theory. 2 (3): 113–124. doi:10.1109/TIT.1956.1056813. ISSN 2168-2712. S2CID 19519474.
Elman, Jeffrey L. (March 1990). "Finding Structure in Time". Cognitive Science (अंग्रेज़ी में). 14 (2): 179–211. डीओआइ:10.1207/s15516709cog1402_1.Elman, Jeffrey L. (March 1990). "Finding Structure in Time". Cognitive Science. 14 (2): 179–211. doi:10.1207/s15516709cog1402_1. S2CID 2763403.
Banko, Michele; Brill, Eric (2001). "Scaling to very very large corpora for natural language disambiguation". Proceedings of the 39th Annual Meeting on Association for Computational Linguistics - ACL '01. Morristown, NJ, USA: Association for Computational Linguistics: 26–33. डीओआइ:10.3115/1073012.1073017.Banko, Michele; Brill, Eric (2001). "Scaling to very very large corpora for natural language disambiguation". Proceedings of the 39th Annual Meeting on Association for Computational Linguistics - ACL '01. Morristown, NJ, USA: Association for Computational Linguistics: 26–33. doi:10.3115/1073012.1073017. S2CID 6645623.
Cho, Kyunghyun; van Merrienboer, Bart; Bahdanau, Dzmitry; Bengio, Yoshua (2014). "On the Properties of Neural Machine Translation: Encoder–Decoder Approaches". Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation. Stroudsburg, PA, USA: Association for Computational Linguistics: 103–111. डीओआइ:10.3115/v1/w14-4012.Cho, Kyunghyun; van Merrienboer, Bart; Bahdanau, Dzmitry; Bengio, Yoshua (2014). "On the Properties of Neural Machine Translation: Encoder–Decoder Approaches". Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation. Stroudsburg, PA, USA: Association for Computational Linguistics: 103–111. doi:10.3115/v1/w14-4012. S2CID 11336213.
Kjell (2019). "Semantic measures: Using natural language processing to measure, differentiate, and describe psychological constructs". Psychological Methods. 24 (1): 92–115. PMID 29963879. डीओआइ:10.1037/met0000191.Kjell (2019). "Semantic measures: Using natural language processing to measure, differentiate, and describe psychological constructs". Psychological Methods. 24 (1): 92–115. doi:10.1037/met0000191. PMID 29963879. S2CID 49642731.
Zaib, Munazza; Sheng, Quan Z.; Emma Zhang, Wei (4 February 2020). "A Short Survey of Pre-trained Language Models for Conversational AI-A New Age in NLP". Proceedings of the Australasian Computer Science Week Multiconference: 1–4. arXiv:2104.10810. आई॰ऍस॰बी॰ऍन॰ 9781450376976. डीओआइ:10.1145/3373017.3373028.Zaib, Munazza; Sheng, Quan Z.; Emma Zhang, Wei (4 February 2020). "A Short Survey of Pre-trained Language Models for Conversational AI-A New Age in NLP". Proceedings of the Australasian Computer Science Week Multiconference: 1–4. arXiv:2104.10810. doi:10.1145/3373017.3373028. ISBN 9781450376976. S2CID 211040895.
Zhu, Yukun; Kiros, Ryan; Zemel, Rich; Salakhutdinov, Ruslan; Urtasun, Raquel; Torralba, Antonio; Fidler, Sanja (December 2015). "Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books" (PDF). 2015 IEEE International Conference on Computer Vision (ICCV). 2015 IEEE International Conference on Computer Vision (ICCV). पपृ॰ 19–27. arXiv:1506.06724. आई॰ऍस॰बी॰ऍन॰ 978-1-4673-8391-2. डीओआइ:10.1109/ICCV.2015.11. अभिगमन तिथि 11 April 2023.Zhu, Yukun; Kiros, Ryan; Zemel, Rich; Salakhutdinov, Ruslan; Urtasun, Raquel; Torralba, Antonio; Fidler, Sanja (December 2015). "Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books" (PDF). 2015 IEEE International Conference on Computer Vision (ICCV). 2015 IEEE International Conference on Computer Vision (ICCV). pp. 19–27. arXiv:1506.06724. doi:10.1109/ICCV.2015.11. ISBN 978-1-4673-8391-2. S2CID 6866988. Retrieved 11 April 2023.
Pilehvar, Mohammad Taher; Camacho-Collados, Jose (June 2019). "WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations". Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics: 1267–1273. S2CID 102353817. डीओआइ:10.18653/v1/N19-1128.
Ji, Ziwei; Lee, Nayeon; Frieske, Rita; Yu, Tiezheng; Su, Dan; Xu, Yan; Ishii, Etsuko; Bang, Yejin; Dai, Wenliang (November 2022). "Survey of Hallucination in Natural Language Generation" (pdf). ACM Computing Surveys. Association for Computing Machinery. 55 (12): 1–38. arXiv:2202.03629. डीओआइ:10.1145/3571730. अभिगमन तिथि 15 January 2023.Ji, Ziwei; Lee, Nayeon; Frieske, Rita; Yu, Tiezheng; Su, Dan; Xu, Yan; Ishii, Etsuko; Bang, Yejin; Dai, Wenliang; Madotto, Andrea; Fung, Pascale (November 2022). "Survey of Hallucination in Natural Language Generation" (pdf). ACM Computing Surveys. Association for Computing Machinery. 55 (12): 1–38. arXiv:2202.03629. doi:10.1145/3571730. S2CID 246652372. Retrieved 15 January 2023.
"Prepare for truly useful large language models". Nature Biomedical Engineering (अंग्रेज़ी में). 7 (2): 85–86. 7 March 2023. PMID 36882584 |pmid= के मान की जाँच करें (मदद). डीओआइ:10.1038/s41551-023-01012-6."Prepare for truly useful large language models". Nature Biomedical Engineering. 7 (2): 85–86. 7 March 2023. doi:10.1038/s41551-023-01012-6. PMID 36882584. S2CID 257403466.
Ananthaswamy, Anil (8 March 2023). "In AI, is bigger always better?". Nature. 615 (7951): 202–205. PMID 36890378 |pmid= के मान की जाँच करें (मदद). S2CID 257380916. डीओआइ:10.1038/d41586-023-00641-w. बिबकोड:2023Natur.615..202A.

stanford.edu

web.stanford.edu

Jurafsky, Dan; Martin, James H. (7 January 2023). Speech and Language Processing (PDF) (3rd edition draft संस्करण). अभिगमन तिथि 24 May 2022.

crfm.stanford.edu

"Stanford CRFM". crfm.stanford.edu.

storage.googleapis.com

Table 20 of PaLM: Scaling Language Modeling with Pathways
Table 20 of PaLM: Scaling Language Modeling with Pathways

techcrunch.com

Wiggers, Kyle (28 April 2022). "The emerging types of language models and why they matter". TechCrunch.Wiggers, Kyle (28 April 2022). "The emerging types of language models and why they matter". TechCrunch.

thecvf.com

openaccess.thecvf.com

Antol, Stanislaw; Agrawal, Aishwarya; Lu, Jiasen; Mitchell, Margaret; Batra, Dhruv; Zitnick, C. Lawrence; Parikh, Devi (2015). "VQA: Visual Question Answering": 2425–2433. Cite journal requires |journal= (मदद)

thegradient.pub

"Large Language Model: world models or surface statistics?". The Gradient (अंग्रेज़ी में). 2023-01-21. अभिगमन तिथि 2023-06-12."Large Language Model: world models or surface statistics?". The Gradient. 2023-01-21. Retrieved 2023-06-12.
Huyen, Chip (18 October 2019). "Evaluation Metrics for Language Modeling". The Gradient.Huyen, Chip (18 October 2019). "Evaluation Metrics for Language Modeling". The Gradient.

theverge.com

Vincent, James (3 April 2023). "AI is entering an era of corporate control". The Verge. अभिगमन तिथि 19 June 2023.Vincent, James (3 April 2023). "AI is entering an era of corporate control". The Verge. Retrieved 19 June 2023.

time.com

"The A to Z of Artificial Intelligence". Time Magazine (अंग्रेज़ी में). 13 April 2023. अभिगमन तिथि 12 June 2023."The A to Z of Artificial Intelligence". Time Magazine. 13 April 2023. Retrieved 12 June 2023.

tools.wmflabs.org

A bot will complete this citation soon. Click here to jump the queue arXiv:1802.05365.Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018). "Deep contextualized word representations". arXiv:1802.05365 [cs.CL].

web.archive.org

"Better Language Models and Their Implications". OpenAI. 2019-02-14. मूल से 2020-12-19 को पुरालेखित. अभिगमन तिथि 2019-08-25."Better Language Models and Their Implications". OpenAI. 2019-02-14. Archived from the original on 2020-12-19. Retrieved 2019-08-25.
Gal, Yarin; Blunsom, Phil (12 June 2013). "A Systematic Bayesian Treatment of the IBM Alignment Models" (PDF). University of Cambridge. मूल से पुरालेखित 4 मार्च 2016. अभिगमन तिथि 26 October 2015.सीएस1 रखरखाव: BOT: original-url status unknown (link)Gal, Yarin; Blunsom, Phil (12 June 2013). (PDF). University of Cambridge. Archived from the original Archived 2016-03-04 at the वेबैक मशीन (PDF) on 4 Mar 2016. Retrieved 26 October 2015.
Lewis-Kraus, Gideon (2016-12-14). "The Great A.I. Awakening". The New York Times (अंग्रेज़ी में). आइ॰एस॰एस॰एन॰ 0362-4331. मूल से पुरालेखित 24 मई 2023. अभिगमन तिथि 2023-06-22.सीएस1 रखरखाव: BOT: original-url status unknown (link)Lewis-Kraus, Gideon (2016-12-14). . The New York Times. ISSN 0362-4331. Archived from the original on 24 May 2023. Retrieved 2023-06-22.
"Improving language understanding with unsupervised learning". openai.com (अंग्रेज़ी में). June 11, 2018. मूल से 2023-03-18 को पुरालेखित. अभिगमन तिथि 2023-03-18."Improving language understanding with unsupervised learning". openai.com. June 11, 2018. Archived from the original on 2023-03-18. Retrieved 2023-03-18.
"OpenAI API". platform.openai.com (अंग्रेज़ी में). मूल से पुरालेखित 23 अप्रैल 2023. अभिगमन तिथि 2023-04-30.सीएस1 रखरखाव: BOT: original-url status unknown (link). platform.openai.com. Archived from the original on April 23, 2023. Retrieved 2023-04-30.
"GPT-J-6B: An Introduction to the Largest Open Source GPT Model | Forefront". www.forefront.ai (अंग्रेज़ी में). मूल से 9 मार्च 2023 को पुरालेखित. अभिगमन तिथि 2023-02-28.
"GPT-4 Technical Report" (PDF). OpenAI. 2023. मूल से March 14, 2023 को पुरालेखित (PDF). अभिगमन तिथि March 14, 2023.

wikipedia.org

en.wikipedia.org

"BERT, RoBERTa, DistilBERT, XLNet: Which one to use?".^{[मृत कड़ियाँ]}

wiley.com

doi.wiley.com

Elman, Jeffrey L. (March 1990). "Finding Structure in Time". Cognitive Science (अंग्रेज़ी में). 14 (2): 179–211. डीओआइ:10.1207/s15516709cog1402_1.Elman, Jeffrey L. (March 1990). "Finding Structure in Time". Cognitive Science. 14 (2): 179–211. doi:10.1207/s15516709cog1402_1. S2CID 2763403.

worldcat.org

Chomsky, N. (September 1956). "Three models for the description of language". IRE Transactions on Information Theory. 2 (3): 113–124. आइ॰एस॰एस॰एन॰ 2168-2712. डीओआइ:10.1109/TIT.1956.1056813.Chomsky, N. (September 1956). "Three models for the description of language". IRE Transactions on Information Theory. 2 (3): 113–124. doi:10.1109/TIT.1956.1056813. ISSN 2168-2712. S2CID 19519474.
Winograd, Terry (1972-01-01). "Understanding natural language". Cognitive Psychology (अंग्रेज़ी में). 3 (1): 1–191. आइ॰एस॰एस॰एन॰ 0010-0285. डीओआइ:10.1016/0010-0285(72)90002-3.Winograd, Terry (1972-01-01). "Understanding natural language". Cognitive Psychology. 3 (1): 1–191. doi:10.1016/0010-0285(72)90002-3. ISSN 0010-0285.
Shannon, C. E. (January 1951). "Prediction and Entropy of Printed English". Bell System Technical Journal. 30 (1): 50–64. आइ॰एस॰एस॰एन॰ 0005-8580. डीओआइ:10.1002/j.1538-7305.1951.tb01366.x.Shannon, C. E. (January 1951). "Prediction and Entropy of Printed English". Bell System Technical Journal. 30 (1): 50–64. doi:10.1002/j.1538-7305.1951.tb01366.x. ISSN 0005-8580.
Lewis-Kraus, Gideon (2016-12-14). "The Great A.I. Awakening". The New York Times (अंग्रेज़ी में). आइ॰एस॰एस॰एन॰ 0362-4331. मूल से पुरालेखित 24 मई 2023. अभिगमन तिथि 2023-06-22.सीएस1 रखरखाव: BOT: original-url status unknown (link)Lewis-Kraus, Gideon (2016-12-14). . The New York Times. ISSN 0362-4331. Archived from the original on 24 May 2023. Retrieved 2023-06-22.
Wei, Jason; Tay, Yi; Bommasani, Rishi; Raffel, Colin; Zoph, Barret; Borgeaud, Sebastian; Yogatama, Dani; Bosma, Maarten; Zhou, Denny (31 August 2022). "Emergent Abilities of Large Language Models". Transactions on Machine Learning Research (अंग्रेज़ी में). आइ॰एस॰एस॰एन॰ 2835-8856.Wei, Jason; Tay, Yi; Bommasani, Rishi; Raffel, Colin; Zoph, Barret; Borgeaud, Sebastian; Yogatama, Dani; Bosma, Maarten; Zhou, Denny; Metzler, Donald; Chi, Ed H.; Hashimoto, Tatsunori; Vinyals, Oriol; Liang, Percy; Dean, Jeff; Fedus, William (31 August 2022). "Emergent Abilities of Large Language Models". Transactions on Machine Learning Research. ISSN 2835-8856.

youtube.com

Pichai, Sundar, Google Keynote (Google I/O '23) (अंग्रेज़ी में), timestamp 15:31, अभिगमन तिथि 2023-07-02Pichai, Sundar, Google Keynote (Google I/O '23), timestamp 15:31, retrieved 2023-07-02

zdnet.com

"ChatGPT is more like an 'alien intelligence' than a human brain, says futurist". ZDNET (अंग्रेज़ी में). 2023. अभिगमन तिथि 12 June 2023."ChatGPT is more like an 'alien intelligence' than a human brain, says futurist". ZDNET. 2023. Retrieved 12 June 2023.