大型语言模型 (Chinese Wikipedia)

Analysis of information sources in references of the Wikipedia article "大型语言模型" in Chinese language version.

refsWebsite

Global rank Chinese rank

48web.archive.org

1^st place

30arxiv.org

69^th place

254^th place

22doi.org

2^nd place

23^rd place

6aclanthology.org

low place

5rgdoi.net

low place

5openai.com

1,559^th place

848^th place

4worldcat.org

5^th place

12^th place

3semanticscholar.org

11^th place

332^nd place

3neurips.cc

low place

2harvard.edu

18^th place

57^th place

2mit.edu

415^th place

500^th place

2openreview.net

low place

2springer.com

274^th place

320^th place

2jalammar.github.io

low place

2archive.today

14^th place

18^th place

2stanford.edu

179^th place

275^th place

2techcrunch.com

187^th place

481^st place

2nytimes.com

7^th place

31^st place

1analyticsindiamag.com

low place

4,491^st place

1amacad.org

3,464^th place

5,489^th place

1usenix.org

5,990^th place

4,780^th place

1acm.org

1,185^th place

809^th place

1ieee.org

652^nd place

712^th place

1theguardian.com

12^th place

60^th place

1euronews.com

612^th place

2,396^th place

1technologyreview.com

1,943^rd place

2,036^th place

1ourworldindata.org

2,263^rd place

5,757^th place

1huggingface.co

low place

1venturebeat.com

616^th place

838^th place

1unite.ai

low place

1nvidia.com

2,503^rd place

1,088^th place

1yenniejun.com

low place

1towardsdatascience.com

8,920^th place

7,729^th place

1ibm.com

1,131^st place

1,050^th place

1blog.google

2,218^th place

5,303^rd place

1anthropic.com

low place

1researchgate.net

120^th place

337^th place

1mlr.press

low place

1thecvf.com

low place

1youtube.com

9^th place

2^nd place

1nature.com

234^th place

227^th place

1mittrchina.com

low place

1thepaper.cn

1,497^th place

54^th place

aclanthology.org

Davidson, Thomas; Bhattacharya, Debasmita; Weber, Ingmar. Roberts, Sarah T.; Tetreault, Joel; Prabhakaran, Vinodkumar; Waseem, Zeerak , 编. Racial Bias in Hate Speech and Abusive Language Detection Datasets. Proceedings of the Third Workshop on Abusive Language Online (Florence, Italy: Association for Computational Linguistics). 2019-08. doi:10.18653/v1/W19-3504.
Rogers, Anna; Kovaleva, Olga; Rumshisky, Anna. A Primer in BERTology: What We Know About How BERT Works. Transactions of the Association for Computational Linguistics. 2020, 8: 842–866 [2024-01-21]. S2CID 211532403. arXiv:2002.12327 . doi:10.1162/tacl_a_00349. （原始内容存档于2022-04-03）.
Movva, Rajiv; Balachandar, Sidhika; Peng, Kenny; Agostini, Gabriel; Garg, Nikhil; Pierson, Emma. Topics, Authors, and Institutions in Large Language Model Research: Trends from 17K arXiv Papers. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2024: 1223–1243 [2024-12-08]. arXiv:2307.10700 . doi:10.18653/v1/2024.naacl-long.67. （原始内容存档于2025-04-12）.
Movva, Rajiv; Balachandar, Sidhika; Peng, Kenny; Agostini, Gabriel; Garg, Nikhil; Pierson, Emma. Topics, Authors, and Institutions in Large Language Model Research: Trends from 17K arXiv Papers. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2024: 1223–1243 [2024-12-08]. arXiv:2307.10700 . doi:10.18653/v1/2024.naacl-long.67. （原始内容存档于2025-04-12）.
Lee, Katherine; Ippolito, Daphne; Nystrom, Andrew; Zhang, Chiyuan; Eck, Douglas; Callison-Burch, Chris; Carlini, Nicholas. Deduplicating Training Data Makes Language Models Better (PDF). Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. May 2022,. 1: Long Papers: 8424–8445 [2025-02-07]. doi:10.18653/v1/2022.acl-long.577. （原始内容存档 (PDF)于2024-09-30）.
Zhou, Karen; Tan, Chenhao. Bouamor, Houda; Pino, Juan; Bali, Kalika , 编. Entity-Based Evaluation of Political Bias in Automatic Summarization. Findings of the Association for Computational Linguistics: EMNLP 2023 (Singapore: Association for Computational Linguistics). 2023-12 [2023-12-26]. doi:10.18653/v1/2023.findings-emnlp.696. （原始内容存档于2024-04-24）.

acm.org

dl.acm.org

Kotek, Hadas; Dockum, Rikker; Sun, David. Gender bias and stereotypes in Large Language Models. Proceedings of The ACM Collective Intelligence Conference. CI '23 (New York, NY, USA: Association for Computing Machinery). 2023-11-05. ISBN 979-8-4007-0113-9. doi:10.1145/3582269.3615599.

amacad.org

Manning, Christopher D. Human Language Understanding & Reasoning. Daedalus. 2022, 151 (2): 127–138 [2023-06-08]. S2CID 248377870. doi:10.1162/daed_a_01905. （原始内容存档于2023-03-09）.

analyticsindiamag.com

Goled, Shraddha. Self-Supervised Learning Vs Semi-Supervised Learning: How They Differ. Analytics India Magazine. May 7, 2021 [2023-06-08]. （原始内容存档于2023-06-18）.

anthropic.com

Long context prompting for Claude 2.1. December 6, 2023 [January 20, 2024]. （原始内容存档于August 27, 2024）.

archive.today

Allamar, Jay. The Illustrated GPT-2 (Visualizing Transformer Language Models). [2023-08-01]. （原始内容存档于2019-08-13）.
OpenAI. GPT-4V(ision) System Card (PDF). September 25, 2023 [2025-02-11]. （原始内容存档 (PDF)于2023-09-25）.

arxiv.org

Queenie Luo; Michael J. Puett; Michael D. Smith. A Perspectival Mirror of the Elephant: Investigating Language Bias on Google, ChatGPT, Wikipedia, and YouTube. arXiv. （原始内容存档于2024-04-16）.
Goodman, Joshua, A Bit of Progress in Language Modeling, 2001-08-09, Bibcode:2001cs........8005G, arXiv:cs/0108005 
Bahdanau, Dzmitry; Cho, Kyunghyun; Bengio, Yoshua. Neural Machine Translation by Jointly Learning to Align and Translate. 2014. arXiv:1409.0473  [cs.CL].
Rogers, Anna; Kovaleva, Olga; Rumshisky, Anna. A Primer in BERTology: What We Know About How BERT Works. Transactions of the Association for Computational Linguistics. 2020, 8: 842–866 [2024-01-21]. S2CID 211532403. arXiv:2002.12327 . doi:10.1162/tacl_a_00349. （原始内容存档于2022-04-03）.
Movva, Rajiv; Balachandar, Sidhika; Peng, Kenny; Agostini, Gabriel; Garg, Nikhil; Pierson, Emma. Topics, Authors, and Institutions in Large Language Model Research: Trends from 17K arXiv Papers. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2024: 1223–1243 [2024-12-08]. arXiv:2307.10700 . doi:10.18653/v1/2024.naacl-long.67. （原始内容存档于2025-04-12）.
Movva, Rajiv; Balachandar, Sidhika; Peng, Kenny; Agostini, Gabriel; Garg, Nikhil; Pierson, Emma. Topics, Authors, and Institutions in Large Language Model Research: Trends from 17K arXiv Papers. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2024: 1223–1243 [2024-12-08]. arXiv:2307.10700 . doi:10.18653/v1/2024.naacl-long.67. （原始内容存档于2025-04-12）.
Peng, Bo; et al. RWKV: Reinventing RNNS for the Transformer Era. 2023. arXiv:2305.13048  [cs.CL].
Gu, Albert; Dao, Tri, Mamba: Linear-Time Sequence Modeling with Selective State Spaces, 2023-12-01, arXiv:2312.00752 
Kaushal, Ayush; Mahowald, Kyle, What do tokens know about their characters and how do they know it?, 2022-06-06, arXiv:2206.02608 
Petrov, Aleksandar; Malfa, Emanuele La; Torr, Philip; Bibi, Adel. Language Model Tokenizers Introduce Unfairness Between Languages. NeurIPS. June 23, 2023 [September 16, 2023]. arXiv:2305.15425 . （原始内容存档于December 15, 2023） –通过openreview.net.
Petrov, Aleksandar; Emanuele La Malfa; Torr, Philip H. S.; Bibi, Adel. Language Model Tokenizers Introduce Unfairness Between Languages. 2023. arXiv:2305.15425  [cs.CL].
Dodge, Jesse; Sap, Maarten; Marasović, Ana; Agnew, William; Ilharco, Gabriel; Groeneveld, Dirk; Mitchell, Margaret; Gardner, Matt. Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus. 2021. arXiv:2104.08758  [cs.CL].
Li, Yuanzhi; Bubeck, Sébastien; Eldan, Ronen; Del Giorno, Allie; Gunasekar, Suriya; Lee, Yin Tat, Textbooks Are All You Need II: phi-1.5 technical report, 2023-09-11, arXiv:2309.05463 
Lin, Zhenghao; Gou, Zhibin; Gong, Yeyun; Liu, Xiao; Shen, Yelong; Xu, Ruochen; Lin, Chen; Yang, Yujiu; Jiao, Jian. Rho-1: Not All Tokens Are What You Need. 2024-04-11. arXiv:2404.07965  [cs.CL].
Brown, Tom B.; et al. Language Models are Few-Shot Learners. 2020. arXiv:2005.14165  [cs.CL].
Abdin, Marah; Jacobs, Sam Ade; Awan, Ammar Ahmad; Aneja, Jyoti; Awadallah, Ahmed; Awadalla, Hany; Bach, Nguyen; Bahree, Amit; Bakhtiari, Arash. Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone. 2024-04-23. arXiv:2404.14219  [cs.CL].
Ouyang, Long; Wu, Jeff; Jiang, Xu; Almeida, Diogo; Wainwright, Carroll L.; Mishkin, Pamela; Zhang, Chong; Agarwal, Sandhini; Slama, Katarina; Ray, Alex; Schulman, John; Hilton, Jacob; Kelton, Fraser; Miller, Luke; Simens, Maddie; Askell, Amanda; Welinder, Peter; Christiano, Paul; Leike, Jan; Lowe, Ryan. Training language models to follow instructions with human feedback. 2022. arXiv:2203.02155  [cs.CL].
Shazeer, Noam; Mirhoseini, Azalia; Maziarz, Krzysztof; Davis, Andy; Le, Quoc; Hinton, Geoffrey; Dean, Jeff. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. 2017-01-01. arXiv:1701.06538  [cs.LG].
Lepikhin, Dmitry; Lee, HyoukJoong; Xu, Yuanzhong; Chen, Dehao; Firat, Orhan; Huang, Yanping; Krikun, Maxim; Shazeer, Noam; Chen, Zhifeng. GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding. 2021-01-12. arXiv:2006.16668  [cs.CL].
Zaib, Munazza; Sheng, Quan Z.; Emma Zhang, Wei. A Short Survey of Pre-trained Language Models for Conversational AI-A New Age in NLP. Proceedings of the Australasian Computer Science Week Multiconference. 4 February 2020: 1–4. ISBN 9781450376976. S2CID 211040895. arXiv:2104.10810 . doi:10.1145/3373017.3373028.
Sharir, Or; Peleg, Barak; Shoham, Yoav. The Cost of Training NLP Models: A Concise Overview. 2020. arXiv:2004.08900  [cs.CL].
Biderman, Stella; Schoelkopf, Hailey; Anthony, Quentin; Bradley, Herbie; Khan, Mohammad Aflah; Purohit, Shivanshu; Prashanth, USVSN Sai. Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling. April 2023. arXiv:2304.01373  [cs.CL].
Maslej, Nestor; Fattorini, Loredana; Brynjolfsson, Erik; Etchemendy, John; Ligett, Katrina; Lyons, Terah; Manyika, James; Ngo, Helen; Niebles, Juan Carlos, Artificial Intelligence Index Report 2023, 2023-10-05, arXiv:2310.03715 
Section 2.1 and Table 1, Kaplan, Jared; McCandlish, Sam; Henighan, Tom; Brown, Tom B.; Chess, Benjamin; Child, Rewon; Gray, Scott; Radford, Alec; Wu, Jeffrey; Amodei, Dario. Scaling Laws for Neural Language Models. 2020. arXiv:2001.08361  [cs.LG].
Li, Junnan; Li, Dongxu; Savarese, Silvio; Hoi, Steven. BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models. 2023-01-01. arXiv:2301.12597  [cs.CV].
Alayrac, Jean-Baptiste; Donahue, Jeff; Luc, Pauline; Miech, Antoine; Barr, Iain; Hasson, Yana; Lenc, Karel; Mensch, Arthur; Millican, Katherine; Reynolds, Malcolm; Ring, Roman; Rutherford, Eliza; Cabi, Serkan; Han, Tengda; Gong, Zhitao. Flamingo: a Visual Language Model for Few-Shot Learning. Advances in Neural Information Processing Systems. 2022-12-06, 35: 23716–23736 [2023-07-02]. arXiv:2204.14198 . （原始内容存档于2023-07-02）.
Liu, Haotian; Li, Chunyuan; Wu, Qingyang; Lee, Yong Jae. Visual Instruction Tuning. 2023-04-01. arXiv:2304.08485  [cs.CV].
Zhang, Hang; Li, Xin; Bing, Lidong. Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding. 2023-06-01. arXiv:2306.02858  [cs.CL].
OpenAI. GPT-4 Technical Report. 2023-03-27. arXiv:2303.08774  [cs.CL].
Lei Huang; Weijiang Yu; Weitao Ma. A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions. arXiv. （原始内容存档于2024-11-28）.

blog.google

Our next-generation model: Gemini 1.5. Google. 15 February 2024 [18 February 2024]. （原始内容存档于18 February 2024）.

doi.org

Manning, Christopher D. Human Language Understanding & Reasoning. Daedalus. 2022, 151 (2): 127–138 [2023-06-08]. S2CID 248377870. doi:10.1162/daed_a_01905. （原始内容存档于2023-03-09）.
Kotek, Hadas; Dockum, Rikker; Sun, David. Gender bias and stereotypes in Large Language Models. Proceedings of The ACM Collective Intelligence Conference. CI '23 (New York, NY, USA: Association for Computing Machinery). 2023-11-05. ISBN 979-8-4007-0113-9. doi:10.1145/3582269.3615599.
Davidson, Thomas; Bhattacharya, Debasmita; Weber, Ingmar. Roberts, Sarah T.; Tetreault, Joel; Prabhakaran, Vinodkumar; Waseem, Zeerak , 编. Racial Bias in Hate Speech and Abusive Language Detection Datasets. Proceedings of the Third Workshop on Abusive Language Online (Florence, Italy: Association for Computational Linguistics). 2019-08. doi:10.18653/v1/W19-3504.
Kilgarriff, Adam; Grefenstette, Gregory. Introduction to the Special Issue on the Web as Corpus. Computational Linguistics. September 2003, 29 (3): 333–347 [2025-01-20]. ISSN 0891-2017. doi:10.1162/089120103322711569. （原始内容存档于2024-06-16）.
Banko, Michele; Brill, Eric. Scaling to very very large corpora for natural language disambiguation. Proceedings of the 39th Annual Meeting on Association for Computational Linguistics - ACL '01 (Morristown, NJ, USA: Association for Computational Linguistics). 2001: 26–33 [2025-01-20]. doi:10.3115/1073012.1073017. （原始内容存档于2024-09-22）.
Resnik, Philip; Smith, Noah A. The Web as a Parallel Corpus. Computational Linguistics. September 2003, 29 (3): 349–380 [2024-06-07]. ISSN 0891-2017. doi:10.1162/089120103322711578 . （原始内容存档于2024-06-07）.
Halevy, Alon; Norvig, Peter; Pereira, Fernando. The Unreasonable Effectiveness of Data. IEEE Intelligent Systems. March 2009, 24 (2): 8–12 [2025-01-20]. ISSN 1541-1672. doi:10.1109/MIS.2009.36. （原始内容存档于2024-10-04）.
Chen, Leiyu; Li, Shaobo; Bai, Qiang; Yang, Jing; Jiang, Sanlong; Miao, Yanming. Review of Image Classification Algorithms Based on Convolutional Neural Networks. Remote Sensing. 2021, 13 (22): 4712. Bibcode:2021RemS...13.4712C. doi:10.3390/rs13224712 .
Rogers, Anna; Kovaleva, Olga; Rumshisky, Anna. A Primer in BERTology: What We Know About How BERT Works. Transactions of the Association for Computational Linguistics. 2020, 8: 842–866 [2024-01-21]. S2CID 211532403. arXiv:2002.12327 . doi:10.1162/tacl_a_00349. （原始内容存档于2022-04-03）.
Movva, Rajiv; Balachandar, Sidhika; Peng, Kenny; Agostini, Gabriel; Garg, Nikhil; Pierson, Emma. Topics, Authors, and Institutions in Large Language Model Research: Trends from 17K arXiv Papers. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2024: 1223–1243 [2024-12-08]. arXiv:2307.10700 . doi:10.18653/v1/2024.naacl-long.67. （原始内容存档于2025-04-12）.
Movva, Rajiv; Balachandar, Sidhika; Peng, Kenny; Agostini, Gabriel; Garg, Nikhil; Pierson, Emma. Topics, Authors, and Institutions in Large Language Model Research: Trends from 17K arXiv Papers. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2024: 1223–1243 [2024-12-08]. arXiv:2307.10700 . doi:10.18653/v1/2024.naacl-long.67. （原始内容存档于2025-04-12）.
Paaß, Gerhard; Giesselbach, Sven. Pre-trained Language Models. Foundation Models for Natural Language Processing. Artificial Intelligence: Foundations, Theory, and Algorithms. 2022: 19–78 [3 August 2023]. ISBN 9783031231902. doi:10.1007/978-3-031-23190-2_2. （原始内容存档于3 August 2023）.
Lee, Katherine; Ippolito, Daphne; Nystrom, Andrew; Zhang, Chiyuan; Eck, Douglas; Callison-Burch, Chris; Carlini, Nicholas. Deduplicating Training Data Makes Language Models Better (PDF). Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. May 2022,. 1: Long Papers: 8424–8445 [2025-02-07]. doi:10.18653/v1/2022.acl-long.577. （原始内容存档 (PDF)于2024-09-30）.
Paaß, Gerhard; Giesselbach, Sven. Pre-trained Language Models. Foundation Models for Natural Language Processing. Artificial Intelligence: Foundations, Theory, and Algorithms. 2022: 19–78 [3 August 2023]. ISBN 9783031231902. doi:10.1007/978-3-031-23190-2_2. （原始内容存档于3 August 2023）.
Zaib, Munazza; Sheng, Quan Z.; Emma Zhang, Wei. A Short Survey of Pre-trained Language Models for Conversational AI-A New Age in NLP. Proceedings of the Australasian Computer Science Week Multiconference. 4 February 2020: 1–4. ISBN 9781450376976. S2CID 211040895. arXiv:2104.10810 . doi:10.1145/3373017.3373028.
Yucong Duan; Fuliang Tang; Zhendong Guo; Yingtian Mei; Yuxing Wang; Kunguang Wu; Zeyu Yang; Shuaishuai Huang; Shiming Gong. Global Large Language Model EQ and IQ Bias Evaluation -Released by DIKWP -AC Research Group. ResearchGate. 2023. doi:10.13140/RG.2.2.12894.61762 –通过ResearchGate （英语）.
Zhou, Karen; Tan, Chenhao. Bouamor, Houda; Pino, Juan; Bali, Kalika , 编. Entity-Based Evaluation of Political Bias in Automatic Summarization. Findings of the Association for Computational Linguistics: EMNLP 2023 (Singapore: Association for Computational Linguistics). 2023-12 [2023-12-26]. doi:10.18653/v1/2023.findings-emnlp.696. （原始内容存档于2024-04-24）.
Yucong Duan; Fuliang Tang; Kunguang Wu; Zhendong Guo; Shuaishuai Huang; Yingtian Mei; Yuxing Wang; Zeyu Yang; Shiming Gong. "Ranking of Large Language Model (LLM) Cultural Bias" --DIKWP Research Group International Standard Evaluation. ResearchGate. 2024. doi:10.13140/RG.2.2.26652.67200 –通过ResearchGate.
Yucong Duan; Fuliang Tang; Kunguang Wu; Zhendong Guo; Shuaishuai Huang; Yingtian Mei; Yuxing Wang; Zeyu Yang; Shiming Gong. "Ranking of Large Language Model (LLM) Regional Bias" --DIKWP Research Group International Standard Evaluation. ResearchGate. 2024. doi:10.13140/RG.2.2.10019.63529 –通过ResearchGate.
Yucong Duan; Fuliang Tang; Kunguang Wu; Zhendong Guo; Shuaishuai Huang; Yingtian Mei; Yuxing Wang; Zeyu Yang; Shiming Gong. "The Large Language Model (LLM) Bias Evaluation (Age Bias)" --DIKWP Research Group International Standard Evaluation. ResearchGate. 2024. doi:10.13140/RG.2.2.26397.12006 –通过ResearchGate.
Yucong Duan; Fuliang Tang; Kunguang Wu; Zhendong Guo; Shuaishuai Huang; Yingtian Mei; Yuxing Wang; Zeyu Yang; Shiming Gong. "The Large Language Model (LLM) Bias Evaluation (Occupational Bias)" --DIKWP Research Group International Standard Evaluation. ResearchGate. 2024. doi:10.13140/RG.2.2.23041.67689 –通过ResearchGate.

dx.doi.org

Banko, Michele; Brill, Eric. Scaling to very very large corpora for natural language disambiguation. Proceedings of the 39th Annual Meeting on Association for Computational Linguistics - ACL '01 (Morristown, NJ, USA: Association for Computational Linguistics). 2001: 26–33 [2025-01-20]. doi:10.3115/1073012.1073017. （原始内容存档于2024-09-22）.

euronews.com

ChatGPT a year on: 3 ways the AI chatbot has completely changed the world in 12 months. Euronews. November 30, 2023 [January 20, 2024]. （原始内容存档于January 14, 2024）.

harvard.edu

ui.adsabs.harvard.edu

Goodman, Joshua, A Bit of Progress in Language Modeling, 2001-08-09, Bibcode:2001cs........8005G, arXiv:cs/0108005 
Chen, Leiyu; Li, Shaobo; Bai, Qiang; Yang, Jing; Jiang, Sanlong; Miao, Yanming. Review of Image Classification Algorithms Based on Convolutional Neural Networks. Remote Sensing. 2021, 13 (22): 4712. Bibcode:2021RemS...13.4712C. doi:10.3390/rs13224712 .

huggingface.co

LMSYS Chatbot Arena Leaderboard. huggingface.co. [June 12, 2024]. （原始内容存档于June 10, 2024）.

ibm.com

What is instruction tuning?. IBM. [2024-12-09]. （原始内容存档于2024-12-09）.

ieee.org

ieeexplore.ieee.org

Halevy, Alon; Norvig, Peter; Pereira, Fernando. The Unreasonable Effectiveness of Data. IEEE Intelligent Systems. March 2009, 24 (2): 8–12 [2025-01-20]. ISSN 1541-1672. doi:10.1109/MIS.2009.36. （原始内容存档于2024-10-04）.

jalammar.github.io

Allamar, Jay. Illustrated transformer. [2023-07-29]. （原始内容存档于2023-07-25）.
Allamar, Jay. The Illustrated GPT-2 (Visualizing Transformer Language Models). [2023-08-01]. （原始内容存档于2019-08-13）.

mit.edu

direct.mit.edu

Kilgarriff, Adam; Grefenstette, Gregory. Introduction to the Special Issue on the Web as Corpus. Computational Linguistics. September 2003, 29 (3): 333–347 [2025-01-20]. ISSN 0891-2017. doi:10.1162/089120103322711569. （原始内容存档于2024-06-16）.
Resnik, Philip; Smith, Noah A. The Web as a Parallel Corpus. Computational Linguistics. September 2003, 29 (3): 349–380 [2024-06-07]. ISSN 0891-2017. doi:10.1162/089120103322711578 . （原始内容存档于2024-06-07）.

mittrchina.com

杨立昆：“AGI即将到来”完全是无稽之谈，真正的智能要建立在世界模型之上. 麻省理工科技评论中文版. 2025-03-28 [2025-04-20] （中文（中国大陆））.

mlr.press

proceedings.mlr.press

Kiros, Ryan; Salakhutdinov, Ruslan; Zemel, Rich. Multimodal Neural Language Models. Proceedings of the 31st International Conference on Machine Learning (PMLR). 2014-06-18: 595–603 [2023-07-02]. （原始内容存档于2023-07-02）.

nature.com

Gibney, Elizabeth. China's cheap, open AI model DeepSeek thrills scientists. Nature. 2025-01-30 [2025-02-03]. （原始内容存档于2025-01-29）.

neurips.cc

proceedings.neurips.cc

Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion; Gomez, Aidan N; Kaiser, Łukasz; Polosukhin, Illia. Attention is All you Need (PDF). Advances in Neural Information Processing Systems (Curran Associates, Inc.). 2017, 30 [2024-01-21]. （原始内容存档 (PDF)于2024-02-21）.
Krizhevsky, Alex; Sutskever, Ilya; Hinton, Geoffrey E. ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems (Curran Associates, Inc.). 2012, 25 [2023-07-02]. （原始内容存档于2023-07-02）.
Alayrac, Jean-Baptiste; Donahue, Jeff; Luc, Pauline; Miech, Antoine; Barr, Iain; Hasson, Yana; Lenc, Karel; Mensch, Arthur; Millican, Katherine; Reynolds, Malcolm; Ring, Roman; Rutherford, Eliza; Cabi, Serkan; Han, Tengda; Gong, Zhitao. Flamingo: a Visual Language Model for Few-Shot Learning. Advances in Neural Information Processing Systems. 2022-12-06, 35: 23716–23736 [2023-07-02]. arXiv:2204.14198 . （原始内容存档于2023-07-02）.

nvidia.com

blogs.nvidia.com

Merritt, Rick. What Is a Transformer Model?. NVIDIA Blog. 2022-03-25 [2023-07-25]. （原始内容存档于2023-11-17）.

nytimes.com

Metz, Cade. OpenAI Unveils New A.I. That Can 'Reason' Through Math and Science Problems. The New York Times. 2024-12-20 [2025-02-03]. （原始内容存档于2025-02-09）.
Metz, Cade. OpenAI Unveils New A.I. That Can 'Reason' Through Math and Science Problems. The New York Times. 2024-12-20 [2025-02-03]. （原始内容存档于2025-02-09）.

openai.com

platform.openai.com

OpenAI API. platform.openai.com. [2023-04-30]. （原始内容存档于April 23, 2023）.
Rate limits. openai.com. [January 20, 2024]. （原始内容存档于February 2, 2024）.

openai.com

Introducing OpenAI o1-preview. OpenAI. 2024-09-12 [2025-02-03]. （原始内容存档于2024-11-26）.
Introducing OpenAI o1-preview. OpenAI. 2024-09-12 [2025-02-03]. （原始内容存档于2024-11-26）.

cdn.openai.com

OpenAI. GPT-4V(ision) System Card (PDF). September 25, 2023 [2025-02-11]. （原始内容存档 (PDF)于2023-09-25）.

openreview.net

Petrov, Aleksandar; Malfa, Emanuele La; Torr, Philip; Bibi, Adel. Language Model Tokenizers Introduce Unfairness Between Languages. NeurIPS. June 23, 2023 [September 16, 2023]. arXiv:2305.15425 . （原始内容存档于December 15, 2023） –通过openreview.net.
Wei, Jason; Tay, Yi; Bommasani, Rishi; Raffel, Colin; Zoph, Barret; Borgeaud, Sebastian; Yogatama, Dani; Bosma, Maarten; Zhou, Denny; Metzler, Donald; Chi, Ed H.; Hashimoto, Tatsunori; Vinyals, Oriol; Liang, Percy; Dean, Jeff; Fedus, William. Emergent Abilities of Large Language Models. Transactions on Machine Learning Research. 31 August 2022 [19 March 2023]. ISSN 2835-8856. （原始内容存档于22 March 2023）.

ourworldindata.org

Parameters in notable artificial intelligence systems. ourworldindata.org. November 30, 2023 [January 20, 2024]. （原始内容存档于2024-10-06）.

researchgate.net

Zaib, Munazza; Sheng, Quan Z.; Emma Zhang, Wei. A Short Survey of Pre-trained Language Models for Conversational AI-A New Age in NLP. Proceedings of the Australasian Computer Science Week Multiconference. 4 February 2020: 1–4. ISBN 9781450376976. S2CID 211040895. arXiv:2104.10810 . doi:10.1145/3373017.3373028.

rgdoi.net

Yucong Duan; Fuliang Tang; Zhendong Guo; Yingtian Mei; Yuxing Wang; Kunguang Wu; Zeyu Yang; Shuaishuai Huang; Shiming Gong. Global Large Language Model EQ and IQ Bias Evaluation -Released by DIKWP -AC Research Group. ResearchGate. 2023. doi:10.13140/RG.2.2.12894.61762 –通过ResearchGate （英语）.
Yucong Duan; Fuliang Tang; Kunguang Wu; Zhendong Guo; Shuaishuai Huang; Yingtian Mei; Yuxing Wang; Zeyu Yang; Shiming Gong. "Ranking of Large Language Model (LLM) Cultural Bias" --DIKWP Research Group International Standard Evaluation. ResearchGate. 2024. doi:10.13140/RG.2.2.26652.67200 –通过ResearchGate.
Yucong Duan; Fuliang Tang; Kunguang Wu; Zhendong Guo; Shuaishuai Huang; Yingtian Mei; Yuxing Wang; Zeyu Yang; Shiming Gong. "Ranking of Large Language Model (LLM) Regional Bias" --DIKWP Research Group International Standard Evaluation. ResearchGate. 2024. doi:10.13140/RG.2.2.10019.63529 –通过ResearchGate.
Yucong Duan; Fuliang Tang; Kunguang Wu; Zhendong Guo; Shuaishuai Huang; Yingtian Mei; Yuxing Wang; Zeyu Yang; Shiming Gong. "The Large Language Model (LLM) Bias Evaluation (Age Bias)" --DIKWP Research Group International Standard Evaluation. ResearchGate. 2024. doi:10.13140/RG.2.2.26397.12006 –通过ResearchGate.
Yucong Duan; Fuliang Tang; Kunguang Wu; Zhendong Guo; Shuaishuai Huang; Yingtian Mei; Yuxing Wang; Zeyu Yang; Shiming Gong. "The Large Language Model (LLM) Bias Evaluation (Occupational Bias)" --DIKWP Research Group International Standard Evaluation. ResearchGate. 2024. doi:10.13140/RG.2.2.23041.67689 –通过ResearchGate.

semanticscholar.org

api.semanticscholar.org

Manning, Christopher D. Human Language Understanding & Reasoning. Daedalus. 2022, 151 (2): 127–138 [2023-06-08]. S2CID 248377870. doi:10.1162/daed_a_01905. （原始内容存档于2023-03-09）.
Rogers, Anna; Kovaleva, Olga; Rumshisky, Anna. A Primer in BERTology: What We Know About How BERT Works. Transactions of the Association for Computational Linguistics. 2020, 8: 842–866 [2024-01-21]. S2CID 211532403. arXiv:2002.12327 . doi:10.1162/tacl_a_00349. （原始内容存档于2022-04-03）.
Zaib, Munazza; Sheng, Quan Z.; Emma Zhang, Wei. A Short Survey of Pre-trained Language Models for Conversational AI-A New Age in NLP. Proceedings of the Australasian Computer Science Week Multiconference. 4 February 2020: 1–4. ISBN 9781450376976. S2CID 211040895. arXiv:2104.10810 . doi:10.1145/3373017.3373028.

springer.com

link.springer.com

Paaß, Gerhard; Giesselbach, Sven. Pre-trained Language Models. Foundation Models for Natural Language Processing. Artificial Intelligence: Foundations, Theory, and Algorithms. 2022: 19–78 [3 August 2023]. ISBN 9783031231902. doi:10.1007/978-3-031-23190-2_2. （原始内容存档于3 August 2023）.
Paaß, Gerhard; Giesselbach, Sven. Pre-trained Language Models. Foundation Models for Natural Language Processing. Artificial Intelligence: Foundations, Theory, and Algorithms. 2022: 19–78 [3 August 2023]. ISBN 9783031231902. doi:10.1007/978-3-031-23190-2_2. （原始内容存档于3 August 2023）.

stanford.edu

web.stanford.edu

Jurafsky, Dan; Martin, James H. Speech and Language Processing (PDF) 3rd edition draft. 7 January 2023 [24 May 2022]. （原始内容存档 (PDF)于23 March 2023）.
Jurafsky, Dan; Martin, James H. Speech and Language Processing (PDF) 3rd edition draft. 7 January 2023 [24 May 2022]. （原始内容存档 (PDF)于23 March 2023）.

techcrunch.com

Wiggers, Kyle. The emerging types of language models and why they matter. TechCrunch. 28 April 2022 [9 March 2023]. （原始内容存档于16 March 2023）.
Wiggers, Kyle. Mistral releases Pixtral 12B, its first multimodal model. TechCrunch. 11 September 2024 [14 September 2024]. （原始内容存档于2024-09-14）.

technologyreview.com

Heaven, Will. GPT-4 is bigger and better than ChatGPT—but OpenAI won't say why. MIT Technology Review. March 14, 2023 [January 20, 2024]. （原始内容存档于March 17, 2023）.

thecvf.com

openaccess.thecvf.com

Antol, Stanislaw; Agrawal, Aishwarya; Lu, Jiasen; Mitchell, Margaret; Batra, Dhruv; Zitnick, C. Lawrence; Parikh, Devi. VQA: Visual Question Answering. ICCV. 2015: 2425–2433 [2023-07-02]. （原始内容存档于2023-07-02）.

theguardian.com

Hern, Alex. New AI fake text generator may be too dangerous to release, say creators. The Guardian. 14 February 2019 [20 January 2024]. （原始内容存档于14 February 2019）.

thepaper.cn

苏霍伊；甲子光年. 杨立昆GTC对话实录：“AGI即将到来”完全是无稽之谈｜甲子光年. 澎湃新闻. 2025-03-24 [2025-04-20] （中文（中国大陆））.

towardsdatascience.com

Lundberg, Scott. The Art of Prompt Design: Prompt Boundaries and Token Healing. Medium. 2023-12-12 [2024-08-05]. （原始内容存档于2024-08-05）（英语）.

unite.ai

Zia, Dr Tehseen. Unveiling of Large Multimodal Models: Shaping the Landscape of Language Models in 2024. Unite.AI. 2024-01-08 [2024-12-28]. （原始内容存档于2024-12-04）（美国英语）.

usenix.org

Carlini, Nicholas; Tramer, Florian; Wallace, Eric; Jagielski, Matthew; Herbert-Voss, Ariel; Lee, Katherine; Roberts, Adam; Brown, Tom B; Song, Dawn; Erlingsson, Ulfar. Extracting Training Data from Large Language Models (PDF). USENIX Security Symposium 6. 2021 [2023-06-08]. （原始内容存档 (PDF)于2023-12-21）.

venturebeat.com

Sharma, Shubham. Open-source DeepSeek-R1 uses pure reinforcement learning to match OpenAI o1 — at 95% less cost. VentureBeat. 2025-01-20 [2025-01-26]. （原始内容存档于2025-01-25）（美国英语）.

web.archive.org

Goled, Shraddha. Self-Supervised Learning Vs Semi-Supervised Learning: How They Differ. Analytics India Magazine. May 7, 2021 [2023-06-08]. （原始内容存档于2023-06-18）.
Manning, Christopher D. Human Language Understanding & Reasoning. Daedalus. 2022, 151 (2): 127–138 [2023-06-08]. S2CID 248377870. doi:10.1162/daed_a_01905. （原始内容存档于2023-03-09）.
Carlini, Nicholas; Tramer, Florian; Wallace, Eric; Jagielski, Matthew; Herbert-Voss, Ariel; Lee, Katherine; Roberts, Adam; Brown, Tom B; Song, Dawn; Erlingsson, Ulfar. Extracting Training Data from Large Language Models (PDF). USENIX Security Symposium 6. 2021 [2023-06-08]. （原始内容存档 (PDF)于2023-12-21）.
Queenie Luo; Michael J. Puett; Michael D. Smith. A Perspectival Mirror of the Elephant: Investigating Language Bias on Google, ChatGPT, Wikipedia, and YouTube. arXiv. （原始内容存档于2024-04-16）.
Kilgarriff, Adam; Grefenstette, Gregory. Introduction to the Special Issue on the Web as Corpus. Computational Linguistics. September 2003, 29 (3): 333–347 [2025-01-20]. ISSN 0891-2017. doi:10.1162/089120103322711569. （原始内容存档于2024-06-16）.
Banko, Michele; Brill, Eric. Scaling to very very large corpora for natural language disambiguation. Proceedings of the 39th Annual Meeting on Association for Computational Linguistics - ACL '01 (Morristown, NJ, USA: Association for Computational Linguistics). 2001: 26–33 [2025-01-20]. doi:10.3115/1073012.1073017. （原始内容存档于2024-09-22）.
Resnik, Philip; Smith, Noah A. The Web as a Parallel Corpus. Computational Linguistics. September 2003, 29 (3): 349–380 [2024-06-07]. ISSN 0891-2017. doi:10.1162/089120103322711578 . （原始内容存档于2024-06-07）.
Halevy, Alon; Norvig, Peter; Pereira, Fernando. The Unreasonable Effectiveness of Data. IEEE Intelligent Systems. March 2009, 24 (2): 8–12 [2025-01-20]. ISSN 1541-1672. doi:10.1109/MIS.2009.36. （原始内容存档于2024-10-04）.
Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion; Gomez, Aidan N; Kaiser, Łukasz; Polosukhin, Illia. Attention is All you Need (PDF). Advances in Neural Information Processing Systems (Curran Associates, Inc.). 2017, 30 [2024-01-21]. （原始内容存档 (PDF)于2024-02-21）.
Rogers, Anna; Kovaleva, Olga; Rumshisky, Anna. A Primer in BERTology: What We Know About How BERT Works. Transactions of the Association for Computational Linguistics. 2020, 8: 842–866 [2024-01-21]. S2CID 211532403. arXiv:2002.12327 . doi:10.1162/tacl_a_00349. （原始内容存档于2022-04-03）.
Movva, Rajiv; Balachandar, Sidhika; Peng, Kenny; Agostini, Gabriel; Garg, Nikhil; Pierson, Emma. Topics, Authors, and Institutions in Large Language Model Research: Trends from 17K arXiv Papers. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2024: 1223–1243 [2024-12-08]. arXiv:2307.10700 . doi:10.18653/v1/2024.naacl-long.67. （原始内容存档于2025-04-12）.
Hern, Alex. New AI fake text generator may be too dangerous to release, say creators. The Guardian. 14 February 2019 [20 January 2024]. （原始内容存档于14 February 2019）.
ChatGPT a year on: 3 ways the AI chatbot has completely changed the world in 12 months. Euronews. November 30, 2023 [January 20, 2024]. （原始内容存档于January 14, 2024）.
Heaven, Will. GPT-4 is bigger and better than ChatGPT—but OpenAI won't say why. MIT Technology Review. March 14, 2023 [January 20, 2024]. （原始内容存档于March 17, 2023）.
Movva, Rajiv; Balachandar, Sidhika; Peng, Kenny; Agostini, Gabriel; Garg, Nikhil; Pierson, Emma. Topics, Authors, and Institutions in Large Language Model Research: Trends from 17K arXiv Papers. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2024: 1223–1243 [2024-12-08]. arXiv:2307.10700 . doi:10.18653/v1/2024.naacl-long.67. （原始内容存档于2025-04-12）.
Parameters in notable artificial intelligence systems. ourworldindata.org. November 30, 2023 [January 20, 2024]. （原始内容存档于2024-10-06）.
LMSYS Chatbot Arena Leaderboard. huggingface.co. [June 12, 2024]. （原始内容存档于June 10, 2024）.
Sharma, Shubham. Open-source DeepSeek-R1 uses pure reinforcement learning to match OpenAI o1 — at 95% less cost. VentureBeat. 2025-01-20 [2025-01-26]. （原始内容存档于2025-01-25）（美国英语）.
Zia, Dr Tehseen. Unveiling of Large Multimodal Models: Shaping the Landscape of Language Models in 2024. Unite.AI. 2024-01-08 [2024-12-28]. （原始内容存档于2024-12-04）（美国英语）.
Merritt, Rick. What Is a Transformer Model?. NVIDIA Blog. 2022-03-25 [2023-07-25]. （原始内容存档于2023-11-17）.
Yennie Jun. All languages are NOT created (tokenized) equal. Language models cost much more in some languages than others. 2023-05-03 [2023-08-17]. （原始内容存档于2023-08-17）. In other words, to express the same sentiment, some languages require up to 10 times more tokens.
Petrov, Aleksandar; Malfa, Emanuele La; Torr, Philip; Bibi, Adel. Language Model Tokenizers Introduce Unfairness Between Languages. NeurIPS. June 23, 2023 [September 16, 2023]. arXiv:2305.15425 . （原始内容存档于December 15, 2023） –通过openreview.net.
OpenAI API. platform.openai.com. [2023-04-30]. （原始内容存档于April 23, 2023）.
Paaß, Gerhard; Giesselbach, Sven. Pre-trained Language Models. Foundation Models for Natural Language Processing. Artificial Intelligence: Foundations, Theory, and Algorithms. 2022: 19–78 [3 August 2023]. ISBN 9783031231902. doi:10.1007/978-3-031-23190-2_2. （原始内容存档于3 August 2023）.
Lundberg, Scott. The Art of Prompt Design: Prompt Boundaries and Token Healing. Medium. 2023-12-12 [2024-08-05]. （原始内容存档于2024-08-05）（英语）.
Lee, Katherine; Ippolito, Daphne; Nystrom, Andrew; Zhang, Chiyuan; Eck, Douglas; Callison-Burch, Chris; Carlini, Nicholas. Deduplicating Training Data Makes Language Models Better (PDF). Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. May 2022,. 1: Long Papers: 8424–8445 [2025-02-07]. doi:10.18653/v1/2022.acl-long.577. （原始内容存档 (PDF)于2024-09-30）.
What is instruction tuning?. IBM. [2024-12-09]. （原始内容存档于2024-12-09）.
Wei, Jason; Tay, Yi; Bommasani, Rishi; Raffel, Colin; Zoph, Barret; Borgeaud, Sebastian; Yogatama, Dani; Bosma, Maarten; Zhou, Denny; Metzler, Donald; Chi, Ed H.; Hashimoto, Tatsunori; Vinyals, Oriol; Liang, Percy; Dean, Jeff; Fedus, William. Emergent Abilities of Large Language Models. Transactions on Machine Learning Research. 31 August 2022 [19 March 2023]. ISSN 2835-8856. （原始内容存档于22 March 2023）.
Allamar, Jay. Illustrated transformer. [2023-07-29]. （原始内容存档于2023-07-25）.
Paaß, Gerhard; Giesselbach, Sven. Pre-trained Language Models. Foundation Models for Natural Language Processing. Artificial Intelligence: Foundations, Theory, and Algorithms. 2022: 19–78 [3 August 2023]. ISBN 9783031231902. doi:10.1007/978-3-031-23190-2_2. （原始内容存档于3 August 2023）.
Our next-generation model: Gemini 1.5. Google. 15 February 2024 [18 February 2024]. （原始内容存档于18 February 2024）.
Long context prompting for Claude 2.1. December 6, 2023 [January 20, 2024]. （原始内容存档于August 27, 2024）.
Rate limits. openai.com. [January 20, 2024]. （原始内容存档于February 2, 2024）.
Jurafsky, Dan; Martin, James H. Speech and Language Processing (PDF) 3rd edition draft. 7 January 2023 [24 May 2022]. （原始内容存档 (PDF)于23 March 2023）.
Jurafsky, Dan; Martin, James H. Speech and Language Processing (PDF) 3rd edition draft. 7 January 2023 [24 May 2022]. （原始内容存档 (PDF)于23 March 2023）.
Wiggers, Kyle. The emerging types of language models and why they matter. TechCrunch. 28 April 2022 [9 March 2023]. （原始内容存档于16 March 2023）.
Kiros, Ryan; Salakhutdinov, Ruslan; Zemel, Rich. Multimodal Neural Language Models. Proceedings of the 31st International Conference on Machine Learning (PMLR). 2014-06-18: 595–603 [2023-07-02]. （原始内容存档于2023-07-02）.
Krizhevsky, Alex; Sutskever, Ilya; Hinton, Geoffrey E. ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems (Curran Associates, Inc.). 2012, 25 [2023-07-02]. （原始内容存档于2023-07-02）.
Antol, Stanislaw; Agrawal, Aishwarya; Lu, Jiasen; Mitchell, Margaret; Batra, Dhruv; Zitnick, C. Lawrence; Parikh, Devi. VQA: Visual Question Answering. ICCV. 2015: 2425–2433 [2023-07-02]. （原始内容存档于2023-07-02）.
Alayrac, Jean-Baptiste; Donahue, Jeff; Luc, Pauline; Miech, Antoine; Barr, Iain; Hasson, Yana; Lenc, Karel; Mensch, Arthur; Millican, Katherine; Reynolds, Malcolm; Ring, Roman; Rutherford, Eliza; Cabi, Serkan; Han, Tengda; Gong, Zhitao. Flamingo: a Visual Language Model for Few-Shot Learning. Advances in Neural Information Processing Systems. 2022-12-06, 35: 23716–23736 [2023-07-02]. arXiv:2204.14198 . （原始内容存档于2023-07-02）.
Wiggers, Kyle. Mistral releases Pixtral 12B, its first multimodal model. TechCrunch. 11 September 2024 [14 September 2024]. （原始内容存档于2024-09-14）.
Introducing OpenAI o1-preview. OpenAI. 2024-09-12 [2025-02-03]. （原始内容存档于2024-11-26）.
Introducing OpenAI o1-preview. OpenAI. 2024-09-12 [2025-02-03]. （原始内容存档于2024-11-26）.
Metz, Cade. OpenAI Unveils New A.I. That Can 'Reason' Through Math and Science Problems. The New York Times. 2024-12-20 [2025-02-03]. （原始内容存档于2025-02-09）.
Gibney, Elizabeth. China's cheap, open AI model DeepSeek thrills scientists. Nature. 2025-01-30 [2025-02-03]. （原始内容存档于2025-01-29）.
Metz, Cade. OpenAI Unveils New A.I. That Can 'Reason' Through Math and Science Problems. The New York Times. 2024-12-20 [2025-02-03]. （原始内容存档于2025-02-09）.
Lei Huang; Weijiang Yu; Weitao Ma. A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions. arXiv. （原始内容存档于2024-11-28）.
Zhou, Karen; Tan, Chenhao. Bouamor, Houda; Pino, Juan; Bali, Kalika , 编. Entity-Based Evaluation of Political Bias in Automatic Summarization. Findings of the Association for Computational Linguistics: EMNLP 2023 (Singapore: Association for Computational Linguistics). 2023-12 [2023-12-26]. doi:10.18653/v1/2023.findings-emnlp.696. （原始内容存档于2024-04-24）.

worldcat.org

Kilgarriff, Adam; Grefenstette, Gregory. Introduction to the Special Issue on the Web as Corpus. Computational Linguistics. September 2003, 29 (3): 333–347 [2025-01-20]. ISSN 0891-2017. doi:10.1162/089120103322711569. （原始内容存档于2024-06-16）.
Resnik, Philip; Smith, Noah A. The Web as a Parallel Corpus. Computational Linguistics. September 2003, 29 (3): 349–380 [2024-06-07]. ISSN 0891-2017. doi:10.1162/089120103322711578 . （原始内容存档于2024-06-07）.
Halevy, Alon; Norvig, Peter; Pereira, Fernando. The Unreasonable Effectiveness of Data. IEEE Intelligent Systems. March 2009, 24 (2): 8–12 [2025-01-20]. ISSN 1541-1672. doi:10.1109/MIS.2009.36. （原始内容存档于2024-10-04）.
Wei, Jason; Tay, Yi; Bommasani, Rishi; Raffel, Colin; Zoph, Barret; Borgeaud, Sebastian; Yogatama, Dani; Bosma, Maarten; Zhou, Denny; Metzler, Donald; Chi, Ed H.; Hashimoto, Tatsunori; Vinyals, Oriol; Liang, Percy; Dean, Jeff; Fedus, William. Emergent Abilities of Large Language Models. Transactions on Machine Learning Research. 31 August 2022 [19 March 2023]. ISSN 2835-8856. （原始内容存档于22 March 2023）.

yenniejun.com

blog.yenniejun.com

Yennie Jun. All languages are NOT created (tokenized) equal. Language models cost much more in some languages than others. 2023-05-03 [2023-08-17]. （原始内容存档于2023-08-17）. In other words, to express the same sentiment, some languages require up to 10 times more tokens.

youtube.com

Pichai, Sundar, Google Keynote (Google I/O '23), timestamp 15:31, 10 May 2023 [2023-07-02]