Article citationsMore>>

Smith, S., Patwary, M., Norick, B., LeGresley, P., Rajbhandari, S., Casper, J., et al. (2022) Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, a Large-Scale Generative Language Model. arXiv: 2201.11990.
https://doi.org/10.48550/arXiv.2201.11990

has been cited by the following article:

Follow SCIRP
Twitter Facebook Linkedin Weibo
Contact us
customer@scirp.org
WhatsApp +86 18163351462(WhatsApp)
Click here to send a message to me 1655362766
Paper Publishing WeChat
Free SCIRP Newsletters
Copyright © 2006-2026 Scientific Research Publishing Inc. All Rights Reserved.
Top