Megatron-LM

Year: 2,020
Languages: All Languages
Programming languages: Python
Input data:

sentences

In this work, we present our techniques for training very large transformer models and implement a simple, efficient intra-layer model parallel approach that enables training transformer models with billions of parameters.

Sign In

Register

Reset Password

Please enter your username or email address, you will receive a link to create a new password via email.