Journal: International Conference on Learning Representations
Languages: All Languages
Programming languages: Python
Project website: https://github.com/google/trax/tree/master/trax/models/reformer
Large Transformer models routinely achieve state-of-the-art results on a number of tasks but training these models can be prohibitively costly, especially on long sequences. We introduce two techniques to improve the efficiency of Transformers.