Funnel Transformer

Year: 2,020
Journal: Conference on Neural Information Processing Systems
Languages: All Languages
Programming languages: Python
Input data:


Funnel-Transformer is a new self-attention model that gradually compresses the sequence of hidden states to a shorter one and hence reduces the computation cost. More importantly, by re-investing the saved FLOPs from length reduction in constructing a deeper or wider model, Funnel-Transformer usually has a higher capacity given the same FLOPs. In addition, with a decoder, Funnel-Transformer is able to recover the token-level deep representation for each token from the reduced hidden sequence, which enables standard pretraining.

Sign In


Reset Password

Please enter your username or email address, you will receive a link to create a new password via email.