I-BERT
Year: 2,021
Programming languages: Python
Input data:
text? (“quantized input” but model probably does this itself?)
Project website: https://github.com/kssteven418/I-BERT
In this work, we propose IBERT, a novel quantization scheme for Transformer based models that quantizes the entire inference with integer-only arithmetic. Based on lightweight integer-only approximation methods for nonlinear operations, e.g., GELU, Softmax, and Layer Normalization, I-BERT performs an endto-end integer-only BERT inference without any floating point calculation.