mBART
Year: 2,020
Journal: Association for Computational Linguistics
Languages: Arabic, Burmese, Chinese (simplified), Czech, Dutch, English, Estonian, Finnish, French, German, Gujarati, Hindi, Italian, Japanese, Kazakh, Korean, Latvian, Lithuanian, Nepali, Romanian, Russian, Sinhala, Spanish, Turkish, Vietnamese
Programming languages: Python
Input data:
text/sentence
Output data:
text (translation?)
Project website: https://github.com/pytorch/fairseq/tree/master/examples/mbart
MBART is a sequence-to-sequence denoising auto-encoder pre-trained on large-scale monolingual corpora in many languages using the BART objective. mBART is one of the first methods for pre-training a complete sequence-to-sequence model by denoising full texts in multiple languages, while previous approaches have focused only on the encoder, decoder, or reconstructing parts of the text.