Many-to-many-100
Year: 2,020
Languages: Afrikaans, Albanian, Amharic, Arabic, Armenian, Asturian, Azerbaijani, Bashkir, Belarusian, Bengali, Bosnian, Breton, Bulgarian, Burmese, Catalan, Cebuano, Central Khmer, Chinese (Mandarin), Croatian, Czech, Danish, Dutch, English, Estonian, Farsi, Finnish, French, Fulah, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hebrew, Hindi, Hungarian, Icelandic, Igbo, Iloko, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Korean, Lao, Latvian, Lingala, Lithuanian, Luganda, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Marathi, Mongolian, Nepali, Northern Sotho, Norwegian, Occitan, Oriya, Panjabi, Pashto, Polish, Portuguese, Romanian, Russian, Scottish Gaelic, Serbian, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swati, Swedish, Tagalog, Tamil, Thai, Tswana, Turkish, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, Western Frisian, Wolof, Xhosa, Yiddish, Yoruba, Zulu
Programming languages: Python
Input data:
sequences of tokens
Output data:
sequences of tokens
Project website: https://github.com/pytorch/fairseq/tree/master/examples/m2m_100
In this work, we create a true Many-to-Many multilingual translation model that can translate directly between any pair of 100 languages.