BARThez

Year: 2,020
Languages: French
Programming languages: Python
Input data:

sentences

A french sequence to sequence pretrained model based on BART.
BARThez is pretrained by learning to reconstruct a corrupted input sentence. A corpus of 66GB of french raw text is used to carry out the pretraining. Unlike already existing BERT-based French language models such as CamemBERT and FlauBERT, BARThez is particularly well-suited for generative tasks, since not only its encoder but also its decoder is pretrained.

Sign In

Register

Reset Password

Please enter your username or email address, you will receive a link to create a new password via email.