DiscoBERT

Year: 2,020
Journal: Association for Computational Linguistics
Programming languages: Python
Input data:

text/sentences

Output data:

text

Recently BERT has been adopted for document encoding in state-of-the-art text summarization models. However, sentence-based extractive models often result in redundant or uninformative phrases in the extracted summaries. Also, long-range dependencies throughout a document are not well captured by BERT, which is pre-trained on sentence pairs instead of documents. To address these issues, we present a discourse-aware neural summarization model – DISCOBERT1 . DISCOBERT extracts sub-sentential discourse units (instead of sentences) as candidates for extractive selection on a finer granularity. To capture the long-range dependencies among discourse units, structural discourse graphs are constructed based on RST trees and coreference mentions, encoded with Graph Convolutional Networks.

Sign In

Register

Reset Password

Please enter your username or email address, you will receive a link to create a new password via email.