Rouge Metric Evaluation for Text Summarization Techniques
31 Pages Posted: 26 May 2022
Abstract
Approaches to Automatic Text Summarization try to extract key information from one or more input texts and generate summaries whilst preserving content meaning. These strategies are separated into two groups: Extractive and Abstractive, which differ in terms of how they work. The former extracts sentences from the document text directly, whereas the latter creates a summary by interpreting the text and rewriting sentences often with new words. So, it is important to assess and confirm how similar a summary is to the original text. The question is: how can the quality of these methodologies of summaries be evaluated? For the evaluation of text summarization results, various metrics and scores have been proposed in the literature, but the most used is ROUGE.Then, the primary goal of this study is to accurately estimate the ROUGE metric's behavior. We conducted a first experiment to compare the metric efficiency for evaluating Abstractive versus Extractive Text Summarization algorithms, and a second one to compare the obtained score for two different summary approaches: a simple execution of a summarization algorithm versus a multiple execution of different algorithms on the same text. Our results have shown that: ROUGE does not obtain impressive results since it behaves in a similar way both on the Abstractive and Extractive algorithms; furthermore, in most cases, a multiple execution is better than a single one.
Keywords: Automatic Text Summarization Algorithms, Extractive, Abstractive, ROUGE metric, Bert
Suggested Citation: Suggested Citation