Text Summarization Evaluation – BLEU vs ROUGE
In general: Bleu measures precision: how much the words (and/or n-grams) in the machine generated summaries appeared in the human reference summaries. Rouge measures recall: how much the words (and/or n-grams) in the human reference summaries appeared in the machine generated summaries. Naturally – these results are complementing, as is often the case in precision … Read more