For the current REF see the REF 2021 website REF 2021 logo

Output details

11 - Computer Science and Informatics

University of Aberdeen

Return to search Previous output Next output
Output 13 of 74 in the submission
Article title

An Investigation into the Validity of Some Metrics for Automatically Evaluating Natural Language Generation Systems

Type
D - Journal article
Title of journal
Computational Linguistics
Article number
-
Volume number
35
Issue number
4
First page of article
529
ISSN of journal
0891-2017
Year of publication
2009
URL
-
Number of additional authors
1
Additional information

<22>This paper presents an empirical investigation into the validity of corpus-based evaluation metrics such as BLEU for evaluating Natural Language Generation (NLG) systems. It is the most careful and detailed such study yet performed, and is helping to shape the NLG community’s perspective on using corpus-based evaluation metrics, especially in the context of the Generation Challenges series of shared NLG tasks. In addition, the experimental design presented in the paper for human ratings-based evaluations of NLG systems has been adapted and used by other NLG researchers, who are looking for a rigorous design for such evaluations.

Interdisciplinary
-
Cross-referral requested
-
Research group
None
Citation count
16
Proposed double-weighted
No
Double-weighted statement
-
Reserve for a double-weighted output
No
Non-English
No
English abstract
-