You are in : Home » Results & submissions » Select UOA » 11 - Computer Science and Informatics » View submission: University of Aberdeen » Outputs » Detail

Output details

11 - Computer Science and Informatics

University of Aberdeen

Return to search Previous output Next output

Output 13 of 74 in the submission

Article title

An Investigation into the Validity of Some Metrics for Automatically Evaluating Natural Language Generation Systems

Type

D - Journal article

DOI

10.1162/coli.2009.35.4.35405

Title of journal

Computational Linguistics

Article number

Volume number

Issue number

First page of article

529

ISSN of journal

0891-2017

Year of publication

2009

URL

Number of additional authors

Additional information

<22>This paper presents an empirical investigation into the validity of corpus-based evaluation metrics such as BLEU for evaluating Natural Language Generation (NLG) systems. It is the most careful and detailed such study yet performed, and is helping to shape the NLG community’s perspective on using corpus-based evaluation metrics, especially in the context of the Generation Challenges series of shared NLG tasks. In addition, the experimental design presented in the paper for human ratings-based evaluations of NLG systems has been adapted and used by other NLG researchers, who are looking for a rigorous design for such evaluations.

Interdisciplinary

Cross-referral requested

Research group

None

Citation count

Proposed double-weighted

Double-weighted statement

Reserve for a double-weighted output

Non-English

English abstract