Output details
11 - Computer Science and Informatics
University of Leeds
Collective Classification of Fine-Grained Information Status
<22>Automatic recognition of information status (+subproblems, e.g. anaphoricity determination) is crucial for wide ranging applications in information extraction, summarization, etc. This paper (part of Markert's Humboldt fellowship) breaks entirely new ground by classifying all mentions in a document collectively for information status, yielding strong improvements over the state of the art. Presents the first written English corpus that is annotated reliably both for information status and fine-grained anaphoric distinctions. We expect frequent use of the publically available corpus due to the wide variety of phenomena annotated (2 international groups have started using it; we used it for bridging work, NAACL2013).