Output details
11 - Computer Science and Informatics
University of Sheffield
Extracting bilingual terminologies from comparable corpora
<22> This paper proposes a novel classifier-based approach to building bilingual terminologies from comparable corpora, where the classifier is trained using cross-language interwiki links from Wikipedia. The work was carried out in the EU-funded Terminology as a Service (TaaS) project. It builds on a novel approach to general bilingual phrase extraction from comparable corpora developed in the earlier EU-funded Accurat project that was shown to significantly improve machine translation (Aker, Feng and Gaizauskas, COLING 2012). TaaS co-ordinator, Tilde SIA, is currently embedding the approach proposed in this paper within a commercial terminology platform.