Output details
11 - Computer Science and Informatics
University of Sheffield
Adapting SVM for data sparseness and imbalance: a case study in information extraction
<22> Supervised learning approaches are seriously hampered by unbalanced training data. This paper is the first to show how to apply the uneven margins SVM model to address this problem within NLP, where it is pervasive. The algorithm achieved the best reported results on two benchmark datasets for evaluation of ML algorithms for information extraction. The paper appears in a leading NLP journal and, together with a preliminary conference version (CONLL), has 75 citations in Google Scholar. An open source implementation is being used by South London and Maudsley NHS Trust (Robert Stewart <robert.stewart@kcl.ac.uk>) to extract information from clinical records.