Output details
11 - Computer Science and Informatics
University of Leeds
A standard tag set expounding traditional morphological features for Arabic language part-of-speech tagging
<22>This is the first formal analysis of traditional Arabic grammarians’ theoretical research applied to NLP PoS-tagging, giving a detailed and comprehensive ontology of established Arabic word structure theory. Several Arabic PoS-tagsets have been developed for specific tasks, but generally are adapted from English models, and/or cover only a limited subset of Arabic morphology. We provide the first and only tagset unifying research in Arabic formal linguistics and NLP, a benchmark for comparison and evaluation of task-specific PoS-tagsets. Arabic NLP research is booming but fragmented; this work will enable NLP research to be grounded on established traditional Arabic linguistic theory.