Output details
11 - Computer Science and Informatics
University of Oxford
OXPath: A language for scalable data extraction, automation, and crawling on the deep web
<16>
This paper introduces OXPath, the first wrapper extraction language with guaranteed constant memory for bounded-depth extractions. It gives the first hard memory guarantee for a wrapper language independent of the number of pages wrapped. OXPath also outperforms existing wrapper systems, including commercial ones, by often several magnitudes. This has been acknowledged by selection for the best paper issue of VLDB 2011, the top-level database. OXPath has also received the silver price in the Open Source Software World Challenge 2011 and there have been several tutorials at industry events. It has seen considerable uptake in academia and open-source projects.