[Back]


Talks and Poster Presentations (with Proceedings-Entry):

J. Carme, M. Ceresna, O. Froelich, G. Gottlob, T. Hassan, M. Herzog, W. Holzinger, B. Krüpl:
"The Lixto Project: Exploring New Frontiers of Web Data Extraction";
Keynote Lecture: BNCOD 2006, Belfast, Northern Ireland, UK; 2006-07-18 - 2006-07-20; in: "Flexible and Efficient Information Handling, 23rd British National Conference on Databases, BNCOD 23", D. Bell, J. Hong (ed.); Springer, LNCS 4042 (2006), ISBN: 3-540-35969-9; 1 - 15.



English abstract:
The Lixto project is an ongoing research effort in the area of Web data extraction. Whereas the project originally started out with the idea to develop a logic-based extraction language and a tool to visually define extraction programs from sample Web pages, the scope of the project has been extended over time. Today, new issues such as employing learning algorithms for the definition of extraction programs, automatically extracting data from Web pages featuring a table-centric visual appearance, and extracting from alternative document formats such as PDF are being investigated.

Created from the Publication Database of the Vienna University of Technology.