[Back]


Diploma and Master Theses (authored and supervised):

M. Dorner:
"Optimizations of structural join algorithms: XML - Structural Join Evolution";
Supervisor: R. Pichler; Institut für Informationssysteme, Arbeitsbereich Datenbanken & Artificial Intelligence, 2008; final examination: 2008-06-26.



English abstract:
Joins are a core operation in database systems. Nowadays every larger database system comes with XML support which allows XPath or XQuery statements to retrieve the stored data. Structural joins have to perform as good as possible to support efficient data retrieval. Many different algorithms have been proposed which work with different strategies to process a structural join. Every algorithm has its strength and weaknesses. The core of every algorithm is the numbering schema which allows determining a structural relationship between two nodes efficiently.

Earlier proposed algorithms like the TreeMerge and the StackTree only handle queries with two nodes. A longer query has to be split and its intermediate results merged together afterwards. Later algorithms like the TwigStack or the PathStack process a query at once and use a special stack-encoding to store intermediate results efficiently. The usage of stacks has proven as very efficient. The StackTree was the first algorithm which used one stack to cache ancestor nodes. State of the art is the Twig2Stack which can also handle optional query nodes. It uses a much more complex stack-encoding combined with the numbering schema.

Even specific index structures have been proposed to speed up the search for ancestors or descendants of a node. Some structures like the B+ tree have been adapted from the relational databases to work with the numbering schema. Other structures like the XR-tree were especially designed to find ancestors and descendants of a node very efficient. New concepts like stab lists have been introduced in this structure.

This work will give an overview how the algorithms get better and how the new strategies affect the join performance.

Created from the Publication Database of the Vienna University of Technology.