Diploma and Master Theses (authored and supervised):
"Algorithmic Approaches to the String Barcoding Problem";
Supervisor: G. Raidl;
Institut für Computergraphik und Algorithmen,
final examination: 2007-10.
This thesis deals with a heuristic approach based on Lagrangian relaxation to the string
barcoding (SB) problem, a close cousin to the well-known combinatorial set cover (SC)
problem. It has recently been proven to be NP-hard and has many real-world applications,
particularly in the fields of medicine and biology.
Given a set of sequences over some alphabet, DNA for instance, we aim at finding a set
of short sequences, so-called probes, such that we are able to identify an unknown sample
sequence as one of the input sequences by determining which probes are subsequences of
the sample, and which are not. The problem is twofold: the determination of all possible
probes and the selection of a suitable subset of minimum cardinality.
The problem has been dealt with under various other names and has in this form been
introduced by Rash and Gusfield in 2002. They proposed an exact approach based
on integer linear programming and the use of suffix trees to generate a complete, nonredundant
set of candidate probes.
We evaluated several approaches for the SB as well as the SC problem. One of the leading
heuristics for the SC problem, based on Lagrangian relaxation, has been proposed by
Caprara et al. in 1999. We adapted the algorithm to see if it works equally well when
applied to the structurally very similar SB problem. Though the results we obtained
are somewhat mixed, the heuristic shows its strength with very complex instances and
delivers much better results compared to simpler heuristics.
Created from the Publication Database of the Vienna University of Technology.