[Zurück]


Zeitschriftenartikel:

B. Nikolic, A. Kartelj, M. Djukanovic, M. Grbic, C. Blum, G. Raidl:
"Solving the Longest Common Subsequence Problem Concerning Non-Uniform Distributions of Letters in Input Strings";
Mathematics, 9 (2021), S. 1 - 25.



Kurzfassung englisch:
The longest common subsequence (LCS) problem is a prominent NP-hard optimization
problem where, given an arbitrary set of input strings, the aim is to find a longest subsequence,
which is common to all input strings. This problem has a variety of applications in bioinformatics,
molecular biology and file plagiarism checking, among others. All previous approaches from the
literature are dedicated to solving LCS instances sampled from uniform or near-to-uniform probability
distributions of letters in the input strings. In this paper, we introduce an approach that is able to
effectively deal with more general cases, where the occurrence of letters in the input strings follows
a non-uniform distribution such as a multinomial distribution. The proposed approach makes
use of a time-restricted beam search, guided by a novel heuristic named GMPSUM. This heuristic
combines two complementary scoring functions in the form of a convex combination. Furthermore,
apart from the close-to-uniform benchmark sets from the related literature, we introduce three new
benchmark sets that differ in terms of their statistical properties. One of these sets concerns a case
study in the context of text analysis. We provide a comprehensive empirical evaluation in two
distinctive settings: (1) short-time execution with fixed beam size in order to evaluate the guidance
abilities of the compared search heuristics; and (2) long-time executions with fixed target duration
times in order to obtain high-quality solutions. In both settings, the newly proposed approach
performs comparably to state-of-the-art techniques in the context of close-to-uniform instances and
outperforms state-of-the-art approaches for non-uniform instances.


"Offizielle" elektronische Version der Publikation (entsprechend ihrem Digital Object Identifier - DOI)
http://dx.doi.org/10.3390/math9131515

Elektronische Version der Publikation:
https://publik.tuwien.ac.at/files/publik_301813.pdf


Erstellt aus der Publikationsdatenbank der Technischen Universität Wien.