[Zurück]


Vorträge und Posterpräsentationen (mit Tagungsband-Eintrag):

D. Damljanovic, U. Kruschwitz, M. Albakour, J. Petrak, M. Lupu:
"Applying Random Indexing to Structured Data to Find Contextually Similar Words";
Vortrag: LREC 2012 8th ELRA Conference on Language Resources and Evaluation, Istanbul, Turkey; 21.05.2012 - 27.05.2012; in: "LREC 2012 8th ELRA Conference on Language Resources and Evaluation", European Language Resources Association (ELRA), (2012), ISBN: 978-2-9517408-7-7; S. 2023 - 2030.



Kurzfassung englisch:
Language resources extracted from structured data (e.g. Linked Open Data) have already been used in various scenarios to improve
conventional Natural Language Processing techniques. The meanings of words and the relations between them are made more explicit
in RDF graphs, in comparison to human-readable text, and hence have a great potential to improve legacy applications. In this paper,
we describe an approach that can be used to extend or clarify the semantic meaning of a word by constructing a list of contextually
related terms. Our approach is based on exploiting the structure inherent in an RDF graph and then applying the methods from statistical
semantics, and in particular, Random Indexing, in order to discover contextually related terms. We evaluate our approach in the domain
of life science using the dataset generated with the help of domain experts from a large pharmaceutical company (AstraZeneca). They
were involved in two phases: firstly, to generate a set of keywords of interest to them, and secondly to judge the set of generated
contextually similar words for each keyword of interest. We compare our proposed approach, exploiting the semantic graph, with the
same method applied on the human readable text extracted from the graph.

Schlagworte:
latent semantic, text mining, random indexing


Elektronische Version der Publikation:
http://publik.tuwien.ac.at/files/PubDat_213935.pdf