[Zurück]


Vorträge und Posterpräsentationen (mit Tagungsband-Eintrag):

N. Rekabsaz, M. Lupu, A. Hanbury, G. Zuccon:
"Exploration of a Threshold for Similarity based on Uncertainty in Word Embedding";
Vortrag: European Conference on IR Research (ECIR), Aberdeen, UK; 08.04.2017 - 13.04.2017; in: "Advances in Information Retrieval", Springer, Cham, 10193 (2017), ISBN: 978-3-319-56607-8; S. 396 - 409.



Kurzfassung englisch:
Word embedding promises a quantification of the similarity between terms. However, it is not clear to what extent this similarity value can be of practical use for subsequent information access tasks. In particular, which range of similarity values is indicative of the actual term relatedness? We first observe and quantify the uncertainty of word embedding models with respect to the similarity values they generate. Based on this, we introduce a general threshold which effectively filters related terms. We explore the effect of dimensionality on this general threshold by conducting the experiments in different vector dimensions. Our evaluation on four test collections with four relevance scoring models supports the effectiveness of our approach, as the results of the proposed threshold are significantly better than the baseline while being equal to, or statistically indistinguishable from, the optimal results.


"Offizielle" elektronische Version der Publikation (entsprechend ihrem Digital Object Identifier - DOI)
http://dx.doi.org/10.1007/978-3-319-56608-5_31


Erstellt aus der Publikationsdatenbank der Technischen Universität Wien.