Talks and Poster Presentations (with Proceedings-Entry):
P. Knees, M. Hübler:
"Towards Uncovering Dataset Biases: Investigating Record Label Diversity in Music Playlists";
Talk: 1st Workshop on Designing Human-Centric Music Information Research Systems,
Delft, The Netherlands;
2019-11-02; in: "Proceedings of the 1st Workshop on Designing Human-Centric Music Information Research Systems",
M. Miron (ed.);
Music recommender models are predominantly built upon the assumption that historic listening data is the result of the interaction between a user and an item and that overall interaction patterns can be extrapolated for making future recommendations (collaborative filtering). In this paper, we argue that listening logs are not only the result of users interacting with items but users interacting with items through a listening service. As such, the service has an impact on the recommendations made and the data created, consequently also introducing biases to datasets used for model training and evaluation. We investigate the case of a large dataset of Spotify playlists. In order to uncover patterns in the data, we augment the dataset with record label information crawled from the web. Subsequent first analyses of record label diversity within the playlists reveal unequal distributions and higher consistency of the most popular label especially in short playlists with few albums. We discuss possible reasons causing these patterns as well as potential algorithmic biases of the approach taken.
Electronic version of the publication:
Created from the Publication Database of the Vienna University of Technology.