[Back]


Diploma and Master Theses (authored and supervised):

P. Blasko:
"Identification of Credit Default Drivers via Lasso Estimation in the Logistic Regression Model";
Supervisor: U. Schneider; E105- Stochastik und Wirtschaftsmathematik, 2020; final examination: 2020-05-25.



English abstract:
In this work, a binary logistic regression model for two-year default probabilities has been estimated on a data set containing information on 150.000 clients available on kaggle's competition "GiveMeSomeCredit". The optimal model has been selected by choosing a subset of continuous, categorical and ordinal variables reflecting sociodemographic and behavioral properties of the client as well as characteristics of their loans using the Lasso estimator. The issue of non-linear dependence of default probabilities on the regressors has been tackled by discretization of regressors using a version of the fused Lasso in a multivariate environment.We find that the model provides an excellent fit of the data by reaching an average out-of-sample AUC of over 86%, independent of the model selection criterion (AIC, BIC or CV). This value lies in the upper range of the industry standard and in range of more complicated modeling approaches such as in Wang et al. (2015). We see that the estimator gives the strongest weightsto behavioral variables such as past due status and limit utilization, while sociodemographic variables and loan properties are much less significant.

Created from the Publication Database of the Vienna University of Technology.