Kemper, Lorenz
Predicting Student Dropout: A Machine Learning Approach Unveröffentlicht
2018.
Abstract | Links | BibTeX | Schlagwörter: descision trees, dropout, higher education, logistic regression, machine learning, massive open online courses (MOOCs), O, prediction, students, Studienerfolg
@unpublished{Kemper2018,
title = {Predicting Student Dropout: A Machine Learning Approach},
author = {Lorenz Kemper},
url = {https://www.researchgate.net/publication/322919234_Predicting_Student_Dropout_a_Machine_Learning_Approach},
year = {2018},
date = {2018-02-01},
urldate = {2018-08-22},
institution = {Karlsruhe Institute of Technology (KIT)},
abstract = {We perform two approaches of machine learning, logistic regression and decision trees, to predict student dropout at the Karlsruhe Institute of Technology (KIT). The models are computed on the basis of examination data, i.e. data available at all universities without need of collection. Therefore, we propose a methodical approach that may be put in practice with relative ease at other institutions. Using a Hellinger-Distance splitting approach we find decision trees to produce slightly better results. However, both methods yield high prediction accuracies of up to 95{37d1f293241a1edd19b097ce37fa29bd44d887a41b5283a0fc9485076e078306} after three semesters. A classification with more than 83{37d1f293241a1edd19b097ce37fa29bd44d887a41b5283a0fc9485076e078306} accuracy is already possible after the first semester. Within our analysis we show, that resampling techniques can improve the detection of at-risk students.},
keywords = {descision trees, dropout, higher education, logistic regression, machine learning, massive open online courses (MOOCs), O, prediction, students, Studienerfolg},
pubstate = {published},
tppubtype = {unpublished}
}
We perform two approaches of machine learning, logistic regression and decision trees, to predict student dropout at the Karlsruhe Institute of Technology (KIT). The models are computed on the basis of examination data, i.e. data available at all universities without need of collection. Therefore, we propose a methodical approach that may be put in practice with relative ease at other institutions. Using a Hellinger-Distance splitting approach we find decision trees to produce slightly better results. However, both methods yield high prediction accuracies of up to 95{37d1f293241a1edd19b097ce37fa29bd44d887a41b5283a0fc9485076e078306} after three semesters. A classification with more than 83{37d1f293241a1edd19b097ce37fa29bd44d887a41b5283a0fc9485076e078306} accuracy is already possible after the first semester. Within our analysis we show, that resampling techniques can improve the detection of at-risk students.