Data Augmentation for Enlarging Student Feature Space and Improving Random Forest Success Prediction
Résumé
One of the main problems encountered when predicting student success, as a tool to aid students, is the lack of data used to model each student. This lack of data is due in part to the small number of students in each university course and also, the limited number of features that describe the educational background for each student. In this article, we introduce new features by augmenting the student feature space to obtain an improved model. These features are divided into several groups, namely, external added data, metric and counter data, and evolutive data. We will then assess the quality of the augmented data to classify at-risk students in their first year of university. For this article, the classifiers are built using Random Forests. As this learning method measures variable importance, we can enquire on the relevance of the augmented data, as well as the data groups that allow a more significant collection of features.
Origine : Fichiers produits par l'(les) auteur(s)