Prescriptive Learning Analytics for Student Dropout: Integrating Temporal Velocity and Counterfactual Explanations in Longitudinal Data
DOI:
https://doi.org/10.62411/jcta.15920Keywords:
Class Imbalance, Counterfactual Explanations, Early Warning Systems, Educational Data Mining, Explainable Artificial Intelligence, Learning Analytics, Student Dropout Prediction, Temporal Data LeakageAbstract
Student dropout in higher education remains a persistent socioeconomic challenge, yet many predictive models reported in the literature are methodologically compromised by randomized cross-validation schemes that introduce temporal data leakage and artificially inflate predictive performance. This study proposes a longitudinal prescriptive learning analytics framework integrating three complementary methodological components: a Leave-One-Cohort-Out (LOCO) temporal validation protocol, a hybrid SMOTE-ENN class balancing strategy, and temporal velocity feature engineering derived from Learning Management System (LMS) behavioral trajectories. The framework was evaluated on a longitudinal dataset comprising 464,739 enrollment records and 77 features. Five predictive algorithms—XGBoost, LightGBM, CatBoost, Random Forest, and Logistic Regression—were comparatively assessed on a strictly isolated blind holdout cohort (2022), with CatBoost emerging as the champion estimator, achieving a PR-AUC of 0.8859, a Macro F1-Score of 0.9143, and the lowest Brier Score (0.0221), thereby demonstrating superior calibration and discriminative capability under severe class imbalance (93:7 ratio). Comprehensive ablation analysis revealed that temporal velocity features function not merely as additive predictors, but as a structural prerequisite enabling Synthetic Minority Oversampling Technique with Edited Nearest Neighbors (SMOTE-ENN) to generate high-quality synthetic boundary instances; removing these features reduced minority-class precision from 0.8302 to 0.6721. To operationalize predictive outputs into actionable intervention pathways, Diverse Counterfactual Explanations (DiCE) were implemented under a three-tier causal constraint architecture on 96 borderline high-risk students, generating 384 feasible intervention scenarios exclusively targeting forward-looking behavioral velocity metrics without constraint violations. Collectively, these findings advance the paradigm of prescriptive learning analytics by providing educational institutions with interpretable risk diagnostics and operationally feasible intervention guidance grounded in empirically validated behavioral and temporal dynamics.References
B. Duro, A. Gomes, F. B. Correia, A. R. Borges, and J. Bernardino, “Machine Learning and Deep Learning for Dropout Prediction in Higher Education: A Review,” Computers, vol. 15, no. 3, p. 164, Mar. 2026, doi: 10.3390/computers15030164.
A. Igualde-Sáez et al., “University Student Dropout: A Longitudinal Dataset of Demographic, Socioeconomic, and Academic Indicators,” Data, vol. 10, no. 10, p. 162, Oct. 2025, doi: 10.3390/data10100162.
I. Elbouknify et al., “AI-based identification and support of at-risk students: A case study of the Moroccan education system,” ArXiv. Apr. 09, 2025. [Online]. Available: http://arxiv.org/abs/2504.07160
A. Shaikhanova, O. Kuznetsov, K. Iklassova, A. Tokkuliyeva, and L. Sugurova, “Interpretable Predictive Modeling for Educational Equity: A Workload-Aware Decision Support System for Early Identification of At-Risk Students,” Big Data Cogn. Comput., vol. 9, no. 11, p. 297, Nov. 2025, doi: 10.3390/bdcc9110297.
Y. Lin, H. Chen, W. Xia, F. Lin, Z. Wang, and Y. Liu, “A Comprehensive Survey on Deep Learning Techniques in Educational Data Mining,” Data Sci. Eng., vol. 10, no. 4, pp. 564–590, Dec. 2025, doi: 10.1007/s41019-025-00303-z.
R. Paul, S. Sarker, H. El Aouifi, S. Hussain, A. K. Baruah, and S. Gaftandzhieva, “Analyzing dropout of students and an explainable prediction of academic performance utilizing artificial intelligence techniques,” Front. Educ., vol. 10, Dec. 2025, doi: 10.3389/feduc.2025.1698505.
W. Chango, J. A. Lara, R. Cerezo, and C. Romero, “A review on data fusion in multimodal learning analytics and educational data mining,” WIREs Data Min. Knowl. Discov., vol. 12, no. 4, Jul. 2022, doi: 10.1002/widm.1458.
W. Dai et al., “Learning Analytics for Early Identification of At-Risk Students and Feedback Intervention,” J. Learn. Anal., vol. 12, no. 3, pp. 102–125, Nov. 2025, doi: 10.18608/jla.2025.8735.
L. Sasse et al., “Overview of leakage scenarios in supervised machine learning,” J. Big Data, vol. 12, no. 1, p. 135, May 2025, doi: 10.1186/s40537-025-01193-8.
E. Tiukhova et al., “Explainable Learning Analytics: Assessing the stability of student success prediction models by means of explainable AI,” Decis. Support Syst., vol. 182, p. 114229, Jul. 2024, doi: 10.1016/j.dss.2024.114229.
M. Rosenblatt, L. Tejavibulya, R. Jiang, S. Noble, and D. Scheinost, “Data leakage inflates prediction performance in connectome-based machine learning models,” Nat. Commun., vol. 15, no. 1, p. 1829, Feb. 2024, doi: 10.1038/s41467-024-46150-w.
A. Turkmenbayev, E. Abdykerimova, S. Nurgozhayev, G. Karabassova, and D. Baigozhanova, “The application of machine learning in predicting student performance in university engineering programs: a rapid review,” Front. Educ., vol. 10, Sep. 2025, doi: 10.3389/feduc.2025.1562586.
I. K. Nti and S. Ramanayake, “Explainable machine learning for student dropout prediction and tailored interventions in online personalized education,” Discov. Artif. Intell., vol. 6, no. 1, p. 288, Feb. 2026, doi: 10.1007/s44163-026-01016-6.
I. K. R. Arthana, “Optimizing Dropout Prediction in University Using Oversampling Techniques for Imbalanced Datasets,” Int. J. Inf. Educ. Technol., vol. 14, no. 8, pp. 1052–1060, 2024, doi: 10.18178/ijiet.2024.14.8.2133.
G. E. A. P. A. Batista, R. C. Prati, and M. C. Monard, “A study of the behavior of several methods for balancing machine learning training data,” ACM SIGKDD Explor. Newsl., vol. 6, no. 1, pp. 20–29, Jun. 2004, doi: 10.1145/1007730.1007735.
C. H. Cho, Y. W. Yu, and H. G. Kim, “A Study on Dropout Prediction for University Students Using Machine Learning,” Appl. Sci., vol. 13, no. 21, p. 12004, Nov. 2023, doi: 10.3390/app132112004.
M. Rebelo Marcolino et al., “Student dropout prediction through machine learning optimization: insights from moodle log data,” Sci. Rep., vol. 15, no. 1, p. 9840, Mar. 2025, doi: 10.1038/s41598-025-93918-1.
A. Bettahi, F.-Z. Belouadha, and H. Harroud, “A Modular and Explainable Machine Learning Pipeline for Student Dropout Prediction in Higher Education,” Algorithms, vol. 18, no. 10, p. 662, Oct. 2025, doi: 10.3390/a18100662.
W.-C. Choi, C.-T. Lam, P. C.-I. Pang, and A. J. Mendes, “A Systematic Literature Review of Explainable Artificial Intelligence (XAI) for Interpreting Student Performance Prediction in Computer Science and STEM Education,” in Proceedings of the 30th ACM Conference on Innovation and Technology in Computer Science Education V. 1, Jun. 2025, pp. 221–227. doi: 10.1145/3724363.3729027.
H. Khosravi et al., “Explainable Artificial Intelligence in education,” Comput. Educ. Artif. Intell., vol. 3, p. 100074, 2022, doi: 10.1016/j.caeai.2022.100074.
L. C. Nnadi, C. P. Isiwu, D. Ding, D. M. Muepu, and Y. Watanobe, “Multi-Level Explainable AI for Predicting Student Depression Risk: Global, Subgroup, and Individual Insights,” IEEE Access, vol. 14, pp. 6271–6286, 2026, doi: 10.1109/ACCESS.2026.3652631.
M. Nagy and R. Molontay, “Interpretable Dropout Prediction: Towards XAI-Based Personalized Intervention,” Int. J. Artif. Intell. Educ., vol. 34, no. 2, pp. 274–300, Jun. 2024, doi: 10.1007/s40593-023-00331-8.
W. Hidayatulloh, F. Mahardika, and D. I. Junaedi, “Explainable Artificial Intelligence-Based Model for Student Academic Performance Prediction,” J. Inf. Syst. Explor. Res., vol. 4, no. 1, pp. 31–40, Feb. 2026, doi: 10.52465/joiser.v4i1.624.
W. Kim, C. Lee, and H. Kim, “KTCF: Actionable Recourse in Knowledge Tracing via Counterfactual Explanations for Education,” Proc. AAAI Conf. Artif. Intell., vol. 40, no. 45, pp. 38726–38735, Mar. 2026, doi: 10.1609/aaai.v40i45.41216.
N. Mduma, “Data Balancing Techniques for Predicting Student Dropout Using Machine Learning,” Data, vol. 8, no. 3, p. 49, Feb. 2023, doi: 10.3390/data8030049.
B. Bouihi, A. Bousselham, E. Aoula, F. Ennibras, and A. Deraoui, “Prediction of Higher Education Student Dropout based on Regularized Regression Models,” Eng. Technol. Appl. Sci. Res., vol. 14, no. 6, pp. 17811–17815, Dec. 2024, doi: 10.48084/etasr.8644.
J. K. Hoyos Osorio and G. Daza Santacoloma, “Predictive Model to Identify College Students with High Dropout Rates,” Rev. Electrónica Investig. Educ., vol. 25, pp. 1–10, May 2023, doi: 10.24320/redie.2023.25.e13.5398.
A. Villar and C. R. V. de Andrade, “Supervised machine learning algorithms for predicting student dropout and academic success: a comparative study,” Discov. Artif. Intell., vol. 4, no. 1, p. 2, Jan. 2024, doi: 10.1007/s44163-023-00079-z.
E. Arslan, S. Gaftandzhieva, A. Gorgani Firouzjaei, J. Hassannataj Joloudari, and R. Doneva, “Ex-ADA: a SHAP-based explainable AdaBoost framework for predicting at-risk students,” Front. Educ., vol. 10, Jan. 2026, doi: 10.3389/feduc.2025.1728070.
B. I. Igoche, O. Matthew, P. Bednar, and A. Gegov, “Integrating Structural Causal Model Ontologies with LIME for Fair Machine Learning Explanations in Educational Admissions,” J. Comput. Theor. Appl., vol. 2, no. 1, pp. 65–85, Jun. 2024, doi: 10.62411/jcta.10501.
G. Ramaswami, T. Susnjak, and A. Mathrani, “Supporting Students’ Academic Performance Using Explainable Machine Learning with Automated Prescriptive Analytics,” Big Data Cogn. Comput., vol. 6, no. 4, p. 105, Sep. 2022, doi: 10.3390/bdcc6040105.
J. Cheng, Z.-Q. Yang, J. Cao, Y. Yang, and X. Zheng, “Predicting Student Dropout Risk With A Dual-Modal Abrupt Behavioral Changes Approach,” ArXiv. May 16, 2025. [Online]. Available: http://arxiv.org/abs/2505.11119
M. Delogu, R. Lagravinese, D. Paolini, and G. Resce, “Predicting dropout from higher education: Evidence from Italy,” Econ. Model., vol. 130, p. 106583, Jan. 2024, doi: 10.1016/j.econmod.2023.106583.
V. Realinho, J. Machado, L. Baptista, and M. V. Martins, “Predicting Student Dropout and Academic Success,” Data, vol. 7, no. 11, p. 146, Oct. 2022, doi: 10.3390/data7110146.
Z. Liu, X. Zhou, and Y. Liu, “Student Dropout Prediction Using Ensemble Learning with SHAP-Based Explainable AI Analysis,” J. Soc. Syst. Policy Anal., vol. 2, no. 3, pp. 111–132, Aug. 2025, doi: 10.62762/JSSPA.2025.321501.
A. Igualde-Sáez et al., “StudentDropoutDataset,” Zenodo, Oct. 2025. https://doi.org/10.5281/zenodo.17239943
F. E. Arévalo-Cordovilla and M. Peña, “Evaluating ensemble models for fair and interpretable prediction in higher education using multimodal data,” Sci. Rep., vol. 15, no. 1, p. 29420, Aug. 2025, doi: 10.1038/s41598-025-15388-9.
T. Saito and M. Rehmsmeier, “The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets,” PLoS One, vol. 10, no. 3, p. e0118432, Mar. 2015, doi: 10.1371/journal.pone.0118432.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Nurul Hidayat, Lasmedi Afuan, Helmi Roichatul Jannah

This work is licensed under a Creative Commons Attribution 4.0 International License.














