An Integrated Framework for Optimizing Customer Retention Budget using Clustering, Classification, and Mathematical Optimization

Amirthanathan Prashanthan

doi:10.62411/jcta.13194

Authors

Amirthanathan Prashanthan Data Insighty Private Limited

DOI:

https://doi.org/10.62411/jcta.13194

Keywords:

Budget optimization, Churn prediction, Classification, Clustering, Customer segmentation, Machine learning, Mathematical optimization, Mixed-Integer Linear programming

Abstract

The study presents a comprehensive framework for optimizing customer retention budget by integrating clustering, classification, and mathematical optimization techniques. The study begins with the IBM Telco dataset, which is prepared through data cleansing, encoding, and scaling. In the preliminary phase, customer segmentation is performed using K-Means clustering, with k = 3 and k = 4 identified as optimal based on the elbow method and Silhouette score. The configurations produced three (Premium, Standard, Low) and four (Premium, Standard Plus, Standard, Low) customer segments based on purchase preferences, which served as input features for churn prediction. In the second phase, the dataset was divided into training and test sets in an 80:20 ratio, followed by data balancing using the Synthetic Minority Over-sampling Technique (SMOTE) and Edited Nearest Neighbors (ENN). Multiple classification algorithms were evaluated, including Naive Bayes (NB), Random Forest (RF), Categorical Boosting (CatBoost), Light Gradient Boosting Machine (LightGBM), Extreme Gradient Boosting (XGBoost), Gradient Boosting (GB), Support Vector Machine (SVM), Logistic Regression (LR), K-Nearest Neighbors (KNN), and Multi-Layer Perceptron (MLP) using F1-score as the performance metric. CatBoost and LightGBM, with k values of 3 and 4, respectively, were the highest-performing classification models, with only minimal differences in performance. Ultimately, customer segmentation established customer prioritization, whereas churn prediction assessed customer churn likelihood. Four distinct configurations were assessed utilizing mixed-integer linear programming (MILP) to optimise retention budget allocation within uniform budget constraints, discount amounts, and churn thresholds. In both the k=3 and k=4 scenarios, CatBoost surpassed LightGBM, with CatBoost at K=3 effectively discounting 66% of at-risk consumers across all three segments, hence improving the intervention's efficacy and budget allocation, making it the ideal choice for maximizing customer retention. The results demonstrate the importance of segmentation in enhancing retention budgeting and budget optimization, particularly concerning parameter sensitivity.

Author Biography

Amirthanathan Prashanthan, Data Insighty Private Limited

Department of Applied Data and AI Research, Data Insighty Private Limited, Colombo 00600, Sri Lanka

References

S. Saleh and S. Saha, “Customer retention and churn prediction in the telecommunication industry: a case study on a Danish university,” SN Appl. Sci., vol. 5, no. 7, p. 173, Jul. 2023, doi: 10.1007/s42452-023-05389-6.

Somya, S. Kaushik, S. Saini, S. Singhal, and S. Kumar, “Telecom Churn Prediction Using Data Science,” Educ. Adm. Theory Pract., vol. 30, no. 5, pp. 11026–11034, 2024, doi: 10.53555/kuey.v30i5.4880.

S. Saha, C. Saha, M. M. Haque, M. G. R. Alam, and A. Talukder, “ChurnNet: Deep Learning Enhanced Customer Churn Prediction in Telecommunication Industry,” IEEE Access, vol. 12, pp. 4471–4484, 2024, doi: 10.1109/ACCESS.2024.3349950.

B. Prabadevi, R. Shalini, and B. R. Kavitha, “Customer churning analysis using machine learning algorithms,” Int. J. Intell. Networks, vol. 4, pp. 145–154, 2023, doi: 10.1016/j.ijin.2023.05.005.

M. R. Mohaimin et al., “Predictive Analytics for Telecom Customer Churn: Enhancing Retention Strategies in the US Market,” J. Comput. Sci. Technol. Stud., vol. 7, no. 1, pp. 30–45, Jan. 2025, doi: 10.32996/jcsts.2025.7.1.3.

A. Ben, “Enhanced Churn Prediction in the Telecommunication Industry,” SSRN Electron. J., 2020, doi: 10.2139/ssrn.3577712.

R. Srinivasan, D. Rajeswari, and G. Elangovan, “Customer Churn Prediction Using Machine Learning Approaches,” in 2023 International Conference on Artificial Intelligence and Knowledge Discovery in Concurrent Engineering (ICECONF), Jan. 2023, pp. 1–6. doi: 10.1109/ICECONF57129.2023.10083813.

T. J. Shen, A. Samad, and B. Shibghatullah, “Customer Churn Prediction Model for Telecommunication Industry,” J. Adv. Artif. Life Robot. , vol. 3, no. 2, pp. 85–91, 2022, [Online]. Available: https://alife-robotics.org/jallr.html

S. Höppner, E. Stripling, B. Baesens, S. vanden Broucke, and T. Verdonck, “Profit driven decision trees for churn prediction,” Eur. J. Oper. Res., vol. 284, no. 3, pp. 920–933, Aug. 2020, doi: 10.1016/j.ejor.2018.11.072.

G. Sam, P. Asuquo, and B. Stephen, “Customer Churn Prediction using Machine Learning Models,” J. Eng. Res. Reports, vol. 26, no. 2, pp. 181–193, Feb. 2024, doi: 10.9734/jerr/2024/v26i21081.

A. Karahoca and A. Kara, “Comparing clustering techniques for telecom churn management,” in The 5th WSEAS International Conference on Telecommunications and Informatics, 2006, no. July, pp. 520–145. [Online]. Available: https://www.wseas.us/e-library/conferences/2006istanbul/papers/520-145.pdf

R. Liu et al., “An Intelligent Hybrid Scheme for Customer Churn Prediction Integrating Clustering and Classification Algorithms,” Appl. Sci., vol. 12, no. 18, p. 9355, Sep. 2022, doi: 10.3390/app12189355.

S. Fakhar Bilal, A. Ali Almazroi, S. Bashir, F. Hassan Khan, and A. Ali Almazroi, “An ensemble based approach using a combination of clustering and classification algorithms to enhance customer churn prediction in telecom industry,” PeerJ Comput. Sci., vol. 8, p. e854, Feb. 2022, doi: 10.7717/peerj-cs.854.

M. Ahmed, R. Seraj, and S. M. S. Islam, “The k-means Algorithm: A Comprehensive Survey and Performance Evaluation,” Electronics, vol. 9, no. 8, p. 1295, Aug. 2020, doi: 10.3390/electronics9081295.

A. K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering,” ACM Comput. Surv., vol. 31, no. 3, pp. 264–323, Sep. 1999, doi: 10.1145/331499.331504.

S. Tondini, C. Castellan, M. A. Medina, and L. Pavesi, “Automatic Initialization Methods for Photonic Components on a Silicon-Based Optical Switch,” Appl. Sci., vol. 9, no. 9, p. 1843, May 2019, doi: 10.3390/app9091843.

M. Ahmed, “Collective Anomaly Detection Techniques for Network Traffic Analysis,” Ann. Data Sci., vol. 5, no. 4, pp. 497–512, Dec. 2018, doi: 10.1007/s40745-018-0149-0.

M. Ahmed, “An Unsupervised Approach of Knowledge Discovery from Big Data in Social Network,” ICST Trans. Scalable Inf. Syst., vol. 4, no. 14, p. 153148, Sep. 2017, doi: 10.4108/eai.25-9-2017.153148.

Y. Qiu, P. Chen, Z. Lin, Y. Yang, L. Zeng, and Y. Fan, “Clustering Analysis for Silent Telecom Customers Based on K-means++,” in 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Jun. 2020, pp. 1023–1027. doi: 10.1109/ITNEC48623.2020.9084976.

M. Alkhayrat, M. Aljnidi, and K. Aljoumaa, “A comparative dimensionality reduction study in telecom customer segmentation using deep learning and PCA,” J. Big Data, vol. 7, no. 1, p. 9, Dec. 2020, doi: 10.1186/s40537-020-0286-0.

J. Kristian Vieri, T. A. Munandar, and D. B. Srisulistiowati, “Exclusive Clustering Technique for Customer Segmentation in National Telecommunications Companies,” Int. J. Inf. Technol. Comput. Sci. Appl., vol. 1, no. 1, pp. 51–57, Jan. 2023, doi: 10.58776/ijitcsa.v1i1.19.

T. Velmurugan, “A State of Art Analysis of Telecommunication Data by k-Means and k-Medoids Clustering Algorithms,” J. Comput. Commun., vol. 06, no. 01, pp. 190–202, 2018, doi: 10.4236/jcc.2018.61019.

S. W. Fujo, S. Subramanian, and M. A. Khder, “Customer Churn Prediction in Telecommunication Industry Using Deep Learning,” Inf. Sci. Lett., vol. 11, no. 1, pp. 185–198, Jan. 2022, doi: 10.18576/isl/110120.

M. D. Gabhane, A. Suriya, and S. B. Kishor, “Churn Prediction in Telecommunication Business using CNN and ANN,” J. Posit. Sch. Psychol., vol. 2022, no. 4, pp. 4672–4680, 2021, [Online]. Available: http://journalppw.com

A. Keramati, R. Jafari-Marandi, M. Aliannejadi, I. Ahmadian, M. Mozaffari, and U. Abbasi, “Improved churn prediction in telecommunication industry using data mining techniques,” Appl. Soft Comput., vol. 24, pp. 994–1012, Nov. 2014, doi: 10.1016/j.asoc.2014.08.041.

H. Ribeiro, B. Barbosa, A. C. Moreira, and R. G. Rodrigues, “Determinants of churn in telecommunication services: a systematic literature review,” Manag. Rev. Q., vol. 74, no. 3, pp. 1327–1364, Sep. 2024, doi: 10.1007/s11301-023-00335-7.

N. Almufadi and A. Mustafa Qamar, “Deep Convolutional Neural Network Based Churn Prediction for Telecommunication Industry,” Comput. Syst. Sci. Eng., vol. 43, no. 3, pp. 1255–1270, 2022, doi: 10.32604/csse.2022.025029.

N. N. Y., T. Van Ly, and D. V. T. Son, “Churn prediction in telecommunication industry using kernel Support Vector Machines,” PLoS One, vol. 17, no. 5, p. e0267935, May 2022, doi: 10.1371/journal.pone.0267935.

Y. Fareniuk, T. Zatonatska, O. Dluhopolskyi, and O. Kovalenko, “Customer churn prediction model: a case of the telecommunication market,” Economics, vol. 10, no. 2, pp. 109–130, Dec. 2022, doi: 10.2478/eoik-2022-0021.

N. Mustafa, L. Sook Ling, and S. F. Abdul Razak, “Customer churn prediction for telecommunication industry: A Malaysian Case Study,” F1000Research, vol. 10, p. 1274, Dec. 2021, doi: 10.12688/f1000research.73597.1.

N. I. Mohammad, S. A. Ismail, M. N. Kama, O. M. Yusop, and A. Azmi, “Customer Churn Prediction In Telecommunication Industry Using Machine Learning Classifiers,” in Proceedings of the 3rd International Conference on Vision, Image and Signal Processing, Aug. 2019, pp. 1–7. doi: 10.1145/3387168.3387219.

A. Amin et al., “Just-in-time customer churn prediction in the telecommunication sector,” J. Supercomput., vol. 76, no. 6, pp. 3924–3948, Jun. 2020, doi: 10.1007/s11227-017-2149-9.

D. M. Melian, A. Dumitrache, S. Stancu, and A. Nastu, “Customer Churn Prediction in Telecommunication Industry. A Data Analysis Techniques Approach,” Postmod. Openings, vol. 13, no. 1 Sup1, pp. 78–104, Mar. 2022, doi: 10.18662/po/13.1Sup1/415.

V. Chang, K. Hall, Q. Xu, F. Amao, M. Ganatra, and V. Benson, “Prediction of Customer Churn Behavior in the Telecommunication Industry Using Machine Learning Models,” Algorithms, vol. 17, no. 6, p. 231, May 2024, doi: 10.3390/a17060231.

C.-P. Wei and I.-T. Chiu, “Turning telecommunications call details to churn prediction: a data mining approach,” Expert Syst. Appl., vol. 23, no. 2, pp. 103–112, Aug. 2002, doi: 10.1016/S0957-4174(02)00030-1.

A. Amin et al., “Cross-company customer churn prediction in telecommunication: A comparison of data transformation methods,” Int. J. Inf. Manage., vol. 46, pp. 304–319, Jun. 2019, doi: 10.1016/j.ijinfomgt.2018.08.015.

I. H. Sarker, “Machine Learning: Algorithms, Real-World Applications and Research Directions,” SN Comput. Sci., vol. 2, no. 3, p. 160, May 2021, doi: 10.1007/s42979-021-00592-x.

M. Badawy, N. Ramadan, and H. A. Hefny, “Healthcare predictive analytics using machine learning and deep learning techniques: a survey,” J. Electr. Syst. Inf. Technol., vol. 10, no. 1, p. 40, Aug. 2023, doi: 10.1186/s43067-023-00108-y.

A. Manzoor, M. Atif Qureshi, E. Kidney, and L. Longo, “A Review on Machine Learning Methods for Customer Churn Prediction and Recommendations for Business Practitioners,” IEEE Access, vol. 12, pp. 70434–70463, 2024, doi: 10.1109/ACCESS.2024.3402092.

F. Pedregosa et al., “Scikit-learn: Machine Learning in Python,” J. Mach. Learn. Res., vol. 12, pp. 2825–2830, 2011, doi: 10.1289/EHP4713.

I. H. Sarker, “Deep Cybersecurity: A Comprehensive Overview from Neural Network and Deep Learning Perspective,” SN Comput. Sci., vol. 2, no. 3, p. 154, May 2021, doi: 10.1007/s42979-021-00535-6.

I. H. Sarker, “A machine learning based robust prediction model for real-life mobile phone data,” Internet of Things, vol. 5, pp. 180–193, Mar. 2019, doi: 10.1016/j.iot.2019.01.007.

S. Le Cessie and J. C. Van Houwelingen, “Ridge Estimators in Logistic Regression,” Appl. Stat., vol. 41, no. 1, p. 191, 1992, doi: 10.2307/2347628.

C. Zhang, C. Liu, X. Zhang, and G. Almpanidis, “An up-to-date comparison of state-of-the-art classification algorithms,” Expert Syst. Appl., vol. 82, pp. 128–150, Oct. 2017, doi: 10.1016/j.eswa.2017.04.003.

Y. Wang et al., “User Relationship Discovery Based on Telecom Data,” in 2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Nov. 2023, pp. 2076–2081. doi: 10.1109/TrustCom60117.2023.00287.

J. R. Quinlan, “Induction of decision trees,” Mach. Learn., vol. 1, no. 1, pp. 81–106, Mar. 1986, doi: 10.1007/BF00116251.

S. L. Salzberg, “C4.5: Programs for Machine Learning by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc., 1993,” Mach. Learn., vol. 16, no. 3, pp. 235–240, Sep. 1994, doi: 10.1007/BF00993309.

L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification And Regression Trees. Routledge, 2017. doi: 10.1201/9781315139470.

I. H. Sarker, Y. B. Abushark, and A. I. Khan, “ContextPCA: Predicting Context-Aware Smartphone Apps Usage Based On Machine Learning Techniques,” Symmetry (Basel)., vol. 12, no. 4, p. 499, Apr. 2020, doi: 10.3390/sym12040499.

I. H. Sarker, A. Colman, J. Han, A. I. Khan, Y. B. Abushark, and K. Salah, “BehavDT: A Behavioral Decision Tree Learning to Build User-Centric Context-Aware Predictive Model,” Mob. Networks Appl., vol. 25, no. 3, pp. 1151–1161, Jun. 2020, doi: 10.1007/s11036-019-01443-z.

D. W. Aha, D. Kibler, and M. K. Albert, “Instance-based learning algorithms,” Mach. Learn., vol. 6, no. 1, pp. 37–66, Jan. 1991, doi: 10.1007/BF00153759.

F. Sanei, A. Harifi, and S. Golzari, “Improving the precision of KNN classifier using nonlinear weighting method based on the spline interpolation,” in 2017 7th International Conference on Computer and Knowledge Engineering (ICCKE), Oct. 2017, pp. 289–292. doi: 10.1109/ICCKE.2017.8167893.

S. S. Keerthi, S. K. Shevade, C. Bhattacharyya, and K. R. K. Murthy, “Improvements to Platt’s SMO Algorithm for SVM Classifier Design,” Neural Comput., vol. 13, no. 3, pp. 637–649, Mar. 2001, doi: 10.1162/089976601300014493.

L. Breiman, “Random Forests,” Mach. Learn., vol. 45, pp. 5–32, 2001, doi: 10.1023/A:1010933404324.

Y. Amit and D. Geman, “Shape Quantization and Recognition with Randomized Trees,” Neural Comput., vol. 9, no. 7, pp. 1545–1588, Oct. 1997, doi: 10.1162/neco.1997.9.7.1545.

Y. Freund and R. E. Schapire, “Experiments with a New Boosting Algorithm,” in Proceedings of the 13th International Conference on Machine Learning, 1996, pp. 148–156. doi: 10.1.1.133.1040.

P. Yang, “Data Visualization and Prediction for Telecom Customer Churn,” Highlights Sci. Eng. Technol., vol. 39, pp. 1080–1085, Apr. 2023, doi: 10.54097/hset.v39i.6711.

S. H. Iranmanesh, M. Hamid, M. Bastan, G. Hamed Shakouri, and M. M. Nasiri, “Customer churn prediction using artificial neural network: An analytical CRM application,” Proc. Int. Conf. Ind. Eng. Oper. Manag., no. July, pp. 2214–2226, 2019.

A. Sikri, R. Jameel, S. M. Idrees, and H. Kaur, “Enhancing customer retention in telecom industry with machine learning driven churn prediction,” Sci. Rep., vol. 14, no. 1, p. 13097, Jun. 2024, doi: 10.1038/s41598-024-63750-0.

T. Kimura, “Customer churn prediction with hybrid resampling and ensemble learning,” J. Manag. Inf. Decis. Sci., vol. 25, no. 1, pp. 1–23, 2022, [Online]. Available: https://www.abacademies.org/articles/customer-churn-prediction-with-hybrid-resampling-and-ensemble-learning-13867.html

I. H. Sarker, A. S. M. Kayes, and P. Watters, “Effectiveness analysis of machine learning classification models for predicting personalized context-aware smartphone usage,” J. Big Data, vol. 6, no. 1, p. 57, Dec. 2019, doi: 10.1186/s40537-019-0219-y.

G. H. John and P. Langley, “Estimating Continuous Distributions in Bayesian Classifiers,” arXiv. Feb. 20, 2013. [Online]. Available: http://arxiv.org/abs/1302.4964

J. Singh and R. Banerjee, “A Study on Single and Multi-layer Perceptron Neural Network,” in 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC), Mar. 2019, pp. 35–40. doi: 10.1109/ICCMC.2019.8819775.

M. U. Tariq, M. Babar, M. Poulin, and A. S. Khattak, “Distributed model for customer churn prediction using convolutional neural network,” J. Model. Manag., vol. 17, no. 3, pp. 853–863, Aug. 2022, doi: 10.1108/JM2-01-2021-0032.

I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. London, England: MIT Press, 2016.

S. Bakhvalov, E. Osadchy, I. Bogdanova, R. Shichiyakh, and E. L. Lydia, “Intelligent System for Customer Churn Prediction using Dipper Throat Optimization with Deep Learning on Telecom Industries,” Fusion Pract. Appl., vol. 14, no. 2, pp. 172–185, 2024, doi: 10.54216/FPA.140214.

A. Achenbach and S. Spinler, “Prescriptive analytics in airline operations: Arrival time prediction and cost index optimization for short-haul flights,” Oper. Res. Perspect., vol. 5, pp. 265–279, 2018, doi: 10.1016/j.orp.2018.08.004.

M. Mehdi Nasrabadi, “Robust Optimization for Performance-Based Budget Allocation at Payam Noor University,” Am. J. Appl. Math., vol. 4, no. 6, p. 310, 2016, doi: 10.11648/j.ajam.20160406.17.

Y. Shi, Y. Xiang, H. Xiao, and L. Xing, “Joint optimization of budget allocation and maintenance planning of multi-facility transportation infrastructure systems,” Eur. J. Oper. Res., vol. 288, no. 2, pp. 382–393, Jan. 2021, doi: 10.1016/j.ejor.2020.05.050.

A. Prashanthan, R. Roshan, and M. Maduranga, “RetenNet: A Deployable Machine Learning Pipeline with Explainable AI and Prescriptive Optimization for Customer Churn Management,” J. Futur. Artif. Intell. Technol., vol. 2, no. 2, pp. 182–201, Jun. 2025, doi: 10.62411/faith.3048-3719-110.

R. N. Anderson, “‘Petroleum Analytics Learning Machine’ for optimizing the Internet of Things of today’s digital oil field-to-refinery petroleum system,” in 2017 IEEE International Conference on Big Data (Big Data), Dec. 2017, pp. 4542–4545. doi: 10.1109/BigData.2017.8258496.

H. Harikumar, S. Rana, S. Gupta, T. Nguyen, R. Kaimal, and S. Venkatesh, “Differentially Private Prescriptive Analytics,” in 2018 IEEE International Conference on Data Mining (ICDM), Nov. 2018, pp. 995–1000. doi: 10.1109/ICDM.2018.00124.

L. Berk, D. Bertsimas, A. M. Weinstein, and J. Yan, “Prescriptive analytics for human resource planning in the professional services industry,” Eur. J. Oper. Res., vol. 272, no. 2, pp. 636–641, Jan. 2019, doi: 10.1016/j.ejor.2018.06.035.

S. Malathi et al., “A mathematical model for cost budget optimization in the early stage of house construction budget analysis,” J. Interdiscip. Math., vol. 25, no. 3, pp. 839–849, Apr. 2022, doi: 10.1080/09720502.2021.2017596.

Y.-F. Liu et al., “A Survey of Recent Advances in Optimization Methods for Wireless Communications,” IEEE J. Sel. Areas Commun., vol. 42, no. 11, pp. 2992–3031, Nov. 2024, doi: 10.1109/JSAC.2024.3443759.

J. K. von Bischhoffshausen, M. Paatsch, M. Reuter, G. Satzger, and H. Fromm, “An Information System for Sales Team Assignments Utilizing Predictive and Prescriptive Analytics,” in 2015 IEEE 17th Conference on Business Informatics, Jul. 2015, pp. 68–76. doi: 10.1109/CBI.2015.38.

M. Aref et al., “Design and Implementation of the LogicBlox System,” in Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, May 2015, pp. 1371–1382. doi: 10.1145/2723372.2742796.

A. Ghoniem, A. I. Ali, M. Al-Salem, and W. Khallouli, “Prescriptive analytics for FIFA World Cup lodging capacity planning,” J. Oper. Res. Soc., vol. 68, no. 10, pp. 1183–1194, Oct. 2017, doi: 10.1057/s41274-016-0143-x.

A. Baur, R. Klein, and C. Steinhardt, “Model-based decision support for optimal brochure pricing: applying advanced analytics in the tour operating industry,” OR Spectr., vol. 36, no. 3, pp. 557–584, Jul. 2014, doi: 10.1007/s00291-013-0338-3.

A. F. Zakaria, S. C. J. Lim, and M. Aamir, “A pricing optimization modelling for assisted decision making in telecommunication product-service bundling,” Int. J. Inf. Manag. Data Insights, vol. 4, no. 1, p. 100212, Apr. 2024, doi: 10.1016/j.jjimei.2024.100212.

S. Ito and R. Fujimaki, “Optimization Beyond Prediction,” in Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2017, pp. 1833–1841. doi: 10.1145/3097983.3098188.

T. Huang, D. Bergman, and R. Gopal, “Predictive and Prescriptive Analytics for Location Selection of Add‐on Retail Products,” Prod. Oper. Manag., vol. 28, no. 7, pp. 1858–1877, Jul. 2019, doi: 10.1111/poms.13018.

J. R. Gardner, M. J. Kusner, Z. Xu, K. Q. Weinberger, and J. P. Cunningham, “Bayesian optimization with inequality constraints,” in Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32, 2014, pp. II–937–II–945.

H. Harikumar, S. Rana, S. Gupta, T. Nguyen, R. Kaimal, and S. Venkatesh, “Prescriptive Analytics Through Constrained Bayesian Optimization,” 2018, pp. 335–347. doi: 10.1007/978-3-319-93034-3_27.

Blastchar, “Telco Customer Churn,” Kaggle.com, 2018. https://www.kaggle.com/datasets/blastchar/telco-customer-churn (accessed May 15, 2025).

C. X. Gao et al., “An overview of clustering methods with guidelines for application in mental health research,” Psychiatry Res., vol. 327, p. 115265, Sep. 2023, doi: 10.1016/j.psychres.2023.115265.

R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan, “Automatic subspace clustering of high dimensional data for data mining applications,” in Proceedings of the 1998 ACM SIGMOD international conference on Management of data, Jun. 1998, pp. 94–105. doi: 10.1145/276304.276314.

D. Müllner, “fastcluster : Fast Hierarchical, Agglomerative Clustering Routines for R and Python,” J. Stat. Softw., vol. 53, no. 9, 2013, doi: 10.18637/jss.v053.i09.

M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “A density-based algorithm for discovering clusters in large spatial databases with noise,” in Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 1996, pp. 226–231.

I. Daniel, L. Akinyemi, and O. Udekwu, “Identifying Landslide Hotspots Using Unsupervised Clustering: A Case Study,” J. Futur. Artif. Intell. Technol., vol. 1, no. 3, pp. 249–268, Nov. 2024, doi: 10.62411/faith.3048-3719-37.

K. P. Sinaga and M.-S. Yang, “Unsupervised K-Means Clustering Algorithm,” IEEE Access, vol. 8, pp. 80716–80727, 2020, doi: 10.1109/ACCESS.2020.2988796.

J. Lwin, “Enhancing Cloud Task Scheduling with Multi-Objective Optimization Using K-Means Clustering and Dynamic Resource Allocation,” J. Comput. Theor. Appl., vol. 2, no. 2, pp. 202–211, Oct. 2024, doi: 10.62411/jcta.11337.

K. Tabianan, S. Velu, and V. Ravi, “K-Means Clustering Approach for Intelligent Customer Segmentation Using Customer Purchase Behavior Data,” Sustainability, vol. 14, no. 12, p. 7243, Jun. 2022, doi: 10.3390/su14127243.

E. Umargono, J. E. Suseno, and S. . Vincensius Gunawan, “K-Means Clustering Optimization Using the Elbow Method and Early Centroid Determination Based on Mean and Median Formula,” in Proceedings of the 2nd International Seminar on Science and Technology (ISSTEC 2019), 2020. doi: 10.2991/assehr.k.201010.019.

Y. Januzaj, E. Beqiri, and A. Luma, “Determining the Optimal Number of Clusters using Silhouette Score as a Data Mining Technique,” Int. J. Online Biomed. Eng., vol. 19, no. 04, pp. 174–182, Apr. 2023, doi: 10.3991/ijoe.v19i04.37059.

M. Lamari et al., “SMOTE–ENN-Based Data Sampling and Improved Dynamic Ensemble Selection for Imbalanced Medical Data Classification,” 2021, pp. 37–49. doi: 10.1007/978-981-15-6048-4_4.

R. E. Ako et al., “Effects of Data Resampling on Predicting Customer Churn via a Comparative Tree-based Random Forest and XGBoost,” J. Comput. Theor. Appl., vol. 2, no. 1, pp. 86–101, Jun. 2024, doi: 10.62411/jcta.10562.

I. D. Mienye and Y. Sun, “A Survey of Ensemble Learning: Concepts, Algorithms, Applications, and Prospects,” IEEE Access, vol. 10, no. August, pp. 99129–99149, 2022, doi: 10.1109/ACCESS.2022.3207287.

J. M. Ahn, J. Kim, and K. Kim, “Ensemble Machine Learning of Gradient Boosting (XGBoost, LightGBM, CatBoost) and Attention-Based CNN-LSTM for Harmful Algal Blooms Forecasting,” Toxins (Basel)., vol. 15, no. 10, p. 608, Oct. 2023, doi: 10.3390/toxins15100608.

K. Ileri, “Comparative analysis of CatBoost, LightGBM, XGBoost, RF, and DT methods optimised with PSO to estimate the number of k-barriers for intrusion detection in wireless sensor networks,” Int. J. Mach. Learn. Cybern., May 2025, doi: 10.1007/s13042-025-02654-5.

A. Yavuz Ozalp, H. Akinci, and M. Zeybek, “Comparative Analysis of Tree-Based Ensemble Learning Algorithms for Landslide Susceptibility Mapping: A Case Study in Rize, Turkey,” Water, vol. 15, no. 14, p. 2661, Jul. 2023, doi: 10.3390/w15142661.

O. Peretz, M. Koren, and O. Koren, “Naive Bayes classifier – An ensemble procedure for recall and precision enrichment,” Eng. Appl. Artif. Intell., vol. 136, p. 108972, Oct. 2024, doi: 10.1016/j.engappai.2024.108972.

M. A. Hearst, S. T. Dumais, E. Osuna, J. Platt, and B. Scholkopf, “Support vector machines,” IEEE Intell. Syst. their Appl., vol. 13, no. 4, pp. 18–28, Jul. 1998, doi: 10.1109/5254.708428.

Z. Zhang, “Introduction to machine learning: k-nearest neighbors,” Ann. Transl. Med., vol. 4, no. 11, pp. 218–218, Jun. 2016, doi: 10.21037/atm.2016.03.37.

K. M. Ting, “Confusion Matrix,” in Encyclopedia of Machine Learning, Boston, MA: Springer US, 2011, pp. 209–209. doi: 10.1007/978-0-387-30164-8_157.

F. Clautiaux and I. Ljubić, “Last fifty years of integer linear programming: A focus on recent practical advances,” Eur. J. Oper. Res., vol. 324, no. 3, pp. 707–731, Aug. 2025, doi: 10.1016/j.ejor.2024.11.018.

Y. D. Sergeyev, D. E. Kvasov, and M. S. Mukhametzhanov, “On the efficiency of nature-inspired metaheuristics in expensive global optimization with limited budget,” Sci. Rep., vol. 8, no. 1, p. 453, Jan. 2018, doi: 10.1038/s41598-017-18940-4.

N. S. Kumar Gandikota, M. H. Hasan, and J. Jaafar, “An adaptive metaheuristic approach for risk-budgeted portfolio optimization,” IAES Int. J. Artif. Intell., vol. 12, no. 1, p. 305, Mar. 2023, doi: 10.11591/ijai.v12.i1.pp305-314.

R. Martín-Santamaría, M. López-Ibáñez, T. Stützle, and J. M. Colmenar, “On the automatic generation of metaheuristic algorithms for combinatorial optimization problems,” Eur. J. Oper. Res., vol. 318, no. 3, pp. 740–751, Nov. 2024, doi: 10.1016/j.ejor.2024.06.001.