Cracking Overtime: Unleashing Machine Learning at PT XYZ with Linear Regression, Neural Networks, and Random Forests
DOI:
https://doi.org/10.47709/cnahpc.v7i2.5678Keywords:
linear regression, neural networks, overtime, prediction, random forestsAbstract
Excessive overtime at PT XYZ is a significant issue for the organization. Besides the significant financial repercussions, they may also affect employee health and productivity. This research aims to forecast future overtime hours, facilitating strategic planning, mitigating excessive overtime, and developing more effective overtime policies. This study employs an overtime realization dataset encompassing many characteristics that influence overtime determinations. The dataset is partitioned into training and testing data to serve as inputs for the three predictive algorithm models: linear regression, artificial neural network, and random forest. The random forest model demonstrates superior performance, evidenced by a mean squared error (MSE) of 158.78, which is proximate to the actual value. The root mean squared error (RMSE) of 12.601 is lower than that of the other two models, indicating a reduced average prediction error. The mean absolute error (MAE) of 8.931 reflects the average deviation from the actual value, while the mean absolute percentage error (MAPE) of 0.336 indicates a prediction error of 34%. Furthermore, the coefficient of determination (R²) of 0.914 signifies that approximately 91.4% of the variation in overtime hours is accounted for, in contrast to the other models, which accounted for 78.8% and 79.6%, respectively. The results indicate that the random forest model demonstrates superior predictive accuracy compared to the other two algorithms, owing to its capacity to handle non-linear data and outliers. Consequently, the random forest model is advocated as the most efficacious method for forecasting the amount of supplementary working hours in the future.
Downloads
References
Biau, G., & Scornet, E. (2016). A random forest guided tour. Test, 25(2), 197–227. https://doi.org/10.1007/s11749-016-0481-7
Chapman, P., Kerber, R., Clinton, J., Khabaza, T., Reinartz, T., & Wirth, R. (n.d.). The CRISP-DM Process Model.
Crocker, D. C., & Seber, G. A. F. (1980). Linear Regression Analysis. In Technometrics (Vol. 22, Issue 1). https://doi.org/10.2307/1268395
Gede Surya Mahendra, Lely Priska D. Tampubolon, M. MSI Herlinah, M. S. S. A., & Lalu Puji Indra Kharisma, Mochzen Gito Resmi, M.Kom I Gede Iwan Sudipa, Khairunnisa, Anak Agung Gede Bagus Ariana, Syahriani Syam, E. (2023). Sistem Pendukung keputusan Teori dan Penerapannya dalam berbagai metode. In Jurnal Ilmu Pendidikan (Vol. 7, Issue 2).
Jin, Z., Shang, J., Zhu, Q., Ling, C., Xie, W., & Qiang, B. (2020). RFRSF: Employee Turnover Prediction Based on Random Forests and Survival Analysis. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 12343 LNCS, 503–515. https://doi.org/10.1007/978-3-030-62008-0_35
Kim, H. (2022). Deep Learning. Artificial Intelligence for 6G, 22(4), 247–303. https://doi.org/10.1007/978-3-030-95041-5_6
Liaw, A., & Wiener, M. (2002). The R Journal: Classification and regression by randomForest. R Journal, 2(3), 18–22. http://www.stat.berkeley.edu/
Manwal, S., Bhandari, A. S., Narang, H., Vats, S., Sharma, V., & Yadav, S. P. (2024). Temperature Prediction: A Comparative Comprehensive Study Between Machine Learning Algorithms. 1st International Conference on Electronics, Computing, Communication and Control Technology, ICECCC 2024. https://doi.org/10.1109/ICECCC61767.2024.10593978
Mathieu, M., Couprie, C., & LeCun, Y. (2016). Deep multi-scale video prediction beyond mean square error. 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings, 2015, 1–14.
Nadh, B. S. S., Sha, A., Addanki, V., Yerramreddy, D. R., & Nisha, K. S. (2023). Performance Analysis of Machine Learning-Based Scheduling Policies for Operating Systems. 7th International Conference on Electronics, Communication and Aerospace Technology, ICECA 2023 - Proceedings, 771–776. https://doi.org/10.1109/ICECA58529.2023.10395782
Ncr, P. C., Spss, J. C., Ncr, R. K., Spss, T. K., Daimlerchrysler, T. R., Spss, C. S., & Daimlerchrysler, R. W. (n.d.). Step-by-step data mining guide.
Nieto, P. J. G., Gonzalo, E. G., García, L. A. M., Prado, L. Á.-D., & Sánchez, A. B. (2024). Predicting the critical superconducting temperature using the random forest, MLP neural network, M5 model tree and multivariate linear regression. Alexandria Engineering Journal, 86, 144–156. https://doi.org/10.1016/j.aej.2023.11.034
Rizvon, S. S., & Jayakumar, K. (2022). Strength prediction models for recycled aggregate concrete using Random Forests, ANN and LASSO. Journal of Building Pathology and Rehabilitation, 7(1). https://doi.org/10.1007/s41024-021-00145-y
Schmidhuber, J. (2015). Deep Learning in neural networks: An overview. Neural Networks, 61, 85–117. https://doi.org/10.1016/j.neunet.2014.09.003
Teng, J., & Yuan, Y. (2020). Inject Machine Learning into Significance Test for Misspecified Linear Models. http://arxiv.org/abs/2006.03167
Tynchenko, V., Kukartseva, O., Degtyareva, K., Khrapunova, Y., & Anisimov, P. (2024). Application of machine learning methods to predict soil moisture based on meteorological and atmospheric data. BIO Web of Conferences, 130. https://doi.org/10.1051/bioconf/202413002003
Zhang, G., Eddy Patuwo, B., & Y. Hu, M. (1998). Forecasting with artificial neural networks: The state of the art. International Journal of Forecasting, 14(1), 35–62. https://doi.org/10.1016/S0169-2070(97)00044-7
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Tifani Dewi Christia, Imam Yuadi

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.