Cracking Overtime: Unleashing Machine Learning at PT XYZ with Linear Regression, Neural Networks, and Random Forests

Authors

  • Tifani Dewi Christia Human Resources Development, Faculty of Human Resources, Airlangga University, 60286, Surabaya, Indonesia
  • Imam Yuadi Human Resources Development, Faculty of Human Resources, Airlangga University, 60286, Surabaya, Indonesia

DOI:

https://doi.org/10.47709/cnahpc.v7i2.5678

Keywords:

linear regression, neural networks, overtime, prediction, random forests

Abstract

Excessive overtime at PT XYZ is a significant issue for the organization. Besides the significant financial repercussions, they may also affect employee health and productivity. This research aims to forecast future overtime hours, facilitating strategic planning, mitigating excessive overtime, and developing more effective overtime policies. This study employs an overtime realization dataset encompassing many characteristics that influence overtime determinations. The dataset is partitioned into training and testing data to serve as inputs for the three predictive algorithm models: linear regression, artificial neural network, and random forest. The random forest model demonstrates superior performance, evidenced by a mean squared error (MSE) of 158.78, which is proximate to the actual value. The root mean squared error (RMSE) of 12.601 is lower than that of the other two models, indicating a reduced average prediction error. The mean absolute error (MAE) of 8.931 reflects the average deviation from the actual value, while the mean absolute percentage error (MAPE) of 0.336 indicates a prediction error of 34%. Furthermore, the coefficient of determination (R²) of 0.914 signifies that approximately 91.4% of the variation in overtime hours is accounted for, in contrast to the other models, which accounted for 78.8% and 79.6%, respectively. The results indicate that the random forest model demonstrates superior predictive accuracy compared to the other two algorithms, owing to its capacity to handle non-linear data and outliers. Consequently, the random forest model is advocated as the most efficacious method for forecasting the amount of supplementary working hours in the future.

Downloads

Download data is not yet available.

References

Biau, G., & Scornet, E. (2016). A random forest guided tour. Test, 25(2), 197–227. https://doi.org/10.1007/s11749-016-0481-7

Chapman, P., Kerber, R., Clinton, J., Khabaza, T., Reinartz, T., & Wirth, R. (n.d.). The CRISP-DM Process Model.

Crocker, D. C., & Seber, G. A. F. (1980). Linear Regression Analysis. In Technometrics (Vol. 22, Issue 1). https://doi.org/10.2307/1268395

Gede Surya Mahendra, Lely Priska D. Tampubolon, M. MSI Herlinah, M. S. S. A., & Lalu Puji Indra Kharisma, Mochzen Gito Resmi, M.Kom I Gede Iwan Sudipa, Khairunnisa, Anak Agung Gede Bagus Ariana, Syahriani Syam, E. (2023). Sistem Pendukung keputusan Teori dan Penerapannya dalam berbagai metode. In Jurnal Ilmu Pendidikan (Vol. 7, Issue 2).

Jin, Z., Shang, J., Zhu, Q., Ling, C., Xie, W., & Qiang, B. (2020). RFRSF: Employee Turnover Prediction Based on Random Forests and Survival Analysis. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 12343 LNCS, 503–515. https://doi.org/10.1007/978-3-030-62008-0_35

Kim, H. (2022). Deep Learning. Artificial Intelligence for 6G, 22(4), 247–303. https://doi.org/10.1007/978-3-030-95041-5_6

Liaw, A., & Wiener, M. (2002). The R Journal: Classification and regression by randomForest. R Journal, 2(3), 18–22. http://www.stat.berkeley.edu/

Manwal, S., Bhandari, A. S., Narang, H., Vats, S., Sharma, V., & Yadav, S. P. (2024). Temperature Prediction: A Comparative Comprehensive Study Between Machine Learning Algorithms. 1st International Conference on Electronics, Computing, Communication and Control Technology, ICECCC 2024. https://doi.org/10.1109/ICECCC61767.2024.10593978

Mathieu, M., Couprie, C., & LeCun, Y. (2016). Deep multi-scale video prediction beyond mean square error. 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings, 2015, 1–14.

Nadh, B. S. S., Sha, A., Addanki, V., Yerramreddy, D. R., & Nisha, K. S. (2023). Performance Analysis of Machine Learning-Based Scheduling Policies for Operating Systems. 7th International Conference on Electronics, Communication and Aerospace Technology, ICECA 2023 - Proceedings, 771–776. https://doi.org/10.1109/ICECA58529.2023.10395782

Ncr, P. C., Spss, J. C., Ncr, R. K., Spss, T. K., Daimlerchrysler, T. R., Spss, C. S., & Daimlerchrysler, R. W. (n.d.). Step-by-step data mining guide.

Nieto, P. J. G., Gonzalo, E. G., García, L. A. M., Prado, L. Á.-D., & Sánchez, A. B. (2024). Predicting the critical superconducting temperature using the random forest, MLP neural network, M5 model tree and multivariate linear regression. Alexandria Engineering Journal, 86, 144–156. https://doi.org/10.1016/j.aej.2023.11.034

Rizvon, S. S., & Jayakumar, K. (2022). Strength prediction models for recycled aggregate concrete using Random Forests, ANN and LASSO. Journal of Building Pathology and Rehabilitation, 7(1). https://doi.org/10.1007/s41024-021-00145-y

Schmidhuber, J. (2015). Deep Learning in neural networks: An overview. Neural Networks, 61, 85–117. https://doi.org/10.1016/j.neunet.2014.09.003

Teng, J., & Yuan, Y. (2020). Inject Machine Learning into Significance Test for Misspecified Linear Models. http://arxiv.org/abs/2006.03167

Tynchenko, V., Kukartseva, O., Degtyareva, K., Khrapunova, Y., & Anisimov, P. (2024). Application of machine learning methods to predict soil moisture based on meteorological and atmospheric data. BIO Web of Conferences, 130. https://doi.org/10.1051/bioconf/202413002003

Zhang, G., Eddy Patuwo, B., & Y. Hu, M. (1998). Forecasting with artificial neural networks: The state of the art. International Journal of Forecasting, 14(1), 35–62. https://doi.org/10.1016/S0169-2070(97)00044-7

Downloads

Published

2025-04-03

How to Cite

Christia, T. D., & Yuadi, I. . (2025). Cracking Overtime: Unleashing Machine Learning at PT XYZ with Linear Regression, Neural Networks, and Random Forests. Journal of Computer Networks, Architecture and High Performance Computing, 7(2), 423–432. https://doi.org/10.47709/cnahpc.v7i2.5678

Similar Articles

1 2 3 4 5 6 7 8 > >> 

You may also start an advanced similarity search for this article.