Improving Software Defect Prediction Performance Using C4.5 Based Ensemble Learning with AdaBoost and Bagging Techniques
DOI:
https://doi.org/10.32664/k7fyc413Kata Kunci:
AdaBoost, Bagging, C4.5, Ensemble, SDPAbstrak
Software defect prediction (SDP) plays a crucial role in improving software quality by enabling the early detection of faulty modules during the development phase. However, class imbalance within software defect datasets remains a significant challenge that adversely impacts prediction accuracy. This study aims to address this issue by implementing ensemble learning methods—specifically Bagging and AdaBoost—combined with the C4.5 decision tree algorithm to enhance classification performance. The research utilized five well-known datasets from the NASA MDP Repository (CM1, JM1, KC1, KC2, and PC1), each containing comprehensive software metrics and defect labels. The methodology involved several stages: data preprocessing (normalization and discretization), model training using 10-fold cross-validation, and performance evaluation through metrics such as accuracy and Area Under the Curve (AUC). Results indicate that both ensemble methods outperformed the standalone C4.5 algorithm across all datasets. Notably, the AdaBoost + C4.5 model yielded the highest accuracy in most scenarios, with the PC1 dataset reaching 97.20% accuracy. In comparison, C4.5 alone and C4.5 with Bagging recorded lower values, demonstrating the significant impact of adaptive weighting in AdaBoost. These findings affirm that ensemble learning, particularly AdaBoost, effectively mitigates the impact of class imbalance and improves prediction performance in SDP tasks.
Referensi
[1] A. Hardoni, D. P. Rini, and S. Sukemi, “Integrasi SMOTE pada Naive Bayes dan Logistic Regression Berbasis Particle Swarm Optimization untuk Prediksi Cacat Perangkat Lunak,” J. Media Inform. Budidarma, vol. 5, no. 1, p. 233, 2021, doi: 10.30865/mib.v5i1.2616.
[2] E. Hari Agus Prastyo, S. Suhartono, M. Faisal, M. A. Yaqin, and R. A. J. Firdaus, “Naive Bayes Classification Untuk Prediksi Cacat Perangkat Lunak,” JIPI (Jurnal Ilm. Penelit. dan Pembelajaran Inform., vol. 9, no. 2, pp. 782–791, 2024, doi: 10.29100/jipi.v9i2.5508.
[3] D. Wintana, G. Gunawan, H. Sulaeman, and S. Bahri, “Penerapan Multi-Layer Perceptron dan Diskrit pada Prediksi Cacat Software,” J-Intech, vol. 12, no. 02, pp. 321–329, 2024, doi: 10.32664/j-intech.v12i02.1422.
[4] E. A. Kusnanti, L. D. F. Vantie, and U. L. Yuhana, “Software Defect Prediction Using Pca Based Recurrent Neural Network,” JUTI J. Ilm. Teknol. Inf., pp. 23–31, 2024, doi: 10.12962/j24068535.v22i1.a1199.
[5] C. G. ; Amanda, “Optimasi Multi-Objektif Prediksi Cacat Perangkat Lunak Melalui Integrasi Nsga-Ii Berbasis Pymoo,” UIN Syarif Hidayatullah Jakarta, 2025.
[6] M. Salsabila, “Pendekatan visual analytics dalam pemodelan prediksi cacat perangkat lunak menggunakan kombinasi pca dan smote,” UIN Syarif Hidayatullah Jakarta, 2022.
[7] N. Ichsan et al., “Prediksi Cacat Software Menggunakan Class Balancer Bagging C4 . 5 dan Analisis Statistik SPSS dalam Konteks Akuntansi,” vol. 5, no. 1, pp. 47–54, 2025.
[8] D. Pramadhana, “Klasifikasi Penyakit Diabetes Menggunakan Metode CFS dan ROS dengan Algoritma J48 Berbasis Adaboost,” Edumatic J. Pendidik. Inform., vol. 5, no. 1, pp. 89–98, 2021, doi: 10.29408/edumatic.v5i1.3336.
[9] N. Ichsan, R. Sopandi, H. Priyandaru, and M. Tabrani, “Pendekatan Level Data Smote Pada Algoritma Bagging C4.5 Untuk Prediksi Cacat Software Smote Data Level Approach of C4.5 Bagging Algorithm for Software Defect Prediction,” CerminJurnal Penelit., vol. 7, pp. 402–416, 2023.
[10] R. Sabaruddin, S. Murni, and W. Nugraha, “Pemanfaatan Resampling Untuk Penanganan Ketidakseimbangan Kelas Pada Prediksi Cacat Software Berbasis C5.0,” J. Teknol. Inf. Mura, vol. 15, no. 1, pp. 14–23, 2023, doi: 10.32767/jti.v15i1.1956.
[11] N. Wulandari and B. Badieah, “Implementasi Teknik Resampling Untuk Mengatasi Ketidakseimbangan Data Terhadap Klasifikasi Anemia Menggunakan Support Vector Machine,” J. Rekayasa Sist. Inf. dan Teknol., vol. 2, no. 3, pp. 942–951, 2025, doi: 10.70248/jrsit.v2i3.1856.
[12] E. R. Putri and D. B. Arianto, “Perbandingan Performa Algoritma Metode Bagging dan Boosting pada Prediksi Konsentrasi PM10 di Jakarta Utara,” J. Nas. Teknol. dan Sist. Inf., vol. 10, no. 1, pp. 72–81, 2024, doi: 10.25077/teknosi.v10i1.2024.72-81.
[13] N. D. Saputri, K. Khalid, and D. Rolliawati, “Komparasi penerapan metode Bagging dan Adaboost pada Algoritma C4. 5 untuk prediksi Penyakit Stroke,” Sist. J. Sist. Inf., vol. 11, no. 3, pp. 567–577, 2022, [Online]. Available: http://sistemasi.ftik.unisi.ac.id.
[14] A. H. Marsuhandi, A. M. Soleh, H. Wijayanto, and D. D. Domiri, “Pemanfaatan Ensemble Learning Dan Penginderaan Jauh Untuk Pengklasifikasian Jenis Lahan Padi,” Semin. Nas. Off. Stat., vol. 2019, no. 1, pp. 188–195, 2020, doi: 10.34123/semnasoffstat.v2019i1.247.
[15] S. Sidiq, P. Korespondensi, and N. Shobi Mabrur, “Pengembangan Model Prediksi Risiko Diabetes Menggunakan Pendekatan AdaBoost dan Teknik Oversampling SMOTE,” J. Ilm. Inform. Dan Ilmu Komput., vol. 4, pp. 13–23, 2025, [Online]. Available: https://doi.org/10.58602/jima-ilkom.v4i1.41.
[16] P. Setiyadi, M. N. Prayogi, and A. Solichin, “Optimalisasi Prediksi Kehilangan Karyawan Menggunakan Teknik Rfe, Smote, Dan Adaboost,” JIPI (Jurnal Ilm. Penelit. dan Pembelajaran Inform., vol. 9, no. 4, pp. 2131–2145, 2024, doi: 10.29100/jipi.v9i4.5642.
[17] N. Purnama and N. W. Utami, “Penerapan Algoritma Adaboost Untuk Optimasi Prediksi Kunjungan Wisatawan Ke Bali Dengan Metode Decission Tree,” JUSIM (Jurnal Sist. Inf. Musirawas), vol. 8, no. 2, pp. 119–126, 2023, doi: 10.32767/jusim.v8i2.2197.
[18] Y. Crismayella, N. Satyahadewi, and H. Perdana, “Algoritma Adaboost pada Metode Decision Tree untuk Klasifikasi Kelulusan Mahasiswa,” Jambura J. Math., vol. 5, no. 2, pp. 278–288, 2023, doi: 10.34312/jjom.v5i2.18790.
[19] Y. I. Lestari, S. Defit, and Y. Yuhandri, “Prediksi Tingkat Kepuasan Pelayanan Online Menggunakan Metode Algoritma C.45,” J. Inform. Ekon. Bisnis, vol. 3, pp. 148–154, 2021, doi: 10.37034/infeb.v3i4.104.
[20] M. A. U. Yosef Mulyanto Dawa, Abdul Aziz, “OPTIMASI ALGORITMA C4.5 BERBASIS PARTICLE SWARM OPTIMIZATION (PSO )UNTUK MENENTUKAN WHOLESALES PENJUALAN Yosef,” J. Ris. Mhs. Bid. Teknol. Inf., vol. 6, pp. 21–26, 2023, [Online]. Available: https://ejournal.unikama.ac.id/index.php/JFTI.
[21] M. Agusviyanda; Novita, Rita; Saleh, Alfa; Jamaris, “Peningkatan Algoritma C4.5 Menggunakan Ensemble Learning Untuk Mendeteksi Penyakit Ginjal,” INFORMATIKA, vol. 13, no. 1, pp. 670–680, 2025, doi: https://doi.org/10.36987/informatika.v12i3.7542.
[22] M. M. Kholil, F. Alzami, and M. A. Soeleman, “AdaBoost Based C4.5 Accuracy Improvement on Credit Customer Classification,” in 2022 International Seminar on Application for Technology of Information and Communication (iSemantic), Sep. 2022, pp. 351–356, doi: 10.1109/iSemantic55962.2022.9920463.
[23] I. K. Nuraini, “Menggunakan Cost Sensitive Learning dan Neural Network Optimasi Imbalance Class Pada Prediksi Cacat Perangkat Lunak,” Repository.Uinjkt.Ac.Id, 2023, [Online]. Available: https://repository.uinjkt.ac.id/dspace/handle/123456789/70877%0Ahttps://repository.uinjkt.ac.id/dspace/bitstream/123456789/70877/1/ISLAH KHOFIFAH NURAINI-FST.pdf.
[24] M. Adriansa, L. Yulianti, and L. Elfianty, “Analisis Kepuasan Pelanggan Menggunakan Algoritma C4.5,” J. Tek. Inform. UNIKA St. Thomas, vol. 07, no. 21, pp. 115–121, 2022, doi: 10.54367/jtiust.v7i1.1983.
[25] S. Umam and F. W. Christanto, “Algoritma C4.5 Pada Sistem Analisis Data Untuk Klasifikasi Nasabah Sebagai Dasar Promosi Penjualan Produk Asuransi,” J. Tek. Inform. dan Sist. Inf., vol. 10, no. 1, pp. 875–884, 2023, [Online]. Available: http://jurnal.mdp.ac.id.
[26] B. Basiroh and H. Irjananto, “C4.5 Algorithm As a Decision Support System for Social Welfare Aid Recipients,” J. Teknol. Inf. dan Komun., vol. 12, no. 1, p. 9, 2024, doi: 10.30646/tikomsin.v12i1.816.
Unduhan
Diterbitkan
Terbitan
Bagian
Lisensi
Hak Cipta (c) 2025 J-INTECH ( Journal of Information and Technology)

Artikel ini berlisensiCreative Commons Attribution-ShareAlike 4.0 International License.

