BERT Algorithm Implementation for Hadith Question-Answering System through a Virtual YouTuber Platform
DOI:
https://doi.org/10.32664/smatika.v15i02.1704Keywords:
Virtual Youtuber, Hadith QnA, IndoBERT-SQuAD, Islamic Education, Natural Language ProcessingAbstract
In this digital era, the integration of Islamic teachings with advanced technology has become essential. This research focuses on developing an Islamic QnA system using Artificial Intelligence in the form of a Virtual YouTuber (VTuber). The system leverages the IndoBERT-SQuAD algorithm for Natural Language Processing, particularly in handling questions about hadiths. By employing prototype methodology, the system underwent stages of analysis, design, implementation, and evaluation. Confidence score and F1-score metrics were utilized to assess the system's performance. After contextual grouping, the model demonstrated significant improvement, achieving an F1-score of 0.96875. Despite these advancements, the system still faces challenges in providing accurate long-form answers. This research contributes to the application of technology in Islamic education, offering a practical solution for making hadith knowledge more accessible and appealing to the younger generation.
References
[1] P. Chowdhary, “Natural Language Processing,” 2020, pp. 603–649. doi: 10.1007/978-81-322-3972-7_19.
[2] F. R. Pradhana, T. Harmini, and H. R. A. S, “Implementasi Teknologi Augmented Reality Dalam Pembelajaran Tajwid Kelas 5 Pada Hukum Bacaan Mim Sukun Dan Tanwin Berbasis MDA Framework,” Smatika Jurnal, vol. 13, no. 02, pp. 350–360, 2023, doi: 10.32664/smatika.v13i02.1002.
[3] E. DANINGRUM, “Dampak Penggunaan Media Sosial Youtube Melalui Film Pendek Islami Terhadap Perilaku Remaja Muslim,” 2019.
[4] Andrea Lidwina, “94% Orang Indonesia Akses YouTube dalam Satu Bulan Terakhir.” Accessed: Nov. 25, 2024. [Online]. Available: https://databoks.katadata.co.id/teknologi-telekomunikasi/statistik/9246278d026597c/94-orang-indonesia-akses-youtube-dalam-satu-bulan-terakhir
[5] A. R. B. Zaman and M. M. Assarwani, “Habib Husein Jafar Al-Hadar’s Da’wa Content Commodification on Youtube,” vol. 15, pp. 1–11, 2021, doi: 10.24090/KOMUNIKA.V15I1.3986.
[6] H. Kurniawan, “Indonesia Rangking Pertama dengan Komunitas Vtuber Terbesar di Asia Tenggara.” Accessed: Nov. 25, 2024. [Online]. Available: https://tekno.sindonews.com/read/1058393/207/indonesia-rangking-pertama-dengan-komunitas-vtuber-terbesar-di-asia-tenggara-1680008621
[7] Y. Tan, “More Attached, Less Stressed: Viewers’ Parasocial Attachment to Virtual Youtubers and Its Influence on the Stress of Viewers During the COVID-19 Pandemic,” SHS Web of Conferences, vol. 155, p. 03012, 2023, doi: 10.1051/shsconf/202315503012.
[8] M. Jayadi, “Kedudukan Dan Fungsi Hadis Dalam Islam,” Jurnal Adabiyah, vol. XI, no. 2, pp. 242–255, 2011.
[9] F. Alaba and O. C. Kolade, “A Novel Hadith Authentication Mobile System for Android and Ios Phones With Arabic to English Language Translation,” Journal of Media & Management, 2023, doi: 10.47363/jmm/2023(5)160.
[10] K. Gaanoun and M. Alsuhaibani, “Fabricated Hadith Detection: A Novel Matn-Based Approach With Transformer Language Models,” IEEE Access, vol. 10, pp. 113330–113342, 2022, doi: 10.1109/ACCESS.2022.3217457.
[11] Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, and Q. V. Le, “XLNet: Generalized autoregressive pretraining for language understanding,” Advances in Neural Information Processing Systems, vol. 32, no. NeurIPS, pp. 1–18, 2019.
[12] Y. Chen and F. Zulkernine, BIRD-QA: A BERT-based Information Retrieval Approach to Domain Specific Question Answering. 2021. doi: 10.1109/BigData52589.2021.9671523.
[13] S. Rosyad and M. Alif, “Hadis di Era Digital: Tantangan dan Peluang Penggunaan Teknologi dalam Studi Hadis,” Jurnal Ilmu Agama: Mengkaji Doktrin, Pemikiran, dan Fenomena Agama, vol. 24, no. 2, pp. 185–197, 2023, doi: 10.19109/jia.v24i2.18979.
[14] D. P. Dika, “Sekilas Mekanisme BERT: Algoritma di Balik Model Bahasa yang Canggih,” 07 Mar 2024. Accessed: Nov. 25, 2024. [Online]. Available: https://dikakaryatech.com/software/development/2024/03/07/sekilas-mekanisme-BERT-algoritma-di-balik-model-bahasa-yang-canggih.html
[15] M. A. PROF DR. ZIKRI DARUSSAMIN, Kuliah Ilmu Hadis 1. 2020.
[16] Y. HaCohen-Kerner, D. Miller, and Y. Yigal, “The influence of preprocessing on text classification using a bag-of-words representation,” PLoS ONE, vol. 15, no. 5, pp. 1–22, 2020, doi: 10.1371/journal.pone.0232525.
[17] J. Singh and V. Gupta, “Text Stemming: Approaches, Applications, and Challenges,” ACM Comput. Surv., vol. 49, no. 3, Sep. 2016, doi: 10.1145/2975608.
[18] L. Mullen, K. Benoit, O. Keyes, D. Selivanov, and J. Arnold, “Fast, Consistent Tokenization of Natural Language Text,” J. Open Source Softw., vol. 3, p. 655, 2018, doi: 10.21105/JOSS.00655.
[19] C. Schiffman et al., “Filtering procedures for untargeted LC-MS metabolomics data,” BMC Bioinformatics, vol. 20, 2019, doi: 10.1186/s12859-019-2871-9.
[20] J. Raulji and J. Saini, “Stop-Word Removal Algorithm and its Implementation for Sanskrit Language,” International Journal of Computer Applications, vol. 150, pp. 15–17, 2016, doi: 10.5120/IJCA2016911462.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 SMATIKA JURNAL

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
The writer agreed that the article copyright by Smatika journal and the writer has the right to disseminate the paper published without permission in advance.
