FAQ Chatbot for Small Businesses on the Web Using Semantic Search and Response Ranking

Authors

  • Abi Armansyah Universitas Bina Darma
  • Ari Muzakir Universitas Bina Darma
  • Evi Yulianingsih Universitas Bina Darma

DOI:

https://doi.org/10.32664/j-intech.v14i01.2252

Keywords:

Cosine Similarity , FAQ Chatbot, Semantic Search, Small Business, TF-IDF, Web Application

Abstract

Small businesses often handle customer questions through manual replies via chat applications or phone calls, causing repetitive work, delayed responses, and inconsistent information delivery. This study proposes a web-based FAQ chatbot that answers user questions by performing semantic search over an Indonesian FAQ knowledge base and ranking the most relevant response. The chatbot applies a lightweight information retrieval approach using TF-IDF vectorization and cosine similarity to compute the relevance score between the user query and FAQ entries (question and tags). The system then selects the top-ranked FAQ entry and returns its associated answer, meaning the semantic matching is performed at the question-to-question level, not directly between questions and answers. The top results are ranked, and the chatbot returns the best answer along with a confidence score and the top three candidate questions to increase transparency. If the score is below a predefined threshold, the system provides a fallback response and suggests related topics rather than forcing an incorrect answer. The system is implemented as a PHP–MySQL web application with an administrator dashboard that supports secure login, FAQ CRUD management, chat logging, and usage analytics. Functional verification is conducted using black-box testing across main modules, including authentication, FAQ management, chatbot interaction, logging, and analytics dashboards. The expected contribution of this work is a practical and low-cost chatbot solution that can be deployed by small businesses to reduce repetitive customer service workload, accelerate response time, and provide measurable service insights through log-based analytics. Future improvements include expanding the knowledge base, enhancing Indonesian text normalization, and adopting embedding-based retrieval for better semantic matching.

References

[1] J. Cordero and L. Barba-guaman, “Use of chatbots for customer service in MSMEs,” vol. 22, no. 1, pp. 185–197, 2026, https://doi.org/10.1108/ACI-06-2022-0148

[2] K. Marcineková, A. J. Sujová, and R. ˇDurica, “Implementing AI Chatbots in Customer Service Optimization — A Case Study in Micro-Enterprise,” vol. 16, no. 1078, pp. 1–18, 2025.https://doi.org/10.3390/info16121078

[3] S. Adam and E. Lulianthy, “Frequenly Ask Question ( FAQ ) Chatbot for New Student Admission System Using Natural Language Processing at Politeknik Aisyiyah Pontianak,” JTKSI, vol. 04, no. 03, 2021.

[4] G. H. Setiawan and I. M. B. Adnyana, “Improving Helpdesk Chatbot Performance with Term Frequency-Inverse Document Frequency ( TF-IDF ) and Cosine Similarity Models,” J. Appl. Informatics Comput., vol. 7, no. 2, pp. 252–257, 2023.https://doi.org/10.30871/jaic.v7i2.6527

[5] B. S. Pramono, M. A. N. Fathihah, M. A. Dzikri, and I. M. C. D. Noto, “Penerapan TF-IDF dan Cosine Similarity dalam Chatbot Virtual Asisten : Studi Kasus Layanan Informasi Pendaftaran Sekolah Menengah Pertama,” vol. 4, no. 1, pp. 150–158, 2025.

[6] D. Kristanto, R. A. Ramadhani, and A. B. Setiawan, “Pengembangan Chatbot Layanan Informasi Kampus Menggunakan TF-IDF,” vol. 22, no. 2, pp. 103–115, 2025, https://doi.org/10.33364/algoritma/v.22-2.2350

[7] R. M. Holis, P. Eko, P. Utomo, and B. F. Hutabarat, “Semantic FAQ Chatbot Using SBERT ( Sentence-BERT ) and Cosine Similarity for Academic Services,” vol. 5, no. 2, pp. 915–922, 2025.https://doi.org/10.47709/brilliance.v5i2.7027

[8] D. Steybe et al., “Evaluation of a context-aware chatbot using retrieval-augmented generation for answering clinical questions on medication-related osteonecrosis of the jaw,” J. Cranio-Maxillo-Facial Surg., vol. 53, no. 4, pp. 355–360, 2025, https://doi.org/10.1016/j.jcms.2024.12.009

[9] X. Lin, X. Wang, B. Shao, and J. Taylor, “How Chatbots Augment Human Intelligence in Customer Services : A Mixed-Methods Study How Chatbots Augment Human Intelligence in Customer Services : A Mixed-Methods Study ABSTRACT,” J. Manag. Inf. Syst., vol. 41, no. 4, pp. 1016–1041, 2025, https://doi.org/10.1080/07421222.2024.2415773

[10] E. Aprianto, D. Mahdiana, and A. Wibowo, “Optimizing Bag of Words and Word2Vec with Vocabulary Pruning and TF- IDF Weighted Embeddings for Accurate Chatbot Responses in Indonesian Treasury Services,” vol. 7, no. 1, pp. 587–605, 2026. https://doi.org/10.52436/1.jutif.2026.7.1.5370

[11] Asmaidin and C. B. Santoso, “Evaluasi Metode Retrieval pada Chatbot Domain Khusus Berbasis Retrieval-Augmented Generation,” JSAI J. Sci. Appl. Informatics, vol. 09, no. 1, pp. 105–111, 2026. https://doi.org/10.36085/jsai.v9i1.9897

[12] D. Baur, J. Ansorg, C. Heyde, P. Med, and A. Voelker, “Development and Evaluation of a Retrieval-Augmented Generation Chatbot for Orthopedic and Trauma Surgery Patient Education : Mixed-Methods Study,” JMIR AI, vol. 4, 2025, https://doi.org/10.2196/75262.

[13] R. Fernando, Y. D. Proboningrum, and S. D. Supriati, “NLP Implementation For AI Generated Text Detection (ChatGPT) Using Naive Bayes Method,” no. 10, pp. 292–302, 2025. https://doi.org/10.32664/j-intech.v13i02.2026

[14] R. Yang, M. Fu, C. Tantithamthavorn, C. Arora, L. Vandenhurk, and J. Chua, “The Journal of Systems & Software RAGVA : Engineering retrieval augmented generation-based virtual assistants,” J. Syst. Softw., vol. 226, no. January, p. 112436, 2025, https://doi.org/10.1016/j.jss.2025.112436.

[15] M. S. Ibrahim, A. D. Sallibi, A. A. Hadi, and A. A. Nafea, “Passer Journal of Basic and Applied Sciences A Chatbot for Frequently Asked Questions using TF-IDF and Query Expansion Techniques,” vol. 7, no. 1, pp. 506–512, 2025, https://doi.org/10.24271/psr.2025.487515.1805

[16] G. Pratista, V. C. Mawardi, and I. Lewenusa, “Pemanfaatan Chatbot Retrieval-Based dan Analisis Sentimen Untuk Meningkatkan Layanan Informasi Interaktif di Radio Untar,” Com. Commun. Inf. Technol. J., vol. 3, no. 2, 2025,https://doi.org/10.47467/comit.v3i2.8424

[17] M. Yusuf, R. S. Perdana, and P. P. Adikara, “Analisis Perbandingan Kinerja Metode Retrieval Cosine Similarity dan BM25 pada Chatbot berbasis Retrieval-Augmented-Generation,” J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 10, no. 2, pp. 1–6, 2026.

[18] S. U. Singh and A. S. Namin, “A survey on chatbots and large language models : Testing and evaluation techniques,” Nat. Lang. Process. J., vol. 10, no. January, p. 100128, 2025, https://doi.org/10.1016/j.nlp.2025.100128.

[19] I. Justice, Y. M. Kobara, J. Owolabi, A. A. Akpan, and O. F. Offodil, “Conversational and generative artificial intelligence and human – chatbot interaction in education and research,” vol. 32, pp. 1251–1281, 2025, https://doi.org/10.1111/itor.13522.

[20] R. M. Amodia, C. Tirnauca, and M. Zorrilla, “SoftwareX RAGBOT CLI : a Python library for running and evaluating retrieval-augmented generation chatbots,” SoftwareX, vol. 32, no. August, p. 102458, 2025, https://doi.org/10.1016/j.softx.2025.102458.

Downloads

Published

2026-03-16