Segmentation and Prediction of Store Performance on the Shopee Marketplace Using a Hybrid Clustering Approach, Spatial Analysis, and Feature Importance
DOI:
https://doi.org/10.32664/j-intech.v14i01.2256Kata Kunci:
data mining, e-commerce, K-Means clustering, Random Forest, marketplace analytics.Abstrak
Marketplace platforms have become a central component of digital commerce, particularly in Southeast Asia where Shopee has emerged as one of the dominant e-commerce ecosystems. The increasing number of sellers on the platform intensifies competition and requires data-driven approaches to understand store performance patterns. This study aims to analyze and predict the performance of Shopee stores using a hybrid data mining approach that integrates clustering, spatial analysis, and feature importance evaluation. The dataset consists of 655 Shopee stores collected on February 18, 2026, including attributes such as number of products, chat response rate, follower count, store rating, store tenure, promotional activity, and seller address. K-Means clustering is applied to segment store performance, while spatial analysis examines the geographic distribution of clusters across Indonesian provinces. Furthermore, a Random Forest classifier is used to predict performance categories and identify influential features affecting store competitiveness. The clustering results reveal three distinct store performance groups representing low, medium, and high activity levels. Spatial analysis indicates that provinces with stronger digital ecosystems, particularly West Java and Jakarta, contain a higher concentration of active stores. Feature importance analysis shows that promotional activity, chat responsiveness, and follower count significantly influence store performance classification. The findings contribute to the development of hybrid data mining frameworks for marketplace analysis and provide practical insights for improving seller competitiveness in digital commerce ecosystems.
Referensi
[1] Reuters, “Google, Shopee-owner Sea to develop AI tools for e-commerce, gaming,” www.reuters.com, 2026.
[2] B. Yáñez-Araque, J. P. S.-I. Hernández, S. Gutiérrez-Broncano, and P. Jiménez-Estévez, “Corporate social responsibility in micro-, small- and medium-sized enterprises: Multigroup analysis of family vs. nonfamily firms,” J. Bus. Res., vol. 124, pp. 581–592, 2021, doi: https://doi.org/10.1016/j.jbusres.2020.10.023.
[3] H. Li et al., “Flash Flood Risk Classification Using GIS-Based Fractional Order k -Means Clustering Method,” MDPI Fractal Fract. J., vol. 9, pp. 1–18, 2025, doi: https://doi.org/10.3390/fractalfract9090586.
[4] Y. K. Dwivedi, N. Kshetri, L. Hughes, E. Slade, and A. Jeyaraj, “Opinion Paper: ‘So what if ChatGPT wrote it?’ Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy,” Int. J. Inf. Manage., vol. 71, pp. 1–63, 2023, doi: https://doi.org/10.1016/j.ijinfomgt.2023.102642.
[5] S. R. Rabani, D. Amalia, M. Erika, N. Kusuma, and F. Ilayana, “Pengaruh Penggunaan AI , Literasi Digital , dan Pengalaman Pengguna Terhadap Loyalitas Pelanggan Pada E-Commerce Shopee,” J. Econ. Bus. Res., vol. 3, no. 2, pp. 147–159, 2024, doi: https://doi.org/10.22515/juebir.v3i2.10813.
[6] A. M. Alghaniy, “The Impact of Artificial Intelligence Technology in Shopee’s Chatbot Service on Customer Satisfaction in Greater Bandung Area, Indonesia,” Int. J. Adm. Bus. Organ., vol. 5, no. 1, pp. 48–55, 2024, doi: https://doi.org/10.61242/ijabo.24.337.
[7] A. Muslikhun and S. Sutopo, “Analisis Faktor-Faktor yang Mempengaruhi Keputusan Pembelian Online di Marketplace Shopee,” J. Transform. Bisnis Digit., vol. 1, no. 4, pp. 11–24, 2024, doi: https://doi.org/10.61132/jutrabidi.v1i4.202.
[8] P. Bicen, S. Hunt, and S. Madhavaram, “Coopetitive innovation alliance performance: Alliance competence, alliance’s market orientation, and relational governance,” J. Bus. Res., vol. 123, pp. 23–31, 2021, doi: https://doi.org/10.1016/j.jbusres.2020.09.040.
[9] Y. A. Wijaya and D. Sudrajat, “Analisis Bibliometrik: Pemetaan Penelitian Machine Learning dalam E-commerce Berdasarkan Data dari Scopus (2019-2024),” in Prosiding Seminar Nasional Sisfotek (Sistem Informasi dan Teknologi Informasi), 2024, pp. 451–461.
[10] A. Shojaei, “Data Mining Systematic Literature Review,” 2024. doi: https://doi.org/10.13140/RG.2.2.14684.40324.
[11] L. A. Putri, M. Tsaqofah, D. S. Hasibuan, H. Fadillah, M. Ulfa, and M. Furqan, “Application of K-Means Clustering Algorithm for E- Commerce Data Analysis,” J. Artif. Intell. Eng. Appl., vol. 4, no. 3, pp. 5–8, 2025, doi: https://doi.org/10.59934/jaiea.v4i3.1170.
[12] A. Jain, “Data clustering: 50 years beyond K-means,” Pattern Recognit. Lett., vol. 31, no. 8, pp. 651–666, 2010, doi: https://doi.org/10.1016/j.patrec.2009.09.011.
[13] B. Neupane et al., “Machine learning algorithms for supporting life cycle assessment studies: An analytical review,” Sustain. Prod. Consum., vol. 56, pp. 37–53, 2025, doi: https://doi.org/10.1016/j.spc.2025.03.015.
[14] R. Siagian, P. Sirait, and A. Halima, “E-Commerce Customer Segmentation Using K-Means Algorithm and Length, Recency, Frequency, Monetary Model,” JITE (Journal Informatics Telecommun. Eng. Available, vol. 5, no. 1, pp. 21–30, 2021, doi: 10.31289/jite.v5i1.5182 Received:
[15] F. Muttaqien, N. Fitria, V. L. Rizki, and I. Abrori, “Pengaruh Kompetensi, Program Diklat, Dan Motivasi Kerja Terhadap Peningkatan Kinerja Karyawan Pt. Bpr Nur Semesta Indah Kabupaten Jember,” J. Istiqro, vol. 11, no. 2, pp. 107–123, 2025, doi: 10.30739/istiqro.v11i2.4119.
[16] S. Wahyuni, T. T. Wulansari, and F. Fahrullah, “Segmentasi Pelanggan Berdasarkan Analisis Recency, Frequency, Monetary Menggunakan Algoritma K-Means Pada CV. Toedjoe Sinar Group,” J. Rekayasa Teknol. Inf., vol. 7, no. 2, pp. 180–187, 2023, doi: http://dx.doi.org/10.30872/jurti.v7i2.8748.
[17] R. Setyawan and B. Murtiyasa, “A Systematic Literature Review of Clustering Algorithms in Stock Market Analysis,” J. Comput. Networks, Archit. High Perform. Comput., vol. 08, no. 1, pp. 36–52, 2026, doi: https://doi.org/10.47709/cnahpc.v8i1.7333.
[18] T. A. N. Azzikra, “Segmentasi Wilayah Digitalisasi di Indonesia dengan DBSCAN dan Validasi menggunakan Random Forest,” Digit. Transform. Technol., vol. 5, no. 2, pp. 85–91, 2025, doi: https://doi.org/10.47709/digitech.v5i2.6532.
[19] A. Khairunnisa, K. A. Notodiputro, and B. Sartono, “A Comparative Study of Random Forest and Double Random Forest Models from View Points of Their Interpretability,” Sci. J. Informatics, vol. 11, no. 1, pp. 207–218, 2024, doi: 10.15294/sji.v11i1.48721.
[20] J. Ipmawati and K. Kusnawi, “Integration of K-Means Clustering, Random Forest, and RFM Analysis for Optimizing Consumer Segmentation in Digital Advertising Strategies,” J. SISFOKOM (Sistem Inf. dan Komputer), vol. 15, no. 1, pp. 112–118, 2026, doi: 10.32736/sisfokom.v15i1.2548.
[21] B. N. Yulisasih, H. Herman, S. Sunardi, and H. Yuliansyah, “Predictive Analytics on Shopee for Optimizing Product Demand Prediction through K-Means Clustering and KNN Algorithm Fusion,” Journal of Information Systems and Informatics,” Ilk. J. Ilm., vol. 16, no. 3, pp. 330–342, 2024, doi: https://doi.org/10.33096/ilkom.v16i3.2325.330-342.
[22] M. Febima and L. Magdalena, “Predictive Analytics on Shopee for Optimizing Product Demand Prediction through K-Means Clustering and KNN Algorithm Fusion,” J. Inf. Syst. Informatics, vol. 6, no. 2, pp. 751–765, 2024, doi: 10.51519/journalisi.v6i2.720.
[23] K. Tabianan, S. Velu, and V. Ravi, “K-Means Clustering Approach for Intelligent Customer Segmentation Using Customer Purchase Behavior Data,” MDPI Sustain., vol. 14, pp. 1–15, 2022, doi: https://doi.org/10.3390/su14127243.
[24] Z. R. Li, “Customer Segmentationand Churn Prediction Based On K-Means And Random Forest: Acase Study Of E-Commerce Data,” Eurasia J. Sci. Technol., vol. 7, no. 2, pp. 14–19, 2025.
[25] U. I. Hartanto, I. G. P. A. Buditjahjanto, and W. Yustanti, “Hybrid Clustering and Classification of At-Risk Customer Segments in Network Marketing,” J. Inf. Eng. Educ. Technol., vol. 9, no. 1, pp. 42–50, 2025.
[26] M. Ali and M. Hussain, “Machine Learning-Based Customer Churn Prediction for E-Commerce Businesses,” Preprint, pp. 1–8, 2025, doi: 10.20944/preprints202511.0735.v1
Unduhan
Diterbitkan
Terbitan
Bagian
Lisensi
Hak Cipta (c) 2026 J-INTECH ( Journal of Information and Technology)

Artikel ini berlisensiCreative Commons Attribution-ShareAlike 4.0 International License.

