Comparative Evaluation of Automatic Labeling and Modeling Strategies for Indonesian Sentiment Analysis: Methodology and Performance Evaluation

Khoiriya Latifa; Agung Handayanto; Nur Latifah Dwi M.S; Rahul Bhandari; Ton Nguyen Trong Hien; Doston Pirnazarov

doi:10.26877/asset.v8i3.2862

Authors

Khoiriya Latifa Universitas PGRI Semarang Indonesia
Agung Handayanto Universitas PGRI Semarang Indonesia
Nur Latifah Dwi M.S Universitas PGRI Semarang Indonesia
Rahul Bhandari Jindal Global University India
Ton Nguyen Trong Hien Van Lang University Viet Nam
Doston Pirnazarov Samarkand State Foreign Languages Institute Uzbekistan

DOI:

https://doi.org/10.26877/asset.v8i3.2862

Keywords:

Low resources nlp, sentiment analysis, automatic labeling, vectorization, postagging

Abstract

Sentiment analysis is vital for understanding consumer perception, yet Indonesian sentiment classification faces challenges due to labeled data scarcity and computational constraints. This study advances automatic labeling techniques and establishes performance benchmarks for Indonesian text. The research compares two labeling approaches InSet Lexicon and IndoBERT based Hugging Face pipeline on 8,447 Tapera-related opinions. Results show InSet Lexicon produced a highly skewed distribution (89.66% neutral), while the IndoBERT pipeline achieved a more balanced distribution (47.66% neutral, 38.43% positive, 13.91% negative).. Evaluation of various modeling strategies revealed that combining InSet Lexicon + TF-IDF with Naïve Bayes or Random Forest achieved scores above 85%. While RNN-LSTM reached >90% accuracy, it required significant resources. Notably, fine-tuning IndoBERT with optimal hyperparameters yielded the most robust performance, achieving 80–90% accuracy with a low validation loss of 0.1. The study concludes that for small datasets (<12,000 samples), the most effective strategies for Indonesian sentiment analysis are either the InSet Lexicon paired with traditional Machine Learning or automatic labeling using pre-trained models followed by rigorous fine-tuning.

Author Biographies

Khoiriya Latifa, Universitas PGRI Semarang

Faculty of Engineering and Informatics, Universitas PGRI Semarang, Jl. Sidodadi
Timur No 24, Semarang, Central Java 50232, Indonesia

Naveen Jindal Young Global Research Fellowship, O.P Jindal Global University,
Haryana, India
Agung Handayanto, Universitas PGRI Semarang

Faculty of Engineering and Informatics, Universitas PGRI Semarang, Jl. Sidodadi
Timur No 24, Semarang, Central Java 50232, Indonesia
Nur Latifah Dwi M.S, Universitas PGRI Semarang

Faculty of Engineering and Informatics, Universitas PGRI Semarang, Jl. Sidodadi
Timur No 24, Semarang, Central Java 50232, Indonesia
Rahul Bhandari, Jindal Global University

Department School of Business and International Office, Jindal Global University,
Sonipat Narela Road, Near Jagdishpur Village, Sonipat, Haryana 131001, India
Ton Nguyen Trong Hien, Van Lang University

Faculty of Business Administration, Van Lang University, 69/68 Dang Thuy Tram,
Binh Loi Trung 72329, Ho Chi Minh City, Vietnam

Naveen Jindal Young Global Research Fellowship, O.P Jindal Global University,
Haryana, India
Doston Pirnazarov, Samarkand State Foreign Languages Institute

Narpay Faculty of General Sciences, Samarkand State Foreign Languages Institute,
Kamolot street, Narpay district, 141200, Uzbekistan

Naveen Jindal Young Global Research Fellowship, O.P Jindal Global University,
Haryana, India

References

[1] Desmond M, Muller M, Ashktorab Z, Dugan C, Duesterwald E, Brimijoin K, et al. Increasing the speed and accuracy of data labeling through an ai assisted interface. Proceedings of the 26th International Conference on Intelligent User Interfaces, 2021, p. 392–401. https://doi.org/10.1145/3397481.3450698.

[2] Desmond M, Brachman M, Duesterwald E, Dugan C, Joshi NN, Pan Q, et al. AI assisted data labeling with interactive auto label. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, 2022, p. 13161–3. https://doi.org/10.1609/aaai.v36i11.21714.

[3] Julianto IT, Kurniadi D, Balilo Jr BB, Rohman F. A Comparative Study of Alternative Automatic Labeling Using AI Assistant. Sinkron: Jurnal Dan Penelitian Teknik Informatika 2024;8:2125–33. https://doi.org/10.33395/sinkron.v8i4.13950.

[4] Biswas S, Young K, Griffith J. A comparison of automatic labelling approaches for sentiment analysis. ArXiv Preprint ArXiv:221102976 2022.

[5] Nema S, Vachhani L. Surgical instrument detection and tracking technologies: Automating dataset labeling for surgical skill assessment. Front Robot AI 2022;9. https://doi.org/10.3389/frobt.2022.1030846.

[6] Imtihan K, Mutawali L, Bagye W, Tantoni A. Automated Label Extraction for Sentiment Analysis in Indonesian Text. Int J Adv Sci Eng Inf Technol 2025;15. https://doi.org/10.18517/ijaseit.15.3.20602.

[7] Thomas S, Yuliana, Noviyanti. P. Study Analisis Metode Analisis Sentimen pada YouTube. Journal of Information Technology 2021;1:1–7. https://doi.org/10.46229/jifotech.v1i1.201.

[8] Faizal A, Irawan ASY, Juardi D. Perbandingan Lexicon Based Dan Naïve Bayes Classifier Pada Analisis Sentimen Pengguna Twitter Terhadap Gempa Turki. INTECOMS: Journal of Information Technology and Computer Science 2023;6:1037–48. https://doi.org/10.31539/intecoms.v6i2.7360.

[9] Setiawan A. Analisis Sentimen Masyarakat Di Twitter Terhadap Kejadian Bom Bunuh Diri Polsek Astana Anyar Menggunakan Algorithma SVM Dengan Leksikon Vader Dan Inset. Skripsi. UIN Syarif Hidayatullah Jakarta, 2024

[10] Musfiroh D, Khaira U, Utomo PEP, Suratno T. Analisis Sentimen terhadap Perkuliahan Daring di Indonesia dari Twitter Dataset Menggunakan InSet Lexicon: Sentiment Analysis of Online Lectures in Indonesia from Twitter Dataset Using InSet Lexicon. MALCOM: Indonesian Journal of Machine Learning and Computer Science 2021;1:24–33. https://doi.org/10.57152/malcom.v1i1.20.

[11] Sriyanti ZA, Kartika DSY, Najaf A. Implementasi Model BERT Pada Analisis Sentimen Pengguna Twitter Terhadap Aksi Boikot Produk Israel. Jurnal Informatika Dan Teknik Elektro Terapan 2024;12. https://doi.org/10.23960/jitet.v12i3.4743.

[12] Sahoo C, Wankhade M, Singh BK. Sentiment analysis using deep learning techniques: a comprehensive review. Int J Multimed Inf Retr 2023;12:41. https://doi.org/10.1007/s13735-023-00308-2.

[13] Rachmawati F, Azmi U, Azwarini R. Comparison of Lexicon-Based Methods and Bidirectional Encoder Representations for Transformers Models in Sentiment Analysis of Government Debt Market Movements. International Journal of Engineering and Computer Science Applications (IJECSA) 2025;4:13–28. https://doi.org/10.30812/ijecsa.v4i1.4832.

[14] Ardiansyah, Adika Sri Widagdo, Krisna Nuresa Qodri, Saputro FEN, Nisrina Akbar Rizky P. Analisis sentimen terhadap pelayanan Kesehatan berdasarkan ulasan Google Maps menggunakan BERT. Jurnal Fssilkom 2023;13:326–33. https://doi.org/10.37859/jf.v13i02.5170.

[15] Tabinda Kokab S, Asghar S, Naz S. Transformer-based deep learning models for the sentiment analysis of social media data. Array 2022;14:100157. https://doi.org/10.1016/j.array.2022.100157.

[16] Vidya Chandradev, I Made Agus Dwi Suarjaya, I Putu Agung Bayupati. Analisis Sentimen Review Hotel Menggunakan Metode Deep Learning BERT. Jurnal Buana Informatika 2023;14:107–16. https://doi.org/10.24002/jbi.v14i02.7244.

[17] Susanto J. Analisis Sentimen Pengguna Aplikasi Udemy Dengan Menggunakan Metode Naïve Bayes. Doctoral dissertation. Universitas Duta Bangsa Surakarta, 2024

[18] Malasari N, Ramli M. Analisis Sentimen Media Sosial Menggunakan Algoritma BERT dan LSTM. Journal of Computer Science and Information Technology 2025;1:85–92. DOI: https://doi.org/10.70716/jocsit.v1i3.318.

[19] Amien M, Gunawan GF. BERT dan Bahasa Indonesia: Studi tentang Efektivitas Model NLP Berbasis Transformer. ELANG: Journal of Interdisciplinary Research 2024;1:132–40. https://doi.org/10.32664/elang.v1i02.

[20] Ogbuokiri B, Obaido G, Kamalu C, Aruleba K, Achilonu O, Mienye ID, et al. Cross-domain fairness audit of sentiment label bias in foundation models: Comparing human and machine annotations on tweets and reviews. Machine Learning with Applications 2025;21:100717. https://doi.org/10.1016/j.mlwa.2025.100717.

[21] Afifah SN, Prabowo MA, Agustina AY, Razik MA. The Effectiveness of the Tapera Program in Improving the Welfare of Government Employees: Media Ethnography Analysis. Innovation Business Management and Accounting Journal 2024;3:533–43. https://doi.org/10.56070/ibmaj.2024.057.

[22] Sihombing E, Halmi Dar M, Aini Nasution F. Comparison of Machine Learning Algorithms in Public Sentiment Analysis of TAPERA Policy. International Journal of Science, Technology & Management 2024;5:1089–98. https://doi.org/10.46729/ijstm.v5i5.1164.

[23] Firdaus MP, Trisnawarman D. Analisis Sentimen Publik terhadap Program Tabungan Perumahan Rakyat Menggunakan Model IndoBERT Lite pada Komentar YouTube. MALCOM: Indonesian Journal of Machine Learning and Computer Science 2025;5:359–68. https://doi.org/10.57152/malcom.v5i1.1744.

[24] Syahputra RA, Arifin R, . S, Iqbal M. Sentiment Analysis on Tabungan Perumahan Rakyat (TAPERA) Program by using Support Vector Machine (SVM). Journal of Applied Informatics and Computing 2024;8:531–41. https://doi.org/10.30871/jaic.v8i2.8694.

[25] Faturohman MI, Arifin M. Text Blob-Based Sentiment Analysis Of Tabungan Perumahan Rakyat (TAPERA) Policy: A Public Perceptron Study. Jti Undip: Jurnal Teknik Industri 2025;20:11–20. https://doi.org/10.14710/jati.20.1.11-20.

[26] Muhammadi RH, Laksana TG, Arifa AB. Combination of Support Vector Machine and Lexicon-Based Algorithm in Twitter Sentiment Analysis. Khazanah Informatika : Jurnal Ilmu Komputer Dan Informatika 2022;8:59–71. https://doi.org/10.23917/khif.v8i1.15213.

[27] Muhandhis I, Ritonga AS. Public sentiment analysis on TikTok about Tapera policy using Random Forest classifier. Sistemasi: Jurnal Sistem Informasi 2025;14:354–65. https://doi.org/10.32520/stmsi.v14i1.4878.

[28] Rianto, Mutiara AB, Wibowo EP, Santosa PI. Improving the accuracy of text classification using stemming method, a case of non-formal Indonesian conversation. J Big Data 2021;8:26. https://doi.org/10.1186/s40537-021-00413-1.

[29] Ridwan Petervan Siburian F, Suharjito. Boosting-Based Machine Learning Models and Hyperparameter Tuning for Predicting Vehicle Carbon Dioxide Emission. Advance Sustainable Science Engineering and Technology 2025;7:02504019. https://doi.org/10.26877/asset.v7i4.2097.

Comparative Evaluation of Automatic Labeling and Modeling Strategies for Indonesian Sentiment Analysis: Methodology and Performance Evaluation

Authors

DOI:

Keywords:

Abstract

Author Biographies

References

Downloads

Published

Issue

Section

License

Menu

tools

Sertifikat

tools

Template

Latest publications

Information

Language

plugin