Classification of Multi-Label of Hate Speech on Twitter Indonesia using LSTM and BiLSTM Method

Elita Aurora Az Zahra; Yuliant Sibaroni; Sri  Suryani Prasetyowati

doi:10.35877/454RI.jinav1864

Authors

Elita Aurora Az Zahra Universitas Telkom
Yuliant Sibaroni Telkom University, Indonesia
Sri Suryani Prasetyowati Fakultas Informatika, Universitas Telkom

DOI:

https://doi.org/10.35877/454RI.jinav1864

Keywords:

twitter, hate speech, social media, LSTM, BiLSTM

Abstract

Social media is a communication tool that supports users to interact socially using technology. One of the most popular social media platforms is Twitter. However, its media platform has been considered by the virtual police as one of the main sources of spreading hate speech on social media. In this final project research, the authors conducted a study on the detection of hate speech in tweets on Twitter Indonesia. The method used in this research is multi-label classification by applying the LSTM and BiLSTM methods. The dataset used was 13,169 tweet data, and data labeling process was carried out into 12 classes. The results revealed that the LSTM and BiLSTM methods had good performance in classifying text data with 10 trials with an accuracy value of 78.67% for LSTM and 80.25% for BiLSTM. Based on the accuracy obtained, BiLSTM has higher accuracy than LSTM, so it can be concluded that BiLSTM is superior to LSTM.

Author Biographies

Yuliant Sibaroni, Telkom University, Indonesia

Telkom University, Indonesia

Sri Suryani Prasetyowati, Fakultas Informatika, Universitas Telkom

Fakultas Informatika, Universitas Telkom

References

Al Ayyubi, S. (2021). Polri: Ujaran kebencian dan SARA paling banyak di Twitter dan Facebook. Kabar 24. https://kabar24.bisnis.com/read/20210416/16/1382198/polri-ujaran-kebencian-dan-sara-paling-banyak-di-twitter-dan-facebook

Anang, sugeng cahyono. (2016). Pengaruh media sosial terhadap perubahan sosial masyarakat di Indonesia. Jurnal Ilmu Sosial & Ilmu Politik Diterbitkan Oleh Fakultas Ilmu Sosial & Politik, Universitas Tulungagung, 9(1), 140–157. http://www.jurnal-unita.org/index.php/publiciana/article/download/79/73

Dwitama, A. P. J., & Hidayat, S. (2021). Identifikasi Ujaran Kebencian Multilabel Pada Teks Twitter Berbahasa Indonesia Menggunakan Convolution Neural Network. Jurnal Sistem Komputer Dan Informatika (JSON), 3(2), 117. https://doi.org/10.30865/json.v3i2.3610

Errika Dwi Setya Watie. (2016). Komunikasi dan media sosial (communications and social media). Jurnal The Messenger, 3(2), 69–74.

Fadli, H., & Hidayatullah, A. (2021). Identifikasi Cyberbullying pada Media Sosial Twitter Menggunakan Metode LSTM dan BiLSTM. Universitas Islam Indonesia (UII), 2(No. 1), 1–6. https://journal.uii.ac.id/AUTOMATA/article/view/17364

Hidayatullah, A. F. dkk. (2019). Identifikasi konten kasar pada tweet bahasa Indonesia. Jurnal Linguistik Komputasional, 2(1), 1–5. http://inacl.id/journal/index.php/jlk/article/view/15

Ibrohim, M. O., & Budi, I. (2019). Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter. 46–57. https://doi.org/10.18653/v1/w19-3506

Ilma, R. A., Hadi, S., & Helen, A. (2021). Twitter’s Hate Speech Multi-label Classification Using Bidirectional Long Short-term Memory (BiLSTM) Method. 2021 International Conference on Artificial Intelligence and Big Data Analytics, ICAIBDA 2021, 93–99. https://doi.org/10.1109/ICAIBDA53487.2021.9689767

Isnain, A. R., Sihabuddin, A., & Suyanto, Y. (2020). Bidirectional Long Short Term Memory Method and Word2vec Extraction Approach for Hate Speech Detection. IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 14(2), 169. https://doi.org/10.22146/ijccs.51743

Minaee, S., Azimi, E., & Abdolrashidi, A. (2019). Deep-Sentiment: Sentiment Analysis Using Ensemble of CNN and Bi-LSTM Models. http://arxiv.org/abs/1904.04206

Rizky, M. G., Jusak, J., & Puspasari, I. (2021). ANALISIS PERBANDINGAN METODE LSTM DAN BiLSTM UNTUK KLASIFIKASI SINYAL JANTUNG PHONOCARDIOGRAM. Journal JCONES, 10(2), 44–49. https://jurnal.dinamika.ac.id/index.php/jcone/article/view/3907

Sasongko, Artanti, V. A. A., Putri, N. U., Hendrawan, J., & Sari, S. D. (2021). Ujaran Kebencian di Media Sosial dalam Perspektif Cyberlaw di Indonesia. Proceeding of Conference on Law and Social Studies, 1–12. http://prosiding.unipma.ac.id/index.php/COLaS

Shultz, T. R., Fahlman, S. E., Craw, S., Andritsos, P., Tsaparas, P., Silva, R., Drummond, C., Ling, C. X., Sheng, V. S., Drummond, C., Lanzi, P. L., Gama, J., Wiegand, R. P., Sen, P., Namata, G., Bilgic, M., Getoor, L., He, J., Jain, S., … Mueen, A. (2011). Confusion Matrix. Encyclopedia of Machine Learning, 209–209. https://doi.org/10.1007/978-0-387-30164-8_157

Staudemeyer, R. C., & Morris, E. R. (2019). Understanding LSTM -- a tutorial into Long Short-Term Memory Recurrent Neural Networks. http://arxiv.org/abs/1909.09586

Terkini, B. (2022). Bunyi UU ITE Pasal 27 Ayat 3 dan Ancaman Hukumannya | kumparan.com. https://kumparan.com/berita-terkini/bunyi-uu-ite-pasal-27-ayat-3-dan-ancaman-hukumannya-1ygVViR4jB7/full

We Are Social. (2022). Digital 2022 Indonesia. Databoks Katadata. https://datareportal.com/reports/digital-2022-indonesia?msclkid=54849450ac3011eca46cf06ec644a888