Tags:BERT, CNN, Cyberbullying, Deep-learning, DistilBERT, GRU, Image data, LSTM-2, Multi-class classification, ResNet-50, RoBERTa, Social Media, Text data and ViT
Abstract:
Social media sites like Facebook, Instagram, Twitter, LinkedIn, have become crucial for content creation and distribution, influencing business, politics, and personal relationships. Users often share their daily activities through pictures, posts, and videos, making short videos particularly popular due to their engaging format. However, social media posts frequently attract mixed comments, both positive and negative, and the negative comments can in some cases take the form of cyberbullying. To identify cyberbullying, a deep-learning approach was employed using two datasets: one self-collected and another public dataset. Nine deep-learning models were trained: ResNet-50, CNN and ViT for image data, and LSTM-2, GRU, RoBERTa, BERT, DistilBERT, and Hybrid (CNN+LSTM) model for textual data. The experimental results showed that the ViT model excelled in multi-class classification on public image data, achieving 99.5% accuracy and a F1-score of 0.995, while RoBERTa model outperformed other models on public textual data, with 99.2% accuracy and a F1-score of 0.992. For the private dataset, the RoBERTa model for text and ViT model for images were developed, with RoBERTa achieving a F1-score of 0.986 and 98.6% accuracy, and ViT obtaining an F1-score of 0.9319 and 93.20% accuracy. These results demonstrate the effectiveness of RoBERTa for text and Vision Transformer (ViT) for images in classifying cyberbullying, with RoBERTa delivering nearly perfect text classification and ViT excelling in image classification.
A Deep-Learning Based Approach for Multi-Class Cyberbullying Classification Using Social Media Text and Image Data