A Vision Transformer Model for Detecting Obfuscated Malicious Narratives in Multiple Languages

Author: Ivan Turan

Affiliation: West High School, 241 N 300 W, Salt Lake City, UT 84103

Abstract:

Large Language Models have advanced machine intelligence, but the obfuscated text can still cause misinterpretation and reduce output accuracy. Detecting such encoded malicious text, especially in multilingual cases, remains a key challenge in online abuse prevention. In this experiment, a multilingual BERT model and a vision transformer were fine-tuned to detect multilingual obfuscated abusive text, and their effectiveness was compared across the same range of inputs. During preprocessing, the test data were obfuscated to simulate real-world noise, while the training data remained in their original form. Both models were evaluated using standard NLP metrics, including accuracy, validation loss, and training loss. The mBERT model achieved an accuracy of only 62% when tested on obfuscated data, whereas the ViT attained an accuracy of 99%. These results indicate that vision transformers are an efficient tool for encrypted hate speech detection compared to standard text-based models. There can be significant advancements to this research, including creating multi-modal augmentation and experimenting with several types of obfuscation techniques. This research can significantly contribute to the refinement and development of advanced content moderation tools on social media and game platforms online.

Keywords: Obfuscated Text, Vision Transformers, Multilingual BERT, Cross-Entropy Loss, Multimodal Learning

Download Full Article (PDF)