Team develops AI-powered tool to detect hate speech in Southeast Asian languages

2024-08-27 00:16:40

The internet, and particularly social media, have grown exponentially over the last decades. The nature of social media allows anyone to go online and create content they find interesting, whether appropriate or not. One form of inappropriate content is hate speech—offensive or threatening speech targeting certain people based on their ethnicity, religion, sexual orientation, and the like.

Hate speech detection models are computational systems that can identify and classify online comments as hate speech.

"These models are crucial in moderating online content and mitigating the spread of harmful speech, particularly on social media," said Assistant Professor Roy Lee from the Singapore University of Technology and Design (SUTD). Evaluating the performance of hate speech detection models is important, but traditional evaluation using held-out test sets often fail to properly assess the model's performance due to inherent bias within the datasets.

To overcome this limitation, HateCheck and Multilingual HateCheck (MHC) were introduced as functional tests that capture the complexity and diversity of hate speech by simulating real-world scenarios. In their research paper titled "SGHateCheck: Functional tests for detecting hate speech in low-resource languages of Singapore," Asst. Prof. Lee and his team build on the frameworks of HateCheck and MHC to develop SGHateCheck, an artificial intelligence (AI)-powered tool that can distinguish between hateful and non-hateful comments in the specific context of Singapore and Southeast Asia.

The work appears in the Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH 2024).

Creating an evaluation tool specifically for the region's linguistic and cultural context was necessary. This is because current hate speech detection models and datasets are mostly based on Western contexts, which do not accurately represent specific social dynamics and issues in Southeast Asia.

"SGHateCheck aims to address these gaps by providing functional tests tailored to the region's specific needs, ensuring more accurate and culturally sensitive detection of hate speech," said Asst. Prof. Lee.

Unlike HateCheck and MHC, SGHateCheck uses large language models (LLMs) to translate and paraphrase test cases into Singapore's four main languages—English, Mandarin, Tamil and Malay. Native annotators then refine these test cases to ensure cultural relevance and accuracy. The end result is over 11,000 test cases meticulously annotated as hateful or non-hateful, which allows for a more nuanced platform to evaluate hate speech detection models.

Moreover, while MHC includes many languages, it does not have the same level of regional specificity as SGHateCheck. A comprehensive list of functional tests tailored to the region's distinct linguistic features (for example, Singlish) paired with expert guidance ensures that SGHateCheck tests are useful and relevant.

"This regional focus allows SGHateCheck to more accurately capture and evaluate the manifestations of hate speech that may not be adequately addressed by broader, more general frameworks," emphasized Asst. Prof. Lee.

The team also found that LLMs trained on monolingual data sets are often biased towards non-hateful classifications. On the other hand, LLMs trained on multilingual datasets have a more balanced performance and can more accurately detect hate speech across various languages due to their exposure to a broader range of language expressions and cultural contexts. This underscores the importance of including culturally diverse and multilingual training data for applications in multilingual regions.

SGHateCheck was specifically developed to solve a real-world issue in Southeast Asia. It is poised to play a significant role by enhancing detection and moderation of hate speech in online environments in these regions, helping to foster a more respectful and inclusive online space. Social media, online forums and community platforms, and news and media websites are just some of the many areas where the implementation of SGHateCheck will be valuable.

Fortunately, a new content moderation application that uses SGHateCheck is already on Asst. Prof. Lee's list of future plans. He also aims to expand SGHateCheck to include other Southeast Asian languages such as Thai and Vietnamese.

SGHateCheck demonstrates how SUTD's ethos of integrating cutting-edge technological advancements with thoughtful design principles can lead to impactful real-world solutions. Through the use of design, AI, and technology, SGHateCheck was developed to analyze local languages and social dynamics in order to meet a specific societal need.

"By focusing on creating a hate speech detection tool that is not only technologically sophisticated but also culturally sensitive, the study underscores the importance of a human-centered approach in technological research and development," said Asst. Prof. Lee.

More information: Ri Chi Ng et al, SGHateCheck: Functional Tests for Detecting Hate Speech in Low-Resource Languages of Singapore, Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH 2024) (2024). DOI: 10.18653/v1/2024.woah-1.24

Citation: Team develops AI-powered tool to detect hate speech in Southeast Asian languages (2024, August 26) retrieved 26 August 2024 from https://techxplore.com/news/2024-08-team-ai-powered-tool-speech.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.