Evaluating GPT-OSS-20B Model for Hate Speech Detection: Advances in  Parameter-Efficient Adaptation

Quang Hong Le; Tan Khoa Huynh Ly; Thanh Le

Authors

Quang Hong Le quanglh@huflit.edu.vn
Tan Khoa Huynh Ly
Thanh Le

Abstract

Hate speech detection continues to pose methodological challenges due to annotation ambiguity, class imbalance and the fine grained distinction be-tween offensive and hateful expressions. This work examines a parameter ef-ficient adaptation of a 20-billion-parameter large language model for three class hate speech classification. The approach consolidates annotator deci-sions into a single label per instance, applies balanced sampling to reduce minority class sparsity, and incorporates instruction templates with agree-ment based metadata to stabilise predictions in borderline cases. The adapted model is evaluated against transformer encoder baselines and prompted Large language models (LLMs) configurations. The results show that the proposed system attains a macro F1-score of 80.66% and an accuracy of 83.37%, outperforming all comparative baselines, with particularly strong gains in the Hate Speech category. An additional analysis of computational usage indicates that the adaptation procedure operates within moderate re-source constraints. These findings indicate that lightweight parameter effi-cient adaptation offers a viable solution for fine grained hate speech classifi-cation when full finetuning of LLMs is impractical.

Author Biographies

Quang Hong Le, quanglh@huflit.edu.vn

Lê Hồng Quang tốt nghiệp chuyên ngành Hệ thống thông tin thuộc ngành Công nghệ thông tin tại Trường Đại học Ngoại ngữ - Tin học TP. Hồ Chí Minh (HUFLIT), Việt Nam vào năm 2021. Anh đang là học viên cao học ngành Công nghệ thông tin tại HUFLIT. Hiện anh đang công tác tại Phòng Chính trị - Công tác sinh viên HUFLIT. Lĩnh vực nghiên cứu, quan tâm bao gồm: ứng dụng công nghệ trí tuệ nhân tạo (AI) trong quản lý giáo dục và Xử lý ngôn ngữ tự nhiên (Natural Language Processing – NLP).

Tan Khoa Huynh Ly

Huỳnh Lý Tân Khoa đang là sinh viên năm cuối chuyên ngành Khoa học Dữ liệu thuộc ngành Công nghệ Thông tin tại Trường Đại học Ngoại ngữ - Tin học Thành phố Hồ Chí Minh (HUFLIT), Việt Nam. Lĩnh vực nghiên cứu, quan tâm bao gồm: Học sâu, Khoa học dữ liệu và Xử lý ngôn ngữ tự nhiên (Natural Language Processing – NLP)

Thanh Le

Lê Thanh tốt nghiệp ngành Công nghệ Thông tin tại Trường Đại học Công nghệ TP. Hồ Chí Minh (HUTECH), Việt Nam vào năm 2011. Anh nhận bằng thạc sĩ Công nghệ Thông tin tại HUTECH vào năm 2018. Sau đó, anh tiếp tục theo học chương trình tiến sĩ chuyên ngành Xử lý ngôn ngữ tự nhiên (Natural Language Processing – NLP) tại HUTECH. Hướng nghiên cứu chính của anh bao gồm: Học sâu, Khoa học dữ liệu và Xử lý ngôn ngữ tự nhiên. Hiện nay, anh đang công tác với vai trò giảng viên tại Trường Đại học Kinh tế – Tài chính TP. Hồ Chí Minh (UEF) …

Evaluating GPT-OSS-20B Model for Hate Speech Detection: Advances in Parameter-Efficient Adaptation

Authors

Abstract

Author Biographies

Quang Hong Le, quanglh@huflit.edu.vn

Tan Khoa Huynh Ly

Thanh Le

Downloads

Published

How to Cite

Issue

Section

Cover

Language

Information