Detecting Hate Speech on YouTube Comments: A Comparative Study of  Machine Learning and Deep Learning Techniques

Sweshthika, S.; Yasotha, R.

dc.contributor.author	Sweshthika, S.
dc.contributor.author	Yasotha, R.
dc.date.accessioned	2026-03-26T03:20:17Z
dc.date.available	2026-03-26T03:20:17Z
dc.date.issued	2026
dc.identifier.uri	http://drr.vau.ac.lk/handle/123456789/2028
dc.description.abstract	The proliferation of social media has significantly amplified user interactions, but it also poses serious threats to communities through the spread of harmful content such as hate speech. The emotionally charged and nuanced language found in user-generated content presents unique challenges for effective detection and analysis. This study investigates YouTube comments related to child abuse and introduces a comprehensive machine learning framework for the automatic identification of hate speech. A dataset of 2,500 comments was collected using Selenium for web scraping, with an equal balance of hate and non-hate speech to ensure fair evaluation. To extract textual features, various natural language processing (NLP) techniques were employed, including CountVectorizer, TF-IDF, Word2Vec, and FastText. Several machine learning models were evaluated on this dataset. The Gradient Boosting model combined with CountVectorizer achieved the highest accuracy at 78%. Ensemble approaches, such as soft voting and stacking classifiers, also performed strongly, reaching up to 75% accuracy. Performance was assessed using metrics like precision and recall. The results demonstrate the effectiveness of the Gradient Boosting model in enhancing hate speech detection systems, particularly in sensitive contexts such as child abuse discussions. By advancing methods for identifying harmful content, this research supports the creation of safer and more respectful digital environments.	en_US
dc.language.iso	en	en_US
dc.publisher	Korea Database Strategy Society (KDSS)	en_US
dc.subject	Child abuse	en_US
dc.subject	CountVectorizer	en_US
dc.subject	Gradient boosting	en_US
dc.subject	Hate speech	en_US
dc.subject	Machine learning	en_US
dc.subject	Selenium	en_US
dc.subject	You Tube	en_US
dc.title	Detecting Hate Speech on YouTube Comments: A Comparative Study of Machine Learning and Deep Learning Techniques	en_US
dc.type	Conference abstract	en_US
dc.identifier.proceedings	32nd International Conference on IT Applications and Management	en_US