Detecting Hate Speech on YouTube Comments: A Comparative Study of Machine Learning and Deep Learning Techniques

Show simple item record

dc.contributor.author Sweshthika, S.
dc.contributor.author Yasotha, R.
dc.date.accessioned 2026-03-26T03:20:17Z
dc.date.available 2026-03-26T03:20:17Z
dc.date.issued 2026
dc.identifier.uri http://drr.vau.ac.lk/handle/123456789/2028
dc.description.abstract The proliferation of social media has significantly amplified user interactions, but it also poses serious threats to communities through the spread of harmful content such as hate speech. The emotionally charged and nuanced language found in user-generated content presents unique challenges for effective detection and analysis. This study investigates YouTube comments related to child abuse and introduces a comprehensive machine learning framework for the automatic identification of hate speech. A dataset of 2,500 comments was collected using Selenium for web scraping, with an equal balance of hate and non-hate speech to ensure fair evaluation. To extract textual features, various natural language processing (NLP) techniques were employed, including CountVectorizer, TF-IDF, Word2Vec, and FastText. Several machine learning models were evaluated on this dataset. The Gradient Boosting model combined with CountVectorizer achieved the highest accuracy at 78%. Ensemble approaches, such as soft voting and stacking classifiers, also performed strongly, reaching up to 75% accuracy. Performance was assessed using metrics like precision and recall. The results demonstrate the effectiveness of the Gradient Boosting model in enhancing hate speech detection systems, particularly in sensitive contexts such as child abuse discussions. By advancing methods for identifying harmful content, this research supports the creation of safer and more respectful digital environments. en_US
dc.language.iso en en_US
dc.publisher Korea Database Strategy Society (KDSS) en_US
dc.subject Child abuse en_US
dc.subject CountVectorizer en_US
dc.subject Gradient boosting en_US
dc.subject Hate speech en_US
dc.subject Machine learning en_US
dc.subject Selenium en_US
dc.subject You Tube en_US
dc.title Detecting Hate Speech on YouTube Comments: A Comparative Study of Machine Learning and Deep Learning Techniques en_US
dc.type Conference abstract en_US
dc.identifier.proceedings 32nd International Conference on IT Applications and Management en_US


Files in this item

This item appears in the following Collection(s)

  • IITAMS - 2026 [39]
    International Conference on IT Applications and Management

Show simple item record

Search


Browse

My Account