Omic Data in Identification of Covid-19 Vaccinated In dividuals

Kaluarachchi, N.C.N.; Pratheeba, J.; Yasotha, R.

dc.contributor.author	Kaluarachchi, N.C.N.
dc.contributor.author	Pratheeba, J.
dc.contributor.author	Yasotha, R.
dc.date.accessioned	2024-11-22T07:30:01Z
dc.date.available	2024-11-22T07:30:01Z
dc.date.issued	2024-10-30
dc.identifier.uri	http://drr.vau.ac.lk/handle/123456789/1073
dc.description.abstract	The COVID-19 pandemic has led to significant challenges in public health, highlighting the need for effective monitoring and identification of vaccinated individuals. This research explores the use of machine learning models in analyzing omic data to distinguish between vaccinated and unvaccinated individuals. Using three datasets from Gene Expression Omnibus (GEO)- GSE199668 (proteomic), GSE220682 (transcriptomic), and GSE243348 (transcriptomic)- three key feature selection techniques– Mutual Information, Correlation Coefficient, and Feature Importance were applied to identify the top 50 features. Computationally intensive methods like forward selection and backward elimination were avoided to ensure efficient processing. Four machine learning models– Random Forest, Support Vector Machine (SVM), Decision Tree, and Logistic Regression were selected for their effectiveness in classification tasks. The result demonstrated that the Random Forest model consistently outperformed the other models, achieving 98% mean accuracy for GSE199668 with 0.016 Standard Deviation, 66% mean accuracy for GSE243348 with 0.092 Standard Deviation and 99% mean accuracy for GSE220682 with 0.003 Standard Deviation after cross-validation when features were selected using Mutual Information. While SVM and Decision Tree also showed strong performance in certain cases, The Random Forest model provided the most accurate and stable predictions, especially after cross-validation with low standard deviation. Unlike prior studies, which primarily focused on predicting COVID-19 severity or outcome using clinical data and single omic, the findings suggest that integrating machine learning with omic data can effectively support identifying of COVID-19 vaccination status and increase vaccine literacy, offering valuable insight for public health initiatives. Future work could explore the application of deep learning models and the use of larger, more diverse datasets to further enhance the accuracy and robustness of these predictions. This study contributes to the growing body of research on the role of artificial intelligence in pandemic responses and personalized medicine	en_US
dc.language.iso	en	en_US
dc.publisher	Faculty of Applied Science, University of Vavuniya	en_US
dc.subject	Correlation coefficient	en_US
dc.subject	Feature importance	en_US
dc.subject	Mutual Information	en_US
dc.subject	Omic	en_US
dc.subject	Random forest	en_US
dc.title	Omic Data in Identification of Covid-19 Vaccinated In dividuals	en_US
dc.type	Conference abstract	en_US
dc.identifier.proceedings	The 5th Faculty Annual Research Session - "Exploring Science for Humanity"	en_US