Machine Learning Approaches to Discover Novel Biomarkers in Lung Cancer

Show simple item record

dc.contributor.author Mithushika, M.
dc.date.accessioned 2026-03-07T08:57:44Z
dc.date.available 2026-03-07T08:57:44Z
dc.date.issued 2025
dc.identifier.uri http://drr.vau.ac.lk/handle/123456789/1968
dc.description.abstract Lung cancer is one of the leading causes of cancer related deaths worldwide. For early diagnosis and better treatment, finding reliable biomarkers is essential. In lung cancer patients, certain genes identified as biomarkers, can be expressed at much higher or lower levels compared to normal. In previous studies, many methods have been used to find biomarkers. Most researchers rely on machine learning models, but there is still a gap in applying these approaches effectively for biomarker discovery. In this study, we developed a combined feature selection approach for transcriptomic data containing 20,000 genes’ protein coding from different cell lines, from the publicly available Human Protein Atlas (HPA) dataset. Based on recent research, we applied eight suitable feature selection methods, such as Variance, ANOVA, Fisher’s Score, Chi-squared, LASSO, Kruskal-Wallis, Mutual Information, and Standard Deviation, individually to the preprocessed data and selected the top 1000 important genes from each. From each method, the top 1000 genes were selected, and overlapping results were compared using visual analytics. Among these, Mutual Information, Chi-squared, and Fisher’s Score showed more overlapping results. It led to the identification of six key genes, which are ANO10, CD63, FAS, PARVA, PHF11, and TMEM115. Interestingly, all six are associated with lung related diseases and immune functions, which strengthens their potential relevance in lung cancer. Among them, ANO10, CD63, FAS, PARVA, and TMEM115 have already been recognized in previous studies as lung cancer biomarkers, either being highly or under expressed, and our analysis further confirmed their roles through Gene Ontology (GO), Kyoto Encyclopedia of Genes (KEGG), and Reactome enrichment results.The exception is PHF11, which has not yet been reported as a lung cancer biomarker. Interestingly, previous studies have linked PHF11 mainly to immune regulation and asthma, both of which influence lung-associated inflammatory pathways. This observation aligns with recent insights that epigenetic alterations, such as DNA methylation, play crucial roles not only in tumorigenesis but also in immune modulation. In this way, our study not only reconfirmed several established biomarkers described in prior research but also highlighted PHF11 as a promising candidate that could bridge immune and epigenetic mechanisms in lung cancer. Our results demonstrate that by integrating multiple feature selection methods with biological validation, we can identify biomarkers that may have been overlooked, offering new possibilities for early detection and treatment in lung cancer. en_US
dc.language.iso en en_US
dc.publisher Faculty of Applied Science University of Vavuniya Sri Lanka en_US
dc.subject Biomarkers en_US
dc.subject Feature selection en_US
dc.subject Gene expression en_US
dc.subject Lung cancer en_US
dc.subject Machine learning en_US
dc.title Machine Learning Approaches to Discover Novel Biomarkers in Lung Cancer en_US
dc.type Conference abstract en_US
dc.identifier.proceedings 1st International Conference on Applied Sciences- 2025 en_US


Files in this item

This item appears in the following Collection(s)

  • ICAS - 2025 [59]
    International Conference on Applied Sciences - 2025

Show simple item record

Search


Browse

My Account