| dc.description.abstract |
Lung cancer is one of the leading causes of cancer related deaths worldwide. For early diagnosis
and better treatment, finding reliable biomarkers is essential. In lung cancer patients, certain genes identified
as biomarkers, can be expressed at much higher or lower levels compared to normal. In previous studies,
many methods have been used to find biomarkers. Most researchers rely on machine learning models,
but there is still a gap in applying these approaches effectively for biomarker discovery. In this study, we
developed a combined feature selection approach for transcriptomic data containing 20,000 genes’ protein
coding from different cell lines, from the publicly available Human Protein Atlas (HPA) dataset. Based on
recent research, we applied eight suitable feature selection methods, such as Variance, ANOVA, Fisher’s
Score, Chi-squared, LASSO, Kruskal-Wallis, Mutual Information, and Standard Deviation, individually to
the preprocessed data and selected the top 1000 important genes from each. From each method, the top 1000
genes were selected, and overlapping results were compared using visual analytics. Among these, Mutual
Information, Chi-squared, and Fisher’s Score showed more overlapping results. It led to the identification
of six key genes, which are ANO10, CD63, FAS, PARVA, PHF11, and TMEM115. Interestingly, all six
are associated with lung related diseases and immune functions, which strengthens their potential relevance
in lung cancer. Among them, ANO10, CD63, FAS, PARVA, and TMEM115 have already been recognized
in previous studies as lung cancer biomarkers, either being highly or under expressed, and our analysis
further confirmed their roles through Gene Ontology (GO), Kyoto Encyclopedia of Genes (KEGG), and
Reactome enrichment results.The exception is PHF11, which has not yet been reported as a lung cancer
biomarker. Interestingly, previous studies have linked PHF11 mainly to immune regulation and asthma,
both of which influence lung-associated inflammatory pathways. This observation aligns with recent insights
that epigenetic alterations, such as DNA methylation, play crucial roles not only in tumorigenesis but
also in immune modulation. In this way, our study not only reconfirmed several established biomarkers
described in prior research but also highlighted PHF11 as a promising candidate that could bridge immune
and epigenetic mechanisms in lung cancer. Our results demonstrate that by integrating multiple feature
selection methods with biological validation, we can identify biomarkers that may have been overlooked,
offering new possibilities for early detection and treatment in lung cancer. |
en_US |