Assessing GAN and VAE Augmentation Methods in Malignant Pleural Mesothelioma Prediction

Show simple item record

dc.contributor.author Fathima Azka, M.A.
dc.date.accessioned 2026-03-07T09:02:32Z
dc.date.available 2026-03-07T09:02:32Z
dc.date.issued 2025
dc.identifier.uri http://drr.vau.ac.lk/handle/123456789/1970
dc.description.abstract Malignant Pleural Mesothelioma (MPM) is a rare and aggressive cancer that is strongly associ ated with asbestos exposure. Its severity has led to growing research interest in finding effective solutions. In recent years, computational methods and machine learning approaches have been increasingly applied in oncology to classify tumor and normal samples using transcriptomic data. However, such models typi cally require large and balanced datasets to achieve robust performances, which are not available for rare cancers like MPM due to the very limited number of patients and under-representation of normal samples. This data scarcity poses a significant challenge in building predictive models that are reliable and general izable. To address this limitation, we employ computational analysis with data augmentation as a strategy to increase the effective sample size. Specifically, we evaluate two deep generative models, Generative Ad versarial Networks (GANs) and Variational Autoencoders (VAEs) to generate synthetic tumor and normal samples. Importantly, synthetic samples were used strictly in the training process, while test sets contained only real data, ensuring no data leakage during evaluation. To validate the augmentation strategy, a com parative evaluation framework was introduced using both the naturally imbalanced MPM dataset and an originally balanced breast cancer dataset, which is further manipulated to simulate imbalance, resulting in four experimental conditions: original balanced data, artificially imbalanced data, GAN-augmented data, and VAE-augmented data. Classification is performed using Support Vector Machines (SVM) and Random Forests (RF), and model performance is assessed through accuracy, F1 score, precision, recall, and ROC AUC. In addition, Principal Component Analysis (PCA) and t-Distributed Stochastic Neighbor Embedding (t-SNE) are applied to visually examine the quality and separability of synthetic data. The results show that GAN-based augmentation consistently improves classification performance more than VAE-based augmenta tion, particularly under imbalanced conditions. For instance, in the imbalanced breast cancer setting, GAN improved SVM accuracy by 5.6% and recall by 7.1% compared to the baseline without augmentation. In MPM, performance gains were smaller due to high baseline separability, indicating a ceiling effect. Overall, GAN achieved a mean performance score of 0.9247, compared to 0.9081 for VAE. This study presents a re producible computational pipeline for benchmarking generative models in transcriptomics, and demonstrates that augmentation can effectively mitigate class imbalance in cancer prediction, while highlights the impor tance of dataset specific characteristics. The findings also motivate further research into hybrid generative architectures and biologically grounded validation strategies in precision oncology. en_US
dc.language.iso en en_US
dc.publisher Faculty of Applied Science University of Vavuniya Sri Lanka en_US
dc.subject Breast Cancer en_US
dc.subject Generative adversarial networks en_US
dc.subject Malignant Pleural Mesothelioma en_US
dc.subject Random forests en_US
dc.subject support vector machines en_US
dc.subject Variational autoencoders en_US
dc.title Assessing GAN and VAE Augmentation Methods in Malignant Pleural Mesothelioma Prediction en_US
dc.type Conference abstract en_US
dc.identifier.proceedings 1st International Conference on Applied Sciences- 2025 en_US


Files in this item

This item appears in the following Collection(s)

  • ICAS - 2025 [59]
    International Conference on Applied Sciences - 2025

Show simple item record

Search


Browse

My Account