Research Article
BibTex RIS Cite

TB-SMGAN: A GAN Based Hybrid Data Augmentation Framework on Chest X-ray Images and Reports

Year 2024, Volume: 11 Issue: 3, 497 - 506, 30.09.2024
https://doi.org/10.54287/gujsa.1501098

Abstract

Data augmentation is a common practice in image classification, employing methods such as reflection, random cropping, re-scaling, and transformations to enhance training data. These techniques are prevalent when working with extended real-world datasets, focusing on improving classification accuracy through increased diversity. The use of Generative Adversarial Networks (GANs), known for their high representational power, enables learning the distribution of real data and generating samples with previously unseen discriminative features. However, intra-class imbalances in augmentations are problematic for conventional GAN augmentations. Hence, we propose a framework named Text-Based Style-Manipulated GAN augmentation framework (TB-SMGAN) aims to leverage the generative capabilities of StyleGAN2-ADA. In this framework, we utilize StyleCLIP to control disentangled feature manipulations and intra-class imbalances. We enhance the efficiency of StyleCLIP by fine-tuning CLIP with x-ray images and information extractions from corresponding medical reports. Our proposed framework demonstrates an improvement in terms of mean PR-AUC score when employing the text-based manipulated GAN augmentation technique compared to conventional GAN augmentation.

References

  • Alqahtani, H., Kavakli-Thorne, M., & Kumar, G. (2021). Applications of generative adversarial networks (gans): An updated review. Archives of Computational Methods in Engineering, 28, 525-552. https://doi.org/10.1007/s11831-019-09388-y
  • Altwaijry, N. (2023). Probability-based synthetic minority oversampling technique. IEEE Access, 11, 28831-28839. https://doi.org/10.1109/ACCESS.2023.3260723
  • Benčević, M., Habijan, M., Galić, I., & Pizurica, A. (2022, August 29 - September 02). Self-supervised Learning as a Means to Reduce the Need for Labeled Data in Medical Image Analysis. In: Proceedings of the 30th European Signal Processing Conference (EUSIPCO 2022) (pp. 1328-1332). Belgrade, Serbia. https://doi.org/10.23919/EUSIPCO55093.2022.9909542
  • Bowles, C., Chen, L., Guerrero, R., Bentley, P., Gunn, R., Hammers, A., Dickie, D. A., Hernández, M. V., Wardlaw, J., & Rueckert, D. (2018). Gan augmentation: Augmenting training data using generative adversarial networks. https://doi.org/10.48550/arXiv.1810.10863
  • Dao, H. N., Quang, T. N., & Paik, I. (2022, October 26-28). Transfer Learning for Medical Image Classification on Multiple Datasets using PubMedCLIP. In: Proceedings of the 2022 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia) (pp. 1-4). Yeosu, Korea, Republic of. https://doi.org/10.1109/ICCE-Asia57006.2022.9954669
  • Dao, H. N., Nguyen, T., Mugisha, C., & Paik, I. (2023). A Multimodal Transfer Learning Approach Using PubMedCLIP for Medical Image Classification. IEEE Access, 12, 75496-75507. https://doi.org/10.1109/ACCESS.2024.3401777
  • Deepshikha, K., & Naman, A. (2020). Removing Class Imbalance using Polarity-GAN: An Uncertainty Sampling Approach. https://doi.org/10.48550/arXiv.2012.04937
  • Fetty, L., Bylund, M., Kuess, P., Heilemann, G., Nyholm, T., Georg, D., & Löfstedt, T. (2020). Latent space manipulation for high-resolution medical image synthesis via the StyleGAN. Zeitschrift für Medizinische Physik, 30(4), 305-314. https://doi.org/10.1016/j.zemedi.2020.05.001
  • Frid-Adar, M., Diamant, I., Klang, E., Amitai, M., Goldberger, J., & Greenspan, H. (2018). GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing, 321, 321-331. https://doi.org/10.1016/j.neucom.2018.09.013
  • Hochberg, D. C., Greenspan, H., & Giryes, R. (2022). A self supervised StyleGAN for image annotation and classification with extremely limited labels. IEEE Transactions on Medical Imaging, 41(12), 3509-3519. https://doi.org/10.1109/TMI.2022.3187170
  • Honnibal, M., Montani, I., Van Landeghem, S., & Boyd, A. (2020). spaCy: Industrial-strength natural language processing in python.
  • Islam, S. M., & Mondal, H. S. (2019, July 06-08). Image Enhancement Based Medical Image Analysis. In: Proceedings of the 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT) (pp. 1-5). Kanpur, India. https://doi.org/10.1109/ICCCNT45670.2019.8944910
  • Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., Chute, C., Marklund, H., Haghgoo, B., Ball, R., Shpanskaya, K., Seekins, J., Mong, D. A., Halabi, S. S., Sandberg, J. K., Jones, R., Larson, D. B., Langlotz, C. P., Patel, B. N., Lungren, M. P., & Ng, A. Y. (2019, January 27 - February 1). Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In: Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, No. 01, pp. 590-597). Honolulu, Hawaii, USA. https://doi.org/10.1609/aaai.v33i01.3301590
  • Jablonski, J. A., Angadi, S. S., Sharma, S., & Brown, D. E. (2022, March 10-11). Enabling Clinically Relevant and Interpretable Deep Learning Models for Cardiopulmonary Exercise Testing. In: Proceedings of the 2022 IEEE Healthcare Innovations and Point of Care Technologies (HI-POCT) (pp. 50-53). Houston, TX, USA. https://doi.org/10.1109/HI-POCT54491.2022.9744068
  • Johnson, A. E. W., Pollard, T. J., Berkowitz, S. J., Greenbaum, N. R., Lungren, M. P., Deng, C., Mark, R. G., & Horng, S. (2019). MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Scientific Data, 6(1), 317. https://doi.org/10.1038/s41597-019-0322-0
  • Kariuki, P. W., Gikunda, P. K., & Wandeto, J. M. (2023, April 14-16). Deep Transfer Learning Optimization Techniques for Medical Image Classification: A Review. In: Proceedings of the 2022 International Conference on Intelligent Computing and Machine Learning (2ICML) (pp. 7-15). Qingdao, China. https://doi.org/10.1109/2ICML58251.2022.00013
  • Karras, T., Aittala, M., Hellsten, J., Laine, S., Lehtinen, J., & Aila, T. (2020, December 6-12). Training generative adversarial networks with limited data. In: H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, & H. Lin (Eds.) Proceedings of the 34th International Conference on Neural Information Processing Systems (NIPS'20) (pp. 12104-12114). Vancouver BC Canada.
  • Ke, B., Lu, H., Huo, W., & Wang, Y. (2022, July 22-24). Semi-supervised Medical Image Classification Combining Metric Pseudo-Label and Classification Pseudo-Label. In: P. Lin, & Y. Yang (Eds.) Proceedings of the 2022 International Conference on Computer Engineering and Artificial Intelligence (ICCEAI) (pp. 794-799). Shijiazhuang, China. https://doi.org/10.1109/ICCEAI55464.2022.00166
  • Kora Venu, S., & Ravula, S. (2020). Evaluation of Deep Convolutional Generative Adversarial Networks for Data Augmentation of Chest X-ray Images. Future Internet, 13(1), 8. https://doi.org/10.3390/fi13010008
  • Lacan, A., Sebag, M., & Hanczar, B. (2023). GAN-based data augmentation for transcriptomics: survey and comparative assessment. Bioinformatics, 39(S1), i111-i120. https://doi.org/10.1093/bioinformatics/btad239
  • Li, Z., Xia, P., Tao, R., Niu, H., & Li, B. (2022). A New Perspective on Stabilizing GANs Training: Direct Adversarial Training. IEEE Transactions on Emerging Topics in Computational Intelligence, 7(1), 178-189. https://doi.org/10.1109/TETCI.2022.3193373
  • Liu, L., Zhang, Y., & Sun, L. (2023). Medimatrix: innovative pre-training of grayscale images for rheumatoid arthritis diagnosis revolutionises medical image classification. Health Information Science and Systems, 11(1), 44. https://doi.org/10.1007/s13755-023-00246-7
  • Neumann, M., King, D., Beltagy, I., & Ammar, W. (2019, August 1). ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing. In: Dina D.-F., Kevin B. C., Sophia A., & J. Tsujii (Eds.) Proceedings of the 18th BioNLP Workshop and Shared Task (pp. 319-327). Florence, Italy. https://doi.org/10.18653/v1/W19-5034
  • Patashnik, O., Wu, Z., Shechtman, E., Cohen-Or, D., & Lischinski, D. (2021, October 10-17). StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 2065-2074). Montreal, QC, Canada. https://doi.org/10.1109/ICCV48922.2021.00209
  • Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., & Sutskever, I. (2021, July 18-24). Learning transferable visual models from natural language supervision. In: M. Meila, & T. Zhang (Eds.) Proceedings of the 38th International Conference on Machine Learning, PMLR (pp. 8748-8763). https://doi.org/10.48550/arXiv.2103.00020
  • Shorten, C., & Khoshgoftaar, T. M. (2019). A survey on image data augmentation for deep learning. Journal of Big Data, 6(1), 60. https://doi.org/10.1186/s40537-019-0197-0
  • Sundaram, S., & Hulkund, N. (2021). GAN-based Data Augmentation for Chest X-ray Classification. https://doi.org/10.48550/arXiv.2107.02970
  • Tarawneh, A. S., Hassanat, A. B., Altarawneh, G. A., & Almuhaimeed, A. (2022). Stop oversampling for class imbalance learning: A review. IEEE Access, 10, 47643-47660. https://doi.org/10.1109/ACCESS.2022.3169512
  • Tov, O., Alaluf, Y., Nitzan, Y., Patashnik, O., & Cohen-Or, D. (2021). Designing an encoder for StyleGAN image manipulation. ACM Transactions on Graphics (TOG), 40(4), 133. https://doi.org/10.1145/3450626.3459838
  • Wang, Y., Ge, X., Ma, H., Qi, S., Zhang, G., & Yao, Y. (2021). Deep Learning in Medical Ultrasound Image Analysis: A Review. IEEE Access, 9, 54310-54324. https://doi.org/10.1109/ACCESS.2021.3071301
  • Wang, X., & Qi, G.-J. (2022). Contrastive Learning With Stronger Augmentations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(5), 5549-5560. https://doi.org/10.1109/TPAMI.2022.3203630
  • Yuan, Z., Yan, Y., Sonka, M., & Yang, T. (2021, October 10-17). Large-scale Robust Deep AUC Maximization: A New Surrogate Loss and Empirical Studies on Medical Image Classification. In: Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (pp. 3040-3049). Montreal, QC, Canada. https://doi.org/10.1109/ICCV48922.2021.00303
Year 2024, Volume: 11 Issue: 3, 497 - 506, 30.09.2024
https://doi.org/10.54287/gujsa.1501098

Abstract

References

  • Alqahtani, H., Kavakli-Thorne, M., & Kumar, G. (2021). Applications of generative adversarial networks (gans): An updated review. Archives of Computational Methods in Engineering, 28, 525-552. https://doi.org/10.1007/s11831-019-09388-y
  • Altwaijry, N. (2023). Probability-based synthetic minority oversampling technique. IEEE Access, 11, 28831-28839. https://doi.org/10.1109/ACCESS.2023.3260723
  • Benčević, M., Habijan, M., Galić, I., & Pizurica, A. (2022, August 29 - September 02). Self-supervised Learning as a Means to Reduce the Need for Labeled Data in Medical Image Analysis. In: Proceedings of the 30th European Signal Processing Conference (EUSIPCO 2022) (pp. 1328-1332). Belgrade, Serbia. https://doi.org/10.23919/EUSIPCO55093.2022.9909542
  • Bowles, C., Chen, L., Guerrero, R., Bentley, P., Gunn, R., Hammers, A., Dickie, D. A., Hernández, M. V., Wardlaw, J., & Rueckert, D. (2018). Gan augmentation: Augmenting training data using generative adversarial networks. https://doi.org/10.48550/arXiv.1810.10863
  • Dao, H. N., Quang, T. N., & Paik, I. (2022, October 26-28). Transfer Learning for Medical Image Classification on Multiple Datasets using PubMedCLIP. In: Proceedings of the 2022 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia) (pp. 1-4). Yeosu, Korea, Republic of. https://doi.org/10.1109/ICCE-Asia57006.2022.9954669
  • Dao, H. N., Nguyen, T., Mugisha, C., & Paik, I. (2023). A Multimodal Transfer Learning Approach Using PubMedCLIP for Medical Image Classification. IEEE Access, 12, 75496-75507. https://doi.org/10.1109/ACCESS.2024.3401777
  • Deepshikha, K., & Naman, A. (2020). Removing Class Imbalance using Polarity-GAN: An Uncertainty Sampling Approach. https://doi.org/10.48550/arXiv.2012.04937
  • Fetty, L., Bylund, M., Kuess, P., Heilemann, G., Nyholm, T., Georg, D., & Löfstedt, T. (2020). Latent space manipulation for high-resolution medical image synthesis via the StyleGAN. Zeitschrift für Medizinische Physik, 30(4), 305-314. https://doi.org/10.1016/j.zemedi.2020.05.001
  • Frid-Adar, M., Diamant, I., Klang, E., Amitai, M., Goldberger, J., & Greenspan, H. (2018). GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing, 321, 321-331. https://doi.org/10.1016/j.neucom.2018.09.013
  • Hochberg, D. C., Greenspan, H., & Giryes, R. (2022). A self supervised StyleGAN for image annotation and classification with extremely limited labels. IEEE Transactions on Medical Imaging, 41(12), 3509-3519. https://doi.org/10.1109/TMI.2022.3187170
  • Honnibal, M., Montani, I., Van Landeghem, S., & Boyd, A. (2020). spaCy: Industrial-strength natural language processing in python.
  • Islam, S. M., & Mondal, H. S. (2019, July 06-08). Image Enhancement Based Medical Image Analysis. In: Proceedings of the 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT) (pp. 1-5). Kanpur, India. https://doi.org/10.1109/ICCCNT45670.2019.8944910
  • Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., Chute, C., Marklund, H., Haghgoo, B., Ball, R., Shpanskaya, K., Seekins, J., Mong, D. A., Halabi, S. S., Sandberg, J. K., Jones, R., Larson, D. B., Langlotz, C. P., Patel, B. N., Lungren, M. P., & Ng, A. Y. (2019, January 27 - February 1). Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In: Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, No. 01, pp. 590-597). Honolulu, Hawaii, USA. https://doi.org/10.1609/aaai.v33i01.3301590
  • Jablonski, J. A., Angadi, S. S., Sharma, S., & Brown, D. E. (2022, March 10-11). Enabling Clinically Relevant and Interpretable Deep Learning Models for Cardiopulmonary Exercise Testing. In: Proceedings of the 2022 IEEE Healthcare Innovations and Point of Care Technologies (HI-POCT) (pp. 50-53). Houston, TX, USA. https://doi.org/10.1109/HI-POCT54491.2022.9744068
  • Johnson, A. E. W., Pollard, T. J., Berkowitz, S. J., Greenbaum, N. R., Lungren, M. P., Deng, C., Mark, R. G., & Horng, S. (2019). MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Scientific Data, 6(1), 317. https://doi.org/10.1038/s41597-019-0322-0
  • Kariuki, P. W., Gikunda, P. K., & Wandeto, J. M. (2023, April 14-16). Deep Transfer Learning Optimization Techniques for Medical Image Classification: A Review. In: Proceedings of the 2022 International Conference on Intelligent Computing and Machine Learning (2ICML) (pp. 7-15). Qingdao, China. https://doi.org/10.1109/2ICML58251.2022.00013
  • Karras, T., Aittala, M., Hellsten, J., Laine, S., Lehtinen, J., & Aila, T. (2020, December 6-12). Training generative adversarial networks with limited data. In: H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, & H. Lin (Eds.) Proceedings of the 34th International Conference on Neural Information Processing Systems (NIPS'20) (pp. 12104-12114). Vancouver BC Canada.
  • Ke, B., Lu, H., Huo, W., & Wang, Y. (2022, July 22-24). Semi-supervised Medical Image Classification Combining Metric Pseudo-Label and Classification Pseudo-Label. In: P. Lin, & Y. Yang (Eds.) Proceedings of the 2022 International Conference on Computer Engineering and Artificial Intelligence (ICCEAI) (pp. 794-799). Shijiazhuang, China. https://doi.org/10.1109/ICCEAI55464.2022.00166
  • Kora Venu, S., & Ravula, S. (2020). Evaluation of Deep Convolutional Generative Adversarial Networks for Data Augmentation of Chest X-ray Images. Future Internet, 13(1), 8. https://doi.org/10.3390/fi13010008
  • Lacan, A., Sebag, M., & Hanczar, B. (2023). GAN-based data augmentation for transcriptomics: survey and comparative assessment. Bioinformatics, 39(S1), i111-i120. https://doi.org/10.1093/bioinformatics/btad239
  • Li, Z., Xia, P., Tao, R., Niu, H., & Li, B. (2022). A New Perspective on Stabilizing GANs Training: Direct Adversarial Training. IEEE Transactions on Emerging Topics in Computational Intelligence, 7(1), 178-189. https://doi.org/10.1109/TETCI.2022.3193373
  • Liu, L., Zhang, Y., & Sun, L. (2023). Medimatrix: innovative pre-training of grayscale images for rheumatoid arthritis diagnosis revolutionises medical image classification. Health Information Science and Systems, 11(1), 44. https://doi.org/10.1007/s13755-023-00246-7
  • Neumann, M., King, D., Beltagy, I., & Ammar, W. (2019, August 1). ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing. In: Dina D.-F., Kevin B. C., Sophia A., & J. Tsujii (Eds.) Proceedings of the 18th BioNLP Workshop and Shared Task (pp. 319-327). Florence, Italy. https://doi.org/10.18653/v1/W19-5034
  • Patashnik, O., Wu, Z., Shechtman, E., Cohen-Or, D., & Lischinski, D. (2021, October 10-17). StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 2065-2074). Montreal, QC, Canada. https://doi.org/10.1109/ICCV48922.2021.00209
  • Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., & Sutskever, I. (2021, July 18-24). Learning transferable visual models from natural language supervision. In: M. Meila, & T. Zhang (Eds.) Proceedings of the 38th International Conference on Machine Learning, PMLR (pp. 8748-8763). https://doi.org/10.48550/arXiv.2103.00020
  • Shorten, C., & Khoshgoftaar, T. M. (2019). A survey on image data augmentation for deep learning. Journal of Big Data, 6(1), 60. https://doi.org/10.1186/s40537-019-0197-0
  • Sundaram, S., & Hulkund, N. (2021). GAN-based Data Augmentation for Chest X-ray Classification. https://doi.org/10.48550/arXiv.2107.02970
  • Tarawneh, A. S., Hassanat, A. B., Altarawneh, G. A., & Almuhaimeed, A. (2022). Stop oversampling for class imbalance learning: A review. IEEE Access, 10, 47643-47660. https://doi.org/10.1109/ACCESS.2022.3169512
  • Tov, O., Alaluf, Y., Nitzan, Y., Patashnik, O., & Cohen-Or, D. (2021). Designing an encoder for StyleGAN image manipulation. ACM Transactions on Graphics (TOG), 40(4), 133. https://doi.org/10.1145/3450626.3459838
  • Wang, Y., Ge, X., Ma, H., Qi, S., Zhang, G., & Yao, Y. (2021). Deep Learning in Medical Ultrasound Image Analysis: A Review. IEEE Access, 9, 54310-54324. https://doi.org/10.1109/ACCESS.2021.3071301
  • Wang, X., & Qi, G.-J. (2022). Contrastive Learning With Stronger Augmentations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(5), 5549-5560. https://doi.org/10.1109/TPAMI.2022.3203630
  • Yuan, Z., Yan, Y., Sonka, M., & Yang, T. (2021, October 10-17). Large-scale Robust Deep AUC Maximization: A New Surrogate Loss and Empirical Studies on Medical Image Classification. In: Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (pp. 3040-3049). Montreal, QC, Canada. https://doi.org/10.1109/ICCV48922.2021.00303
There are 32 citations in total.

Details

Primary Language English
Subjects Deep Learning
Journal Section Information and Computing Sciences
Authors

Hasan Berat Özfidan 0009-0006-0950-2017

Mehmet Ulvi Şimşek 0000-0001-8017-4753

Early Pub Date September 28, 2024
Publication Date September 30, 2024
Submission Date June 13, 2024
Acceptance Date September 12, 2024
Published in Issue Year 2024 Volume: 11 Issue: 3

Cite

APA Özfidan, H. B., & Şimşek, M. U. (2024). TB-SMGAN: A GAN Based Hybrid Data Augmentation Framework on Chest X-ray Images and Reports. Gazi University Journal of Science Part A: Engineering and Innovation, 11(3), 497-506. https://doi.org/10.54287/gujsa.1501098