Skip to main content
Log in

Automated retinal disease classification using hybrid transformer model (SViT) using optical coherence tomography images

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Optical coherence tomography (OCT) is a widely used imaging technique in ophthalmology for diagnosis and treatment. Recent advances in deep neural networks (DNNs) and vision transformers (ViTs) have paved the way for automated eye/retinal disease classifications and segmentations using OCT or spectral domain OCT (SD-OCT) images. Diabetic macular edema (DME), choroidal neovascularization (CNV), and Drusen are particularly challenging to accurately classify using OCT images because of their subtle differences and intricate features. Currently, the algorithms reported in the literature using DNNs or ViTs are computationally complex, consider fewer diseases, and are less accurate. This study proposes a hybrid SqueezeNet-vision transformer (SViT) model that combines the strengths of SqueezeNet and vision transformer (ViT), capturing local and global features of OCT images to achieve more accurate classification with less computational complexity. The proposed model uses the OCT2017 dataset for training, testing, and validation, and it performs both binary classification (normal vs disorders) as well as multiclass classification (DME, CNV, Drusen, and normal). As compared to state-of-the-art CNN-based and standalone Transformer models, the proposed SViT model achieves an overall classification accuracy of 99.90% for multiclass classification (CNV: 100%, DME: 99.9%, Drusen: 100%, and normal: 100%). With a good generalization ability, the model can be used to improve patient care and clinical decision-making across a broader range of applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

The OCT2017 dataset is available as open-source data and can be accessed from https://www.kaggle.com/paultimothymooney/kermany2018.

References

  1. Sakata LM, DeLeon-Ortega J, Sakata V, Girkin CA (2009) Optical coherence tomography of the retina and optic nerve–a review. Clin Exp Ophthalmol 37(1):90–99

    Article  Google Scholar 

  2. Hui VWK, Szeto SKH, Tang F et al (2022) Optical coherence tomography classification systems for diabetic macular edema and their associations with visual outcome and treatment responses–an updated review. Asia Pac J Ophthalmol (Phila) 11(3):247–257. https://doi.org/10.1097/APO.0000000000000468

    Article  Google Scholar 

  3. Krishna KVSSR, Chaitanya K, Subhashini PPS, Yamparala R, Kanumalli SS (2021) Classification of glaucoma optical coherence tomography (OCT) images based on blood vessel identification using CNN and firefly optimization. Traitement du Signal 38(1):239–245

    Article  Google Scholar 

  4. Tsuji T, Hirose Y, Fujimori K et al (2020) Classification of optical coherence tomography images using a capsule network. BMC Ophthalmol 20:114. https://doi.org/10.1186/s12886-020-01382-4

    Article  Google Scholar 

  5. Stanojevic M, Draškovic D, Nikolic B (2022) Retinal disease classification based on optical coherence tomography images using convolutional neural networks. J Electron Imag 32(3):032004. https://doi.org/10.1117/1.JEI.32.3.032004

    Article  Google Scholar 

  6. Srinivasan PP, Kim LA, Mettu PS, Cousins SW, Comer GM, Izatt JA, Farsiu S (2014) Fully automated detection of diabetic macular edema and dry age-related macular degeneration from optical coherence tomography images. Biomed Opt Express 5:3568–3577

    Article  Google Scholar 

  7. Chen X, Xue Y, Wu X, Zhong Y, Rao H, Luo H, Weng Z (2023) Deep learning-based system for disease screening and pathologic region detection from optical coherence tomography images. Transl Vis Sci Technol 12(1):29. https://doi.org/10.1167/tvst.12.1.29

    Article  Google Scholar 

  8. Omid NM, Hamid H, Hossein K, Shahriar BS, Ahmad A (2023) MedViT: a robust vision transformer for generalized medical image classification. Comput Biol Med 157:106791. https://doi.org/10.1016/j.compbiomed.2023.106791

    Article  Google Scholar 

  9. Varadarajan AV, Bavishi P, Ruamviboonsuk P et al (2020) Predicting optical coherence tomography-derived diabetic macular edema grades from fundus photographs using deep learning. Nat Commun 11:130

    Article  Google Scholar 

  10. Murugappan M, Bourisly AK, Prakash NB et al (2023) Automated semantic lung segmentation in chest CT images using deep neural network. Neural Comput Appl 35:15343–15364. https://doi.org/10.1007/s00521-023-08407-1

    Article  Google Scholar 

  11. Murugappan M, Prakash NB, Jeya R, Mohanarathinam A, Hemalakshmi GR, Mahmud M (2022) A novel few-shot classification framework for diabetic retinopathy detection and grading. Measurement 200:111485. https://doi.org/10.1016/j.measurement.2022.111485

    Article  Google Scholar 

  12. Perdomo O, Rios H, Rodríguez FJ, Otálora S, Meriaudeau F, Müller H, González FA (2019) Classification of diabetes-related retinal diseases using a deep learning approach in optical coherence tomography. Comput Methods Programs Biomed 178:181–189

    Article  Google Scholar 

  13. Ryu G, Lee K, Park D, Park SH, Sagong M (2021) A deep learning model for identifying diabetic retinopathy using optical coherence tomography angiography. Sci Rep 11(1):23024

    Article  Google Scholar 

  14. Das V, Prabhakararao E, Dandapat S, Bora PK (2020) B-Scan attentive CNN for the classification of retinal optical coherence tomography volumes. IEEE Signal Process Lett 27:1025–1029

    Article  Google Scholar 

  15. Wu B, Xu C, Dai X, Wan A, Zhang P, Yan Z, Tomizuka M, Gonzalez J, Keutzer K, Vajda P (2020) Visual transformers: token-based image representation and processing for computer vision. arXiv preprint arXiv:2006.03677

  16. Ma Z, Xie Q, Xie P, Fan F, Gao X, Zhu J (2022) HCTNet: a hybrid ConvNet-transformer network for retinal optical coherence tomography image classification. Biosensors 12:542. https://doi.org/10.3390/bios12070542

    Article  Google Scholar 

  17. Zhang Y, Li Z, Nan N, Wang X (2023) TranSegNet: hybrid CNN-vision transformers encoder for retina segmentation of optical coherence tomography. Life 13:976. https://doi.org/10.3390/life13040976

    Article  Google Scholar 

  18. Jiang Z, Wang L, Wu Q, Shao Y, Shen M, Jiang W, Dai C (2022) Computer-aided diagnosis of retinopathy based on vision transformer. J Innov Opt Health Sci 15(02):2250009. https://doi.org/10.1142/S1793545822500092

    Article  Google Scholar 

  19. Dutta P, Sathi KA, Hossain MA, Dewan MAA (2023) Conv-ViT: a convolution and vision transformer-based hybrid feature extraction method for retinal disease detection. J Imag 2023(9):140. https://doi.org/10.3390/jimaging9070140

    Article  Google Scholar 

  20. Strudel R, Garcia R, Laptev I, Schmid C (2021) Segmenter: transformer for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7262–7272

  21. Dai Y, Gao Y, Liu F (2021) Transmed: transformers advance multi-modal medical image classification. Diagnostics 11(8):1384

    Article  Google Scholar 

  22. Gao Y, Zhou M, Metaxas DN (2021) UTNet: a hybrid transformer architecture for medical image segmentation. In: Medical image computing and computer assisted intervention–MICCAI 2021: 24th international conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part III 24, Springer International Publishing, pp 61–71

  23. He K, Gan C, Li Z, Rekik I, Yin Z, Ji W, Gao Y, Wang Q, Zhang J, Shen D (2022) Transformers in medical image analysis: a review. Intell Med 3(1):59–78. https://doi.org/10.1016/j.imed.2022.07.002

    Article  Google Scholar 

  24. Khan A, Rauf Z, Sohail A, Rehman A, Asif H, Asif A, Farooq U (2023) A survey of the vision transformers and its CNN-transformer based variants. arXiv preprint arXiv:2305.09880

  25. Nanni L, Loreggia A, Barcellona L, Ghidoni S (2023) Building ensemble of deep networks: convolutional networks and transformers. IEEE Access 11:124962–124974. https://doi.org/10.1109/ACCESS.2023.3330442

    Article  Google Scholar 

  26. Kermany D, Zhang K, Goldbaum M (2018) Labeled optical coherence tomography (oct) and chest x-ray images for classification. Mendeley Data 2(2):651

    Google Scholar 

  27. Kuwayama S, Ayatsuka Y, Yanagisono D, Uta T, Usui H, Kato A, Takase N, Ogura Y, Yasukawa T (2019) Automated detection of macular diseases by optical coherence tomography and artificial intelligence machine learning of optical coherence tomography images. J Ophthalmol 2019:6319581

    Google Scholar 

  28. Islam MM, Yang HC, Poly TN, Jian WS, Li YCJ (2020) Deep learning algorithms for detection of diabetic retinopathy in retinal fundus photographs: a systematic review and meta-analysis. Comput Methods Programs Biomed 191:105320

    Article  Google Scholar 

  29. Li T, Gao Y, Wang K, Guo S, Liu H, Kang H (2019) Diagnostic assessment of deep learning algorithms for diabetic retinopathy screening. Inf Sci 501:511–522

    Article  Google Scholar 

  30. Das V, Dandapat S, Bora PK (2019) Multi-scale deep feature fusion for automated classification of macular pathologies from OCT images. Biomed Signal Process Control 54:101605

    Article  Google Scholar 

  31. Huang L, He X, Fang L, Rabbani H, Chen X (2019) Automatic classification of retinal optical coherence tomography images with layer guided convolutional neural network. IEEE Signal Process Lett 26(7):1026–1030

    Article  Google Scholar 

  32. Cazañas-Gordón A, Parra-Mora E, Cruz LADS (2021) Ensemble learning approach to retinal thickness assessment in optical coherence tomography. IEEE Access 9:67349–67363

    Article  Google Scholar 

  33. Anoop BN, Pavan R, Girish GN, Kothari AR, Rajan J (2020) Stack generalized deep ensemble learning for retinal layer segmentation in optical coherence tomography images. Biocybern Biomed Eng 40(4):1343–1358

    Article  Google Scholar 

  34. Ai Z, Huang X, Feng J, Wang H, Tao Y, Zeng F, Lu Y (2022) FN-OCT: disease detection algorithm for retinal optical coherence tomography based on a fusion network. Front Neuroinform 16:876927. https://doi.org/10.3389/fninf.2022.876927

    Article  Google Scholar 

  35. Khan A, Sohail A, Zahoora U, Qureshi AS (2020) A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev 53:5455–5516

    Article  Google Scholar 

  36. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprintarXiv:2010.11929

  37. Wassel M, Hamdi AM, Adly N, Torki M (2022) Vision transformers based classification for glaucomatous eye condition. In: 2022 26th international conference on pattern recognition (ICPR), IEEE, pp 5082–5088

  38. Fan R, Alipour K, Bowd C, Christopher M, Brye N, Proudfoot JA, Goldbaum MH, Belghith A, Girkin CA, Fazio MA, Liebmann JM, Weinreb RN, Pazzani M, Kriegman D, Zangwill LM (2023) Detecting glaucoma from fundus photographs using deep learning without convolutions: transformer for improved generalization. Ophthalmol Sci 3(1):100233

    Article  Google Scholar 

  39. Wen H, Zhao J, Xiang S, Lin L, Liu C, Wang T, An L, Liang L, Huang B (2022) Towards more efficient ophthalmic disease classification and lesion location via convolution transformer. Comput Methods Progr Biomed 220:106832

    Article  Google Scholar 

  40. He J, Wang J, Han Z, Ma J, Wang C, Qi M (2023) An interpretable transformer network for the retinal disease classification using optical coherence tomography. Sci Rep 13(1):3637

    Article  Google Scholar 

  41. Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10012–10022

  42. Retinal OCT Images (optical coherence tomography) | Kaggle. https://www.kaggle.com/paultimothymooney/kermany2018. Retrieved on 2 June 2023

  43. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618–626

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Murugappan.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hemalakshmi, G.R., Murugappan, M., Sikkandar, M.Y. et al. Automated retinal disease classification using hybrid transformer model (SViT) using optical coherence tomography images. Neural Comput & Applic 36, 9171–9188 (2024). https://doi.org/10.1007/s00521-024-09564-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-024-09564-7

Keywords

Navigation