Development and Validation of Deep Learning–based Automatic Detection Algorithm for Malignant Pulmonary Nodules on Chest Radiographs
Abstract
Our deep learning–based automatic detection algorithm outperformed physicians in radiograph classification and nodule detection performance for malignant pulmonary nodules on chest radiographs, and when used as a second reader, it enhanced physicians’ performances.
Purpose
To develop and validate a deep learning–based automatic detection algorithm (DLAD) for malignant pulmonary nodules on chest radiographs and to compare its performance with physicians including thoracic radiologists.
Materials and Methods
For this retrospective study, DLAD was developed by using 43 292 chest radiographs (normal radiograph–to–nodule radiograph ratio, 34 067:9225) in 34 676 patients (healthy-to-nodule ratio, 30 784:3892; 19 230 men [mean age, 52.8 years; age range, 18–99 years]; 15 446 women [mean age, 52.3 years; age range, 18–98 years]) obtained between 2010 and 2015, which were labeled and partially annotated by 13 board-certified radiologists, in a convolutional neural network. Radiograph classification and nodule detection performances of DLAD were validated by using one internal and four external data sets from three South Korean hospitals and one U.S. hospital. For internal and external validation, radiograph classification and nodule detection performances of DLAD were evaluated by using the area under the receiver operating characteristic curve (AUROC) and jackknife alternative free-response receiver-operating characteristic (JAFROC) figure of merit (FOM), respectively. An observer performance test involving 18 physicians, including nine board-certified radiologists, was conducted by using one of the four external validation data sets. Performances of DLAD, physicians, and physicians assisted with DLAD were evaluated and compared.
Results
According to one internal and four external validation data sets, radiograph classification and nodule detection performances of DLAD were a range of 0.92–0.99 (AUROC) and 0.831–0.924 (JAFROC FOM), respectively. DLAD showed a higher AUROC and JAFROC FOM at the observer performance test than 17 of 18 and 15 of 18 physicians, respectively (P < .05), and all physicians showed improved nodule detection performances with DLAD (mean JAFROC FOM improvement, 0.043; range, 0.006–0.190; P < .05).
Conclusion
This deep learning–based automatic detection algorithm outperformed physicians in radiograph classification and nodule detection performance for malignant pulmonary nodules on chest radiographs, and it enhanced physicians’ performances when used as a second reader.
© RSNA, 2018
References
- 1. Computer-aided detection improves detection of pulmonary nodules in chest radiographs beyond the support by bone-suppressed images. Radiology 2014;272(1):252–261.
- 2. Computer-aided detection of lung cancer on chest radiographs: effect on observer performance. Radiology 2010;257(2):532–540.
- 3. Computer-aided detection of peripheral lung cancers missed at CT: ROC analyses without and with localization. Radiology 2005;237(2):684–690.
- 4. Measuring performance in chest radiography. Radiology 2000;217(2):456–459.
- 5. . Sensitivity and specificity of lung cancer screening using chest low-dose computed tomography. Br J Cancer 2008;98(10):1602–1607.
- 6. . Sensitivity and specificity of chest X-ray screening for lung cancer: review article. Cancer 2000;89(11 Suppl):2453–2456.
- 7. . Missed bronchogenic carcinoma: radiographic findings in 27 patients with a potentially resectable lesion evident in retrospect. Radiology 1992;182(1):115–122.
- 8. . Effective doses in radiology and diagnostic nuclear medicine: a catalog. Radiology 2008;248(1):254–263.
- 9. Benefits and harms of CT screening for lung cancer: a systematic review. JAMA 2012;307(22):2418–2429.
- 10. Improving early diagnosis of pulmonary infections in patients with febrile neutropenia using low-dose chest computed tomography. PLoS One 2017;12(2):e0172256.
- 11. Pulmonary nodule volumetry at different low computed tomography radiation dose levels with hybrid and model-based iterative reconstruction: a within patient analysis. J Comput Assist Tomogr 2016;40(4):578–583.
- 12. . Chest radiography: new technological developments and their applications. Semin Respir Crit Care Med 2014;35(1):3–16.
- 13. . Deep learning. Nature 2015;521(7553):436–444.
- 14. . Imagenet classification with deep convolutional neural networks. In: NIPS’12: Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1. Red Hook, NY: Curran Associates, 2012; 1097–1105.
- 15. . Gradient-based learning applied to document recognition. Proc IEEE 1998;86(11):2278–2324.
- 16. . Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016; 770–778.
- 17. . Representation learning: a unified deep learning framework for automatic prostate MR segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Berlin, Germany: Springer, 2013; 254–261.
- 18. . Lung nodule classification using deep features in CT images. In: 2015 12th Conference on Computer and Robot Vision. LOCATION: IEEE, 2015; 133–138.
- 19. . Recent developments in imaging system assessment methodology, FROC analysis and the search model. Nucl Instrum Methods Phys Res A 2011;648(Supplement 1):S297–S301.
- 20. . Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988;44(3):837–845.
- 21. Observer performance in the detection and classification of malignant hepatic nodules and masses with CT image-space denoising and iterative reconstruction. Radiology 2015;276(2):465–478.
- 22. . Multiple significance tests: the Bonferroni method. BMJ 1995;310(6973):170.
- 23. . A comparison of computer-aided detection (CAD) effectiveness in pulmonary nodule identification using different methods of bone suppression in chest radiographs. J Digit Imaging 2013;26(4):651–656.
- 24. . Computer-aided Detection Fidelity of Pulmonary Nodules in Chest Radiograph. J Clin Imaging Sci 2017;7(1):8.
- 25. . Computer-aided nodule detection system: results in an unselected series of consecutive chest radiographs. Acad Radiol 2015;22(4):475–480.
- 26. New methods for using computer-aided detection information for the detection of lung nodules on chest radiographs. Br J Radiol 2014;87(1036):20140015.
- 27. . Pitfalls in chest radiographic interpretation: blind spots. Semin Roentgenol 2015;50(3):197–209.
- 28. . Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 2013;35(8):1798–1828. * The score was defined as the number of thoracic radiologists who successfully detected the nodule (confidence ≥1).
Article History
Received: Jan 30 2018Revision requested: Mar 20 2018
Revision received: July 29 2018
Accepted: Aug 6 2018
Published online: Sept 25 2018
Published in print: Jan 2019