External validation of deep learning-based contouring of head and neck organs at risk

Phys Imaging Radiat Oncol. 2020 Jul 10:15:8-15. doi: 10.1016/j.phro.2020.06.006. eCollection 2020 Jul.

Abstract

Background and purpose: Head and neck (HN) radiotherapy can benefit from automatic delineation of tumor and surrounding organs because of the complex anatomy and the regular need for adaptation. The aim of this study was to assess the performance of a commercially available deep learning contouring (DLC) model on an external validation set.

Materials and methods: The CT-based DLC model, trained at the University Medical Center Groningen (UMCG), was applied to an independent set of 58 patients from the Radboud University Medical Center (RUMC). DLC results were compared to the RUMC manual reference using the Dice similarity coefficient (DSC) and 95th percentile of Hausdorff distance (HD95). Craniocaudal spatial information was added by calculating binned measures. In addition, a qualitative evaluation compared the acceptance of manual and DLC contours in both groups of observers.

Results: Good correspondence was shown for the mandible (DSC 0.90; HD95 3.6 mm). Performance was reasonable for the glandular OARs, brainstem and oral cavity (DSC 0.78-0.85, HD95 3.7-7.3 mm). The other aerodigestive tract OARs showed only moderate agreement (DSC 0.53-0.65, HD95 around 9 mm). The binned measures displayed the largest deviations caudally and/or cranially.

Conclusions: This study demonstrates that the DLC model can provide a reasonable starting point for delineation when applied to an independent patient cohort. The qualitative evaluation did not reveal large differences in the interpretation of contouring guidelines between RUMC and UMCG observers.

Keywords: Auto-contouring; Contour comparison; Deep learning; Head & neck cancer.