Table 2 Performance metrics of the model and retinal specialists on the primary clinical validation set.

Metric	Model	Specialist 1	Specialist 2	Specialist 3
Positive predictive value (%), 95% CI	61% [56–66%] n = 1033	37% [33–40%] n = 1004	36% [33–40%] n = 987	38% [34–42%] n = 1001
Negative predictive value (%), 95% CI	93% [91–95%] n = 1033	88% [85–91%] n = 1004	89% [85–92%] n = 987	88% [84–91%] n = 1001
Sensitivity (%), 95% CI	85% [80–89%] n = 1033	84% [80–89%] n = 1004	85% [80–89%] n = 987	82% [77–86%] n = 1001
Specificity (%), 95% CI	80% [77–82%] n = 1033	45% [41–48%] n = 1004	45% [41–48%] n = 987	50% [47–54%] n = 1001
Accuracy (%), 95% CI	81% [79–83%] n = 1033	56% [52–59%] n = 1004	56% [52–59%] n = 987	59% [56–62%] n = 1001
Cohen’s Kappa, 95% CI	0.57 [0.52–0.62] n = 1033	0.21 [0.16–0.25] n = 1004	0.21 [0.16–0.25] n = 987	0.24 [0.19–0.28] n = 1001

For the model we chose an operating point that matched the sensitivity of the retinal specialists to calculate the metrics. The performance metrics for the model were calculated on the entire primary clinical validation set; for the retinal specialists it was calculated only on the images that they marked as gradable. Brackets denote 95% confidence intervals. n = number of images

Search