Table 2 Performance metrics of the model and retinal specialists on the primary clinical validation set.

From: Predicting optical coherence tomography-derived diabetic macular edema grades from fundus photographs using deep learning

Metric

Model

Specialist 1

Specialist 2

Specialist 3

Positive predictive value (%), 95% CI

61% [56–66%]

n = 1033

37% [33–40%]

n = 1004

36% [33–40%]

n = 987

38% [34–42%]

n = 1001

Negative predictive value (%), 95% CI

93% [91–95%]

n = 1033

88% [85–91%]

n = 1004

89% [85–92%]

n = 987

88% [84–91%]

n = 1001

Sensitivity (%), 95% CI

85% [80–89%]

n = 1033

84% [80–89%]

n = 1004

85% [80–89%]

n = 987

82% [77–86%]

n = 1001

Specificity (%), 95% CI

80% [77–82%]

n = 1033

45% [41–48%]

n = 1004

45% [41–48%]

n = 987

50% [47–54%]

n = 1001

Accuracy (%), 95% CI

81% [79–83%]

n = 1033

56% [52–59%]

n = 1004

56% [52–59%]

n = 987

59% [56–62%]

n = 1001

Cohen’s Kappa, 95% CI

0.57 [0.52–0.62]

n = 1033

0.21 [0.16–0.25]

n = 1004

0.21 [0.16–0.25]

n = 987

0.24 [0.19–0.28]

n = 1001

  1. For the model we chose an operating point that matched the sensitivity of the retinal specialists to calculate the metrics. The performance metrics for the model were calculated on the entire primary clinical validation set; for the retinal specialists it was calculated only on the images that they marked as gradable. Brackets denote 95% confidence intervals. n = number of images