Evaluating Uncertainty SCT-Bench
Measuring how AI adjusts diagnostic thinking when new evidence arrives -- the way real clinicians reason under uncertainty
Based on Script Concordance Testing -- a decades-old medical assessment tool now applied to AI
750 Questions / 10 International Datasets / 9 Previously Unreleased / NEJM AI
v0.2.0
PAUSED -- press Space to continue