Key Points
- The authors developed and validated an artificial intelligence (AI) algorithm (ADAPT-CEC) that adjudicates multiple cardiovascular endpoints and adapts to new definitions.
- In EUCLID, ADAPT-CEC correctly classified 86.4% of primary endpoint events and 99.4% of non-events compared with human adjudication.
- The hybrid AI-human strategy performed best, correctly classifying 95.6% of events and 99.6% of non-events.
- The primary EUCLID treatment effect remained similar across adjudication strategies, including HR 1.02 with human CEC, HR 0.98 with ADAPT-CEC, and HR 1.04 with the hybrid approach.
Behind every major cardiovascular trial is a less visible but decisive process: determining whether a suspected event truly meets the trial’s endpoint definition. At ACC.26, investigators presented evidence that adaptive artificial intelligence may be able to perform much of that work while preserving the central result of the trial itself.
At the 2026 American College of Cardiology Scientific Sessions, Sreek Vemulapalli, MD, presented an analysis of ADAPT-CEC, an AI system developed to adjudicate cardiovascular events across changing endpoint definitions in the ODYSSEY OUTCOMES and EUCLID trials. Published also in Circulation, the study asked whether a model trained in one trial could be briefly adapted to a second trial, handle both familiar and new endpoints, and still reproduce the trial’s primary treatment effect.
The model was derived using adjudicated events from ODYSSEY OUTCOMES, the 18,924-patient trial, and externally validated in EUCLID, the 13,885-patient trial. In ODYSSEY OUTCOMES, training included myocardial infarction, stroke, and heart failure. In EUCLID, the system was adapted with 20 suspected events per endpoint, 10 events and 10 non-events, then tested on myocardial infarction and stroke under new definitions, along with new endpoints including cardiovascular death and bleeding. Investigators compared ADAPT-CEC alone, GPT 4.0 adjudication, and a hybrid strategy that sent the 30% of cases with the lowest AI certainty for human review.
The hybrid approach produced the strongest overall performance. Across 13,885 suspected EUCLID primary endpoint events, ADAPT-CEC correctly classified 86.4% of all endpoints and 99.4% of all non-endpoints compared with human adjudication. The hybrid strategy improved this to 95.6% and 99.6%, respectively, while GPT 4.0 classified 76.3% of endpoints and 99.8% of non-endpoints. For individual endpoints, the hybrid strategy also posted the highest F1 scores, including 0.940 for cardiovascular death, 0.795 for myocardial infarction, 0.821 for stroke, and 0.833 for bleeding. ADAPT-CEC was broadly similar to GPT 4.0 for cardiovascular death, myocardial infarction, and stroke, but performed better for bleeding, where F1 was 0.780 versus 0.564.
Most importantly, the main trial conclusion did not materially change across adjudication methods. For the EUCLID primary endpoint of cardiovascular death, myocardial infarction, or ischemic stroke, human CEC yielded HR 1.02 (0.93, 1.13), p=0.65. ADAPT-CEC yielded HR 0.98 (0.88, 1.09), p=0.74. The hybrid strategy yielded HR 1.04 (0.94, 1.15), p=0.44. GPT 4.0 yielded HR 1.06 (0.95, 1.19), p=0.29. As Vemulapalli said in the interview, “the hybrid approach was the best. By best, I mean, it most closely recreated what human beings did.”
The analysis was retrospective, performance varied across event types, and training and validation were performed primarily within adjudications from a single academic research organization. Prospective testing will be needed before such an approach can be used routinely in future trials.
The most important implication was not that AI replaced adjudication. It was that the best result came when AI and human review were paired. For clinical trials under pressure to become faster, larger, and more efficient, that may prove to be the more important advance.
