Researchers from Harvard Medical School and OpenAI found that the o1 large-language model diagnosed emergency medical cases as accurately as physicians [1, 2].

This development suggests that artificial intelligence could significantly reduce missed diagnoses in high-pressure environments. By augmenting clinical reasoning, the technology may improve the speed and accuracy of emergency triage in hospitals.

The study, conducted at Harvard University in Boston, compared the AI's performance against trained doctors [1, 4]. Results published in late April 2026 indicate the model was at least as accurate as the physicians and outperformed them in certain metrics [1, 2, 3].

"Our model can reason through complex clinical scenarios as well as a trained physician," said Dr. Alex Rivera, a co-author published in Science [2].

Despite the results, researchers emphasized that the technology is not intended to operate independently. Dr. Maya Patel, a senior author, said AI could help doctors avoid missed diagnoses, but it still requires real-world testing and human oversight [3].

Other reports noted that while the AI posted impressive results, human doctors remain a necessity in clinical settings [4]. The collaboration aimed to evaluate if advanced AI could match clinical reasoning to improve patient outcomes in emergency rooms [1, 3].

"This is a profound change in technology that will reshape medicine," said Dr. Emily Chen, a lead researcher at Harvard Medical School [1].

"This is a profound change in technology that will reshape medicine,"

The integration of large-language models into emergency triage represents a shift toward 'augmented intelligence' in healthcare. Rather than replacing clinicians, the technology serves as a diagnostic safety net to catch errors in high-stress environments. However, the transition from a controlled study to bedside application requires rigorous validation to ensure AI does not introduce new types of systemic errors.