AI and the USMLE: Insights from AnsibleHealth’s PLOS Digital Health Study

No time to read?
Get a summary

Researchers from AnsibleHealth examined how a ChatGPT-based system performs on medical licensing content in the United States. Their study, published in PLOS Digital Health, explored whether an advanced AI conversational model could approach the level needed to pass the United States Medical Licensing Examination (USMLE). The USMLE is a three-step sequence designed to assess a physician’s ability to apply knowledge, concepts, and principles that underlie health and disease, across disciplines from biochemistry to diagnostic reasoning and medical ethics. The study situates AI performance within the practical context of medical education and real-world clinical communication, highlighting both capabilities and limitations. It is clear that the work is part of a broader effort to understand how AI can support learners and clinicians without diminishing the essential human role in medical evaluation and patient care.

In the tested cohort, the AI system was evaluated against 350 of the 376 public USMLE questions available as of June 2022. The results showed a score range from 52.4 percent to 75.0 percent across the three USMLE exam steps. The pass mark in the licensing process has historically hovered around the 60 percent threshold, though exact cutoffs can vary by year and test form. Importantly, the analysis found that in 88.9 percent of responses, the AI provided outputs that were not only correct but clinically meaningful and non-obvious, demonstrating the model’s potential to translate complex medical knowledge into actionable, patient-relevant insights. The AI did notably better than PubMedGPT, another model trained primarily on biomedical literature, which achieved about 50.8 percent on the same exam set. These comparative results underscore the strong foundation of general medical knowledge in the model while also drawing attention to the value of broader clinical training data beyond curated literature alone.

The authors discuss several implications for medical education and clinical practice. They suggest that AI tools of this kind could help students and residents by offering quick explanations, reviewing key concepts, and translating dense medical information into more accessible language for patients. In practice, clinicians at AnsibleHealth have begun using the technology to rewrite complex reports into language that patients can understand, potentially improving comprehension, adherence, and shared decision making. The work also emphasizes the need for ongoing evaluation, transparent reporting of capabilities and limits, and careful integration into curricula and clinical workflows. The study is a reminder that AI can augment rather than replace the expert judgment of medical professionals, especially in areas requiring nuanced interpretation, ethical considerations, and personalized patient communication. Overall, the research points toward a future where AI support tools contribute to medical education, clinical documentation, and patient engagement while maintaining rigorous safeguards and human oversight.

No time to read?
Get a summary
Previous Article

Next Article

Frontline engagements in Donbass highlight evolving tactics and international responses