AI evaluates surgical scenarios with precision

As with every graduate from a six-year medical program, Sber’s artificial intelligence faced the standard exam ticket and provided answers to the questions. The final score was four. The exam ticket typically contains three practical tasks in therapy, surgery, obstetrics and gynecology, along with three to five questions per area. Tasks include naming the likely diagnosis, outlining a treatment plan, and prescribing additional tests. GigaChat also completed a 100-question assessment, achieving 82 percent, well above the 70 percent passing mark.

From a clinician’s perspective, performance looked solid because it mirrored nearly a year of observed progress. The supervising professor noted initial concerns but acknowledged striking improvement over time, describing today’s result as unexpected yet highly impressive. The evaluation took place at a national medical research center, reflecting a milestone in AI-assisted medical assessments.

For the examiner, a score of four on the first attempt is excellent. GigaChat demonstrated strong alignment with surgical scenarios and showed particularly strong responses to surgical questions. The therapy section, while solid, did not reach the same level of clarity as the surgical answers.

The professor emphasized that the journey from the starting point has been remarkable and that therapy, being a broad field, inevitably presents room for growth. Gratitude was extended for the collaborative effort that supported the AI’s progress across disciplines.

unusual experience

The head of the surgical department at the national clinic acknowledges the experience as unusual. This candid, person-to-person comparison highlighted a different dynamic from standard exams. There was no hesitation, only a momentary pause to think before delivering a concise, direct answer. The head described this as a positive and instructive development worth pursuing further.

Initial challenges showed that obstetrics-surgery posed the primary problem for the AI, with purely surgical questions following closely. The surgical responses were judged the most complete and robust, while therapy responses were somewhat more expansive but less precise in places. The professionals noted a high quality of surgical reasoning and felt the AI demonstrated solid capability overall.

The head of the department added that some responses were overly detailed and others contained extra analysis and methods not required in surgical settings due to time constraints, but the overall result was encouraging.

beyond expectations

Committee members were surprised by the outcome, with many agreeing that the results exceeded early expectations. Observers had watched the AI develop from the outset and were consistently pleased with the trajectory. What was seen on exam day surpassed even the most optimistic projections.

The admissions panel deemed the neural network capable of a strong overall rating. While there is still work to be done, the consensus is that the system has the potential to broaden its specialization and become a versatile, first-class professional in the field.

What are You Looking For?

The best answer about surgery

unusual experience

beyond expectations

Roscosmos Leadership Highlights Instrument Production Bottlenecks and 2024 Launch Cadence

Investigative Committee Probes Memorial Plaque Damage in New Moscow and Related Incidents