| ||||
| ||||
![]() Title:Verbalized Uncertainty in Medical AI: Differential Diagnosis in Commercial LLMs Conference:IEEE CBMS 2026 Tags:large language models, medical AI and uncertainty estimation Abstract: Large Language Models (LLMs) have revolutionized large-scale data processing in healthcare settings, including more efficient and readily available diagnostic models. Differential diagnoses are generated freely and introduced into the clinic by concerned patients. However, many biases are present with limited knowledge about the relationship between the model correctness and the prediction's associated confidence. The current study analyzed three differently purposed LLMs in light of this relationship and visualized the calibration of medical LLMs. Sex, age, and pathology-stratified analyses were also performed separately to evaluate possible biases. Our results indicate that calibration moves from overconfidence to underconfidence when medical LLMs are prompted for a top-5 of likely diagnoses instead of a single prediction. Moreover we found no biases for sex or age-groups, while a bias might exist for specific pathologies. We show that robust evaluation is key for trust in these medical LLMs and more information is required before clinical adoption. Verbalized Uncertainty in Medical AI: Differential Diagnosis in Commercial LLMs ![]() Verbalized Uncertainty in Medical AI: Differential Diagnosis in Commercial LLMs | ||||
| Copyright © 2002 – 2026 EasyChair |
