| ||||
| ||||
![]() Title:Specializing Large Language Models for Hierarchy-Aware ICD-10 Mapping of Portuguese Cardiology Diagnoses Authors:Gustavo Cruz, Yohan Gumiel, Carolina Montenegro, Carlo Bonasso Filho, Ramon Moreno, Marina Rebelo, José Krieger and Marco Gutierrez Conference:IEEE CBMS 2026 Tags:cardiology informatics, hierarchical classification, ICD-10, large language models and Portuguese clinical text Abstract: Automated ICD-10 coding is a high-impact yet expertise-intensive task requiring precise hierarchical reasoning. We evaluate whether a specialized LLM can approach cardiology specialist-level performance when mapping short Portuguese diagnoses to ICD-10 codes. We introduce (i) a double-specialist benchmark of 381 diagnoses from 89 clinical texts and (ii) a 14,685-pair supervision diagnoses corpus generated by a teacher LLM with structural validation and hierarchical normalization. Across paradigms, Block+Category accuracy improves from 0.5350 (retrieval baseline) to 0.7366 (expanded retrieval), 0.7815 (open frontier model), and 0.9019 (proprietary frontier model). Most residual errors reflect hierarchical near-misses rather than semantic misclassification. Supervised fine-tuning of mid-scale open models achieves 0.8582 accuracy on a stratified test set, approaching frontier performance. Results indicate that ICD-10 coding depends more on hierarchical calibration and domainspecific supervision than model scale alone, supporting compact, clinically deployable assistants for Portuguese cardiology. Specializing Large Language Models for Hierarchy-Aware ICD-10 Mapping of Portuguese Cardiology Diagnoses ![]() Specializing Large Language Models for Hierarchy-Aware ICD-10 Mapping of Portuguese Cardiology Diagnoses | ||||
| Copyright © 2002 – 2026 EasyChair |
