| ||||
| ||||
![]() Title:A Conversational Agent for Natural Language Access to Public Health Data Conference:IEEE CBMS 2026 Tags:Health Information Systems, Large Language Models, Natural Language Interfaces, Public Health Data and Text-to-SQL Abstract: DATASUS, Brazil's national public health data repository, provides access to large volumes of epidemiological data. Among its systems, the Hospital Information System in Reduced Data format (SIH-RD) records millions of hospitalization procedures. Despite being one of the world's largest epidemiological repositories, SIH-RD microdata remains analytically inaccessible to non-technical practitioners: opaque clinical encodings, ambiguous join paths, and coded value mappings confound general-purpose language models, and no Portuguese natural-language interface for DATASUS currently exists. To the best of our knowledge, we present the first Text-to-SQL system for DATASUS SIH-RD, enabling queries over 18.7 million hospitalization records. To address this, we derive fifteen domain-specific SQL generation rules from systematic SIH-RD schema analysis and embed them in a 9-stage LangGraph pipeline with query routing, automatic table selection, chain-of-thought planning, SQL generation, static validation, and bounded self-repair, requiring no model fine-tuning. We also construct a benchmark of 120 Portuguese healthcare queries stratified into Easy, Medium, and Hard tiers (40 each) with gold-standard SQL over records from two Brazilian states (2008-2023), comprising the first formal Text-to-SQL evaluation on SIH-RD. The agent achieves 93.3% execution accuracy (112/120) with 100% pipeline completion; a controlled single-shot baseline sharing identical model, domain rules, and prompts achieves 90.0% (108/120), with the advantage concentrated in Hard queries (+10.0 pp), isolating the contribution of graph orchestration for complex multi-table reasoning. A Conversational Agent for Natural Language Access to Public Health Data ![]() A Conversational Agent for Natural Language Access to Public Health Data | ||||
| Copyright © 2002 – 2026 EasyChair |
