Tags:Causal discovery, Prior knowledge, Rare diseases, Soft tissue sarcoma and Survival data
Abstract:
Causal networks go beyond the purely correlative approach that most machine learning models pursue. Indeed, they explicitly represent cause-effect relationships and explain how human actions can ramify towards different outcomes. For these reasons, causal networks are attracting increasing interest in the healthcare domain, where physicians commonly need a mechanistic portrait of the system under study and support for effective decision-making. Developing a causal network for the problem at hand is a complex task known as causal discovery, which combines prior knowledge and available data. Extensive literature covers theoretical aspects of causal discovery. However, the task is still challenging in settings characterized by low sample size and limited prior knowledge, a typical scenario when trying to disentangle rare diseases functioning over time. This paper tackles the challenge by developing a novel and original pre-processing algorithm for survival data, i.e., data measuring whether and when an event of interest occurred, and a highly structured workflow for learning causal networks related to different time windows. Comparing the structure of these causal networks enables domain experts to study the evolution over time of the causal mechanisms ruling the system. The proposed methodology is unique in interacting with experts and refines the generalizability and reproducibility of causal discovery studies in similar settings. Moreover, the case of soft tissue sarcoma, a class of rare cancers, is presented. The obtained results demonstrate the effectiveness of our approach in the rare disease domain and provide the first cause-effect representation of soft tissue sarcoma natural history.
Causal Discovery for Rare Disease Research: Iterative Refinement Applied to Soft Tissue Sarcoma