ICSB 2022: THE 21ST INTERNATIONAL CONFERENCE ON SYSTEMS BIOLOGY (ICSB 2022)
PROGRAM FOR SUNDAY, OCTOBER 9TH
Days:
previous day
next day
all days

View: session overviewtalk overview

09:15-10:00 Session K4: KEYNOTE IV: Peer Bork

Abstract: Environmental sequencing, that is metagenomics, has become a major driver for uncovering microbial biodiversity and increasingly also for cataloging molecular functions on our planet. The exponentially increasing metagenomes need computational tools and resources to allow researchers to access and digest these valuable data. Based on computational methods and resources, often developed in our group, but also by utilizing public resources, here I (i) introduce into our work on the gut microbiome, arguable the best-studied microbial community with its internal and environmental (host) interactions, serving as a model for other habitats. Systems approaches include large-scale perturbations by drugs of both individual strains and synthetic communities, with the aims to increase our still limited understanding, the establishment of novel diagnostics and individualized medication guidance. I (ii) further show on how to apply the underlying concepts to other habitats, like ocean and soil, to arrive at basic understanding of the underexplored microbial diversity on earth. We study, for example, how molecular function is evolving and spreading and, in analogy to microbial diagnostics and treatment for human health, prepare the grounds for bioindicators and remediation strategies in various habitats towards improving planetary health.

Chair:
Location: Alexander
09:15
Molecular eco-systems biology of microbiomes for human and planetary health

ABSTRACT. Environmental sequencing, that is metagenomics, has become a major driver for uncovering microbial biodiversity and increasingly also for cataloging molecular functions on our planet. The exponentially increasing metagenomes need computational tools and resources to allow researchers to access and digest these valuable data. Based on computational methods and resources, often developed in our group, but also by utilizing public resources, here I (i) introduce into our work on the gut microbiome, arguable the best-studied microbial community with its internal and environmental (host) interactions, serving as a model for other habitats. Systems approaches include large-scale perturbations by drugs of both individual strains and synthetic communities, with the aims to increase our still limited understanding, the establishment of novel diagnostics and individualized medication guidance. I (ii) further show on how to apply the underlying concepts to other habitats, like ocean and soil, to arrive at basic understanding of the underexplored microbial diversity on earth. We study, for example, how molecular function is evolving and spreading and, in analogy to microbial diagnostics and treatment for human health, prepare the grounds for bioindicators and remediation strategies in various habitats towards improving planetary health.

10:00-10:30Coffee Break
10:30-12:30 Session 1: TUMOR ECOSYSTEMS

Summary: Petabyte cancer big data ranging over single cells to temporal space is changing our way of systems understanding of cancer. Various AI technologies enhanced with supercomputers and big storages are its driving force. This session focuses on the new discoveries and topics which may not be investigated with such technologies.

Location: Grenander I+II
10:30
Inferring kinase activity of a tumor from phosphoproteomic data
PRESENTER: Kristen Naegle

ABSTRACT. Kinase inhibitors are one of the largest classes of FDA-approved drugs and are major targets in oncology. Although kinase inhibitors have played an important role in improving cancer outcomes, major challenges still exist, including the development of resistance and failure to respond to treatments. Improvements for tumor profiling of kinase activity would be an important step in improving treatment outcomes and identifying effective kinase targets. Here, we present a graph- and statistics-based algorithm, called KSTAR, which harnesses the phosphoproteomic profiling of human cells and tissues by predicting kinase activity profiles from the observed phosphorylation of kinase substrates. The algorithm is based on the hypothesis that the more active a kinase is, the more of its substrates will be observed in a phosphoproteomic experiment. This method is error- and bias-aware in its approach, overcoming challenges presented by the variability of phosphoproteomic pipelines, limited information about kinase-substrate relationships, and limitations of global kinase-substrate predictions, such as training set bias and high overlap between predicted kinase networks. We demonstrate that the predicted kinase activities: 1) reproduce physiologically-relevant expectations and generates novel hypotheses within cell-specific experiments, 2) improve the ability to compare phosphoproteomic samples on the same tissues from different labs, and 3) identify tissue-specific kinase profiles. Global benchmarking and comparison to other algorithms demonstrates that KSTAR is particularly superior for predicting tyrosine kinase activities and, given its focus on utilizing more of the available phosphoproteomic data, significantly less sensitive to study bias. Finally, we apply the approach to complex human tissue biopsies in breast cancer, where we find that KSTAR activity predictions complement current clinical standards for identifying HER2-status – KSTAR can identify clinical false positives, patients who will fail to respond to inhibitor therapy, and clinically defined HER2-negative patients that might benefit from HER2-targeted therapy. KSTAR will be useful for both basic biological understanding of signaling networks and for improving clinical outcomes through improved clinical trial design, identification of new and/or combination therapies, and for identifying the failure to respond to targeted kinase therapies.

10:50
XAI × Fugaku uncovered EMT mechanisms via cancer characteristic specific gene network analysis
PRESENTER: Satoru Miyano

ABSTRACT. Gene network analysis is crucial to understand complex mechanism of cancer, because the cancer-related mechanisms involve the perturbations in the complex molecular network. We consider cell line characteristic-specific gene network that enables us to identify molecular interplays underlying the varying cancer characteristics of cell lines. Although various computational methodologies have been proposed to estimate the personalized gene network, the interpretation of the multilayer massive networks remains a challenge, because the cell line characteristic-specific gene network analysis provides hundreds of networks consisting of more than ten thousand genes. In order to comprehensively analyze the large-scale networks, we propose a novel explainable artificial intelligence (XAI) strategy, called a Tensor Reconstruction-based Interpretable Prediction (TRIP: https://arxiv.org/abs/2007.03912), by collaboration with Artificial Intelligence Laboratory at Fujitsu. The TRIP is a deep learning strategy to tensor decomposition and interprets the multilayer massive networks in the human-readable low-dimensional subspace. Unlike to existing deep learning-based AI approaches, the TRIP learns the network data to minimize errors of not only prediction but also the low-dimensional subspace estimation, where the subspace is estimated to capture crucial features for prediction. It enables us to perform unboxing the black box AI, i.e., we can explain the results by interpret the projected subspace. The multilayer massive network analysis needs a huge amount of computation and thus we use supercomputer Fugaku (TOP500: No. 1 in 2021; No. 2 in 2022 June). We focus on the epithelial–mesenchymal transition (EMT) mechanism that plays key role in tumor progression. We apply the TRIP to 762 gene regulatory networks under varying conditions of EMT status of cell lines. We then uncover EMT mechanism and related markers by combine our results with the well-known EMT markers, i.e., ZEB1, ZEB2, SNAIL1, SNAIL2, and TWIST1. Our results suggest 17 novel EMT markers (e.g., IFI16 and TP63) and related pathways (e.g., transcription activation and keratinocyte proliferation).

11:10
Are metastatic triple negative breast cancers closer to their parental tumors or primary tumors of their destination tissues?

ABSTRACT. Triple negative breast cancer (TNBC) metastases are assumed to exhibit similar functions in different organs as in the original primary tumor. The past decade has brought several advances in the understanding of metabolic phenotypes of tumors that are different from their adjacent nonmalignant tissues. However, studies of metastasis are often limited to a comparison of metastatic tumors with primary tumors of their origin, and little is known about the adaptation to the local environment of the metastatic sites. Here we present a comprehensive investigation of the extent of adaptation of TNBC cells to their new microenvironment in the distant tissues. We performed an analysis of RNA-Seq data from TNBC primary tumors, paired distant metastases in six different tissues (n=31), primary tumors (TP), and paired adjacent normal tissue samples corresponding to these tissues from TCGA (n=1289), together with healthy tissue data (GTEx, n=3362), to systematically investigate the tissue-specific characteristics of metastatic tumors. We employed different techniques including dimensionality reduction, clustering, deconvolution analyses, and gene set analyses to study their characteristics. We then reconstructed their metabolic networks to investigate the metabolic features of TNBC metastatic tumors and compared with those of primary tumors of the destination tissue. Principal component analysis showed that the expression profiles of metastatic tumors represent an intermediate state between TPs from the tissue of origin and the tissue of destination, with a profile closer to that of the destination tissue TP. Deconvolution analyses revealed substantial similarities among metastatic tumors and primary tumors of their destinations, although the trends of divergence from their tissue of origin vary across different metastatic sites. Metabolic network analyses showed that the TNBC metastatic tumors (TM) display metabolic phenotypes that are distinct from the metabolic strategy of TNBC primary tumors. While the altered metabolic program in TNBC-TMs was consistent with TPs of their destination, the metastatic tumors also employed metabolic programs distinct from both TPs of their origin and destination. These functional changes, as well as the retained metabolic signatures from the primary tumors of the origin, were primarily associated with transport reactions and uptake capacity of different nutrients, emphasizing the capability of metastatic tumors to obtain nutrients from their microenvironments to survive the circulation and specific distal tissues. Our results also highlighted the importance of several metabolic pathways contributing to tumor viability which may hold potential as drug targets in cancer and metastasis treatment. In conclusion, metastatic tumor cells differentially engage distinct metabolic strategies similar to primary tumors of the destination tissues to sustain their survival and proliferation depending on the local microenvironments in the metastatic sites while retaining some key signatures from their parental primary tumors. This study could reveal new therapeutic windows for developing more effective treatments of metastatic tumors.

11:23
Unified Tumor Growth Mechanisms from Multimodel Inference and Dataset Integration

ABSTRACT. Systems approaches to elucidate biological processes that impact human health leverage mathematical models encoding mechanistic hypotheses suitable for experimental validation. However, building a single model fit to one dataset may miss alternate equally valid mathematical formulations, and available data may not be sufficient to fully elucidate mechanisms underlying system behavior. Here, we overcome these limitations via a Bayesian multimodel inference (Bayesian-MMI) approach, which estimates how multiple mechanistic hypotheses explain experimental datasets, concurrently quantifying how each dataset informs each hypothesis. We apply this approach to unanswered questions about heterogeneity, lineage plasticity, and cell-cell interaction dynamics in small cell lung cancer (SCLC) tumor growth. Through available dataset integration, we find that Bayesian-MMI predictions support tumor evolution promoted by high lineage plasticity, rather than through expanding rare stem-like populations. These results highlight that given available data, any SCLC cellular subtype can contribute to tumor repopulation post-treatment, suggesting a mechanistic interpretation for tumor recalcitrance.

11:36
Stochastic Model of Intra-Tumor Heterogeneity (SMITH)
PRESENTER: Adam Streck

ABSTRACT. Cell-based simulations are a popular method for investigating intra-tumor heterogeneity and genome evolution during tumour growth. However, tracking individual cells in the size of a palpable tumour (billions of cells) is computationally expensive, and most methods thus represent groups of cells (demes, glands, patches) embedded in a lattice. This means that the models create only a simplified abstraction of the population with rigid, non-biological limitations of the lattice. We argue that the particular feature of lattice-based models is that they create implicit spatial constraints on the cell growth resulting in Darwinian selection. However, these constraints can be also expressed explicitly in terms of algebraic geometry and enforced even on non-spatial, well-mixed models.

To demonstrate this claim we have created a well-mixed, confined model of tumour growth of a solid, spherical tumour with fitness altering mutations. Our model introduces a novel mechanic, so-called confinement, that limits the cell turnover in the tumour to its outer shell of a certain width. We show that, when paired with fitness increase of mutations, confinement is sufficient to introduce the Darwinian selection to tumour growth and that different confinement values lead to different spatial dynamics, ranging from pure surface growth to full volume growth. We further show how a wide range of clonal dynamics naturally emerges from the combination of fitness increase and confinement.

Our model is implemented in the SMITH simulation tool. Due to its computational efficiency, SMITH can simulate a real-sized tumour of around ~2cm in diameter (~1 billion cells) in seconds.

11:49
Vasculogenic mimicry induced by multicellular interaction in a 3D glioblastoma model-on-a-chip
PRESENTER: Tae-Yun Kang

ABSTRACT. In 1999, Maniotis et al. first reported an unusual, seemingly new fluid-conducting structure, vasculogenic mimicry (VM), through which cancer cells may supplement traditional angiogenesis in oxygen/nutrient supply, and the elimination of cell waste in an endothelial cell-free manner. VM is frequently observed in aggressive tumors leading to lower patient survival. It might explain why some of the most-hyped drugs in cancer therapy, angiogenesis inhibitors, have underperformed. However, the idea has been controversial as the presence of fluid conducting tubes within a tumor mass has not been proved in vitro. Most of the VM research in vitro has wrongfully utilized the intercellular network formed between cancer cells analogous to vasculogenesis of endothelial cells. Studies based on the vasculogenic activity of cancer cells to date have failed to provide a solid basis for VM and it hampers studies to elucidate the mechanisms behind this phenomenon and find proper therapeutic strategies. Here we report the discovery and a new mechanism of internal cavity formation within A172 glioblastoma (GBM) spheroids in vitro, presumably inferred as VM. We developed a GBM model on-a-chip recapitulating the essential cellular components of in vivo GBM environment: GBM cells, endothelial cells as vasculature, and macrophages as an immune component. Multicellular interaction through direct contact or paracrine signaling in the model revealed that vasculogenesis of endothelial cells was triggered by GBM spheroids but inhibited by macrophages. Strikingly, GBM cells formed internal cavities as an alternative means for mass transportation in the presence of macrophages. Diffusion through the cavities was visually confirmed with FITC-dextran and the hypoxic core was significantly diminished, increasing the size of GBM spheroids. These results correspond to the notion of VM as tumors’ own measures of blood-delivering. The histological characteristics of the internal cavities were also consistent with VM observed in vivo: cancer cells in the absence of EC marker lining on the non-luminal side of a glycoprotein-rich matrix. Based on our observations, we propose and demonstrate an alternative mechanism of VM formation consisting of two sequential steps: 1) entosis, a process by which one cell is engulfed by another cell, regulated by heterogenic FGFR1 expression in GBM cells, and 2) the death of engulfing cell triggered by lysosomal membrane permeabilization of the engulfed cell. VM formation was controlled by manipulating the ratio of a sub-population with hyperexpression of FGFR1 and by disturbing the pathways with chemical inhibition or gene silencing. By focusing on the multicellular interaction under the GBM environment, we established an in vitro model for studying VM and revealed its key mechanisms. It offers a new insight to understand VM and novel therapeutic strategies for VM prevention.

12:02
Cancer cells depend on environmental lipids for proliferation when electron acceptors are limited

ABSTRACT. Production of oxidized biomass, which requires regeneration of the cofactor NAD+, can be a proliferation bottleneck that is influenced by environmental conditions. However, a comprehensive quantitative understanding of metabolic processes that may be affected by NAD+ deficiency is currently missing. Here, we show that de novo lipid biosynthesis can impose a substantial NAD+ consumption cost in proliferating cancer cells. When electron acceptors are limited, environmental lipids become crucial for proliferation because NAD+ is required to generate precursors for fatty acid biosynthesis. We find that both oxidative and even net reductive pathways for lipogenic citrate synthesis are gated by reactions that depend on NAD+ availability. We also show that access to acetate can relieve lipid auxotrophy by bypassing the NAD+ consuming reactions. Gene expression analysis demonstrates that lipid biosynthesis strongly anti-correlates with expression of hypoxia markers across tumor types. Overall, our results define a requirement for oxidative metabolism to support biosynthetic reactions and provide a mechanistic explanation for cancer cell dependence on lipid uptake in electron acceptor-limited conditions, such as hypoxia.

12:15
Lipid metabolic reprogramming extends beyond histological tumor demarcations in human operable pancreatic cancer
PRESENTER: Abel Szkalisity

ABSTRACT. Pancreatic ductal adenocarcinoma (PDAC, here pancreatic cancer) is one of the deadliest diseases with bitter expected survival time. As the mutations driving this malignancy are found in other cancers with better prognostic prospects, an increasing number of studies focus on the pancreatic tumor microenvironment to identify additional contributing factors. For instance, PDAC is characterized by a prominent fibrotic stroma infiltrating the neoplastic areas, and this desmoplastic reaction might either promote or limit tumor progression.

Inspired by studies in mouse models that argued for the importance of lipid metabolic interplay between the tumor and the stroma, we performed here a systematic characterization of the proteome of four tissue compartments from human operable pancreatic cancers. Utilizing laser-capture microdissection we isolated neoplastic lesions (neoplastic parenchyma – NP), tumor adjacent, histologically benign exocrine pancreas (adjacent parenchyma – AP), and stromal tissue surrounding both (neoplastic stroma and adjacent stroma) from the diagnostic specimen of 14 treatment-naïve patients operated on in the Helsinki University Hospital. LC-MS/MS proteomics quantified 6979 unique proteins including lipid metabolic enzymes with low abundance.

We found a loss of normal pancreatic secretory functions in the NP compared to AP, as expected, and abundant apolipoproteins in the stromal areas, reflecting vascular supply. On the other hand, lipid metabolism was more active in the parenchymal regions than in the corresponding stromas, with cholesterol biosynthetic enzymes being most pronounced in the NP.

Despite the small cohort, we investigated the prognostic relevance of the proteins in each microdissected tissue compartment by dividing the 14 patients at their median (2-years) survival time. To our great surprise, the tumor-adjacent (AP) regions harbored more proteins significant for survival than any other compartment. The presence of prognostically relevant proteomic variation in the AP suggested that the histologically benign exocrine pancreas adjacent to the tumor is potentially different from truly healthy pancreas. To test this hypothesis, we reprocessed 12 pancreatic samples from healthy individuals and integrated them with our data. We concluded that the tumor adjacent exocrine regions differed from healthy pancreas by increased lipid metabolism and transport activity and this difference was prognostically relevant for the survival of PDAC patients.

We verified our findings with immunohistochemical staining of select proteins and by analyzing the proteomes of 51 additional patients from published datasets. Our study underscores the role of altered lipid metabolism in PDAC progression and shows that investigation of the previously neglected histologically benign tumor microenvironment may provide novel possibilities in the quest for effective treatment of pancreatic cancer.

12:18
Disentangling the internal composition of tumour activities through a hierarchical factorization model

ABSTRACT. Genomic heterogeneity represents one of the most distinctive molecular features of any type of cancer, having a considerable impact on the efficacy of available medical treatments, often leading to relapse and subsequent deterioration of patients' health. Tumourigenesis emerges as a strongly stochastic process, producing a variable landscape of genomic configurations organised into cell subpopulations or dominant clones, building the global identity of the tumour. In this context, matrix factorisation techniques represent a suitable approach, as they are able to efficiently capture complex patterns of variability. These methodologies aim to obtain a finite set of latent patterns that represent the basic building blocks of observations, rendering in cancer samples the different molecular strategies that tumours implement to develop the hallmarks of cancer. Furthermore, these patterns are shared between samples belonging to the same phenotypic group, providing valuable insight into the main differences and commonalities between cancer subtypes. From this perspective, we present a protocol[1] designed to explore the different levels of genomic heterogeneity in a cohort of cancer patients. To this end, the protocol is based on a hierarchical factorisation model conceived from a systems biology perspective, which integrates the topology of signalling pathways. For a set of altered biological processes, the model simultaneously decomposes two different matrices representing the activity of genes and signalling pathways, respectively, obtaining for the same group of patients two sets of mutually compatible latent components. The protocol was evaluated using a set of simulations specifically designed to recapitulate the cellular hierarchy between genes and pathways, showing a high degree of accuracy when recovering both the previously introduced components and their weights into the simulated samples. In addition, the analysis performed on a real cohort of breast cancer patients recapitulated the internal composition of some of the most relevant altered biological processes in the disease, such as the internal structure of the Her2 subtype in the regulation of epidermal growth factor, the inner composition of the Basal subtype in the Notch signalling pathway and the differences between the Luminal A and Luminal B subtypes in the regulation of oestrogen response and the cell cycle regulation, describing gene- and pathway-level strategies and their combinations across the different breast cancer subtypes. We envisage that hierarchical matrix factorization designs will be essential to better understand the different levels of heterogeneity in tumour cells, revealing how patients who develop the same hallmarks of cancer start from largely different initial genomic configurations.

[1] Carbonell-Caballero, J., López-Quílez, A., Conesa, D., & Dopazo, J. (2021). Deciphering Genomic Heterogeneity and the Internal Composition of Tumour Activities through a Hierarchical Factorisation Model. Mathematics, 9(21), 2833.

12:21
Cell type-specific gene co-expression modules define tumor heterogeneity in melanoma patients
PRESENTER: Michael Prummer

ABSTRACT. Gene co-expression networks are governing all cellular processes in health and disease. But the presence or absence of correlated gene pairs is difficult to interpret in bulk samples. For instance, the co-occurrence of two cell types can lead to an apparent co-expression of two genes even when they are completely independent within each individual cell. In single cell experiments, an observed correlation between a pair of gene is truly present within one cell.

Here we use droplet-based single cell transcriptomics to discover disease-specific robust co-expression networks in different cell types from tissue biopsies of melanoma patients. We analyze each sample independently to arrive at patient-specific networks and subsequently compare them across the cohort. This way, we remove technical variability and perform what is called late integration of the data. To this end, co-expression sub-networks (aka, modules) are identified in each patient using community detection principles. Recurring as well as unique co-expression modules are compared to gene ontology terms to assign a biologically meaningful label. Any difference of the disease and cell type-specific module composition from common gene sets can provide new insight into disease causing mechanisms or novel treatment options. After all, many of the curated gene sets used for enrichment analysis were derived from bulk samples of healthy individuals or non-human model organisms or cultured cell lines. As an outlook, patient-specific gene expression programs in various cell types may give rise to personalized treatment recommendations.

12:24
Hybrid cellular automaton representation of the influence of mechanical stimulation on the development of bone metastases
PRESENTER: Claire Villette

ABSTRACT. Bone metastases (BMs) are among the most debilitating complications for cancer patients. They are associated with poor prognosis and are often incurable. BMs develop through cancer-induced perturbation of the inherent bone remodelling process, which is responsible for healthy bone integrity through balanced resorption of old/damaged bone and formation of new tissue. Osteolytic BMs interfere with this balance in a vicious cycle whereby cancer cells favour bone resorption. Growth factors are released from the degraded matrix and enhance tumour growth, which in turn intensifies bone resorption. Mechanical loading naturally induces an opposite shift to the remodelling balance by stimulating bone apposition. Early in-vitro and in-vivo experiments suggest a therapeutic potential for mechanical stimulation against metastases in bone [1].

The aim of this study was to develop a computational model of load-induced bone remodelling in the context of cancerous metastases, in order to screen for loading regimens with potential therapeutic benefits. This model aimed to recapitulate five main processes at play in this context: differentiation and/or proliferation of healthy cells (osteogenic cells and osteoclasts) and cancer cells, osteogenic/osteolytic activity of healthy cells, influence of cancer cells on healthy cell activity, influence of mechanical stimulation on cellular activity, and changes in mechanical environment due to osteogenesis/osteolysis.

A hybrid cellular automaton framework was implemented in FEniCSx to support this model. Cellular events (proliferation, differentiation, migration) were modelled using a cellular automaton on a regular grid of 10 micrometer resolution. In parallel, separate partial differential equation (PDE) problems were defined to represent the mechanical environment in response to loading and the diffusion of osteoprotegerin (OPG), receptor activator of NF-jB ligand (RANKL), and parathyroid hormone-related protein (PTHrP) signals. The PDEs were solved using Finite Element solvers on linear elements with a mesh resolution of around 5 micrometers. Osteogenic cell secretion of OPG increased in response to increased von Mises stress. In response to PTHrP signals, OPG secretion reduced, and osteolytic activity increased. Osteogenic and osteolytic activities resulted in changes in the system mechanical properties, which in turn influenced local von Mises stress.

This model was evaluated against qualitative observations from in-vitro experiments. It proved capable of capturing changes in extra-cellular matrix deposition by osteoblasts seeded in hydrogel in response to loading, as well as changes in OPG/RANKL expression ratio in response to loading and presence of cancer cells [2, 3].

Next development steps include quantitative model calibration and simulation of different mechanical loading scenarios for comparison of therapeutic benefits in terms of cancer cell proliferation and activity.

[1] Lynch et al., 2013. Journal of Bone and Mineral Research, 28(11), pp.2357-2367. [2] Mc Garrigle et al., 2016. Eur Cell Mater, 31, pp.323-340. [3] Curtis et al., 2020. Journal of the Royal Society Interface, 17(173), p.20200568.

12:25
Single cell pseudo-time inference based on copy number variants for identifying a tumor transition state
PRESENTER: Jonghyun Lee

ABSTRACT. Recent in-depth pan-cancer analysis revealed that there is no apparent universal cause of cancer; driver mutations and the rate of mutation accumulation are all unique to the types of tissue and organ which cancer originates from. Nonetheless, transformation events would have happened to alter a healthy tissue into a tumor. We focused on a universal framework for this normal-to-cancer transition where various causes of cancer are implicitly included, from which a transition state between healthy and tumor can be identified and potential targets for anti-cancer treatments can be predicted. There have been extensive studies on transition states for cellular development, differentiation and reprogramming. Typical pseudo-time analysis based on the single-cell transcriptomic data infers the trajectory of the cellular development from stem cells to differentiated states, and further extended concepts such as RNA velocity assign which direction the cells are likely to progress. However, while the differentiation process occurs along the same genetic background, the tumorigenesis progresses along with the accumulation of genetic changes. Therefore, transcriptional profiles are insufficient in describing the transition of normal to tumor cells. One of the well-established factors regarding the genetic changes during tumorigenesis is the accumulation of aneuploidy. Copy number variation (CNV) inference algorithms such as InferCNV and CopyKat deduce the aneuploidy from the single-cell expression data. Based on these algorithms, one can not only differentiate tumor cells from normal cells but also uncover an intermediate clone that consists of the mixture of normal and tumor cells, that is, a tumor transition state. By utilizing the CNV accumulation as a means to identify the transition state, we were able to construct and define transition states in three different types of cancer: Breast, Lung and Colon. We describe the CNV profiles of the transition states in different cancers, and the genes affected by these aneuploidies. Our algorithm identifies a group of cells that were previously uncharacterized in cancer research. Possible drug targets identified from the cells in the transition state during tumorigenesis have strong implications in cancer prevention in both onset and remission and are the first step in personalized treatment and medicine in cancer treatment.

12:26
Deep learning a model of cytotoxic T cell activation in the tumor microenvironment
PRESENTER: Madison Wahlsten

ABSTRACT. Cytotoxic CD8 T cells recognize antigens presented on major histocompatibility class I (MHC-I) molecules expressed on the surface of other cells, including tumors, leading to T cell activation and killing. The level of T cell activation, indicated by surface marker expression, cytokine production, and killing activity, is modulated by many factors including the quality and quantity of presented antigens. Immunotherapies such as checkpoint blockade antibodies function by preventing checkpoint inhibitors such as PD-1 and CTLA-4 from inhibiting tumor-specific T cell cytotoxic responses to cancer cells. These immunotherapy treatments have been successful in several cancers such as non-small cell lung cancer and melanoma, but limited in other types of cancers (e.g., pancreatic or prostate carcinomas) owing to differences in tumor antigenicity. Previous work from our lab has shown that the quality of an antigen for T cell activation can be encoded in a single parameter derived from cytokine dynamics produced in ex vivo co-cultures between antigen presenting cells (APCs) and T cells. This encoding provides an overall measure of antigenicity for a sample. Here we built a model that can capture the quality of tumor antigen seen by an individual T cell. Using a custom robotic platform, we generated high-throughput kinetics of T cell activation in co-culture with APCs by sampling supernatants and analyzing cells at various timepoints. We performed spectral flow cytometry to measure the expression of up to 30 surface markers and intracellular signals per cell from these co-cultures. Typical datasets comprise over ten million cells, characterized by 25-30 features over 72 hours, across up to 96 conditions composed of different quality and quantity of antigens, ratios of T cells to APCs, and drug perturbations. To analyze these content-rich datasets, we designed a deep neural network that can classify the antigen seen by an individual cell using marker expression values from flow cytometry with high accuracy (area under the receiver operating characteristic curve > 0.8). By leveraging the multifactorial nature of T cell activation at the single cell level, we aim to provide an in vivo-relevant classification of T cell activation, as well as insight into perturbations that could be applied to immunotherapies to achieve better responses in more patients.

12:27
Exploring the “dark proteome” in hepatocellular carcinoma

ABSTRACT. A key remaining frontier in our understanding of biological systems is the “dark proteome”—that is, proteins encoded by long noncoding RNAs (lncRNAs) where the molecular function is largely unknown. The key aspect of this work is that it combines big data mining and pathology to explore the “dark proteome” in hepatocellular carcinoma (HCC), a highly aggressive cancer with limited therapeutic options. Experiments modulating lncRNAs-encoded microprotein expression confirmed a role in proliferation and metastasis in liver cancer. Considering that there are very few accurate molecular biomarkers for HCC detection, understanding function for the entities involved and their potential role in diagnosis and patient stratification will bring substantial impact in HCC therapy. In this study we identified a subset of HCC-specific lncRNAs are translated into small functional proteins. Here, abnormal chromatin remodeling in HCC triggers the expression of lncRNA-encoded microproteins. We generated specific antibodies for C20orf204-189AA and Linc013026-68AA, two of HCC-specific lncRNA-encoded microproteins. Both proteins promote cancer cell proliferation. At the molecular level, we show that C20orf204-189AA participates in ribosomal RNA transcription, while Linc013026-68AA may be phosphorylated by Epidermal Growth Factor Receptor (EGFR) and extracellular signal-regulated kinase (ERK). Remarkably, C20orf204-189AA protein was detected in 70% of primary HCCs but not in but not in control livers, suggesting that HCC-specific lncRNA-encoded proteins may represent a novel class of biomarkers and HCC targets. Our finding also sheds light on the role of the previously ignored ’dark proteome’, that originates from noncoding regions in the maintenance of cancer. Publications: 1. Polenkowski, M., Allister, SB., Burbano de Lara, S., Pierce, A., Geary, B., El Bounkari, O., Wiehlmann, L., Hoffmann, A., Whetton, AD., Tamura, T. and Tran, D.D.H. THOC5 Complexes With DDX5, DDX17 and CDK12 Are Essential in Primitive Cell Survival to Regulate R Loop Structures and Transcription Elongation Rate. http://dx.doi.org/10.2139/ssrn.4175592 (under revision in iScience)

2. Polenkowski M, Burbano de Lara S., Allister MB, Nguyen TNQ, Tamura T and Tran DD. Identification of novel micropeptides derived from hepatocellular carcinoma-specific long noncoding RNA. Int. J. Mol. Sci. 2022, 23(1), 58 (IF=6.2)

3. Burbano De Lara, S., Tran, D.D, Allister, A.B, Polenkowski, M., Nashan, B, Koch, M, Tamura, T. C20orf204, a hepatocellular carcinoma-specific protein interacts with nucleolin and promotes cell proliferation. Oncogenesis. 2021 Mar 17;10(3):31. (IF = 6.5)

12:28
Analysis of Heterogeneity in the Tumor Microenvironment in the 4NQO HNSCC Mouse Model
PRESENTER: Lina Kroehling

ABSTRACT. Oral squamous cell carcinoma (OSCC) is the sixth most prevalent cancer. Lysine-specific demethylase 1 (LSD1) expression, a nuclear histone demethylase, progressively increases with tumor grade and stage in clinical OSCC and blocking LSD1 inhibits preneoplasia. However, the mechanisms through which LSD1 promotes carcinoma are unknown. This study evaluates whether LSD1 inhibition attenuates OSCC by affecting the prevalence of specific cell types within the tumor, and cell-cell interactions occurring between these subtypes, through single cell RNA-seq analyses.

10:30-12:30 Session 2: IMAGING ALGORITHMS

Summary: In this session we will cover the great challenges in current imaging and imaging analysis. From deep learning to data analysis we will host talks that aim to unravel systems properties of complex biological systems including tissues. Multi-scale analysis of biological systems  demans both advanced imaging but also new algorithms and tools. Talks will include recent examples of large-scale imaging breakthroughs made possible with new algorithms. 

10:30
The Systems of Size and Shape Control

ABSTRACT. Cells have evolved mechanisms by which proliferation can generate daughter cells with identical size and shape. For example, during the maintenance of stem cells, division generates identical clones. Size and shape control involves dynamical signalling systems that coordinate cell cycle entry, growth, and mitosis/cytokinesis. However, many of the genes that encode regulators of these processes are mutated in cancer, and participate in the ability of cancer cells to resist therapy. Thus, a major question is how do cancer cells regulate size and shape during tumorigenesis? Or to acquire therapeutic resistance?

We have developed a cell biology programme to model the signalling pathways that regulate size during proliferation in cancer cells with statistical and mathematical models. We can now estimate the relationship between signalling dynamics and physical parameters such as cell size, shape, and the spatiotemporal organization/concentration of proteins. Importantly, we can compare models to test hypothesis on behaviour based on genome sequence and gene expression (i.e. different cancer drivers).

The second line of work addresses the basic constraints under which self-organizing image analysis machines operate to differentiate cellular phenotypes. For example, identifying spatial features (size, shape, cytoskeletal organization) and patterns (morphological heterogeneity across population) AI machines use to analyse cells in both laboratory and clinical settings. This work aims to make digital pathology approaches capable of diagnosing cancer earlier and better.

More broadly, our work leads to the development of quantitative signatures to define cell states and aims to answer questions such as: How many cell shapes exist in the body? In cancer cells? How does shape change during differentiation? When combined with other datasets, such as gene or protein expression, we can start to understand the connection between genotype and phenotype.

10:50
Robust oscillations against spatial-temporal noise in intra-inter cellular kinetics

ABSTRACT. Circadian clock generates ~24h rhythms everyday via a negative feedback loop. Although this involves the daily entry of molecules to the nucleus after random diffusion through a crowded cytoplasm, the period is extremely well preserved. Furthermore, the period is well maintained across the cell population whose size differs considerably. In this talk, I will illustrate how thousands of molecules work together in time and space to compensate for their spatio-temporal variations and maintain robust rhythms, which we identified using the combination of agent based modeling and single cell imaging experiments. Furthermore, the population of individual oscillatory cells can be communicated via intercellular signal to generate synchronous rhythms. I will describe that cells use an intracellular signal amplification to achieve long range temporal synchrony with local signal, which we identified via the combination of delay PDE model and synthetic biology experiments.

This work is based on Jeong et al, Interface Focus (2022), Bessely et al, PNAS (2020), Kim et al, Nature Chemical Biology (2019).

11:03
Inference of a causal synaptic molecular network from multiplexed imaging and its perturbations in autism and schizophrenia
PRESENTER: Reuven Falkovich

ABSTRACT. The complex functions of neuronal synapses depend on their tightly interacting, compartmentalized molecular network of hundreds of proteins spanning the pre- and post-synaptic sites. Due to the high interconnectivity of any molecular subsystem in the synapse, a whole-network view is needed to understand the biochemical processes and synapse state diversity underlying different forms of plasticity and metaplasticity and, importantly, their dysfunction in cognitive disorders like autism and schizophrenia. This challenge is not restricted to synapses, as it is relevant to all subcellular molecular systems whose composition is controlled by interdependent local translation or recruitment of proteins, for example, more than by global gene expression.

In this work, we leveraged PRISM – a quantitative, high-throughput, single-synapse multiplexed imaging technique – in combination with Bayesian network inference, to derive a graph of causal conditional dependencies among eight proteins in the excitatory synapse. The resulting model predicts new hypotheses for downstream effects of perturbing individual nodes, which we tested directly. Applying this analysis to previously obtained data of RNAi knockdowns of 16 schizophrenia- and autism-associated genes, we show that central features of the network are similarly perturbed across all genetic knockdowns, despite having very different targets and divergent effects on individual synaptic proteins. This offers insight into the convergent molecular etiology of these debilitating, hereditary and highly polygenic disorders.

Thus, the combination of PRISM imaging with Bayesian network inference, which can also be integrated with live imaging data and dynamic network inference, offers a novel data modality and hypothesis-generating tool for understanding complex protein networks in situ in cells, organelles, and subcellular structures, as well as their responses to chemical or genetic perturbations.

11:15
Development of a geometric blood vessel model to quantify morphological changes of endothelial cells in 3D during vascular remodeling
PRESENTER: Daniel Seeler

ABSTRACT. Vascular remodeling is a physiological process that continuously ensures sufficient nutrient supply of tissues by long-term changes of the blood vessel architecture. Here, endothelial cells (ECs) lining the blood vessel interior perceive fluid shear stress (FSS) exerted by blood flow. In response to high FSS ECs elongate and align in flow direction. These cell morphological changes lead to changes in vessel diameter and, subsequently, FSS. To better understand these coupled processes, we aimed to develop a geometric model linking EC shape and dorsal aorta (DA) geometry in zebrafish embryos and use it to quantify morphometric measures of EC morphology in 3D. We obtained endothelial cell contours in the DA of zebrafish embryos (N=7) at 48 hours post fertilization (hpf) and 72hpf by manually annotating 3D data points onto the EC-specific transgenic junctional marker pecam1-EGFP in Imaris software (Bitplane Inc.). As a pre-processing step, we fitted smoothing splines to each cell contour to reduce noise created by the manual annotation process. Then, we locally fitted cross-section shapes along the vessel axis. To describe the family of cross-section shapes, we developed a shape model which accounts for physiologically observed dorsal-ventral asymmetry. We ensured feasibility and locality of the estimation by considering the projection errors of all points in proximity to the cross-section with Gaussian weights decaying with distance. To improve the estimation in case of outliers and data sparsity, we (1) chose the width of the weight function adaptive in space and (2) constrained the deviation of the local cross-section shape’s parameters from the mean shape’s parameters. As a post-processing step, we employed a Gaussian filter in parameter space to smoothen the transitions between cross-section shapes along the vessel axis. We interpolated between the fitted cross-section shapes by triangulation. Each EC surface was triangulated such that its projected contour edges were included as triangle edges resulting in a 3D mesh with smooth boundary. Using these 3D meshes, we were able to quantify the increase of both, EC elongation and alignment in flow direction, and the decrease in EC compactness between 48hpf and 72hpf. Our estimation process has low projection errors and is robust to annotation errors. Performing the analysis in 3D avoids cartographic distortions, which would be caused by tube unrolling. While our geometric model currently only computes a static description of EC morphology, a dynamic model of EC shape in response to blood flow can be incorporated. This would enable us to study the dynamics of the coupling between EC morphology and vessel geometry. Our long-term aim is to compare EC morphology between wild-type zebrafish and disease models to improve our understanding of pathological vascular remodeling in humans.

11:27
Analyzing single-cell trajectories with augmented machine learning methods predicts cancer drugs response in live-cell microscopy assays
PRESENTER: Marielle Péré

ABSTRACT. Cell response heterogeneity upon treatment is a main obstacle in preclinical development of efficacious cancer drugs, due to the emergence of drug-tolerant cells. We have previously developed an approach to profile drug-tolerant persisters, based on predictions of their drug response. To automatize and increase the prediction throughput, we present a mathematical framework, composed of machine-learning classification models augmented by an ODE model.

First, an ODE model of the extrinsic apoptosis initiation by death ligands is calibrated on time trajectories of hundreds of treated clonal HeLa cells. The resulting deterministic systems are then analysed, based on drug response, to highlight mechanistic features with predictive values for cell decision. In a second step, we combine the predictions of the ODE system with machine-learning classification models to determine the drug response of each cell before it commits to an irreversible decision that would alter their states before profiling.

Here we show that the ODE model analysis could detect the time of cell decision shortly after treatment, thanks to the emergence of an additional regulation at the receptor level in drug-sensitive cells. Moreover, the parameters distribution of the deterministic system provided a biological threshold that allows the prediction of cell response. Our mechanistic-informed approach, combining our ODE system with machine learning classifiers, outperformed classic machine learning approaches and enabled the accurate cell response prediction of otherwise unpredictable cells (Meyer et al., Cell Systems 2020).

11:39
PolarityJaM: An image analysis toolbox for cell polarity, junction and morphology quantification
PRESENTER: Wolfgang Giese

ABSTRACT. Cellular polarity is important in many biological processes from development, wound healing to angiogenesis. Fundamental processes of living cells such as cell migration, cell division and morphogenesis depend on prior polarization and breaking of spatial symmetry. Spatial reorganization of plasma membrane, cytoskeleton, cell-cell junction or organelles, is required to establish an axis of polarity with a distinct direction, meaning ‘front and back’, to guide directed processes. In these processes, cells have to adapt and react according to multiple and often conflicting cues of the environment.

We put forth a package with the aim of providing the user with an up to date versatile means to reproducibly conduct exploratory image analysis exemplified for endothelial cells in culture and tissue. This package can be roughly divided into three parts (multi-channel) cell instance segmentation, feature extraction, and exploratory analysis. Multi-channel segmentation is carried out using Cellpose; the user can specify a pre-trained model in this step or provide their own model.

After the image or a collection of images has been segmented into individual cells the features of each cell are extracted, examples of these features include junctional properties, cellular orientation, and organelle orientation within the cell. With the features extracted several publication-ready plots are automatically generated to visualize phenotypes such as collective orientation, tissue wide variation in size and, eccentricity.

At this stage in the pipeline tabulated per cell features exist and can be uploaded an R shiny app (www.polarityjam.com) where periodic features such as collective variation in orientation can be measured and visualized. Circular statistics of cellular polarity including mean and confidence intervals, comparative circular statistics, circular-linear and circular-circular correlation analysis as well as spatial statistics can be computed in the R shiny app. Metadata across all steps is saved into a text file along with visualization output. Many powerful and sophisticated tools have been developed to analyze the endothelial tissue collective phenotype. We present a package that integrates these analyses reproducibly and with a level of documentation that allows the user to efficiently and quickly carry out analysis of their data.

11:51
On multistability and constitutive relations of cell motion on Fibronectin lanes
PRESENTER: Behnam Amiri

ABSTRACT. Cell motility on flat substrates exhibits coexisting steady and oscillatory morphodynamics, the biphasic adhesion-velocity relation, and the universal correlation between speed and persistence (UCSP) as simultaneous observations common to many cell types. Their universality and concurrency suggest a unifying mechanism causing all three of them. This study suggests a mechanical mechanism controlled by integrin signalling on the basis of a biophysical model and analysis of trajectories of MDA-MB-231 cells on Fibronectin lanes. The experiments exhibit cells with steady or oscillatory morphodynamics and either spread or moving with spontaneous transitions between the dynamic regimes, spread and moving and spontaneous direction reversals. Our biophysical model is based on the force balance at the protrusion edge, the noisy clutch of retrograde flow and a response function of friction and membrane drag to integrin signaling. The theory reproduces the experimentally observed cell states, characteristics of oscillations and state probabilities. Analysis of experiments with the biophysical model establishes a stick-slip oscillation mechanism, explains multistability of cell states and the statistics of state transitions. It suggests protrusion competition to cause direction reversal events, the statistics of which explain the UCSP. The effect of integrin signalling on drag and friction explains the adhesion-velocity relation and cell behavior at Fibronectin density steps. The dynamics of our mechanism are non-linear flow mechanics driven by F-actin polymerization and shaped by the noisy clutch of retrograde flow friction, protrusion competition via membrane tension and drag forces. Integrin signalling controls the parameters of the mechanical system.

12:01
Insights on hemodynamic changes in hypertension and T2D through non-invasive cardiovascular modeling
PRESENTER: Kajsa Tunedal

ABSTRACT. One third of all persons worldwide have high blood pressure (hypertension), and it is twice as common in patients with type 2 diabetes (T2D). Uncontrolled hypertension is a risk factor for diseases such as stroke, heart failure, and renal failure. The connection between these diseases, T2D and hypertension can be understood through the hemodynamic mechanisms that describe the complex changes in the regulation of blood flow and blood pressure. Detailed hemodynamic data can be acquired with non-invasive measurements such as 3D magnetic resonance imaging of blood flow over time called four-dimensional magnetic resonance imaging (4D Flow MRI). However, 4D Flow MRI cannot directly measure hemodynamic parameters such as stiffness and blood pressure in the heart and aorta. To acquire these parameters together with other information that otherwise is hard to measure non-invasively, we herein combine a cardiovascular model with 4D flow data. The aim is to investigate hemodynamic differences between controls, T2D patients, hypertensive patients, and patients with both T2D and hypertension, to further elucidate the mechanisms of hypertension and T2D.

For 80 subjects from the SCAPIS Linköping cohort in Sweden, we used patient-specific data from MRI and cuff pressure to create personalized models of the individual hemodynamics. The 80 personalized models were used to group and compare hemodynamic parameters between controls, patients with T2D, patients with hypertension, and patients with both T2D and hypertension. Preliminary results show statistically significant hemodynamic changes in hypertensive and T2D patients compared to controls, as well as differences between hypertensive and T2D patients. Additionally, a large patient-to-patient variation can be seen within each group, showing the importance of a patient-specific approach like our personalized models in the treatment of these patients. These new insights and personalized models could, together with further studies, aid in the treatment planning of patients with both diabetes and hypertension.

12:04
Dynamic analysis framework to detect cell division and cell death in live-cell imaging, using signal processing and machine learning
PRESENTER: Asma Chalabi

ABSTRACT. The detection of cell division and cell death events in live-cell assays has the potential to produce robust metrics of drug pharmacodynamics and return a more comprehensive understanding of tumor cells responses to cancer therapeutic combinations. As cancer drugs may have complex and mixed effects on the biology of the cell, knowing precisely when cellular events occur in a live-cell experiment allows to study the relative contribution of different drug effects –such as cytotoxic or cytostatic, on a cell population. Yet, classical methods require dyes to measure cell viability as an end-point assay, where the proliferation rates can only be estimated when both viable and dead cells are labeled simultaneously –not to mention that the actual cell division events are often discarded due to analytical limitations.

Live-cell imaging is a promising cell-based assay to determine drug efficacies, however its main limitation remains the accuracy and depth of the analyses, to acquire automatic measures of the cellular response phenotype, making the understanding of drug action on cell populations difficult. In this work, we present a new algorithmic architecture integrating machine learning, image and signal processing methods to perform dynamic image analyses of single cell events in time-lapse microscopy experiments of drug pharmacological profiling. Our event detection method is based on a pattern detection approach on the polarized light entropy making it free of any labeling step and exhibiting two distinct patterns for cell division and death events. Our analysis framework is an open source and adaptable workflow that automatically predicts cellular events (and their times) from each single cell trajectory, along with other classic cellular features of cell image analyses, as a promising solution in pharmacodynamics.

12:07
Combining single-cell imaging techniques to probe surface-adhesion-dependent morphological features of disease state
PRESENTER: Matthew Lux

ABSTRACT. Single-cell imaging offers a powerful way to probe myriad aspects of cellular behavior. Different techniques offer advantages and disadvantages with respect to ease of analysis, throughput, capturing of single-cell dynamics, and ability to capture images in situ rather than in suspension. Here we compare two such techniques, imaging flow cytometry and confocal microscopy. To do so, we develop a pipeline to segment individual cells from microscopy images and produce sets of individual images analogous to imaging flow cytometry. The approach allows quantitative analysis of morphological features of cells while adhered to surfaces and interacting with neighboring cells rather than in solution as in imaging flow cytometry. We present analysis of images of macrophage infection by Francisella tularensis to identify morphological features that vary with measurement technique.

12:10
Morphometric analysis of macrophage populations in response to animal-derived extracellular matrices

ABSTRACT. Wound healing and tissue regeneration rely on the concerted activities of many cell types, including immune cells. For example, macrophages recruited to wound site can polarise into different subtypes with different functions. M1 polarised macrophages become more phagocytic with shorter lifespans, whereas M2 polarised cells become more contractile and longer-lived. The balance between M1 and M2 cells impacts how inflammation occurs and resolves. Shifting that balance can mean the difference between healthy regrowth and fibrosis.

The extracellular environment provides critical physical and chemical signals that guide immune cell behaviour. The extracellular matrix (ECM) is a complex mix of structural and signalling proteins and other molecules. The composition and function of ECMs in tissue regeneration are not fully understood. Over the last decade, decellularised ECM (dECM) derived from tissues have shown promise for medical use, as injectable hydrogels or as carriers for cells and therapeutics. Understanding the impact of dECM on macrophage polarisation is therefore important for predicting their effectiveness in tissue engineering and repair.

To investigate the effects of various dECMs on macrophage polarisation, in combination with inflammatory cytokines, we used automated image analysis to morphologically profile millions of cells and compared populations in high-dimensional feature space. Each population of cells contains a mixture of phenotypes; thus, simply comparing averages may mask or skew population-level differences. First, we determined that M1/M2 polarization could be inferred from cell morphology. Next we asked whether dECM exposure affected macrophage polarization using Maximum Mean Discrepancy (MMD). MMD is a kernel-based statistical test used to determine whether given two distributions are the same or different, without the need for dimensionality reduction. We used two-sample MMD test statistics as proxies for distance in morphological feature space between populations. Projecting these distances in 2D allowed us to situate the cell populations relative to reference populations and one another. These data, supported by quantification of known surface markers, showed that dECMs did impact macrophage polarization and responses to inflammatory cytokines. Furthermore, morphological profiling indicates further functional distinctions or classes within macrophage populations that are not captured by surface marker expression.

10:30-12:30 Session 3: SINGLE CELL MODELING

Summary: The last two decades has seen a dramatic rise in experimental approaches to characterize behaviors at the single-cell level and this has been accompanied by the development of a wide range of computational methods for both the analysis of single-cell measurements and the modeling of single-cell behavior. The single-cell modeling session broadly covers all areas of computational biology associated with the biological behaviors at the single-cell level including stochastic dynamics, gene regulation, spatiotemporal dynamics, and questions about how cells self-organize, receive, process and respond to information, and communicate with other cells. 

Location: Alexander
10:30
Avoiding errant activation of mitophagy: dual factor authentication of the Pink1/Parkin circuit

ABSTRACT. When mitochondria accumulate significant damage, they are targeted for mitophagy (mitochondrial autophagy). A core mitochondrial damage sensing pathway is the Pink1/Parkin positive feedback circuit. An input signal from mitochondrial Pink1 permits continued self-recruitment of Parkin to the mitochondria and subsequent recruitment of mitophagy machinery. Interestingly, healthy mitochondria tolerate natural transient depolarization events without activating the Pink1/Parkin pathway.

How does this positive feedback circuit filter “noise” to avoid false activation? Using quantitative live-cell microscopy and mathematical modeling we mapped the relationship between Pink1 input levels and Parkin recruitment dynamics. We found that initiation of Parkin self-recruitment is only triggered when Pink1 input both exceeds a threshold and is persistent. This “dual factor authentication” is an emergent property of the Pink1/Parkin circuit topology that prevents errant activation of mitophagy.

10:50
How do single bacterial cells make decisions?

ABSTRACT. The operation of genome-wide gene regulatory networks in bacteria presents us with an apparent puzzle. On the one hand, bacteria manage to successfully coordinate their global gene expression patterns to allow them to grow in a huge variety of environments, including complex combinations of nutrients and stresses that natural selection cannot possibly have specifically prepared them for. On the other hand, the more we study gene regulation in bacteria at the single cell level, the more noisy and haphazard it appears. Moreover, given the low molecule numbers involved, there are severe thermodynamic limitations on the accuracy of both sensing and regulation of gene expression in single bacterial cells, which seems to preclude the robust adaptation observed at the population level.

In this talk, I will present a number of recent observations that our lab has made through joint experimental and theoretical study of single-cell gene regulation in E. coli, including

1. That gene expression fluctuations are largely driven by propagation of noise through the gene regulatory network, 2. That, through the effects of dilution, growth rate controls the sensitivity of gene regulatory circuits to fluctuations, 3. That the phenotypic stability of cells systematically increases with growth rate,

and will discuss how these observations combine to paint an overall picture of the way bacteria can successfully adapt to complex unpredictable environments, in spite of highly noisy and inaccurate regulation at the single cell level.

11:10
Dynamic fluctuations in a bacterial metabolic network
PRESENTER: Manika Kargeti

ABSTRACT. Metabolism converts nutrients to energy and biomolecules, in order to sustain all cellular processes. Although the operation of the central metabolism is perceived as deterministic, dynamics and high connectivity of the metabolic network make it prone to fluctuation generation. Nevertheless, identification and characterization of such fluctuations through time resolved metabolite measurements in small bacterial cells remained a challenge. Here we use single-cell metabolite measurements based on Förster resonance energy transfer (FRET), combined with computer simulations, to explore the real-time dynamics of the metabolic network of Escherichia coli. We observe that step like exposure to glycolytic carbon sources elicits large periodic fluctuations in the intracellular concentration of pyruvate in individual E. coli cells. We have then tried to find the source of these fluctuations by using deletion strains of numerous enzymes in central carbon metabolism and also by using many glycolytic and nonglycolytic sugars as the sole carbon source. These experiments suggested that a combination of biochemical reactions is responsible for these fluctuations, with reactions around pyruvate node being especially crucial. Also, these fluctuations occur on a timescale of several minutes that is consistent with predicted oscillatory dynamics of the metabolic network. These fluctuations apparently propagate to other cellular processes, thus affecting multiple aspects of bacterial physiology and leading to post-translational heterogeneity of cellular states within a population. Normally fluctuations are considered as a hindrance to performance and many bacterial networks have regulatory systems in place to avoid these fluctuations (also known as noise). But recent studies have demonstrated that these fluctuations and resultant metabolic heterogeneity could be responsible for survival of bacterial populations, especially ones exposed to highly dynamic environments. Therefore, it might be beneficial to study metabolic variability in order to decipher underlying metabolic regulation and interactions of different cellular processes.

11:28
Optimal prediction of a noisy signal
PRESENTER: Jenny Poulton

ABSTRACT. Cells exist in dynamic environments. Among others stressors, they must be able to survive changing temperatures, pH, and concentrations of both useful and harmful chemicals. To increase their chance of survival, cells can measure properties of the environment and mount a response accordingly. Since mounting a response takes time, a cell must anticipate environmental changes.

A cell's ability to predict the future is limited by the predictability of the signals available to it. Indeed, only if past and present values of a signal are predictive of future values will the cell choose to store some features of the signal. Additionally, the system has limited resources for prediction; thus, it should choose to compress the signal, extracting a signal's maximally predictive features. The information bottleneck method (IBM) provides a model-free framework for calculating the optimal compression of a given signal with a given amount of resources and extracts the optimal compression kernel. This kernel tells us which signal features are maximally informative about the future.

The data processing inequality tells us that \(I(Y;T)<I(Y;X)\), i.e., that a cell's ability to predict the future value of a signal is limited by the mutual information between said future and the trajectory of that signal in the past. So what happens if the signal is corrupted by noise? In this case, the amount of predictive information in the signal \(I(Y;X)\) reduces, leading to a worse prediction of the future value of the signal. Does it also change the kernel's shape, i.e., which signal features are maximally informative?

In this talk, I will discuss an extension to the information bottleneck method for Gaussian variables, which allows us to find the optimal kernels for compressing a signal corrupted by gaussian noise. This signaling noise is in addition to the compression noise considered in the original method. The specific consideration of this signaling noise allows us to make a clear link between the Information Bottleneck Method to the Wiener filter, the optimal filter for eliminating noise from a signal.

We will show that the kernel found by this method bears a striking resemblance to the bacterial chemotaxis kernel, suggesting that noise averaging is a critical factor for this system.

11:40
Non-monotonic Na+ dynamics in the malaria parasite Plasmodium falciparum
PRESENTER: Jorin Diemer

ABSTRACT. Malaria is currently responsible for more than 200 million estimated cases and half a million deaths annually, with the majority of cases and deaths attributable to Plasmodium falciparum, one of six strains of malaria parasite able to infect humans (World Health Organization 2021) . The P. falciparum parasite has developed varying degrees of resistance against most, if not all, of the antimalarial drugs currently available and there is an ongoing need to develop new antimalarial agents. Two compounds, which are currently in clinical trials against malaria target an ’ion pump’ on the surface membrane of the malaria parasite (Tse et al. 2019) . Ion regulation in the P. falciparum parasite has been the subject of extensive studies over recent decades. This research has led to a general understanding of how the parasite regulates its internal ionic composition (Kirk 2015) . However, there has not yet been any attempt to integrate these findings into a quantitative model. We developed a mathematical model for ion regulation in the asexual intra-erythrocytic blood-stage of the P. falciparum parasite. With only a single parameter set the model is able to explain at least 29 different datasets, with a subset of those showing transient dynamics of the Na+ concentration in the parasite cytosol. A proposed inhibition of a Na+-extruding P-Type ATPase by cytosolic protons was shown to be responsible for the transient changes induced in the Na+ concentration, demonstrating the importance of mathematical models to unravel unintuitive observations. The model was further used to show additive effects of two ion pump inhibitors, which are currently in clinical trials. Indicating that it might be beneficial to combine both drug candidates. We anticipate the model to be a valuable framework and helpful tool to study various aspects of the ion regulation of Plasmodium falciparum.

Bibliography Kirk, Kiaran (2015): Ion Regulation in the Malaria Parasite. In Annual review of microbiology 69, pp. 341–359. DOI: 10.1146/annurev-micro-091014-104506. Tse, Edwin G.; Korsik, Marat; Todd, Matthew H. (2019): The past, present and future of anti-malarial medicines. In Malar J 18 (1), p. 93. DOI: 10.1186/s12936-019-2724-z. World Health Organization (2021): World malaria report 2021. World Health Organization.

11:52
Accounting for low transcript sampling rates in scRNA-seq using the Good-Turing estimator

ABSTRACT. Due to multiple inefficiencies during the various scRNA-seq experimental protocols and the limited sequencing depth per experiment, we only observe reads for a small fraction of the mRNAs contained within a cell. Additionally, the number of reads per cell varies by up to an order of magnitude. To account for this variation, cells are usually normalized and therefore represented by their relative gene expression vectors rather than the absolute read counts per gene. However, since the number of unique reads per cell usually is smaller than even the expected number of expressed genes in a cell, we can not hope to accurately represent the true relative gene expression vector. The Good-Turing estimator improves the estimation of the relative gene expression vector, by accounting for the probability that a next hypothetical read would be of a previously unobserved gene. In simulated datasets, we show that this improves the estimation of the relative gene expression vectors of cells. Which, in turn, improves the estimates of cell-cell distances and reduces the inverse correlation between cell-cell distance and the number of reads in the cell. We further show that this enhances our ability to distinguish between truly different cells and those who’s relative gene expression estimates only look different due to sampling noise. Good-Turing estimates have the potential to improve many downstream analysis steps based on cell-cell distances, such as: Clustering, visualization and trajectory inference.

12:04
Mitotic Memory as Spontaneous Symmetry Breaking in the Cell
PRESENTER: Arran Hodgkinson

ABSTRACT. During development, lineage committed cells undergo numerous cell divisions. Mitosis represents a challenge to the inheritance of transcriptional states.During mitosis, chromosomes become condensed, packed with nucleosomes and Pol II, whilst most transcription factors are expelled from this particularly hostile chromatin landscape. How a cell maintains transcriptional fidelity across cell divisions is a fundamental question in biology, in healthy organisms as well as for relentlessly dividing cancerous cells. It is now clear that not all traces of transcriptional activation or repression are erased during mitosis (Festuccia et al., Development 2017). Live imaging of transcription dynamics provides the extent to which transcriptional status is inherited between cell generations, a phenomenon called mitotic memory. Using this technique in Drosophila embryos, our team has recently visualized transcriptional memory for the first time in a multicellular organism (Ferraro et al., Curr Biol 2016). When a mother nucleus is transcriptionally active, its descendants have a higher probability to activate transcription in the following cycle, compared to descendants of inactive mothers. With this tool in hand, we now seek to employ a mathematical model of mitotic memory to be able to formulate hypotheses on the potential supports and timescales of this memory. Indeed the support of this mitotic memory could involve mitotic retention of transcription factors or epigenetic modification on histone tails (bookmarking). Because too long a memory would prevent activation of new genes or shut down of old ones when this is needed, mitotic memory should be short-term and the processes involved should be dynamic (Bellec, Radulescu, Lagha, Curr Opinion Sys Biol 2018). Of course, this does not preclude other forms of cellular memory that are long-term. Previous mathematical models developed by our team (Dufourt et al., Nat Com 2018) were based on Markov chains and described transcriptional activation by a small number of discrete limiting transitions. Here we derive a mitotic memory model from very general principles using statistical field theory. In our model, the cell's transcriptional state is represented as one of the attractors of a multistationary Phi4 model. Contrary to our previous Markov chain model, which does not cope with mitotic events, the new model is the first to describe memory transmission across multiple mitoses. In this model, we interpret mitosis using the general concept of symmetry breaking. Seen as such, our model provides the normal form for a whole class of mitotic memory models. We validate our model by using single nuclei data recording the inheritance of transcriptional states down cell lineages, in vivo. For this we employ the MS2/MCP technique and live imaging of developing early Drosophila embryos (Ferraro et al., 2016 Curr Biol; Dufourt et al., Nat Com 2018; Bellec et al., Nat Com 2022).

12:16
Scalable and flexible inference framework for stochastic dynamic single-cell models

ABSTRACT. How a cell population dynamically responds to a stimulus like a drug, or a nutrient shift can today be studied by methods like single-cell microscopy. Ideally, dynamic data can shed light on both the cellular mechanisms behind a stimuli response, and the mechanisms giving rise to cell-to-cell variability within a population. An efficient way to test if a hypothesis can rationalise gathered data is mechanistic mathematical modelling, however, mechanistic models typically have several unknown parameters (e.g., protein translation rates). To be able to test if a hypothesis/model is valid, unknown parameters must first be estimated by fitting the model to available data. Parameter estimation for a single-cell mechanistic model is a challenging though, since these models often must account for cell heterogeneity caused by both intrinsic (e.g., variations in chemical reactions) and extrinsic (e.g., variability in protein concentrations) noise. Although several parameter estimation methods exist, the availability of an efficient general and flexible method is lacking which limits the usage of single-cell mechanistic models. To alleviate this we have developed a scalable and flexible framework for Bayesian estimation on state-space mixed-effects stochastic dynamic single-cell models. Our approach can infer model parameters when intrinsic noise is modelled by either exact or approximate stochastic simulators, and when extrinsic noise is modelled by either time-varying, or time-constant parameters that vary between cells. We demonstrate our approach by studying how cell-to-cell variation in carbon source utilisation affects heterogeneity in the budding yeast Saccharomyces cerevisiae SNF1 nutrient sensing pathway. We identify hexokinase activity as a source of extrinsic noise and deduce that sugar availability dictates cell-to-cell variability. Besides single-cell studies, our method can be used in several processes where both inter- and intraindividual variability matters, like as in ecology and in pharmacokinetic/ pharmacodynamic studies.

12:21
The transcriptome dynamics of single cells during the cell cycle
PRESENTER: Daniel Schwabe

ABSTRACT. The cell cycle is among the most basic phenomena in biology. Despite advances in single-cell analysis, dynamics and topology of the cell cycle in high-dimensional gene expression space remain largely unknown. We developed a linear analysis of transcriptome data which reveals that cells move along a planar circular trajectory in transcriptome space during the cycle. Non-cycling gene expression adds a third dimension causing helical motion on a cylinder. We find in immortalized cell lines that cell cycle transcriptome dynamics occur largely independently from other cellular processes. We offer a simple method (“Revelio”) to order unsynchronized cells in time. Precise removal of cell cycle effects from the data becomes a straightforward operation. The shape of the trajectory implies that each gene is upregulated only once during the cycle, and only two dynamic components represented by groups of genes drive transcriptome dynamics. It indicates that the cell cycle has evolved to minimize changes of transcriptional activity and the related regulatory effort. This design principle of the cell cycle may be of relevance to many other cellular differentiation processes.

12:24
Theory of biochemical information processing with transients
PRESENTER: Manish Yadav

ABSTRACT. Cells in tissues and organisms operate in dynamic environments, continuously sensing and responding to time-varying chemical signals. In order to accurately interpret the complex information from their environment, biochemical networks in single cells actively process these extracellular signals in real-time. The current concept of biochemical computations places a strong focus on attractor-based information processing in cells. Recent studies, however, have shown that cells generate completely opposite phenotypic responses depending upon the frequency of the growth factor, independent of growth factor identity. This breaks down the steady-state description of biochemical information processing. Therefore, we propose to describe biochemical networks embedded in non-stationary environments as non-autonomous systems whose solutions are the dynamic input-dependent trajectories. We show that memory arising through metastable states on the level of the input layer of the biochemical network will enable the system to integrate time-varying signals such that, inputs resulting in different phenotypic responses will be uniquely encoded in phase-space trajectories. The extracellular information of different phenotypes is spread throughout the large signaling networks and represented by characteristically different classes of phase-space trajectories. This encoded information will further be decoded downstream by early response genes (ERG) in real-time, where we show that the feed-forward structure of ERG is sufficient for this task.

12:25
Quantification of bacterial resource allocation in changining environments on the single-cell level
PRESENTER: Antrea Pavlou

ABSTRACT. The ability of microbes to colonize the most improbable places can be partly attributed to the efficient coordination between growth and metabolism. Over the last 50 years, the relationship between growth and the environment has been intensely studied, and has lead to general empirical relationships or ’growth laws’. In most studies, however, bacteria are maintained at steady-state growth even though such conditions are rarely found in a natural environment. To investigate bacterial adaptation in changing environments, we have tracked growth and gene expression of single cells of Escherichia coli bacteria growing in a microfluidics device in changing environments. We have examined the behavior of key ribosomal and metabolic genes using fluorescent protein tags. Using inference algorithms, along with models accounting for the maturation kinetics of reporters, we were able to derive dynamic resource allocation profiles of each protein of interest from the time-lapse measurements [PhD thesis, Pavlou et al]. The experimental results provide a detailed view of resource allocation strategies of individual bacteria in dynamically changing environments. Even though the average behavior of the bacteria precisely matches known growth laws during steady-state, resource allocation deviates from the classical growth laws during growth transitions. Furthermore, we identified a considerable heterogeneity between bacteria that manifests itself by different strategies for adapting to a new environment. Our results reveal new principles of dynamical resource allocation and could be helpful in improving biotechnological processes involving microorganisms.

12:26
Inference and dynamical simulation of the gene-regulatory network in single cells
PRESENTER: Patrick Stumpf

ABSTRACT. During development the gene regulatory network within the cell achieves an extraordinary feat: the robust specification of all somatic cell types in the body from a single fertilized egg through differential expression of a constant set of genes. However, gene mutations that cause loss or gain of function may alter the activity of the GRN and affect the outcome of cell differentiation. A prominent example for this is found in blood cancer, where gene mutations lead to an increased proliferation of stem cells and decreased production of mature blood cells. The consequence is an insufficient oxygen supply to the body or an incapacitated immune system, both of which can be lethal. To better understand how gene mutations may lead to the altered production of cell types, we need to consider them in the context of the gene regulatory network and find a suitable computer model to predict cell fate. Since the gene regulatory network is largely unknown, network inference is typically used to predict gene interactions directly from data. For this purpose, large single-cell expression data obtained from RNA sequencing have proven useful, although the destructive nature of such measurements prevents the recording of time-resolved expression changes from individual cells. Recently, methods have been developed to extract temporal information from individual cells via RNA splicing dynamics. These methods have been used to visualize cell fate within a population as a phase plane. Here, using a simple machine learning model, we learn to encode the gene regulatory network from RNA splicing dynamics, enabling the simulation of cell trajectories for individual cells. The model enables extraction and manipulation of a gene regulatory network and the simulation of diseased cell trajectories due to gene mutation. We demonstrate that the model can learn the correct regulatory logic from synthetic single-cell gene expression data and recapitulate the appropriate dynamics of gene expression. We further apply this model to gene expression data obtained from bone marrow to understand myeloid blood cell differentiation in cancer. Our modelling strategy can be universally applied to any single-cell data to elucidate cell differentiation in health and disease. We anticipate that this approach will lead to new insights on cell differentiation dynamics for a large range of monogenic diseases.

12:27
Virtual Cell Modeling and Simulation Software
PRESENTER: Michael Blinov

ABSTRACT. Virtual Cell (VCell, http://vcell.org) is free open-source software for modeling cell biological systems. Biology-based interface is designed for researchers with no programming/scripting experience: they enter reactions and reaction rules in an intuitive graphical way, and VCell automatically creates the math for you. Models and simulations can be accessed from anywhere using the VCell database; models can be shared among collaborators or made publicly available. Simulations can be run locally (on a user’s computer) or using our server – a job can be send to a server and results can be retrieved any time later. VCell provides a variety of simulation frameworks. The same reaction network can be simulated using deterministic (compartmental ODE) and/or stochastic reactions (SSA solvers) simulators. By adding a geometry for compartments and diffusion for species, the same reaction network can be simulated as a reaction-diffusion-advection PDE with support for 2D kinematics and/or using spatial stochastic (reaction-diffusion with Smoldyn, http://smoldyn.org ). Geometries from 2D or 3D microscope images or from idealized analytical expressions can be used, and membrane flux, lateral membrane diffusion and electrophysiology could be incorporated. If elements of the model are defined by reaction rules (using a novel graphical user interface for defining, visualizing and verifying rule-based models), the model can be simulated using BioNetGen (http://bionetgen.org) engine and/or network-free agent-based simulations. The VCell models can be exchanged with other tools using standard formats such as SBML and BNGL, as well as exported as MatLab, SED-ML and COMBINE archive. The latest developments include a better support of SBML features, possibility to fully annotate the models, view models online, interact with ImageJ, and send simulations to runBiosimulations (https://run.biosimulations.org) service for online simulations and visualization with modified parameters.

12:28
Cells use molecular working memory to navigate in changing chemoattractant fields
PRESENTER: Robert Lott

ABSTRACT. Cellular migration is guided by local gradients of chemoattractants, which in the complex environments of tissues and organisms change over time and space. However, single cells are able to resolve the conflicting local information and generate persistent directional migration over large distances. We have identified a molecular mechanism relying on a metastable signaling state that enables cells to maintain transient polarisation of the signaling activity and shape in the direction of the last encountered signal, while remaining responsive to changes in signal localisation. We provide experimental evidence from live-cell imaging of the epidermal growth factor receptor phosphorylation that this transient memory arises from a remnant of the polarized signaling state,a dynamical ‘ghost’, and further drives memory guided directional migration. We thus identified a basic mechanism that underlies cellular polarization and navigation in changing chemoattractant fields.

12:29
Deciphering transcriptional drivers of stem cell differentiation using causal inference
PRESENTER: Martin Proks

ABSTRACT. During differentiation, progenitor cells undergo numerous transcriptional and epigenetic changes while also being subjected to mechanical forces. In this project, we focus on identifying causal transcriptional drivers and their interactions by predicting gene regulatory networks (GRNs). Current GRN inference algorithms suffer from numerous limitations like linear assumptions, correlation or entropy based calculations, missing directionality, requirement of prior knowledge or multimodal setup, and “black box” models (neural networks). To overcome these limitations, we use Convergent Cross Mapping (CCM) from dynamical systems. As a basis we exploit single-cell RNA sequencing (scRNA-seq) which allows us to track individual gene expression throughout each cell. With downstream analysis of scRNA-seq data one can infer directionality (trajectory inference), approximated time of differentiation (pseudo-/latent-time) as well as transcriptional, splicing and degradation kinetics (RNA velocity). We utilize these methods by generate time series of gene expression along pseudo-time and estimate causal directional interactions with CCM. Compared to current GRN algorithms, our tool reports directionality, effect (induction/repression) and its temporal kinetics. Finally, it uses public ChIP-seq atlas data to benchmark predicted interactions. Despite high computational demand, we showcase that CCM can infer main drivers as well as new interactions in mouse pre-implantation development.

12:30-13:30 Session L1: LUNCHEON I: Advancements in single cell multiomic profiling of antigen-specific T cells with dCODE Dextramer® (RiO) on the BD Rhapsody™ Single-Cell Analysis System.

Summary: The advancement of immunotherapies and drug development relies on the characterization of antigen-specific T cells. However, in many cases, detection of these cells has been difficult when their frequencies among T cells are low and their affinity to bind MHC-peptide complexes are weak. Furthermore, pairing this information with the corresponding variable (V), diversity (D), and joining (J) [V(D)J] sequences of the antigen-specific T cell receptors has also been challenging due to hurdles in sequencing these large regions, particularly at the single-cell level. Here, we demonstrate the combination of two powerful technologies, Immudex® dCODE Dextramer® (RiO) Reagents and the BD Rhapsody™ Single-Cell Analysis System, to detect and characterize low-frequency antigen-specific T cells, including the full sequences of the V(D)J gene segments of the T cell receptors, as well as profile transcriptome and protein expression.

Presenters: 

Soudabeh Rad Pour, PhD: Field Applications Specialist MultiOmics Solutions

Laurence Monnet, PhD: Senior Single Cell Solution Field Applications Scientist

Location: Alexander
12:30-13:30 Session L2: LUNCHEON II: Tutorial: Modelling with VCell

Summary: VCell (https://vcell.org/) is a comprehensive platform for modeling cell biological systems that is built on a central database and disseminated as a web application. VCell is widely used in dozens of published research projects (https://vcell.org/vcell-published-models), and for teaching at numerous universities.  The software is free with automatic installers for Windows, Mac OS and Linux. It provides a biology-based interface for inexperienced modelers and a math interface for math-inclined researchers. VCell platform may run on user’s computer or on provided remote servers enabling running complex simulations from any low-cost laptop. Models can be accessed from anywhere using the VCell database and can be shared among collaborators or made publicly available.

Location: Grenander I+II
13:30-15:30 Session 4: CLIMATE, BIOTOPES & MICROBIOMES

Summary: This session will cover biological discoveries made in a variety of biotopes - from Oceans to the Human Gut. We will be discussing the major challenges involved in systematic sampling of biotopes and also what we can learn from doing so. We also aim to put this in the context of a changing climate and how potentially systems biology can help deal with some of these challenges.

Location: Grenander I+II
13:30
An integrative understanding of the large metabolic shifts induced by antibiotics in critical illness

ABSTRACT. Antibiotics are commonly used in the intensive care unit; however, while the resulting expansion of multi-resistant bacteria is a widely studied problem, the negative impact on the microbiome of critically ill patients has largely been ignored. Here, we performed shotgun metagenomics, ITS2 sequencing and metabolomics to characterize the compositional and metabolic changes of the gut microbiome induced by critical illness and antibiotics in a cohort of 75 individuals together with 2,180 taxonomically and functionally annotated gut microbiome samples from 16 different diseases. We provide evidence that even though antibiotics do not significantly disturb the bacterial and fungal composition of critically ill patients, as observed in healthy individuals, they cause abundance changes in a small number of species that are highly connected with the production of short-chain fatty acids and bile acids, allowing the expansion of pathogenic species. These results suggest that antibiotic treatment induces changes in the essential functional activities in the gut related to immune responses, which may have a dramatic impact on the already “infection-vulnerable” microbiome structure of critically ill patients, and would, in part, explain the unwanted effects of antibiotics in the critically ill.

13:40
Agent-based modeling of spatiotemporal organization of bacteria in the human gut

ABSTRACT. The intestinal microbiome is important for the innate and adaptive immune system. Microorganisms contribute to the maintenance of the gut mucosal barrier and the protection against pathogens. Perturbations of the gut microbiota caused by antibiotics or inflammations can lead to dysbiosis and bacterial overgrowth. Bacterial imbalance is associated with several diseases, among others chronic liver diseases and bloodstream infections. In clinical practice, fecal samples are the mostly used method to measure bacterial abundances in patients. The bacterial composition in the rectum cannot capture the specific gut biogeography. Hence, it is important to understand bacterial communities along the tract and from lumen to mucosa. We present an agent-based model of the human proximal colon that simulates the spatiotemporal organization of bacteria. The model simulated the bacteria in balance and the bacteria during exposure to antibiotics and dietary changes. We implemented bacteria as autonomous entities with individual properties and actions. The bacteria can interact with other bacteria and with the environment, the human gut. The choice of the parameters of the model was based on microbial experiments and a mathematical model of bacterial growth by Cremer et al. [1]. We aimed to represent the proximal colon as accurate as possible. Thus, we considered the peristalsis of the large intestine. The intestinal motility affected the bacterial movement along the gut. The behavior and interactions of bacteria led to emergent spatiotemporal phenomena in the lumen and near to the mucosa. Our model enabled the simulation of bacterial composition and metabolites along and transversally the proximal colon. The spatial temporal phenomena have to be validated by experimental methods as, for example, by using immunofluorescence images of gut cross-section. Adding more bacterial families and their interactions, we could improve the accuracy of the model.

[1] Cremer J, Arnoldini M, Hwa T. Effect of water flow and chemical environment on microbiota growth and composition in the human colon. Proceedings of the National Academy of Sciences. 2017;114(25):6438–6443.

13:50
Microbial compound bioactivity descriptors enable large-scale prediction of small molecule impact on microbiota species and pathogens
PRESENTER: Nils Kurzawa

ABSTRACT. Chemical descriptors are the workhorse of chemoinformatics and have enabled increasingly accurate prediction of a wide range of compound properties. Recently, our group has introduced the Chemical Checker which extends the principle of numerical descriptors of chemical structures to various layers of compound bioactivity from targets to clinical parameters. Here, we present an extension of this concept to microbial compound bioactivities capturing the effects of compounds on microbes populating the human body and metabolomic responses of Escherichia coli upon drug treatment. Moreover, we provide an approach that allows inference of these descriptors for uncharacterized compounds and we make use of it by exploring diverse compound libraries for microbial bioactivity. Additionally, we show that microbial descriptors can be used orthogonally to chemical and human bioactivity features for different machine learning tasks such as antibiotic mode of action and ESKAPE species compound sensitivity prediction.

14:00
Towards identification of new life cycle-associated regions in phage genomes using deep learning

ABSTRACT. Background: Bacteriophages are important parts of the ecosystem and natural antibacterial agents that can be applied to treat bacterial infections. Rapidly emerging antibiotic-resistant bacteria pose a threat to the efficacy of antibiotics, endangering public health worldwide. Phage therapy, known for decades but significantly under-researched, provides a promising alternative. The life cycle type that a phage follows after entering the host cell is one of the optimality criteria for the design of phage-based therapeutics. Lytic phages that replicate immediately after entering the host cell, causing lysis and resulting in a cell's death, are the most suitable candidates. Although many genes determining the life cycle are known, selecting or engineering phages with therapeutic potential remains challenging. Further, functional annotations of many known phage genomes are still incomplete. New computational methods are needed to better characterize both available and new genomes, metaviromic samples, and therapeutic phage engineering targets.

Methods: We train deep residual networks (ResNets) to predict if a phage follows the lytic or non-lytic life style directly from both full genomes and unassembled next-generation sequencing reads. To this end, we repurpose the DeePaC framework, initially developed to predict the pathogenic potential of novel species, and use labelled phage data acquired from publicly available databases. We focus on new phages belonging to either known or entirely novel phage clusters. Hence, we perform the training-test split by single phage or phage cluster, respectively. Further, we use a suite of interpretable deep learning tools to visualize the lytic potential of genomes and identify virulence-associated genes or regions-of-interest.

Results: For full genomes, the models achieve over 99% and 91% accuracy for the phage-wise and cluster-wise classification, respectively. For read pairs, accuracy approaches 99% in the phage-wise setting as well. However, we demonstrate that only the cluster-wise models provide an insight into regions of genomes potentially influencing the life cycle, stressing the importance of avoiding information leakage to ensure proper generalization. While life cycle predictions directly from reads can facilitate the analysis of complex metaviromic samples, identification of new, previously unannotated regions associated with lytic potential may open new paths towards rational phage engineering.

14:10
Computationally Modeling the Human Microbiome of the Respiratory Tract

ABSTRACT. Background: The human respiratory tract harbors numerous bacteria and viruses as it interfaces with the human body and its outside. Many bacteria are commensals, but some are particularly life-threatening and no longer reliably respond to antibiotic treatment. Mutual interactions between bacteria, their host, and viruses influence the risk of an infection outbreak. Altering the microbiome could lead to novel therapeutics. A deep understanding of our inhabitants and their mutual interaction remains indispensable for exploitation.

Results: We assembled a unique priority list of the respiratory tract’s 148 most relevant bacteria and now systematically reconstruct genome-scale metabolic models of these. We refined existing protocols and standard operating procedures for efficient model reconstruction while ensuring the highest quality measures. We contributed to defining standard file formats for graphical representations (SBGN, SBGN-ML), model distribution (SBML), and the benchmarking tool MEMOTE for quality control and quality assurance. We implemented software for efficient simulation (SBSCL), including a web frontend (SBMLWebApp). We have already reconstructed Corynebacterium simulans, Dolosigranulum pigrum, Finegoldia magna, Moraxella catarrhalis, Pseudomonas aeruginosa, Rothia mucilaginosa, Staphylococcus epidermidis, and 114 Staphylococcus aureus strains by applying our modeling pipeline and careful manual curation. More reconstructions are in development. We refined modeling techniques for simulating virus reproduction and applied them to COVID-19 to identify potential drug targets. Our software pymCADRE reconstructs tissue-specific human cell models. We developed methods to combine multiple models to simulate mutual influences (NCMW).

Conclusion: Computational models provide a valuable basis for integrated wet-lab/dry-lab research. Their predictions guide experimental work and identify potential targets for investigating complicated mutual interactions between microbes, viruses, and their host. Validating model-derived hypotheses may enable us to reduce the growth and colonization abilities of pathogens drastically. We laid the foundation for systematic model building, quality control, open distribution, validation, and simulation. The outcomes support the rapid search for novel antimicrobial and broad-acting antiviral therapies to fight against multi-resistant pathogens or severe viral infections.

14:20
Latent variable models of fungal growth considering the data measurement process
PRESENTER: Tara Hameed

ABSTRACT. Fungal infections are common and are often treated by antifungal drugs. With the growing threat of antifungal resistance, extensive research is underway to develop novel antifungal drugs, mainly through in vitro or in vivo experiments. To evaluate and compare the efficacy of antifungal drugs, we often want to estimate the rate of fungal growth in each treatment group by fitting mathematical models of fungal growth to collected time-course fungal burden data. A popular method for collecting time-course fungal burden data is by measuring optical density (OD) due to its speed and ability to be automated. However, OD data is often affected by measurement errors. Ignoring the measurement errors may bias the derivation of the growth rates. Employing flexible non-parametric models to overcome this issue is not appropriate as they do not explicitly include a fungal growth rate as a model parameter, and fitting the data to non-parametric models would not provide a biologically interpretable fungal growth rate. To address this issue, we developed a latent variable model of fungal growth that incorporates the measurement process of the OD data. In this model, the measured fungal burden is treated as an observed variable distributed around a latent true fungal burden whose dynamics is described by logistic growth for example. We tested the model’s ability to infer a fungal growth rate from the data we collected under different treatment conditions in vitro, using a Bayesian workflow through prior predictive, fake data and posterior predictive checks. We then evaluated the models’ performance to predict the fungal growth using 5-fold cross validation. Our model outperformed baselines (random walk and a basic logistic model) in terms of estimated log predictive density, and inferred a biologically interpretable fungal growth rate. Incorporating the data measurement process in a fungal growth model improved inference and interpretability of fungal growth rates.

14:30
The metabolic community model of Staphylococcus aureus and four abundant human nasal microbes

ABSTRACT. Background: The human nose microbiota plays a vital role in the well-being of its host. This is of particular importance as a source of many respiratory infections. Understanding how this microbial community is formed and maintained can provide powerful insights into microbiota-based therapies to prevent or treat infections. Such approaches require a deep understanding of the underlying mechanisms within the host’s microbial communities. Results: We assembled an in silico nasal community, including five abundant species as follows: Staphylococcus aureus as the most common human nose pathogens in healthcare facilities and its interaction with Haemophilus influenzae, Staphylococcus epidermidis, Moraxella catarrhalis, and Dolosigranulum pigrum. We applied constraint-based and optimization approaches to understanding the interaction between the severe microbial species S. aureus and other high abundance species. Due to the complexity of large metabolic models, we applied our already created systematic workflow called “NCMW: Nasal Community Nasal Microbial Workflow” to analyze the microbe-microbe interactions based on computational metabolic modeling methods. The output of these interactions in silico predicted that S. aureus and H. influenzae admit significant growth rates, whereas all other community members are close to zero. As glucose majorly affects the growth of S. aureus, it indeed uptakes the majority of glucose within the medium. Glucose is only shared with D. pigrum, the second major glucose-dependent species. However, this glucose deficit seems to be compensated by S. aureus through the uptake of pyruvate (which has been experimentally observed that pyruvate was consumed instead of glucose to produce lactate for balancing the reducing equivalents (NAD+)). On the other hand, M. catarrhalis and D. pigrum provided fumarate, which is the major contributor to the growth of H. influenza, which then offers outstanding amounts of succinate for M. catarrhalis and S. aureus. Conclusion: This study provides an example of a complex commensal interaction cycle that can be revealed by metabolic community modeling. Moreover, this can convey the understanding of how these species contribute to human health either by producing antibacterial compounds or inhibiting the pathogen ones for future treatment discovery.

14:40
GuaCAMOLE: fragment GC bias-aware abundance estimation from metagenomic data increases accuracy and comparability
PRESENTER: Laurenz Holcik

ABSTRACT. The importance of the microbiome in medical research, ecology, and other fields has become apparent with the rapid advances in sequencing technologies. Instead of simply detecting the presence or absence of specific bacteria, high-throughput sequencing allows for quantitative analysis of species abundances in microbiomes. This quantification is usually based on the number of sequenced reads that stem from each species. Read counts, however, are influenced by factors other than species abundance. Genomic GC content, in particular, is a major cause of systematic errors in microbial abundance estimation. Yet, while genomic GC content has been observed to be a significant source of error in many widely used sequencing protocols, the precise nature and direction of these errors is highly protocol-dependent. This is one of the reasons why to this date no effective and protocol-independent computational correction of these errors is available. To address this, we present GuaCAMOLE (Guanine Cytosine Aware Metagenomic Opulence Least squares Estimation), a protocol-agnostic and fast computational method to estimate relative taxon abundances from shotgun metagenomic data. The only assumption that GuaCAMOLE makes is for sequencing reads from different species with a similar GC content to be similarly affected by GC bias. Based on this assumption, GuaCAMOLE transforms the problem of finding the true species abundances into a least-squares problem. For the class of of least-squares optimization problems, highly efficient algorithms are readily available. We show on data from a microbial mock community that GuaCAMOLE reduces the relative error of abundance estimates by up to 50 % compared to commonly used methods like Bracken and MetaPhlAn2. Importantly, our method estimates correct abundances across a wide range of PCR-dependent and PCR-free library preparation protocols. Apart from improving single-experiment accuracy, GuaCAMOLE will thereby contribute to comparability of results in microbiome research across experimental protocols and laboratories.

14:50
Macroecological Laws Naturally Arise from Complex Chaotic Dynamics of Gut Microbiota

ABSTRACT. The human gut microbiota consists of hundreds of bacterial species, and bacterial composition varies significantly between individuals, in terms of time, and spatially within the gut. We and others have previously shown that the dynamics of the gut microbiota are characterized by the same quantitative scaling laws as those observed in ecology of plants and animals. Although various mathematical models exist in ecology that aim to explain the origins of individual scaling laws in animal and plant species, the nature of these laws remains an open area of research. More recent models of microbiota also stop short of reproducing all known statistical scaling laws with the correct scaling coefficients. We believe that in order to gain meaningful insights into open questions in mathematical ecology, such as the relationship between biodiversity and stability, a model that can capture all ecological scaling laws with accurate coefficients is currently necessary.

To that end, we adapted a generalized Lotka-Volterra model of gut microbial dynamics, which, with several biologically-motivated modifications, was able to accurately reproduce all considered macroecological scaling laws observed in the gut microbiota, consistently regardless of random realizations of the distribution of interactions and growth rates. This model suggests several conceptually important insights.

A. No environmental stochasticity Environmental stochasticity is not required for maintaining the necessary temporal dynamics on short time scales. By implementing a model without external sources of noise, we demonstrate that deterministic chaos can reproduce abundance profiles with low short-term autocorrelation that have been observed in nature, without assuming that environmental fluctuations or measurement noise are the primary culprits of this behavior. B. Spatial heterogeneity We incorporate spatial variability into the model, which is necessary for maintaining high diversity over long time scales. Explicit modeling of spatial structure provides our analysis with an additional level of interpretability, allowing comparisons with experimental spatial microbiota measurements. C. Carrying capacities and total load Our model does not require predetermined fixed carrying capacities for each species. Instead, the power law distributed relationship between rank and abundance of species arises naturally as an emergent property of the dynamical system. Finally, although measurements of the ecological state of the system are performed on normalized, compositional data, our model explicitly accounts for variability in total abundances of all species.

In conclusion, we show that a deterministic gLV model with a small number of hyperparameters can generate long-term ecological dynamics that reproduce all considered ecological scaling laws with accurate coefficients, without relying on any environmental noise or species re-introduction from an external reservoir. The assumptions of the model, and subsequent analyses, give several key insights into the mathematical mechanisms required for generating realistic ecological systems.

15:00
Human host and pathogenic fungi: modeling and deciphering nested metabolic defense strategies

ABSTRACT. Candida albicans is a very widespread human commensalist, but is also the human’s most common fungal pathogen [1]. A candidiasis (C. albicans infection) could be very dangerous, with mortality rates up to 75%, especially among immunocompromised patients [2,3]. The infections leads to interactions between the host and fungal pathogen which could be described in terms of a nested defense strategy, also known as a counter-defense strategy - the initial aggressive action from a pathogen’s side causes the reaction (defense), which causes counter-action (counter-defense) and so on. We have described such a nested defense strategy up to counter-counter-defense-stage for host-C. albicans interaction. Human macrophages can phagocytose fungal cells and are limiting the glucose concentration within the phagolysosome. As a result, a key factor for pathogenicity of C. albicans is its ability to survive within macrophages by utilizing fatty acids via the glyoxylate shunt [4]. The macrophages react by production of itaconic acid (itaconate), an act that can be described as a counter-counter-defense. Itaconate works as an inhibitor of the glyoxylate shunt, more specifically, one of its enzymes - isocitrate lyase[3]. For now, no nested interaction of higher degree is known for C. albicans. We have hypothesized the probable 3rd degree action, namely the itaconate inhibition, based on experimental data. According to the molecular docking studies methylation could possibly be used as a counter3-defense. We have explored the system behaviour as well as the optimal resource allocation from the pathogen’s side by constructing a model of such a nested defense strategy. Based on optimization methods, this allows us to achieve deeper understanding of not only the C. albicans infection process, but the nested defense strategies as a whole.

References [1] Witchley et al.: ”Candida albicans morphogenesis programs control the balance between gut commensalism and invasive infection”, Cell Host Microbe. 2019 Mar 13; 25(3) [2] Brown et al.: ”Hidden killers: human fungal infections”, Sci Transl Med. 2012 Dec 19; 4(165) [3] Ewald et al.: ”Trends in mathematical modeling of host–pathogen interactions”, Cell Mol Life Sci. 2019 Nov 27; 77 [4] Cheah et al.: ”Inhibitors of the glyoxylate cycle enzyme ICL1 in Candida albicans for potential use as antifungal agents”, PLoS One. 2014 Apr 29; 9(4)

15:10
Resistant starch decreases intrahepatic triglycerides in NAFLD patients via the gut microbiome contribution to the branched-chain amino acids pool

ABSTRACT. Non-alcoholic fatty liver disease (NAFLD) is a hepatic manifestation of metabolic dysfunctions for which effective interventions are lacking. To investigate the effects of resistant starch (RS) as a microbiota-directed dietary supplement for NAFLD treatment, we conducted a randomized placebo-controlled clinical trial of 4 months in individuals with NAFLD, coupled with metagenomics and metabolomics analysis. Compared to the control, the RS intervention resulted to a 40.32% relative decrease of the intrahepatic triglyceride content (IHTC). Serum branched chain amino acids (BCAAs) and gut microbial species, in particular Bacteroides stercoris, significantly correlated with IHTC and liver enzymes, and were reduced by RS in comparison to the control. Multi-omics integrative analyses revealed the interplay among gut microbiota changes, BCAA availability, and hepatic steatosis, which was supported by both in vivo and in vitro models. Thus, dietary supplementation with RS might be a strategy for managing NAFLD by altering gut microbiota composition and functionality.

15:13
A probabilistic graphical model for taxonomic profiling of viral and microbiome proteome samples
PRESENTER: Tanja Holstein

ABSTRACT. Probabilistic graphical models provide a concise representation for multivariate models whose dependance structure is given by an underlying graph. Their applications range from causal inference and path analysis to expert systems, and have been used, for example, in clinical settings or diverse research fields such a physics, economics and biology. Through their graph structure, graphical models provide a factorization of multivariate distributions that allows for efficient inference algorithms with clear attributions to Bayesian conditional and posterior probabilities. In proteomics, the presence of proteins is inferred from a list of peptides that were identified from mass spectra using a database search. Many proteins are homologous, meaning they share peptides, which leads to the so-called protein inference problem. The peptide-protein relationship can be represented as a bipartite graph. Using this structure, Bayesian inference and graphical models have been used successfully for probability-based protein inference. For proteomic samples of unknown taxonomic origin, the presence of certain taxa must additionally be inferred – complicated further because proteins share peptides not only within, but also between taxa. Previous taxonomic profiling approaches rely on strategies such as peptide-taxon match counting or the presence of taxon-unique peptides. We present PepGM, a graphical model that uses belief propagation to compute the marginal distributions of peptides and taxa. We represent the peptide-taxon relationships as a bipartite graph where nodes represent peptides and taxa, respectively. The resulting structure serves as scaffold for a factor graph. The unknown conditional probability distribution between peptides and taxa is represented using a noisy-OR model, whose parameters are evaluated using grid search. Our model allows for computing the marginal distributions of peptides and taxa. The posterior probabilities are computed through loopy belief propagation. Using various pathogenic viral proteome samples, we show that PepGM successfully classifies viral samples with strain-level resolution using unconstrained reference databases providing meaningful confidence estimates. The assigned confidence scores, going beyond simple heuristics, could be particularly useful in the clinical context, where therapeutic decisions depend on them. We additionally show that our approach is extensible to more complex metaproteomic samples, where multiple organisms are present.

15:16
Lactobacillus rhamnosus colonisation antagonizes Candida albicans by forcing metabolic adaptations that compromise pathogenicity
PRESENTER: Sascha Schaeuble

ABSTRACT. Intestinal microbiota dysbiosis can initiate overgrowth of commensal Candida species – a major predisposing factor for disseminated candidiasis. Commensal bacteria such as Lactobacillus rhamnosus can antagonize Candida albicans pathogenicity. We investigated the interplay between C. albicans, L. rhamnosus, and intestinal epithelial cells by ntegrating transcriptional and metabolic profiling. Using untargeted metabolomics together with in silico genome-scale metabolic modelling indicated that intestinal epithelial cells foster bacterial growth metabolically, which leads to bacterial production of antivirulence compounds. In addition, bacterial growth appeared to modify the metabolic environment, including removal of C. albicans’ favoured nutrient sources. This is accompanied by transcriptional and metabolic changes in C. albicans, including altered expression of virulence-related genes. Our results indicate that intestinal colonization with bacteria can antagonize C. albicans by reshaping the metabolic environment and forcing metabolic adaptations that reduce fungal pathogenicity.

15:19
Cell Type view into tomato stem elongation in Shade avoidance response
PRESENTER: Linge Li

ABSTRACT. In nature and cultivation, plants compete with their neighbours for the limited light. Many plants can out-grow to avoid further shading, this is called as shade avoidance response (SAR). In shade condition, far-red light will be enriched comparing to normal light condition.We use two cultivars of tomatoes M82 and Moneymaker as model system. Our research question: how is tomato developmental plasticity of cellular anatomy regulated in far-red light? We compared the phenotypic response of these two tomatoes cultivars upon low R:FR condition (WL+FR) vs control (WL). We found that internode 1 elongated the most after 2 weeks of WL+FR. Furthermore, we quantified all the cell types features with microcope and found pith has the most significant response. Transcriptomic analysis was performed in internode and pith cell layers. The analysis revealed GO enrichment catagory of auxin. Therefore we are currently using auxin to simulate SAR in WL, also looking into evolutionary perspective to pith elongation response.

15:22
Quantitative Impacts of Bumble Bee Ecology on Foraging Behaviour: An Economic Model
PRESENTER: Arran Hodgkinson

ABSTRACT. Bumble bees’ time of exit from the hive is a critical factor in determining foraging success, yet little is known about how ecological factors quantitatively influence this behaviour. Bumble bees also exhibit varying degrees of foraging experience and this is thought to contribute to their foraging decisions and temporal strategy. They are also known to forage in the mornings, evenings, and may stay in the field overnight. In order to better quantify the hive-level dynamics of behaviour and their consequences for the survival of the colony, we utilise an economic partial differential equation (PDE) model to study the net energetic production of the colony. By accounting for the experience structured population of bumble bees as well as the diurnal dynamics of plant resource availability, with respect to yield and resource stores, under competition, we look at optimal colony-level resource collection strategies. We find that, under the imposition of no associated overnight cost, the optimal strategy for all foragers is to leave late in the evening and stay out overnight, before foraging and returning in the morning. The imposition of even small differential overnight penalties, however, lead to an optimal strategy being developed through morning departures for inexperienced bees and overnight departures for experienced bees. The rate of departure, or how collectively bees may coordinate their departure from the hive, also has an effect on optimal strategy, favouring earlier departures for experienced bees. Currently, experiments and data analysis are underway to discover whether such strategies are routinely exhibited in bumble bee foraging and what ecological factors could explain their manifest behaviours. This research is essential in understanding how changes in bumble bees’ environments contribute to their survival, as a species, and how ecosystems can be adapted to encourage their fruition.

15:25
Discovery of Robust and Highly Specific Microbiome Signatures for Non-Alcoholic Fatty Liver Disease
PRESENTER: Emmanouil Nychas

ABSTRACT. Non-alcoholic fatty liver disease (NAFLD) is a metabolic disease with a global prevalence of almost 25%. The pathogenesis of NAFLD is still poorly understood, however, we know that the gut microbiome is highly associated with the development of the disease. Up to now, finding robust bacterial signatures for NAFLD has been a great challenge, mainly because the disease often co-occurs with other metabolic diseases such as type 2 diabetes, obesity, hypertension, etc, making it difficult to find what is highly specific for NAFLD and what is masked from the presence of other diseases. Differences in the analytical tools used by different studies from the sequencing and metabolomic platforms to taxonomic profiling and statistics can greatly differentiate the results among studies making them non-comparable. Lastly, previous studies mainly focused on finding singular species that have a significant impact on the disease, instead of taking a more community-based approach. To address the issues above, we performed a large-scale meta-analysis collecting very detailed clinical, metagenomics sequencing, and in silico metabolomic data for 1231 Chinese subjects with metabolic diseases. Samples were all processed in the same pipeline and were placed in the following categories after regrouping them based on their clinical data: NAFLD – Overweight, NAFLD-Lean, Prediabetes, and T2D Overweight, T2D Lean, Hypertension, Pre-hypertension and Atherosclerosis, and respective controls. In summary, we built highly specific NAFLD diagnostic models, using microbial species and metabolites. Moreover, we highlighted important bacterial consortia and metabolites that are unique and highly associated with NAFLD. Lastly, we revealed key differences and similarities between overweight and lean NAFLD.

15:26
From genome sequencing to the first draft of the genome scale metabolic model of the symbiont fungus Leucoagaricus gongylophorus LEU18496.

ABSTRACT. In this work, we report the draft of the genome-scale metabolic model of the basidiomycete fungus Leucoagaricus gongylophorus LEU18496, a symbiont of the ant Atta Mexican, using its annotated genome. The annotation was obtained from the hybrid genomic assembly, which combines the data from the sequencing platforms MiSeq Illumina and GS+ FL Roche 454. A total of 11,690 predicted genes were found by Augustus using the 48,287 contigs (N50=5241) obtained from the hybrid assembly; the training of the tool was carried out suing the protein sequences from Leucoagaricus gongylophorus AC12 genome. The obtained functional annotation showed 96.25% of sequence alignments versus the proteins reported in different databases for the Basidiomycota division. Out of the total annotated proteins, it was possible to assign EC numbers for 3150 proteins and it was possible the identification of 391 CAZymes, 52 FOLymes and 38 possible proteases. Based on the annotation of Leucoagaricus gongylophorus LEU18496, three drafts of the metabolic models were constructed using different computational tools. The draft models included: 1219 reactions, 897 metabolites and 455 genes using Carveme. 632 reactions, 800 metabolites and 356 genes using Merlin, while Aureme found 863 reactions, 1030 metabolites and 863 genes. Due to these differences, it is necessary to work on the refinement of the metabolic models. Additionally, to the reconstruction of the genome scale metabolic mode, a genome comparative analysis was carried out using four genomes of Leuocagaricus. From the analysis of the pangenome for these species, a total of 18,052 groups of genes composes the pangenome, 383 groups of genes belong to the core-genome and 17,669 to the accessory genome.

15:27
Effect of dispersal by inundation on soil bacterial communities depends on soil developmental stage
PRESENTER: Xiu Jia

ABSTRACT. Dispersal is crucial for the dynamics and assembly of bacterial communities during ecological succession. However, the relative importance of dispersal is often not directly measured. Here, a microcosm experiment was performed to directly evaluate the effect of dispersal by seawater inundation on bacterial communities from soils naturally subjected to different inundation regimes, i.e., the early and late stages of a salt marsh ecological succession located on the island of Schiermonnikoog, the Netherlands. Bacterial communities were characterized through 16S rRNA gene sequencing over a treatment period of 20 days. Our results show that bacterial communities from two successional stages responded differently to inundation. Community structure changed systematically with time in the early-stage soil but was relatively stable in the late-stage soil. In the early-stage soil, the richness of bacterial communities significantly increased over time, mainly driven by the increase of low-abundant bacteria. The different influence of inundation on two successional stages may be attributed to both contemporary conditions and historical contingency. Taken together, our results highlight that bacterial communities in the early-successional stage of salt marsh are sensitive to inundation and vulnerable to accelerated sea-level rise.

15:28
In silico analysis of metabolic capabilities in a synthetic community of phycosphere bacteria
PRESENTER: Ali Navid

ABSTRACT. Microalgae play key roles in global nutrient cycles. They also can be used as biomass for production of sustainable biofuels. It is known that optimum growth and robustness of algae are critically dependent on interactions with the bacteria that reside on their surfaces and immediate surroundings (phycosphere). Unfortunately, many biophysical and biochemical aspects of these interactions are unknown. To understand the who, and the how of bacteria that interact with biofuel producing algae, we collected 18 isolates from the phycosphere of Phaeodactylum tricornutum (PT), a model biofuel producing algae. We sequenced and annotated the genomes of these isolates using several annotation tools and then developed genome-scale metabolic models for each species. We also co-cultured several isolates with PT to examine the outcomes of their interactions. We examined the reactome of the synthetic phyocosphere microbiome to find commonalities and differences between the organisms. The community reactome has more than 5000 reactions and while many of these reactions are shared among all the organisms, we found that approximately 20% of the reactions are unique to individual species. The unique metabolic capabilities are not equally distributed among the community members and so we hypothesize that our community a few metabolic specialists, and these capabilities might explain the outcome of their one-on-one interactions with PT. One isolate from the algoriphagus family of microbes (labeled ARW1R1) has the most unique metabolic reactions in the community. Our experiments show that for most conditions co-culturing ARW1R1 with PT has no effect on the growth of PT. But we observed a slight increase in growth of PT when it is co-cultured with ARW1R1 in an environment that is nitrate rich but light limited. We analyzed the metabolism of ARW1R1 in search of an answer for the cause of this behavior. Our examination showed that among its various metabolic capabilities, ARW1R1 has enzymes that catalyze two unique metabolic pathways. One pathway involves metabolism of arachidonic acid and related poly-unsaturated fatty acids (PUFAs). The other process is oxidative deamination of D- and L- amino acids resulting in production of H2O2 and -keto acids. Both processes can improve growth of ARW1R1 in the above environment. PUFAs are a metabolic byproduct of algae and metabolizing these carbon rich components of algal cell wall could be one reason why we find ARW1R1 growing in the phycosphere of PT. Additionally, ARW1R1 does not have the ability to directly use nitrate and thus having the ability to metabolize amino acids produced by PT can sate its need to nitrogen. So, while our analyses have identified possible means of nutrient scavenging by ARW1R1 from PT, we still have not found a mechanism for the observed beneficial interaction between the two organisms.

15:29
Benchmarking Phage-Host Prediction Tools using Real Metagenomics Data
PRESENTER: Levi van Doorn

ABSTRACT. Viruses are the most abundant life entities on the planet. They can influence the composition of microbial communities and nutrient flow within ecosystems through infection of their hosts. Identifying which host or hosts a virus infects can help us understand their roles and expand our understanding of microbial ecosystem functioning. Currently, the hosts of most viruses remain unknown. Computational tools play an important role in answering the question of which virus infects what host. Many different computational tools have been developed for phage-host prediction. These tools use different methods, reference databases, and biological signals to make their phage-host predictions. This work aims to provide an independent benchmarking of the performance of nine different tools at different taxonomic levels of prediction. We compared their performance using real metagenomic datasets from a tomato soil biome. Soil has a very high microbial diversity and many of the organisms are still unknown. Understanding the microbiome of the soil better could be beneficial for agricultural practices. The samples consist of paired viral and microbial size fractions. The hosts of viruses identified in the viral size fraction are predicted using the nine tools, and compared to the microbes identified in the microbial size fraction. Performance of the tools is then assessed based on the similarity between the predicted host abundance profile and the microbial abundance profile and considers the Precision, Recall, F1-score, and the number of predictions made. As expected, the overall trend shows that all tools perform less well at lower taxonomic ranks. We find that iPHoP is the best performing tool among the tested tools (average precision: 49.8%, average recall: 85.5%, average F1 score: 59, average prediction percentage: 95.8%). While iPHoP does not have a higher precision or F1 score compared to RaFHA and WiSH. The higher prediction percentage and recall make iPHoP the best-performing tool across taxonomic ranks.

This Benchmarking of computational phage-host prediction provides independent insight into the performance of the tools. These results are only a snapshot of the complete analysis which will take the abundance of the predicted and presents microbial hosts into account and will include data from human gut and marine biomes.

13:30-15:30 Session 5: DEEP HIDDEN PHYSICS

Summary: A grand challenge with great opportunities is to develop a coherent framework that enables blending conservation laws, physical principles, and/or phenomenological behaviors expressed by differential equations with the vast data sets available in many fields of engineering, science, and technology. At the intersection of probabilistic machine learning, deep learning, and scientific computations, this work is pursuing the overall vision to establish promising new directions for harnessing the long-standing developments of classical methods in applied mathematics and mathematical physics to design learning machines with the ability to operate in complex domains without requiring large quantities of data. To materialize this vision, this work is exploring two complementary directions: (1) designing data-efficient learning machines capable of leveraging the underlying laws of physics, expressed by time dependent and non-linear differential equations, to extract patterns from high-dimensional data generated from experiments, and (2) designing novel numerical algorithms that can seamlessly blend equations and noisy multi-fidelity data, infer latent quantities of interest (e.g., the solution to a differential equation), and naturally quantify uncertainty in computations.

13:30
Continuous-time models under uncertainty

ABSTRACT. In contrast to classical mechanistic modelling, deep learning offers new avenues towards free-form learning of underlying dynamics of natural phenomena. These methods provide efficient machinery for fitting almost arbitrary dynamics with deep ODEs or PDEs, often in latent spaces. The key limitations of these methods are their black-box nature. In this talk I will give a summary of learning free-form dynamics under uncertainty using Bayesian principles. I will highlight such methods on both neural network and kernelised frameworks.

13:50
Path Integral Stochastic Optimal Control for Sampling Transition Paths

ABSTRACT. We consider the problem of Sampling Transition Paths. Given two metastable conformational states of a molecular system, e.g. a folded and unfolded protein, we aim to sample the most likely transition path between the two states. Sampling such a transition path is computationally expensive due to the existence of high free energy barriers between the two states. To circumvent this, previous work has focused on simplifying the trajectories to occur along specific molecular descriptors called Collective Variables (CVs). However, finding CVs is not trivial and requires chemical intuition. For larger molecules, where intuition is not sufficient, using these CV-based methods biases the transition along possibly irrelevant dimensions. Instead, this work proposes a method for sampling transition paths that consider the entire geometry of the molecules. To achieve this, we first relate the problem to recent work on the Schrödinger bridge problem and stochastic optimal control. Using this relation, we construct a method that takes into account important characteristics of molecular systems such as second-order dynamics and invariance to rotations and translations. We demonstrate our method on the commonly studied Alanine Dipeptide, but also consider larger proteins such as Polyproline and Chignolin.

14:15
Posterior marginalization accelerates Bayesian inference for dynamical systems
PRESENTER: Elba Raimundez

ABSTRACT. Bayesian inference is an important method in life and natural sciences for learning from data. It provides information about parameter uncertainties, and thereby the reliability of models and model predictions. Yet, the generation of representative samples from the Bayesian posterior distribution is often computationally challenging. Here, we present an approach that lowers the computational complexity of sample generation for problems with scaling factors, offset and noise parameters. The proposed approach is based on the marginalization of the posterior distribution, which reduces the dimensionality of the sampling problem. We provide analytical results for a broad class of problems and show that the approach is suitable for a large number of applications. Subsequently, we demonstrate the benefit of the approach for various application examples from the field of systems biology. As the approach is broadly applicable and offers the possibility for a substantial acceleration, it will facilitate Bayesian inference in different research fields.

14:28
The Economy and Control of Balanced Cellular Growth
PRESENTER: Hugo Dourado

ABSTRACT. Fundamental quantitative principles are in the foundations of the physical sciences, but in cell biology quantitative analyses have been mostly restricted to numerical simulations of particular systems or highly simplified whole-cell models. Here, we present Growth Balance Analysis (GBA) as a general mathematical framework for the nonlinear modeling and analysis of cellular balanced growth, and show how it helps to uncover fundamental analytical principles general to all optimally growing cells. The cellular states in those nonlinear models are entirely understood as phenomena emerging solely from basic physiological constraints and the optimization of fitness by natural selection, which in many cases is quantified by the growth rate. This analytical approach is in a sense opposite to the predominantly linear modeling frameworks such as FBA and RBA, which are typically analyzed numerically; the efficiency of linear programming facilitates fast solutions for genome-scale models. In contrast to linear modeling frameworks, the construction and solution of genome-scale nonlinear models faces the major obstacle of nonlinear reaction rate laws, which make numerical optimizations much more difficult, and have limited studies to nonlinear models with only a handful of reactions. The succinct mathematical formulation of GBA helps to address this problem by simplifying the problem description to a minimal set of variables, the scaled fluxes. Also importantly, the concise formulation of GBA allows a Lagrangian formalism for the study of analytical conditions necessary for optimal cellular states, thus providing a tool for a deeper understanding of the principles shaping optimal cellular resource allocation, even when specific optimal states are not explicitly known. We show here how these optimality conditions also help to define from first principles the fundamental economic principles of cellular resource allocation shaping the optimal logistics at balanced growth. The main economic principle here is the marginal value of each scaled flux, which is composed by costs and benefits of three different kinds, all connected to marginal changes in protein allocation: one related to changes in protein production, one related to changes in the protein catalytic efficiency due its saturation with metabolites, and one related to the protein allocated to the maintenance of each reaction. Our analytical approach also allows the derivation of explicit expressions for growth control coefficients, defined as the proportional change in the optimal growth rate by a change in one of the parameters of the optimization problem; these are the external concentrations of reactants defining the growth environment, the kinetic parameters, and the cellular density. Our analytical study here helps to clarify the mathematical links between the usually separate fields of cellular economics and metabolic control analysis (MCA), also extending these to models of growing cells.

14:41
Hybrid mechanistic and deep learning modeling in systems biology
PRESENTER: Maren Philipps

ABSTRACT. Mathematical models are widely used in systems biology to understand complex biological processes. The most common approaches are knowledge-based mechanistic models and data-driven statistical models. Yet - driven by the success of deep learning, over the last years a third approach arose: hybrid modeling. In particular physics-informed neural networks (PINNs) and universal differential equations (UDEs) gained attention in the field of systems biology. Unfortunately, the potential of these approaches has hardly been evaluated. We set out to thoroughly evaluate the efficiency and suitability of the novel hybrid approaches for systems identification on a comprehensive benchmark collection. The collection features a diverse set of ordinary differential equation (ODE) models, varying in dimensionality, complexity of dynamics and observed data, thus enabling the rigorous examination of the potential for handling large scale, heterogeneous data, as well as noisy and sparse measurements. Benchmarking stands to establish guidelines for hybrid model construction, including empirically motivated suggestions for hyperparameters and balancing of contributions from data and dynamics. Moreover, it allows to identify the type of problems that benefit from the hybrid modeling approach. Finally, our contribution will pinpoint technical limitations, hence motivating the development of systems biology-tailored extensions for hybrid modeling.

14:54
Discovery of Hidden Control Variables and modeling Non-Linear Dynamical Biological Systems
PRESENTER: Juan Munoz

ABSTRACT. How to construct robust dynamical models for biological systems constitute a grand challenge for the systems biology community that is still unresolved. The root of the problem is partly because several putative model architectures could be formulated for given biological systems, and worse the parameter space is, as a rule, massive for each model. Furthermore, we cannot assume that we have access (observations) to all relevant state variables. That is to say, hidden variables exist controlling the system, but we have no data for their temporal evolution. Therefore, incorporating non-observables in the models, which may also drive the dynamical change and transitions, is usually considered prohibitively tricky. Furthermore, we may not even know the unknown variables which are part of the biological system of interest. Thus, the problem of model discovery is intrinsically linked with the challenge and existence of such hidden variables.

Here we explore the idea of using normal forms as universal, scalable, and minimal dynamical building blocks to capture and model the system dynamics. Our method, called HINNDy, samples observations in the vicinity of a slow manifold and formulate the problem to a constrained optimization problem. We evaluate performance, robustness against noise, and data requirements by benchmarking HINNDy against standard bifurcation models (Saddle-node, Transcritical, Pitchfork, Hopf). Next, we tested HINNDy for the discovery of Lorentz, Van Der Pol, the Hodgkin-Huxley, and Fitzhugh-Nagumo dynamical systems from data generated by these models. Finally, we show that our approach generalizes to more complicated models where we do not have access to the specific model equations and hidden control variables following more sophisticated dynamic. We effectively discover the underlying equations and unobserved hidden variables from data generated from the toggle switch, genetic oscillator, and Waddington landscape model. Thus, HINNDy enables data-efficient, robust and scalable discovery of generative non-linear models, including hidden and observed variables, capturing different dynamical regimes.

In conclusion, there is growing interest in the scientific community in asking how artificial intelligence and machine learning can facilitate scientific discovery. Such a quest is increasingly going beyond data analysis and pattern detection, a need for robust methods to explain natural processes within neuroscience, genomics, stem cell biology, and developmental biology. We believe that constraining the learning of models from data by using normal forms is the first step towards a more data-driven workflow to formulate systems biology models.

15:07
Combining machine learning and ODE’s to build multiscale models for estimation of metabolic trajectories
PRESENTER: Natal van Riel

ABSTRACT. We develop a hybrid modelling framework (called ADAPT, Analysis of Dynamic Adaptations in Parameter Trajectories) that blends biochemical conservation laws in metabolic networks and kinetic and physiological information captured by differential equations with data-driven machine learning. ADAPT offers a coherent framework for selection of model structure, integration of different types of (molecular) data, calibration, and uncertainty quantification. Structural uncertainty in the model, such as missing information about regulation, is translated into a parameter estimation problem. Time-dependent parameters in de ODE model are inferred from time-series data. To control flexibility versus overfitting of the model, multiple regularization functions are introduced in the learning algorithm, resulting in a multi-objective optimization algorithm. Numerical analyses and domain knowledge are used to tune the algorithm’s hyperparameters. ADAPT has been applied to model the metabolic effects of activation of nuclear receptor LXR and effects of diabetes medication in humans with type 2 diabetes. Here the hybrid modelling method is applied to study the development of Metabolic Syndrome (MetS) using data from a diet-induced mouse model. Our computational model couples fast and slow processes and tissue-level phenomena (e.g. insulin resistance, hepatic steatosis) to whole-body manifestation of MetS (glucose intolerance, dyslipidemia). The model captures inter-individual differences in progression of MetS, which are predicted to be rooted in metabolic processes in liver and gut, with bile acids (in conjunction with gut microbiome) as plausible cause. Predicted differences were also observed in gene expression data and differences in fecal and plasma bile acids composition could be detected using targeted metabolomics. In an independent study we have modulated bile acid metabolism and show that activation of bile acid receptor farnesoid X receptor (FXR) resolves dyslipidemia and decreases adiposity in this mouse model of MetS, providing further mechanistic support for ongoing (pre)clinical trials with bile acid agonists for treatment of obesity associated metabolic disorders.

15:20
Linking models of biochemical dynamics via mass-constrained neural ordinary differential equations

ABSTRACT. In the last decade there has been an explosion of various different experimental methods used to measure the dynamics of biochemical species in chemical reaction networks. However, not all biological systems are amenable to such methods. For example, synapses, the connection points between neurons, are still largely opaque in terms of the dynamics of various biochemical species. There are many dynamical models of synaptic chemical reaction networks. However, the current dynamical models include only a small part of the biochemical species and reactions taking place in a synapse. Due to the lack of data necessary to characterize the chemical reaction network structure in synapses there is a need for alternative modeling approaches. Specifically, approaches that do not rely on the characterization of the full structure of the chemical reaction network would be favourable. In this work we showcase such an approach.

In order to model a chemical reaction network where parts of the reaction network are represented as a neural network, we use the neural ordinary differential equations [1]. Moreover, we integrate mass constraints along with additional structural constraints that ensure non-negativity of biochemical species during simulations. As an example of this modeling approach we use a published MAP3K-MAP2K-MAPK cascade capable of oscillations [2]. We use the published model to generate a synthetic data set. Then we use the hybrid model and assume that we know nothing about MAP2K. Instead, we add a neural network term to MAPK derivatives, linking MAP3K to MAPK dynamics. Finally, we train the hybrid model on time-series of pMAPK and ppMAPK. Our results show that it is possible to get reasonably accurate fits on the training data.

The field of scientific machine learning - combination of machine learning and traditional scientific modeling approaches and knowledge - is a novel emerging field. Combining things like the function approximation capabilities of neural networks and known domain specific laws might lead to solutions to challenges previously thought insurmountable. In this work we used scientific machine learning to produce a hybrid model of a chemical reaction network whose full reaction structure was not known. In the future we aim to expand our approach to larger and more complex systems and synapses.

[1] Chen, R. T., Rubanova, Y., Bettencourt, J., and Duvenaud, D. Neural ordinary differential equations. Advances in Neural Information Processing Systems 2018-Decem (2018), 6571–6583. [2] Sarma, U., and Ghosh, I. Oscillations in mapk cascade triggered by two distinct designs of coupled positive and negative feedback loops. BMC Research Notes 5 (2012).

15:25
Reinforcement learning and Monte-Carlo tree search in gene expression enable the entire gene regulation in a cell

ABSTRACT. Expression of more than 10,000 genes is precisely controlled in a human cell with adapting various situations. In systems biology, it remains unclear how genetic and epigenetic mechanisms cooperates to achieve quantitatively proper expression patterns. While recent progress in artificial intelligence indicates the importance of learning process in complex systems, conventional biology prefers to reveal causal relationships than the learning process. In this study, by assuming that cells change the expression pattern if it is inappropriate, I theoretically show that biological processes known as epigenetic regulation works as a reinforcement learning and the leaning model reproduces the changes of expression of all genes in human cells. Performing Monte-Carlo simulations, I reveal how two kinds of agents autonomously approach a target ratio through repetitions of stochastic processes of increase and decrease in an agent-based model. The results show that the increase process should be competitive amplification with a small additive noise and the decrease process should be decay depending on the difference between current and target ratios of agent numbers. In this case, each unit simply changes with an equal probability. The following stochastic differential equation represents this reinforcement learning process; dx_i/dt = A x_i/∑x_j – E(x_i/∑x_j, T) x_i + B(i, x), where x_i is the number of i-th kind of agent, the decay probability E is an approximated value from mean squared error between the current x_i/∑x_j and the target T ratios, and the bias B is a small noise. The ratio x_i/∑x_j autonomously approaches the ratio where E becomes smaller. Epigenetic regulation of two genes that have similar promotor sequences would be this learning pair process, considering that the openness of chromatin increases the accessibility of a histone acetylase. The expression ratio of many genes can be controlled in a hierarchical architecture of these learning pairs. A gene is selected by Monte-Carlo tree search through the hierarchical pairs with activating the pathway branches in competitive amplification. This is like a signal transduction. The model well reproduces the changes of whole-gene expression during human early embryogenesis, by setting the initial and target values in each pair. Gene expression changes during hematopoiesis are also reproduced in the same model without modifying any other parameters than initial and target values. In this model for whole gene expression, epigenetic regulation as the reinforcement learning is clearly distinguished from a genetically-conserved hierarchical-pair architecture. Based on these findings, I propose the law of biological inertia, which means that a living cell basically maintains the expression pattern while metabolizing its contents and achieves leaning ability, as represented by the above equation. This principle would give insights to understand various complex systems.

15:26
Efficient parameter estimation for ODE models with semi-quantitative data using spline approximation
PRESENTER: Domagoj Doresic

ABSTRACT. Quantitative dynamical models facilitate the understanding of biological processes and the prediction of their dynamics. These models usually comprise unknown parameters, which have to be inferred from experimental data. For quantitative experimental data, there are several methods and software tools available. However, for semi-quantitative and qualitative data the available approaches are limited and computationally demanding. We deal with a kind of semi-quantitative data. Specifically, for several measurement techniques, the measurements have an unknown monotone non-linear dependency on the variables of interest. We present a method based on hierarchical optimization and approximation using splines. We consider a reformulation of the inverse problem as a bi-level optimization problem. The linear splines that model the unknown monotone non-linear dependencies are optimized in the inner optimization problem. Using these optimized spline mappings we can obtain mapped observable simulations comparable to the given measurements. This enables us to define a negative log-likelihood objective function which we minimize in the outer optimization problem to obtain maximum likelihood estimates of the parameters of the dynamical system. To improve the performance and efficiency of the method, for both optimization problems we use gradient-based local optimizers. The gradients are computed using a semi-analytical algorithm for gradient calculation. The approach is implemented in the open-source Python Parameter EStimation TOolbox (pyPESTO).

15:27
Hybrid models for cellular signaling: meso-scale pathway identification

ABSTRACT. Experimentation using combinatorial perturbations of biological systems is a common method to extract data and study the underlying, interacting mechanisms that compose them. These interactions can be represented by a network of unknown structure, with the overall output of the entire system, for a small subset of the input space, as the only available information. An accurate description of these processes constitutes an important step towards a more efficient medicine, personalised in treatments and free of undesired side effects.

We develop a method for network reconstruction from perturbation, static datasets by means of hybrid models, structured combinations of mechanistic modelling and machine learning techniques, and apply it to the the context of signalling pathways. The method is based on the identification of patterns in the data and their relation to graph substructures representing portions of the signalling network. This allows us to avoid restrictive modelling assumptions and possible biases from external sources, but at the same time reduces the level of attained detail. The end result is a quantitative, meso-scale version of the network (it sits in the middle between a macroscopic and a proteomic scale description) that highlights differences and similarities between the response mechanisms of groups of phosphoproteins.

Graph theoretic approaches might provide a link to a more detailed reconstruction, otherwise, the method could serve as a prior on a multi-scale modelling pipeline including existing proteomic scale reconstruction algorithms.

15:28
On data-driven learning of an effective energy landscape from trajectories of nonlinear dynamical biological systems
PRESENTER: Sandip Saha

ABSTRACT. To construct robust dynamical and predictive state-space models (e.g., ODE, PDE, Boolean) has been and still is at the core of systems biology. Yet, it has proven very challenging due to data limitations. As a result, the putative model architectures are under-determined for a given biological system, and the parameter space is, as a rule, huge. Here we explore the idea of constructing effective energy/potential-based models capturing a system's dynamical landscape without specifying each putative state variable and their associated parameters. Such a potential-like function provides sufficient information about a system's dynamics and stability properties. However, constructing such energy-based models is a challenging problem for an arbitrary system. Physics Informed Neural Networks (PINNs) have emerged as a promising tool to predict systems' future with the help of Neural networks under the guidance of basic physical principles such as the Principle of Least Action, the system Lagrangian, or Hamiltonian. Here we explore how to construct potential landscapes in a purely data-driven sense by suitably employing PINNs for systems biology. We have designed a pipeline to extract approximate potential landscapes by integrating several models such as Lagrangian Neural Network (LNN), Neural New-Physics Detector (NNPhD), and AI Feynman 2.0 and respecting consistency of assumptions. We validated the concept theoretically and computationally for the Double Pendulum model case. We are currently targeting the construction of a Waddington epigenetic landscape using simulated data generated from coupled transcriptional-epigenetic dynamical systems biology model (Matsushita and Kaneko, PRR, 2020). This pipeline enables us to construct the potential landscapes from any data which contains phase-space information.

15:29
Structural reduction of chemical reaction networks based on topology
PRESENTER: Yuji Hirono

ABSTRACT. Inside living cells, chemical reactions form a large web of networks. Understanding the behavior of those complex reaction networks is an important and challenging problem. We develop a model-independent reduction method of chemical reaction systems based on the stoichiometry, which determines their network topology. A subnetwork can be eliminated systematically to give a reduced system with fewer degrees of freedom. This subnetwork removal is accompanied by rewiring of the network, which is prescribed by the Schur complement of the stoichiometric matrix. Using homology and cohomology groups to characterize the topology of chemical reaction networks, we can track the changes of the network topology induced by the reduction through the changes in those groups. We prove that, when certain topological conditions are met, the steady-state chemical concentrations and reaction rates of the reduced system are ensured to be the same as those of the original system. This result holds regardless of the modeling of the reactions, namely, chemical kinetics, since the conditions only involve topological information. This is advantageous because the details of reaction kinetics and parameter values are difficult to identify in many practical situations. The method allows us to reduce a reaction network while preserving its original steady-state properties, thereby complex reaction systems can be studied efficiently. We demonstrate the reduction method in hypothetical networks and the central carbon metabolism of Escherichia coli.

13:30-15:30 Session 6: MODELS IN SPACE AND TIME

Summary: The behavior of cells is impacted by many factors, such as  gene regulation, signaling, metabolism, transport, or mechanical forces. While studying these components in isolation can be informative, they all interact with each other and are ultimately part of the same system. The session will discuss models that capture cellular dynamics and regulation with an emphasis on the role played by the spatial organization of its components.

Location: Alexander
13:30
Directed cell migration by cortical actin and ER-PM contact site-mediated restriction of receptor signaling to the front
PRESENTER: Tobias Meyer

ABSTRACT. My laboratory is interested in understanding how mammalian cells integrate signals to control cell function. A particularly fascinating problem we are focusing on is the interplay between signaling polarity and the polarity of small GTPases and actin organization during directed cell migration. My talk will focus on the question how the cortical actin network and ER-PM contact sites direct receptor tyrosine kinase, Cdc42 and Rac signaling to the front which in turn directs membrane protrusions during cell migration. I will discuss the use of a reporter system that we developed to measure the local density of F-actin close to the plasma membrane along with the dynamic organization of ER-PM contact sites in migrating cells. We identified parallel gradients of both membrane proximal actin and ER-PM contact sites, being high in the back and low in front, inverse from the gradient in Rac activity. We show that a main purpose of the gradient of membrane proximal F-actin is to increase Rac and Cdc42 signaling to the front to both direct local protrusions and stabilize polarity and ensure that cells persistently migrate. We found that a main purpose of the gradient in ER-PM contact sites is to localize EGF and other RTK receptor activities to the front by increasing the rate of dephosphorylation of EGFR by PTB1B phosphatase in the back. PTB1B activity is localized to the back since the density of ER-PM contact sites is much higher there compared to the front and the exclusively ER localized PTB1B can only interact with EGFR’s at ER-PM contact sites. Overall, our findings argue that the polarization of receptor signaling and small GTPase activity in cell migration and chemotaxis results from a close interplay between the spatial organization of cortical actin and ER-PM contact sites, along with feedback mechanisms that orient receptor, lipid second messengers and small GTPase signaling.

13:50
Regulation of Mammalian Iron Physiology Across Scales

ABSTRACT. Iron is a transition metal required in large quantities in mammals, mostly as part of hemoglobin for oxygen transport, but also as a cofactor to many enzymes. Due to its redox properties and in order to minimize oxidative stress, iron needs to be transported and stored in conditions that reduce its reactivity. Different organs/systems have competing or antagonistic needs for iron and this requires several levels of regulation. Deficiencies in these regulatory mechanisms can result in iron excess or deficiency, which cause several common diseases. To understand the dynamics of iron regulation in health and disease we have embarked in a long term project creating computational models at various scales, from macromolecules to a whole individual. I will describe the models at different spatial and time scales and how they are being integrated to provide wider scope and higher resolution. I will also discuss general modeling challenges we encountered and how we addressed them, such as those related to obtaining data for model calibration and validation, and connecting models of different scales.

14:10
Modelling how mRNA flows through subcellular compartments

ABSTRACT. During their life cycle, eukaryotic mRNAs move from the nucleus, where they are synthesised, to the cytoplasm, where they are translated by ribosomes free-floating in the cytosol or bound to the endoplasmic reticulum membrane, and then are eventually degraded. Despite high-throughput sequencing enabling assessment of steady-state gene expression levels, the intracellular dynamics of RNA transcription, processing and localisation remain poorly characterised. In this ongoing study, we quantify mRNA flow rates between subcellular compartments transcriptome-wide in mouse embryonic stem cells. Combining standard sequencing methods with metabolic labelling and cell fractionation we are able to determine the ratio between newly-synthesised and total RNA levels in four compartments, namely nucleus (divided in pre- and mature mRNA), cytosol and membrane. Using a simple four-step model of ordinary differential equations we estimate the following kinetic rates for each gene: splicing, nuclear export, cytosolic processing (separated into export and degradation) and membrane degradation. All rates show a high variability across genes, with the splicing rate having the narrowest and the cytosolic processing rate having the widest distribution. On average, the nuclear export is the slowest rate, meaning transcripts remain the longest in the nucleus as mature mRNA before exiting the nucleus. Membrane-associated mRNAs, here defined as having an abundance in the membrane at least four times higher than in the cytosol, show characteristic dynamics: cytosolic processing is faster, while membrane degradation is substantially slower compared to other genes. With a gene ontology analysis of this category showing strong enrichment of genes coding for endoplasmic reticulum and cell membrane proteins, this suggests that degradation is slower for transcripts associated with endoplasmic reticulum-bound ribosomes than for those associated with cytosolic ribosomes. Additionally, RNA polymerase II elongation rates were estimated to account for time delays observed in the labelling of exons distant to the 3’ end. The elongation rate correlates with gene length and has a mean of roughly 3 kb / min, coinciding with results from previous studies. To summarise, we quantified kinetic flow rates of mouse mRNA globally, allowing us to investigate characteristic dynamic profiles and their links to biological function.

14:23
Spatio-temporal modeling reveals a layer of tunable control circuits for the distribution of cytokines in tissues.
PRESENTER: Patrick Brunner

ABSTRACT. Cytokines are diffusible mediators of cell-cell communication among immune cells with critical regulatory functions for cell differentiation and proliferation. Previous studies have revealed considerable spatial inhomogeneities in the distribution of cytokine molecules in tissues, potentially shaping the efficacy and range of paracrine cytokine signals. How such cytokine gradients emerge and are controlled within cell populations is incompletely understood. In this work, we employed a spatial reaction-diffusion model to systematically investigate the formation and influence of spatial cytokine gradients. While a distribution of each cells receptor levels introduces small inhomogeneities, we found the fraction of cytokine secreting cells to be the main source of spatial inhomogeneity and subsequent activation. These activation levels are not possible in an ODE based model with a well-mixed extracellular space, highlighting the importance of cytokine inhomogeneities in cell activation. Employing IL-2 and IL-7 as model systems, we show the positive feedback from local cytokine levels upon IL-2 receptor expression to further increase spatial cytokine inhomogeneities [1] while the negative feedback of IL-7 decreases inhomogeneity. In those setups our spatial model allows us to characterize the cells local environment, establishing a niche of activation around each secreting cell through the DBSCAN algorithm. By analyzing these niches we find positive feedback to enhance niche separation while negative feedback causes niches to fuse, underlining the effect of varying levels of inhomogeneity on cell activation. Controlling the size and quantity of these niches through the clustering of cytokine secreting cells and cells with large amounts of receptor expression, such as regulatory T cells or innate lymphoid cells, around antigen presenting cells we found that this constrained tissue architecture can have profound effects on the range of paracrine cytokine signals and activation dynamics. Continuing our work we will employ spatial statistics and imaging data in an effort to recreate in vivo conditions as closely as possible, deepening our understanding of T cell activation and proliferation.

[1] Patrick Brunner, Lukas Kiwitz, Kevin Thurley; bioRxiv 2022.03.17.484722; doi: https://doi.org/10.1101/2022.03.17.484722

14:35
Real-time biochemical computations at criticality

ABSTRACT. Living systems on all scales of organization operate in changing environments. This implies that the biochemical networks, even in single cells, perform computations in real-time in order to navigate or stabilize a phenotypic output based on the dynamic external signals. In contrast to the current frameworks which rely on stable-states computations, we hypothesized that efficient real-time computations can be uniquely realized with metastable states, which are an emergent property of systems organized at criticality. We develop theoretical frameworks to study real-time computations in particular for biochemical networks in single cells, and demonstrate using life-time fluorescence imaging of protein activities in single cells as well as migration assays how single cells utilise these features for efficient navigation in environments where signals are noisy, disrupted or conflicting.

14:47
Stochasticity of Meiotic Entry in Xenopus Oocytes
PRESENTER: Deniz Fettahoglu

ABSTRACT. Frog egg extracts allowed the discovery of the maturation promoting factor (MPF) in the 70s and opened the way to mechanistic studies, emphasizing the role of cyclins and cyclin dependent kinases (CDKs) in orchestrating cell cycle events. Mathematical modeling conciliated irreversible progression of the cell cycle with dynamical systems concepts such as bistability of the MPF. Fifty years since this discovery one would expect that everything about cell cycle mechanisms is known, both on the experimental and theoretical sides. Until recently, it was believed that accumulation of Cyclin B during the G2 stage leads to a tipping point where the inactive CDK1 becomes unstable and spontaneously releases the inhibitory phosphorylation. This traditional picture is challenged by our recent results (Vigneron et al, Dev Cell 2018) showing that Cyclin B alone is not sufficient to push the system over a tipping point. In the case of mitosis, another complex Cyclin A – CDK1 plays the role of a trigger and can push the Cyclin B – CDK1 inactive state over a tipping point. Only when this trigger acts, the Cyclin B – CDK1 complex can play its role of driver of mitotic events. The full mitotic entry wiring also includes the polo-like kinase Plx1, the kinases Aurora A, Greatwall, the phosphatase PP2A-B55, the activator Bora and the inhibitors Arpp19 and ENSA. We applied this methodology also to meiosis. Like for mitosis, the results allowed us to correct the state of the art. The two meiotic cell divisions are controlled by the activity of CDK1 but Cyclin B alone cannot trigger the transitions. Furthermore, it is known that Cyclin A expression is very low in meiosis which asks for a new candidate for the trigger. We have identified a trigger and tested the novel biochemical wiring using a similar combination of experiments and mathematical modeling. Interestingly, our mathematical model shows that the meiotic entry is deterministic for high concentration of the trigger and stochastic when this concentration is low. To apprehend the complex spatio-temporal dynamics of the MPF activation in the oocyte, one can draw an analogy to vapor-liquid first order phase transitions. In the stochastic regime, the oocyte behaves like a superheated fluid. Waves of MPF activation nucleate spontaneously and propagate by auto-catalysis in the cytoplasm. The nucleation process depends on the properties of intrinsic and extrinsic biochemical noise, being bolstered by temporal and spatial correlation of the noise. The pattern of MPF activation coarsens by diffusion controlled processes. These theoretical findings have implications for the understanding of the meiotic cleavage timing and progression of maturation. The trigger is important for reliable meiotic entry, whereas MPF activation waves could coordinate the maturation processes, independently or coupled to surface contraction waves.

14:59
Rhythmic protein degradation for cost-effective circadian oscillation
PRESENTER: Junghun Chae

ABSTRACT. Circadian protein oscillations are maintained by the lifelong repetition of protein production and degradation in daily balance. It comes at the cost of ever-replayed, futile protein synthesis each day. This biosynthetic cost with a given oscillatory protein profile is relievable by a rhythmic, not constant, degradation rate that selectively peaks at the right time of day but remains low elsewhere, saving much of the gross protein loss and of the replenishing protein synthesis. Here, our mathematical modeling reveals that the rhythmic degradation rate of proteins with circadian production spontaneously emerges under steady and limited activity of proteolytic mediators and does not necessarily require rhythmic post-translational regulation of previous focus. Additional (yet steady) post-translational modifications in a proteolytic pathway can further facilitate the degradation’s rhythmicity in favor of the biosynthetic cost saving. Our work is supported by animal and plant circadian data, offering a generic mechanism for potentially wide-spread, time-dependent protein turnover.

15:11
Model-based design of a synthetic oscillator based on an epigenetic memory system
PRESENTER: Viviane Klingel

ABSTRACT. Oscillations are important components in biological systems, grasping their mechanisms and regulation, however, is challenging. Here, we use the theory of dynamical systems to support the design of oscillatory systems based on epigenetic control elements (Klingel et al. 2022). Specifically, we use results that extend the Poincaré-Bendixson Theorem for monotone control systems which are coupled to a negative feedback circuit. The methodology is applied to a synthetic epigenetic memory system based on DNA methylation. This memory system was developed by Maier et al. (2017) by designing a methylation sensitive Zinc-finger protein, which when bound to DNA can inhibit the transcription of a methyltransferase. There, memory functioning was realized by a positive feedback that leads to bistability. Our study is based on a mathematical model of this system (Klingel et al. 2021). This memory module serves as the monotone control system. We propose to implement the additional negative feedback required for oscillations by introducing novel DNA methylation sites into the autoregulation of the Zinc-finger. Through phase plane analysis of our mathematical model, we then determined necessary parameter conditions required for the system to operate in the oscillatory range. We further used model simulations to distinguish between dampened and sustained oscillations. Our proposed system is generally able to exhibit sustained oscillations according to model predictions. However, first experimental implementations showed that several adaptations in the experimental system are required to reproduce our modeling results. Using the insights we created with our computational model, we explored the experimental design space to shift the system into an oscillatory regime. In particular, we then proposed several modifications, which could encourage oscillations in the experimental system, including altered Zinc-finger binding sites or a doubled methyltransferase production. Overall, our study shows that a model-based design of functional modules, combined with predictions about experimentally realizable design parameters, can support the targeted construction of synthetic modules.

References: Maier et al., 2017, Nat. Commun. 8, 1-10 Klingel et al. 2021, FEBS J. 288, 5692-5707 Klingel et al. 2022, ACS Synth. Biol. 11, 2445-2455

15:16
Advanced Modelling of Multicellular Systems in Morpheus
PRESENTER: Jörn Starruß

ABSTRACT. Computational modeling is increasingly important to analyze tissue dynamics during development and disease progression. Thus, a growing community of (computational) biologists is seeking solutions to construct and simulate multicellular models. A number of software tools have been designed in order to alleviate the computational challenges, but require scientists to encode their models in an imperative programming language. Morpheus [1], however, was established as the first extensible open-source software framework featuring declarative multicellular modelling (MorpheusML) and is thus applicable by a broad community, including experimentalists and trainees.

We present how MorpheusML [2] and our open-source framework [3] allow for advanced scientific work-flows that meet today's requirements of educational use, interdisciplinary research groups, and also data driven modelling approaches of expert users.

In brief, MorpheusML provides a bio-mathematical language describing fundamental cell behaviors, intra- and extra-cellular processes, and supplies data handling . Symbolic identifiers in mathematical expressions describe the dynamics of and coupling between the various model components. It can represent the spatial aspects of interacting motile cells as well as regulatory systems on tissue, cell and subcellular cell membrane level.

Following the rules of separation of model and implementation, our user-friendly GUI can be used to map multicellular models to MorpheusML, to embed an SBML model to account for known regulatory paths and to apply the model on experimental scenarios, e.g. derived from microscopy imaging. A numerical simulation is then composed by scheduling predefined components in the simulator.

Morpheus' parameter estimation and model selection is embedded into the meta-framework FitMultiCell [4], composing Morpheus and pyABC into a high-throughput and high-content data tool that is crucial for the understanding of multicellular processes, the prediction of perturbation experiments and the comparison of competing hypotheses.

{\bf References} \newline [1] Morpheus: a user-friendly modeling environment for multiscale and multicellular systems biology, Bioinformatics, 30, 2014. \newline [2] Model repository: https://morpheus.gitlab.io/model/published-models \newline [3] Homepage: https://morpheus.gitlab.io \newline [4] FitMultiCell: https://fitmulticell.gitlab.io

15:19
Resolving Resolution Hindrances in Spatial Transcriptomics
PRESENTER: Connor King

ABSTRACT. Abstract: Spatial transcriptomics is a novel methodology that allows for the analysis of gene expression in the context of tissue architecture. In the wake of this incredible new technique, many methods have been developed to understand the complex information contained in these datasets. These methods utilize dimensionality reduction and clustering to dissect these datasets and obtain meaningful information from them. Oftentimes, this is achieved by classifying regions into regions of expression that reflect common functionality. Other times, genes are classified into groups of spatially correlated genes to infer spatial regulation of cellular processes. However, many mainstream approaches to spatial transcriptomics are limited by the resolution of the capture areas. This limits the applicability of these techniques to disorganized tissues. Here, we propose a method for analyzing spatial transcriptomics data that utilizes both emerging methods to look at spatial transcriptomics datasets from different perspectives, while also informing the number of clusters present in each analysis using the MultiK algorithm. This systematic approach to identifying clusters allowed for the extension of these methodologies into the analysis of disorganized breast cancer tissue and extraction of information that has previously been overlooked in other analyses.

15:22
Inferring individual interactions in gene regulatory networks with delays from time-series experiments
PRESENTER: Yu Wang

ABSTRACT. This work considers inferring individual interactions in gene regulatory networks from time-series response data, with a particular focus on dealing with delays in the regulatory interactions. Inferring individual interactions is an important problem faced e.g., in target identification, but also in applications where it is of interest to learn regulatory interactions between a given subset of genes. However, the inference of interactions of interest is greatly hampered by inevitable delays between genes of interest caused by e.g., intermediate unmeasured components, protein synthesis, transportation delays, etc. Existing inference methods essentially rely on fitting a full network model to infer individual interactions of interest, which either do not take such delays into account or consider pure delays rather than rational dynamics. Such methods based on full network inference require a large number of experiments that can be time-consuming and costly. Experience shows that the resulting models typically will contain a large fraction of false positives and false negatives, and it is in general difficult to provide any statement on the reliability of individually inferred interactions. In contrast to existing methods, we here derive a tool to handle delays in network inference based on taking a geometric perspective on the inference problem formulated as a linear regression, by considering the span of individual time-shifted gene response vectors in sample space. Note that the assumption of linearity here corresponds to assuming the dynamics of the transformed variables being linear. Instead of fitting the full network model, the proposed method can identify individual interactions independently with a label of confidence. Using the proposed method, different delays associated with individual identified interactions can also be distinguished from a single time-series perturbation experiment. A key feature of the method is that it can deal with uncertainty in the data, resulting from both intracellular noise and measurement noise. The effectiveness of the proposed method is illustrated on a 10-gene DREAM4 in-silico network. We consider inferring interactions from the gene with the highest connectivity to other genes in the network. We collect 21 samples with sampling time Ts = 50 min from one single time-series experiment, containing both intracellular and measurement noise. We consider delays between the genes of up to 150 minutes. By applying the proposed method, 4 out of 5 true interactions are correctly identified with few false discoveries. The proposed method also infers different delays in the identified interactions; one of the identified interactions is with 0-50 min delays, while others with up to 150 min delays. The results show that the proposed method can be instrumental for reliable network inference in the presence of delays.

15:25
Variance of filtered signals: characterization for linear reaction networks and application to neurotransmission dynamics
PRESENTER: Ariane Ernst

ABSTRACT. Neurotransmission at chemical synapses relies on the calcium-induced fusion of synaptic vesicles with the presynaptic membrane. The distance to the calcium channels determines the release probability and thereby the postsynaptic signal. Suitable models of the process need to capture both the mean and the variance observed in electrophysiological measurements of the postsynaptic current. In this work, we propose a method to directly compute the exact first- and second-order moments for signals generated by a linear reaction network under convolution with an impulse response function, rendering computationally expensive numerical simulations of the underlying stochastic counting process obsolete. We show that the autocorrelation of the process is central for the calculation of the filtered signal’s second-order moments, and derive a system of PDEs for the cross-correlation functions (including the autocorrelations) of linear reaction networks with time-dependent rates. Finally, we employ our method to efficiently compare different spatial coarse graining approaches for a specific model of synaptic vesicle fusion. Beyond the application to neurotransmission processes, the developed theory can be applied to any linear reaction system that produces a filtered stochastic signal.

15:26
Dynamical basis of cellular sensing and responsiveness to spatial-temporal signals

ABSTRACT. Under physiological conditions, cells continuously sense and migrate in response to local gradient cues which are irregular, conflicting, and changing over time and space. This suggests cells exhibit seemingly opposed characteristics, such as robust maintenance of polarized state longer than the signal duration while remaining adaptive to novel signals. However, the dynamical mechanism that enables such sensing capabilities is still unclear. Here we propose a generic dynamical mechanism based on the critical positioning of the receptor signaling network in the vicinity of saddle-node of a sub-critical pitchfork bifurcation (SubPB mechanism). The dynamical “ghost” that emerges at the critical organization gives transient memory in the polarized response, as well as the ability to continuously adapt to changes in signal localization. Using weakly nonlinear analysis, an analytical description of the necessary conditions for the existence of this mechanism in a general receptor network is provided. By using a physical model that couples signaling to morphology, we demonstrate how this mechanism enables cells to navigate in changing environments. Comparing to three classes of existing mathematical models for the polarization that operate on the principle of stable attractors (Wave-pinning, Turing, and LEGI models), we show that the metastability arising from "ghost" in the SubPB mechanism uniquely enables sensing dynamic spatial-temporal signals in a history-dependent manner.

15:27
Computational modeling of the Hes1 oscillator for the self-renewal of muscle stem cells
PRESENTER: Zsófia Bujtár

ABSTRACT. Oscillatory dynamics in regulatory networks can be critical in development. The decision between self-renewal and differentiation is controlled by a Notch related signaling network. The key transcription factors of the Hes gene family are regulated by the Notch signaling pathway and show oscillations that are indispensable for self-renewal. Recent experiments demonstrated that expression of the Hes1 oscillates in activated muscle stem cells and regulates transcription of the genes encoding the myogenic transcription factor MyoD and the Notch ligand Dll1, thereby driving MyoD and Dll1 oscillations.

We developed a mathematical model describing the network of the co-regulated genes Hes1, MyoD and Dll1 in individual cells based on ordinary differential equations (ODEs). In knockout simulations, the model exhibits Dll1 oscillations with declined amplitude in the case of the MyoD knockout, but shows non-oscillatory, increased levels of Dll1 for Hes1 knockout, confirming corresponding experimental observations. Based on the ODE model, we established a delay differential equation (DDE) model representing Hes1 and Dll1 protein dynamics. The DDE model allows for the direct investigation of modified delays, such as in the Dll1type2 mutant cell line characterized by a prolonged Dll1 transcription time. We studied the case of two cells coupled via Dll1 signaling in detail. The investigation showed out-of-phase oscillations for coupled wild-type cells and quenched oscillations for coupled Dll1type2 mutant cells, as observed experimentally. To investigate the impact of the delays systematically, we performed a bifurcation analysis of the 2-cell-DDE-model. As the strength of intercellular coupling also controls the dynamics, we are currently expanding the bifurcation analysis by considering the effect of coupling strength as well. Our approach demonstrates that computational modelling allows for a systematic investigation of the dynamic properties of the molecular network regulating cell fate decision in muscle stem cells.

15:28
A powerful modelling approach tailored to cellular signalling
PRESENTER: Clemens Kreutz

ABSTRACT. In systems biology, ordinary differential equations (ODEs) are frequently applied for investigating dynamic processes such as signalling pathways. ODEs are typically defined by translating relevant biochemical interactions into rate equations. One disadvantage of such mechanistic dynamic models is that they can become very large in terms of the number of dynamic variables and parameters if entire cellular pathways are described. Moreover, analytical solutions of the ODEs are not available and the dynamics is nonlinear which are a challenges for numerical approaches as well as for statistically valid reasoning.

We recently introduced a complementary modeling approach based on curve fitting of a tailored retarded transient response function (RTF) [1]. This approach exhibits amazing capabilities in approximating ODE solutions in case of transient dynamics as it is typically observed for cellular signalling. Besides the broad and easy applicability, a benefit of the RTF is the clear-cut interpretation of its parameters as response time, as amplitudes, and time constants of a transient and a sustained part of the response. Dose-dependencies of these parameters are described via Hill functions, allowing for the calculation of half-maximal activating (EC50) or inhibitory (IC50) effects on these dynamic parameters.

The presented approach offers a data-driven alternative modelling strategy for situations where classical ODE modeling is cumbersome or even infeasible. Moreover, it enables valuable interpretations of traditional ODE models and is also applicable to the analysis of time-course and/or dose response data from omics experiments. Nine benchmark problems for cellular signaling were analyzed to demonstrate the approach in realistic systems biology settings. The performance of the approach is also demonstrated using dose-dependent time-course data of inflammasome activation.

[1] https://doi.org/10.3389/fphy.2020.00070

15:29
A robust model of neural superposition sorting
PRESENTER: Eric Reifenstein

ABSTRACT. Precise brain wiring relies on specific connections between pre- and postsynaptic partners, but the underlying mechanisms for how these connections are formed remain unclear. Here we use non-invasive intravital live imaging of Drosophila photoreceptor neurons in the lamina to establish a model of neural superposition. The model consists of three components: (i) mechanical stiffness of the growing axon, (ii) stochastic growth-cone extension towards regions of low tissue density, and (iii) short-range attraction of the growth cones to the postsynaptic partners to stabilize the wiring pattern. The relative contributions of the three components dynamically change over developmental time. The model reproduces the biological wiring pattern for all photoreceptor subtypes and for different subregions of the lamina. Most parameters of the model are estimated from the live-imaging data. For the few remaining parameters, we show that the modelled growth cones robustly reach their correct target locations for wide ranges of parameter values. In fact, the stochastic component of the model increases the robustness to parameter variations. In summary, our three-component model robustly works for a broad range of conditions and reproduces key experimental findings.

15:30-16:00Coffee Break
16:00-16:45 Session K5: KEYNOTE V: Chris Sander

Abstract: AI and statistical learning can generate actionable predictions at the level of molecules, cells and humans.(1) EVcouplings: protein structure from experimental evolution; design of proteins for sustainability.(2) Perturbation Biology: designing targeted intervention from large-scale perturbation-response experiments.(3) CancerRiskNet: Identifying high risk of pancreatic cancer from real-world clinical records.(4) EVescape: Forecasting viral antibody escape.

Chair:
Location: Alexander
16:00
Machine learning for hard biological problems

ABSTRACT. AI and statistical learning can generate actionable predictions at the level of molecules, cells and humans.

(1) EVcouplings: protein structure from experimental evolution; design of proteins for sustainability. (2) Perturbation Biology: designing targeted intervention from large-scale perturbation-response experiments. (3) CancerRiskNet: Identifying high risk of pancreatic cancer from real-world clinical records. (4) EVescape: Forecasting viral antibody escape.

16:45-19:30 Session P1: POSTER SESSION I (Odd Submission Numbers)

ODD NUMBERED POSTERS (1, 3, 5, ...;)

Location: All Grenanders
FitMultiCell: Simulating and parameterizating computational models of multi-cellular processes
PRESENTER: Emad Alamoudi

ABSTRACT. Biological tissues tend to be dynamic and highly organized. Multi-cellular models are increasingly receiving attention as a means to explain and understand this organization. Researchers have studied many aspects of mathematical modeling; however, these models' parametrizations are still hard to handle. Many established parameter estimation methods are not suitable for these models due to the intractability of calculating the likelihood function. A method that has been proven to be applicable to multi-cellular models is Approximate Bayesian Computation (ABC). ABC is a likelihood-free method that circumvents the calculation of the likelihood function by evaluating the parameter space of the model to generate data similar to the observed data. Unfortunately, ABC is a computationally expensive approach, as it requires a large number of simulations. Thus, there is a need for a fast and general-purpose pipeline for modeling and simulating multi-cellular systems that can exploit parallelization on high-performance computing infrastructure. To this end, we built a user-friendly, open-source, and scalable platform, called FitMultiCell, that can handle modeling, simulating, and parameterizing multicellular systems. The platform combines Morpheus, a modeling and simulation tool, with pyABC, an advanced likelihood-free inference tool. The platform was successfully tested using various benchmark models that were described by Kumberger et al. (Viruses, 10(4), 2018), Imle et al. (Nature Communications, 10.1(2019)), and Meyer et al. (Molecular systems biology, 16.2 (2020)).

Transcriptional fluctuations govern the serum dependent cell cycle duration heterogeneities in Mammalian cells

ABSTRACT. Mammalian cells exhibit a high degree of intercellular variability in cell cycle period and phase durations. However, the factors orchestrating the cell cycle duration heterogeneities remain unclear. Herein, by combining cell cycle network-based mathematical models with live single-cell imaging studies under varied serum conditions, we demonstrate that fluctuating transcription rates of cell cycle regulatory genes across cell lineages and during cell cycle progression in mammalian cells majorly govern the robust correlation patterns of cell cycle period and phase durations among sister, cousin, and mother-daughter lineage pairs. However, for the overall cellular population, alteration in serum level modulates the fluctuation and correlation patterns of cell cycle period and phase durations in a correlated manner. These heterogeneities at the population level can be fine-tuned under limited serum conditions by perturbing the cell cycle network using a p38-signalling inhibitor without affecting the robust lineage level correlations. Overall, our approach identifies transcriptional fluctuations as the key controlling factor for the cell cycle duration heterogeneities, and predicts ways to reduce cell-to-cell variabilities by perturbing the cell cycle network regulations.

Benchmarking of analysis strategies for data-independent acquisition proteomics data
PRESENTER: Eva Brombacher

ABSTRACT. Making decisions can be hard. Even more so if there are an extensive number of options, as in the case of algorithms available for each step of a data-independent acquisition (DIA)-type proteomics analysis workflow.

Benchmark studies objectively compare the performance of such algorithms and, thus, can facilitate making an informed decision. In our benchmark study we evaluated more than thousand distinct data analysis workflows, i.e. different combinations of DIA software, spectral libraries, sparsity reduction, normalization, and statistical tests, which we assessed based on their ability to correctly identify differentially abundant proteins.

We found that DIA software-library combinations which include gas-phase fractionation were among the best-performing workflows, while also on average detecting the most proteins. Among all investigated statistical tests non-parametric permutation-based statistical tests consistently perform best.

A generic framework to coarse-grain stochastic reaction networks by Abstract Interpretation
PRESENTER: Albin Salazar

ABSTRACT. In the last decades, logical models have emerged as a successful paradigm for capturing and predicting the behavior of systems of molecular interactions. Intuitively, they consist in sampling the abundance of each kind of biochemical entity within finite sets of intervals and deriving transitions accordingly. Whereas formally sound derivation from more precise descriptions (such as from reaction networks) includes many fictitious behaviors, direct modeling usually favors dominant interactions with no guarantee on the behaviors that are neglected, such as competition between reactions.

In this work, we developed a generic framework to discretize behaviors emerging from stochastic reaction networks by means of Abstract Interpretation, a tool which has proven successful in the approximation of mathematical structures. Our framework designs overlapping intervals to approximate behaviors and introduce a minimal effort for the system to go back to an abstract region of states, hence limiting fictitious oscillations in the derived models. Here, relations between obtained approximations and the set of all possible behaviors provide a rigorous guide for coarse-graining. Then, we compute for pairs of transitions (in the derived model) bounds on the probabilities on which one will occur first. We illustrate our ideas on two case studies and demonstrate how techniques from Abstract Interpretation can be used to construct more precise discretization methods, while providing a framework to further investigate the underlying structure of logical models.

Aspergillus fumigatus pan-genome analysis identifies genetic variants associated with human infection
PRESENTER: Tongta Sae-Ong

ABSTRACT. Aspergillus fumigatus is an environmental ubiquitous human fungal pathogen. Despite the more than 300,000 cases of invasive disease globally each year, a comprehensive survey of the genomic diversity present, including the relationship between clinical and environmental isolates, and how this genetic diversity contributes to virulence and antifungal drug-resistance, has been lacking. In this study, we define the pangenome of A. fumigatus using a collection of 300 environmental and clinical genomes from a global distribution, 188 of which were sequenced in this study. We found a total of 10,907 orthologous groups, of which 7,563 (69%) are core groups, while 3,344 groups show presence/absence variation, representing 16-22% of each isolate’s genome. Using this large genomic dataset of both environmental and clinical samples, we found a genetic cluster was enriched for clinical isolates. Their genomes contain more accessory genes, including more transmembrane transporters, proteins with iron-binding activity, and genes involved in both carbohydrate and amino acid metabolism. Finally, we leverage the power of genome-wide association to identify genomic variation associated with clinical isolates and triazole resistance as well as characterize genetic variation in known virulence factors. This characterization of the genomic diversity of A. fumigatus allows us to move away from a single reference genome that does not necessarily represent the species as a whole and better understand its pathogenic versatility, ultimately leading to better management of these infections.

Tissue-specific codon usage: from systems to synthetic biology

ABSTRACT. Although different tissues showcase differences in codon usage and anticodon tRNA repertoires, the codon-anticodon co-adaptation of multicellular eukaryotes is not completely understood. On the one hand, coding sequences are determined by manifold overlapping factors (codons, mRNA stability, splicing, etc.) and, on the other hand, tRNAs are intricately regulated at multiple levels (expression, modification, aminoacylation, fragmentation). Here, we uncover translational determinants of tissue-specificity applying a systems biology approach to human high-throughput datasets. First, analyzing the tRNA abundance in over 8,000 tumor and healthy samples unveiled that the variability of the tRNA pool is largely related to the proliferative state across tissues, and that cancer patient survival is associated with the translational efficiency of certain codons. To quantify the extent codons are efficiently translated, we then leveraged transcriptomics and proteomics datasets to compute the protein-to-mRNA ratios across 36 different healthy human tissues. We detected two clusters of tissues with an opposite pattern of A/T- vs G/C-ending codon preferences. Using these, we then developed and experimentally validated CUSTOM (custom.crg.eu), a codon optimizer algorithm for tissue-specific protein production. Altogether, our work not only provides evidence of tissue-specific tRNA expression and protein synthesis, but also makes this knowledge applicable to the development of tissue-targeted therapies and vaccines.

A Systems Biology Approach To Study The Spatiotemporal Dynamics Of Senescent Cells In Wound Healing And Tissue Repair

ABSTRACT. Cellular senescence is thought to drive age-related pathology through the senescence-associated secretory phenotype (SASP). However, it also plays important physiological roles such as cancer suppression, embryogenesis and wound healing. Wound healing is a tightly regulated process which when disrupted results in conditions such as fibrosis and chronic wounds. Senescent cells appear during the proliferation phase of the healing process where the SASP is involved in maintaining tissue homeostasis after damage. Interestingly, SASP composition and functionality was recently found to be temporally regulated, with distinct SASP profiles involved: a fibrogenic, followed by a fibrolytic SASP, which could have important implications for the role of senescent cells in wound healing. Although senescence plays an important role in physiological wound healing, it has also been implicated in the progression of fibrotic and chronic wound disorders. Given the number of factors at play a full understanding requires addressing the multiple levels of complexity, pertaining to the various cell behaviours, individually followed by investigating the interactions and influence each of these elements have on each other and the system as a whole. Here, a systems biology approach was adopted whereby a multi-scale model of wound healing that includes the dynamics of senescent cell behaviour and corresponding SASP composition within the wound microenvironment was developed. The model was built using the software CompuCell3D, which is based on a Cellular Potts modelling framework. We used an existing body of data on healthy wound healing to calibrate the model and validation was done on known disease conditions. The model provides understanding of the spatiotemporal dynamics of different senescent cell phenotypes and the roles they play within the wound healing process. The model also shows how an overall disruption of tissue-level coordination due to age-related changes results in different disease states including fibrosis and chronic wounds. Further specific data to increase model confidence could be used to explore senolytic treatments in wound disorders.

Impact of Amyloid-Beta Induced Disruption on Ca2+ Homeostasis in a Simple model of Neuronal Activity

ABSTRACT. Alzheimer’s Disease is characterized by a dysregulation of Ca2+ homeostasis. This dysregulation of a main signalling pathway is partly provoked by the gradual and massive deposition of Amyloid-Beta (A) peptides around Ca2+ channels and transporters. Since Ca2+ is also involved in neuronal activity, A’s are expected to disrupt electrical excitability. To assess this A-induced disruption, we resorted to mathematical modelling that allows to systematically investigate the effect of changing the activity of each reported target of A’s on Ca2+ dynamics and neuronal excitability. The model is based on previous computational analyses that have been validated experimentally. Methods included bifurcation theory, non-linear mathematical analysis and the use of the XPP AUTO software. We explored the impact of Ca2+ dysregulation coming from the formation of pores of A’s in the plasma membrane and of changes in L-type channel activity and plasma membrane Ca2+ ATPases. We also focussed on the importance of the Ca2+ sensitive K+ channels. The model focusses on the behaviour of a single spherical cell. Spatial heterogeneities in Ca2+ concentrations are considered by dividing the cell in three compartments: a fictious sub-plasmalemmal compartment, the cytoplasm and the endoplasmic reticulum. This simple model revealed that Ca2+ in the subplasmalemmal compartment is influenced by the exchanges with the extra-cellular medium but is practically independent of the Ca2+ exchanges with the endoplasmic reticulum because of the high Ca2+ buffering capacity of the cytoplasm. In contrast, the model predicts that by affecting L-type Ca2+ activity, A’s promote neuronal hyperexcitability. When acting on the plasma-membrane Ca2+-ATPases, A’s promote hypo-excitability. Lastly, when A’s create pores in the plasma-membrane, they can promote hypo or hyper-excitability depending on conditions. These computational results were compared to previous experimental studies. As such, the model provides a simple and helpful tool to understand complex interactions between the A-induced disruption of Ca2+ homeostasis and electrical activity. It provides molecular explanations to the counter-intuitive observation that accumulation of A’s is accompanied by both an increase in intracellular Ca2+ and an increase in neuronal excitability.

Improving Our Understanding of ADPKD: Computational Analysis of Metabolome Data

ABSTRACT. Autosomal Dominant Polycystic Kidney Disease (ADPKD) is one of the most common genetic disorders leading to kidney failure, with a high incident rate of 1:1000. ADPKD is characterized by the development of hundreds of fluid-filled cysts bilaterally, leading to gradual kidney enlargement and functional decrease in the kidneys. The disruptions caused by ADPKD are not limited to the kidney but also have external manifestations, including cardiovascular abnormalities and extra-renal cyst development predominantly in the liver and pancreas. Accurately measuring kidney function or estimated Glomerular Filtration Rate (eGFR) is required to provide suitable medication and slow down the progression of the disease. CKD-EPI is one of the most widely accepted and used formulas by clinicians. It uses serum creatinine level, age, gender, and race to calculate eGFR. However, this formula still has accuracy problems for eGFR values above 90 mL/min/1.73m2, causing incorrect stage assignments to patients. Furthermore, the symptoms become recognizable in the late stages when almost half of the filtering units in the kidney are irreversibly damaged. Therefore, early and accurate diagnosis has vital importance for ADPKD patients. To unravel the molecular mechanisms underlying ADPKD and its progression, we developed predictive models with selected biomarkers of ADPKD to estimate GFR with different modeling techniques. There were 450 features in the integrated metabolome and clinical data. LASSO, followed by stability selection, yielded four features serum metabolite creatinine, VLDL size, serum urea, and lipase. We investigated these variables' separation capacity of patients according to stages by a heat map. Including age and gender as features in the previously selected four variables, we generated eight models by employing linear regression (LR) and random forest (RF). Their prediction capacities were compared to the CKD-EPI formula. The highest explained variance among the generated models was 96.5%. This LR model contained five features and an interaction term. The predicted eGFR values by this LR model also provide the closest estimates to the CKD-EPI formula. Furthermore, the predicted eGFR values were used to identify possible targets related to ADPKD onset and progression by comparison of different stages with LIMMA. Currently, we are focusing on developing a molecular network of ADPKD with the zero-sum approach. This network will provide an overview of how metabolites are affecting each other and aid in identifying possible drug targets to intervene in the progression. After obtaining the longitudinal molecular and clinical data, we will integrate them with the same workflow, and we believe that we will be able to predict future eGFR decline and when the patients reach renal failure.

Molecular principles governing substrate specificity of human SUMO E3 Ligases

ABSTRACT. The conjugation of SUMO molecules to specific target-proteins orchestrates several pathways essential for cellular homeostasis, including protein degradation and chromatin remodeling. This post-translational modification is the result of a tightly controlled biochemical cascade. The deregulation of the SUMO Conjugation is often associated with several disease states. A small set of diverse SUMO E3 ligases have evolved in higher organisms to provide specificity to these cellular pathways. To decipher how these protein-complexes are organized within specific cellular compartments, recognize their unique substrates, and, function in various cellular pathways, we develop a data-integration approach to effectively combine information from disparate datasets. By using publicly available omics-datasets in combination with state-of-the-art bioinformatics methods, we develop a novel framework which incorporates regulatory effects, network properties and biophysics of SUMOylation to identify SUMO E3 ligases and their specific substrate molecules. This framework allows us to decipher key molecular signatures driving Enzyme-Substrate specificity and predicts E3 ligase-Substrate pairs with high degree of confidence which were previously not characterized within the SUMO system.

A mechanistic model for endocrine profiles of female puberty maturation

ABSTRACT. The hypothalamus-pituitary-gonadal axis (HPG axis) has a central role in female reproduction and is deactivated during childhood. During pubertal development, the reactivation of the HPG axis causes characteristic physical and psychological changes. Pubertal development in girls is clinically monitored by breast development. A trend of earlier breast development in girls since 1977 has been reported, which creates interest in looking into new approaches to monitor pubertal development. Using a mechanistic model, we aim to describe the dynamics of three reproductive hormones (LH, FSH, and E2) over the time course of female puberty. In the first step, we calibrate a population-average model using cross-sectional data. In a second step, we translate the population-average model into a patient-specific form. In order to archive that, we estimated patient-specific model parameters using a Bayesian approach. The Bayesian estimator uses prior knowledge about the model parameter gained from the population-average model. With this project, we assess whether the information in large cross-sectional data samples can be exploited to inform individual patient-specific parameter estimates on sparse longitudinal data sets. Such a patient-specific model forecasts hormone dynamics over the time of pubertal development and could offer an alternative way of monitoring.

Advanced Modelling of Multicellular Systems in Morpheus
PRESENTER: Jörn Starruß

ABSTRACT. Computational modeling is increasingly important to analyze tissue dynamics during development and disease progression. Thus, a growing community of (computational) biologists is seeking solutions to construct and simulate multicellular models. A number of software tools have been designed in order to alleviate the computational challenges, but require scientists to encode their models in an imperative programming language. Morpheus [1], however, was established as the first extensible open-source software framework featuring declarative multicellular modelling (MorpheusML) and is thus applicable by a broad community, including experimentalists and trainees.

We present how MorpheusML [2] and our open-source framework [3] allow for advanced scientific work-flows that meet today's requirements of educational use, interdisciplinary research groups, and also data driven modelling approaches of expert users.

In brief, MorpheusML provides a bio-mathematical language describing fundamental cell behaviors, intra- and extra-cellular processes, and supplies data handling . Symbolic identifiers in mathematical expressions describe the dynamics of and coupling between the various model components. It can represent the spatial aspects of interacting motile cells as well as regulatory systems on tissue, cell and subcellular cell membrane level.

Following the rules of separation of model and implementation, our user-friendly GUI can be used to map multicellular models to MorpheusML, to embed an SBML model to account for known regulatory paths and to apply the model on experimental scenarios, e.g. derived from microscopy imaging. A numerical simulation is then composed by scheduling predefined components in the simulator.

Morpheus' parameter estimation and model selection is embedded into the meta-framework FitMultiCell [4], composing Morpheus and pyABC into a high-throughput and high-content data tool that is crucial for the understanding of multicellular processes, the prediction of perturbation experiments and the comparison of competing hypotheses.

{\bf References} \newline [1] Morpheus: a user-friendly modeling environment for multiscale and multicellular systems biology, Bioinformatics, 30, 2014. \newline [2] Model repository: https://morpheus.gitlab.io/model/published-models \newline [3] Homepage: https://morpheus.gitlab.io \newline [4] FitMultiCell: https://fitmulticell.gitlab.io

Resistant starch decreases intrahepatic triglycerides in NAFLD patients via the gut microbiome contribution to the branched-chain amino acids pool

ABSTRACT. Non-alcoholic fatty liver disease (NAFLD) is a hepatic manifestation of metabolic dysfunctions for which effective interventions are lacking. To investigate the effects of resistant starch (RS) as a microbiota-directed dietary supplement for NAFLD treatment, we conducted a randomized placebo-controlled clinical trial of 4 months in individuals with NAFLD, coupled with metagenomics and metabolomics analysis. Compared to the control, the RS intervention resulted to a 40.32% relative decrease of the intrahepatic triglyceride content (IHTC). Serum branched chain amino acids (BCAAs) and gut microbial species, in particular Bacteroides stercoris, significantly correlated with IHTC and liver enzymes, and were reduced by RS in comparison to the control. Multi-omics integrative analyses revealed the interplay among gut microbiota changes, BCAA availability, and hepatic steatosis, which was supported by both in vivo and in vitro models. Thus, dietary supplementation with RS might be a strategy for managing NAFLD by altering gut microbiota composition and functionality.

Data-driven mathematical modeling of human skin aging and its application for natural compound screening
PRESENTER: Masatoshi Haga

ABSTRACT. Internal and external environmental factors cause skin aging and its functional decline reduces the barrier function, leading to skin dysfunction. Therefore, to maintain skin homeostasis, it is desirable to reduce the rate of skin aging. In this study, we identified genes whose expression was altered both in vivo and in vitro over time by examining two independent time-course public RNA-seq datasets; in vivo data of primary human skin fibroblasts obtained from a wide range of ages, representing external factors (Fleischer et al., Genome Biol. (2018)) and in vitro data from human foreskin fibroblasts (HFF-1) cultured for long-term cell passage, representing internal factors (Marthandan et al., PLoS One. (2016)). Pathway analysis of the genes revealed TGFβ signal as skin aging-related pathways. Thrombospondin-1 (THBS1) and Fibromodulin (FMOD) were selected as key regulators that show a high correlation between the age of in vivo samples and the population doubling level of in vitro. Validation revealed that THBS1 increases and FMOD decreases with aging in human dermal tissue and cellular senescence induced HFF-1. We also found that THBS1 induces SA-β-gal, a senescence marker, while FMOD suppresses the induced SA-β-gal level. Thus, we identified a novel regulatory network of skin aging controlled by THBS1 and FMOD.

Based on the experimental findings, we developed an ordinary differential equation model that reproduces the dynamics of the gene expression to comprehensively and quantitatively understand the global regulatory mechanism associated with skin aging. Data fitting of temporal changes in HFF-1 protein expression with and without TGFβ1 stimulation showed that TGFβ1, which increases with skin aging, sustainably activates the SMADs complex, increasing THBS1. FMOD, conversely, was found to be suppressed via persistently activated Akt induced by TGFβ1. The model showed that THBS1 expression is sensitive to changes in TGFβ1 while FMOD expression is associated with a robust regulatory mechanism. These results suggest that THBS1 is a promising drug target for skin aging, and the sensitivity analysis and siRNA-based in vitro validation suggest that SMAD4 is the target for THBS1 regulation.

A natural compound library screening found that retinoic acid (RA) is effective in suppressing THBS1 in HFF-1. We have further found that the RA signaling pathway is involved in the transcriptional repression of THBS1 in the nucleus. This research has clarified one of the mechanisms of action of RA, which is used as an anti-wrinkle active ingredient.

Information Processing by Homo-Oligomeric Proteins: From First Principles to Cardiac Disease

ABSTRACT. Reversible protein homo-oligomerisation, i.e. the formation of larger protein complexes out of identical subunits, is observed for 30-50% of all vertebrate proteins. Despite being a ubiquitous phenomenon, the specific function of protein homo-oligomerisation remains poorly understood. I previously demonstrated theoretically that homo-oligomerisation could be a versatile mechanism for a range of signal processing capabilities such as dynamic signal encoding, homeostasis and bistability via pseudo-multisite modification. In this talk I will present the first dynamical systems model of phospholamban (PLN), a crucial mediator protein of the physiological "fight-or-flight" response triggered by β-adrenergic signaling and a key regulator of calcium cycling in heart muscle cells. Importantly, PLN forms homo-pentamers whose function remained elusive for decades. Simulations and model analyses demonstrate that pentamers enable bistable phosphorylation and further constitute substrate competition based low-pass filters for phosphorylation of monomeric PLN. Both predictions of the model were confirmed experimentally by demonstrating substrate competition in vitro and and by demonstrating hysteresis of pentamer phosphorylation in cardiomyocytes. These non-linear phenomena may ensure consistent monomer phosphorylation and calcium cycling despite noisy signaling activity in the upstream network and may be impaired by perturbations (e.g. via genetic mutations or in the context of underlying heart disease) which cause cardiac arrhythmias. These studies show that homo-oligomerisation can play unanticipated and potentially disease relevant roles in biochemical signaling networks.

Modeling the adaptive immune response of a lymph node as Petri net
PRESENTER: Sonja Scharf

ABSTRACT. Background The lymph node is responsible for important tasks of the adaptive immune response. The lymph node consists of different compartments, such as B zone and T zone. The compartments include various cell types, for example macrophages, dendritic cells, antigen-presenting cells, B and T cells. The cells are motile and communicate with each other. Whereas certain cell types, e.g., B cells can move from compartment to compartment, the movement of other cells is restricted to specific regions, for example follicular dendritic cells are located in the germinal centers. Cellular interactions trigger differentiations or movement of cells to another region. Based on the current knowledge, we created a model to better understand and predict immunological reactions.

Methods For model development and analysis, we applied the Petri net (PN) formalism, using the software tool MonaLisa. We analyzed the invariants of the model for network verification. The place invariants of the PN model validated the conservation of cells, and we applied transition invariants to explore the network dynamics. In our PN model, the production of B cells, the influx of antigen, the interaction of cells with antigens, differentiation of B cells and proliferation of differentiated B cells, release of antibodies and degradation of antigens represented the immune response. We simulated the adaptive immune response in an asynchronous way to consider a non-deterministic behavior.

Results The PN model describes movement and interaction of different cell types in and between compartments of the lymph node such as the subcapsular sinus, T zone, germinal center, and medulla. We included the interactions of the lymph node with the human body by modeling the two compartments blood and tissue without internal structure. The PN comprises 65 transitions (interactions and movement processes) and 49 places (cells, antigens, and antibodies). Four place invariants reflect the conservation of T cells, macrophages, antigen-presenting cells, and dendritic cells. The PN is covered by 25 transition invariants, each of which describes immunological reactions and movement of the cells through the lymph node. To analyze the PN, we started the simulation with an influx of antigens. When we stopped the antigen influx, antibody production was still active. We observed a much faster immune response for a second influx of antigens. The PN model was able to describe the dynamic behavior of cells inside the lymph node during an immune response in a semi-quantitative way. The model could adapt to known antigens. With the model, we can therefore predict that new antigens will lead to the production of specific antibodies and memory B cells. Our model demonstrates a functional adaptive immune response.

Resolving Resolution Hindrances in Spatial Transcriptomics
PRESENTER: Connor King

ABSTRACT. Abstract: Spatial transcriptomics is a novel methodology that allows for the analysis of gene expression in the context of tissue architecture. In the wake of this incredible new technique, many methods have been developed to understand the complex information contained in these datasets. These methods utilize dimensionality reduction and clustering to dissect these datasets and obtain meaningful information from them. Oftentimes, this is achieved by classifying regions into regions of expression that reflect common functionality. Other times, genes are classified into groups of spatially correlated genes to infer spatial regulation of cellular processes. However, many mainstream approaches to spatial transcriptomics are limited by the resolution of the capture areas. This limits the applicability of these techniques to disorganized tissues. Here, we propose a method for analyzing spatial transcriptomics data that utilizes both emerging methods to look at spatial transcriptomics datasets from different perspectives, while also informing the number of clusters present in each analysis using the MultiK algorithm. This systematic approach to identifying clusters allowed for the extension of these methodologies into the analysis of disorganized breast cancer tissue and extraction of information that has previously been overlooked in other analyses.

Digital twins and hybrid modelling for simulation of physiological variables and stroke risk
PRESENTER: Tilda Herrgårdh

ABSTRACT. Stroke is one of the most common causes of death in our society. The underlying aetiology leading to a stroke event is complex and develops over several years, often without symptoms. To be able to predict a stroke is therefore as desirable as it is difficult. The disease mechanisms act on different levels, comprising many both physiological and environmental factors, and involving multiple organs, timescales, and control mechanisms. Therefore, a multiscale and multilevel approach is needed to fully understand and predict disease progression. One such approach is digital twins. A digital twin is a personalized computer model of a patient. So far, digital twins have been constructed using either mechanistic models, which can simulate the trajectory of physiological and biochemical processes in a person, or using machine learning models, which for example can be used to estimate the risk of having a stroke given a cross-section profile at a given timepoint. These two modelling approaches have complementary strengths which can be combined into a hybrid model. However, even though hybrid modelling combining mechanistic modelling and machine learning has been proposed, there are few, if any, real examples of hybrid digital twins available. We now present such a hybrid model for the simulation of ischemic stroke. On the mechanistic side, we combine a model for blood pressure with a multi-level (intracellular biochemistry to whole-body) and multi-timescale (seconds to years) model for the development of type 2 diabetes. This mechanistic model can simulate the evolution of known physiological risk factors (such as weight, diabetes, and blood pressure) through time, and under different intervention scenarios (change in diet, exercise, and certain medications). These forecast trajectories of the physiological risk factors are then used by a machine learning model to calculate the 5-year risk of stroke. The stroke risk can also be calculated for each timepoint in the simulated scenarios. The hybrid model is now ready to be tested in clinical usage, in the preventative health care meetings in Sweden, where we hope to increase doctor-patient communication, facilitate shared decision-making, and improve adherence to prescribed medications.

Dynamical basis of cellular sensing and responsiveness to spatial-temporal signals

ABSTRACT. Under physiological conditions, cells continuously sense and migrate in response to local gradient cues which are irregular, conflicting, and changing over time and space. This suggests cells exhibit seemingly opposed characteristics, such as robust maintenance of polarized state longer than the signal duration while remaining adaptive to novel signals. However, the dynamical mechanism that enables such sensing capabilities is still unclear. Here we propose a generic dynamical mechanism based on the critical positioning of the receptor signaling network in the vicinity of saddle-node of a sub-critical pitchfork bifurcation (SubPB mechanism). The dynamical “ghost” that emerges at the critical organization gives transient memory in the polarized response, as well as the ability to continuously adapt to changes in signal localization. Using weakly nonlinear analysis, an analytical description of the necessary conditions for the existence of this mechanism in a general receptor network is provided. By using a physical model that couples signaling to morphology, we demonstrate how this mechanism enables cells to navigate in changing environments. Comparing to three classes of existing mathematical models for the polarization that operate on the principle of stable attractors (Wave-pinning, Turing, and LEGI models), we show that the metastability arising from "ghost" in the SubPB mechanism uniquely enables sensing dynamic spatial-temporal signals in a history-dependent manner.

Identification of causal genes at GWAS loci with pleiotropic gene regulatory effects using instrumental variable sets
PRESENTER: Mariyam Khan

ABSTRACT. Genome wide association studies (GWAS) have shown that genetic architecture of human health and disease traits is highly complex, with most traits being affected by large number of small effect genetic variants spread across the entire genome. At molecular level, genetic variants affect surrounding epigenetic states, leading to altered transcription of nearby genes by cis-acting mechanisms, which then causes downstream trans effects on gene expression and clinical phenotypes via gene regulatory networks.

We are interested in inferring trans acting causal relations between gene expression traits at disease risk loci identified by GWAS and clinical phenotypes using Mendelian Randomization (MR). In traditional MR, variant with local gene regulatory effect (a cis- expression quantitative trait locus or cis-eQTL) acts as a randomized “instrument” for the expression of the gene, like random assignment of individuals to treatment groups in randomized controlled trials, such that the statistical associations between the variant, the gene and the phenotype can be used to estimate the causal effect of the gene on the phenotype.

However, in human data it has been found that up to 57% of genetic variants with local gene regulatory effects are linked to expression of multiple nearby genes (regulatory pleiotropy). Here, traditional MR cannot be applied, and identification of causal genes and their relative causal effects at GWAS loci with pleiotropic regulatory effects is an open question in the field.

We have used Wright’s method of causal path coefficients to prove mathematically that if a regulatory site is shared by ‘d’ cis-eGenes, and if ‘d’ genetic variants can be found in the shared-site locus, each associated with at least one of the cis-eGenes, not in perfect linkage disequilibrium with each other, then these variants form a generalized instrumental variable set and allow identification of the relative contributions of each cis-eGene to the phenotype, irrespective of any hidden confounding among the cis-eGenes and the phenotype.

As a proof of principle, we identified candidate causal genes at GWAS loci for coronary artery disease risk with pleiotropic gene regulatory effects.

A model of RNA repair to study antibiotic tolerance
PRESENTER: Hollie Hindley

ABSTRACT. Antibiotic tolerance, the mechanism of bacteria transiently surviving antibiotic treatment, is emerging as a precursor to the development of full antibiotic resistance. An RNA repair system, the Rtc system, has recently been shown to promote antibiotic tolerance upon exposure to ribosome-targeting antibiotics. The role of this system in the absence of antibiotics is largely unknown, and even less so the mechanisms by which tolerance is obtained. In this work, we develop and analyse the first mathematical model of the Rtc system for RNA repair, to investigate the mechanistic action of Rtc leading to antibiotic tolerance in bacteria.

The Rtc system is an RNA repair system found in all domains of life. Recent work has highlighted the role of Rtc in maintaining RNA components of the translational apparatus, allowing bacteria to counteract the translation-inhibiting effects of antibiotics, as well as roles in chemotaxis and motility processes. The system consists of an RNA cyclase, RtcA, and an RNA ligase, RtcB, which together perform an end-healing and -sealing function for RNA ends and are both regulated by RtcR.

The expression of RtcA and RtcB is tightly regulated by a σ54-factor that requires an activator protein, RtcR. Under normal conditions, RtcR exhibits negative self-autoregulation and requires cooperative activation by a ligand. Once active, RtcR interacts with the σ54-RNA polymerase (RNAP) holoenzyme and using its ATPase activity, converts RNAP from the closed complex to the open complex, where transcription of RtcA and RtcB can begin.

Building a mathematical model of the Rtc system, we investigate the potential of ribosome maintenance in rescuing growth upon antibiotic exposure. We model expression of the three Rtc genes and their action on ribosomes. We further model ribosomes as three separate species: healthy and damaged ribosomes, and `healed' ribosomes that have been tagged by RtcA for `sealing' by RtcB. Tagged ribosomes act as ligands to RtcR, and so a positive feedback loop is created, with tagged ribosomes leading to expression of RtcA and RtcB.

Preliminary analysis indicates a high sensitivity of RtcAB expression on ATP availability. The system further displays potential for bistability, which may explain the heterogeneity observed in the expression of Rtc and in tolerance levels across isogenic cells.

Integrating heterogeneous data on Rtc expression, growth rate and ribosome efficiency, we work to embed the Rtc model within a mechanistic cell model of bacterial growth, which will allow Rtc to be studied under various growth conditions, analyse how antibiotics affect its expression, and in turn, how Rtc affects bacterial growth.

Insights on hemodynamic changes in hypertension and T2D through non-invasive cardiovascular modeling
PRESENTER: Kajsa Tunedal

ABSTRACT. One third of all persons worldwide have high blood pressure (hypertension), and it is twice as common in patients with type 2 diabetes (T2D). Uncontrolled hypertension is a risk factor for diseases such as stroke, heart failure, and renal failure. The connection between these diseases, T2D and hypertension can be understood through the hemodynamic mechanisms that describe the complex changes in the regulation of blood flow and blood pressure. Detailed hemodynamic data can be acquired with non-invasive measurements such as 3D magnetic resonance imaging of blood flow over time called four-dimensional magnetic resonance imaging (4D Flow MRI). However, 4D Flow MRI cannot directly measure hemodynamic parameters such as stiffness and blood pressure in the heart and aorta. To acquire these parameters together with other information that otherwise is hard to measure non-invasively, we herein combine a cardiovascular model with 4D flow data. The aim is to investigate hemodynamic differences between controls, T2D patients, hypertensive patients, and patients with both T2D and hypertension, to further elucidate the mechanisms of hypertension and T2D.

For 80 subjects from the SCAPIS Linköping cohort in Sweden, we used patient-specific data from MRI and cuff pressure to create personalized models of the individual hemodynamics. The 80 personalized models were used to group and compare hemodynamic parameters between controls, patients with T2D, patients with hypertension, and patients with both T2D and hypertension. Preliminary results show statistically significant hemodynamic changes in hypertensive and T2D patients compared to controls, as well as differences between hypertensive and T2D patients. Additionally, a large patient-to-patient variation can be seen within each group, showing the importance of a patient-specific approach like our personalized models in the treatment of these patients. These new insights and personalized models could, together with further studies, aid in the treatment planning of patients with both diabetes and hypertension.

COVRECON: Combining Genome-scale Metabolic Network Reconstruction and Data-driven Inverse Modeling to Reveal Causal Biochemical Regulations
PRESENTER: Jiahang Li

ABSTRACT. One central goal of systems biology is to infer biochemical regulations from large-scale OMICS data. Regulatory interactions can be represented in a Jacobian matrix, which can be inferred from metabolomics data by solving the Lyapunov equation JC+CJ^T=-2D (1-4). However, prior algorithms for this inference are limited by two issues: they rely on structural network information that needs to be assembled manually, and they are numerically unstable due to ill-conditioned regression problems, which makes them inadequate for dealing with large-scale metabolic networks. In this work, we present a novel regression-loss based inverse Jacobian algorithm and related workflow COVRECON. It consists of two parts: a, Sim-Network and b, Inverse differential Jacobian evaluation. Sim-Network automatically generates an organism-specific enzyme and reaction dataset from Bigg and KEGG databases, which is then used to reconstruct the Jacobian’s structure for a specific metabolomics dataset. Instead of directly solving a regression problem, the new inverse differential Jacobian part is based on a more robust approach and rates the biochemical interactions according to their relevance from large-scale metabolomics data. This approach is illustrated by in silico stochastic analysis with different-sized metabolic networks from the BioModel database. The advantages of COVRECON are that 1) it automatically reconstructs a data-driven conceptualized metabolic model; 2) more general network structures can be considered; 3) the new inverse algorithms improve stability, decrease computation time, and extend to large-scale models.

1. Steuer R et al. (2003) Bioinformatics 19(8):1019-26 2. Sun X & Weckwerth W (2012) Metabolomics 8(1):81-93 3. Weckwerth W. Frontiers in Applied Mathematics and Statistics 5 (2019): 29. 4. Weckwerth, Wolfram. Analytical and Bioanalytical Chemistry 400.7 (2011): 1967-1978.

Lipid metabolic reprogramming extends beyond histological tumor demarcations in human operable pancreatic cancer
PRESENTER: Abel Szkalisity

ABSTRACT. Pancreatic ductal adenocarcinoma (PDAC, here pancreatic cancer) is one of the deadliest diseases with bitter expected survival time. As the mutations driving this malignancy are found in other cancers with better prognostic prospects, an increasing number of studies focus on the pancreatic tumor microenvironment to identify additional contributing factors. For instance, PDAC is characterized by a prominent fibrotic stroma infiltrating the neoplastic areas, and this desmoplastic reaction might either promote or limit tumor progression.

Inspired by studies in mouse models that argued for the importance of lipid metabolic interplay between the tumor and the stroma, we performed here a systematic characterization of the proteome of four tissue compartments from human operable pancreatic cancers. Utilizing laser-capture microdissection we isolated neoplastic lesions (neoplastic parenchyma – NP), tumor adjacent, histologically benign exocrine pancreas (adjacent parenchyma – AP), and stromal tissue surrounding both (neoplastic stroma and adjacent stroma) from the diagnostic specimen of 14 treatment-naïve patients operated on in the Helsinki University Hospital. LC-MS/MS proteomics quantified 6979 unique proteins including lipid metabolic enzymes with low abundance.

We found a loss of normal pancreatic secretory functions in the NP compared to AP, as expected, and abundant apolipoproteins in the stromal areas, reflecting vascular supply. On the other hand, lipid metabolism was more active in the parenchymal regions than in the corresponding stromas, with cholesterol biosynthetic enzymes being most pronounced in the NP.

Despite the small cohort, we investigated the prognostic relevance of the proteins in each microdissected tissue compartment by dividing the 14 patients at their median (2-years) survival time. To our great surprise, the tumor-adjacent (AP) regions harbored more proteins significant for survival than any other compartment. The presence of prognostically relevant proteomic variation in the AP suggested that the histologically benign exocrine pancreas adjacent to the tumor is potentially different from truly healthy pancreas. To test this hypothesis, we reprocessed 12 pancreatic samples from healthy individuals and integrated them with our data. We concluded that the tumor adjacent exocrine regions differed from healthy pancreas by increased lipid metabolism and transport activity and this difference was prognostically relevant for the survival of PDAC patients.

We verified our findings with immunohistochemical staining of select proteins and by analyzing the proteomes of 51 additional patients from published datasets. Our study underscores the role of altered lipid metabolism in PDAC progression and shows that investigation of the previously neglected histologically benign tumor microenvironment may provide novel possibilities in the quest for effective treatment of pancreatic cancer.

Non-parametric model-based estimation of the effective reproduction number for SARS-CoV-2
PRESENTER: Jacques Hermes

ABSTRACT. Viral outbreaks, such as the current COVID-19 pandemic, are commonly described by compartmental models by means of ordinary differential equation (ODE) systems. The parameter values of these ODE models are typically unknown and need to be estimated based on accessible data. In order to describe realistic pandemic scenarios with strongly varying conditions, these model parameters need to be assumed as time-dependent. While parameter estimation for the typical case of time-constant parameters does not pose significant issues, the determination of time-dependent parameters, e.g.~the transition rates of compartmental models, remains notoriously difficult, in particular since the function class of these time-dependent parameters are unknown. In this work, we present a novel method which utilizes the Augmented Kalman Smoother in combination with an Expectation-Maximization algorithm to simultaneously estimate all time-dependent parameters in an SIRD compartmental model. This approach only requires incidence data, but no prior knowledge on model parameters or any further assumptions on the function class of the time-dependencies. In contrast to other approaches for the estimation of the time-dependent reproduction number, no assumptions on the parameterization of the serial interval distribution are required. With this method, we are able to adequately describe COVID-19 data in Germany and to give non-parametric model-based time course estimates for the effective reproduction number. This approach can also be applied in a cell biological context where time-dependent parameter or unknow stimuli need to be estimated.

Federated mathematical modelling and machine learning for a multi-national cohort of COVID-19 patients
PRESENTER: Manuel Huth

ABSTRACT. The COVID-19 pandemic has highlighted the need for disease modelling based on mathematical and machine learning techniques in order to understand the impact of SARS-CoV-2 crisis. ORCHESTRA is an international research project which aims to deliver scientific evidence for analyses by using datasets supplied by cohorts from various European and non-European countries. The combined information provided by these partners allows for more robust evidence-based disease modelling to take place, given the larger sample size and variation of COVID-19 measures in different regions. However, the sharing of data is a prevalent issue within healthcare in general, as different partners are subject to different data privacy restrictions due to patient concerns and governing legal entities.

To address this problem, we use federated learning, a machine learning technique that trains an algorithm across each cohort’s data storage center by sending non-disclosive aggregated information around, thereby achieving the same results as pooled analyses while respecting the privacy restrictions of each cohort member. In order to provide a data-driven analyses that measures the impact of various factors related to the COVID-19 pandemic, we report on federated algorithms that are central to statistical impact evaluation and well-known from the Econometrics literature. The implemented methods (supported through the RDataShield package) include the non-parametric Difference-in-Differences with multiple time periods, instrumental variable regression allowing for consistent estimates under endogeneity problems, and propensity score matching methods.

The establishment of this federated analysis framework, not only improves the prevention and treatment of COVID-19, but also provides better preparation for future pandemics.

Integrative analysis on comorbid signatures in psychotic disorders using multi-rank non-negative matrix factorization based graph approach
PRESENTER: Youcheng Zhang

ABSTRACT. The clinical burden of mental illness, in particular schizophrenia and bipolar disorder, are driven by frequent chronic courses and increased mortality, as well as the risk for comorbid conditions such as cardiovascular disease and type II diabetes. Evidence suggests an overlap of molecular pathways between psychotic disorders and somatic comorbidities. Therefore, our task is to disentangle the shared mechanisms underlying mental illnesses and common somatic comorbidities and to provide guidance for personalized therapies and novel intervention strategies.

In this study, we developed a computational framework to perform comorbidity modeling via an improved integrative unsupervised machine learning approach based on multi-rank non-negative matrix factorisation (mrNMF). Using this procedure, we extracted molecular signatures potentially explaining shared comorbid mechanisms. For this, 27 case-control microarray cohorts across multiple tissues were collected, covering three main categories of conditions including psychotic disorders, cardiovascular diseases and type II diabetes. We addressed the limitation of normal NMF for parameter selection (e.g. undetermined number of rank) by introducing multi-rank ensembled NMF to identify signatures under various hierarchical levels simultaneously. Reciprocal best hit scoring matrix was computed to integrate signatures across different cohorts and to generate the disease signature graph while controlling confounding effects. Downstream analysis on nodes and edges of the graph was performed to identify several comorbid pathways and genes such as CEACAM6 (psychosis and cardiovascular diseases) and GABRA5 (psychosis and diabetes). Reference gene sets were curated to validate the underlying processes of comorbid signatures such as leukocyte processes in cardiovascular disease comorbid signatures and neuropeptide in type II diabetes signatures. Finally, further association of these comorbid signatures with clinical outcome in external validation cohorts, including levels of blood high density lipoprotein and C-reactive protein were established.

The pharmaco-epigenomic landscape of cancer cell lines reveals the epigenetic component of drug sensitivity

ABSTRACT. Aberrant DNA methylation accompanies genetic alterations during oncogenesis and tumour homeostasis and contributes to the transcriptional deregulation of key signalling pathways in cancer. Despite increasing efforts in DNA methylation profiling of cancer patients, epigenetic biomarkers for predicting treatment efficacy are still lacking. In order to address this, we analysed 721 cancer cell lines across 22 cancer types that were screened across 453 anti-cancer compounds. We systematically detected the predictive component of DNA methylation in the context of transcriptional and mutational patterns. Our results show that DNA methylation in regulatory elements can constitute drug sensitivity biomarkers by mediating the expression of proximal genes, thereby giving additional mechanistic insights and biological signals across multi-omic data modalities. In total, we identified 56 DNA methylation biomarkers for 25 drugs across eight cancer types. Our method reproduced anticipated clinical associations and generated novel hypotheses, e.g. we found that the hypermethylation of the NEK9 promoter conferred sensitivity to the NAE inhibitor pevonedistat in melanoma through the downregulation of NEK9. We envision that epigenomic characterisation will refine existing patient stratification, thus empowering the next generation of precision oncology.

Manually curated genome-scale metabolic reconstructions of the multi-drug resistant Gram-negative pathogens A. baumannii and K. pneumoniae
PRESENTER: Nantia Leonidou

ABSTRACT. With the emergence of multi-drug resistant bacteria (MDR), a catalog of microorganisms for which new antibiotics are urgently needed was published in 2017 by the World Health Organization (WHO). Within this list, two ESKAPE pathogens were granted the “critical” status: Acinetobacter baumannii and Klebsiella pneumoniae. Throughout the years, isolates of both organisms resistant to carbapenems have been detected inside health care units. Moreover, each pathogen is associated with multiple antibiotic resistances, posing a global threat for upcoming outbreaks. One way to facilitate a systemic view of bacterial metabolism and allow the formulation of dependable hypotheses based on environmental and genetic alterations is by applying constraint-based modeling and genome-scale metabolic networks.

Here we present two novel genome-scale metabolic models of A. baumannii ATCC 17978 (iACB22LX) and K. pneumoniae HS11286 (iKPMLL). We enriched the models with database cross-references and validated them using multiple existing experimental datasets. Our analysis showed that our reconstructions could imitate cellular metabolic phenotypes observed during in vitro experiments with high accuracy. iACB22LX predicted over 80% accuracy in the carbon and nitrogen sources tests, while iKPMLL correctly predicted growth in all tested conditions. We computationally determined essential genes and examined the existence of human orthologs to ensure their suitability to become drug candidates. In addition, we examined whether the models could predict growth in commonly used media and defined a minimal set of compounds needed for them to grow. Besides that, we simulated various growth media depending on where the pathogens are known to grow in the human body. Finally, we created a curated collection of already published reconstructions of distinct strains for the same pathogens and analyzed their growth capabilities.

The presented models are in a standardized and curated format of high quality. The community can easily use them, and they guide the reconstruction of multi-strain networks. Ultimately, they will serve as a knowledge base aiming for reliable predictions regarding various perturbations and the development of effective antimicrobial therapies.

Single cell analysis of colorectal cancers with familial adenomatous polyposis for revealing differential pre-malignant programs
PRESENTER: Najung Lim

ABSTRACT. Identifying and investigating an intermediate state between the healthy and the tumor is a key step in cancer prevention and anti-cancer treatments. This is also crucial in treating hereditary cancer, where the onset of cancer is certain and the patient is at constant risk. Colorectal cancer (CRC) patients with familial adenomatous polyposis (FAP), who have germline mutations in APC, are extremely vulnerable to CRC as they develop hundreds of polyps even at early ages. In this study, we focused on the pre-cancerous state of CRC patients with FAP to identify the regulatory mechanism by which polyps develop into cancer. By applying single-cell sequencing data to pseudo-time analysis, we reconstructed and traced the progression of tumorigenesis, and identified states that existed between normal and tumor states. As APC-driven polyps and resulting cancer are abundant in CNV accumulation, it was hypothesized by monitoring and measuring the CNV characteristics may reveal a distinct pre-cancerous state before polyps transform into cancer. To this end, we have analyzed single-cell CRC and polyp data using CopyKat, a computational tool using integrative Bayesian approaches to identify genome-wide aneuploidy, and identified frequently accumulated CNVs and tumor-specific CNVs through statistical tests. Then we utilized the pseudo-time technique to align polyps based on their transcriptional characteristics and to see how differential pre-malignant programs corresponded to specific CNV accumulations in the pre-cancerous state. We found that while the pseudo-time and CNV accumulations generally agreed, there were distinct patterns in CNV accumulation that were not correlated with transcriptional changes. We will discuss the common pathways influenced by the genes in CNV regions which relate to cancer development and possible targets for preventive medicine. The proposed method would help frame novel and general therapeutic approaches to cancer therapy and cancer prevention for hereditary cancers.

A deep learning-based approach for counting ovarian follicles
PRESENTER: Misbah Razzaq

ABSTRACT. Ovaries are of paramount importance in reproduction as they produce female gametes through a complex developmental process known as folliculogenesis. Treatments allowing the control of reproduction (e.g. contraception for family planing or superovulation in assisted reproduction technologies) aim at blocking or stimulating folliculogenesis. In the prospect of developing novel contraceptive or fertility treatment, it is therefore crucial to accurately and quantitatively assess ovarian folliculogenesis. Manual counting is commonly employed to determine the number of follicles, however it is a laborious task. There are various types of follicles, but we are focused on late follicles that are hallmarks of pre-(antral follicles) and post-(corpus luteum) ovulation. There are various challenges in counting follicles, such as large changes between 3 major stages of antral follicles, overall non-static follicle structure, and the subjectivity of experts. In this poster, we will be presenting how deep learning-based model can be used to count the number of such late follicles in order to resolve the aforementioned challenges. Our dataset comprising of histology images comes from 3 mouse ovaries, which were annotated using cvat. Hematoxylin-eosin-stained histological sections of 8 weeks-old mouse ovaries were indeed manually annotated: 255 antral follicules displaying the ovocyte, 252 antral follicles without the ovocyte, 196 corpus luteum and 486 non-relevant (negative control) structures were identified. The dataset was divided into training, validation, and testing sets. We are optimizing a pre-built convolution neural network (CNN) using the transfer learning method to carry out the classification task on our dataset. The validity of the CNN model is tested by comparing its output with the experts' output. The issue of an unbalanced dataset was resolved by assigning weights to different categories in order to increase the penality in the case of misclassification of under-represented classes. Our preliminary results are promising, with an accuracy of 82% on the testing dataset (data which was not part of the training process). In the future, we are planning to add more data to improve the performance of the proposed CNN. Finally, we will also show the interpretation of the CNN model using different methods such as LIME and Deeplift to compare whether the proposed model deploys the same features as used by experts. The proposed model will help to speed up the counting process, save resources, propose objective counting, and may propose new features that can be taken into account in counting follicles.

Modeling disease progression in spinocerebellar ataxias

ABSTRACT. Spinocerebellar ataxias (SCAs) are rare neurological diseases that follow an autosomal dominant inheritance. The symptoms include loss of balance, loss of coordination and slurred speech and typically develop only at adult age. The most common SCAs are SCA1, SCA2, SCA3, and SCA6. Each of these disease subtypes is caused by CAG repeat expansion in the protein-coding region of a single, subtype-specific gene, leading to a polyglutamine stretch in the resulting protein. To comparatively investigate determinants of disease progression for the four subtypes, we co-analyzed genetic data on repeat lengths with demographic features and three-year clinical time courses of 39 neurological scales. The dataset comprised 1554 subjects from five different cohorts in total; each disease subtype was covered by four cohorts, enabling us to detect recurring patterns. Mining of the progression data revealed the Scale for the Assessment and Rating of Ataxia (SARA) sum score to be the most representative descriptor of disease progression, reflecting progression of most other scales. Furthermore, progression events for the SARA sum score were predictable from the baseline neurological status. Survival forests outperformed regularized Cox regression models in that task. Remarkably, the top predictive features differed for each of the four SCA subtypes. Also, the set of neurological symptoms that showed progression depended on the SCA subtype. We trained predictive models for the progression of each neurological symptom and determined the top cross-symptom predictors for each subtype by survival forest analysis. The number of repeats in the expanded allele was the top predictor for SCA3 symptom progression and also had significant impact on life-time disease courses for each SCA subtype. For all subtypes except SCA6, gait was the most predictive marker of future transitions to more severe clinical disease stages. Beyond the data-driven characterization of relationships between observable features and symptom progression, our work aims at providing tailored models to support subtype-specific clinical monitoring and assessment of therapy effects in further clinical studies.

Computational modeling of the Hes1 oscillator for the self-renewal of muscle stem cells
PRESENTER: Zsófia Bujtár

ABSTRACT. Oscillatory dynamics in regulatory networks can be critical in development. The decision between self-renewal and differentiation is controlled by a Notch related signaling network. The key transcription factors of the Hes gene family are regulated by the Notch signaling pathway and show oscillations that are indispensable for self-renewal. Recent experiments demonstrated that expression of the Hes1 oscillates in activated muscle stem cells and regulates transcription of the genes encoding the myogenic transcription factor MyoD and the Notch ligand Dll1, thereby driving MyoD and Dll1 oscillations.

We developed a mathematical model describing the network of the co-regulated genes Hes1, MyoD and Dll1 in individual cells based on ordinary differential equations (ODEs). In knockout simulations, the model exhibits Dll1 oscillations with declined amplitude in the case of the MyoD knockout, but shows non-oscillatory, increased levels of Dll1 for Hes1 knockout, confirming corresponding experimental observations. Based on the ODE model, we established a delay differential equation (DDE) model representing Hes1 and Dll1 protein dynamics. The DDE model allows for the direct investigation of modified delays, such as in the Dll1type2 mutant cell line characterized by a prolonged Dll1 transcription time. We studied the case of two cells coupled via Dll1 signaling in detail. The investigation showed out-of-phase oscillations for coupled wild-type cells and quenched oscillations for coupled Dll1type2 mutant cells, as observed experimentally. To investigate the impact of the delays systematically, we performed a bifurcation analysis of the 2-cell-DDE-model. As the strength of intercellular coupling also controls the dynamics, we are currently expanding the bifurcation analysis by considering the effect of coupling strength as well. Our approach demonstrates that computational modelling allows for a systematic investigation of the dynamic properties of the molecular network regulating cell fate decision in muscle stem cells.

Learning synthetic cell classifier designs with genetic algorithms and logic programming
PRESENTER: Melania Nowicka

ABSTRACT. Background: Cell classifiers are synthetic bio-devices performing type-specific in vivo classification of the cell's molecular fingerprint. In particular, they can recognize cancerous cells and trigger their apoptosis, shaping novel therapies for cancer patients. Here, the classifiers describe the relationship between cells' molecular profiles and their annotation as cancerous or non-cancerous. Such a relationship can be represented as a partially defined logical function where the output indicates the cell condition. A single circuit's processing logic is usually described using a larger individual Boolean function, whereas multi-circuit classifiers are ensembles of simpler logic designs. Such distributed classifier consists of a group of single-circuit classifiers deciding collectively whether a cell is cancerous according to a predefined threshold function. Both architectures have shown the potential to predict the cell condition with high accuracy. However, the lack of comprehensive workflows to design and evaluate the classifiers, in particular, assessing their robustness to noise and novel information, makes their application limited.

Results: Here, we present a framework for designing miRNA-based distributed cell classifiers, employing genetic algorithms and Answer Set Programming. We develop optimization criteria comprising the accuracy and robustness of the circuits and train classifiers that achieve high performance (89.78% accuracy for the most-perturbed data set), as shown in multiple simulated data studies. The evaluation performed on cancer data demonstrates that distributed classifiers outperform single-circuit designs by up to 13.40%. Our workflow provides inherently interpretable classifiers comprising relevant miRNAs previously described in the literature, as well as more complex regulation patterns underlying the data. Ultimately, we show how our approach can be applied to other binary classification problems employing different biological modalities such as gene expression or mutation patterns providing interpretable classifiers.

Machine learning-based prediction of frailty in elderly people - Data from the Berlin Aging Study-II (BASE-II)
PRESENTER: Jeff Didier

ABSTRACT. Frailty is a geriatric medical condition that is highly associated with age and age-related diseases. The multidimensional consequences of frailty are heavily impacting the quality of life, and will inevitably increase the burden on healthcare systems in the future. Most importantly, the lack of a universal standard to describe, diagnose, or let alone treat frailty, is further complicating the situation in the long-term. Nowadays, more and more frailty assessment tools are being developed on a regional and institutional basis, which is continuing to drive the heterogeneity in the characterization of frailty further apart. Gaining better insights into the underlying causes and pathophysiology of frailty, and how it is developing in patients is, therefore, required to establish strong and accurately tailored response schemes for frail patients, where currently only symptoms are treated. Thus, in this study, we deployed machine learning-based classification and optimization techniques to predict frailty in the Berlin Aging Study II (BASE-II, N=1512, frail=484) and revealed some of the most informative biomedical information to characterize frailty, including new potential biomarkers. Frailty in BASE-II was measured by the Fried et al. 5-item frailty index, composed of the clinical variables grip strength, weight loss, exhaustion, physical activity, and gait. The level of frailty in BASE-II was adapted for binary classification purposes by merging the pre-frail and frail levels as frail. A configurable in-house pipeline was developed for pre-processing the clinical data, predicting the target disease, and determining the most informative subgroup of clinical measurements with regards to frailty. The best prediction power was yielded with resampling and dimensionality reduction techniques using the F-beta-2 score, and was further increased by adding one item of the Fried et al. frailty index. We suggest that a combination of the easy-to-obtain biomedical information on frailty risk factors together with one Fried et al. phenotype information provided by i.e. smart wearable devices (gait, grip strength, . . . ) could significantly improve the frailty prediction power.

M4-health: digital twins that follow you throughout your health journey
PRESENTER: Gunnar Cedersund

ABSTRACT. For 20+ years, I have developed and tested mechanistic mathematical models for the main organs in the human body. We have now created a reusable backend to an eHealth platform, where the organ-models are interconnected, and where the models can be personalized, and used to simulate scenarios. The models are Multi-level (intracellular to whole-body), Multi-timescale (seconds to decades), Multi-organ, and Mechanistic (M4).

The M4-models are developed/simulated using differential-algebraic equations, across various platforms: Matlab, OpenCOR, NEURON, OpenSim, Unreal Engine, INCA, etc. The M4-models are extendable to omics-level network models, and are combined with machine-learning models, to calculate e.g. the risk of a stroke. The backend is written in Python, and can be called from any eHealth-platform.

The interconnected M4-model is able to simulate scenarios that agree with data for all levels and timescales. The omics-level model can simulate diabetes on a phosphoproteome level. The digital twins are personalized in appearance (face, proportions, weight, etc), and can be made to move. Intrabody images (MRI, microscopy images of biopsies, etc) can visualize both how the organs and cells are now, and how they gradually change, depending on what the digital twin is doing: diet, exercise, medication, etc.

Because of the physiological M4-core, our model can be re-used across the entire health journey: for personalized computer labs in education, for including your digital twins in performances on stage, and for improving communication with your personal trainer at the gym, or with nurses or specialist (hepatologist, cardiologist). The hypothesis is that seeing such scenarios play out, in your own digital twin, will improve the understanding, motivation, and compliance to treatment. In my presentation, I will give examples of how we work with end-users across the different stages of your life journey. Finally, if there is a grand piano, I can show live how dancing digital twins are incorporated in lecture-performances.

Multi-platform model training reduces bias in translational drug sensitivity models for cancer patients

ABSTRACT. Translational models directly relating drug response specific processes that can be observed in gene expression profiles in vitro to their in vivo role in cancer patients constitute a crucial part in the development of personalized medication. In extensive comparative studies of the parameter space of translational models, the impact of diverse preprocessing steps and model settings on the predictive performance can be extracted.

Our research demonstrates that the quality of the in vitro training data set can severely affect the model performance on the in vivo test set. In a comparison of a large set of model settings for translational models trained on the well-established cell line data sets of CCLE, CTRP or GDSC in order to predict the survival of 28 ovarian cancer patients treated with Paclitaxel (GSE51373) and 25 NSCLC patients treated with Erlotinib (GSE33072), we show that the choice of training data can contribute up to 69% of performance variation, biasing the conception of other modeling criteria and their effect on model performance.

Instead of setting the different cell line platforms against each other, our work aims to use the observed disparity and benefit from the multi-angled view on drug response processes to ultimately develop robust drug sensitivity models for cancer patients. In the framework of a consensus model, we combine multiple simple models established on cell line data from different platforms into one collective estimate of patient survival. Our findings indicate that a significant increase of predictivity can be achieved for such consensus models compared to translational models based on solely one platform. Here, we present the consensus concept and selected model settings that yield reliable AUCs of ROC in the range of 0.79 for different patient data sets.

Quantify the role of DMT1 endocytosis, IRPs activity, and FT iron sequestration in duodenal enterocyte iron regulation
PRESENTER: Joseph Masison

ABSTRACT. Iron levels in humans are controlled mainly through regulation of intestinal iron absorption, with the majority of absorption through duodenal enterocytes (epithelial cells of the first part of the small intestine). In addition to balancing intracellular iron deficiency with cytotoxicity, enterocytes face unique iron related challenges because of their iron absorption role. Iron entry into enterocytes depends mostly on dietary availability and intracellular signaling, but its export is regulated systemically by the hormone hepcidin. Enterocytes release absorbed dietary iron to circulation through the exporter ferroportin. Hepcidin promotes the degradation of ferroportin thus limiting iron release. As a result, in conditions where hepcidin level is high, these cells have the potential to absorb a large amount of iron with no way of releasing it to the blood, leading to cytotoxicity through iron induced oxidative stress. There are three mechanisms proposed to simultaneously prevent this cytotoxicity while not diminishing iron absorptive capacity in enterocytes: 1) endocytosis of the luminal iron importer DMT1 after iron exposure, 2) post-transcriptional regulation by iron regulatory proteins (IRPs) in response to changing cytosolic iron levels, and/or 3) an iron buffering effect caused by the cellular iron storage protein ferritin (FT). While isolated studies of these mechanisms suggest all three may be involved to protect enterocytes from iron toxicity, the relative contribution of each one is largely unknown. In this work, the relative contribution of DMT1 endocytosis, IRP regulation, and ferritin buffering toward enterocytic iron dynamics is determined by a cellular-scale biochemical kinetic model of iron metabolism in the enterocyte using the COPASI simulator. Quantification of the contributions of each mechanism to iron absorption and their interplay is assessed via model simulation and sensitivity analysis. The model also enables exploration of more general iron dynamics in the enterocyte during fasting, meal, and pathologic states. The model, which was built from a compilation of published data, includes the relevant biochemical species and pathways for individual enterocyte iron absorption, cellular regulation, and export.

Periodic Forcing of the ERK pathway
PRESENTER: Nguyen Tran

ABSTRACT. Introduction: Signal transduction networks (STNs) compute extracellular biochemical information to regulate intracellular biochemistry. Different biochemistries dictate different cell outcomes. This extracellular information is stored in signal dynamics at cell surface. Therefore, different signaling dynamics can induce different cell outcomes.

We can describe dynamics in terms of frequency. Frequency-dependent cell behaviour has been documented in experimental literature [1,2,3,4].

One physiologically important STN is the ERK pathway. The ERK pathway transmits EGF signals from the extracellular space to a family of proteins in the cytoplasm called ERK. ERK controls a wide range of cell functions in metazoans including survival, growth, metabolism, migration, and differentiation [5]. These functions are vital to understanding and treating cell diseases like cancer. It is therefore fruitful to ask if they are frequency dependent. Knowing which frequencies promote or inhibit functions like cell growth can help develop therapeutics and experimental techniques.

We investigate this frequency response by deriving the transfer function of the ERK pathway. This function which will tell us how sinusoidal EGF inputs are transformed into ERK outputs. This then sets the basis for obtaining ERK responses for arbitrary periodic forcing patterns, such as triangular and rectangular pulse trains, via Fourier analysis.

Results: We study the EGF-activated ERK pathway model presented in [6] and derive its transfer function. Using this function, we predict how different periodic forcing dynamics of EGF yield different ERK activation dynamics and relate these to different cell outcomes in experimental literature.

Conclusion: Our work aims to provide a predictive tool for experimentalists to relate desired inputs and outputs in conducting frequency domain experiments. It also provides insight into the regions of dynamics that may be useful for treating diseases associated with dysfunctional ERK activation.

References: [1] A. Mitchell, P. Wei, and W. A. Lim. Science, 350(6266):1379–83, 2015.

[2] Jared e Toettcher, Orion d Weiner, and Wendell a Lim. Cell (Cambridge), 155(6):1422– 1434, 2013.

[3] P. Hersen, M. N. McClean, L. Mahadevan, and S. Ramanathan. Proc Natl Acad Sci U S A, 105(20):7165–70, 2008.

[4] Zubaidah Ningsih and Andrew H A Clayton. 2020 Phys. Biol. 17 044001

[5] Lavoie, H., Gagnon, J., & Therrien, M. (2020). Nature Reviews Molecular Cell Biology, 21(10), 607-632.

[6] Ryu, H., Chung, M., Dobrzyński, M., Fey, D., Blum, Y., Lee, S. S., ... & Pertz, O. (2015). Molecular systems biology, 11(11), 838.

Jinkō Knowledge: An integrated knowledge management tool prior to in silico multiscale model development
PRESENTER: Shiny Martis B

ABSTRACT. Development of multiscale models for in silico clinical trial simulations based on knowledge, known as knowledge based models (KBMs), requires a comprehensive understanding of the underlying physiology and pathophysiology. KBMs need to accurately represent biological mechanisms and related physiological phenomena. Prior to any in silico model development, literature review is the most time consuming step which requires evaluation of the most relevant pieces of knowledge for a particular indication, to manage and to extensively document literature sources. In silico models tend to leverage various resources with the frequent risk of information loss or provenance issues. The literature review can be labour intensive due to the exponential growth in the number of published articles in the past decades and the complexity inherent to the study of biological systems that needs to be represented with a single computational model. Jinkō Knowledge (JK) is a module of the Jinkō platform, developed by Novadiscovery to facilitate smooth transition from literature sources to building a multiscale model. It encapsulates the literature knowledge in the form of assertions, i.e. reference extracts that combine information and the corresponding source. Assertions are evaluated by the researchers and graded with a metric named strength of evidence (SoE). JK is designed as a collaborative platform to import documents, extract assertions by highlighting and categorising knowledge prior to the development of a dynamic computational model. JK aims to integrate the complexity of biological, physiological, clinical, and epidemiological knowledge in a single document that is interpretable, traceable and operational via a white box approach facilitating transparent third party auditing of the models. Although JK has been developed as a first step to develop KBMs, it can also serve as a stand-alone tool to significantly enhance scientific research and writing.

Deciphering dysregulation in Erythroleukemia to design intervention strategies
PRESENTER: Yomn Abdullah

ABSTRACT. Erythropoietin receptor (EpoR) signaling is crucial for the activation and differentiation of erythroid progenitor cells, however its role in erythroleukemia has not been studied. To characterize Erythropoietin (Epo)-induced signal transduction and elucidate its perturbations in erythroleukemia, a systems medicine approach was used. To develop an integrative dynamic pathway model of EpoR signaling and link it to cell proliferation, we adapted the previously established model for EpoR signal transduction in the murine CFU-E erythroid progenitor cells to the context of erythroleukemia. As a cellular model system the cell line AS-E2 was examined since it depends on Epo for survival and growth. Quantitative mass spectrometry was used to compare the proteome of AS-E2 and human CFU-E cells and it was found that AS-E2 cells harbor significantly higher levels of EpoR and a decreased abundance of the negative regulator SHP1. To adapt the parameters of the dynamic pathway model, AS-E2 cells were stimulated with Epo in a dose and time-resolved manner. Quantitative immunoblotting revealed prolonged phosphorylation of EpoR which was reflected by a sustained activation of the pro-proliferative pathways MAPK, PI3K/ AKT and JAK/STAT. Examination of pathway perturbations employing JAK/STAT, MEK and AKT small molecule inhibitors were used to improve identifiability of model parameters. The calibrated model was able to capture the differential effects of the inhibitors on signal transduction and proliferation and enables us to pinpoint major alterations in the cancer cells compared to the healthy situation. These developments will provide the basis to propose an effective targeted therapy for individual erythroleukemia patients.

The insidious trappings of gene set enrichments

ABSTRACT. Gene set enrichments remain one of the main tools linking statistical results from high throughput techniques with biological interpretation. In short, they rely on categorizing genes into a number of gene set and using an appriopriate statistical test to examine the given gene set as a whole. For example, we may ask whether interferon stimulated genes (ISG) are more likely to be differentially expressed between patients and healthy controls.

However, the apparent simplicity of gene set enrichments is misleading. Recently, we have shown a widely spread, but incorrect analysis: when two conditions (e.g. patients and healthy controls) are compared independently in two group of patients, incorrect use of gene set enrichments may lead to false positives. Moreover, we show that such positives are related to the differences between conditions, but not groups, and thus they seem to be "reasonable" in the given context. For example, we might come to the conclusion that ISG are stimulated stronger by one particular strain of the virus, whereas in reality there is no statistical difference between the groups.

A second common issue are the widely spread randomization-based tests for gene set enrichments. There are two main approaches for using a randomization test in the context of gene set enrichments: to estimate the null distribution it is possible to randomize either the samples or the genes. Randomizing samples is effective and correct, however requires a sufficient number of samples. We show that the popular alternative – randomizing genes rather than samples – leads to false positives and spurious results.

Predicting metabolite accumulation in cancerous metabolic network
PRESENTER: Tin Yau Pang

ABSTRACT. Different theoretical approaches are developed to predict the phenotype of metabolic networks. “Metabolic network expansion” considers how the presence and absence of enzymes and their reactions affect the functions of the network and the synthesis of metabolic products. “Flux balance analysis” (FBA) assumes that the network is optimal for certain metabolic objectives and predicts the metabolic fluxes across the reactions in a network based on optimization. However, the objective of a cell in a multicellular organism cancerous or a cancerous cell with aberrant mutations is elusive. Kinetics modeling may predict the metabolic fluxes in the network, but it requires the kinetic parameters of each reaction in the network.

Methodologies based on machine learning can now predict a protein’s various properties with order-of-magnitude accuracy. The Michaelis constant (KM) or catalytic rate constant (kcat) of an enzyme can be predicted from its sequence and the substrate’s structure. The intensity of a protein in the proteome can also be predicted from its sequence and the transcriptome. Thus, we now can roughly estimate the kinetic parameters of most reactions in a metabolic network.

Here we predict the enzymes’ abundance or their kinetic parameters if they are not empirically measured, and calculate the fluxes in the metabolic network. We calculate the enzyme abundance using the data from cancer cell line encyclopedia (CCLE). In a cancerous condition, some metabolites are accumulating, either because they are the metabolic products of the network, or because of the mismatch of production and consumption fluxes caused by mutations. We compare our predictions of metabolite concentrations and accumulations with the metabolomics data. Our modeling framework serves as the basis for the integration of other gene networks, such as small molecule regulation of proteins, which further improve model predictions.

Patient-specific modeling of the TP53 pathway in endometrial carcinoma patients

ABSTRACT. With the advent of sequencing technologies and massive inter-institutional projects like The Cancer Genome Atlas, it has become evident that even though a similar array of signaling pathways is altered in different types of cancer, the exact changes differ among cancer types and patient subgroups. We aim to unravel the repercussions of genomic alterations in cancer using mathematical modeling. As a case study, we use Uterine Corpus Endometrial Carcinoma (UCEC) patient data from TCGA to delineate the mechanistic effects of mutations and copy number aberrations (CNAs) in the TP53 signaling pathway. This pathway is frequently affected in those patients and displays distinct combinations of altered genes. For our study we utilize a published ordinary differential equation-based model which describes the cell fate decision between survival and apoptosis upon genotoxic stress. The implementation of patient-specific combinations of mutations and CNAs in the mathematical model leads to a cellular survival phenotype in 80% of simulated patient samples despite a strong induction of DNA damage. Since prolonged survival due to evasion of apoptosis is a hallmark of cancer, our modeling strategy allows us to mirror the expected cancerous behavior in the majority of samples. Subsequently, we identified all processes in the model controlling cellular susceptibility towards a shift from abnormal survival to apoptosis and thereby revealed potential drug targets. In conclusion, our results shed light on the mechanistic details defining differences between patients and therefore enable the identification of distinct cell vulnerabilities that could be exploited for patient-specific therapy.

The abundance and cross-talk of EGFR and MET receptors in lung cancer cell lines is critical for the efficacy of tyrosine kinase inhibitors
PRESENTER: Dario Lucas Frey

ABSTRACT. Non-small cell lung cancer (NSCLC) is the leading cause of cancer-related deaths worldwide. Targeted therapies, such as epidermal growth factor receptor (EGFR) tyrosine kinase inhibitor (TKI) treatment, can prolong survival to a few months. However, due to the development of therapy resistance, the effect is only transient. It was proposed that the hepatocyte growth factor (HGF) receptor MET contributes to resistance to EGFR-TKIs, but the underlying mechanism remained unresolved. To address the cross-talk of EGFR and MET, we examined EGF- and HGF-induced signaling in four NSCLC cell lines harboring distinct EGFR mutations and differing in their expression levels of EGFR and MET. These studies showed a cell type-specific enhancement of MET phosphorylation upon co-stimulation with EGF and HGF. Our mechanism-based dynamic pathway model indicated that the cross-talk of EGFR and MET is determined by the ratio of the abundance of the two receptors leading to a prolonged half-life of the MET receptor. This hypothesis was validated by live-cell fluorescent microscopy and by altering receptor expression levels by retroviral transduction and siRNA. To verify our observation, we screened an extended panel of NSCLC cell lines by mass spectrometry using data-independent acquisition (DIA) and targeted proteomics using parallel reaction monitoring (PRM) for absolute receptor quantification. These studies showed that the ratio of the receptor abundances indeed varies widely. The dynamic pathway model suggested that a high EGFR to MET ratio improves the efficacy of EGFR TKIs and thus opens new possibilities to avoid the emergence of therapy resistance.

Alterations in hepatocyte growth factor induced signal transduction in fatty liver disease foster cancer development

ABSTRACT. The incidence of fatty liver disease has progressively increased in the past decades. This development correlates with an increase in hepatocellular carcinoma cases that arise in the absence of cirrhosis suggesting a link between both pathologies. Nevertheless, it remains unclear which molecular mechanisms can contribute to the progression of fatty liver disease and if they may support oncogenesis. To address this, we made use of the high fat and high carbohydrate Western diet mouse model. Isolated primary mouse hepatocytes from Western diet mice showed increased proliferation in the absence of growth factor stimulation. Notably, this effect increased after stimulation with hepatocyte growth factor (HGF). To disentangle molecular mechanisms that contribute to the impact of Western diet on hepatic proliferation, we utilized quantitative immunoblotting and mass spectrometry to analyze longitudinal samples. The time-resolved quantitative data was used to develop a dynamic pathway model capturing HGF signal transduction in healthy and Western diet exposed hepatocytes. This analysis pointed to a role of alterations in the regulation of the HGF receptor MET and the PI3K signal transduction pathway. Importantly, our model identified specific parameters that describe Western diet specific alterations. This knowledge sheds light into a previously ignored link between fatty liver disease and cancer formation and could improve the diagnosis of hepatocellular carcinoma.

Inferring individual interactions in gene regulatory networks with delays from time-series experiments
PRESENTER: Yu Wang

ABSTRACT. This work considers inferring individual interactions in gene regulatory networks from time-series response data, with a particular focus on dealing with delays in the regulatory interactions. Inferring individual interactions is an important problem faced e.g., in target identification, but also in applications where it is of interest to learn regulatory interactions between a given subset of genes. However, the inference of interactions of interest is greatly hampered by inevitable delays between genes of interest caused by e.g., intermediate unmeasured components, protein synthesis, transportation delays, etc. Existing inference methods essentially rely on fitting a full network model to infer individual interactions of interest, which either do not take such delays into account or consider pure delays rather than rational dynamics. Such methods based on full network inference require a large number of experiments that can be time-consuming and costly. Experience shows that the resulting models typically will contain a large fraction of false positives and false negatives, and it is in general difficult to provide any statement on the reliability of individually inferred interactions. In contrast to existing methods, we here derive a tool to handle delays in network inference based on taking a geometric perspective on the inference problem formulated as a linear regression, by considering the span of individual time-shifted gene response vectors in sample space. Note that the assumption of linearity here corresponds to assuming the dynamics of the transformed variables being linear. Instead of fitting the full network model, the proposed method can identify individual interactions independently with a label of confidence. Using the proposed method, different delays associated with individual identified interactions can also be distinguished from a single time-series perturbation experiment. A key feature of the method is that it can deal with uncertainty in the data, resulting from both intracellular noise and measurement noise. The effectiveness of the proposed method is illustrated on a 10-gene DREAM4 in-silico network. We consider inferring interactions from the gene with the highest connectivity to other genes in the network. We collect 21 samples with sampling time Ts = 50 min from one single time-series experiment, containing both intracellular and measurement noise. We consider delays between the genes of up to 150 minutes. By applying the proposed method, 4 out of 5 true interactions are correctly identified with few false discoveries. The proposed method also infers different delays in the identified interactions; one of the identified interactions is with 0-50 min delays, while others with up to 150 min delays. The results show that the proposed method can be instrumental for reliable network inference in the presence of delays.

Multilevel approach characterizing the progression of fatty liver disease
PRESENTER: Ina Biermayer

ABSTRACT. The incidence of non-alcoholic fatty liver disease (NAFLD) characterized by the accumulation of liver fat is increasing worldwide. Since the disease can advance to liver cancer, it is important to resolve the temporal order of events and identify protein patterns for early detection of disease progression. To this aim, we employed a systems medicine approach linking the dynamics of structural changes at the organ and tissue level with molecular alterations in the proteome. As preclinical model, we studied the development of NAFLD in mice fed with a high glucose and fat (“Western”) diet for up to 26 weeks. MicroCT and automated quantification of tissue imaging of steatosis revealed early accumulation of liver fat and a continuous increase of lipid droplets. To characterize underlying molecular alterations, we determined changes in the global proteome of time-resolved liver tissue and blood plasma samples by mass spectrometry. For the identification of changes indicative of disease progression, we complemented current proteome data analysis approaches and developed a method that exploits the information encoded in the occurrence of missing values and transforms it into detection probabilities. The correlation of the dynamics of liver fat accumulation and steatosis development with the relevant proteome alterations present in liver and plasma resulted in the identification of more than 120 proteins as potential indicators for the progression of NAFLD. Based on their correlation coefficient, top ranking circulating markers are currently analyzed in NAFLD patients. Thus, our multilevel approach establishes a novel strategy to identify potential early indicators of NAFLD progression.

A generic approach to decipher the mechanistic pathway of heterogeneous protein aggregation kinetics

ABSTRACT. Amyloid formation is a generic property of many protein/polypeptide chains. A broad spectrum of proteins, despite having diversity in the inherent precursor sequence and heterogeneity present in the mechanism of aggregation produces a common cross β-spine structure that is often associated with several human diseases. However, a general modeling framework to interpret amyloid formation remains elusive. Herein, we propose a data-driven ODE based mathematical modeling approach that elucidates the most probable interaction network for the aggregation of a group of proteins (α-synuclein, Aβ42, Myb, and TTR proteins) by considering an ensemble set of network models, which include most of the mechanistic complexities and heterogeneities related to amyloidogenesis. The best-fitting model efficiently quantifies various timescales involved in the process of amyloidogenesis and explains the mechanistic basis of the monomer concentration dependency of amyloid-forming kinetics. Moreover, the present model reconciles several mutant studies and inhibitor experiments for the respective proteins, making experimentally feasible non-intuitive predictions, and provides further insights about how to fine-tune the various microscopic events related to amyloid formation kinetics. This might have an application to formulate better therapeutic measures in the future to counter unwanted amyloidogenesis. Importantly, the theoretical method used here is quite general and can be extended for any amyloid-forming protein. The link for the core model network structure is available in sbml format (https://github.com/baichandra05/Genericmodel_protein_aggregation) in GitHub repository for the users to customize and fit any type of kinetic data set according to their preferences. An associated flowchart of the algorithm has also been provided to make the approach easy and understandable for the general readers.

Dynamic fluctuations in a bacterial metabolic network
PRESENTER: Manika Kargeti

ABSTRACT. Metabolism converts nutrients to energy and biomolecules, in order to sustain all cellular processes. Although the operation of the central metabolism is perceived as deterministic, dynamics and high connectivity of the metabolic network make it prone to fluctuation generation. Nevertheless, identification and characterization of such fluctuations through time resolved metabolite measurements in small bacterial cells remained a challenge. Here we use single-cell metabolite measurements based on Förster resonance energy transfer (FRET), combined with computer simulations, to explore the real-time dynamics of the metabolic network of Escherichia coli. We observe that step like exposure to glycolytic carbon sources elicits large periodic fluctuations in the intracellular concentration of pyruvate in individual E. coli cells. We have then tried to find the source of these fluctuations by using deletion strains of numerous enzymes in central carbon metabolism and also by using many glycolytic and nonglycolytic sugars as the sole carbon source. These experiments suggested that a combination of biochemical reactions is responsible for these fluctuations, with reactions around pyruvate node being especially crucial. Also, these fluctuations occur on a timescale of several minutes that is consistent with predicted oscillatory dynamics of the metabolic network. These fluctuations apparently propagate to other cellular processes, thus affecting multiple aspects of bacterial physiology and leading to post-translational heterogeneity of cellular states within a population. Normally fluctuations are considered as a hindrance to performance and many bacterial networks have regulatory systems in place to avoid these fluctuations (also known as noise). But recent studies have demonstrated that these fluctuations and resultant metabolic heterogeneity could be responsible for survival of bacterial populations, especially ones exposed to highly dynamic environments. Therefore, it might be beneficial to study metabolic variability in order to decipher underlying metabolic regulation and interactions of different cellular processes.

The role of the kynurenine metabolism in chronic graft-versus-host-disease: Insights from computational modeling and patient data
PRESENTER: Thomas Stiehl

ABSTRACT. Chronic graft-versus-host disease (cGVHD) is a severe complication after allogeneic haematopoietic cell transplantation (allo-HCT). In cGVHD grafted immune cells are activated by the host‘s tissues which they recognize as non-self. This results in a chronic activation of the immune system. In a significant number of patients, the chronic immune activation triggers fibrosis. This leads to occasionally life-long morbidity and increased mortality in patients who were cured from their haematological malignancies.

The development of fibrosis is poorly understood and current therapeutic strategies succeed only in few patients. Kynurenine and its metabolites contribute to fibrosis, inflammation and immune-modulation. Kynurenine is a metabolite of tryptophan and is further degraded into anthranilic acid, kynurenic acid, 3-hydroxykynurenine and 3-hydroxanthranilic acid. We have developed an ordinary differential equation model of the kynurenine metabolism. Combining the model with chromatography-tandem mass spectrometry measurements of kynurenine metabolites in serum helps to quantify how the metabolic fluxes in the kynurenine pathway change during the course of cGVHD and how these fluxes differ across clinically defined subtypes of cGVHD [1].

The model-guided data analysis suggests that fibrosing cGVHD is associated with a shift of the kynurenine metabolism towards anthranilic acid and kynurenic acid [1]. Systematic analysis of the model and dynamic simulations help to understand which enzymatic steps in kynurenine metabolism have to be inhibited to reproduce the metabolic patterns observed in patients. Some of the observed changes correlated with serum levels of immune mediators such as IL18, CXCL9 or cofactors (Vitamin B6). Other changes may require regulations of enzymatic activities.

The proposed model is the theoretical basis for the use of kynurenine metabolites as biomarkers for the risk of fibrosing cGVHD. At the same time, it generates new hypotheses about the fine-tuning of the human kynurenine metabolism that can experimentally be tested.

[1] Orsatti L, Stiehl T, Dischinger K, Speziale R, Di Pasquale P, Monteagudo E, Müller-Tidow C, Radujkovic A, Dreger P, Luft T. Kynurenine pathway activation and deviation to anthranilic and kynurenic acid in fibrosing chronic graft-versus-host disease. Cell Rep Med. 2021 Oct 19;2(10):100409. doi: 10.1016/j.xcrm.2021.100409. PMID: 34755129; PMCID: PMC8561165.

Mechanistic insights into sensitization and desensitization of the Interferon α signal transduction pathway

ABSTRACT. As a key component of the innate immune system, Interferon alpha (IFNα) orchestrates the antiviral response in hepatocytes. The IFNα signal transduction pathway is known to desensitize upon activation which constitutes a major problem for the usage of IFNα as treatment against chronic viral infections or as an anti-tumor drug. However, the mechanisms that lead to this desensitization remain poorly understood.

Here, an ODE model is presented that describes the biochemical reaction network of IFNα signaling in different hepatoma cell lines as well as primary human hepatocytes (PHH). The calibrated model shows that besides a dose-dependent desensitization mediated by the negative feedback components SOCS1 and USP18 that act at the receptor level, the signaling pathway can also show (hyper-)sensitization in consequence of an upregulation of the intra-cellular components IRF9 and STAT2.

The model predicted the dose-dependent dynamics of transcriptionally active complexes in the signaling pathway and their effect on mRNA production, as shown by independent validation experiments. Furthermore, the model-based analysis of measurement data from PHH unraveled that each cell system establishes a particular dose-depending sensitization behavior whose shape is strongly determined by the abundance of the feedback components USP18 and STAT2.

Our findings will help to understand the dynamics of production of Interferon Stimulated Genes (ISGs) which exert numerous antiviral effector functions, and serve as a basis for a patient-individual optimization of the antiviral response upon IFNα stimulation.

Fast parameter estimation for ODE-based models of heterogeneous cell populations
PRESENTER: Yulan van Oppen

ABSTRACT. Single-cell time series data frequently display considerable variability across a cell population. When these data are used to fit dynamic models of intracellular processes, it is more appropriate to infer parameter distributions that capture population variability, rather than fitting the population average to obtain a point parameter estimate. The current gold standard for inferring parameter distributions across cell populations is the Global Two Stage (GTS) approach for nonlinear mixed-effects models, where cell-specific parameter estimates and their associated uncertainties are calculated in the first stage and population parameter distributions are inferred in the second. Although the GTS method is reliable, its current implementation requires repeated use of non-convex optimization, which is not guaranteed to converge, while each optimization run requires multiple simulations of the system. These features make the GTS method computationally expensive.

We propose an alternative, computationally efficient implementation of the GTS method for mixed-effects dynamical systems which are nonlinear in the states but linear in the parameters (a class that encompasses a wide range of models such as those based on mass-action kinetics). For such systems, point parameter estimates can be obtained using least squares regression on time derivatives of smoothed measurement data, an approach called gradient matching. Here, we extend the application of gradient matching to the inference problem for mixed-effects dynamical systems and integrate it into the GTS method by properly accounting for uncertainties in individual cell parameters in the first stage. We also present an Expectation Maximization (EM) algorithm and associated parameter uncertainty estimates which are applicable when not all system states are observed, as is typical for biological systems.

We demonstrate the efficiency of our approach with a small simulation study including three dynamical systems. For each system, we simulate N = 100 noisy trajectories and assume the model parameters follow a joint normal distribution. The computing times and accuracies of inferred distributions in terms of Fréchet distances from the ground truth (in parentheses) are given below for the original GTS method vs our adaptation. As our results demonstrate, gradient matching using linear regression yields a substantial improvement in terms of computational efficiency over the simulation-based GTS approach, at the cost of minor accuracy loss.

(Original GTS method) (Our adaptation) - SIMPLE ENZYME KINETICS All states observed: 89.23 sec (0.0514) vs 1.12 sec (0.0050) (Three states, three parameters) Two states observed: 74.71 sec (0.0416) vs 1.95 sec (0.0240)

- FLUORESCENT PROTEIN MATURATION All states observed: 117.49 sec (0.0121) vs 1.09 sec (0.0285) (Three states, five parameters) One state observed: 72.04 sec (0.0302) vs 3.31 sec (0.0404)

- BIFUNCTIONAL TWO-COMPONENT SYSTEM All states observed: 77.78 sec (0.1835) vs 4.65 sec (0.2162) (Six states, four parameters) Three states observed: 74.97 sec (0.2130) vs 5.23 sec (0.2128)

Dynamic analysis framework to detect cell division and cell death in live-cell imaging, using signal processing and machine learning
PRESENTER: Asma Chalabi

ABSTRACT. The detection of cell division and cell death events in live-cell assays has the potential to produce robust metrics of drug pharmacodynamics and return a more comprehensive understanding of tumor cells responses to cancer therapeutic combinations. As cancer drugs may have complex and mixed effects on the biology of the cell, knowing precisely when cellular events occur in a live-cell experiment allows to study the relative contribution of different drug effects –such as cytotoxic or cytostatic, on a cell population. Yet, classical methods require dyes to measure cell viability as an end-point assay, where the proliferation rates can only be estimated when both viable and dead cells are labeled simultaneously –not to mention that the actual cell division events are often discarded due to analytical limitations.

Live-cell imaging is a promising cell-based assay to determine drug efficacies, however its main limitation remains the accuracy and depth of the analyses, to acquire automatic measures of the cellular response phenotype, making the understanding of drug action on cell populations difficult. In this work, we present a new algorithmic architecture integrating machine learning, image and signal processing methods to perform dynamic image analyses of single cell events in time-lapse microscopy experiments of drug pharmacological profiling. Our event detection method is based on a pattern detection approach on the polarized light entropy making it free of any labeling step and exhibiting two distinct patterns for cell division and death events. Our analysis framework is an open source and adaptable workflow that automatically predicts cellular events (and their times) from each single cell trajectory, along with other classic cellular features of cell image analyses, as a promising solution in pharmacodynamics.

Quantification of bacterial resource allocation in changining environments on the single-cell level
PRESENTER: Antrea Pavlou

ABSTRACT. The ability of microbes to colonize the most improbable places can be partly attributed to the efficient coordination between growth and metabolism. Over the last 50 years, the relationship between growth and the environment has been intensely studied, and has lead to general empirical relationships or ’growth laws’. In most studies, however, bacteria are maintained at steady-state growth even though such conditions are rarely found in a natural environment. To investigate bacterial adaptation in changing environments, we have tracked growth and gene expression of single cells of Escherichia coli bacteria growing in a microfluidics device in changing environments. We have examined the behavior of key ribosomal and metabolic genes using fluorescent protein tags. Using inference algorithms, along with models accounting for the maturation kinetics of reporters, we were able to derive dynamic resource allocation profiles of each protein of interest from the time-lapse measurements [PhD thesis, Pavlou et al]. The experimental results provide a detailed view of resource allocation strategies of individual bacteria in dynamically changing environments. Even though the average behavior of the bacteria precisely matches known growth laws during steady-state, resource allocation deviates from the classical growth laws during growth transitions. Furthermore, we identified a considerable heterogeneity between bacteria that manifests itself by different strategies for adapting to a new environment. Our results reveal new principles of dynamical resource allocation and could be helpful in improving biotechnological processes involving microorganisms.

A robust model of neural superposition sorting
PRESENTER: Eric Reifenstein

ABSTRACT. Precise brain wiring relies on specific connections between pre- and postsynaptic partners, but the underlying mechanisms for how these connections are formed remain unclear. Here we use non-invasive intravital live imaging of Drosophila photoreceptor neurons in the lamina to establish a model of neural superposition. The model consists of three components: (i) mechanical stiffness of the growing axon, (ii) stochastic growth-cone extension towards regions of low tissue density, and (iii) short-range attraction of the growth cones to the postsynaptic partners to stabilize the wiring pattern. The relative contributions of the three components dynamically change over developmental time. The model reproduces the biological wiring pattern for all photoreceptor subtypes and for different subregions of the lamina. Most parameters of the model are estimated from the live-imaging data. For the few remaining parameters, we show that the modelled growth cones robustly reach their correct target locations for wide ranges of parameter values. In fact, the stochastic component of the model increases the robustness to parameter variations. In summary, our three-component model robustly works for a broad range of conditions and reproduces key experimental findings.

Decoding cellular deformation from pseudo-simultaneously observed RhoGTPase activities
PRESENTER: Katsuyuki Kunida

ABSTRACT. Limitations in simultaneous live-cell observation of multiple molecules have prevented researchers from elucidating the mechanism of their coordinated dynamic regulation of cellular functions. In this study, we propose Motion-Triggered Average (MTA), a novel method of data analysis that converts multiple individually observed molecular activity in a migrating cell into combined pseudo-simultaneous observations based on the reverse correlation analysis. Using MTA, we successfully extracted pseudo-simultaneous activity of individually observed Cdc42, Rac1, and RhoA. To verify that the molecular activity time series extracted by MTA encoded information on cell edge movement, we predicted the edge velocity from the activities of the three molecules by mathematical model and regression.

Amino Acid Impact on Protein Secretion in B. subtilis: an RBA Approach
PRESENTER: Rafael Moran

ABSTRACT. The production of recombinant proteins has a major role in the biotechnology industry for innovative research and the development of novel drugs. Microbial systems are getting more attention for this matter due to their low cost and high productivity. Protein secretion refers to the transport of a protein from the microbial cytoplasm to the outside of the cell. From an industrial perspective, protein secretion can reduce process costs by reducing the cytoplasmic protein-accumulation stress and by simplifying the downstream protein purification process. In the gram-positive bacteria B. subtilis, protein secretion is affected by a variety of factors such as type of promoter, deletion or overexpression of chaperones, signal peptide and media composition. The latter, highlights the relevance to study the metabolic aspects involved in protein secretion. Resource balance analysis (RBA) is a genomic scale metabolic model based on resource allocation which has shown to be able to perform quantitative predictions of metabolic fluxes, growth rates, and concentrations of proteins involved in a certain metabolic pathway. We implemented an RBA model in a secretory context to analyse the impact of media composition (focusing on amino acid composition) in the secretion of proteins. A further validation of the method is made for three different proteins in minimal media varying their amino acid concentrations in two different Bacillus strains.

Phylogeny and artificial neural networks

ABSTRACT. In recent years Artificial Neural Networks (ANNs) have become extremely popular. As powerful learning methods, they solve pattern recognition tasks and other challenges. We demonstrate how ANNs can be employed to solve the difficult long branch attraction problem in phylogenetics. When long branches are placed adjacent to each other on a reconstructed phylogeny, it is difficult to tell if this placement is artefactual (Felsenstein-type), or accurate (Farris-type). We developed F-zoneNN, an ANN which infers with high accuracy if the input data evolved under a Farris-type or Felsenstein-type tree. Despite its success, it is difficult to identify the features in the data that F-zoneNN leverages to make the decisions. F-zoneNN’s architecture comprises a composition of 9 linear and 9 non-linear functions including more than 1.2 million parameters, and so it is impossible to tell what drives the decisions of such an ANN. To get deeper insights into the decision-making process we endeavoured to simplify F-zoneNN as much as possible, without sacrificing accuracy. This led to the development of an alternative mathematical representation of sequence alignments. Using this representation as input, we found a simple and explainable rational function which can infer the tree-type with high accuracy. This technique to simplify an ANN in order to identify an explainable function that is already able to perform the task harbours great potential for use in other applications of ANNs.

Discovery of Robust and Highly Specific Microbiome Signatures for Non-Alcoholic Fatty Liver Disease
PRESENTER: Emmanouil Nychas

ABSTRACT. Non-alcoholic fatty liver disease (NAFLD) is a metabolic disease with a global prevalence of almost 25%. The pathogenesis of NAFLD is still poorly understood, however, we know that the gut microbiome is highly associated with the development of the disease. Up to now, finding robust bacterial signatures for NAFLD has been a great challenge, mainly because the disease often co-occurs with other metabolic diseases such as type 2 diabetes, obesity, hypertension, etc, making it difficult to find what is highly specific for NAFLD and what is masked from the presence of other diseases. Differences in the analytical tools used by different studies from the sequencing and metabolomic platforms to taxonomic profiling and statistics can greatly differentiate the results among studies making them non-comparable. Lastly, previous studies mainly focused on finding singular species that have a significant impact on the disease, instead of taking a more community-based approach. To address the issues above, we performed a large-scale meta-analysis collecting very detailed clinical, metagenomics sequencing, and in silico metabolomic data for 1231 Chinese subjects with metabolic diseases. Samples were all processed in the same pipeline and were placed in the following categories after regrouping them based on their clinical data: NAFLD – Overweight, NAFLD-Lean, Prediabetes, and T2D Overweight, T2D Lean, Hypertension, Pre-hypertension and Atherosclerosis, and respective controls. In summary, we built highly specific NAFLD diagnostic models, using microbial species and metabolites. Moreover, we highlighted important bacterial consortia and metabolites that are unique and highly associated with NAFLD. Lastly, we revealed key differences and similarities between overweight and lean NAFLD.

Building the knowledge base to understand cellular signal transduction in different inflammatory phenotypes
PRESENTER: Marcus Krantz

ABSTRACT. Within the X-HiDE project, we aim to understand the establishment and resolution of inflammation, and how different states of the underlying signal transduction network results in different inflammatory phenotypes. However, the biochemistry of this signal transduction network is notoriously complex: Each component may be regulated by multiple modifications and interaction partners, which can be combined in a large number of different configurations. Furthermore, single inputs trigger multiple downstream signalling processes, which each may be triggered or antagonised by multiple inputs. Finally, the function of the signal transduction system differs between individuals and cell types, depending on genetic variation and gene expression differences. Consequently, a useful knowledge base must be comprehensive, to account for all those interacting processes, as well as mechanistically detailed, to account for allele and expression differences as well as the impact of drug treatments. Here, we present a literature based mechanistic model of the network recognising infection, from the recognition of pathogen-associated molecular patterns by the toll-like receptors to activation of NF-kappa-B and IRF3/IRF7 mediated transcription. By using rxncon, the reaction-contingency language, we avoid the combinatorial complexity associated with microstate-based formalisms, and hence we can – in contrast to previous efforts – integrate all processes into a single network that defines a unique logical model that can be executed without further parametrisation. While limited to qualitative predictions, it provides a powerful tool for network validation and genotype-to-phenotype analysis. Taken together, we present an approach that reconciles mechanistic detail and scalability in signal transduction modelling, opening the door to comprehensive – in scope and detail – models of the regulatory network in health and disease.

Macroecological Laws Naturally Arise from Complex Chaotic Dynamics of Gut Microbiota

ABSTRACT. The human gut microbiota consists of hundreds of bacterial species, and bacterial composition varies significantly between individuals, in terms of time, and spatially within the gut. We and others have previously shown that the dynamics of the gut microbiota are characterized by the same quantitative scaling laws as those observed in ecology of plants and animals. Although various mathematical models exist in ecology that aim to explain the origins of individual scaling laws in animal and plant species, the nature of these laws remains an open area of research. More recent models of microbiota also stop short of reproducing all known statistical scaling laws with the correct scaling coefficients. We believe that in order to gain meaningful insights into open questions in mathematical ecology, such as the relationship between biodiversity and stability, a model that can capture all ecological scaling laws with accurate coefficients is currently necessary.

To that end, we adapted a generalized Lotka-Volterra model of gut microbial dynamics, which, with several biologically-motivated modifications, was able to accurately reproduce all considered macroecological scaling laws observed in the gut microbiota, consistently regardless of random realizations of the distribution of interactions and growth rates. This model suggests several conceptually important insights.

A. No environmental stochasticity Environmental stochasticity is not required for maintaining the necessary temporal dynamics on short time scales. By implementing a model without external sources of noise, we demonstrate that deterministic chaos can reproduce abundance profiles with low short-term autocorrelation that have been observed in nature, without assuming that environmental fluctuations or measurement noise are the primary culprits of this behavior. B. Spatial heterogeneity We incorporate spatial variability into the model, which is necessary for maintaining high diversity over long time scales. Explicit modeling of spatial structure provides our analysis with an additional level of interpretability, allowing comparisons with experimental spatial microbiota measurements. C. Carrying capacities and total load Our model does not require predetermined fixed carrying capacities for each species. Instead, the power law distributed relationship between rank and abundance of species arises naturally as an emergent property of the dynamical system. Finally, although measurements of the ecological state of the system are performed on normalized, compositional data, our model explicitly accounts for variability in total abundances of all species.

In conclusion, we show that a deterministic gLV model with a small number of hyperparameters can generate long-term ecological dynamics that reproduce all considered ecological scaling laws with accurate coefficients, without relying on any environmental noise or species re-introduction from an external reservoir. The assumptions of the model, and subsequent analyses, give several key insights into the mathematical mechanisms required for generating realistic ecological systems.

Boolean dynamic modeling of TNFR1 signaling predicts a nested feedback loop regulating the apoptotic response at single-cell level (EXCEPTION - ZOOM)

ABSTRACT. Tumor Necrosis Factor Receptor 1 (TNFR1) signaling in cells, triggered by TNFα, exhibits cell-to-cell variability in pro-survival and apoptotic phenotypic responses. The causal factor to account such variability is the heterogeneity in signal flow within intracellular signaling entities. Signal flow controls the balance between these two phenotypes. However, modulating such signal flow and make cells favor apoptosis, which has been considered in cancer therapies, is still under investigation. We use Boolean dynamic modelling to account for signal flow path variability and identify 6-node nested feedback loop that facilitates crucial cross-talk regulation between these two phenotypic responses. We achieve this by systematically developing novel approach “Boolean Modeling based Prediction of Steady-state probability of Phenotype Reachability (BM-ProSPR)” to construct reliable partial state transition graph (pSTG) in a computationally efficient manner and analysing pSTG to accurately predict the extent of network’s long term response. We show that knocking-off Comp1-IKK* complex directs the signal flow path leading to ~62% increase in probability to show apoptotic response and thereby favors phenotype switching from pro-survival to apoptosis. Priming cancerous cells with inhibitors targeting the interaction involving Comp1 and IKK* prior to TNFα exposure could be a potential therapeutic strategy.

Frequency preference in miRNA-mRNA interaction models: competition and cooperativity effects
PRESENTER: Candela Szischik

ABSTRACT. A number of recent studies have highlighted that living cells are inherently dynamic. Live-cell time-lapse microscopy and fluorescent reporter genes have allowed to track the dynamic behavior of specific molecules over time, thereby uncovering a picture where many regulatory proteins undergo pulses of activation and deactivation. Recent quantitative studies have mathematically investigated the implications of pulsatility and periodicity in diverse signaling contexts. Results demonstrate that oscillatory signals are able to carry key biological information encoded in the shape of pulses. It has been shown that features such as the duration, frequency and amplitude can determine the behavior of signaling pathways. MicroRNAs are small RNA molecules that do not code for proteins and regulate their target, messenger RNA (mRNA), through a post-transcriptional mechanism [1]. It is generally accepted that miRNAs repress gene expression by promoting the degradation of target mRNA and/or inhibiting its translation [2]. Thousands of mature miRNAs have been observed in humans, and regulatory circuits involving miRNAs are increasingly being uncovered acting in key biological processes, such as development and differentiation [3]. MiRNAs play fundamental roles in pathological contexts related to tumorigenesis, viral infection, and neurological diseases [4]. A single miRNA regulates multiple target genes, and a gene is typically targeted by several different miRNAs. An mRNA molecule often presents multiple binding sites relative to a specific miRNA, making the extent of repression dependent on how many of such sites are bound and thus conferring cooperative properties to the interaction between a miRNA and its target [5]. We theoretically address pulsatile signaling within miRNA-mediated regulation focusing on the role of oscillation frequency and investigating its interplay with significant features of miRNA-mRNA interaction, namely cooperative binding and competition between targets. We feed the system with pulsatile miRNA expression, and we compute frequency-dependent responses in terms of biologically relevant quantities. Specifically, we focus on observables that characterize the strength of miRNA-mediated repression, such as the level of unbound target mRNA and the fold-repression. Our results indicate that modulation some parameters that control competition and cooperativity could serve as a tuning mechanism: it can shift and sharpen the frequency preference response, leading to non-intuitive effects. These results can also be address experimentally by quantifying target fold-repression by time-lapse microscopy and using optogenetics to induce a pulsatile miR-20a expression and transfecting fluorescent targets.

[1] Bartel, D. P. Cell 116, 281–297 (2004). [2] He, L. & Hannon, G. J. Nat. Rev. Genet. 5, 522–531 (2004). [3] Ferro, E. & Enrico Bena, C.a & Grigolon, S. & Bosia, C. Cells 8, 1540 (2019). [4] Tealdi, S. & Ferro, E. & Campa, C. C. &Bosia, C. Biomolecules 12, 213 (2022). [5] Bosia C. & Pagnani A. & Zecchina R. PLoS ONE 8, 6 (2013).

Coordination of p53 and MAPK dynamics controls heterogeneous responses to genotoxic agents in single cells

ABSTRACT. Heterogeneous cellular responses to chemotherapy are a significant obstacle in cancer treatment. One source of such heterogeneity is variability in the temporal expression and activity of key signal transduction pathways that detect cell stresses and coordinate appropriate responses in individual cells. We have shown that variable p53 expression dynamics can generate distinct cellular responses to genotoxic agents. However, in some cases distinct stresses can generate the same p53 dynamics but different cell fate outcomes, suggesting integration of dynamic information from other pathways is important for cell fate regulation. We focused on pancreatic cells and quantified the dynamics of p53 and the MAPKs, signaling systems frequently mutated in pancreatic cancer. To determine how MAPK activities affect p53-mediated responses to DNA double strand breaks and oxidative stress, we used time-lapse microscopy to simultaneously track p53 and ERK, JNK, or p38 MAPK activities in single cells. While p53 dynamics were comparable between the stresses, cell fate outcomes were distinct. Combining MAPK dynamics with p53 dynamics was important for distinguishing between the stresses and for generating temporal ordering of downstream cell fate pathways. Cross-talk between MAPKs and p53 controlled the balance between proliferation and cell death. These findings provide insight into how individual cells integrate signaling information from separate pathways with distinct temporal patterns of activity to encode stress specificity and drive heterogeneous cell fate decisions. Furthermore, our results identify timing windows during which combination drug treatments can effectively alter cell fate responses to genotoxic agents.

Explainable Machine Learning in Mass Spectrometry-based Proteomics
PRESENTER: Tom Altenburg

ABSTRACT. Mass spectrometry (MS)-based proteomics provides a holistic view of all proteins in cells. Recently, there has been a dawn of deep learning approaches in MS-based proteomics. However, the explainability of these approaches is typically falling short. We recently developed deep learning models that directly look at tandem mass spectra to make decisions and also enable new downstream analysis approaches. These models either detect high-level peptide properties, such as post-translational modifications (10.1038/s42256-022-00467-7), or embed spectra and peptides jointly (10.1101/2021.12.01.470818), which enabled us to implement a fast and robust open search algorithm. Here, we show our explainability results for phosphoproteomics data on the level of individual peaks and spectra. Also, we visualize joint embeddings between spectra and peptides which in turn reflect physico-chemical properties of peptides. Finally, we propose a differential transfer learning scheme, using our previously-trained models, to pinpoint fragmentation differences between samples (e.g. diseased/healthy). Consequently, fewer data and annotations are required – compared to full training. This may spark future explainability applications in single-cell proteomics – including subtyping, but also allow the detection of biomarkers and pathogens.

Logical modeling of signaling mechanisms in early development
PRESENTER: Yozlem Bahar

ABSTRACT. During early nervous system development, a variety of neural cell types originate from an epithelial sheet of ectodermal cells called the neuroectoderm. The neurogenic region is subdivided into specialized domains where patterning gene expression is directed by signaling mechanisms1. Although the regulation of the patterning genes in neuroectodermal progenitors has been widely studied, signaling mechanisms integrating positional information for the generation of cell diversity is yet to be resolved.

This project investigates the regulatory mechanisms that lead to cellular diversity by developing a logical neuroectoderm model in Drosophila. Given the complexity of developmental networks encompassing multiple regulatory circuits, a logical modeling framework allows us to study and predict the behavior of such networks. Gene regulatory networks underlying neural stem cells differentiation inferred from bulk and single-cell RNA sequencing data were combined with information from literature and refined into logic functions. These functions enabled analysis of dynamical properties of the regulatory network via attractor search and regulatory circuit analysis.

The computational neuroectoderm model presented here serves as a tool to investigate the role of regulatory mechanisms in cell fate diversification. It comprises three positive, functional circuits which are known to have the potential to give raise to multiple differentiated states. It captures the regulatory events driving neuroectoderm specification and predicts cell fate based on positional information from signaling pathways.

1.Price, D. J., Jarman, A. P., Mason, J. O. & Kind, P. C. Patterning the Neuroectoderm. Build. Brains 77–104 (2017)

Multi-dimensional immune correlates of early prediction of COVID-19 severe versus asymptomatic disease course

ABSTRACT. Background and objectives: COVID-19 patients present a versatile range of severity from asymptomatic to debilitating symptoms and critical conditions requiring hospitalization. It is still not understood why some patients have a severe or highly symptomatic COVID-19 disease course while others remain asymptomatic. To understand the viral and immune correlates of severity we investigated multiple dimensions of the immune response, including single cell analysis, in asymptomatic versus symptomatic versus hospitalized patients early after SARS-CoV-2 infection. Materials and Methods: Patients (N=104) were recruited early (within seven days post symptom onset) after SARS-CoV-2 infection (wild-type and alpha variants) and four samples were taken within the first month. Serum cytokine levels by ultra-sensitive ELISA (Simoa, Quanterix) and oral/nasal SARS-CoV-2 viral load (RNA and Nucleocapsid-antigen) were measured longitudinally. PBMCs were analyzed by singe-cell intra-cellular-staining (ICS) cytometry, single-cell CyTOF and EliSpot, before and after stimulation with Spike and Nucleocapsid peptide pools. Bioinformatics analysis (PCA, UMAP and ML) was performed to cluster the patients’ groups and identify the cellular immune correlates. Results: A cytokine combination, based on the ratio of inflammatory cytokines to type-I-interferons, was identified as a highly accurate (>95%) early predictor of both the likelihood for hospitalization and symptoms´ severity. Hospitalized patients have significantly higher levels and longer duration of inflammatory cytokines. Moreover, asymptomatic patients present with significantly lower viral loads, in correlation with higher frequencies and counts of SARS-CoV-2 specific activated CD4 and CD8 T-cells. In particular, asymptomatics show a significantly higher frequency of Th1 related cytokine expression, and of note a higher level of poly-functional CD4 T-cells expressing multiple cytokines. Interestingly, asymptomatic status is more significantly associated with a potent CD4 response against Nucleocapsid-antigen, rather than against Spike-antigen. Furthermore, inflammatory cytokine (e.g., Interleukin-6) levels do not correlate with counts of activated, or cytokine expressing, CD4 and/or CD8 T-cells specific for SARS-CoV-2. Rather, higher levels of inflammatory cytokines are correlated with a higher count of classical monocytes. Conclusions: COVID-19 hospitalization and symptoms´ severity can be accurately predicted already within one week of symptoms onset. This early predictor, using cytokines levels that are feasibly measurable at point-of-care setting, can guide personalized medicine with anti-viral or cytokine-inhibitor therapy. Furthermore, extensive single cell analysis allowed us to identify the immune correlates behind this predictor. Asymptomatic disease course is characterized by a potent anti-Nucleocapsid CD4 Th1 response, in association with lower viral loads and lower inflammatory cytokine levels. Conversely, more severe patients have less potent SARS-CoV-2 specific T-cell response and higher viral loads, as well as higher levels of inflammatory cytokines, apparently produced by monocytes rather than by specific T-cells. These results have important clinical relevance for the development of both personalized therapy and vaccines aimed at reducing severity, by indicating the source of the inflammatory cytokine storm.

Hybrid models for cellular signaling: meso-scale pathway identification

ABSTRACT. Experimentation using combinatorial perturbations of biological systems is a common method to extract data and study the underlying, interacting mechanisms that compose them. These interactions can be represented by a network of unknown structure, with the overall output of the entire system, for a small subset of the input space, as the only available information. An accurate description of these processes constitutes an important step towards a more efficient medicine, personalised in treatments and free of undesired side effects.

We develop a method for network reconstruction from perturbation, static datasets by means of hybrid models, structured combinations of mechanistic modelling and machine learning techniques, and apply it to the the context of signalling pathways. The method is based on the identification of patterns in the data and their relation to graph substructures representing portions of the signalling network. This allows us to avoid restrictive modelling assumptions and possible biases from external sources, but at the same time reduces the level of attained detail. The end result is a quantitative, meso-scale version of the network (it sits in the middle between a macroscopic and a proteomic scale description) that highlights differences and similarities between the response mechanisms of groups of phosphoproteins.

Graph theoretic approaches might provide a link to a more detailed reconstruction, otherwise, the method could serve as a prior on a multi-scale modelling pipeline including existing proteomic scale reconstruction algorithms.

Gene-essentiality based drug signature helps repurposing non-cancer drugs
PRESENTER: Jing Tang

ABSTRACT. Cancer drugs often kill cancer cells independent of their putative targets. The lack of understanding on drug-target interactions prevents biomarker identification and ultimately leads to high attrition in clinical trials. In this study, we explored whether the integration of loss-of-function genetic and drug sensitivity screening data could help identify the mechanisms of action of drugs. We constructed a gene-essentiality drug signature by integrating loss-of-function genetic and drug sensitivity screening data. A machine learning model was developed, where the coefficients of all the genes were considered as the gene-essentiality signature of the drug. We compared the gene-essentiality signatures against structure-based fingerprints as well as the gene expression signatures in both supervised and supervised target predictions. We showed that the gene-essentiality signature can better predict drug targets and their downstream signaling pathways. We then confirmed the validity of our framework in the PRISM dataset generated by the large-scale drug screening experiment. Finally, we predicted the targets for the non-cancer drugs in the PRISM screens that explain better their anticancer efficacy, which may pave the way for drug repositioning.

Elucidating the role of TAS2R43 in HGT-1 cells – an integrated omics approach
PRESENTER: Juergen Behr

ABSTRACT. Originally identified on the tongue for their chemosensory role, the receptors for bitter taste are also found to be expressed in non-gustatory tissues and are involved in non-sensory processes, e.g. in the regulation of food intake and in anti-inflammatory processes in immune defense functions. In some types of cancer, they implicate in important cellular processes such as apoptosis and proliferation. However, knowledge regarding the functional role of bitter taste sensing receptors (TAS2Rs) is still incremental. Here, we aimed at elucidating the role of TAS2R43 in a human gastric cancer cell line (HGT-1) by means of an integrated omics approach applied to HGT-1 wild type cells and CRISPR-Cas9 stably deleted TAS2R43 KO HGT-1 cells. In growth experiments, TAS2R43 KO HGT-1 cells showed a reduced growth rate compared to HGT-1 wt cells, suggesting on overall impact of TAS2R43 on cellular metabolism and proliferation. To gain insight into the impact of the bitter receptor on multiple levels of the cellular metabolism, we chose a multi-omics approach (n = 4) that combines transcriptome, proteome and metabolome data. For integration of multi-omics datasets, an approach of task-specific synthetic sample generation together with a multi-omics data integration based on the Bioconductor R package multiSight was conducted. Our goal was to compensate for extensive biological replication, which is on one hand often not easily feasible for multi-omics approaches, but on the other hand required for the application of certain data analytical methods as the one mentioned above. Synthetic sample generation was based on the LoRAS (Localized Randomized Affine Shadow Sampling) oversampling technique, which was extended by a noise generator. In this way, random noise was introduced on each LoRAS-generated sample to reduce the tendency of the LoRAS algorithm to further separate the distributions of experimental groups by synthetically expanding them. Subsequently, the oversampled data set was analyzed by multiSight, which provides methods for multi-omics classification, functional enrichment and network inference analysis. Initial results indicate the potential usefulness of such an approach, when sample replication is limited, but high-dimensional data coming from such as omics techniques are available. We will compare this approach to classical statistical ones and discuss potential advantages and risks and try to connect the outcomes to the observed phenotypic data in order to elucidate/predict the impact of bitter taste receptors on cellular life.

Dynamics of the circadian clock in models of cancer and its potential for treatment
PRESENTER: Carolin Ector

ABSTRACT. Oscillating with a period of circa 24 hours, the circadian clock is an essential regulator of multiple physiological and behavioral aspects and it is present at individual cell and whole organism level. Previous studies demonstrated the involvement of the circadian clock in the control of cellular responses to DNA damage, thereby paving the way for chronotherapy which can be defined as the optimal timing of administrating an anti-cancer drug to maximise its efficiency and/or minimise adverse effects. Even though the importance of the circadian clock in medicine is gaining increasing significance, its role in the treatment of various cancer types remains largely unknown. In my PhD project I plan to fully characterize the properties of the circadian clock in different cancer models. We focus on the quantification of oscillator’s parameters such as rhythmicity, rigidity, amplitude, period, and the stability of these parameters in time. In my poster I will present our preliminary results characterizing the oscillator parameters using circadian luciferase reporters and their potential implications for chronotherapy.

Identification of personalised driver gene panels in Melanoma

ABSTRACT. Melanoma has one of the highest mutation burdens among tumours and has been broadly classified into four mutually-exclusive mutational subtypes. Identifying these driver mutation profiles that influence the cancer progression of each patient is necessary to understand their disease risk and to develop personalised treatment regimes. Driver genes are typically identified based on the frequency of occurrence in cohorts, which leaves out many potential drivers in patients. Individuals exhibit very high heterogeneity in their mutational profiles, especially when mutation combinations are considered, so it is essential to address the driver mutations in each patient.

In this work, we report short personalised driver panels in 303 patients with human skin cutaneous melanoma (SKCM) from the TCGA cohort, identified using iCanD, a new unbiased computational algorithm developed by us. The individual panel sizes in patients ranged from 1 to 6 genes. iCanD correctly identified the known driver genes BRAF, NRAS, KIT or NF1, also used as subtype markers, in a subset of patients. In others, it identified drivers more influential than these four, providing a means to achieve alternate subtyping. The panels were patient-specific and essentially captured gold standard drivers while a significant number were rare, novel cancer driver genes. No two patients had the same panel of genes, reflecting intra-cohort heterogeneity. The driver genes in the panels have the highest influence on the disease-associated perturbations and hence connected to a large number of differentially expressed genes.

iCanD also computes a prognosis risk score (PRS) for panels that serve as hazard indicators for each patient. The PRSs on the whole were a) higher for patients who had shorter survival time and b) showed an increasing trend with advancing stages of the disease and c) showed an increasing trend among the subtypes in the following order - BRAF, NRAS, KIT and NF1. Many panels had at least one actionable gene, leading to new strategies for exploring therapeutic opportunities.

Inference of differential gene regulatory networks from gene expression data using boosted differential trees
PRESENTER: Gihanna Galindez

ABSTRACT. Diseases can be caused by molecular perturbations that induce specific changes in regulatory interactions and their coordinated expression, also referred to as network rewiring. However, the detection of complex changes in regulatory connections remains a challenging task and would benefit from the development of novel non-parametric approaches. We developed a new ensemble method called BoostDiff (boosted differential regression trees) to infer a differential network discriminating between two conditions. BoostDiff builds an adaptively boosted (AdaBoost) ensemble of differential trees with respect to a target condition. To build the differential trees, we propose differential variance improvement as a novel splitting criterion. Variable importance measures derived from the resulting models are used to reflect changes in gene expression predictability and to build the output differential networks. We first applied BoostDiff on simulated data in comparison to existing differential network methods. We then demonstrate the power of our approach when applied to real transcriptomics data in COVID-19 and Crohn’s disease. BoostDiff identifies context-specific networks that are enriched with genes of known disease-relevant pathways and complements standard differential expression analyses. BoostDiff is available at https://github.com/gihannagalindez/boostdiff_inference.

Initial source of heterogeneity in a model for cell fate decision in the early mammalian embryo

ABSTRACT. Cell differentiation is the process during which cells from a population of common progenitors evolve towards distinct cellular fates. These fates are characterized by different transcription factor (TF) levels. This evolution is governed by Gene Regulatory Networks (GRN) and controlled by intercellular signaling. Since cells need to acquire different fates, there must be some source of heterogeneity in the common population of progenitors that translates into the emergence of different cell fates.

We focused on early mammalian embryogenesis, where the zygote, after successive rounds of mitosis, can give rise to two distinct populations: the Inner Cell Mass (ICM) and the Trophectoderm (TE). The ICM population itself then gives rises to the Epiblast (Epi) and to the Primitive Endoderm (PrE). These two populations are characterized by a high level of Nanog and Gata6 transcripts, respectively. In addition, several other components of the FGF/ERK signaling pathway are expressed at different levels in those cells, particularly Fgf4 that acts as an intercellular signal.

We previously derived a model of the gene regulatory network (GRN) involved in the differentiation process described above. This model consists of 25 cells (placed on a 5x5 matrix grid), and, for each cell, the evolution of 5 variables (Nanog (N), Gata6 (G), Fgf4 receptor (FR), ERK signaling (ERK) and extracellular Fgf4 (F)) is described by Ordinary Differential Equations. Cells are coupled to their neighbours via secretion/perception of Fgf4. For the cells to acquire different fates, an initial heterogeneity was applied to the perceived Fgf4 concentration. Simulations were accompanied by bifurcation analyses using the XPP AUTO software. The model was able to reproduce experimental results, including the Epi/PrE population proportions, the temporal evolutions of Nanog and Gata6 expression, and predict the emergence of tri-stability.

We then explored if appropriate behaviour is still obtained with the model when the initial heterogeneity is applied on parameters related to the rates of transcription of specific genes, instead of a heterogeneity in the perceived Fgf4 concentration. Statistical analysis of the time-dependent distributions of the levels of expression of the genes of the network reveals differences in the evolutions of the variance-to-mean ratios of key variables of the system, depending on the source of variability. Comparison with experimental data points to the rate of synthesis of the key transcription factor Nanog as a likely initial source of heterogeneity.

Exact Confidence Regions for Non-Linear Models

ABSTRACT. Since the confidence regions of linearly parametrised models always constitute perfect ellipsoids around the maximum likelihood estimate, their shape can be fully encoded using a positive-definite covariance matrix. In contrast, the confidence regions of non-linearly parametrised models exhibit non-linearly distorted shapes, which strongly complicates a faithful assessment of the parameter uncertainties. Given that virtually all models obtained from theoretical considerations are non-linear with respect to their parameters, this impacts a broad range of research fields in the systems biology community and beyond.

Our approach uses a special family of vector fields which can be integrated along to efficiently obtain confidence boundaries as the respective integral manifolds of said vector fields. This turns the problem of finding exact confidence regions into numerically solving a system of ODEs. Therefore, the need to sample the likelihood over large volumes in the parameter space is eliminated, which represents a significant reduction in computational effort. Furthermore, knowledge of exact confidence regions can subsequently be used to quantify the uncertainty in the model predictions via confidence bands.

By making making comprehensive parameter uncertainty analyses feasible for a wider class of problems through its improved efficiency, our method allows for more nuanced insights into the mechanisms underlying biological processes.

Towards rational design of antibiotic combination therapies

ABSTRACT. Antimicrobial resistance is on the rise globally. Increased levels of resistance have been reported across bacterial strains and antibiotic compounds. Of special concern are multi-drug resistant bacteria that do not respond to last-resource antibiotics, leaving patients without available treatment options. Some predictions estimate that by 2050, 10 million deaths per year will be attributed to bacterial resistance, making it a major threat to human health [1].

Tackling this issue by developing novel compounds has proved to be a difficult process: economic, regulatory and scientific bottlenecks slow down the antibiotic pipeline, evidencing the need for innovative ways of slowing the emergence and spread of resistance [2]. Antibiotic combination therapies are a promising approach to potentiate treatment and slow resistance evolution. However, administering two compounds can also lead to a loss of effect or an increase in toxicity [3,4]. A key aspect to predicting antibiotic efficacy and developing new therapeutic approaches is a thorough understanding of the relationship between drug susceptibility and bacterial physiology.

Here we present a mechanistic modelling approach to quantitatively predict single antibiotic effect on bacterial growth dynamics under different environmental conditions. We focus on ribosome-targeting antibiotics, which constitute more than half of the drugs used to treat bacterial infections and are among the most successful antimicrobials. We model the uptake of antibiotics and their dynamic interplay with ribosomes within an established model of bacterial growth physiology [5]. Integrating data on growth responses to three ribosome-targeting antibiotics (chloramphenicol, tetracycline and streptomycin), we infer drug-associated parameters and obtain estimates consistent with reported literature values. Furthermore, the calibrated model recovers the effects observed in an independent dataset of growth curves with the same antibiotics.

Currently, we are working on expanding this framework to predict the effect of antibiotic combinations on growth behaviour. By integrating theoretical knowledge and data on growth responses, we expect to identify crucial interactions and gain further mechanistic understanding of combined drug action. This will bring us closer to a predictive theory of bacterial responses to antibiotics, and thus on a path to rational antibiotic therapy.

[1] O’Neill, J. Tackling drug-resistant infections globally: final report and recommendations. (2016).

[2] Gupta, S. & Nayak, R. Dry antibiotic pipeline: Regulatory bottlenecks and regulatory reforms. Journal of Pharmacology & Pharmacotherapeutics, 2014.

[3] Tyers, M. & Wright, G. Drug combinations: a strategy to extend the life of antibiotics in the 21st century. Nature Reviews Microbiology 2018, 2019.

[4] Coates, A., Hu, Y., Holt, J. & Yeh, P. Antibiotic combination therapy against resistant bacterial infections: synergy, rejuvenation and resistance reduction. Expert review of anti-infective therapy, 2020.

[5] Weiße, A., Oyarzún, D., Danos, V. & Swain, P. Mechanistic links between cellular trade-offs, gene expression, and growth. PNAS, 2015.

Combining single-cell imaging techniques to probe surface-adhesion-dependent morphological features of disease state
PRESENTER: Matthew Lux

ABSTRACT. Single-cell imaging offers a powerful way to probe myriad aspects of cellular behavior. Different techniques offer advantages and disadvantages with respect to ease of analysis, throughput, capturing of single-cell dynamics, and ability to capture images in situ rather than in suspension. Here we compare two such techniques, imaging flow cytometry and confocal microscopy. To do so, we develop a pipeline to segment individual cells from microscopy images and produce sets of individual images analogous to imaging flow cytometry. The approach allows quantitative analysis of morphological features of cells while adhered to surfaces and interacting with neighboring cells rather than in solution as in imaging flow cytometry. We present analysis of images of macrophage infection by Francisella tularensis to identify morphological features that vary with measurement technique.

Network-based Stratification of Cancer Drugs
PRESENTER: Nurcan Tuncbag

ABSTRACT. Molecular heterogeneity across tumor types may result in different signaling alterations for the same drug. Moreover, drugs perturb simultaneously a set of protein networks besides their immediate targets. Therefore, network-based approaches may address several unknowns about the drug efficacy and mechanisms of action across various cancer types. Integrative multi-omic approaches can provide a realistic view of network-level alterations toward developing better treatment strategies against the complexity and heterogeneity of cancer. In this study, we elaborate on perturbed networks to mechanistically understand the similarities and differences of drugs. Conceptually we (i) stratified the drugs at the network level beyond their immediate targets, (ii) evaluated the alterations of drug modulation in different cell lines (iii) suggested potential drug combinations based on topological separation of the networks. For this purpose, we used transcriptomic and phosphoproteomic data of five cancer cell lines treated with 89 perturbagens and the associated control treatment (CMap). We obtained the upstream regulators - the set of transcription factors—of the significantly expressed genes for each cell line-drug pair from transcriptomic data. Additionally, we retrieved the targets of each drug from CMAP Drug Repurposing tool. Finally, we merged the set of transcription factors, phosphoprotein hits, and drug targets to obtain seed proteins for each cell line-drug pair for the network modeling. We used an integrative approach based on reverse engineering principles that combine a link prediction strategy to modify the underlying interactome and the solution of the prize-collecting Steiner forest problem to reconstruct drug and cell line-specific networks. In total, we constructed 236 subnetworks from the combination of 70 drugs and five cell lines. All-pair comparison of the reconstructed networks shows that chemically and functionally different drugs may modulate shared pathways. Additionally, we revealed a set of tumor-specific hidden pathways with the help of drug network models that are not detectable from the initial data. The difference in the target selectivity of the drugs leads to disjoint networks despite sharing a similar mechanism of action, e.g., HDAC inhibitors. Modulating orthogonal sets of proteins or pathways simultaneously is an effective strategy for finding drug combinations. Therefore, we also used the reconstructed network models to study potential drug combinations based on the topological separation and found literature evidence for a set of drug pairs. Topological separation between drug networks across cell lines gives clues about their sensitivity levels. We found that the higher difference in response to drugs implies the more separated networks for two drugs (CHIR-99021 and PD-0325901). Therefore, analyzing the networks of these drugs and exploring enriched pathways may give clues about the resistance to these drugs. Overall, network-level exploration of drug-modulated pathways and their deep comparison may potentially help optimize treatment strategies and suggest new drug combinations.

Deep learning a model of cytotoxic T cell activation in the tumor microenvironment
PRESENTER: Madison Wahlsten

ABSTRACT. Cytotoxic CD8 T cells recognize antigens presented on major histocompatibility class I (MHC-I) molecules expressed on the surface of other cells, including tumors, leading to T cell activation and killing. The level of T cell activation, indicated by surface marker expression, cytokine production, and killing activity, is modulated by many factors including the quality and quantity of presented antigens. Immunotherapies such as checkpoint blockade antibodies function by preventing checkpoint inhibitors such as PD-1 and CTLA-4 from inhibiting tumor-specific T cell cytotoxic responses to cancer cells. These immunotherapy treatments have been successful in several cancers such as non-small cell lung cancer and melanoma, but limited in other types of cancers (e.g., pancreatic or prostate carcinomas) owing to differences in tumor antigenicity. Previous work from our lab has shown that the quality of an antigen for T cell activation can be encoded in a single parameter derived from cytokine dynamics produced in ex vivo co-cultures between antigen presenting cells (APCs) and T cells. This encoding provides an overall measure of antigenicity for a sample. Here we built a model that can capture the quality of tumor antigen seen by an individual T cell. Using a custom robotic platform, we generated high-throughput kinetics of T cell activation in co-culture with APCs by sampling supernatants and analyzing cells at various timepoints. We performed spectral flow cytometry to measure the expression of up to 30 surface markers and intracellular signals per cell from these co-cultures. Typical datasets comprise over ten million cells, characterized by 25-30 features over 72 hours, across up to 96 conditions composed of different quality and quantity of antigens, ratios of T cells to APCs, and drug perturbations. To analyze these content-rich datasets, we designed a deep neural network that can classify the antigen seen by an individual cell using marker expression values from flow cytometry with high accuracy (area under the receiver operating characteristic curve > 0.8). By leveraging the multifactorial nature of T cell activation at the single cell level, we aim to provide an in vivo-relevant classification of T cell activation, as well as insight into perturbations that could be applied to immunotherapies to achieve better responses in more patients.

Lactobacillus rhamnosus colonisation antagonizes Candida albicans by forcing metabolic adaptations that compromise pathogenicity
PRESENTER: Sascha Schaeuble

ABSTRACT. Intestinal microbiota dysbiosis can initiate overgrowth of commensal Candida species – a major predisposing factor for disseminated candidiasis. Commensal bacteria such as Lactobacillus rhamnosus can antagonize Candida albicans pathogenicity. We investigated the interplay between C. albicans, L. rhamnosus, and intestinal epithelial cells by ntegrating transcriptional and metabolic profiling. Using untargeted metabolomics together with in silico genome-scale metabolic modelling indicated that intestinal epithelial cells foster bacterial growth metabolically, which leads to bacterial production of antivirulence compounds. In addition, bacterial growth appeared to modify the metabolic environment, including removal of C. albicans’ favoured nutrient sources. This is accompanied by transcriptional and metabolic changes in C. albicans, including altered expression of virulence-related genes. Our results indicate that intestinal colonization with bacteria can antagonize C. albicans by reshaping the metabolic environment and forcing metabolic adaptations that reduce fungal pathogenicity.

Nonstationary Biomedical Signal Feature Extraction

ABSTRACT. With the advancements in sensor technologies, data analytics, and machine learning, the role of meaningful feature extraction is a key area of investigation especially for biomedical signals. Most of the real world signals, and especially the signals from biosensors possess long-term, non-stationary and non-linear characteristics. Signal representation, information processing and feature extraction from these signals is a challenging task. This talk will focus on five generations of signal processing algorithms developed for analysis and interpretation of biomedical signals. The talk will touch upon event analysis, spectral analysis, time-frequency domain analysis and multi-modal biomedical signal processing. Specifically feature extraction algorithms from time domain, frequency domain, signal decomposition domain, time-frequency matrix and image domains. Recent advances in using sparse signal representation and compressive sensing of long-term signals for Internet of Medical Things (IoMT) applications will also be covered. The application of the extraction and classification of features from cardiac signals (electrograms and ECG), neural signals (EEG), bio-acoustical signals (pathological voice), and sleep signals (polysomnography) will be discussed in detail. Machine learning (ML) results in using different nonstationary signal feature extraction techniques directly from time or frequency domain or joint Spatio-temporal/time-frequency signals will also be presented to highlight the key advantages in feature analysis using well-known ML techniques such as linear discriminant analysis, decision trees, support vector machines, and Naive Bayes classifiers. Comparative results to automatic feature analysis results provided by deep learning models in certain areas of cardiac and neural applications will also be presented.

Reconstruction of regulatory networks driving patterned expression in the Drosophila embryo based on spatially resolved single-cell sequencing data​

ABSTRACT. The project is focused on molecular mechanisms that drive spatial gene expression in the early Drosophila embryo. By stage 6, the embryo is an oval-shaped cellular blastoderm embryo with ~6 000 cells lining the periphery. At this point, maternal cues give way to zygotic regulation and spatial gene expression cascades emerge with exquisite specificity, primarily controlled by cis-regulatory modules (CRMs), which contain clusters of ‘docking sites’ for multiple transcription factors.

Inspired by a spatial reconstruction of the transcriptome performed by a collaboration of the Zinzen and the Rajewski Lab¹, I performed a Multiome single-cell sequencing assay of the Drosophila embryo for gene expression and chromatin accessibility. For regaining the spatial aspect > 5000 high-quality cells were fed into NovoSpaRc ², which reconstructs a ‘virtual embryo’ approaching single-cell resolution. This ‘virtual embryo’ displays precise spatial gene expression predictions (for 1700 genes) and chromatin accessibility predictions (for 20000 genomic regions).

For identifying TFs, sequences with a common spatial pattern were queried de novo for ‘docking sites’. The common patterns are the Latent Factors of a Multivariate Factor Analysis computed on all chromatin accessibility data. In parallel, chosen CRMs are currently being validated by transgenic reporter assays.

The combination of multiome single-cell sequencing paired with spatial reconstruction enabled an atlas of the fly embryo and helps decipher the regulation of spatial gene expression.

¹ Karaiskos & Wahle (2017). The Drosophila embryo at single-cell transcriptome resolution. Science

² Nitzan & Karaiskos (2019). Gene expression cartography. Nature

Improving our Mechanistic Understanding of Cell Cycle Dynamics
PRESENTER: Paul Lang

ABSTRACT. The mammalian cell cycle is regulated by a well-studied but complex biochemical reaction system. Computational models provide a particularly systematic and systemic description of the mechanisms governing mammalian cell cycle control. They facilitate a detailed understanding of cell cycle control mechanisms and are in part also able to aggregate this knowledge into full cell cycle models that explain periodic cell cycle oscillations. This work aims at improving on these models along four dimensions: model structure, validation data, validation methodology and model reusability.

Presented is a core model structure of the full cell cycle that qualitatively explains the behaviour of unperturbed and perturbed cells. Using rule-based model descriptions, the core model was conveniently extended by a DNA damage checkpoint and a separation in a nuclear and cytoplasmic compartment. To estimate the model parameters, the time courses of several cell cycle regulators were reconstructed from single cell snapshot immunofluorescence data using the reCAT algorithm. This data and the cell cycle model were then cast into the PEtab format for specifying parameter estimation problems in biochemical reaction networks. After optimising these parameters with self-adaptiive cooperative enhanced scatter search, a cell cycle model that explains the validation data was obtained. The PEtab specification allows any modeler to reuse the model, the data and/or the optimisation results.

Further experimental conditions, for instance in form of CRISPR interference, are expected to significantly improve parameter identifiability and provide a way for testing the predictive power of the model. Given the central role of the cell cycle in health and disease, such a predictive model may aid in the discovery of new therapeutic targets.

A graph-based approach to analyze signal transduction involving cytokines and the JAK-STAT pathway in hepatocytes using Petri nets
PRESENTER: Marcus Keßler

ABSTRACT. The pleiotropic cytokines interleukin 6 (IL-6) and interleukin 22 (IL-22) are involved in multiple signaling pathways in a variety of cells. In hepatocytes, they activate the acute phase response by phosphorylating and activating STAT3 homodimers. Other functions, which involve both STAT1 and STAT3 homodimers as well as heterodimers, include cell homeostasis and tissue repair. However, high concentrations of IL-6 and IL-22 are associated with worse outcomes for patients in pathologies such as ACLF, and continuous overexpression of these cytokines can prove detrimental to the liver and lead to the formation of tumors. Consequently, these pathways are tightly controlled by regulatory proteins such as SOCS.

Despite their importance to liver function in health and disease, the signaling pathways activated by IL-6 and IL-22 have still not been fully described or lack detail. Cross-talk between both cytokines is similarly enigmatic. We have built a Petri net (directed graph-based) model of the IL-6 and IL-22 pathways to improve our understanding of cytokine regulation and to produce testable predictions regarding the dynamics of the interactions between the variety of involved agonists and antagonists. The model includes cytokine binding to their respective receptors, including IL-6 traditional and trans signaling. Following receptor binding, the JAK/STAT pathways can be activated, leading to RNA transcription. This process is regulated in our model by PTPN11 and, through negative feedback, SOCS3, binding to phosphorylated JAK. We applied in silico knockout experiments to analyze how critical different proteins appear for the functioning of the pathways. We undertook further analysis using clustering algorithms to advance our understanding of the model, find similarities between sub-pathways and examine its correctness. We will use our model together with experimentalists in further research to gain a better understanding of ACLF and improve current methods for its treatment.

A Reproducibility Scorecard for Self-checking of Systems Biology Models

ABSTRACT. Systems biology modelling involves the mathematical representation of biological processes to study complex system behaviour and might be expected to be least affected by the reproducibility crisis. However, models often fail to reproduce and the reasons for the failure and prevalence were not fully understood. In a recent study, we analysed 455 kinetic models published in 152 peer-reviewed journals. Most of these models were manually encoded from scratch to assess the reproducibility. Our investigation revealed that 49% of the models could not be reproduced using the information provided in the manuscripts. Among the corresponding authors we contacted over 70% did not respond. Models published across the most relevant life science journals failed to reproduce, revealing a common problem in the peer-review process. As a solution, we propose a simple easy-to-use reproducibility scorecard with 8 yes-or-no questions that can be used by model authors, reviewers and journal editors during the peer-review process. Authors get a first indication how reusable their model is, reviewers can use the scorecard to evaluate model-based publications, and journal editors can point to missing scores during the revision process. In this work we demonstrate how the scorecard can help modellers evaluate the reproducibility status of their model. We also show that the score is a good indicator for model reusability. The reproducibility crisis in systems biology modelling can be tackled as a community, where model authors, reviewers, journal editors and funding bodies embrace reproducibility more proactively than before.

BioModels: an open repository of mathematical models of the biological and biomedical processes

ABSTRACT. Systems biology modelling involves the representation of biological processes using mathematical notations to analyse and understand emerging system's behaviour. BioModels has grown over the past 17 years as a leading repository of manually curated mathematical models of biological and biomedical systems. Modellers can submit and share their new models as well as explore existing ones in the BioModels platform. While community standards such as SBML BioModels accept the submission of computational models encoded in any modelling format. BioModels currently hosts about 2800 existing literature-based physiologically and pharmaceutically relevant mechanistic models in standard formats from the published literature. With almost 1100 curated models, BioModels has become the world’s largest repository of curated models and emerged as the third most used data resource after PubMed and Google Scholar among the scientists who use modelling in their research. Thus, BioModels benefits modellers by providing access to reliable and semantically enriched curated models in standard formats that are easy to share, reproduce and reuse. BioModels Parameters search provides easy access to all model parameters, mathematical equations, and initial concentrations of the model entities from the curated and non-curated public models in the BioModels repository. BioModels offers valuable content and functionality including advanced search and APIs to access models from various modelling approaches including kinetic, logical and constraint-based modelling to support various areas such as whole-cell modelling, personalised medicine, and pharmacological and toxicological research.

Metabolic profile predictions using efficient and interpretable data descriptors generated with relational learning

ABSTRACT. Metabolic profiles are arguably the type of biological data that most closely represents the functional readout of the physiological state of an organism, and thus, increased understanding of what controls and defines the accumulated abundances of these biochemicals is of high scientific interest. While the yeast Saccharomyces cerevisiae is an extremely well-studied model organism, the amount of high-quality data available on its metabolome is still lacking.

One of the keys to success in applying machine learning to scientific research tasks is the use of meaningful data representations. While popular methods such as deep neural networks (DNNs) are very successful in extracting rich internal representations from seemingly simple inputs, they have poor interpretability and explainability, wich are of the utmost importance in systems biology. More explainable models improve the understanding of the implications of systems biology models, and enable models to be rationally improved.

There is a wide range of available knowledge on yeast physiology contained in databases such as the Saccharomyces Genome Database, and in highly curated genome scale metabolic models such as Yeast8. Being the product of decades worth of experiments on multiple different modalities, these are rich in information, and adhere to semantically meaningful ontologies. By representing this prior knowledge in a richly expressive Datalog database we generate data descriptors using relational learning that makes more efficient use of existing propositional data and both improves model predictions and their interpretability.

Model-based design of a synthetic oscillator based on an epigenetic memory system
PRESENTER: Viviane Klingel

ABSTRACT. Oscillations are important components in biological systems, grasping their mechanisms and regulation, however, is challenging. Here, we use the theory of dynamical systems to support the design of oscillatory systems based on epigenetic control elements (Klingel et al. 2022). Specifically, we use results that extend the Poincaré-Bendixson Theorem for monotone control systems which are coupled to a negative feedback circuit. The methodology is applied to a synthetic epigenetic memory system based on DNA methylation. This memory system was developed by Maier et al. (2017) by designing a methylation sensitive Zinc-finger protein, which when bound to DNA can inhibit the transcription of a methyltransferase. There, memory functioning was realized by a positive feedback that leads to bistability. Our study is based on a mathematical model of this system (Klingel et al. 2021). This memory module serves as the monotone control system. We propose to implement the additional negative feedback required for oscillations by introducing novel DNA methylation sites into the autoregulation of the Zinc-finger. Through phase plane analysis of our mathematical model, we then determined necessary parameter conditions required for the system to operate in the oscillatory range. We further used model simulations to distinguish between dampened and sustained oscillations. Our proposed system is generally able to exhibit sustained oscillations according to model predictions. However, first experimental implementations showed that several adaptations in the experimental system are required to reproduce our modeling results. Using the insights we created with our computational model, we explored the experimental design space to shift the system into an oscillatory regime. In particular, we then proposed several modifications, which could encourage oscillations in the experimental system, including altered Zinc-finger binding sites or a doubled methyltransferase production. Overall, our study shows that a model-based design of functional modules, combined with predictions about experimentally realizable design parameters, can support the targeted construction of synthetic modules.

References: Maier et al., 2017, Nat. Commun. 8, 1-10 Klingel et al. 2021, FEBS J. 288, 5692-5707 Klingel et al. 2022, ACS Synth. Biol. 11, 2445-2455

Structural reduction of chemical reaction networks based on topology
PRESENTER: Yuji Hirono

ABSTRACT. Inside living cells, chemical reactions form a large web of networks. Understanding the behavior of those complex reaction networks is an important and challenging problem. We develop a model-independent reduction method of chemical reaction systems based on the stoichiometry, which determines their network topology. A subnetwork can be eliminated systematically to give a reduced system with fewer degrees of freedom. This subnetwork removal is accompanied by rewiring of the network, which is prescribed by the Schur complement of the stoichiometric matrix. Using homology and cohomology groups to characterize the topology of chemical reaction networks, we can track the changes of the network topology induced by the reduction through the changes in those groups. We prove that, when certain topological conditions are met, the steady-state chemical concentrations and reaction rates of the reduced system are ensured to be the same as those of the original system. This result holds regardless of the modeling of the reactions, namely, chemical kinetics, since the conditions only involve topological information. This is advantageous because the details of reaction kinetics and parameter values are difficult to identify in many practical situations. The method allows us to reduce a reaction network while preserving its original steady-state properties, thereby complex reaction systems can be studied efficiently. We demonstrate the reduction method in hypothetical networks and the central carbon metabolism of Escherichia coli.

Combined multiple transcriptional repression mechanisms generate ultrasensitivity and oscillations
PRESENTER: Eui Min Jeong

ABSTRACT. Transcriptional repression can occur via various mechanisms, such as blocking, sequestration and displacement. Although the transcription can be completely suppressed with a single mechanism, multiple repression mechanisms are used together to inhibit transcriptional activators in many systems, such as circadian clocks and NF-κB oscillators. This raises the question of what advantages arise if seemingly redundant repression mechanisms are combined. Here, by deriving equations describing the multiple repression mechanisms, we find that their combination can synergistically generate a sharply ultrasensitive transcription response and thus strong oscillations. This rationalizes why the multiple repression mechanisms are used together in various biological oscillators. The critical role of such combined transcriptional repression for strong oscillations is further supported by our analysis of formerly identified mutations disrupting the transcriptional repression of the mammalian circadian clock. The hitherto unrecognized source of the ultrasensitivity, the combined transcriptional repressions, can lead to robust synthetic oscillators with a previously unachievable simple design.

Multi-Omics Visible Drug Activity prediction, interpreting the biological processes underlying drug sensitivity
PRESENTER: Luigi Ferraro

ABSTRACT. Cancer is a genetic disease resulting from the accumulation of genomics alterations in living cells. Large scale genomics studies have been instrumental to understand the recurrent somatic genetic alterations within a cell and for the characterization of their functional effects in transformed cells. One of the main challenging questions in this field is how to exploit all these molecular information to identify therapeutic targets and to develop personalized therapies, understanding which molecular features influencing sensitivity to drugs. Machine learning models are able to exploit multi-modal screening datasets to develop predictive algorithms useful to associate omics features with response. The basic approach is to use the data from these screenings to train a machine learning “black box" model that predicts the 50% inhibitory concentration (IC50) of a drug from the multi-omics profile of a cell line, without the possibility to interpret the biological mechanisms underlying predicted outcomes and the exploitation of the unbalanced nature of the data. In order to address these limitations we propose a Multi-Omics Visible Drug Activity prediction (MOViDA) neural network model that extends the visible network approach incorporating functional information in terms of pathway activity from gene expression and copy number data into a neural network. We have identified which pathways and drug features are good predictors for high sensitivity of a cell line to a drug. This explanation is the basis to hypothesize drug combinations, cell editing and properties of new drugs aimed at the identification of cell vulnerabilities.

DUNE-COPASI: Multi-Compartment Diffusion-Reaction solver for Cell Biology

ABSTRACT. The quantitative study of living cells with systems biology has been traditionally approached assuming spatially homogeneous biochemical species (e.g. using ODEs). However, recent technological progress led to the increasing availability of spatiotemporal data, and thus, caused the need to extend and create new tools that account for the spatial dimension (e.g. using PDEs). With this panorama, we aim to model the spatiotemporal distribution of biochemical species within the cell as well as its immediate surroundings. Such a setting may be characterized with a system of diffusion-reaction equations per compartment/membrane, together with a set of transmission conditions to couple them at the membrane. In this poster, we explore a generic model formulation as well as two applications. Moreover, we present dune-copasi, an open-source multi-compartment diffusion-reaction solver tailored for biological systems.

Computational modeling of DLBCL predicts response to BH3-mimetics
PRESENTER: Simon Mitchell

ABSTRACT. In healthy cells, pro- and anti-apoptotic BCL2 family and BH3-only proteins are expressed in a delicate equilibrium. In contrast, this homeostasis is frequently perturbed in cancer cells due to the overexpression of anti-apoptotic BCL2 proteins. Variability in the expression and sequestration of these proteins in Diffuse Large B cell Lymphoma (DLBCL) likely contributes to variability in response to BH3-mimetics, which has prevented their widespread clinical adoption. While several highly specific BH3-mimetics have been developed, successful deployment of BH3-mimetics in DLBCL requires reliable predictions of which lymphoma cells will respond. Here we show that a computational systems biology approach enables accurate prediction of the sensitivity of DLBCL cells to BH3-mimetics. We found that fractional killing of DLBCL, and the presence of treatment-resistant cells within a cell population, can be explained by cell-to-cell variability in the molecular abundances of signaling proteins. Importantly, by combining protein interaction data with a knowledge of genetic lesions in DLBCL cells we could accurately predict in silico the sensitivity to BH3-mimetics in vitro. Furthermore, the library of virtual DLBCL cells we created was able to predict novel synergistic combinations of BH3-mimetics, which were then experimentally validated. These results show that when computational systems biology models of apoptotic signaling are constrained by experimental data, they can facilitate the rational assignment of efficacious targeted inhibitors in B cell malignancies paving the way for development of more personalised approaches to treatment.

Automated detection and regional classification of Cerebral Amyloid Angiopathy in digital whole slide images
PRESENTER: Lise Minaud

ABSTRACT. The deposition of amyloid beta (Aß) in cortical and leptomeningeal brain vessel walls, termed Cerebral Amyloid Angiopathy (CAA), can increase susceptibility to brain hemorrhages. CAA is present in 30% of neurologically-normal elderly and frequently co-occurs with Alzheimer's Disease (AD). Characterizing the frequency and anatomic distribution of CAA, typically by human experts visually examining stained brain sections, is a time-consuming task. Furthermore, inter-rater variability in expert assessment limits the ability to harmonize data across studies. Hence, methods are needed for scalable, reproducible assessments of CAA. We introduce an automated tool to reliably overcome these challenges, by training a deep learning model to locate and classify CAAs throughout the brain. We collected 95 whole slide images (WSIs) of postmortem human brain tissue from three institutions, derived from temporal, occipital, and frontal cortices, immunostained with four different antibodies for Aß. We processed the WSIs into 256x256 pixel tiles, from which we isolated ~20,000 candidate tiles of potentially CAA-affected vessels. Six experts independently annotated each tile, for a total of 120,000 annotations. For each tile, annotators labeled whether CAA was present and, if so, its anatomic location, parenchymal vs. leptomeningeal. By combining multiple and at times discordant expert opinions via a consensus strategy, we developed a convolutional neural network (CNN) that learned from six expert annotators. The model achieved held-out test set performance of AUPRC=0.90 for leptomeningeal CAA vessels and AUPRC=0.88 for parenchymal CAA vessels. We intend to release the model as a foundation for a generalizable, scalable means to detect CAA in human postmortem brain WSIs.

Cell type-specific gene co-expression modules define tumor heterogeneity in melanoma patients
PRESENTER: Michael Prummer

ABSTRACT. Gene co-expression networks are governing all cellular processes in health and disease. But the presence or absence of correlated gene pairs is difficult to interpret in bulk samples. For instance, the co-occurrence of two cell types can lead to an apparent co-expression of two genes even when they are completely independent within each individual cell. In single cell experiments, an observed correlation between a pair of gene is truly present within one cell.

Here we use droplet-based single cell transcriptomics to discover disease-specific robust co-expression networks in different cell types from tissue biopsies of melanoma patients. We analyze each sample independently to arrive at patient-specific networks and subsequently compare them across the cohort. This way, we remove technical variability and perform what is called late integration of the data. To this end, co-expression sub-networks (aka, modules) are identified in each patient using community detection principles. Recurring as well as unique co-expression modules are compared to gene ontology terms to assign a biologically meaningful label. Any difference of the disease and cell type-specific module composition from common gene sets can provide new insight into disease causing mechanisms or novel treatment options. After all, many of the curated gene sets used for enrichment analysis were derived from bulk samples of healthy individuals or non-human model organisms or cultured cell lines. As an outlook, patient-specific gene expression programs in various cell types may give rise to personalized treatment recommendations.

Universality of form: The case of retinal cone photoreceptor mosaics

ABSTRACT. Cone photoreceptor cells are wavelength-sensitive receptors in the retinas of vertebrate eyes, which are responsible for color vision. The spatial distributions of these cells are commonly referred to as retinal cone photoreceptor mosaics.

Cone mosaic patterns vary among different species, which in each case it may reflect the evolutionary pressures that give rise to various adaptations to specific visual needs of a particular species with respect to its lifestyle; although, in most cases, the adaptive value of a particular cone mosaic is unknown. From the perspective of the gene regulatory mechanisms, the most fundamental questions, such as: what are the mechanisms which control the mostly random distributions of the cone subtypes in the human retina? or, what migration mechanism determines the highly regular and ordered patterns of the cone subtypes in the retina of the zebrafish?, remain unanswered.

By applying the principle of maximum entropy, we demonstrate the universality of the spatial distributions of the cone cells in divergent species of: rodent, dog, monkey, human, fish, and bird, without invoking any specific biological mechanisms. This incidentally, implies a mathematical restriction imposed on the evolution of biological forms.

Dimethyl fumarate shifts CSF protein profile in relapsing remitting MS: a proteomics approach for biomarker discovery
PRESENTER: Sara Hojjati

ABSTRACT. Finding biomarkers for multiple sclerosis prognosis and response to drugs can lead to faster and more efficient decisions on treatment. We have investigated levels of 184 inflammation- and neuro-associated proteins in plasma and CSF to assess effects of dimethyl fumarate (DMF) treatment in patients with relapsing remitting multiple sclerosis (RRMS). In addition, we aimed at identifying candidate proteins capable of predicting response to treatment. Twenty-eight RRMS patients were examined clinically and by MR before treatment and after one year. Using a high throughput immunoassay, we identified that levels of 19 plasma proteins and 10 CSF proteins were significantly changed after DMF treatment (p<0.01). T-helper 1 (Th1)-associated (CXCL10, CXCL11, IL-12, lymphotoxin) proteins decreased, while IL-7 increased in CSF, in line with a shift from a pro-inflammatory to an anti-inflammatory profile. Also,neuro-associated proteins decreased in CSF, including potential regulators of microglia and myelination.The changes in protein levels did not follow the same pattern in CSF and plasma. Levels of 14 proteins in CSF and 2 proteins in plasma differed among responders and non-responders to DMF (p<0.01) thus being candidates for prediction of response to treatment that were used to obtain predictive models with high sensitivity and specificity.

Automated improvement of genome-scale models using a first-order logical model of metabolism
PRESENTER: Alexander Gower

ABSTRACT. Genome-scale metabolic models (GEMs)—often evaluated with flux balance analysis (FBA)—are a popular and useful modelling paradigm for systems biology. GEMs require significant research effort to develop, which is extremely expensive with regards to time, money and physical resource. Automated techniques are one promising way to make scientific discoveries within systems biology at the scale and pace required given the complexity of the challenge. Constructing models that enable automation of scientific discovery in yeast is therefore of high research interest.

First-order logic allows for a rich expression of knowledge about biological processes. Mechanisms such as reactions, enzyme catalysis and gene regulation can be expressed independently of specific genes, species or enzymes. Specific instances of these mechanisms, for example those recorded in a GEM, produce a logical theory which can be used for deductive inference. Previously such modelling approaches have been used to: find genes encoding orphan enzymes and deduce growth or no-growth for single gene deletions.

We construct a first-order logic theory based on background knowledge as coded by existing GEMs. Using the resultant theory and the theorem proving software iProver we systematically predict gene essentiality using *in silico* single gene knockout experiments for each of the genes included in the given base model. We compare these predictions with experimental data on deletant strain viability. Using Yeast8 (v8.46.4.46.2) our model achieves an overall accuracy of 0.84, a sensitivity of 0.19 and a specificity of 0.95.

We present novel automated model improvement algorithms using abductive reasoning and apply them to our logical theory to suggest novel hypotheses. We test these hypotheses by: evaluating changes in performance in single-gene essentiality prediction; and by designing *in vivo* experiments, laying foundations for autonomous scientific experimentation and discovery.

The potential of PBI for high resolution lung imaging in patients

ABSTRACT. By using propagation based imaging (PBI) with an extreme sample-to-detector distance of 10.7 m, we were able to perform lung imaging with an unprecedented resolution of 67 μm in a human chest phantom equipped with fresh porcine lungs at a dose lower than currently employed for clinical HRCT.

We demonstrate that at this resolution the morphology of structures such as subsolid and solid nodules can be depicted in great detail which will aid to an improved diagnosis. Thus PBI was virtually limited to lung imaging in small animals performed at synchrotron light sources. Here we present our PBI imaging results using a human chest phantom equipped with fresh porcine lungs using a radiation dose below clinical applied high resolution CT

Cohort-specific Boolean models highlight different regulatory modules during disease progression
PRESENTER: Ahmed Hemedan

ABSTRACT. Molecular mechanisms of health and disease are often represented as systems biology diagrams, and the coverage of such representation constantly increases. Knowledge resources, such as Parkinson’s disease map (PD map), encode PD-specific mechanisms in a computable and diagrammatic form, integrating different layers of information to allow the analysis of the disease complexity. The map can be used as a scaffold to build in silico models to improve our understanding of the disease behaviour. Despite the lack of kinetic data in the map, parameter-free models, e.g. Boolean models (BMs), can offer powerful analytical tools for dynamic analysis. These models can test hypotheses about mechanisms underlying the pathophysiology of the disease, identifying key drug targets and responses. Furthermore, BMs show a flexibility to be integrated with omics data in a standardised way. This helps to translate the created BMs to testable hypotheses and to simulate precise clinical studies. In this work, we infer the disease mechanisms as BMs from the PD map in an automated fashion. The overall behaviour and signal response pattern of the models are compared and matched with the scientific evidence from literature, ensuring that the models can represent the biological reality. Then, we tailor the BMs by integrating cohort-specific and comorbidities omics datasets, aiming to simulate the disease behaviour and therapeutic responses in multiple conditions. The analysis reveals interesting phenotypic diversity throughout the PD cohorts, and highlights different regulatory modules of the transcriptome during progression and comorbidities.

A bipartite function of ESRRB can integrate signaling over time to balance self-renewal and differentiation

ABSTRACT. Cooperative DNA binding of transcription factors (TFs) integrates external stimuli and context across tissues and time. Naïve mouse embryonic stem cells are derived from early development and can sustain the pluripotent identity indefinitely. Here we ask whether TFs associated with pluripotency evolved to directly support this state, or if the state emerges from their combinatorial action. NANOG and ESRRB are key pluripotency factors that co-bind DNA. We find that when both factors are expressed, ESRRB supports pluripotency. However, when NANOG is not present, ESRRB supports a bistable culture of cells with an embryo-like primitive endoderm identity ancillary to pluripotency. The stoichiometry between NANOG and ESRRB quantitatively influences differentiation, and in silico modeling of bipartite TF activity suggests ESRRB safeguards plasticity in differentiation. Thus, the concerted activity of cooperative TFs can transform their effect to sustain intermediate cell identities and allow ex vivo expansion of highly stable stem cell models.

A comprehensive mechanistic model of adipocyte signaling with layers of confidence
PRESENTER: William Lövfors

ABSTRACT. Adipocyte cellular signaling, normally and in type 2 diabetes, is far from fully studied. We have earlier developed detailed dynamic mathematical models for some well-studied, and partially overlapping, signaling pathways in adipocytes. Still, these models only cover a fraction of the total cellular response. For a broader coverage of the response, large-scale phosphoproteomic data is key. There exists such data for the insulin response of adipocytes, as well as prior knowledge on possible protein-protein interactions associated with a confidence level. However, methods to combine detailed dynamic models with large-scale data, using information about the confidence of included interactions, are lacking. In our new method, we first establish a core model by connecting our partially overlapping models of adipocyte cellular signaling with focus on: 1) lipolysis and fatty acid release, 2) glucose uptake, and 3) the release of adiponectin. We use the phosphoproteome data and prior knowledge to identify phosphosites adjacent to the core model, and then try to add the adjacent phosphosites to the model. The additions of the adjacent phosphosites is tested in a parallel, pairwise approach with low computation time. We then iteratively collect the accepted additions into a layer, and use the newly added layer to find new adjacent phosphosites. We find that the first 15 layers (60 added phosphosites) with the highest confidence can correctly predict independent inhibitor-data (70-90 % correct), and that this ability decrease when we add layers of decreasing confidence. In total, 60 layers (3926 phosphosites) can be added to the model and still keep predictive ability. Finally, we use the comprehensive adipocyte model to simulate systems-wide alterations in adipocytes in type 2 diabetes. This new method provide a tool to create large models that keeps track of varying confidence.

Hierarchical gene function prediction using a novel topology-based threshold and comparing 6 correlation metrics. Study case: Rice

ABSTRACT. Pearson's correlation coefficient is widely used for the generation of gene co-expression networks (GCNs), despite its sensibility to outliers and its limitations in detecting non-linear associations. Besides, the process of extracting a GCN is prone to subjective definitions of a significance threshold. In this study, six different correlation metrics are tested for the computation of the co-expression matrix and a novel method based on the underlying topology of the network is proposed. The resulting networks are assessed for the task of gene functional prediction using four hierarchical multi-label classification models based on gradient-boosted forests. Results show that a model considering all metrics at once performs better than level-wise or single annotation prediction models. Also, results show that the RIC GCN performs better than the others at the prediction task for small GO hierarchies . Likewise, the BICOR GCN performs better for large GO hierarchies.

Personalized medicine for Multiple Sclerosis treatment strategies at the individual and population level
PRESENTER: Roberta Bursi

ABSTRACT. The landscape of treatment options for Relapsing-remitting Multiple Sclerosis patients has been completely transformed over the last few decades. Over a dozen disease-modifying therapies are now available, targeting several distinct pathways and yielding varying degrees of efficacy and – inevitably – of side effects. Currently, a lack of understanding of the specific benefits, disadvantages and optimal patient profiles for each treatment has as a consequence that the available therapies are not being optimally leveraged to improve patients’ wellbeing. Thus, translating the increased availability of therapeutic options into treatment strategies and policies that benefit patients’ health remain a major challenge. We have developed a cloud-based MS simulator (MS TreatSim – https://mstreat.insiliconeuro.com) to support understanding of the efficacy and safety profiles of various Relapsing-remitting Multiple Sclerosis therapies at both the personal and population level. The simulator builds on an agent-based model that combines four key elements: (1) immune system architecture and dynamics, (2) Relapsing-remitting Multiple Sclerosis etiology (3) four commonly prescribed first and second line treatments for Multiple Sclerosis, and (4) immune system heterogeneity. The immune system forms the basis of the model, incorporating fundamental processes and cell types of both the innate and adaptive immune systems. Multiple Sclerosis etiology was incorporated by extending the model with an explicit white matter compartment, in which oligodendrocytes are attacked and destroyed by the autoresponsive immune system during active disease. The four treatment options – interferon β-1a, teriflunomide, natalizumab and ocrelizumab - are each incorporated through their pharmacokinetic characteristics and their mechanism of action. Finally, heterogeneous virtual Relapsing-remitting Multiple Sclerosis patients are created by mapping demographic and clinical parameters (e.g., age at disease onset, lesion load, immune variability) to underlying mechanistic model parameters, and subsequently selecting the patients of interest with the aid of disease history characteristics. The simulator includes both individual level and population level workflows, allowing user-friendly evaluation of the efficacy of the integrated treatment options. In addition to demonstrating treatment effects on clinical outcomes such as relapse rates, MS TreatSim also allows investigation of the underlying effects on immune system variables. MS TreatSim can thus be a valuable tool to investigate the variability in treatment response, and to guide individual and policy-level treatment guidance. With further clinical validation and advanced options for personalization, in the future MS TreatSim may be applied for personalized treatment planning.

Deciphering transcriptional drivers of stem cell differentiation using causal inference
PRESENTER: Martin Proks

ABSTRACT. During differentiation, progenitor cells undergo numerous transcriptional and epigenetic changes while also being subjected to mechanical forces. In this project, we focus on identifying causal transcriptional drivers and their interactions by predicting gene regulatory networks (GRNs). Current GRN inference algorithms suffer from numerous limitations like linear assumptions, correlation or entropy based calculations, missing directionality, requirement of prior knowledge or multimodal setup, and “black box” models (neural networks). To overcome these limitations, we use Convergent Cross Mapping (CCM) from dynamical systems. As a basis we exploit single-cell RNA sequencing (scRNA-seq) which allows us to track individual gene expression throughout each cell. With downstream analysis of scRNA-seq data one can infer directionality (trajectory inference), approximated time of differentiation (pseudo-/latent-time) as well as transcriptional, splicing and degradation kinetics (RNA velocity). We utilize these methods by generate time series of gene expression along pseudo-time and estimate causal directional interactions with CCM. Compared to current GRN algorithms, our tool reports directionality, effect (induction/repression) and its temporal kinetics. Finally, it uses public ChIP-seq atlas data to benchmark predicted interactions. Despite high computational demand, we showcase that CCM can infer main drivers as well as new interactions in mouse pre-implantation development.

In silico analysis of metabolic capabilities in a synthetic community of phycosphere bacteria
PRESENTER: Ali Navid

ABSTRACT. Microalgae play key roles in global nutrient cycles. They also can be used as biomass for production of sustainable biofuels. It is known that optimum growth and robustness of algae are critically dependent on interactions with the bacteria that reside on their surfaces and immediate surroundings (phycosphere). Unfortunately, many biophysical and biochemical aspects of these interactions are unknown. To understand the who, and the how of bacteria that interact with biofuel producing algae, we collected 18 isolates from the phycosphere of Phaeodactylum tricornutum (PT), a model biofuel producing algae. We sequenced and annotated the genomes of these isolates using several annotation tools and then developed genome-scale metabolic models for each species. We also co-cultured several isolates with PT to examine the outcomes of their interactions. We examined the reactome of the synthetic phyocosphere microbiome to find commonalities and differences between the organisms. The community reactome has more than 5000 reactions and while many of these reactions are shared among all the organisms, we found that approximately 20% of the reactions are unique to individual species. The unique metabolic capabilities are not equally distributed among the community members and so we hypothesize that our community a few metabolic specialists, and these capabilities might explain the outcome of their one-on-one interactions with PT. One isolate from the algoriphagus family of microbes (labeled ARW1R1) has the most unique metabolic reactions in the community. Our experiments show that for most conditions co-culturing ARW1R1 with PT has no effect on the growth of PT. But we observed a slight increase in growth of PT when it is co-cultured with ARW1R1 in an environment that is nitrate rich but light limited. We analyzed the metabolism of ARW1R1 in search of an answer for the cause of this behavior. Our examination showed that among its various metabolic capabilities, ARW1R1 has enzymes that catalyze two unique metabolic pathways. One pathway involves metabolism of arachidonic acid and related poly-unsaturated fatty acids (PUFAs). The other process is oxidative deamination of D- and L- amino acids resulting in production of H2O2 and -keto acids. Both processes can improve growth of ARW1R1 in the above environment. PUFAs are a metabolic byproduct of algae and metabolizing these carbon rich components of algal cell wall could be one reason why we find ARW1R1 growing in the phycosphere of PT. Additionally, ARW1R1 does not have the ability to directly use nitrate and thus having the ability to metabolize amino acids produced by PT can sate its need to nitrogen. So, while our analyses have identified possible means of nutrient scavenging by ARW1R1 from PT, we still have not found a mechanism for the observed beneficial interaction between the two organisms.

GO Term-based Comparative Functional Genomics in Plants

ABSTRACT. Gene Ontology annotations are used to describe the functional roles of genes in an organism. Their applications in research are usually limited to functional enrichment analyses or a quick description of a specific gene of interest but as high-throughput pipelines such as GOMAP (Gene Ontology Meta Annotator for Plants) are now available which predict gene functions in a consistent and reproducible manner across the whole genome, new types of analyses become fathomable. Here we present a basic approach to comparative functional genomics in plants using GO terms, leveraging the hierarchical nature of the ontology to calculate a distance metric for comparing and clustering genomes as well as a binary matrix for parsimony-based clustering. We applied the method to 18 plant genomes across 14 species annotated with the GOMAP pipeline and quite to our surprise, the resulting dendrograms largely, but not entirely, resemble well-established evolutionary phylogenies. We believe the discrepancies might point to interesting biological convergencies, making this a valuable approach for hypothesis generation. However, certainly more work is needed on extending the model and overcoming the limitations of the currently very simple method. The full paper is freely available at https://doi.org/10.1093/gigascience/giac023 , the data and code used at https://github.com/Dill-PICL/GOMAP-Paper-2019.1 .

Exploring the missing heritability in SPG7 heterozygous carriers with Whole Genome Sequencing
PRESENTER: Marie Coutelier

ABSTRACT. SPG7 biallelic mutations are the most frequent cause of autosomal recessive spastic paraplegia. The associated clinical picture is either a pure spastic paraplegia, or a complex phenotype encompassing mitochondrial features, optic atrophy, and cerebellar signs. In the recent years, the phenotype has been widened to cerebellar ataxia more generally, with or without pyramidal signs; and to clinical presentations associating extrapyramidal features, mimicking Parkinson’s disease or Multisystemic Atrophy of the cerebellar type in some cases.

In 731 patients with cerebellar ataxia, we sequenced known ataxia genes, either with amplicon-based panel sequencing (n=412) or whole exome sequencing (n=319). We found biallelic mutations in 23 patients (3.1%), often associating spastic or mitochondrial presentations. We also identified 19 heterozygous carriers of loss of function or previously described missense variants in SPG7, without a second mutation. Dominant transmission has been discussed in the literature. While it is suggested in some patients, a recent report described a deep intronic change responsible for an alteration of SPG7 expression, in trans with a missense mutation, advising genetic reexamination of heterozygous carriers.

We performed short-read Whole Genome Sequencing in 13 patients from 12 pedigrees, with a phenotype characteristic of SPG7-related cerebellar ataxia, associating either spasticity or parkinsonism, and an established causative but monoallelic SPG7 variant. In two patients, we identified deletions encompassing exons or enhancers that could explain the missing heritability and were missed with exome sequencing. In three other index cases, we identified conserved and rare mutations in SPG7 brain enhancers. In a family with two patients, three coding variants explain the presentation. Finally, one patient carried a CAG expansion in NOP56, detected with ExpansionHunter.

While further functional validation is required, whole genome sequencing appears to be a promising approach to reach molecular diagnosis in patients carrying heterozygous variants in recessive genes, when the clinical presumption is sufficiently high.

Comparative analysis of molecular fingerprints in prediction of drug combination effects
PRESENTER: Bulat Zagidullin

ABSTRACT. Application of machine and deep learning methods in drug discovery and cancer research has gained a considerable amount of attention in the past years. As the field grows, it becomes crucial to systematically evaluate the performance of novel computational solutions in relation to established techniques. To this end, we compare rule-based and data-driven molecular fingerprints in prediction of drug combination sensitivity and drug synergy scores using standardized results of 14 HTS studies, comprising 64 200 unique combinations of 4 153 drug-like small molecules screened in 112 cancer cell lines. We also evaluate the clustering performance of drug representations and quantify their similarity by adapting the Centered Kernel Alignment metric. Our work demonstrates that to identify an optimal drug representation type, it is necessary to take into account quantitative metrics together with qualitative considerations, such as model interpretability and robustness requirements that vary between and throughout stages of a drug development project.

Proteomics profiling revealed a combination of protein biomarkers for predicting disease trajectory of multiple sclerosis
PRESENTER: Julia Åkesson

ABSTRACT. There is a high demand for clinically useful and easily detectable protein biomarkers for predicting disease trajectory and response to treatment for multiple sclerosis (MS) patients. We previously used mRNAs as proxies combined with protein-protein interaction networks to identify MS modules whose state was measured by a few secreted proteins and used to predict the disease trajectory. Here, for the first time, we have used the highly sensitive proximity extension assay (OLINK Explore) to measure 1,463 proteins in both plasma and cerebrospinal fluid (CSF) from 143 patients at the early stages of the disease and 43 healthy controls. The patients were divided into a discovery cohort from Linköping University hospital (n=92) and a replication cohort from Karolinska University Hospital (n=51). Consistent with previous studies, differential protein levels are clearly detectable in CSF, but these changes are mostly not transferred to plasma on individual protein level. Using clinical information about patients' progressing disease activity, we identified proteins whose baseline expression in CSF could predict two different measures of disease progression: NEDA-3 and nARMSS. In line with previous studies we found NFL to be the only protein needed to predict if a patient would show signs of disease activity (NEDA-3) 2 years after sampling, with a replication AUC of 0.77 (p = 0.02). The severity of disability worsening (nARMSS) proved to be more complex with the best prediction achieved (replication AUC = 0.81, p = 2*10-5) using NFL and 10 additional proteins. Importantly, this model could also predict the disability worsening from plasma with an AUC of 0.66 (p = 2*10-3). The suggested models could be used to distinguish patients with a promising disease course, not needing as effective treatment, reducing costs and side-effects. Lastly, by considering the network connectivity we identified a proteomics MS module in plasma which significantly overlapped with differential proteins in CSF which provides a functional context of the differentially expressed proteins.

Linking models of biochemical dynamics via mass-constrained neural ordinary differential equations

ABSTRACT. In the last decade there has been an explosion of various different experimental methods used to measure the dynamics of biochemical species in chemical reaction networks. However, not all biological systems are amenable to such methods. For example, synapses, the connection points between neurons, are still largely opaque in terms of the dynamics of various biochemical species. There are many dynamical models of synaptic chemical reaction networks. However, the current dynamical models include only a small part of the biochemical species and reactions taking place in a synapse. Due to the lack of data necessary to characterize the chemical reaction network structure in synapses there is a need for alternative modeling approaches. Specifically, approaches that do not rely on the characterization of the full structure of the chemical reaction network would be favourable. In this work we showcase such an approach.

In order to model a chemical reaction network where parts of the reaction network are represented as a neural network, we use the neural ordinary differential equations [1]. Moreover, we integrate mass constraints along with additional structural constraints that ensure non-negativity of biochemical species during simulations. As an example of this modeling approach we use a published MAP3K-MAP2K-MAPK cascade capable of oscillations [2]. We use the published model to generate a synthetic data set. Then we use the hybrid model and assume that we know nothing about MAP2K. Instead, we add a neural network term to MAPK derivatives, linking MAP3K to MAPK dynamics. Finally, we train the hybrid model on time-series of pMAPK and ppMAPK. Our results show that it is possible to get reasonably accurate fits on the training data.

The field of scientific machine learning - combination of machine learning and traditional scientific modeling approaches and knowledge - is a novel emerging field. Combining things like the function approximation capabilities of neural networks and known domain specific laws might lead to solutions to challenges previously thought insurmountable. In this work we used scientific machine learning to produce a hybrid model of a chemical reaction network whose full reaction structure was not known. In the future we aim to expand our approach to larger and more complex systems and synapses.

[1] Chen, R. T., Rubanova, Y., Bettencourt, J., and Duvenaud, D. Neural ordinary differential equations. Advances in Neural Information Processing Systems 2018-Decem (2018), 6571–6583. [2] Sarma, U., and Ghosh, I. Oscillations in mapk cascade triggered by two distinct designs of coupled positive and negative feedback loops. BMC Research Notes 5 (2012).

Data-driven modeling of mitochondrial metabolism

ABSTRACT. N-acetylaspartate (NAA) is the second most abundant metabolite in the brain, which is linked to Canavan disease, gestational diabetes and cancer. NAA has been shown to affect mitochondrial metabolism.

Our aim is to improve understanding of the functional role of NAA, in particular, in mitochondrial metabolism. To do that, we are developing a data-driven mathematical mechanistic model describing mitochondrial metabolism.

This large-scale compartmental model allows us to combine time-resolved stable isotope labelling data with growth medium metabolite level measurements and to analyze dynamics of cellular metabolism. We use a rule-based modelling approach and thermodynamics-based parameterization. A computational pipeline for model simulation and parameter estimation have been established that facilitates the iterative model development.

We will use this model to check the plausibility of different hypotheses about the effect of NAA on mitochondrial metabolism.

Quantitative Impacts of Bumble Bee Ecology on Foraging Behaviour: An Economic Model
PRESENTER: Arran Hodgkinson

ABSTRACT. Bumble bees’ time of exit from the hive is a critical factor in determining foraging success, yet little is known about how ecological factors quantitatively influence this behaviour. Bumble bees also exhibit varying degrees of foraging experience and this is thought to contribute to their foraging decisions and temporal strategy. They are also known to forage in the mornings, evenings, and may stay in the field overnight. In order to better quantify the hive-level dynamics of behaviour and their consequences for the survival of the colony, we utilise an economic partial differential equation (PDE) model to study the net energetic production of the colony. By accounting for the experience structured population of bumble bees as well as the diurnal dynamics of plant resource availability, with respect to yield and resource stores, under competition, we look at optimal colony-level resource collection strategies. We find that, under the imposition of no associated overnight cost, the optimal strategy for all foragers is to leave late in the evening and stay out overnight, before foraging and returning in the morning. The imposition of even small differential overnight penalties, however, lead to an optimal strategy being developed through morning departures for inexperienced bees and overnight departures for experienced bees. The rate of departure, or how collectively bees may coordinate their departure from the hive, also has an effect on optimal strategy, favouring earlier departures for experienced bees. Currently, experiments and data analysis are underway to discover whether such strategies are routinely exhibited in bumble bee foraging and what ecological factors could explain their manifest behaviours. This research is essential in understanding how changes in bumble bees’ environments contribute to their survival, as a species, and how ecosystems can be adapted to encourage their fruition.

Dynamic single cell analysis of a MAPK signaling cascade and its impact on transcriptional output

ABSTRACT. Mitogen-activated protein kinases (MAPKs) are a three tiered signal transduction cascade involved in key cellular processes. They have been shown to intervene in cellular proliferation, differentiation, and multiple stress response pathways (hyperosmotic, oxydative, inflammation). It has also been demonstrated that malfunctions in these cascades lead to severe oncogenic phenotypes due to the critical nature of the processes they control. Although the main players implicated in these cascades are known, we still lack a quantitative and dynamic description of their function. As one major role of MAPKs is to activate transcriptional programs, investigating the kinetics of transcriptional induction can provide key insights in the genetic regulation of these critical cellular components. We aim to understand how the signal transduction from the cascade impacts the transcriptional activation of stress-responsive genes. To this end the yeast Saccharomyces cerevisiae is used to study the dynamics of the MAPK Hog1 during an hyper-osmotic stress response. It is possible to quantify the activity of Hog1 by monitoring its relocation upon induction by osmotic stress as it accumulates in the nucleus. Downstream osmo-stress inducible promoters such as pSTL1, pHSP12, pGPD1 are functionalized with the PP7 system to label nascent mRNAs. This system comprises 24 stem loop repeats which are bound by a labelled phage coat protein. As soon as the mRNA is transcribed, these loops form and a fluorescent signal which accumulates at the active locus can be observed and quantified. From the signals extracted out of these images key parameters of the promoter output are characterised such as peak transcriptional activity, duration and total output. These parameters are correlated to the MAPK dynamics in order to study the impact of MAPK activity on transcritonal response. We report that the MAPK activity pattern we measure is not a predictor of the promoter expression . We investigate whether the global transcriptional capability of the cell conditions the level of transcriptional response and heterogeneity in single cell behaviour. In parallel we are building a mathematical model of MAPK driven transcription using time dependent rates based on experimental measurements and estimates.

Targeted Proteomics-Driven Computational Modeling of the Mouse Macrophage Toll-like Receptor Signaling Pathway
PRESENTER: Nathan Manes

ABSTRACT. Introduction The Toll-like receptor (TLR) signaling pathway is crucial for the initiation of effective immune responses. Subtle variations in the concentration, timing, and molecular structure of the stimuli (e.g., lipopolysaccharide (LPS)) are known to affect TLR signaling and the resulting pathway dynamics. Tight regulation is essential to avoid acute tissue damage and chronic inflammation. Computational modeling can test mechanistic hypotheses about how regulation is achieved and why it sometimes fails, causing pathologies (e.g., sepsis). In this investigation, mass spectrometry (LC-MS) is being used to enable TLR4 pathway modeling.

Methods Mouse (C57BL/6J) bone marrow-derived macrophages were either unstimulated or stimulated with 10 nM LPS for 30 min. A dilution series of stable-isotope labeled internal (phospho)peptide standards was spiked into cell lysates (136 unmodified peptides and 29 phosphopeptides for 54 (phospho)proteins). LC-MS was performed using a Q Exactive HF, and the resulting data were analyzed using Skyline. AlphaFold-Multimer was used to predict protein complex structures, Rosetta was used to idealize and relax the structures, and Simulation of Diffusional Association (SDA) and TransComp were used to estimate protein-protein association rates. Simmune was used to perform rule-based pathway modeling, simulation, and training.

Preliminary Results Most of the TLR4 pathway was quantified successfully. The protein abundances ranged from 1,332 to 227,000,000 copies per cell. They moderately correlated with transcript abundance values (r = 0.699, p = 1.37e-17), and these data were used to make proteome-wide abundance estimates. Abundance increases and decreases in response to LPS were observed for proteins known to be affected by TLR pathway activation. For example, LPS stimulation resulted in the abundance of phosphorylated ERK1 to increase from 30,000 to 250,000 copies per cell. Hundreds of protein-protein association rates were estimated. The (phospho)protein absolute abundance values and protein-protein association rates are being used as parameters for TLR4 pathway models. The parameter space is being explored to identify parameter sets that accurately reproduce experimental data.

Conclusions Experimental and computational techniques are being integrated to generate a strongly data driven model of the TLR pathway. This work was supported by the Intramural Research Program of NIAID, NIH.

Latent Dirichlet Allocation for Double Clustering (LDA-DC): Discovering patients phenotypes and cell populations within a single Bayesian framework

ABSTRACT. Introduction Human disorders have a highly multifactorial nature and depend on genetic, behavioral, socioeconomic, and environmental factors. The number of metabolic diseases, cancer, and autoimmune pathologies has increased significantly in recent years, making research in this field a public health priority. In parallel, bioclinical routine datasets have expanded in conjunction with all kind of “omics” data, from both the host and microbiota, as well as metabolomic, proteomic, and cytometry data [1]. All these types of data have some underlying structure on their own, taking values on different scales, with different variability, and are differently distributed. In addition, human patients are an equally important source of variability even among carefully selected cohorts: phenotypic variability (age, gender, previous conditions), dietary habits, bad vs good responders to the treatment, etc. In particular, new types of data have emerged which yield description at the cell level i.e. cytometry of scRNA seq. These data add a new layer of structuration that needs to be taken into account.

Motivations and Results From the analytical viewpoint, the single cell data are huge-dimensional matrices produced for each subject. The data dimension, i.e., the number of cells, vary from one individual to another, and note that cell types, as well as the correspondence between the cell populations of the subjects, have to be identified before applying any statistical machine learning method. We refer to the challenge we introduce and consider here as to a double clustering problem, where the aim is to simultaneously, purely from observations without any prior knowledge determine cell types, as well as stratify patients in order to study mechanisms of pathologies explained by particular cell subpopulations. We propose a novel approach to stratify cell-based observations within a single probabilistic framework, i.e., to extract meaningful phenotype from both patients and cells simultaneously. Our method is a practical extension of the Latent Dirichlet Allocation and is used to solve the Double Clustering task .The first step of our framework is the identification of the cell types. Once the cell types are fixed, we can efficiently estimate both probability of a phenotype given a patient and the probability of a cell type given a phenotype. We tested our method on different datasets ranging from simulated patients to whom with AML (acute myeloid leukemia) or Crohn’s disease, and were able to identify simultaneously clusters of patients and clusters of cells related to patients’ conditions. Furthermore, using a network approach, we were able to stratify patients and identify groups of patients with specific phenotypes.

References [1] C. Manzoni, et al, Genome, transcriptome and proteome: the rise of omics data and their integration in biomedical sciences. Briefings in Bioinformatics, 19(2):286–302, November 2016.

Combinatorial effects of ligands are predicted by a genome-scale model of signaling based on artificial neural networks
PRESENTER: Avlant Nilsson

ABSTRACT. Immune cells mount an appropriate response to threats by integrating signals from numerous ligands that bind their receptors. This depends on a network of thousands of signaling proteins that activate transcription factors (TFs), which trigger different gene programs. Simulations of this flow of information could help predict transcriptional phenotypes and the effects of mutations and drugs. However, it has been challenging to parametrize systems wide models using traditional methods.

To address this, we developed LEMBAS (Large-scale knowledge EMBedded Artificial Signaling networks). It represents these processes as a recurrent neural network with signaling molecules as hidden nodes and established protein-protein interactions as weights. Applied to synthetic data of ligand stimulated cells, LEMBAS rapidly parameterizes models that predict unseen test-data (Pearson correlation r=0.98) and the effects of knocking out signaling nodes (r=0.8).

To test LEMBAS performance on data from human macrophages, we measured their transcriptional response to more than 350 unique combinations of 20 ligands, up to 5 at a time that were chosen by an algorithm to maximize information content. Models trained on these data attained a good fit (r=0.8) that generalized well to unseen data (r=0.73 under cross validation). We systematically probed the model for interaction effects among all combinations of three ligands. To give an example, the model predicts a strong response to the combination of interferon gamma (IFNg) and a synthetic toll like receptor 2 (TLR2) agonist (PAM3CYK4) for expression of tumor necrosis factor (TNF) that is selectively suppressed by interleukin 10 (IL10). We extract the predicted causative signaling cascades using a combination of sensitivity analysis and simulated perturbations, for the previous example the model predicts a role for Ras-related C3 botulinum toxin substrate 1 (RAC1) and c-Jun N-terminal kinases (JNKs). This work demonstrates the feasibility and utility of genome-scale simulations of intracellular signaling. In future work we are looking to integrate neural network modules of signaling, metabolism, and gene regulation for a more complete mechanistic description of cellular activities.

Agent-based modeling of spatiotemporal organization of bacteria in the human gut

ABSTRACT. The intestinal microbiome is important for the innate and adaptive immune system. Microorganisms contribute to the maintenance of the gut mucosal barrier and the protection against pathogens. Perturbations of the gut microbiota caused by antibiotics or inflammations can lead to dysbiosis and bacterial overgrowth. Bacterial imbalance is associated with several diseases, among others chronic liver diseases and bloodstream infections. In clinical practice, fecal samples are the mostly used method to measure bacterial abundances in patients. The bacterial composition in the rectum cannot capture the specific gut biogeography. Hence, it is important to understand bacterial communities along the tract and from lumen to mucosa. We present an agent-based model of the human proximal colon that simulates the spatiotemporal organization of bacteria. The model simulated the bacteria in balance and the bacteria during exposure to antibiotics and dietary changes. We implemented bacteria as autonomous entities with individual properties and actions. The bacteria can interact with other bacteria and with the environment, the human gut. The choice of the parameters of the model was based on microbial experiments and a mathematical model of bacterial growth by Cremer et al. [1]. We aimed to represent the proximal colon as accurate as possible. Thus, we considered the peristalsis of the large intestine. The intestinal motility affected the bacterial movement along the gut. The behavior and interactions of bacteria led to emergent spatiotemporal phenomena in the lumen and near to the mucosa. Our model enabled the simulation of bacterial composition and metabolites along and transversally the proximal colon. The spatial temporal phenomena have to be validated by experimental methods as, for example, by using immunofluorescence images of gut cross-section. Adding more bacterial families and their interactions, we could improve the accuracy of the model.

[1] Cremer J, Arnoldini M, Hwa T. Effect of water flow and chemical environment on microbiota growth and composition in the human colon. Proceedings of the National Academy of Sciences. 2017;114(25):6438–6443.

Scalable and flexible inference framework for stochastic dynamic single-cell models

ABSTRACT. How a cell population dynamically responds to a stimulus like a drug, or a nutrient shift can today be studied by methods like single-cell microscopy. Ideally, dynamic data can shed light on both the cellular mechanisms behind a stimuli response, and the mechanisms giving rise to cell-to-cell variability within a population. An efficient way to test if a hypothesis can rationalise gathered data is mechanistic mathematical modelling, however, mechanistic models typically have several unknown parameters (e.g., protein translation rates). To be able to test if a hypothesis/model is valid, unknown parameters must first be estimated by fitting the model to available data. Parameter estimation for a single-cell mechanistic model is a challenging though, since these models often must account for cell heterogeneity caused by both intrinsic (e.g., variations in chemical reactions) and extrinsic (e.g., variability in protein concentrations) noise. Although several parameter estimation methods exist, the availability of an efficient general and flexible method is lacking which limits the usage of single-cell mechanistic models. To alleviate this we have developed a scalable and flexible framework for Bayesian estimation on state-space mixed-effects stochastic dynamic single-cell models. Our approach can infer model parameters when intrinsic noise is modelled by either exact or approximate stochastic simulators, and when extrinsic noise is modelled by either time-varying, or time-constant parameters that vary between cells. We demonstrate our approach by studying how cell-to-cell variation in carbon source utilisation affects heterogeneity in the budding yeast Saccharomyces cerevisiae SNF1 nutrient sensing pathway. We identify hexokinase activity as a source of extrinsic noise and deduce that sugar availability dictates cell-to-cell variability. Besides single-cell studies, our method can be used in several processes where both inter- and intraindividual variability matters, like as in ecology and in pharmacokinetic/ pharmacodynamic studies.

Stochastic Model of Intra-Tumor Heterogeneity (SMITH)
PRESENTER: Adam Streck

ABSTRACT. Cell-based simulations are a popular method for investigating intra-tumor heterogeneity and genome evolution during tumour growth. However, tracking individual cells in the size of a palpable tumour (billions of cells) is computationally expensive, and most methods thus represent groups of cells (demes, glands, patches) embedded in a lattice. This means that the models create only a simplified abstraction of the population with rigid, non-biological limitations of the lattice. We argue that the particular feature of lattice-based models is that they create implicit spatial constraints on the cell growth resulting in Darwinian selection. However, these constraints can be also expressed explicitly in terms of algebraic geometry and enforced even on non-spatial, well-mixed models.

To demonstrate this claim we have created a well-mixed, confined model of tumour growth of a solid, spherical tumour with fitness altering mutations. Our model introduces a novel mechanic, so-called confinement, that limits the cell turnover in the tumour to its outer shell of a certain width. We show that, when paired with fitness increase of mutations, confinement is sufficient to introduce the Darwinian selection to tumour growth and that different confinement values lead to different spatial dynamics, ranging from pure surface growth to full volume growth. We further show how a wide range of clonal dynamics naturally emerges from the combination of fitness increase and confinement.

Our model is implemented in the SMITH simulation tool. Due to its computational efficiency, SMITH can simulate a real-sized tumour of around ~2cm in diameter (~1 billion cells) in seconds.

Temporal Proteomics and Transcriptomics Unravels the Host-Pathogen Interaction Network of Macrophages and Corynebacterium diphtheriae.
PRESENTER: Luca Musella

ABSTRACT. Corynebacterium diphtheriae had been the etiological agent of severe diphtheria epidemics in early industrial times and prior to mass immunization against the eponymous secreted toxin. Nevertheless, scientists currently believe that C. diphtheriae is a re-emerging pathogen, as outbreaks, antibiotic-resistance and systemic infections caused by non-toxigenic strains have been recently reported. In this study, we investigated the host-pathogen interactions between THP-1 derived macrophages and C. diphtheriae ISS3319, a non-toxigenic strain with superior intracellular survival within the macrophage’s phagolysosome, in comparison to other non-pathogenic bacteria. An ad hoc infection assay was set up and total RNA and protein contents were collected 4 and 24 hours after bacteria inoculation, along with control groups, and respectively processed via RNAseq and HPLC-MS/MS. Differential gene expression, integrated with enrichment analyses, genomics data, homologous mapping and networks reconstruction, suggests a mechanistic interpretation of the infection process across time, intracellular compartments and metabolic processes, in a systems biology fashion.

Lineage plasticity in prostate cancer depends on FGFR and JAK/STAT inflammatory signaling
PRESENTER: Joseph Chan

ABSTRACT. The inherent plasticity of tumor cells provides a mechanism of resistance to molecularly targeted therapies, exemplified by adeno-to-neuroendocrine lineage transitions in prostate and lung cancer. Here, we investigate the root cause of lineage plasticity by performing single-cell transcriptomic analysis of time-course experiments in genetically engineered mouse models and murine organoid cultures of castrate-resistant prostate cancer following Trp53 and Rb1 deletion. We observe rapid collapse of cell-type fidelity with the emergence of a mixed luminal and basal phenotype with additional EMT-like features. To quantify dynamic changes in plasticity, we develop scBLender, a suite of methods that measure basal-luminal mixing as a proxy for plasticity. We leverage these plasticity metrics to identify Fgfr and Jak-Stat inflammatory signaling as putative drivers of plasticity that are activated early in the time-course prior to any corresponding morphological changes as well as under therapeutic pressure. Genetic and pharmacologic inhibition of Jak1/2 combined with Fgfr blockade in murine and patient-derived organoids not only reversed the plastic state to wild-type morphology, but also restored sensitivity to antiandrogen therapy in models with residual AR expression. Single-cell analysis of clinical biospecimens confirms the presence of mixed basal-luminal cells with elevated JAK/STAT and FGFR signaling in a subset of patients with metastatic disease, with implications for stratifying patients for clinical trials. Collectively, we show that lineage plasticity initiates quickly as a cell-autonomous process that is further increased in the in vivo setting, and through newly developed computational approaches, we identify a pharmacological strategy that restores lineage identity using clinical grade inhibitors.

Efficient parameter estimation for ODE models with semi-quantitative data using spline approximation
PRESENTER: Domagoj Doresic

ABSTRACT. Quantitative dynamical models facilitate the understanding of biological processes and the prediction of their dynamics. These models usually comprise unknown parameters, which have to be inferred from experimental data. For quantitative experimental data, there are several methods and software tools available. However, for semi-quantitative and qualitative data the available approaches are limited and computationally demanding. We deal with a kind of semi-quantitative data. Specifically, for several measurement techniques, the measurements have an unknown monotone non-linear dependency on the variables of interest. We present a method based on hierarchical optimization and approximation using splines. We consider a reformulation of the inverse problem as a bi-level optimization problem. The linear splines that model the unknown monotone non-linear dependencies are optimized in the inner optimization problem. Using these optimized spline mappings we can obtain mapped observable simulations comparable to the given measurements. This enables us to define a negative log-likelihood objective function which we minimize in the outer optimization problem to obtain maximum likelihood estimates of the parameters of the dynamical system. To improve the performance and efficiency of the method, for both optimization problems we use gradient-based local optimizers. The gradients are computed using a semi-analytical algorithm for gradient calculation. The approach is implemented in the open-source Python Parameter EStimation TOolbox (pyPESTO).