previous day
next day
all days

View: session overviewtalk overview

09:15-10:00 Session K8: KEYNOTE VIII: Alexander Pritzel


Location: Alexander
The workings and impact of AlphaFold

ABSTRACT. AlphaFold.

10:00-10:30Coffee Break

Summary: In recent years, more and more studies are appearing where different computational approaches and methodologies are combined to address biological phenomena. On one hand, integration of different methodologies poses problems regarding the definition of the respective interface, for example when combining agent-based models with ordinary differential equations, or Boolean with genome-scale metabolic models. On the other hand, the successful combination offers the opportunity to answer biological questions not easily addressable otherwise. This session will present success cases where the combination of different computational methodologies and approaches has shed light on mechanisms underlying emergent properties of biological systems. 

Location: Grenander I+II
Integrating cellular and physiologically-based pharmacokinetic (PBPK) modelling

ABSTRACT. Integrating different levels of abstractions and different types of models and/or modeling approaches poses several kinds of problems. Often, it is not possible to integrate these in a completely seemless manner and special care has to be taken of the interface between and the synchronization of the models. Here, besides some general considerations, I present our work on the integration of cellular kinetic models representing signal transduction processes in individual cells and whole-body pharmacokinetic or more specifically physiologically-based pharmacokinetic (PBPK) modelling. We established such integrated models for IFN-alpha administration and signalling in liver cells both for humans as well as mice. We used the models to compare in vitro and in vivo findings, as well as shed some light on inter-species extrapolations.

A Novel and Robust Molecular Switch Actuating the Quantitative Model of Eukaryotic Cell Cycle Control

ABSTRACT. The eukaryotic cell cycle is driven by waves of cyclin-dependent kinase (cyclin/Cdk) activities that rise and fall with a timely pattern called “waves of cyclins”. This pattern guarantees coordination and alternation of DNA synthesis with cell division, and its failure results in altered cyclin/Cdk dynamics and abnormal cell proliferation. Although details about transcription of cyclins are available, the network motifs responsible for this timely pattern are currently unknown. Here I show a novel principle of design that ensures cell cycle time keeping through interlocking transcription with cyclin/Cdk dynamics in budding yeast. Through analyses of kinetic models of the cyclin/Cdk network and of their state and parameter space, and quantitative data of Clb dynamics, a novel regulatory design is unravelled that highlights the Clb/Cdk–TF axis being pivotal for timely cell cycle dynamics. This work rationalizes the quantitative model of Cdk control proposed by the 2001 Nobel Prize recipient Sir Paul Nurse, identifying regulatory motifs underlying cell proliferation dynamics in eukaryotes.

More BANG for the buck: how to distribute measurements to optimize information gain for mathematical models
PRESENTER: Severin Bang

ABSTRACT. Biological measurements are often performed in triplicates i.e. repeating the same experiment three times to improve accuracy. In the case of the total number of measurements being restricted, repetition of the same measurements means fewer different biological conditions can be measured. When mathematical models of dynamical processes are fitted to biological data, the quality of the fit is not only determined by the accuracy of the data, but also by how much and which part of the dynamic is observed. In this talk we present a statistical analysis of different sampling strategies and their ability to inform a mathematical model. We compare what influence the distribution of a fixed number of measurements has to the accuracy of the model fit. The aim is to infer guiding principles to optimize experimental planing to maximize information gain.

A generic approach to decipher the mechanistic pathway of heterogeneous protein aggregation kinetics

ABSTRACT. Amyloid formation is a generic property of many protein/polypeptide chains. A broad spectrum of proteins, despite having diversity in the inherent precursor sequence and heterogeneity present in the mechanism of aggregation produces a common cross β-spine structure that is often associated with several human diseases. However, a general modeling framework to interpret amyloid formation remains elusive. Herein, we propose a data-driven ODE based mathematical modeling approach that elucidates the most probable interaction network for the aggregation of a group of proteins (α-synuclein, Aβ42, Myb, and TTR proteins) by considering an ensemble set of network models, which include most of the mechanistic complexities and heterogeneities related to amyloidogenesis. The best-fitting model efficiently quantifies various timescales involved in the process of amyloidogenesis and explains the mechanistic basis of the monomer concentration dependency of amyloid-forming kinetics. Moreover, the present model reconciles several mutant studies and inhibitor experiments for the respective proteins, making experimentally feasible non-intuitive predictions, and provides further insights about how to fine-tune the various microscopic events related to amyloid formation kinetics. This might have an application to formulate better therapeutic measures in the future to counter unwanted amyloidogenesis. Importantly, the theoretical method used here is quite general and can be extended for any amyloid-forming protein. The link for the core model network structure is available in sbml format (https://github.com/baichandra05/Genericmodel_protein_aggregation) in GitHub repository for the users to customize and fit any type of kinetic data set according to their preferences. An associated flowchart of the algorithm has also been provided to make the approach easy and understandable for the general readers.

Unraveling the role of network motifs to decipher the origin of robust decision-making in biological systems
PRESENTER: Amitava Giri

ABSTRACT. Living cells make precise decisions under any given physiological conditions. This kind of robust decision-making has been shown to be dynamically organised by complex tri-stable, Mushroom or Isola kinds of bifurcations related to a specific regulatory gene. How these dynamical features emerge from the complex gene regulatory networks organising such cellular processes and what are the minimal network motifs to achieve such complex dynamical features remain poorly understood. Herein, by employing bifurcation analysis and Waddington’s potential landscape analysis, we demonstrate that Mushroom and Isola bifurcations can be realised with four minimal network motifs that are constituted by combining a positive feedback motif with various incoherent feed-forward loops (IFFL). Our study reveals that the intrinsic bi-stable dynamics originating from the positive feedback motif can be fine-tuned by altering the extent of the incoherence of these minimal networks to produce these complex bifurcations. To further examine the relevance of these findings, we investigated the transcriptional network involving Nanog, Oct4 and Gata which dynamically governs the cell-fate determination of embryonic stem (ES) cells. We showed that a tri-stable Oct4 and a mushroom-like steady-state dynamics Nanog orchestrate the ES cell differentiation regulation. The model reconciles a range of experimental observations and predicts ways to fine-tune the developmental dynamics of ES cells by altering the nature and extent of positive feedback and incoherent feed-forward interactions existing in the Nanog regulatory network.

Geometric programming to Solve Optimal Concentrations of Metabolites and Enzymes in Constraint-based modelling
PRESENTER: Sabine Peres

ABSTRACT. The Constraint-based modelling is a widely used approach to analyze genotype-phenotype relationships. The main key concepts are stoichiometric analysis such as flux balance analysis (FBA), Resource Balance Analysis (RBA) or elementary flux mode (EFM) analysis. While FBA identifies optimal flux distribution with respect to a given objective, the EFM characterizes the totality of the available solution space in terms of minimal pathways. The RBA predicts for a specific environment, the set of possible cell configurations compatible with the available resources and extends very significantly the predictive power of the FBA. However, when stoichiometric and kinetic constraints are considered together, the set of possible flux configurations does not generally define a convex set since the kinetic function are not linear. The problem resolution has thus multiple local maxima. Recent works showed that the optimal solution of constraint enzyme allocation problems with general kinetics is an EFM. Based in this recent outcome, a first contribution of our work is to prove that resource allocation constraint on kinetic optimization problem is a geometric problem in an EFM, i.e. a convex optimal problem easily solved. Thus to predict optimal flux modes, we compute all the EFMs and resolve the convex optimization problem on each EFMs which provides for this mode, the optimal repartition of resources among enzymes and the associated profil of metabolite concentrations. We applied our method to the central carbon metabolism of E. coli, with a detailed model of the respiration chains, ATPase (including explicitly the proton motrice force). This approach allowed us to explore whether certain experimental properties observed on E. coli are consistent and consequences of an optimal repartition of bacterial resources. Our method is very promising in synthetic biology and increased the ability to efficiently design biological systems.

Multi-Omics Regulatory Network Inference in the Presence of Missing Data

ABSTRACT. A key problem in systems biology is the discovery of regulatory mechanisms in the form of multi-level networks. Modern multi-omics profiling techniques probe these fundamental regulatory networks but are often hampered by experimental restrictions leading to missing data or partially measured omics types for subsets of individuals. In such cases, classical regulatory network inference approaches are limited. In recent years, approaches have been proposed to infer sparse regression models in the presence of missing information.

However, these methods have yet to be adopted for regulatory network inference.

In this presentation, we introduce an extension of KiMONo, a Knowledge guIded Multi-Omics Network inference approach, to handle missingness. To this end, we integrated five Lasso regression-based methods, of three categories, that can be applied to missing data; a kNN-imputation-based Sparse-Group Lasso approach (knnSGLasso), two approaches that are designed to integrate multiple imputed datasets (Group Adaptive Lasso: GALasso, and Stack Adaptive Lasso: SALasso), and two Lasso-based inverse covariance estimators (Convex Conditioned Lasso: CoCoLasso, and Lasso with High Missing Rate: HMLasso).

We benchmarked their performance on an invasive carcinoma multiomics atlas (604 patients) comprising copy number variation (84 features), methylome (1366 features), and transcriptome data (11530 features). We simulated commonly encountered missing data scenarios including progressively increasing random feature missingness and block-wise sample missingness with increasingly noise data. Additionally, we evaluated the performance of the different approaches with reduced sample size removing up to 50% of the total number of samples, as well as the computational runtime.

In general, our results indicated the methods using prior imputation outperformed those based on inverse covariance estimation, although the latter were computationally faster. When comparing the inferred networks on missing data with networks inferred on full data, knnSGLasso showed the highest F1-score closely followed by SALasso and GALasso on scenarios with randomly missing features. However, SALasso and GALasso outperformed knnSGLasso in scenarios with block-missingness.

In conclusion, we show that robust multi-omics network inference in presence of missingness is feasible with the extended version of KiMONo and thus allows users to leverage available multi-omics data to its full extent.

BioCypher: an ontology-driven framework for flexible harmonisation of large-scale biomedical knowledge graphs

ABSTRACT. Although biomedical knowledge is increasingly abundant and available, it is fragmented across providers and research groups. Large-scale pipelines have been built by individual researchers and companies to harmonise the data supplied by complementary primary datasets. These pipelines integrate the heterogeneous primary data into large, harmonised knowledge graphs. However, each of these large secondary sources still operates by their own arbitrary schema and technological foundation, making maintenance and interoperability difficult. The ways in which researchers interact with these platforms are equally heterogeneous and arbitrary, most often in the form of a web interface or software package. As a first step towards the solution of these challenges, we propose a biomedical data interface, which we call BioCypher.

BioCypher aims to facilitate integration and use of biomedical prior knowledge and data via several mechanisms, all implemented in an open-source Python package: 1) Encoding biomedical "objectness" of knowledge graph entities based on a comprehensive public ontology system (the Biolink project); 2) Translation and integration between different data sources and identifier systems using ontological hierarchy and established mapping facilities; 3) Easy-to-use, fast and flexible build mechanism for the creation of individualised task-specific knowledge graphs to allow for rapid prototyping and application-oriented performance.

The BioCypher workflow is simple and hinges on specification of knowledge graph entities via a configuration YAML file that maps the heterogeneous input data to the respective ontological classes (e.g., "Gene" or "SmallMolecule"). Via this file, the user specifies which types of entities and relationships should be represented in the output knowledge graph. Using a simple adapter script, the input data is passed to BioCypher, which then performs harmonisation, integration, and the knowledge graph build procedure.

As a first step, we are focused on migrating secondary knowledge sources such as our own database OmniPath, the Clinical Knowledge Graph, the Dependency Map project, the Open Targets knowledge graph, and others. However, the long-term goal of BioCypher is to represent each of the primary knowledge sources with their own adapter, which can then be combined "on the fly" by specifying a mode of representation, essentially allowing for recreation and recombination of these secondary knowledge sources in a harmonised manner. Down the line, this would allow centralisation of the primary knowledge collections and distributed storage and computing for individualised knowledge graphs.

Once centralised, individualised knowledge graphs can be made available to a wider range of biomedical researchers, not only bioinformatics specialists with access to high-performance computing. Knowledge graph workflows could be provided in a parallelisable cloud environment, for example via Jupyter notebooks. The ontological harmonisation can also allow for further accessibility measures, such as graphical user interfaces or dialogue systems, usable by all biomedical researchers.

Fast parameter estimation for ODE-based models of heterogeneous cell populations
PRESENTER: Yulan van Oppen

ABSTRACT. Single-cell time series data frequently display considerable variability across a cell population. When these data are used to fit dynamic models of intracellular processes, it is more appropriate to infer parameter distributions that capture population variability, rather than fitting the population average to obtain a point parameter estimate. The current gold standard for inferring parameter distributions across cell populations is the Global Two Stage (GTS) approach for nonlinear mixed-effects models, where cell-specific parameter estimates and their associated uncertainties are calculated in the first stage and population parameter distributions are inferred in the second. Although the GTS method is reliable, its current implementation requires repeated use of non-convex optimization, which is not guaranteed to converge, while each optimization run requires multiple simulations of the system. These features make the GTS method computationally expensive.

We propose an alternative, computationally efficient implementation of the GTS method for mixed-effects dynamical systems which are nonlinear in the states but linear in the parameters (a class that encompasses a wide range of models such as those based on mass-action kinetics). For such systems, point parameter estimates can be obtained using least squares regression on time derivatives of smoothed measurement data, an approach called gradient matching. Here, we extend the application of gradient matching to the inference problem for mixed-effects dynamical systems and integrate it into the GTS method by properly accounting for uncertainties in individual cell parameters in the first stage. We also present an Expectation Maximization (EM) algorithm and associated parameter uncertainty estimates which are applicable when not all system states are observed, as is typical for biological systems.

We demonstrate the efficiency of our approach with a small simulation study including three dynamical systems. For each system, we simulate N = 100 noisy trajectories and assume the model parameters follow a joint normal distribution. The computing times and accuracies of inferred distributions in terms of Fréchet distances from the ground truth (in parentheses) are given below for the original GTS method vs our adaptation. As our results demonstrate, gradient matching using linear regression yields a substantial improvement in terms of computational efficiency over the simulation-based GTS approach, at the cost of minor accuracy loss.

(Original GTS method) (Our adaptation) - SIMPLE ENZYME KINETICS All states observed: 89.23 sec (0.0514) vs 1.12 sec (0.0050) (Three states, three parameters) Two states observed: 74.71 sec (0.0416) vs 1.95 sec (0.0240)

- FLUORESCENT PROTEIN MATURATION All states observed: 117.49 sec (0.0121) vs 1.09 sec (0.0285) (Three states, five parameters) One state observed: 72.04 sec (0.0302) vs 3.31 sec (0.0404)

- BIFUNCTIONAL TWO-COMPONENT SYSTEM All states observed: 77.78 sec (0.1835) vs 4.65 sec (0.2162) (Six states, four parameters) Three states observed: 74.97 sec (0.2130) vs 5.23 sec (0.2128)

Statistical estimation of enzyme kinetics and pathway cost predicts overflow metabolism
PRESENTER: Mattia Gollub

ABSTRACT. Constraints on cellular protein concentrations are often used to explain overflow metabolism and might be a major driving factor of microbial cross-feeding and division of labor. Mathematical models of overflow metabolism have the potential of predicting such behaviors, but linking reaction fluxes with enzyme abundances requires quantitative knowledge of enzyme kinetics. However, kinetic parameters have only been measured in few organisms and in-vitro conditions, leaving it unclear what values should be used to construct a model. Moreover, current Enzyme Cost Minimization (ECM) methods require either fixed flux ratios or strong simplifying assumptions, limiting their applicability. To address the challenge of unknown kinetic parameters, we developed mixed-effects models representing Kcat and Km values as a hierarchy of properties from generic to specific: substrate and reaction identifier, EC number, protein family, and protein identifier. We fitted the models to values from kinetics databases in a Bayesian framework, which provides both parameter estimates and their uncertainty. Cross-validation performances are comparable to state-of-the-art deep learning methods, while our approach adds interpretability and robust uncertainty estimates. We found that, on average, the largest effect size for Km was contributed by the substrate (0.97 on the log10 scale). The average effect of organism-specific protein identifiers was 0.44, one third of the overall variation in parameter values, confirming that measurements from characterized organisms are indeed informative for predicting values in new organisms. We obtained similar results for catalytic rates. The average residual standard deviation, which captures unexplained differences between experimental conditions such as temperature and medium composition, was 0.30. This value highlights the limitations of directly employing literature values, but also provides an estimate of the error introduced when applying in-vitro values to in-vivo models. To address the limitations of current ECM methods, we present Global ECM (GECM), a multi-start non-convex optimization method that can jointly optimize fluxes and metabolite concentrations to predict cost-optimal pathways. We combined our parameter estimates with standard free energies using eQuilibrator and parameter balancing to obtain thermodynamically consistent estimates for reversible Michaelis-Menten kinetics. The uncertainty in the kinetic and thermodynamic parameters was then propagated to the predictions by running the optimization on multiple parameter sets, sampled from their predicted distribution. Our GECM approach predicts preference for partial fermentation of glucose to acetate and glycerol over complete respiration in E. coli’s core metabolism, a behavior that consistently emerges in long-term evolution experiments. Overall, we established a statistical pipeline from sparse parameter values to robust models of enzyme cost and showed its ability to predict pathway choice under proteome limitations.

Integrating 13C labeling and thermodynamic data for thermodynamically consistent Bayesian Metabolic Flux Analysis (T13C-MFA)

ABSTRACT. The quantitative study of cellular metabolism is central to Systems Biology where it has broad applications ranging from strain design to drug development. The intracellular metabolic reaction rates, in short fluxes, which characterize the metabolic state together with the metabolite concentrations, are key to understanding the cellular phenotype. Since fluxes are not directly measurable under in vivo conditions, they must be inferred from data through computational models.

Here, two popular, orthogonal approaches to constrain intracellular fluxes are thermodynamics-based flux analysis (TFA) and metabolic flux analysis with 13C labeling data (13C-MFA). TFA is based on the solid foundation of thermodynamics and uses measured or estimated Gibbs energies of reactions, which are available from services such as eQuilibrator [1]. With that, TFA constrains the space of metabolic states allowed in a specific condition, rather than answering the question of the actual metabolic state of the cell. 13C-MFA, on the other hand, is high-precision flux inference tool, which integrates data from 13C-Isotope labeling experiments with a small-scale, focused and atom-mapped metabolic model to infer the in vivo metabolic rates under the investigated conditions. Because reversible reactions, which are governed by thermodynamics, are notoriously hard to infer precisely for 13C-MFA, a combination of both approaches promises improvements to flux inference. Few attempts to combine TFA and 13C-MFA sequentially have been made, for example by filtering thermodynamic consistent 13C-MFA flux maps or by improving the tightness of TFA constraints by addition of further data. However, a correct and unbiased solution that harnesses the best of both techniques in one consistent approach has not been developed until now.

To enable unbiased sampling of thermodynamically consistent fluxes with 13C labeling data we developed T13C-MFA, a method that combines the Probabilistic Thermodynamic Analysis (PTA) toolbox [2] and the high-performance simulator 13CFLUX2 [3]. The challenges of T13C-MFA are twofold: Firstly, a specialized MCMC sampling algorithm had to be designed for handling high-dimensional, non-convex, joint space of fluxes and Gibbs energies; secondly the HPC 13CFLUX simulator had to be interfaced with the open PTA tool to achieve MCMC convergence within practical time frames. T13C-MFA is proven to scale to real-world problems as demonstrated with a community-typical model of E. coli. With this, we study the improvements in the resolution of fluxes, especially those of reversible reactions, which are notoriously hard to identify.


[1] Beber et al. (2022) eQuilibrator 3.0: a database solution for thermodynamic constant estimation. Nucleic Acids Res 50: D603–D609

[2] Gollub et al (2021) Probabilistic thermodynamic analysis of metabolic networks. Bioinformatics 37: 2938–2945

[3] Weitzel et al (2013) 13CFLUX2 - High-performance software suite for 13C-metabolic flux analysis. Bioinformatics 29: 143–145

Simulation Based Metabolic Flux Inference
PRESENTER: Thomas Diederen

ABSTRACT. The measurement of metabolic fluxes is crucial for application areas ranging from cancer metabolism to industrial biotechnology. Fluxomics has to date proven a low-throughput affair, typically employed to verify a small number of hypotheses on flux-regulation. We set out to enable flux-inference on large numbers of samples in order to gain biological insight from high-throughput screens. Advances in metabolomics, such as novel column chemistry and higher resolution mass-spectrometers, have enabled the measurement of ever larger numbers of samples. To capitalize on these advances in analytical chemistry in the context of fluxomics, we propose a major overhaul of the computational inference procedure necessary for flux-inference. We develop a simulation-based flux inference pipeline using normalizing flows that enables fast inference on arbitrarily large numbers of measurements. Unlike the inference methods of old, our method is inspired by the Approximate Bayesian paradigm and allows for the flexible specification of prior distributions over fluxes and in the choice of observation models which are tailored to a specific instrument. We develop an observation model for an in-house LC-MS method and calibrate it on 430 separate LC-MS measurements. Lastly, we demonstrate the superiority of our approach over classical flux inference approaches in terms of inductive bias. This approach allows for rigorous and high-throughput fluxomics hypothesis testing and represents a major improvement over existing methods.

BioSimulations: integrated models, model languages, model repositories, simulation experiments, simulation tools and data visualization

ABSTRACT. More predictive models, such as whole-cell models, could transform biological science, medicine, and bioengineering. To collaborate, investigators will need to reuse and combine each others' simulations. Languages, ontologies, and repositories have been developed to help exchange individual aspects of simulation experiments. Nevertheless, it is difficult to reuse simulations for several reasons: the large numbers of methods, formats, and tools needed for different biological systems and scales; numerous gaps within and between these resources; and limited quality controls. To facilitate the use of these resources, we first developed RunBioSimulations (https://run.biosimulations.org/), an extensible web application that simulates a wide range of computational modeling frameworks, algorithms, and formats. RunBioSimulations leverages several community resources, including model formats such as SBML, CellML, BNGL, and VCML, standards for in silico experiment such as the Simulation Experiment Description Language (SED-ML) and the COMBINE archive format, and BioSimulators (https://biosimulators.org), a new open registry of standardized simulation tools. To help investigators share and reuse simulations, we developed BioSimulations, a central repository for models, simulations, and visualizations of simulation results. Importantly, BioSimulations both helps authors quality control and share their projects and helps others modify and execute these projects and interactively visualize their results. Already, BioSimulations includes over 1,000 projects. To support a broad range of biological systems and scales, BioSimulations supports numerous modeling approaches, model languages, model repositories, simulation algorithms, and simulation tools. We achieved this by refining the SED-ML standard for describing simulations, expanding the Systems Biology Ontology of modeling frameworks, expanding the Kinetic Simulation Algorithm Ontology of simulation methods; developing new formats for simulation results, logs of simulations, data visualizations of simulation results, and simulation tools; developing tools for quality controlling simulations and simulation tools; and integrating these resources together. Furthermore, the community can easily extend BioSimulations to additional modeling approaches, modeling languages, simulation algorithms, and simulation tools. By facilitating collaboration, we anticipate that BioSimulations will be a driving force toward more comprehensive and more predictive models.

Systematic inference identifies a major source of heterogeneity in non-Markovian cell signaling dynamics: the rate-limiting step number
PRESENTER: Hyukpyo Hong

ABSTRACT. Identifying the sources of cell-to-cell variability in signaling dynamics is essential to understanding drug response variability and developing more effective therapeutics. However, it is challenging because many signaling intermediate reactions are experimentally unobservable. This can be overcome by replacing them with a single random time delay, but the resulting process is non-Markovian, making it difficult to infer cell-to-cell heterogeneity in reaction rates and time delays. In this talk, we present an efficient and scalable moment-based Bayesian method that infers cell-to-cell heterogeneity in the non-Markovian signaling process. We apply this method to single-cell expression profiles from promoters responding to various antibiotics and discovered a major source of cell-to-cell variability in antibiotic stress-signal response: the number of rate-limiting steps in signaling cascades. This knowledge can help identify more effective therapies that destroy all pathogenic or cancer cells.

FAIR assessment of biosimulation models - a cross-community project
PRESENTER: Dagmar Waltemath

ABSTRACT. Background: Computational models have been developed and published for many years to study phenomena from biochemical reactions to multiorgan whole-body mechanisms across a spectrum of species. These resources are becoming increasingly relevant for clinician scientists as supporting tools for diagnosis, therapy, and scientific investigations. However, it remains an open issue to determine whether a model is of sufficient quality and can be appropriately reused - specifically in a biomedical setting, where higher standards apply than in biochemical research settings. As a consequence, the uptake of computational models in the clinic is still hindered. Specific reasons include limited availability of semantic information, lack of standardization of properties and settings indicating how the model can be used for computational simulations, insufficiently clear specifications on the model kinetics, and reduced reproducibility of the model simulation results [1,2]. In this work, we investigate how the FAIR principles (Findability, Accessibility, Interoperability, Reusability) can be applied as one pillar to estimate a model’s quality. We believe that adherence to FAIR principles supports reproducibility, semantic descriptions of model components, and accessibility of all model-related data - which are all relevant indicators for model quality.

Methods: We propose and discuss the use of the Research Data Alliance (RDA) indicators [3] for a standardized FAIR evaluation of computational models encoded in COMBINE standards [4]. Specifically, we organized two FAIR-dedicated COMBINE workshops in October 2021 and April 2022, which focused on assessing FAIRness for COMBINE archives, a standard format for storing all data necessary to reproduce a simulation experiment [5]. At the workshops, we adapted the RDA FAIR Data Maturity Model indicators based on community feedback to align them more closely with the requirements for assessing computational models. We also formalized our assessment methodology following the template provided in the IMI FAIRplus project. Ongoing work is on finalizing the FAIR model indicators; we aim to publish them as a recommendation for the COMBINE community to work with in future. We also plan to develop a semi-automatic FAIR evaluation tool to support the application of the FAIR model indicators to biocomputation models. In this talk, we will present the preliminary results of the FAIR indicators for COMBINE archives and will discuss further work including the involvement of the community.

Conclusions: With this project, a key step towards trust building and cross-discipline communication in relation to computational models is taken. We believe that FAIR can be a connecting principle across the clinical and biomedical domain as it is recognised and appreciated in both fields.

Selected references: 1. Tiwari et al.(2021). DOI:10.15252/msb.20209982 2. König et al. (2016) http://ceur-ws.org/Vol-1692/paperC.pdf[accessed 20/08/2022] 3. Group, F. D. M. M. W. FAIR Data Maturity Model (2020). DOI:10.15497/rda00050. 4. Schreiber et al.(2021). DOI:https://doi.org/10.1515/jib-2021-0026 5. Bergmann et al. (2014). DOI:https://doi.org/10.1186/s12859-014-0369-z

FLOVELO - Pushing Boundaries of Cell Dynamics Inference with Maximum Flow Networks

ABSTRACT. Single-cell RNA-sequencing (scRNA-seq) technologies provide impressive new insights into biological samples on single cell resolution allowing a deep understanding of the developmental state of a cell. To infer dynamics of cellular processes, such as the cell cycle, additional temporal information is often obtained by performing time course experiments, but can also already be derived from one, static scRNA-seq measurement: Knowing that unspliced mRNA eventually is processed to spliced mRNA, the abundance of unspliced and spliced mRNA molecules gives insights about the future expression profile of a cell.

We present FLOVELO, a computational approach that recovers the most likely cell trajectory in two-dimensional unspliced-spliced mRNA expression space (hereafter Unspliced-Spliced Trajectory, UST) for each gene. Interpreting the distribution of cells as a probability measure, a UST appears as one-dimensional density ridge, which FLOVELO reliably detects by solving multiple, interconnected network flow problems.

Comparable methods, such as RNA velocity implementations velocyto and scvelo, are very much limited to model assumptions such as (piecewise) constant transcription and splicing rates and are known to potentially lead to incorrect biological conclusions when data comprises transcriptional bursts, multiple kinetics, or short developmental time spans. FLOVELO, as a model-free approach, can flexibly adapt to such transcriptional contexts as it does not rely on constrained estimation of rate parameters. On the contrary, it even allows for a much more informed reverse engineering of the underlying consortium of differential equation systems explaining the reconstructed UST.

Furthermore, comparing the UST shapes of multiple genes using optimal transport theory provides a novel, intuitive way to classify genes that are assumed to underlie similar regulatory control mechanisms. In the cell cycle context, this can be directly applied to distinguish between cell cycle-regulated and cell cycle-independent genes. Finally, gene-wise USTs can be combined into a gene shared cell trajectory, summarizing global cellular developmental trends in the respective sample. We will demonstrate FLOVELO’s performance with illustrative examples.

Exact Confidence Regions for Non-Linear Models

ABSTRACT. Since the confidence regions of linearly parametrised models always constitute perfect ellipsoids around the maximum likelihood estimate, their shape can be fully encoded using a positive-definite covariance matrix. In contrast, the confidence regions of non-linearly parametrised models exhibit non-linearly distorted shapes, which strongly complicates a faithful assessment of the parameter uncertainties. Given that virtually all models obtained from theoretical considerations are non-linear with respect to their parameters, this impacts a broad range of research fields in the systems biology community and beyond.

Our approach uses a special family of vector fields which can be integrated along to efficiently obtain confidence boundaries as the respective integral manifolds of said vector fields. This turns the problem of finding exact confidence regions into numerically solving a system of ODEs. Therefore, the need to sample the likelihood over large volumes in the parameter space is eliminated, which represents a significant reduction in computational effort. Furthermore, knowledge of exact confidence regions can subsequently be used to quantify the uncertainty in the model predictions via confidence bands.

By making making comprehensive parameter uncertainty analyses feasible for a wider class of problems through its improved efficiency, our method allows for more nuanced insights into the mechanisms underlying biological processes.

Transcription start site signal profiling improves transposable element RNA expression analysis at locus-level
PRESENTER: Natalia Savytska

ABSTRACT. The transcriptional activity of Transposable Elements (TEs) has been involved in numerous pathological processes, including neurodegenerative diseases such as amyotrophic lateral sclerosis and frontotemporal lobar degeneration. The TE expression analysis from short-read sequencing technologies is, however, challenging due to the multitude of similar sequences derived from singular TEs subfamilies and the exaptation of TEs within longer coding or non-coding RNAs. Specialised tools have been developed to quantify the expression of TEs that either relies on probabilistic re-distribution of multimapper count fractions or allow for discarding multimappers altogether. Until now, the benchmarking across those tools was largely limited to aggregated expression estimates over whole TEs subfamilies. Here, we compared the performance of recently published tools (SQuIRE, TElocal, SalmonTE) with simplistic quantification strategies (featureCounts in unique, fraction and random modes) at the individual loci level. Using simulated datasets, we examined the false discovery rate and the primary driver of those false positive hits in the optimal quantification strategy. Our findings suggest a high false discovery number that exceeds the total number of correctly recovered active loci for all the quantification strategies, including the best performing tool TElocal. As a remedy, filtering based on the minimum number of read counts or baseMean expression improves the F1 score and decreases the number of false positives. Finally, we demonstrate that additional profiling of Transcription Start Site mapping statistics (using a k-means clustering approach) significantly improves the performance of TElocal while reporting a reliable set of detected and differentially expressed TEs in human simulated RNA-seq data.

Combined multiple transcriptional repression mechanisms generate ultrasensitivity and oscillations
PRESENTER: Eui Min Jeong

ABSTRACT. Transcriptional repression can occur via various mechanisms, such as blocking, sequestration and displacement. Although the transcription can be completely suppressed with a single mechanism, multiple repression mechanisms are used together to inhibit transcriptional activators in many systems, such as circadian clocks and NF-κB oscillators. This raises the question of what advantages arise if seemingly redundant repression mechanisms are combined. Here, by deriving equations describing the multiple repression mechanisms, we find that their combination can synergistically generate a sharply ultrasensitive transcription response and thus strong oscillations. This rationalizes why the multiple repression mechanisms are used together in various biological oscillators. The critical role of such combined transcriptional repression for strong oscillations is further supported by our analysis of formerly identified mutations disrupting the transcriptional repression of the mammalian circadian clock. The hitherto unrecognized source of the ultrasensitivity, the combined transcriptional repressions, can lead to robust synthetic oscillators with a previously unachievable simple design.

DUNE-COPASI: Multi-Compartment Diffusion-Reaction solver for Cell Biology

ABSTRACT. The quantitative study of living cells with systems biology has been traditionally approached assuming spatially homogeneous biochemical species (e.g. using ODEs). However, recent technological progress led to the increasing availability of spatiotemporal data, and thus, caused the need to extend and create new tools that account for the spatial dimension (e.g. using PDEs). With this panorama, we aim to model the spatiotemporal distribution of biochemical species within the cell as well as its immediate surroundings. Such a setting may be characterized with a system of diffusion-reaction equations per compartment/membrane, together with a set of transmission conditions to couple them at the membrane. In this poster, we explore a generic model formulation as well as two applications. Moreover, we present dune-copasi, an open-source multi-compartment diffusion-reaction solver tailored for biological systems.

10:30-12:30 Session 15: CANCER SYSTEMS BIOLOGY

Summary: Over the last decade, a large amount of data has been collected and made publicly available in cancer research. This has enabled development of new approaches in cancer research, ranging from predicting the functional nature of genetic alterations and assessing the effect of genetic and pharmacologic perturbations to predicting patient sensitivity to specific drugs and adaptive response in cancer cells. These methodologies represent critical contribution to the field of precision cancer medicine and support increasing clinical translational of computational and systems biology approaches to the clinic. This session will present some of the latest development in both basic and translational research using mathematical modeling, network- and deep learning-based, for the prediction of biological mechanisms, drug responses and personalized cancer medicine.

Location: Alexander
Gene expression heterogeneity arises from signaling networks

ABSTRACT. Cells make individual fate decisions through linear and nonlinear regulation of gene network, generating diverse outputs from a single pathway. In this session, I will present our recent work on gene expression heterogeneity mediated by signal-transcription factor kinetics in cell populations. Excessive activation of NF-kappaB transcription factor has been reported in many types of cancers. By focusing on the NF-kappaB transcription factor in B cell, we found that the opening and closing of chromatin at the DNA regions of the putative NF-kappaB binding sites, the cooperativity in their interactions, and LLPS (liquid-liquid phase separation)-like molecular interactions in nucleus significantly influenced the cell-to-cell heterogeneity and gene expression levels in cell populations. This study indicates that the noise in gene expression is rather strongly regulated by the DNA side, even if the intracellular signals are tightly regulated. The interaction mechanisms between transcription factors and DNA are important in understanding the signal encoding and decoding of cell fate determination process and their application to human diseases.

Systematic elucidation and pharmacological targeting of non-oncogene dependencies at the single cell level.

ABSTRACT. We have developed network-based methodologies for the systematic identification, validation, and pharmacological targeting of a new class of therapeutic targets. These targets comprise Master Regulator proteins, whose concerted activity within a Regulatory Checkpoint module is responsible for the mechanistic implementation and maintenance of cell transcriptional state, in both transformed and non-transformed cells. By leveraging these methodologies, we have developed NY CLIA certified tests (OncoTreat and OncoTarget) that leverage large-scale drug-perturbation assays to systematically identify drugs and drug combinations whose mechanism of action is specifically effective in abrogating tumour checkpoint activity, on an individual patient basis. These tests have shown >80% success rate in 34 drug arms in PDX models established from patients who had failed multiple lines of therapy. We will first introduce the methodological advances supporting the development of these methodologies and then demonstrate their extension to elucidating drugs capable of targeting the master regulator dependencies of transcriptionally distinct tumour subpopulations, at the single cell level. Specifically, we will discuss identification and pre-clinical validation of drugs combinations targeting stem-like progenitor and differentiated cells in breast adenocarcinoma as well as Master Regulators of tumour-infiltrating T regulatory cells, thus potentiating the effect of immune checkpoint therapy.

Digital Twin models for the precision diagnosis and therapy of cancer

ABSTRACT. Approaches to personalized diagnosis and treatment in oncology are heavily reliant on computer models that use molecular and clinical features to characterize an individual patient’s disease. Most of these models use genome and/or gene expression sequences to develop classifiers of a patient’s tumor. However, in order to fully model the behavior and therapy response of a tumor, dynamic models are desirable that can act like a Digital Twin of the cancer patient allowing prognostic and predictive simulations of disease progression, therapy responses and development of resistance. We are constructing Digital Twins of cancer patients in order to perform dynamic and predictive simulations that improve patient stratification and facilitate the design of individualized therapeutic strategies. Using a hybrid approach that combines artificial intelligence / machine learning with dynamic mechanistic modelling we are developing a computational framework for generating Digital Twins. This framework can integrate different types of data (multiomics, clinical, and existing knowledge) and produces personalized computational models of a patient’s tumor. The computational models are validated and refined by experimental work and in retrospective patient studies. We present some of the results of the dynamic Digital Twins simulations in neuroblastoma. They include (i) identification on non-MYCN amplified high risk patients; (ii) prediction of individual patients’ responses to chemotherapy; and (iii) identification of new drug targets for personalized therapy. Digital Twin models allow the dynamic and mechanistic simulation of disease progression and therapy response. They are useful for the stratification of patients and the design of personalized therapies.

Model ensembling as a tool to form interpretable multi-omic predictors of cancer pharmacosensitivity

ABSTRACT. The determination of the optimal treatment for individual patients presenting with cancer is the major goal of personalized oncology, a rapidly-growing field of modern medicine. One important aspect is the accurate prediction of the response of cancer cells to various chemotherapies, as patients frequently fail to respond adequately to first-line therapies, or develop resistance over time. It is expected that the molecular characteristics of the neoplasic cells (genomic, transcriptomic, etc.) contain enough information to retrieve specific signatures, in turn allowing to form accurate predictions based solely on these multi-omic data. Ideally, these predictions should be explainable to clinicians, in order to be integrated in the patients care. While a number of computational methods have been developed over the years, very few have been assessed in a clinical setting, and none has been integrated in standard cancer care. We propose a machine-learning framework based on ensemble learning to integrate multi-omic data and predict chemosensitivity to an array of commonly used drugs. We trained an array of classifiers on the different parts of our dataset to produce omic-specific features, and subsequently trained a random-forest classifier on these features to predict chemosensitivity. We used the CCLE dataset, comprising multi-omic and chemosensitivity measurements for hundreds of cell lines, to build the models, and we validated our results using leave-one-out validation strategy. Our results show superior performance to the state-of-the-art for several drugs, belonging to different compound classes, and across the most frequent cancer types. Furthermore, the relative simplicity of our approach allows to easily examine which features have a larger importance in the models, as well as why a particular prediction was formed for a specific sample. As such, we identify new markers of chemosensitivity and their links to known cancer pathways and to druggable targets. Importantly, our models are flexible and can adapt to missing data, for example when the measurement of some omics type is not clinically feasible. Overall, our approach has the potential to be a useful tool in a personalized oncology setting, by helping clinicians to link the characteristics of the tumors to their sensitivity to chemotherapeutic drugs, and ultimately to the clinical response in patients.

MultiOmics Network Embedding for SubType Analysis
PRESENTER: Giovanni Scala

ABSTRACT. Biological systems are complex entities whose behavior emerges from an enormous number of reactions taking place within and among different internal molecular districts. The dissection and the modeling of the entities and the interactions constituting these interactions are essential in biological processes behind normal and pathological conditions as well as the perturbations induced by the exposure to external molecules like drugs. The recent explosion of omics data fueled the creation of diverse systems biology models. The majority of these are focused on the representation of interactions taking place in single molecular districts and have been successfully used to perform sample stratification, especially in cancer disease. Despite the usefulness proven by these models, they still did not reach the level of complexity needed to distinguish different biological conditions.

One step forward in this direction is the creation of multi-omics models capturing the dynamics taking place within and between omics layers. This latter approach needs powerful modeling strategies and is still an open research field. We propose the application of a powerful AI technique based on graph embedding for the creation of a system that, starting from multi-omics measurements, is able to model and generate knowledge about multi-omics interactions.

Here we present a novel approach implemented as an R package named MultiOmics Network Embedding forSubType Analysis (MoNETA) for the identification of relevant multi omics relationships between biological samples. This approach has been applied in the identification of different cancer subtypes using multi omics data form the The Cancer Genome Atlas (TCGA) and the Clinical Proteomic Tumor Analysis Consortium (CPTAC) datasets. MoNETA will be freely available as an R package at https://github.com/BioinfoUninaScala/MoNETA.

Flow Cytometry Combined with Systems Biology Modelling Reveals Heterogeneous “NFkB Fingerprints” in DLBCL
PRESENTER: Eleanor Jayawant

ABSTRACT. NFkB signalling plays a crucial role in lymphoid malignancies and is frequently aberrantly activated by mutations and the tumour microenvironment (TME). As a ubiquitously-expressed regulator of many genes, broad targeting of NFkB has failed due to on-target toxicity. While the roles of specific NFkB molecules in lymphoma remain poorly understood, recent work has implicated non-canonical NFkB signalling in a subset of diffuse large B-cell lymphoma (DLBCL), while NFkB cRel activation is associated with a distinct disease classification. This molecular heterogeneity of DLBCL likely contributes to treatments remaining unchanged over the last decade, despite substantial advances in understanding of the mechanisms of disease.

Systems biology approaches, using computational models, have provided insight into NFkB. However, current models of NFkB signalling do not capture heterogeneity within or between DLBCL cell populations.

We developed a library of computational simulations, informed by multiparametric flow cytometry (FC) data from DLBCL cell lines, to capture the cell-to-cell and line-to-line heterogeneity in NFkB signalling in DLBCL. We aim to establish these models as tools, enabling the prediction of the effect of the TME and targeted therapeutics on NFkB.

Using a bespoke FC analysis pipeline, we have simultaneously collected intracellular FC data for four metrics of NFkB activity and abundance. We validated our pipeline using published data in multiple myeloma cells, and characterised the relative abundance of NFkB subunits in in two DLBCL cell lines, SUDHL8 and U2932. We analysed two subclones found within U2932 (R1 and R2). We quantified the balance between canonical and non-canonical activity, and between cRel- and RelA-containing dimers, within each cell population with single-cell resolution. These “NFkB fingerprints” reveal substantial cell-to-cell variability and key differences in subunit activity and abundances, unexplained by current cell-of-origin classifications. SUDHL8 cells showed decreased canonical activity compared to U2932 cells. Even within a single cell line (U2932), the subclones had distinct NFkB signalling states, with increased canonical activity in R1.

These data informed ordinary differential equation models, allowing us to change parameters encoding expression rates of NFkB subunits to reflect the heterogeneity between DLBCL cell lines. Our models were able to recapitulate the “fingerprints” of each cell line, capturing the vast heterogeneity within and between cell populations, with only minor changes to expression levels consistent with epigenetic heterogeneity. We used these models to predict the impact of the TME and targeted therapeutics on NFkB signalling. These differences in NFkB signalling state result in distinct predicted response to microenvironmental stimuli and NFkB targeting therapeutics.

Ongoing work is expanding this approach to additional DLBCL cell lines and patient samples, to capture the diversity of “fingerprints” in the lab and in silico. This work lays the foundation using models to predict which therapeutics will be most effective for individual DLBCL patients.

Discovery of Latent Drivers from Double Mutations in Pan-Cancer Data
PRESENTER: Nurcan Tuncbag

ABSTRACT. Despite massive advancements in cancer genomics, to date driver mutations whose frequencies are low, and their observable translational potential is minor have escaped identification. Yet, when paired with other mutations in cis, such ‘latent driver’ mutations can drive cancer. Here, we discover potential ‘latent driver’ double mutations. We applied a statistical approach to identify significantly co-occurring mutations in the pan-cancer data of mutation profiles of ~60,000 tumor sequences from the TCGA and AACR GENIE databases. The components of same gene doublets were assessed as potential latent drivers. Our comprehensive statistical analysis identified 194 same gene double mutations of which 147 individual components are cataloged as latent drivers. Evaluation of the response of cell lines and patient-derived xenograft data to drug treatment indicate that in certain genes double mutations may have significant role in increasing oncogenic activity, hence obtain a better drug response as in PIK3CA. A counter-example is that they can promote resistance to the drugs as in EGFR. With time, drug resistance will arise. Taken together, our comprehensive analyses indicate that same-gene doublet mutations are exceedingly rare phenomena but are a signature for some cancer types, e.g., breast, and lung cancers. They also point that the load of doublet mutations in tumor suppressors is significantly higher than in oncogenes, indicating their relative robustness. On the other hand, the additivity of co-occurring driver mutations in different genes (in trans) can lead to a powerful oncogenic-signal, encoding aggressive proliferation. Rare co-occurring in trans combinations can serve as metastasis markers; excluded combinations may give rise to oncogene-induced senescence (OIS). We identified co-occurring double mutations on different genes that additively can promote tumorigenesis through single or multiple pathways. We found 4352 statistically significant different gene double mutations that alter non-redundant pathways and interactions and promote cancer-specific tumorigenesis. They are mostly in primary tumors. Rare occurrences can be a signature of metastatic tumors. We identified a strong association between mutations in ESR1, GATA3, and PIK3CA genes in metastatic tumors. ESR1 mutations at positions 536, 537, and 538, which are sequence neighbors, are exclusively paired with the major drivers of PIK3CA at positions 1047, 542, and 545. Interrogation of big genomic data and integration with large-scale small-molecule sensitivity data can provide deep patterns that are rare – but can prompt dramatic phenotypic alterations and serve as clinical signatures. Mapping cancer-specific co-occurring pair signatures, in single and metastatic tumors, is vital in precision oncology.

Improving our Mechanistic Understanding of Cell Cycle Dynamics

ABSTRACT. The mammalian cell cycle is regulated by a well-studied but complex biochemical reaction system. Computational models provide a particularly systematic and systemic description of the mechanisms governing mammalian cell cycle control. They facilitate a detailed understanding of cell cycle control mechanisms and are in part also able to aggregate this knowledge into full cell cycle models that explain periodic cell cycle oscillations. This work aims at improving on these models along four dimensions: model structure, validation data, validation methodology and model reusability.

Presented is a core model structure of the full cell cycle that qualitatively explains the behaviour of unperturbed and perturbed cells. Using rule-based model descriptions, the core model was conveniently extended by a DNA damage checkpoint and a separation in a nuclear and cytoplasmic compartment. To estimate the model parameters, the time courses of several cell cycle regulators were reconstructed from single cell snapshot immunofluorescence data using the reCAT algorithm. This data and the cell cycle model were then cast into the PEtab format for specifying parameter estimation problems in biochemical reaction networks. After optimising these parameters with self-adaptiive cooperative enhanced scatter search, a cell cycle model that explains the validation data was obtained. The PEtab specification allows any modeler to reuse the model, the data and/or the optimisation results.

Further experimental conditions, for instance in form of CRISPR interference, are expected to significantly improve parameter identifiability and provide a way for testing the predictive power of the model. Given the central role of the cell cycle in health and disease, such a predictive model may aid in the discovery of new therapeutic targets.

Modeling the tumor microenvironment in patient-derived organoid culture

ABSTRACT. Patient-derived organoids are a model of choice to elucidate inter- and intratumoral heterogeneity to combat therapy resistance. However, their utility is limited by heterologous and poorly-defined extracellular matrices and lack of proper tumor microenvironment, thus failing to model the tumor in its complexity.

Here, we present an approach to identify relevant paracrine interactions between stromal and tumor cells in colorectal cancer. Single cell-RNAseq data of 12 patients were analyzed for ligand-receptor pairs enabling stroma-to-tumor signaling. Physiological relevance was tested by adding stroma-derived ligands to the organoid culture, followed by mass cytometry and scRNAseq analysis. We also aimed to model extracellular matrix composition in colorectal cancer by supplementing the laminin/collagen IV rich environment with other known matrix proteins such as collagen I to identify the impact of a changing substrate on cell plasticity.

We identified paracrine factors and signals affecting proliferation, differentiation, and developmental trajectories of patient-derived organoids in vitro. We hypothesize that environmental factors may limit the phenotypic space in which organoid cells differentiate, disabling the study of more invasive behaviors in vitro. We show that extracellular matrix parameters have a strong impact on cell plasticity and highlight the importance of adjusting and expanding organoid in vitro culture models.

Our data provide guidelines to improve existing tumor organoid models and provide a feasible approach to address common limitations in organoid culture. Based on our findings, we currently identify factors that can interfere with drug efficacy and potentially favor clinically relevant therapy resistance mechanisms.

Transcriptional fluctuations govern the serum dependent cell cycle duration heterogeneities in Mammalian cells

ABSTRACT. Mammalian cells exhibit a high degree of intercellular variability in cell cycle period and phase durations. However, the factors orchestrating the cell cycle duration heterogeneities remain unclear. Herein, by combining cell cycle network-based mathematical models with live single-cell imaging studies under varied serum conditions, we demonstrate that fluctuating transcription rates of cell cycle regulatory genes across cell lineages and during cell cycle progression in mammalian cells majorly govern the robust correlation patterns of cell cycle period and phase durations among sister, cousin, and mother-daughter lineage pairs. However, for the overall cellular population, alteration in serum level modulates the fluctuation and correlation patterns of cell cycle period and phase durations in a correlated manner. These heterogeneities at the population level can be fine-tuned under limited serum conditions by perturbing the cell cycle network using a p38-signalling inhibitor without affecting the robust lineage level correlations. Overall, our approach identifies transcriptional fluctuations as the key controlling factor for the cell cycle duration heterogeneities, and predicts ways to reduce cell-to-cell variabilities by perturbing the cell cycle network regulations.

Model Predictive Control of Cancer Cellular Dynamics: A New Strategy For Therapy Design
PRESENTER: Benjamin Smart

ABSTRACT. Recent advancements in Cybergenetics have led to the development of new computational and experimental platforms that enable us to robustly steer cellular dynamics by applying external feedback control. Such technologies have never been applied to regulate intracellular dynamics of cancer cells. Here, we show in silico that adaptive model predictive control (MPC) can effectively be used to steer the simulated signalling dynamics of Non-Small Cell Lung Cancer (NSCLC) cells to resemble those of wild-type cells. Our optimisation-based control algorithm enables tailoring the cost function to force the controller to alternate different drugs and/or reduce drug exposure, minimising both drug-induced toxicity and resistance to treatment. Our results pave the way for new cybergenetics experiments in cancer cells, and, longer term, can support the design of improved drug combination therapies in biomedical applications.

Harnessing cancer heterogeneity for the systematic discovery of treatable cancer-driver exons with spotter

ABSTRACT. Alternative splicing shapes the regulatory and functional diversity in the cell. Cancer cells tend to select alternative splicing programs involved in tumor progression. However, while therapies based on targeting splicing events have been developed to treat cancer and other diseases, the systematic prioritization of potential disease-driver targets still remains unaddressed. Here, by using publicly available gene-level cancer dependencies from RNAi viability screens across 713 cancer cell lines, we define 140,310 exon-level linear models using splicing profiles and mRNA levels. We then identified cancer-driver exons as the ensemble of models that best prioritized experimental cancer dependencies across individual samples, which we call spotter. The 1,073 selected models corresponded to exons that mostly disrupt their gene's ORF or create new isoforms. These exons belong to genes related to the splicing machinery and cell proliferation and show a low rate of aberrant mutations. Interestingly, our ensemble model inferred the effects of single and multiple splicing perturbations on cell proliferation. Integrating pharmacological screens with our predicted splicing-level dependencies, we uncovered cancer-driver exons that mechanistically mediate drug sensitivity and synergize with drug effects. In patients, our ensemble model can not only aid the systematic prioritization of splicing targets across 14 different types of cancer but also identify putative splicing events driving patient response upon drug treatment or pinpoint susceptible splicing events at single-patient resolution. Taken together, in silico RNA isoform screening with spotter sheds light on the weak spots of cancer samples at the splicing level and holds the potential to be implemented for personalizing treatments.

Expanding the disease network of Glioblastoma Multiforme via topological analysis
PRESENTER: Apurva Badkas

ABSTRACT. Even among cancers, Glioblastoma Multiforme (GBM) is a challenge. Classified as a grade 4 glioma, it is one of the most common forms of brain cancers, with poor prognosis and limited therapy options. Understanding the molecular players causing the underlying heterogeneity is a key step in expanding therapeutic arsenal for GBM. Several computational methods have explored GBM, however, these are top-down approaches and are limited by the challenge of obtaining adequate number of disease and control datasets and require comprehensive data integration/batch correction efforts. A complementary, bottom-up, network approach is presented in this study which is based on minimal inputs, and two centrality measures – betweenness and eigenvector centrality. Using publicly available protein-protein interaction (PPI) dataset, the method corrects for degree bias commonly encountered in the network analysis methodologies. It highlights several topologically important key nodes in periphery of the known GBM genes. 26 out of the 36 top ranked genes have been linked to glioma/GBM in literature. Several of these candidates are also found to be differentially expressed between other gliomas and GBM. The method proposes to expand the list of GBM associated genes. Additionally, some of the highlighted candidates are known drug targets. Thus, establishing the role of these candidates in GBM patients can help expand the available drug repertoire for GBM.

Mutually Antagonistic Protein Pairs of Cancer
PRESENTER: Ertugrul Dalgic

ABSTRACT. Cancer could be viewed as a result of switch like behavior of cells, which, could be best understood by a systems level view. Antagonist protein pairs with mutual inhibition have critical roles for generating bistability. Two proteins of such antagonist pairs, negatively regulate each other directly or indirectly. Mutually acting antagonist proteins could show contrasting expression or activity levels in two different stable states. Unlike extensive analysis of gene expression, search for protein level antagonistic pairs has been limited. Here, potential cancer type specific antagonist protein pairs with mutual inhibition were obtained by a large scale analysis. Two proteins underlying a bistable switch could show opposite behavior in two different cancer types. Mutually antagonistic protein pairs were identified by selecting pairs of proteins which are ON-OFF in at least one cancer type, and OFF-ON in at least one other cancer type. Some proteins were found to have high number of antagonistic relationships with other proteins and participate in most of the associations. The proteins with highly antagonistic profile could not be attained from a differential expression or a correlation based analysis. Protein-protein and protein-DNA interactions between the antagonist proteins were also investigated. Mutually antagonistic protein pairs with direct or indirect interaction were identified. The identified proteins and their connections, provide potentially novel mechanisms that could play critical and cancer type-specific roles. Integrative analysis of mutually antagonist protein pairs contributes to our understanding of systems level changes of cancer.

A disease network-based deep learning approach for characterizing melanoma

ABSTRACT. Multiple types of genomic aberrations occur in cutaneous melanoma, and some can impact the prognosis of the disease. Hence, the integration of genomics data with clinical outcomes could facilitate the identification of the most relevant genomic features for melanoma progression. We developed a systems medicine approach that integrates genomics data with a disease network and deep learning model for the prognostic classification of melanoma patients and assessed the impact of different genomic features. Specifically, the deep learning model utilizes clusters (“communities”) identified in the network to effectively reduce the dimensionality of genomics data into a patient score profile. Using this profile, we identified three disease subtypes that differ in survival time. Subsequently, we quantified and ranked the impact of genomic features on the patient score profile using a machine-learning technique. Follow-up analysis of the top-ranking features provided us with a biological interpretation at both pathway and molecular levels, such as their mutation and interactome profiles in melanoma and their involvement in signal transduction, immune response, and cell cycle pathways. Taken together, we demonstrate the power of network-based artificial intelligence to provide personalized prognostic assessment for melanoma patients. The generic nature of the approach suggests that it is applicable to other cancer types.

10:30-12:30 Session 20: PLANT SYSTEMS BIOLOGY

Summary: A rapidly changing environment challenges our understanding of plant metabolism, growth and development. It has become evident that application of systems biology approaches essentially supports the quantitative analysis of plant-environment interactions, plant resilience against diverse stressors and plant performance. This conference session will focus on methods, theories, and approaches in the field of plant systems biology comprising experimental omics analysis, flux analysis, network analysis, mathematical modelling, and bioinformatics.

Co-expression analysis of transcriptome and proteome identifies LRR-VIII-1 kinase and MAPK-kinase (MEK1) regulatory modules associated with P-deficiency adaptation and P use efficiency in maize

ABSTRACT. Maize (Zea mays) is one of the most important crops worldwide. Crop productivity is widely constrained by limited phosphorus (P). Thus, improving P use efficiency (PUE) in newly developed cultivars is one of the long-term goals of breeding programs. However, the function of genes and their regulation for adaptation to P-limitation in maize is largely unknown, and the genetic potential for improving PUE is still under discovery. Therefore, we explored molecular-level regulatory networks under low-P supply using a multi-OMICs approach. We performed transcriptomic and proteomic analyses using six maize genotypes, which have close genomic backgrounds but several contrasting phenotypic traits, including PUE. We constructed co-expression networks for proteome and transcriptome data, and identified associations between co-expression modules and 31 traits. Within these networks, we further investigated protein kinases as potential regulators, and experimentally verified potential interactions between kinases and their substrates using the split YFP system. We propose the LRR-VIII-1 kinase (Zm00001d038522) as a regulator in roots. This kinase may play a fundamental role in adaptations to LP-stress, its regulatory modules are highly associated with tissue P-concentration and root-to-shoot ratios, linking with biological processes of ROS cleavage, flavonoid, and anthocyanin synthesis, cell elongation, and secondary cell wall organization. We propose MAPK-kinase (MEK1, Zm00001d043609) as a regulator of different LP-stress adaptations among different genotypes, and its regulatory module is significantly associated with genotype-specific traits, regarding root dry weight, root hair length, specific root length, as well as PUE. We show evidence that MEK1 can interact either with Sucrose synthase 1 (SH1, Zm00001d045042) or eEF1B-γ translation elongation factor 1-gamma 3 (eEF1B-γ, Zm00001d046352), suggesting it is involved in sucrose metabolism and translation elongation. These proteins are suggested as key candidates to develop breeding targets to improve product yield with less P-fertilizer inputs and improved carbon resource allocation. More importantly, these OMICs profiles contain a wealth of information to be mined by the community and may provide clues for further research beyond the work presented.

A modular concept for modeling of plant lipid metabolism
PRESENTER: Sandra Correa

ABSTRACT. Gaining further understanding of the limiting steps of lipid accumulation in plants can provide new biotechnological leads for altering the lipid content and composition. Yet, despite the progress achieved, addressing this question remains challenging due to the recognized complexity of lipid metabolism. While plant metabolic modelling offers the means to better understand the cross-talk between pathways in lipid and central metabolism, comprehensive models of plant lipid metabolism are missing. Here, we provide a new metabolic modeling framework rooted in the concept of metabolic module. We illustrated the modular framework by reconstructing the lipid metabolic module of Arabidopsis thaliana consisting of 5956 reactions and 3108 metabolites organized in 16 compartments that captures the accumulation routes for 2402 lipid species, grouped into 25 classes. The module also considers a set of precursor molecules and essential cofactors to facilitate the integration and guarantee the functionality when plugged in a metabolic model of any size by software designed for this purpose. The framework also includes an accompanying software tool that allows the integration of lipidomics data sets and obtaining flux distributions that respect the measured lipid distributions. By investigating the lipid metabolic module in three case studies we showed that: (1) phenotypes of single and double knock-outs of genes involved in lipid metabolism are in concordance with specific growth rates predicted by the model, (2) fluxes in lipid metabolism correlate well with transcript levels of corresponding genes measured in Arabidopsis thaliana grown under extended darkness, and (3) accession-specific lipid modules pinpoint possible alternative mechanisms that allow different Arabidopsis accessions to cope with stress conditions (e.g., extended darkness). The modular framework provides versatile tools to establish and study cross-talk between pathways that rely on precursors from central metabolism. Future extensions will be directed at the estimation of enzymatic kinetic parameters for lipid-related enzymes, and their integration into the framework to gain a better understanding of (1) the functional advantages of the catalytic overlap exhibited by the majority of enzymes participating in the lipid network, (2) how the control of quality and quantity traits is shared among partially redundant reaction steps, and (3) design of engineering strategies to modulate lipid pathways in the context of a dynamic environment.

Predicting plasticity of rosette growth and metabolic fluxes in Arabidopsis thaliana

ABSTRACT. Phenotypic plasticity allows an organism to rapidly mitigate the effects of suboptimal growth environments. While genetic variability for phenotypic plasticity can be used to develop climate-resilient crop lines, accurate genomic prediction models for plasticity of fitness-related traits are still lacking. Here, we employed condition- and accession-specific metabolic models for 67 Arabidopsis thaliana accessions based on our recently proposed genome-scale metabolic network-based framework for genomic prediction (termed netGS), which used to dissect and predict plasticity of rosette growth in response to different environments. We showed that specific reactions in photorespiration, linking carbon and nitrogen metabolism, as well as key pathways of central carbon metabolism exhibited substantial genetic variability for flux plasticity. We also demonstrated that genomic prediction of flux plasticity improves the predictability of fresh weight under unseen conditions by at least 83.3% in comparison to the classical models. Therefore, the combination of metabolic and statistical modeling provides a stepping stone in understanding the molecular mechanisms and improving the predictability of plasticity for fitness-related traits.

Stress Knowledge Map: A knowledge resource for modeling of plant stress responses
PRESENTER: Kristina Gruden

ABSTRACT. With pressure on global food security set to increase due to a growing human population and the increasingly apparent effect of climate change on agriculture, our understanding of the complexity of plant response to biotic and abiotic stressors is becoming ever more important. Knowledge on molecular processes occurring within the plant cell is currently scattered across various sources, and thus not easily accessible for mathematical modeling. Stress Knowledge Map (SKM) is an attempt at integrating this dispersed information into a freely available resource. It supports interactive exploration of its contents, and represents a basis for various mathematical modelling approaches. Stress Knowledge Map (SKM, available at https://skm.nib.si) is a knowledge graph resulting from the integration of dispersed published information on plant molecular responses to biotic and abiotic stressors. SKM contains two complementary resources: the Plant Stress Signalling model (PSS) and the Comprehensive Knowledge Network (CKN). PSS is a detailed conceptual model of plant stress signalling reactions, implemented as a neo4j database. The types of entities within PSS include genes and gene products, complexes, metabolites, and biotic and abiotic triggers of plant stress. PSS currently includes 1,318 entities and 498 reactions. Reactions are divided into ten reaction types (e.g. protein activation or catalysis). The majority of the contents of PSS were compiled from peer-reviewed manuscripts with targeted methodology, giving them a high degree of confidence. Furthermore, selected stress signalling associated pathways were added from KEGG. Curators review each of the entries to make PSS even more reliable. CKN is a partially directed network of physical interactions between molecular entities, encompassing experimentally confirmed protein-protein interactions, protein-DNA interactions, miRNA-transcript interactions, and metabolic conversions. The interactions are compiled from a multitude of curated knowledge-bases (such as KEGG, STRING, and MIRBASE) and published high throughput experiments (such as Y2H, ChIp-Seq, DAP-Seq). CKN currently includes 20,009 entities and 71,946 interactions, and is implemented in simple relational structure. A number of download formats are available on the downloads page, including SBGN (a standard format enabling graphical visualisation of model and the use of diverse systems biology modelling tools), SBML (for mechanistic ODE modelling), and BoolNet (for Boolean network modelling). Furthermore, SKM can be exported in Cytoscape (https://cytoscape.org/) and DiNAR (https://github.com/NIB-SI/DiNAR) compatible formats to allow the user easy access to network analysis features available in these tools. Initial exploration of plant stress signaling using quantitative Boolean modeling and simulations has already proven useful for understanding of emerging signaling network properties.

Topological properties accurately predict cell division events and organization of Arabidopsis thaliana’s shoot apical meristem
PRESENTER: Timon W. Matz

ABSTRACT. Cell division and the resulting changes to the cell organization affect the shape and functionality of all tissues. Thus, understanding the determinants of the tissue-wide changes imposed by cell division is a key question in developmental biology. Here, we use a network representation of live cell imaging data from shoot apical meristems (SAMs) in Arabidopsis thaliana to predict cell division events and their consequences at a tissue level. We show that a support vector machine classifier based on the SAM network properties is predictive of cell division events, with test accuracy of 76%, matching that based on cell size alone. Further, we demonstrate that the combination of topological and biological properties, including: cell size, perimeter, distance, and shared cell wall between cells, can further boost the prediction accuracy of resulting changes in topology triggered by cell division. Using our classifiers, we demonstrate the importance of microtubule mediated cell-to-cell growth coordination in influencing tissue-level topology. Together, the results from our network-based analysis demonstrate a feedback mechanism between tissue topology and cell division in A. thaliana’s SAMs.

Computational analysis of cambium activity during plant radial growth
PRESENTER: Ruth Großeholz

ABSTRACT. Plant radial growth is responsible for large parts of terrestrial biomass production. It is mediated by the cambium, a stem cell niche continuously producing wood (xylem) and bast (phloem) in a strictly bidirectional manner. Here, we present a cell-based model of cambium activity in VirtualLeaf, an agent-based modeling software specific for plant tissues. By an iterative cycle of comparing in planta and in silico tissue anatomies, we identified a minimal framework around the receptor-like kinase PXY and its ligand CLE41 that is sufficient for recapitulating central tissue characteristics.

We further investigated the role of biomechanical forces in the computational model during tissue formation, as these are known to play a role experimentally. First, we expanded the base code of VirtualLeaf to allow for cell wall stability values. Using this new, custom version of VirtualLeaf, we were able to investigate the influence of biophysical properties on tissue geometry. In particular, we analyzed the impact of varying cell wall stability in the xylem at the core of the tissue and in the epidermis at the tissue boundary on the direction of cell lineages during tissue growth. Altogether, our model highlights the role of intercellular communication within the cambium and allows a dynamic view on radial growth in plants, which eludes direct access due to obstacles in life cell imaging.

Metabolome-wise analysis of plants exposed to abiotic stress conditions to identify potential metabolites inducers of cross-tolerance to biotic stressors

ABSTRACT. Plant pathogens are major threats for optimal crop yield and preservation of the harvest. Although plant immune responses are well characterised in isolation, pathogen infections frequently occur in combination with other stress conditions. Importantly, the outcome of plant-pathogen interactions under combinatorial stress is unpredictable as the operating molecular mechanisms differ from those triggered under each individual stressor. Certain abiotic stressors ultimately confer resilience to subsequent pathogen infection, i.e. cross-tolerance, however the network of molecular events involved remains undeciphered. Recent studies suggested that metabolites could retain “stress memory” and modulate combinatorial stress responses, but their potential role in cross-tolerance has been barely investigated.

My Marie Skłodowska-Curie project is aimed at identifying changing metabolites after abiotic stress periods that contribute to enhance plant tolerance to pathogens. To this end, comprehensive time-courses for sequential stress were defined to cultivate Arabidopsis plants under nine frequent adverse environmental conditions – namely fluctuations in light intensity, humidity, water availability and temperature - followed by a recovery phase and an eventual pathogen challenge. Metabolome changes of plants were profiled on daily basis using untargeted LC-MS and systemically analysed by means of conditional networks. Network analysis and identification of significantly altered metabolites were used to retrieve central features to cope with abiotic stressors. Complementary, transcriptome analyses were also conducted. Our first results are consistent with the idea that metabolome changes in response to abiotic stressors are persistent overtime, whereas transcriptome changes are reversible. Further investigation on metabolite composition of plants after abiotic stressors uncovered fluctuations in central groups of compounds. The potential relevance of such candidates to modulate immune responses is currently being assessed in planta.

Finally, metabolome profiling is being also employed to evaluate potential mechanisms of cross-tolerance in selected species of agronomic interest, i.e., melon, tomato, and rice. Conditional networks will be employed in that case to evaluate conservation and/or particular mechanisms of adaptation for memory stress in species evolutionary distant.

Deep convolutional neural networks capture the relationship between genetic variation and gene expression in plants

ABSTRACT. The promoter region remains the main regulatory site for gene expression because it contains the recognition sites for transcription factors and RNA polymerase. Mutations at the level of the promoter can therefore break the expression pathway before it even starts. However, because of the complexity of gene promoters, the “soft” sequence definition of recognition sites and their syntax context, prediction of the effect of genetic variation in gene promoters is not trivial. Therefore we decided to test if the task can be addressed by using convolutional neural networks, deep learning algorithms that have shown outstanding performance in the field of computer vision. These networks apply small filters on images repeatedly, to capture the combined effects of neighbouring pixels as e.g. specific objects on the image. Similarly in genomic sequences, specific combinations of neighbouring nucleotides produce objects (motifs) which are binding sites for transcription factors. Using convolutional neural networks, we elucidated how the cis regulatory regions affect the realtive transcript abundance in tomato accessions. Our research highlighted the importance of the 5’ UTR in the regulation of gene expression. We also achieved an improvement in our prediction accuracy by enriching the encoding of our input sequences with more biological knowledge.

PlantEd - a serious game about plant growth that aims to support metabolic modeling with citizen science

ABSTRACT. Plant growth is a game of survival. To win the game, the plant has to adjust its developmental programs to the changing environment and in result survive and successfully disperse seeds. This is realized by constant regulation of plant metabolism to reach specific objectives according to the developmental stage, time of the day, resource availability and environmental parameters. Several studies successfully used whole-plant metabolic models to simulate some aspects of that process and dynamic Flux Balance Analysis (dFBA) provided an effective computational framework. However, currently the ability to simulate plant growth remains in the hands of experts. Therefore, in PlantEd we implement dFBA as an engine for a simple real time strategy game, linking molecular complexity of the system with challenging game mechanics and user-friendly interface. The game enables players to explore survival strategies of plants and collectively participate in science without deep knowledge of plant metabolism and physiology.

From Plants to Plants and Beyond: How Modeling Strategies from the Engineering Field Could Benefit Systems Biology
PRESENTER: Maria Krantz

ABSTRACT. Mathematical Modeling has become a vital part of data analysis in many fields. Two of these are systems biology/medicine and the engineering field. Despite some knowledge transfer between these fields, the potential for beneficial exchange is not being fully exploited. The systems of interest in biology and engineering have many aspects in common. Modeling is carried out in both fields with the intention to understand the system’s behavior, identify abnormal behavior and estimate the effect of interventions. Engineering uses models to predict the behavior of machines or simulate logistics networks. At their core, these systems are very similar to a cell’s metabolism and signaling network. This can be exemplified by looking at a modern production plant. Such systems are combinations of mechanical and electrical parts (machines), which process the product, and the computational parts, which control the behavior of the machines. This is very similar to the way a cell functions – enzymes (molecular machines) process metabolites (products) and the action of these enzymes is controlled by signaling pathways (computational parts in a production plant). These parallels in the systems of interest provide an ideal basis for exchange and collaboration between modelers from the respective fields. However, this exchange is, up to now, rather limited. It would therefore be useful to foster exchange between these two fields of modeling by focusing on modeling approaches, rather than on model outcomes. Furthermore, terms from the engineering field should be linked to terms in the biological field to enable a common ground between researchers from both fields. This can be achieved by presenting models from the engineering field and exemplifying similarities to biological models. An exchange and knowledge transfer between researchers from systems biology/medicine and engineering would be beneficial for researchers in both fields and could help advance modeling in the life sciences.

12:30-13:30 Session L5: LUNCHEON V: ICSB Conference Organizers/Community Discussion

Summary: Discussion of ICSB future and next meetings etc.

Location: Grenander I+II
12:30-13:30 Session L6: LUNCHEON VI: Workshop on gene set enrichment

Summary: Gene set enrichments (GSEs) remain an important tools to link statistical results of high throughput data sets with biological reality. While the principle is simple, there is a substantial number of variations and special applications, such as combining GSEs with correlation analysis, multivariate modelling or applying to non-standard data sets (such as OLINK, Nanostring, ATAC-Seq or ChIP-Seq as well as metabolic profiling). There is a substantial number of algorithms and packages for GSEs. These are not redundant, but rather suitable for different applications. In this tutorial, I will show simple yet effective strategies for gene set enrichment analysis in several different scenarios, including metabolomic profiling, multivariate analyses, single cell RNA-Seq, small gene universes, the strengths and weaknesses of different GSE packages and many tips and tricks. To fully take advantage of the workshop, a very basic command of the R programming language is recommended.

13:30-15:30 Session 14: WHOLE CELL MODELING

Summary: The challenge of whole cell-modeling is indeed one of the most important ones in computational biology. In metabolism, there has been a series of genome scale modeling studies which we aim to involve in the session in addition to topics such as signaling and  regulation.

Can we trust genome-wide metabolic models?

ABSTRACT. The latest genome-wide reconstructions of human metabolism allow the flux of metabolites through the whole cell and tissue to be simulated. The most typical approach consists in identifying a steady-state metabolic flux distribution that optimizes a given objective, such as maximal production of biomass, subject to constraints. An alternative and powerful approach is based on exploring the region of feasible steady-state flux distributions by means of random sampling. I will illustrate our main achievements in using either approach to contextualize variations in bulk or single-cell transcriptomics and/or proteomics data between different cells, tissues, or experimental conditions. We have observed a significant agreement between the predicted variations in reaction fluxes and the observed variations in substrate abundances, as well as between the predicted growth rate of single cells and their cell cycle phase. However, when the aim is to accurately describe the individual metabolic flux distribution of each cell/tissue and its redox balance, rather than the differences/similarities among them, the level of uncertainty rises. I will discuss the main barriers to attaining this goal and our ongoing efforts to overcome them.

The limit on dry mass density is an organizing principle for cellular physiology
PRESENTER: Martin Lercher

ABSTRACT. The dry mass density of E. coli cells has been found to remain almost constant across growth conditions, a phenomenon that may arise from a trade-off between positive and negative effects of molecular crowding on cellular efficiency. Accordingly, cellular dry mass is a limiting resource that should be used parsimoniously. We found that a corresponding optimization principle predicts a general, quantitative relation between the concentrations of enzymes and their substrates. Strikingly, this relationship also provides accurate quantitative predictions for the growth rate-dependence of the concentrations of the most abundant cellular catalysts, the ribosome and the enzyme MetE. The corresponding organizing principle also predicts the apparent offset when extrapolating the concentrations of biosynthetic enzymes to zero growth. These offsets are a central feature of empirical bacterial growth laws, which provide an important basis for understanding the regulation of cellular resource allocation and for constructing predictive models.

How can we unleash the potentials of whole-cell modelling?

ABSTRACT. Whole-cell modelling has been receiving growing attention since the publication of the Mycoplasma genitalium model (Karr et al., 2012). Since then, another whole-cell model for Escherichia coli was published (Macklin et al., 2020), followed by a spatial model of the JCVI-syn3A minimal cell (Thornburg et al., 2022). In my previous work in the Karr Lab, I developed a prototype whole-cell model of human embryonic stem cells using a computational pipeline developed by the lab. Despite this progress, challenges remain in building, simulating and validating whole-cell models. Insufficient mechanistic understanding and lack of kinetic data aside, simulating such models is computationally expensive. These challenges make whole-cell models seem far away from their potential applications to knowledge discovery and solving real-world problems. I believe a flexible hybrid approach to building and simulating whole-cell models could accelerate the usability of these models. I will present what I believe to be some fundamental features of such an approach.

Analyzing Optimal and Non-Optimal Resource Allocations in Whole-Cell Models
PRESENTER: Diana Széliová

ABSTRACT. Next-generation genome-scale metabolic models allow studying the reallocation of cellular resources upon changing environmental conditions. They allow modeling metabolic flux distributions and predicting expression profiles of the catalyzing proteome. Consequently, the biomass composition can no longer be assumed constant but needs to be computed to account for the variable resource allocation. Although computational methods for identifying optimal solutions are available, unbiased characterization of all feasible solutions was so far missing. Here we introduce elementary growth modes (EGMs) to comprehensively analyze whole-cell models.

EGMs generalize the concept of minimal functional units -- known from elementary flux mode analysis -- to resource balance models. Thus, they provide an understanding of all possible flux distributions and all possible biomass compositions.

First, we demonstrate the power of an EGM analysis by analyzing biomass variations upon nitrogen limitations across multiple growth rates and find that the accumulation of lipid and/or starch upon nitrogen starvation is a feature of balanced growth and not (necessarily) a result of active regulation.

Next, we asked if the experimentally observed ribosome composition of E. coli (2/3 rRNA + 1/3 protein) can be understood as an (evolutionary) resource allocation problem. However, this is only possible if the cellular transcription is constrained too. An observation that strongly questions currently established theories.

In summary, EGMs provide unprecedented comprehensive insights into resource allocation in next-generation genome-scale models.

The E. coli Whole-Cell Modeling Project: Vision and Progress

ABSTRACT. Francis Crick first called for a coordinated worldwide scientific effort to determine a “complete solution” of the bacterium Escherichia coli. We have been working for some years now to complete an E. coli model that takes into account all of the known functions of every well-annotated gene, in order to better understand and predict the behavior of this scientifically-relevant and industrially-significant model organism. I will discuss our ongoing efforts to improve this model, most recently with new modeling added to better describe growth rate control, transcription unit architecture and tRNA metabolism. I will then highlight our newest “whole-colony” models, a multi-scale modeling effort in which every individual within a simulated colony is running the latest version of the whole-cell model, to calculate population-level emergent behaviors based on molecular interactions and events as the colony responds to the sudden introduction of antibiotics.

Building a synthetic cell: search for the minimal requirements for robust, sustainable, and tunable cell cycles

ABSTRACT. Cells control their sizes and cytoplasm properties to ensure proper functioning. The density of cytoplasm, where most cellular reactions occur, is highly variable across various physiological and pathological states of cells undergoing growth, division, differentiation, apoptosis, senescence, etc. The challenge of understanding how cytoplasm affects cellular functions is in modulating its density in live cells. Here, by combining microfluidic experiments and modeling to perturb a frog egg cytoplasm in vitro, we found that cell cycles, a ubiquitous cellular process, maintain stable functioning across an incredible range from 0.2X RCD (relative cytoplasmic density) up to 1.46X RCD. Cell cycles arrested in a concentrated cytoplasm (>=1.46X RCD) can recover by diluting it but cannot until it is way below the natural density (0.79X RCD), suggesting the system remembers its history. This phenomenon, called hysteresis, is also common in physics, chemistry, and engineering.

We developed a mathematical model to reproduce most experimental observations by assuming that cyclin synthesis & degradation rates and all molecule concentrations depend on cytoplasmic density. Interestingly, Neurohr et al. (Cell 2019) find that in oversized cells (density reduction from ~1.10 to ~1.07), transcription and translation machinery become limiting and do not scale with cell size, suggesting that our assumption of decreased synthesis rate with cytoplasmic density might be appropriate. Our model suggested a subcritical Hopf bifurcation causes differential thresholds to switch between the oscillating/arrested states, producing the observed hysteresis. The model also predicted that the Cdk1/Wee1/Cdc25 positive feedback does not contribute to the robustness, confirmed by experiments applying inhibitors.

Studies also connected cytoplasmic density with cell homeostasis. Fission yeast with higher cytoplasmic density tends to undergo supergrowth at a higher rate to achieve proteome homeostasis (Knapp, Cell Syst 2019). In human cells, cytoplasmic dilution of the cell cycle inhibitor Rb through the G1 growth phase triggers cell division, providing a mechanism to promote cell size homeostasis (Zatulovskiy, Science 2020). Interestingly, while we could tune the cell-cycle period to cyclin variations, we found it robust to density changes of the whole cytoplasm that contains cyclin. We hypothesized that instead of the absolute concentrations of each component, ratios between components matter for the oscillator robustness. This divergent response to changes in the overall concentration (where the cell cycle is resilient) versus specific signaling activity (where the cell cycle is sensitive) may help decouple possible cell homeostasis control mechanisms from cell cycle tuning capability.

FROG Analysis - a community standard to foster reproducibility and curation of constraint-based models

ABSTRACT. Community standards for consistent reconstruction, FAIR sharing and curation of constraint-based models such as genome-scale metabolic models (GEMs) are crucial to ensure their reproducibility and reliability. We initiated a community effort for a standardised assessment of reproducibility and curation of constraint-based models. Following the discussions at dedicated breakout sessions at HARMONY and COMBINE meetings over the past two years, we have developed the FROG analysis, an ensemble of analysis of constraint-based models to test the reproducibility of numerical simulation based on a set of standardized analyses. FROG analysis encompasses Flux variability analysis (FVA), Reaction deletion analysis, Objective function calculation, and Gene deletion analysis. We have also developed a collection of tools that generate FROG reports in a standardized schema to enable a reliable assessment of model reproducibility based on popular constraint-based software and a web-service. FROG analysis is currently used in BioModels’ workflow to curate and build a collection of FAIR and reproducible genome-scale metabolic models.

A low-granularity model integrating metabolism, growth and cell cycle: towards multi-level whole cell modeling of budding yeast
PRESENTER: Marco Vanoni

ABSTRACT. In living cells complex cellular functions arise from the non-linear interactions of a large number of molecules whose understanding requires mathematical modeling and simulation. A need exists to integrate different models to develop whole-cell models. The first attempt carried out in Mycoplasma genitalium, a simple prokaryotic parasite with a small genome, divides cell functions into modules, each modeled for short enough periods of time to assume module independence. The application of the same bottom-up approach to more complex cells, such as the budding yeast Saccharomyces cerevisiae, is not straightforward because of a compartmentalized cellular organization, a ten-fold larger genome, sophisticated nutritionally modulated sensing and differentiation pathways, and an asymmetric cell division that results in population heterogeneity in terms of size, age and cellular content of individual cells. Nevertheless some preliminary attempts have been published [2,3]. Here we present a top-down approach. First, we expand at the population level a model interconnecting metabolism, cell mass growth, and cell cycle regulation into one coarse-grained dynamic model [4]. The model couples the type of energy metabolism (fermentation vs. respiration), accumulation of RNA and proteins, which together account for ca. 70% of total biomass in yeast (http://book.bionumbers.org/what-is-the-macromolecular-composition-of-the-cell/) with cell cycle and division in cell populations. The integrated, multi-level model quantitatively accounts for temporal and growth parameters of wild-type cells grown at different glucose concentrations or in ethanol. Model analysis shows the type of metabolism (more specifically, the ratio between glucose fermentation and respiration) is the primary factor that – through regulation of macromolecular biosynthesis - drives cell size setting during S. cerevisiae balanced exponential growth under different growth conditions. The model also describes alterations in population structure originated by different mutants in metabolism, growth, and cell cycle, pinpointing the primary cellular functions affected in the mutant. The model predicts that nutrient-dependent cell size modulation requires a switch from respiration to fermentation. We validate this prediction by analysis of the growth properties of non-fermenting mutants. The model modularity allows to substitute a low granularity module with one with a finer grain, whenever molecular details are required to correctly reproduce specific experiments. As a proof-of-principle we show that the simple timer describing the G1/S transition may be replaced by a previously published molecular module that focuses the transition mechanism on the multi-site phosphorylation of the Whi5 transcriptional inhibitor [5]. Plugging-in of other modules, such as a medium-granularity description of ribosome assembly, is underway. References 1) JR Karr, et al., Cell 150:389–401, 2012 2) C Ye, et al., Biotechnology and Bioengineering, 117:1562–1574, 2020. 3) Münzner U, et al., Nat Commun 10:1308, 2019 4) P. Palumbo, Italian Workshop on Artificial Life and Evolutionary Computation, 165– 180. Springer, 2017. 5) Palumbo, et al., Nat. Commun. 7, ncomms11372, 2016

Multi-scale modeling of B-cell fate decisions required for antibody generation

ABSTRACT. During immune responses, B-cells generate an antibody repertoire, which is characterized by its depth and breadth. Classical immunology describes the mechanism of repertoire generation as a Darwinian process of positive selection based solely on the genetically encoded B-cell receptor affinity for antigen. However, B-cells differ substantially in their epigenetic propensity to undergo cell fate decisions such as survival vs. cell death, of growth and division, or of differentiation into plasma cells. Furthermore, this epigenetic propensity is remarkably stable and heritable from one generation to the next. To study the characteristics and sources of epigenetic heterogeneity and heritability, we developed a mathematical model of four distinct but interconnected regulatory networks to account for the B-cell fate decisions required for antibody generation. The differential equations-based model describes how dynamic signaling by receptors to transcription factors controls survival vs death, growth and division, and differentiation. With the model we identify molecular sources of phenotypic heterogeneity, quantify epigenetic heritability, and characterize how regulatory dynamics control cell fate decisions. (MSB, 11, pp.783-96, PNAS, 115, E2888-E2897, Immunity, 50, pp.616–628)

13:30-15:30 Session 16: MULTISCALE MODELS

Summary: Systems-biology principles emerge across many orders of magnitude in length and time. This session will highlight leading research that tackles multiscale questions in biology through the integration of models and quantitative experiments. Topics will include the coupling of fast and slow processes, the extrapolation of molecular networks–modules to broader populations of cells and organisms, and the fusion of single-cell mechanisms with tissue-level phenomena.

Location: Grenander I+II
Circadian and cell cycle control of terminal cell differentiation

ABSTRACT. Most mammalian cells have an intrinsic circadian clock that coordinates metabolic activity with the daily rest and wake cycle. The circadian clock is known to regulate cell differentiation, but how continuous daily oscillations of the internal clock can control a much longer, multi-day differentiation process is not known. Here we simultaneously monitor circadian clock and adipocyte differentiation progression live in single cells. Strikingly, we find a bursting behavior in the cell population whereby individual preadipocytes commit to differentiate primarily during a 12-hour window each day corresponding to the time of rest. Daily gating occurs because cells irreversibly commit to differentiate within only a few hours, which is much faster than the rest phase and the overall multi-day differentiation process. The daily bursts in differentiation commitment are driven by a variable and slow increase in expression of PPARG, the master regulator of adipogenesis, overlaid with circadian boosts in PPARG expression driven by fast, clock-driven PPARG regulators such as CEBPA. Our combined computational modeling and experiments support that this fast positive feedback regulation of PPARG drives a brief step increase in PPARG so that some cells can reach the threshold to irreversibly commit to differentiate in the evening phase. Our findings of consecutive daily bursts in cell differentiation at the population level are broadly relevant given that most differentiating somatic cells are regulated by the circadian clock. Having a restricted time each day when differentiation occurs may open therapeutic strategies to use timed treatment relative to the clock to promote tissue regeneration.

Individualized modules and models of enterovirus infection in host cells

ABSTRACT. Enteroviruses are small, fast-acting RNA viruses that are highly prevalent across the globe. There are 7+ decades of research on enteroviruses and the diseases they cause: gastroenteritis, sinusitis, herpangina, myocarditis, infantile paralysis, and others. Despite this wealth of information, there has not been a credible effort to synthesize enterovirus knowledge comprehensively in silico. Recognizing this opportunity, we built and validated a detailed kinetic model for the cardiotoxic enterovirus, coxsackievirus B3 (CVB3; Cell Syst 12:304-23 [2021]). The model initializes with CVB3 docking on cell-surface receptors and encodes—with 90+% of parameters fixed by the literature or our own experiments—the rate processes that culminate in the formation of mature infectious virions. The complete kinetic model is modular, stoichiometrically balanced for all categories of enteroviral proteins, and layered with host-pathogen feedbacks that normally confound experiments. In my talk, I will discuss model applications and extensions, which move down in scale to the stochastic events of individual enterovirions and up in scale to the infection susceptibility of human populations.

Predicting pathogenicity of TPM1 variants using multi-scale models

ABSTRACT. Hypertrophic cardiomyopathy (HCM) is a common inherited heart disorder that carries elevated risk for sudden cardiac death and heart failure. These outcomes can be mitigated through early detection of HCM inheritance in affected individuals. Although genetic screening of HCM families yields meaningful diagnostic information in some cases, many families have private genetic variants whose clinical significance has not yet been established. These variants of unknown significance (VUS) pose a significant challenge for families and their physicians, because they cannot be used as a basis for diagnosis and treatment. Our long-term goal is to develop a multiscale modeling pipeline that can be used for reliable in silico determination of pathogenicity for HCM gene variants. We selected the gene TPM1 as an initial focus for pipeline development. TPM1 encodes cardiac tropomyosin, a coiled-coil molecule that associates with actin filaments and is integral to the regulation of cardiac muscle contraction. The relatively simple and extensively studied molecular structure of tropomyosin constitutes feasible but meaningful opportunity to scale the effects of genetic variants from the molecular level to physiologically relevant alterations in function. Our process first involves molecular dynamic simulations of tropomyosin. Amino acid substitutions defined by TPM1 variants taken from clinical databases are introduced into the tropomyosin structure, and the resulting alterations in molecular mechanics are determined. Tropomyosin mutations are also introduced into an atomistic model of tropomyosin on actin to determine changes in interaction energies of regulatory conformations in this complex. Next, the molecular-scale mutation-dependent properties are inputted into a Markov model of cardiac sarcomere function. This model predicts force production by cardiac myofilaments by tracking the calcium-dependent activity of actin regulatory proteins (including tropomyosin). In this way, it is possible to predict the impact of TPM1 VUS on contractile behavior of cardiac muscle cells. To validate these predictions, we have developed human engineered heart tissue specimens that can be manipulated to express TPM1 VUS. We have recently used this approach to study two TPM1 variants, M8R and S215L. These studies have allowed us to construct unbroken chains of biophysical cause-and-effect, beginning with amino acid substitutions and ending with pathological changes in contractile behavior. These predictions have good agreement with data obtained in engineered heart tissues expressing the same TPM1 variants. We conclude that predicting physiological consequences of TPM1 variants using multiscale methods is feasible and has the potential to provide clinically important information to clinicians treating HCM families.

Multimodal perception links cellular state to decision making in single cells
PRESENTER: Bernhard Kramer

ABSTRACT. Contextual decision making by individual cells in a collective is a hallmark of multicellular systems. To achieve context-aware behavior, individual cells must integrate the input they receive from growth factors with the complexity of information of their physicochemical state and their microenvironment (cellular state). Cells perceive this through intracellular signaling networks, but individual signaling nodes are thought to have low information processing capacity, reflected in their heterogenous responses to growth factor stimulation.

We instead hypothesized that heterogenous responses of individual signaling nodes do not reflect low information processing capacity but reflect adaptive, cellular state-dependent, information processing. When considered holistically, as a multimodal percept, activation across the whole network could provide single cells with sufficient information to enable accurate contextual decision making.

To test this, we utilized EGF stimulation, combined with multiplexed (40-plex), imaging-based quantification to generate comprehensive single-cell data mapping the signaling responses across the EGFR pathway and cellular states in millions of individual cells. We find that signaling nodes in networks indeed display deterministic and adaptive cellular state-dependent information processing, which leads to heterogeneous growth factor responses and enables nodes to capture non-redundant information about the cellular state. Collectively, as a multimodal percept, activation across the network reflects the diversity of cellular states in cell populations. We further find that this multimodal percept accurately reflects varying input concentrations of EGF. Lastly, we find that multimodal perception links the cellular state to the heterogenous decision of single cells to re-enter the cell cycle or stay quiescent after exposure of EGF. References: Bernhard A. Kramer, Jacobo S. Del Castillo, Lucas Pelkmans. Multimodal perception links cellular state to decision making in single cells. Science. July 14, 2022. DOI: 10.1126/science.abf4062

Understanding Stem Cell Growth Dynamics in Human Cerebral Organoids
PRESENTER: Simon Haendeler

ABSTRACT. During neurogenesis stem cells self-organize and balance symmetric and asymmetric divisions to generate the necessary number and types of cells. To get insight into the stem cell growth dynamics of human neurogenesis we measured the offspring distribution of all stem cells in cerebral organoids with a genetic lineage tracing approach. The resulting offspring distribution is approximately power-law distributed, indicating a self-organized process. To understand which dynamics can generate power-laws, we modelled the cerebral organoid as a stochastic process of symmetrically, asymmetrically and non-dividing cells (S-cells, A-cells and N-cells). Each lineage starts with one S-cell, which can expand and eventually change its division behavior to become an A-cell, which asymmetrically produces N-cells.

We estimated the growth rates from the lineage tracing data and found that the maintenance of a small, but actively dividing S-cell population is necessary to explain the emergent power-law. The strong heterogeneity in the offspring distribution is explained by a neutral drift dynamic, where S-cells concentrate into fewer lineages over time. We hypothesized that the S-cell population size is mediated by a capacity mechanism. S-cells expand symmetrically and when capacity is reached, symmetric divisions trigger a random S-cell to change its identity to an A-cell. To validate our hypothesis we grew chimeric organoids of puromycin (a cell death inducing chemical agent) resistant and susceptible populations. Our model predicts that after inducing cell death lineages with S-cells replenish quickly and contribute overproportionally to the final organoid.

Surprisingly we found that even if 95% of the organoid is killed, the final size of the organoid is nearly recovered. To quantify the replenishment of each lineage performed, we used an optimal-transport based approach to match each resistant lineage in a puromycin-treated organoid to an equivalent resistant lineage in an untreated organoid and calculated the cell replenishment normalized to the lineage size. We found that bigger lineages replenish overproportionally more than smaller lineages, which is in tune with our model prediction that bigger lineages have a higher fraction of S-cells. In summary we found that our model can explain the power-law in the offspring distribution of cerebral organoids, which depends on a maintained symmetrically dividing cell population.

Computational modeling for the elucidation of cell fate decisions in early development

ABSTRACT. Signalling pathways and gene regulatory processes are critical for cell fate decisions. While in some cases quantitative readouts of the networks encode the information, in other cases the dynamics of network components is important. We here use quantitative and qualitative computational modelling integrated in experimental-theoretical approaches to investigate cell fate decisions in early mouse development. First, we focus on cell fate decisions involved in the development of the hepato-pancreato-biliary organ system. Quantitative mathematical modelling based on detailed experimental data clearly describes the sequence of lineage decisions in the organ system but also indicates the existence of a multipotent progenitor subpopulation in the pancreato-biliary system that can contribute cells not only to the pancreas and gallbladder but also to the liver. This indicates a sustained plasticity within the development of the organ system that was also shown experimentally. In a second case we study cell fate decisions in muscle stem cells. Here oscillatory dynamics plays an important role. Muscle stem cells can differentiate or self-renew and is has been shown that this decision depends on the dynamics of a Notch-related signalling network. We developed models for the underlying network, analyse the dynamics in wt cells and study the changes in dynamics for various knockout situations.

Willnow et al. (2021), Quantitative lineage analysis identifies a hepato-pancreato-biliary progenitor niche, Nature 597 (7874): 87-91. Zhang et al. (2021), Oscillations of Delta-like1 regulate the balance between differentiation and maintenance of muscle stem cells, Nat. Commun. 12 (1), 1318.

Physiologically based pharmacokinetic (PBPK) models for metabolic phenotyping and evaluation of liver function
PRESENTER: Matthias König

ABSTRACT. Physiologically-based pharmacokinetic models (PBPK) are digital twins of human physiology which allow the study of individual drug metabolism in silico. PBPK models can simulate the absorption, distribution, metabolization and elimination of substances and when coupled to pharmacodynamics models also the effect of the substance on the body.

Here we present our work on PBPK models for metabolic phenotyping and evaluation of liver function [1-5]. For the development and validation we established the first open pharmacokinetics database PB-PK with data from over 600 curated clinical studies [1]. The database allows to integrate data for given test substances and study the effect of lifestyle factors or co-administration such as smoking or oral contraceptives [2]. Based on the curated data individualized and stratified computational models have been established and applied to various clinical questions [3-5]. A model of indocyanine green allowed to simulate individual outcome after hepatectomy by building patient-specific models using data such as anthropometric information (e.g. sex, weight, age) and pre-existing liver disease [3]. Combining the data with information on study cohorts allowed to simulate virtual populations [4]. Using a model of dextromethorphan coupled to drug-gene interactions accounting for changes in CYP2D6 enzyme kinetics depending on activity score (AS), allowed in combination with AS for individual polymorphisms to study the effect of CYP2D6 gene variants [5]. The model was applied to investigate the genotype-phenotype association and the role of CYP2D6 polymorphisms for metabolic phenotyping using the urinary cumulative metabolic ratio (UCMR). The model is capable of estimating the UCMR dispersion within and across populations depending on activity scores. The model can be applied for individual prediction of UCMR and metabolic phenotype based on CYP2D6 genotype. A model of omeprazole allowed to study the effect of omeprazole therapy on stomach pH, a model of simvastatin to study the LDL-C lowering effect of statins, and a model of pravastatin the effect of genetic variants and hepato-renal impairment on pravastatin pharmacokinetics. All models and data are available for reuse with models encoded in SBML.

We end the talk with an outlook on how such PBPK models can be applied to predict liver function in liver cancer and how coupling of such models to AI could help in clinical decision support.

[1] https://doi.org/10.1093/nar/gkaa990 [2] https://doi.org/10.3389/fphar.2021.752826 [3] https://doi.org/10.3389/fphys.2021.730418 [4] https://doi.org/10.3389/fphys.2021.757293 [5] https://doi.org/10.1101/2022.08.23.504981

A Systems Biology Approach To Study The Spatiotemporal Dynamics Of Senescent Cells In Wound Healing And Tissue Repair

ABSTRACT. Cellular senescence is thought to drive age-related pathology through the senescence-associated secretory phenotype (SASP). However, it also plays important physiological roles such as cancer suppression, embryogenesis and wound healing. Wound healing is a tightly regulated process which when disrupted results in conditions such as fibrosis and chronic wounds. Senescent cells appear during the proliferation phase of the healing process where the SASP is involved in maintaining tissue homeostasis after damage. Interestingly, SASP composition and functionality was recently found to be temporally regulated, with distinct SASP profiles involved: a fibrogenic, followed by a fibrolytic SASP, which could have important implications for the role of senescent cells in wound healing. Although senescence plays an important role in physiological wound healing, it has also been implicated in the progression of fibrotic and chronic wound disorders. Given the number of factors at play a full understanding requires addressing the multiple levels of complexity, pertaining to the various cell behaviours, individually followed by investigating the interactions and influence each of these elements have on each other and the system as a whole. Here, a systems biology approach was adopted whereby a multi-scale model of wound healing that includes the dynamics of senescent cell behaviour and corresponding SASP composition within the wound microenvironment was developed. The model was built using the software CompuCell3D, which is based on a Cellular Potts modelling framework. We used an existing body of data on healthy wound healing to calibrate the model and validation was done on known disease conditions. The model provides understanding of the spatiotemporal dynamics of different senescent cell phenotypes and the roles they play within the wound healing process. The model also shows how an overall disruption of tissue-level coordination due to age-related changes results in different disease states including fibrosis and chronic wounds. Further specific data to increase model confidence could be used to explore senolytic treatments in wound disorders.

Modeling the adaptive immune response of a lymph node as Petri net
PRESENTER: Sonja Scharf

ABSTRACT. Background The lymph node is responsible for important tasks of the adaptive immune response. The lymph node consists of different compartments, such as B zone and T zone. The compartments include various cell types, for example macrophages, dendritic cells, antigen-presenting cells, B and T cells. The cells are motile and communicate with each other. Whereas certain cell types, e.g., B cells can move from compartment to compartment, the movement of other cells is restricted to specific regions, for example follicular dendritic cells are located in the germinal centers. Cellular interactions trigger differentiations or movement of cells to another region. Based on the current knowledge, we created a model to better understand and predict immunological reactions.

Methods For model development and analysis, we applied the Petri net (PN) formalism, using the software tool MonaLisa. We analyzed the invariants of the model for network verification. The place invariants of the PN model validated the conservation of cells, and we applied transition invariants to explore the network dynamics. In our PN model, the production of B cells, the influx of antigen, the interaction of cells with antigens, differentiation of B cells and proliferation of differentiated B cells, release of antibodies and degradation of antigens represented the immune response. We simulated the adaptive immune response in an asynchronous way to consider a non-deterministic behavior.

Results The PN model describes movement and interaction of different cell types in and between compartments of the lymph node such as the subcapsular sinus, T zone, germinal center, and medulla. We included the interactions of the lymph node with the human body by modeling the two compartments blood and tissue without internal structure. The PN comprises 65 transitions (interactions and movement processes) and 49 places (cells, antigens, and antibodies). Four place invariants reflect the conservation of T cells, macrophages, antigen-presenting cells, and dendritic cells. The PN is covered by 25 transition invariants, each of which describes immunological reactions and movement of the cells through the lymph node. To analyze the PN, we started the simulation with an influx of antigens. When we stopped the antigen influx, antibody production was still active. We observed a much faster immune response for a second influx of antigens. The PN model was able to describe the dynamic behavior of cells inside the lymph node during an immune response in a semi-quantitative way. The model could adapt to known antigens. With the model, we can therefore predict that new antigens will lead to the production of specific antibodies and memory B cells. Our model demonstrates a functional adaptive immune response.

Digital twins and hybrid modelling for simulation of physiological variables and stroke risk
PRESENTER: Tilda Herrgårdh

ABSTRACT. Stroke is one of the most common causes of death in our society. The underlying aetiology leading to a stroke event is complex and develops over several years, often without symptoms. To be able to predict a stroke is therefore as desirable as it is difficult. The disease mechanisms act on different levels, comprising many both physiological and environmental factors, and involving multiple organs, timescales, and control mechanisms. Therefore, a multiscale and multilevel approach is needed to fully understand and predict disease progression. One such approach is digital twins. A digital twin is a personalized computer model of a patient. So far, digital twins have been constructed using either mechanistic models, which can simulate the trajectory of physiological and biochemical processes in a person, or using machine learning models, which for example can be used to estimate the risk of having a stroke given a cross-section profile at a given timepoint. These two modelling approaches have complementary strengths which can be combined into a hybrid model. However, even though hybrid modelling combining mechanistic modelling and machine learning has been proposed, there are few, if any, real examples of hybrid digital twins available. We now present such a hybrid model for the simulation of ischemic stroke. On the mechanistic side, we combine a model for blood pressure with a multi-level (intracellular biochemistry to whole-body) and multi-timescale (seconds to years) model for the development of type 2 diabetes. This mechanistic model can simulate the evolution of known physiological risk factors (such as weight, diabetes, and blood pressure) through time, and under different intervention scenarios (change in diet, exercise, and certain medications). These forecast trajectories of the physiological risk factors are then used by a machine learning model to calculate the 5-year risk of stroke. The stroke risk can also be calculated for each timepoint in the simulated scenarios. The hybrid model is now ready to be tested in clinical usage, in the preventative health care meetings in Sweden, where we hope to increase doctor-patient communication, facilitate shared decision-making, and improve adherence to prescribed medications.

Multiscale Modeling of Dyadic Structure-Function Relation in Ventricular Cardiac Myocytes
PRESENTER: Wilhelm Neubert

ABSTRACT. Cardiovascular disease is often related to defects of subcellular components in cardiac myocytes, specifically in the dyadic cleft, which include changes in cleft geometry and channel placement. Modeling of these pathological changes requires both spatially resolved cleft as well as whole cell level descriptions. We use a multiscale model to create dyadic structure-function relationships to explore the impact of molecular changes on whole cell electrophysiology and calcium cycling. This multiscale model incorporates stochastic simulation of individual L-type calcium channels and ryanodine receptor channels, spatially detailed concentration dynamics in dyadic clefts, rabbit membrane potential dynamics, and a system of partial differential equations for myoplasmic and lumenal free Ca2+ and Ca2+-binding molecules in the bulk of the cell. We found action potential duration, systolic, and diastolic [Ca2+] to respond most sensitively to changes in L-type calcium channel current. The ryanodine receptor channel cluster structure inside dyadic clefts was found to affect all biomarkers investigated. The shape of clusters observed in experiments by Jayasinghe et al. and channel density within the cluster (characterized by mean occupancy) showed the strongest correlation to the effects on biomarkers.

Multilevel approach characterizing the progression of fatty liver disease
PRESENTER: Ina Biermayer

ABSTRACT. The incidence of non-alcoholic fatty liver disease (NAFLD) characterized by the accumulation of liver fat is increasing worldwide. Since the disease can advance to liver cancer, it is important to resolve the temporal order of events and identify protein patterns for early detection of disease progression. To this aim, we employed a systems medicine approach linking the dynamics of structural changes at the organ and tissue level with molecular alterations in the proteome. As preclinical model, we studied the development of NAFLD in mice fed with a high glucose and fat (“Western”) diet for up to 26 weeks. MicroCT and automated quantification of tissue imaging of steatosis revealed early accumulation of liver fat and a continuous increase of lipid droplets. To characterize underlying molecular alterations, we determined changes in the global proteome of time-resolved liver tissue and blood plasma samples by mass spectrometry. For the identification of changes indicative of disease progression, we complemented current proteome data analysis approaches and developed a method that exploits the information encoded in the occurrence of missing values and transforms it into detection probabilities. The correlation of the dynamics of liver fat accumulation and steatosis development with the relevant proteome alterations present in liver and plasma resulted in the identification of more than 120 proteins as potential indicators for the progression of NAFLD. Based on their correlation coefficient, top ranking circulating markers are currently analyzed in NAFLD patients. Thus, our multilevel approach establishes a novel strategy to identify potential early indicators of NAFLD progression.

An interconnected multi-level mechanistic model of the human brain

ABSTRACT. 1. Introduction In the pursuit of gaining a more comprehensive understanding of the brain, we aim to expand and integrate a set of existing and newly developed mechanistic models that describe different aspects of the neuronal and hemodynamic functions of the brain. The goal is to have an interconnected multi-level, multi-scale model that can explain mechanisms on different levels of cerebral physiology. Starting at the level of ion channel kinetics, where neuronal homeostasis can be explored, and zooming out to large intraneuronal signalling networks, including descriptions of how such signalling activity; i) affect the metabolic control, and ii) the hemodynamic control of cerebral tissue, allowing changes in local vessels connect to a global whole body vessel tree. 2. Materials and Methods The interconnected model of the brain is constructed using ordinary differential equations (ODEs) and incorporate these equations with large-scale neuronal network modelling structures (NEURON and NetPyNE) [1, 2]. The interconnected model utilizes both qualitative and quantitative information form a wide variety of experimental measurements, such as measurement of action potentials (AP), magnetic resonance spectroscopy (MRS), functional magnetic resonance imaging (fMRI), as well as electrophysio-logical measurements, both on an ion channel level and a cell population level in the form of local field potential (LFP), multi-unit activity (MUA) and electro-encephalography (EEG) measurements. 3. Results The interconnected model can currently offer a detailed mechanistic description of the neurovascular coupling [3], with connections to metabolic responses [4] and neuronal network activity is in development. Further, a versatile ion channel structure and a mechanistic interpretation of neuron facilitation are being integrated. The existing models can describe experimental data as well as independent validation data, not used for model training. The model's fit to data is further validated by statistical hypothesis testing.

4. Discussion and Conclusions By integrating these aspects, we aim to achieve a model that can offer a detailed intracellular description that also reflects the physiological structure of the human brain. The model framework could be used to study and predict different diseases and physiological alterations, such as if facilitation of neurons can cause epilepsy, how Alzheimer’s disease affects the signalling patterns between neuron populations, and how stroke affects the cerebral tissue. Such an interconnected model would also allow for qualitative information to be gained from multi-species measurements. 5. References [1] Carnevale, N. T., & Hines, M. L. (2006). The NEURON book. Cambridge University Press. [2] Dura-Bernal et al (2021). NetPyNE, a tool for data-driven multiscale modeling of brain circuits. eLife 2019;8:e44494. [3] Sten S (2021). A multi-data based quantitative model for the neurovascular coupling in the brain. bioRxiv. [4] Sundqvist N (2022) Mechanistic model for human brain metabolism and the neurovascular coupling bioRxiv.


Summary: In this session we will be covering the challenges associated with delivering real impact in the clinic using systems biology. What are some of the open major challenges currently facing clinicians that could benefit from systems biology? We will host talks that cover recent examples of how systems approaches can be powerfully integrated into clinical research and problem solving flows. 

Location: Alexander
Therapy-related senescence signatures in lymphoma

ABSTRACT. Aggressive B-cell lymphoma, especially diffuse large B-cell lymphoma (DLBCL), come with a high medical need – about one third of the patients cannot be cured by the first-line (1L) standard immune chemotherapy Rituximab-CHOP (R-CHOP) and has, despite novel immune oncology (IO) treatment approaches, a dismal prognosis. Numerous large “all comer” or only cell-of-origin-preselected “R-CHOP + X”-type phase III trials failed, mostly due to the extensive molecular heterogeneity of the disease. Worldwide research efforts aim at designating individual lymphomas to more refined genomic subtypes or clusters, and at dissecting the immunological microenvironment as distinct “ecotypes”. Retrospective re-analyses of negative phase III studies unveiled hitherto unknown links between specific subtypes or ecotypes and the B-cell receptor/NF-B pathway-targeting Bruton’s Tyrosine kinase (BTK) inhibitor ibrutinib (I) or the proteasome-blocking agent bortezomib (B), which now require prospective confirmation and are under multi-omics investigation in our just fully recruited investigator-initiated „ImbruVeRCHOP“ DLBCL 1L “R-CHOP + I + B” trial (PI: CAS). Based on numerous lymphoma biopsies taken prior to, acutely under and in the course of therapy, we seek to characterize therapy-evoked biological-immunological state switches of curable patients as compared to those relapsing from their minimal residual disease (MRD) as the sole potential source of an imminent cancer relapse. Cellular senescence as a therapy-induced stress response with a massive NF-B-driven pro-inflammatory secretion and epigenetic stem-like reprogramming reflects such critical state switch, especially if these tumor cells occasionally manage to resume proliferation. Systems biology-based bioinformatics in mouse lymphoma models and human DLBCL samples led to the identification of various cross-species senescence signatures that provide novel functional insights, prognostic information and conceptually novel therapeutic target principles.

The genotype-phenotype nexus: uncovering causal mechanisms in neurodegenerative disease with experimental and theoretical approaches

ABSTRACT. Currently, one of the biggest challenges in disease research is to understand the underlying molecular causes of genetic diseases and to predict the clinical manifestations of pathogenic genetic variants. This is true for complex diseases with multiple disease-associated genetic variants like cancer but also for monogenic diseases such as Huntington’s disease (HD) where a single causative mutation in one specific gene drives the disease process. We apply quantitative multi-OMICs techniques, disease-relevant experimental model systems and theoretical network modelling approaches to gain insights into the molecular mechanisms of neurodegenerative diseases (NDs) such as Alzheimer’s disease (AD). We have previously generated interactome maps for a large number of ND-associated proteins and computationally predicted subnetworks enriched with aggregation-prone proteins for AD, suggesting that abnormal protein aggregation in NDs is a more widespread phenomenon than commonly appreciated. Also, we have recently generated multiple transgenic Drosophila lines containing different genetic variants of the human HTT gene to model HD in fly brains. Strikingly, we observed a strong correlation between the expression of different pathogenic HTT protein variants in neurons and the lifespans of transgenic fly strains, indicating that the model recapitulates key aspects of human HD and enables the investigation of genotype-phenotype relationships. Differential gene expression analysis of RNAseq data from different HD fly strains revealed “gene clusters” and pathways that are significantly altered in HD brains but not in age-matched controls. Finally, we applied a machine learning approach to identify lifespan-predicting genes using RNAseq data and lifespan measurements as input data. Our computational studies revealed multiple lifespan-predicting genes that finally were experimentally validated. From these results, we aim to deduce general molecular mechanisms elucidating the genotype-phenotype interface in HD.

Precision machine-learning identifies a new paradigm for therapeutic discovery in Huntington’s disease: remodeling stress response to re-instate neuronal health and resilience.
PRESENTER: Lucile Megret

ABSTRACT. Loss of cellular homeostasis has been implicated in the etiology of several neurodegenerative diseases (ND). However, the molecular mechanisms that underlie this loss remain poorly understood on a systems level in each case, limiting therapeutic discovery in NDs. Recently, genomic screening technologies have been used to interrogate how specific neurons in the brain of living mice may use hundred of genes to respond to ND insults. However, extracting systems level information from these complex datasets is challenging. To understand the dynamics of cell resilience-over-senescence in HD, we developed Geomic, a novel computational approach that is part of our BioGemix platform for precision machine-learning and that is based on the application of shape-analysis concepts to in-depth analysis of complex omics data. We used Geomic to integrate dimensional RNA-seq data (across CAG repeat lengths and age points, in specific cell types) and in vivo neuron survival data (shRNA screen) obtained in the striatum of HD model knock-in mice, mapping the temporal dynamics of homeostatic compensatory and pathogenic causative responses to mutant huntingtin in four striatal cell types. The resulting model shows that most pathogenic responses are mitigated and most homeostatic responses are decreased over time, revealing that neuronal death in HD may be primarily driven by the loss of homeostatic responses, and not by the aggravation of pathogenic responses. Moreover, different cell types may lose similar homeostatic processes. HD relevance is validated by human stem cell, genome-wide association study, and post-mortem brain data. These findings highlight the importance of considering stress resilience dynamics in several brain cell types affected by neurodegenerative conditions, particularly in the early phases of ND processes. These findings provide a new paradigm for therapeutic discovery in HD, based on remodeling stress response to re-instate neuronal health and resilience in the early phases of the HD process. These findings also provide a database of future targets to probe (see https://elifesciences.org/articles/64984) in view of using molecular reprogramming as an alternative approach to silencing ND triggers such as huntingtin.

Multiple-omics based metabolic modelling reveals effects of drug-induced liver toxicity

ABSTRACT. Adverse drug events are a major burden in drug development and clinical care. Common pre-clinical assays are based on artificial in vitro exposure times and concentrations which typically do not represent in vivo conditions. The inability to translate previous findings to reliable predictions of human toxicity risks emphasizes the need for a physiologically relevant in vitro models to investigate the mechanisms of drug induced toxicity. In the HeCaTos study, we demonstrate the use of primary human liver micro-tissues to model liver-toxicity by exposing the primary human hepatocytes to physiologically relevant drug concentration-time profiles. The pharmacokinetics profiles were designed to mimic a therapeutic exposure according to the drug label or a toxic exposure corresponding to concentration-viability IC20 values after 14 days of incubation. The physiologically-relevant assay was conducted for ten well known hepatotoxic drugs and time-resolved alterations were studied over a 14 day time course. Mass-spectrometry-based proteomics data combined with mRNA sequencing-based transcriptomics analyses were applied to characterize the cellular alterations after therapeutic and toxic drug exposure to decipher mechanisms underlying drug-specific-toxicities. Overall, we found that the accumulated exposure to drugs over time was driving the cellular response rather than the application of either a toxic or therapeutic dose itself. We will illustrate this based on a toxicity timeline for initiation of apoptosis for each drug. In addition, functional enrichment analysis revealed that metabolic processes related to central carbon and nitrogen metabolism were amongst the top altered processes. Hence, we reconstructed context-specific genome-scale metabolic models using iMAT combined with probabilistic simulations integrating transcriptomics, proteomics, substrate availability, and cell viability measurements to unravel the underlying mechanisms. In our presentation, we will discuss key findings. The study illustrates a generic framework for further investigations to elucidate the impact of drug concentrations-time profiles on organ- specific cellular biochemistry and mechanisms of drug-induced liver toxicity in the future.

The role of the kynurenine metabolism in chronic graft-versus-host-disease: Insights from computational modeling and patient data
PRESENTER: Thomas Stiehl

ABSTRACT. Chronic graft-versus-host disease (cGVHD) is a severe complication after allogeneic haematopoietic cell transplantation (allo-HCT). In cGVHD grafted immune cells are activated by the host‘s tissues which they recognize as non-self. This results in a chronic activation of the immune system. In a significant number of patients, the chronic immune activation triggers fibrosis. This leads to occasionally life-long morbidity and increased mortality in patients who were cured from their haematological malignancies.

The development of fibrosis is poorly understood and current therapeutic strategies succeed only in few patients. Kynurenine and its metabolites contribute to fibrosis, inflammation and immune-modulation. Kynurenine is a metabolite of tryptophan and is further degraded into anthranilic acid, kynurenic acid, 3-hydroxykynurenine and 3-hydroxanthranilic acid. We have developed an ordinary differential equation model of the kynurenine metabolism. Combining the model with chromatography-tandem mass spectrometry measurements of kynurenine metabolites in serum helps to quantify how the metabolic fluxes in the kynurenine pathway change during the course of cGVHD and how these fluxes differ across clinically defined subtypes of cGVHD [1].

The model-guided data analysis suggests that fibrosing cGVHD is associated with a shift of the kynurenine metabolism towards anthranilic acid and kynurenic acid [1]. Systematic analysis of the model and dynamic simulations help to understand which enzymatic steps in kynurenine metabolism have to be inhibited to reproduce the metabolic patterns observed in patients. Some of the observed changes correlated with serum levels of immune mediators such as IL18, CXCL9 or cofactors (Vitamin B6). Other changes may require regulations of enzymatic activities.

The proposed model is the theoretical basis for the use of kynurenine metabolites as biomarkers for the risk of fibrosing cGVHD. At the same time, it generates new hypotheses about the fine-tuning of the human kynurenine metabolism that can experimentally be tested.

[1] Orsatti L, Stiehl T, Dischinger K, Speziale R, Di Pasquale P, Monteagudo E, Müller-Tidow C, Radujkovic A, Dreger P, Luft T. Kynurenine pathway activation and deviation to anthranilic and kynurenic acid in fibrosing chronic graft-versus-host disease. Cell Rep Med. 2021 Oct 19;2(10):100409. doi: 10.1016/j.xcrm.2021.100409. PMID: 34755129; PMCID: PMC8561165.

Personalized Treatment of Anemia in Lung Carcinoma

ABSTRACT. With more than 2 million of new cases per year, lung carcinoma (LC) is the cancer with the highest incidence. Prevalence of associated anemia ranges from 50% to 90% in the latest stage. The two main therapeutic options to manage chemotherapy-induced anemia are blood transfusions and/or erythropoiesis stimulating agents (ESAs). Adverse events have been reported for both options, and the recommendations and guidelines limit the use to the minimum quantity, which in many cases is insufficient, and compromise the palliative treatment in the latest stage of the disease. The clinical decision is based on the benefit-to-risk ratio, which is challenging due to the heterogeneity of the patients, the lack of prognosis markers and the dynamics of comorbidities. A dynamic pathway model, which describes the interactions of ESAs with the erythropoietin receptor at the cellular scale, was coupled to ordinary differential equations and linked the cellular to body scale by calibrating the model with pharmacokinetic and pharmacodynamics data of individual patients, generating a mechanism-based multiscale model. Individualized data from 253 healthy subjects and 795 LC patients were used for the calibration and retrospective validation of the model. This model stratifies patients based on two estimated patient specific parameters utilizing time-course data of Hb, CRP values and scheduled chemotherapy. The two patient specific parameters reflect the anemic status of the patient and the capability to respond to anti-anemic treatment. The model is capable to propose optimized personalized interventions for anemia management employing a minimal effective dose of ESAs and/or blood transfusions leading to safer hemodynamic profiles, and eventually a reduction in the risk of adverse events and mortality.

Transomic network analysis of glucose metabolism and its dysfunction associated with obesity

ABSTRACT. Blood glucose levels are homeostatically regulated by inter-organ metabolic cycles. Glucose metabolism in each organ is regulated by a large transomics network including metabolome, proteome and transcriptome. When obesity changes metabolic control in each organ and impair systemic glycemic control, resulting in hyperglycemia and type 2 diabetes. However, the transomic network of glucose metabolism homeostasis and its dysfunction associated with obesity has not fully been elucidated. We established a transomics network of glucose metabolism regulation in the liver by oral glucose administration to healthy wild-type (WT) mice and leptin-deficient obese (ob/ob) mice, a model mice of obesity (Kokaji et al., Sci. Signal. 2020). Comparison of this network between WT and ob/ob mice revealed that glucose metabolism in the liver of WT and ob/ob mice are largely different, with few common regulations between the two mice. Glucose metabolism in WT mice was rapidly regulated by the Akt and Erk pathways and by the metabolites themselves, mainly through allosteric regulation in a time scale of ten minutes. On the other hand, in ob/ob mice, most rapid regulation by glucose-responsive metabolites was absent; instead, glucose administration produced slow changes in the expression of carbohydrate, lipid, and amino acid metabolic enzyme-encoding genes to alter metabolic reactions in a time scale of hours. We next examined the transomic network of inter-organ metabolic cycles involving the liver and skeletal muscle via blood in WT and ob/ob mice (Egami et al, iScience, 2021). We identified metabolome, proteome, and transcriptome with differential abundance and differential regulation in ob/ob mice. By constructing and evaluating the trans-omic network controlling the differences in metabolic reactions between fasted WT and ob/ob mice, we provided potential mechanisms of the obesity-associated dysfunctions of metabolic cycles between liver and skeletal muscle involving glucose-alanine, glucose-lactate, and ketone bodies. These results show obesity-associated systemic pathological mechanisms of dysfunction of inter-organ metabolic cycles.

The Mixed Meal Model; quantifying metabolic resilience in overweight and obesity
PRESENTER: Shauna O'Donovan

ABSTRACT. 1. Introduction While once thought of as simply the absence of disease, our concept of health is increasing begin redefined as the ability of a person to respond and adapt to physical, emotional, or social challenges, termed resilience [1]. Challenge tests, such as oral glucose tolerance tests or mixed meal challenges tests, are regularly employed in nutritional research to assess metabolic resilience. The post-meal trajectories of plasma glucose, insulin, and triglycerides can provide insights into disturbances in the metabolic crosstalk between the liver, skeletal muscle, and adipose tissue seen in overweight and obesity that give rise to the development of dyslipidemia, insulin resistance, and ultimately a loss of glycaemic control. However, methods to effectively process and quantify physiologically relevant features of metabolic health from this dynamic multi-variate data are still lacking, and valuable information may be lost [2]. In this study we generate a novel mechanistic model to quantify features of metabolic resilience from meal challenge test data.

2. Approach We construct the Mixed Meal Model, a physiology based computational model of postprandial glucose and lipid metabolism which can be personalised using post-meal time series of plasma glucose, insulin, triglyceride, and free-fatty acid concentrations. A population of 342 personalised Mixed Meal Models were generated using data from three independent dietary intervention studies (caloric restriction, improved macronutrient quality, or a combination of both) identifying postprandial metabolic signatures insulin resistance and elevated liver fat.

3. Results The Mixed Meal Model could capture the diverse individual responses to the standardised meals included in this study. Moreover, personalised parameter estimates quantified features of metabolic health from the meal response data with the model parameter describing lipid metabolism (k11) producing a strong correlation (ρ = -0.76, p < 0.05) with hepatic fat accumulation measured with magnetic resonance spectroscopy (MRS). The Mixed Meal Model derived estimation of insulin sensitivity (k5) produced a correlation of ρ = 0.65 (p < 0.05) with hyperinsulinemic euglycemic clamp, the gold standard measure of insulin resistance, and the model predicted measure of beta-cell functionality (k6) has a correlation of ρ = 0.61 (p < 0.05) with the insulinogenic index. In addition, for some individuals the personalised Mixed Meal Models could infer a reduction in hepatic fat content following a period of caloric restriction using meal responses alone, confirmed using MRS measure of liver fat.

4. Conclusion The Mixed Meal Model provides an objective and sensitive assessment of metabolic resilience for individuals with overweight and obesity.

References 1. Huber M, Knottnerus JA, Green L, van der Horst H et al. How should we define health? BMJ. (2011) 343:d4163. 2. Vis DJ, Westerhuis JA, Jacobs DM, van Dynhoven JPM et al. Analyzing metabolomics-based challenge tests. Metabolomics. (2015) 11(1):50-63.

Aspergillus fumigatus pan-genome analysis identifies genetic variants associated with human infection
PRESENTER: Tongta Sae-Ong

ABSTRACT. Aspergillus fumigatus is an environmental ubiquitous human fungal pathogen. Despite the more than 300,000 cases of invasive disease globally each year, a comprehensive survey of the genomic diversity present, including the relationship between clinical and environmental isolates, and how this genetic diversity contributes to virulence and antifungal drug-resistance, has been lacking. In this study, we define the pangenome of A. fumigatus using a collection of 300 environmental and clinical genomes from a global distribution, 188 of which were sequenced in this study. We found a total of 10,907 orthologous groups, of which 7,563 (69%) are core groups, while 3,344 groups show presence/absence variation, representing 16-22% of each isolate’s genome. Using this large genomic dataset of both environmental and clinical samples, we found a genetic cluster was enriched for clinical isolates. Their genomes contain more accessory genes, including more transmembrane transporters, proteins with iron-binding activity, and genes involved in both carbohydrate and amino acid metabolism. Finally, we leverage the power of genome-wide association to identify genomic variation associated with clinical isolates and triazole resistance as well as characterize genetic variation in known virulence factors. This characterization of the genomic diversity of A. fumigatus allows us to move away from a single reference genome that does not necessarily represent the species as a whole and better understand its pathogenic versatility, ultimately leading to better management of these infections.

Machine learning based pathway deregulation analysis of metabolomics data for Parkinson's Disease

ABSTRACT. Parkinson’s Disease (PD) is a complex and heterogeneous disorder, influenced by both genetic and environmental factors. Accurate diagnosis of PD is still a challenge and even after the onset of clinical motor symptoms, misdiagnoses can still occur. Machine learning analysis of blood plasma metabolomics data may provide a means to identify molecular signatures associated with PD diagnostic status.

Here, the goal was to build machine learning models for motor-stage PD vs. control classification which are both robust and biologically interpretable. We investigated global cellular pathway alterations of metabolomics data through multiple aggregation statistics and dimension reduction methods, which summarize the abundance information from pathways’ metabolite members into global fingerprints of pathway activity. These pathway activity fingerprints were then cross-validated and tested on hold-out data as predictors for classification of PD patients and controls using machine learning methods, while accounting for common confounders. We compared the resulting models’ predictive performance and most informative features derived from both the pathway-based data representations and the original metabolite features.

Overall, our results suggest that blood plasma metabolomics data contains significant predictive information for sample classification and that a pathway-based modeling approach can reveal robust and interpretable global deregulations in cellular processes. Additional targeted measurements of the observed metabolite alterations and further validation on independent studies will be needed to corroborate the results.

A systems pharmacology approach reveals robust drug metabolism and altered glucuronide disposition in a mouse model of liver cirrhosis
PRESENTER: Rebekka Fendt

ABSTRACT. Liver cirrhosis impairs the liver’s function and alters drug absorption, distribution, metabolism, and excretion (ADME). Therefore, drug doses for patients with liver cirrhosis might need adjustment to ensure efficacious and safe pharmacotherapy. However, the effect of cirrhosis on pharmacokinetics (PK) is not fully understood. We investigated PK in a mouse model of liver cirrhosis with a systems pharmacology approach consisting of physiologically based pharmacokinetic (PBPK) model predictions and experimental validation.

Liver cirrhosis in mice was induced by repeated injections of carbon tetrachloride (CCl4, twice per week). After 12 months, the mice were administered a drug cocktail of caffeine, codeine, midazolam, pravastatin, talinolol, and torsemide. The drugs served as probes for the metabolic enzymes Cyp1a2, Cyp2d22, Cyp3a11, Oatp1b2, and the drug transporters Mdr1 and Cyp2c29. PBPK models were established for all compounds and applied to simulate reduced drug metabolism, altered drug transport, and further cirrhosis-associated pathophysiologies.

The expression of CYP1A, a marker for liver function, was reduced in liver sections of CCl4-treated mice. PBPK model simulations with reduced metabolic enzyme activity predicted increased parent drug concentrations and reduced production of metabolites. Surprisingly, the PK of most drugs was not significantly altered in cirrhotic mice and in vitro assays of liver microsomes also suggested functional drug metabolism.

Furthermore, RNA expression of the drug transporters Oatp1b2 and Mdr1 in the livers of CCl4-treated mice was significantly altered. The pravastatin and talinolol PBPK models predicted only a minor influence of the altered transporter expression on PK, which was in line with the observed data.

However, concentrations of glucuronidated metabolites formed in phase 2 of the biotransformation were increased in cirrhotic mice. We hypothesized that either (1) glucuronosyltransferase activity was increased, (2) biliary excretion was impaired, or (3) basolateral export was increased. PBPK simulations showed that all three mechanisms could explain altered glucuronide disposition. Experiments revealed increased RNA expression of basolateral glucuronide transporters. Therefore, we concluded that increased basolateral export probably caused altered glucuronide disposition.

The CCl4 mouse model recapitulated many features of liver cirrhosis, but drug metabolism was surprisingly robust. In this respect, the mouse model for cirrhosis might differ from patients, who often show reduced metabolic clearance. On the other hand, it is also common that PK in liver cirrhosis patients is less affected than predicted by PBPK simulations which might indicate compensational mechanisms [1].

Experiments and PBPK modeling mutually contributed to a deeper understanding of pharmacokinetics in a mouse model of liver cirrhosis. PBPK modeling linked altered expression to functional impact and helped to explore scenarios that could not readily be tested experimentally. Reference: 1. Heimbach, T., et al., Physiologically-Based Pharmacokinetic Modeling in Renal and Hepatic Impairment Populations: A Pharmaceutical Industry Perspective. Clin Pharmacol Ther, 2021. 110(2): p. 297-310.

Machine learning-based prediction of frailty in elderly people - Data from the Berlin Aging Study-II (BASE-II)
PRESENTER: Jeff Didier

ABSTRACT. Frailty is a geriatric medical condition that is highly associated with age and age-related diseases. The multidimensional consequences of frailty are heavily impacting the quality of life, and will inevitably increase the burden on healthcare systems in the future. Most importantly, the lack of a universal standard to describe, diagnose, or let alone treat frailty, is further complicating the situation in the long-term. Nowadays, more and more frailty assessment tools are being developed on a regional and institutional basis, which is continuing to drive the heterogeneity in the characterization of frailty further apart. Gaining better insights into the underlying causes and pathophysiology of frailty, and how it is developing in patients is, therefore, required to establish strong and accurately tailored response schemes for frail patients, where currently only symptoms are treated. Thus, in this study, we deployed machine learning-based classification and optimization techniques to predict frailty in the Berlin Aging Study II (BASE-II, N=1512, frail=484) and revealed some of the most informative biomedical information to characterize frailty, including new potential biomarkers. Frailty in BASE-II was measured by the Fried et al. 5-item frailty index, composed of the clinical variables grip strength, weight loss, exhaustion, physical activity, and gait. The level of frailty in BASE-II was adapted for binary classification purposes by merging the pre-frail and frail levels as frail. A configurable in-house pipeline was developed for pre-processing the clinical data, predicting the target disease, and determining the most informative subgroup of clinical measurements with regards to frailty. The best prediction power was yielded with resampling and dimensionality reduction techniques using the F-beta-2 score, and was further increased by adding one item of the Fried et al. frailty index. We suggest that a combination of the easy-to-obtain biomedical information on frailty risk factors together with one Fried et al. phenotype information provided by i.e. smart wearable devices (gait, grip strength, . . . ) could significantly improve the frailty prediction power.

M4-health: digital twins that follow you throughout your health journey
PRESENTER: Gunnar Cedersund

ABSTRACT. For 20+ years, I have developed and tested mechanistic mathematical models for the main organs in the human body. We have now created a reusable backend to an eHealth platform, where the organ-models are interconnected, and where the models can be personalized, and used to simulate scenarios. The models are Multi-level (intracellular to whole-body), Multi-timescale (seconds to decades), Multi-organ, and Mechanistic (M4).

The M4-models are developed/simulated using differential-algebraic equations, across various platforms: Matlab, OpenCOR, NEURON, OpenSim, Unreal Engine, INCA, etc. The M4-models are extendable to omics-level network models, and are combined with machine-learning models, to calculate e.g. the risk of a stroke. The backend is written in Python, and can be called from any eHealth-platform.

The interconnected M4-model is able to simulate scenarios that agree with data for all levels and timescales. The omics-level model can simulate diabetes on a phosphoproteome level. The digital twins are personalized in appearance (face, proportions, weight, etc), and can be made to move. Intrabody images (MRI, microscopy images of biopsies, etc) can visualize both how the organs and cells are now, and how they gradually change, depending on what the digital twin is doing: diet, exercise, medication, etc.

Because of the physiological M4-core, our model can be re-used across the entire health journey: for personalized computer labs in education, for including your digital twins in performances on stage, and for improving communication with your personal trainer at the gym, or with nurses or specialist (hepatologist, cardiologist). The hypothesis is that seeing such scenarios play out, in your own digital twin, will improve the understanding, motivation, and compliance to treatment. In my presentation, I will give examples of how we work with end-users across the different stages of your life journey. Finally, if there is a grand piano, I can show live how dancing digital twins are incorporated in lecture-performances.

Defining Design Rules for Next-Generation Snakebite Antivenoms
PRESENTER: Natalie Morris

ABSTRACT. Snakebite envenomation is a priority neglected tropical disease, which globally results in around 100,000 deaths and 400,000 cases of disability per year. Venom is a complex mixture of protein toxins of various families and functions. Systemic envenomation is treated using antivenoms, which are currently produced by hyper-immunising large animals against the venom in question. The animal naturally produces protective toxin-neutralising antibodies, which can be harvested to formulate the serum-based antivenom product. There is an urgent need to innovate the way that we design and produce antivenoms, owing to limitations in the cost, efficacy, and safety of these conventional treatments.

The advent of recombinant protein expression and in vitro antibody selection has greatly expanded the antivenom design space, and has facilitated the production of toxin-neutralising recombinant antibodies in a range of conventional and alternative molecular formats. Different antivenom scaffolds may impart pharmacokinetic benefits to treatment under different circumstances. Whilst there are several next-generation formats under active investigation, there has been little work done to quantitatively compare the performance of different scaffolds. The pharmacokinetic venom-antivenom system is complex, owing to the variable distribution and elimination profiles of different venoms and antivenoms. Computational simulations of venom and antivenom pharmacodynamics can be used to explore this interplay and compare the function of different scaffolds under clinically relevant treatment scenarios. These simulations can facilitate the methodical testing of a much wider area of parameter space than would be feasible in vivo.

We have built a two-compartment pharmacokinetic model of systemic snakebite envenomation and treatment in rabbits, which tracks the movement of toxins through separate blood and tissue compartments. The model was parameterised with existing experimental data and enables the simulation of antivenom scaffolds ranging in size from 15 to 150 kDa. The model additionally enables control of other treatment parameters including antivenom dosing, affinity, treatment time, and venom type. We are performing a range of simulations, including local and global sensitivity analysis and global parameter optimisation, to better understand the most important features in antivenom design. We are exploring and defining the optimal combinations of antivenom molecular size, dosing ratio, and affinity parameters within these studies. Thus far, global sensitivity analysis has indicated that the most important antivenom parameters in the neutralisation of low molecular weight venoms are the antivenom-to-venom dosing ratio, and the on-binding affinity rate. Antivenom molecular size has a much smaller impact on treatment outcome. Global parameter optimisation has indicated that the most effective antivenoms constitute low molecular weight scaffolds, at high dosing ratios and with high kon binding affinities. This modelling approach can be used to elucidate the dynamics of envenomation-treatment systems, and help inform the development of low-cost, high coverage antivenoms for snakebite.

Neural Circuits Underlying Autism Spectrum Disorders

ABSTRACT. Autism spectrum disorder (ASD) is a common and highly heritable psychiatric disorder and the genetic risk factors have been well studied, however, the perturbed neural circuits responsible for characteristic behaviors are poorly understood. Our analysis uses genetic mutations to ascertain the neural circuits perturbed in ASD, and we find that a strongly interconnected system of neural structures may be the basis for the behavioral phenotypes associated with the disorder. We observe that distal projections constitute a disproportionately large fraction of the network composition, suggesting that the integration of diverse brain regions is a key property of the circuit. We also implicate key cortical and subcortical structures sharing strong functional connections, and we observe that cortical perturbations are associated with more severe intellectual phenotypes. Overall, we present a method that, to our knowledge, is the first unbiased approach to comprehensively discover and identify the neural circuitry affected in ASD.

MeDaX - our vision for bioMedical Data eXploration
PRESENTER: Judith Wodke

ABSTRACT. An immense amount of (bio)medical data is collected in clinical everyday life, providing an enormous potential for research and evidence-based medicine. However, this data is usually not standardized and, often, simply not accessible. Systematical sharing and usage of especially clinical data is prevented by several reasons, such as i) data complexity and heterogeneity, ii) lack of appropriate tools for storage and comparison, and iii) data security and protection of personal information.

Focusing on integration of standardized data formats (e.g. HL7/FHIR [1], OpenEHR [2], bio-ontologies [3], or COMBINE standards [4]) and generation of FAIR [5] data, we will connect diverse (bio)medical data and semantic information in an integrated, formalized, and standardized knowledge graph. Graph databases are an appropriate tool for processing highly interconnected, heterogeneous data [6-10]. Data integration will be accomplished via ETL (extract, transform, load) processes. Data and information sources include the data integration center at Universitätsmedizin Greifswald, local population studies [11,12], biomedical ontologies [13], and public information portals [14,15]. Implementation of our graph database will be accompanied by integrating and advancing methods for data provenance, quality assurance, and similarity measures. Once data connectivity and accessibility are established, we will design and implement methods and software for data analysis and prediction.

In summary, the MeDaX junior research group, will develop an innovative and efficient research platform for biomedical data exploration. This includes i) pipelines for semi-automated storage of and access to (bio)medical data in our graph database, ii) methods for data provenance, quality control, and similarity measures, and iii) tools for data analysis and prediction.

Committed to responsible and reproducible science, our results and code will be made publicly available and measures for data privacy will be considered at all project stages. To maximize benefits for researchers, clinicians, and most importantly patients, we are interested in cooperations providing us with information on their requirements.

References [1] D. Bender, K. Sartipi, Proceedings of the 26th IEEE international symposium on computer-based medical systems 2013 [2] D. Kalra et al., Studies in health technology and informatics 2005 [3] M. Salvadores et al., Semantic web 2013 [4] D. Waltemath et al., J integrative bioinf 2020 [5] M. D. Wilkinson et al., Scientific data 2016 [6] C. T. Have, L. J. Jensen, Bioinformatics 2013 [7] S. G. Finlayson et al., Sci data 2014 [8] I. Balaur et al., J Comp Biol 2017 [9] A. Fabregat et al., PLoS comp biol 2018 [10] D. S. Himmelstein et al., Elife 2017 [11] U. John et al., Sozial-und Präventivmedizin 2001 [12] H.J. Grabe et al., J translational medicine 2014 [13] N.F. Noy et al., NAR 2009 [14] nfdi4health, https://nfdi4health.de/ [15] S. Thun et al., BMC Med Inform Decis Mak 2020

Characterization of cardiac fibroblast remodeling dynamics after myocardial infarction
PRESENTER: Laura Sudupe

ABSTRACT. The heart tissue healing process after myocardial infarction (MI) is orchestrated by activated cardiac fibroblasts (CFs). High throughput technologies have demonstrated CF heterogeneity during ventricular remodeling in the last few years, where each subpopulation's particular role is becoming increasingly important. Recently, we have described the reparative cardiac fibroblasts (RCF). This activated subtype of CFs is related to the tissue healing process after MI. Our analysis identified RCFs, characterized the associated markers, and quantified their prevalence during MI recovery. However, it was clear that at 7 days-post infarct (7dpi) the RCFs were already differentiated, and was not possible to characterize their spatial location or early differentiation dynamics fully. Therefore, we use a Col1α1-GFP mouse model for MI to investigate the ventricular remodeling process by exploring fibroblasts after MI at single-cell resolution and the entire heart by spatial transcriptomics. With this approach, we aim to decipher the spatial location and dynamics of activation of the CFs, after MI, particularly the RCFs. We first defined the window of activation (WoA) of the RCF transcriptomic signature between 3 and 5 dpi using bulk RNA-seq. Secondly, using single-cell profiling of the WoA, we characterized RCF dynamics (primarily through the top marker gene Cthrc1). As a result, we identified two significant gene expression dynamics in the RCF-specific signature cluster. Finally, we localized those dynamics using 10x genomics FF Visium spatial profiling on transversal sections from healthy and 3, 5 (female and male) dpi hearts. To this end, we manually characterized the different areas of interest (RZ, remote zone; BZ, border zone; and IZ, infarcted zone) and then used an enrichment score analysis to quantify each dynamic prevalence in the different time points. Finally, we validated our data in both a preclinical model for MI, such as pigs, and in patients with varying failures of heart. In summary, we characterized RCF subtype-specific signatures that advance separately in the different time points of WoA. Our work uncovers a spatial-dependent response in the damaged tissue that implies a complex mechanism in the remodeling process, thus extending the scope and reach of systems biology.

15:30-16:00Coffee Break
16:00-16:45 Session K9: KEYNOTE IX: Luis Serrano

Abstract: Alternative splicing shapes the regulatory and functional diversity in the cell. Cancer cells tend to select alternative splicing programs involved in tumor progression. However, while therapies based on targeting splicing events have been developed to treat cancer and other diseases, the systematic prioritization of potential disease-driver targets still remains unaddressed. Here, by using publicly available gene-level cancer dependencies from RNAi viability screens across 713 cancer cell lines, we define 140,310 exon-level linear models using splicing profiles and mRNA levels. We then identified cancer-driver exons as the ensemble of models that best prioritized experimental cancer dependencies across individual samples, which we call spotter. The 1,073 selected models corresponded to exons that mostly disrupt their gene's ORF or create new isoforms. These exons belong to genes related to the splicing machinery and cell proliferation and show a low rate of aberrant mutations. Interestingly, our ensemble model inferred the effects of single and multiple splicing perturbations on cell proliferation. Integrating pharmacological screens with our predicted splicing-level dependencies, we uncovered cancer-driver exons that mechanistically mediate drug sensitivity and synergize with drug effects. In patients, our ensemble model can not only aid the systematic prioritization of splicing targets across 14 different types of cancer but also identify putative splicing events driving patient response upon drug treatment or pinpoint susceptible splicing events at single-patient resolution. Taken together, in silico RNA isoform screening with spotter sheds light on the weak spots of cancer samples at the splicing level and holds the potential to be implemented for personalizing treatments.

Location: Alexander
Bacterial therapy for pulmonary infectious diseases

ABSTRACT. Engineering bacteria for treating human diseases presents new opportunities in therapeutics. Although lung diseases are among the top causes for mortality worldwide, there is no treatment for them based on a live biotherapeutic. We have developed a non-pathogenic chassis of the human lung bacterium M. pneumoniae (MPN) to treat lung diseases. This strain has a unique genetic code, which hinders gene transfer to most other bacterial genera, and it lacks a cell wall, which allows it to express proteins that target peptidoglycans of pathogenic bacteria. We first determined that removal of the pathogenic factors fully attenuated the chassis strain in vivo. We then designed synthetic promoters, and identified an endogenous peptide signal sequence that, when fused to heterologous proteins, promotes efficient secretion. This strain capable of dissolving Staphylococcus aureus biofilms preformed on catheters in vitro, ex vivo and in vivo and in treating an acute murine model of P. aeruginosa infection. Moreover, we demonstrate that the engineered chassis can also dissolve biofilms formed in ETTs of VAP patients ex vivo. Our chassis can express biomolecules in the lung like IL-10 having a powerful anti-inflammatory effect on Pseudomonas aeruginosa lung infection. MPN can be used in combination with antibiotics that target peptidoglycan layer formation in both Gram-positive and Gram-negative bacteria, thereby increasing the efficacy of some antibiotics, as well as in other lung disease like cancer.