Program for Thursday, July 10th

PROGRAM FOR THURSDAY, JULY 10TH

Days:

previous day

next day

all days

View: session overview talk overview

09:00-10:00 Session 13: Keynote 4: Emma Schymanski

Open Science Data Processing and Integration Workflows in Metabolomics and Exposomics

slides

Chair:

Sylvain Prigent

Location: Main amphi

09:00

Emma Schymanski

Open Science Data Processing and Integration Workflows in Metabolomics and Exposomics

ABSTRACT. Exposomics researchers need to identify relevant chemicals covering the entirety of potential exposures over entire lifetimes. With over 100 million chemicals in the largest open chemical databases, coupled with broadly acknowledged knowledge gaps, researchers are faced with too much yet not enough information at the same time. Improvements in analytical technologies and computational mass spectrometry workflows coupled with the rapid growth in databases and increasing demand for high throughput “big data” services from the research community present significant challenges for both data hosts and workflow developers. This talk will showcase FAIR and Open Science developments in the Environmental Cheminformatics group, including the NORMAN Suspect List Exchange (NORMAN-SLE), MassBank, MetFrag, PubChemLite for Exposomics [1], patRoon, ShinyTPs and the Chemical Stripes. Beyond the software developments, it will showcase how these are applied in our active research projects in our data processing and integration workflows to tackle challenges in non-target exposomics studies [2,3]. The case studies will show how enhancing the FAIRness (Findability, Accessibility, Interoperability and Reusability) of open resources can mutually enhance several resources for whole community benefit. Many thanks to all group members, collaborators and colleagues who have been a part of these efforts!

References

1. Schymanski EL et al. Empowering large chemical knowledge bases for exposomics: PubChemLite meets MetFrag, Journal of Cheminformatics, 2021; 13:19, DOI: 10.1186/s13321-021-00489-0

2. Talavera Andújar B et al. (2024) Can Small Molecules Provide Clues on Disease Progression in Cerebrospinal Fluid from Mild Cognitive Impairment and Alzheimer's Disease Patients? Environ. Sci. Technol, 2024; 58(9):4181-4192. DOI: 10.1021/acs.est.3c10490

3. Talavera Andújar B et al. Exploring environmental modifiers of LRRK2-associated Parkinson’s disease penetrance: An exposomics and metagenomics pilot study on household dust. Environment International, 2024; 194:109151 DOI: 10.1016/j.envint.2024.109151

10:00-10:30Coffee Break

10:30-11:30 Session 14A: Systems Biology

Chair:

Damien Eveillard

Location: Main amphi

10:30	Juliette Audemard, Mohamed Mouffok, Charlotte Duval, Jeanne Got, Sebastien Halary, Marie Lefebvre, Julie Leloup, Benjamin Marie, Gabriel Markov, Coralie Muller, Nicolas Creusot, Binta Diémé and Clémence Frioux Metagenome-scale metabolic modelling for the characterization of cross-feeding interactions in freshwater cyanobacteria-associated microbial communities PRESENTER: Juliette Audemard ABSTRACT. Favoured by global changes, freshwater cyanobacterial harmful blooms (HCBs) generate increasing ecological, economical and public health challenges. Microcystis, one the most pervasive genera of cyanobacteria, grows within a phycosphere, where specialized interactions with its microbiome occurs, and are suspected to influence blooms appearance and its potential toxicity. Through metagenomics, metabolomics and metabolic modelling, we characterized twelve Microcystis phycosphere cultured after isolation from a French pond. Metagenomics revealed that associated bacteria introduce new functions to the phycosphere, while functional redundancy within and across communities remains. Metabolic reaction presence in Microcystis is consistent with their genospecies, whereas community-level metabolic landscape diverges cyanobacteria’s phylogeny. On the other hand, metabolomic results lean on metabolic output led by cyanobacteria. Metabolic modelling and identification of toxic secondary metabolites biosynthetic gene cluster further highlighted differences between phycosphere metabolic capabilities and the importance of manual curation of secondary metabolism in GSMNs. These findings deepen understanding of Microcystis’ phycosphere functionning, demonstrate the relevance of multi-omics systems biology approaches, and lay the ground for further characterisation of freshwater HCB’s microbial interactions and inter-species complementarity.
10:50	Clément Frainay, Ludovic Cottret, Marion Liotier, Louison Fresnais, Meije Mathé and Fabien Jourdan Met4J: a library, a toolbox and a workflow suite for graph-based analysis of metabolic networks PRESENTER: Clément Frainay ABSTRACT. Graph algorithms are essential tools for network analysis in various domains, including biology. Despite successful applications to metabolic networks, including several developments specific to these models, few implementations are openly available. Furthermore, the exchange format adopted for most genome-scale models is incompatible with the main generic graph-analysis libraries. We present Met4J, an open-source library dedicated to the structural analysis of metabolic models and their manipulation, as well as a toolbox encompassing implementations of analyses relevant to metabolism-related research. We exemplify the potential of Met4J by creating a workflow for the construction and analysis of an holobiont network. Met4J's source code, executable JAR and containers are available at https://forgemia.inra.fr/metexplore/met4j and the library artifact is accessible through the Maven central repository. High-level applications are also available on a Galaxy interface.
11:10	Pauline Hamon-Giraud, Anne Siegel, Gabriel Markov, Benoît Bergk Pinto, Jeanne Got, Coralie Rousseau, François Thomas, Simon Dittami and Erwan Corre Methods for a species-specific genome-scale metabolic modeldesigned for eukaryotes and applied to the Ascophyllum nodosum macroalga PRESENTER: Pauline Hamon-Giraud ABSTRACT. Genome-scale metabolic models (GEMs) are essential tools for studying metabolism, either for comparative analyses or to investigate organisms interactions. However, genome annotation, biomass formulation, and network gap-filling are key steps in constructing a relevant GEM and ensuring the biosynthesis of specialized metabolites. We present a pipeline to integrate extensive biological knowledge (genomes of closely related species, metabolic profiling studies, potential interactions with microbiota) about an eukaryotic organism in order to generate high quality GEMs. To manage genome annotation limitations, the pipeline relies on a GEM reconstruction tool that propagates annotations across closely related species through the identification of orthologous genes. It also pays particular attention to biomass formulation, using a set of metabolomic studies to create a consensus biomass composition that seeks to closely reflect biological reality, such as incorporating specialized metabolites and their precursors. The gap-filling stage of the pipeline uses a semi-automated curation process for added reactions, taking into account the presence of orthologous genes, occurrence in phylogenetically related species and potential interactions with the organism's microbiota. The final GEM applied to the brown alga Ascophyllum nodosum comprises 3,536 metabolites and 3,072 biochemical reactions, predicting the synthesis of 1,023 compounds from 38 seawater-derived metabolites. Almost all reactions (99.98%) are linked to an enzyme supported in the algal genome. This refined model provides a framework for studying host-microbiota metabolic complementarity. This pipeline offers a scalable and robust method for reconstructing high-quality GEMs in non-model eukaryotic organisms, improving metabolic network accuracy and expanding our understanding of species-specific metabolism. It also sheds lights on the various level of knowledge related to the synthesis pathways of the biomass, paving the way to future studies to be undergone.

10:30-11:30 Session 14B: Algorithms and data structures for sequences

Chair:

Matthias Zytnicki

Location: amphi D

10:30	Bastien Degardins, Charles Paperman and Camille Marchet Vizitig: context-rich exploration of sequencing datasets PRESENTER: Bastien Degardins ABSTRACT. Recent advances in k-mer indexing have facilitated the cataloging and rapid querying of planetary- scale genomic data. While these indices excel at high-throughput sequence lookups, they often lack context-rich exploration capabilities and rely on simplistic match-based queries. This gap hinders deeper investigations into variants, regulatory elements, and other features crucial for pangenomic and transcriptomic analyses. We present Vizitig, a novel system that harnesses a de Bruijn graph as the core data structure. By directly encoding overlapping k -mers from both genome and transcriptome data, Vizitig supports the processing of partially or completely unassembled sequences, making it broadly applicable from collections of genomes to eukaryotic RNA-seq. Vizitig integrates k-mer indices into a database framework, providing an intuitive, metadata-aware approach to querying. Users can select candidate regions by specific annotations (e.g., genes, motifs) or sample-specific features (e.g., abundance, presence or absence in annotated gene or a sample), retrieving relevant graph neighborhoods and associated meta-data from extensive datasets.
10:50	Yohan Hernandez Courbevoie, Mikaël Salson, Chloé Bessière, Haoliang Xue, Daniel Gautheret, Camille Marchet and Antoine Limasset Reindeer2: practical abundance index at scale PRESENTER: Yohan Hernandez Courbevoie ABSTRACT. Over the past decade, significant efforts have been made to develop indexing solutions capable of querying sequence presence in large genomic data repositories. Recent indexing approaches have made giant steps toward the ultimate goal of indexing repositories like the SRA and ENA, leveraging k-mers for efficiency. In the case of indexing RNA samples, querying k-mer abundance is equally important than the presence itself. The current available methods for indexing abundances either fail to scale to the vast number of datasets, lose variants, or lack precision in abundance estimation. Moreover, the rapid accumulation of sequencing data presents a significant computational challenge for these structures that are mostly static. We introduce REINDEER2, a novel k-mer abundance index that addresses these limitations by providing three key properties: scalability, dynamicity, and tunable precision. REINDEER2 is highly scalable and efficient, capable of indexing thousands of RNA-seq datasets within hours while maintaining low memory usage. Unlike recent methods that sacrifice memory for completeness, REINDEER2 indexes all k-mers, ensuring nucleotide-level exploration remains possible. Additionally, it supports high-throughput queries, enabling rapid retrieval of k-mer abundance across large-scale transcriptomic datasets. One of the key advantages of REINDEER2 is its tunable abundance precision. Abundance values can be recovered with less than 1% variation from their original counts, providing reliable quantitative insights. Furthermore, REINDEER2 supports updatability: new datasets can be added efficiently without requiring a complete reindexing process through a merge operation, making it adaptable to evolving sequencing repositories. We report REINDEER2’s great efficiency at indexing collections of 1000-10,000 RNA-seq samples, and demonstrate its capacity to provide abundance estimations comparable to state-of-the-art. Availability: github.com/Yohan-HernandezCourbevoie/REINDEER2
11:10	Mathilde Girard, Lea Vandamme, Bastien Cazaux and Antoine Limasset OReO: Optimizing Read Order for practical compression PRESENTER: Mathilde Girard ABSTRACT. Recent advances in high-throughput and third-generation sequencing technologies have created significant challenges in storing and managing the rapidly growing volume of read datasets. Although more than 50 specialized compression tools have been developed, employing methods such as reference-based approaches, customized generic compressors, and read reordering, many users still rely on common generic compressors (e.g., gzip, zstd, xz) for convenience, portability, and reliability, despite their low compression ratios. Here, we introduce OReO, a simple read-reordering framework that achieves high compression performance without requiring specialized software for decompression. By grouping overlapping reads together before applying generic compressors, OReO exploits inherent redundancies in sequencing data and achieves compression ratios on par with state-of-the-art tools. Moreover, because it relies only on standard decompressors, OReO avoids the need for dedicated installations and maintenance, removing a key barrier to practical adoption. We evaluated OReO on both ONT and HiFi genomic and metagenomic datasets of varying sizes and complexities. Our results demonstrate that OReO provides substantial compression gains with comparable resource usage and outperforms dedicated methods in decompression speed. We propose that future compression strategies should focus on reordering as a means to let generic compression tools fully exploit data redundancy, offering an efficient, sustainable, and user-friendly solution to the growing challenges of sequencing data storage. The OReO code is open source and available at https://github.com/girunivlille/oreo.

10:30-11:30 Session 14C: Workflows, Reproducibility, and Open Science

Chair:

Erwan Corre

Location: Amphi E

10:30	Laurent Bouri, Imane Messak, Baptiste Rousseau, Anakim Gualdoni, Elora Vigo, Matéo Hiriart, Nadia Goué, Julien Seiler and Thomas Denecker Madbot, a metadata and data brokering online tool to ensure the adoption of standards and FAIR principals in an open science context PRESENTER: Imane Messak ABSTRACT. Madbot is a tool designed to help researchers manage and share their scientific data more easily. As research data continues to grow in volume, it becomes harder to ensure that data is accessible, reusable, and easy to understand. While other tools exist to help with parts of this process, they often lack automation, standardization, or flexibility. Madbot solves these issues by providing a simple and comprehensive solution that follows international data standards, making it easier for researchers to publish their data. It automates much of the work involved in organizing and describing data, which saves time and effort for researchers. Madbot also helps ensure that data is described correctly and consistently, following well-established standards. This makes it easier for others to find and use the data in the future. The tool connects to various global platforms like Zenodo and ENA (European Nucleotide archive), allowing researchers to submit their data directly to these repositories without hassle. Madbot’s easy-to-use interface allows users to interact with the system even if they don't have technical expertise. Behind the scenes, the tool keeps everything organized, automatically checks for mistakes, and helps researchers create accurate and high-quality metadata. Madbot’s architecture is designed to be easily extensible, enabling integration with various data storage solutions, data repositories, and metadata standards. This flexibility allows researchers to adapt the tool to their specific needs, ensuring seamless interoperability with different research infrastructures. By simplifying the process of submitting research data, Madbot encourages researchers to adopt open science principles, making their work more accessible to others. In the end, Madbot helps reduce the barriers to sharing research data and makes it easier for scientists to contribute to the global scientific community.
10:50	Ulysse Le Clanche, Sarah Cohen Boulakia, Yann Le Cunff, Alban Gaignard and Olivier Dameron Assessing bioinformatics software annotations : bio.tools case-study PRESENTER: Ulysse Le Clanche ABSTRACT. Reproducibility and reuse of digital bioinformatics resources are essential for the development of open and cumulative science, in line with FAIR principles. To search and reuse bioinformatics tools, scientists need to be confident enough with the reliability of their annotations. Our study focuses on the quantitative and qualitative evaluation of semantic annotations in the bio.tools registry, which serves more than 30,000 bioinformatics tool descriptions, annotated with the EDAM ontology. In this work, we propose to study how the EDAM ontology is used to categorize software based on scientific disciplines and the kind of data processing they allow. We also evaluate how qualitative the annotations are based on Shannon entropy. We emphasize that particular attention should be given to the whole set of inherited annotations from the used ontology. Our results underline the need for automatic tools to support annotation curation, reducing the annotation cost for domain experts. This study is a preliminary work aimed at designing novel annotation approaches based on the combination of knowledge graphs and large language models towards more findable and reusable bioinformatics tools.
11:10	Ezechiel B. Tibiri, Christine Dubreuil-Tranchant, Romaric K. Nanema, Fidèle Tiendrebeogo and Justin S. Pita A decade of strengthening bioinformatics in West Africa: HPC infrastructure, training, and scientific collaboration PRESENTER: Ezechiel B. Tibiri ABSTRACT. Since 2014, a collaborative and interdisciplinary dynamic has emerged in West Africa to build lasting capacities in bioinformatics. Driven by the growing need to analyze locally produced sequencing data, this initiative has led to the development of regional infrastructures and training programs through strong partnerships between academic and research institutions, including Joseph KI-ZERBO University (UJKZ), INERA, IRD, and the LMI PathoBios. Key milestones include the establishment of bioinformatics platforms in Ouagadougou (Burkina Faso) and, more recently, in Bingerville (Côte d’Ivoire) within the WAVE-CI framework. These platforms have served as training hubs, enabling a wide range of hands-on and theoretical training—from basic GNU/Linux usage to advanced metagenomics data analysis. A major achievement of this initiative is the launch of the International Certificate in Bioinformatics and Genomics (CIBiG) in 2023–2024. This intensive program combines 154 hours of in-person courses and practical sessions with laboratory work, project-based tutoring, and personalized coaching. It covers the entire data lifecycle, from sequencing using Oxford Nanopore Technologies (ONT) to data analysis workflows including assembly, annotation, SNP detection, phylogenetics, and transcriptomic analyses. Anchored in a participatory and inclusive model, CIBiG addresses two main objectives: (1) strengthening local expertise in bioinformatics applied to agriculture and health, and (2) structuring a regional community of practice. The program is supported by committed institutional stakeholders (UJKZ, IRD, WAVE), a broad network of trainers, and a strong ambition to sustain the initiative through curriculum reforms, long-term funding strategies, and regional thematic working groups. This paper presents a ten-year retrospective on capacity-building activities, the impact of the co-constructed training programs, the pedagogical innovations used (e.g., JupyterBook, Slack, supervised internships), and the perspectives for scaling up this pioneering experience in West Africa.

11:30-12:30 Session 15A: Demos

Chair:

Sylvain Prigent

Location: amphi D

11:30

Louis Paré, Philippe Bordron, Audrey Bihouée and Damien Eveillard

HUMESS: A tool to integrate quantitative transcriptomic and metabolic network modelling to unveil context specific gene signatures.

PRESENTER: Louis Paré

ABSTRACT. Transcriptomic analysis is a powerful tool for elucidating gene expression patterns associated with specific biological conditions, offering invaluable insights into cellular responses and regulatory mechanisms. However, one major challenge in transcriptomic analysis is the need for external knowledge to interpret gene expression changes in a meaningful biological context, which can be time-consuming and prone to biases. Consequently, many gene expression signatures derived from transcriptomic data remain quite superficial and lack the depth necessary for true mechanistic understanding.

In another hand, multi-omics data allows the reconstruction of genome-scale metabolic networks which represent all biochemical reactions involved in a given organism and how these reactions interplay. Theses networks are model of phenotypic metabolism which can have many applications such as the identification of potential therapeutic targets. Nethertheless, these genome-scale metabolic model are difficult to obtain due to the tedious steps of manual curation required to obtain good quality models.

Here we introduce HUMESS (HUman Metabolis Specific Signature), a tool that seeks to bridge the gap between both approaches. Using a snakemake implementation, HUMESS integrates quantitative transcriptomic data with metablic network modeling by (i) identifying significantly expressed genes from quantitative 3'seq-RNA Profiling (3'SRP) sequencing data, and (ii) uses a modified version of CarveMe - top-down approach for metabolic model reconstruction from a universal human metabolic model - for building a metabolic model specific to the gene differentially expressed. The metabolic model is then (iii) extensively analysed for identifying reactions essential to sustain the human metabolic phenotype.

In order to facilitate the analysis of the results, an online Rshiny interface has been developed allowing an in depth exploration of the results. The demo will show how this interface has successfully been used to analyze various stages of human embryonic stem cell development as described in the HUMESS preprint paper

12:00

Marilyne Summo, Gaetan Droc and Gautier Sarah

SynFlow: a Syri based interactive viewer

PRESENTER: Marilyne Summo

ABSTRACT. With the increasing accessibility of genome sequencing, a crucial step now involves comparing the sequenced genomes of the same species to identify structural variations and syntenic regions. The SyRI(1) tool currently enables such analyses, while plotsr(2) provides a graphical representation of the results. However, this representation remains static and could be enhanced for a more interactive exploration of the data. We introduce SynFlow, a web application designed to provide a dynamic and interactive visualization of SyRI analysis results. Featuring an intuitive and responsive interface, SynFlow allows users to explore genome alignments and structural variations. The application includes interactive features such as zooming, panning, and filtering of bands based on type and length, thereby facilitating the analysis and interpretation of genomic data. SynFlow also enhances accessibility and usability by integrating real-time interactions and supports the simultaneous visualization of up to ten genomes, making it a powerful tool for researchers in comparative genomics. Code is available at: https://github.com/SouthGreenPlatform/SynFlow

11:30-12:30 Session 15B: Demos

Chair:

Pascal G P Martin

Location: Amphi E

11:30

Sebatien Ravel, Nina Marthe, Camille Carrette, Mourdas Mohammed, Christine Tranchant-Dubreuil and François Sabot

Gratools a tool for easy manipulation of GFA

PRESENTER: Camille Carrette

ABSTRACT. The use of reference genomes introduces a bias to all genomic studies that rely on them, since a single individual from a population is not representative of the full genetic diversity. Pangenomes, accessible thanks to lower sequencing costs, bring together several complete genomes in a single data structure. A compact way to represent this complex data is the pangenome graph, which groups similar or divergent regions of the graph into nodes that may or may not be traversed by their individuals genomes. Many tools output a graph in GFA format, but there is a lack of tools to manipulate them. The current tools that manipulate graphs are VG or ODGI, but they require specific formats, can be time consuming, and do not allow as many different manipulations as Gratools in a single tool.

Gratools is an efficient tool for manipulating pangenome graphs in GFA format. It has several features that I'll demonstrate in a demo for graph description, extraction and analysis. Gratools begins with a one-time indexing step to extract and store essential information in BAM and BED files, it only need to be performed once per graph and significantly accelerates downstream analyses performed by GraTools.

The main steps of the demo will be to list the samples and chromosomes present in the GFA file and their sizes using the list_sample and list_chr commands. Second, a general assessment of the depth of the nodes with the depth_nodes command to get a first idea of the distribution of the nodes in this file and nodes can be filtered by size.

Next, the core_dispensable_ratio command is used to analyze the core genome and the dispensable genome according to user-defined limits. The get_segments_by_depth command is used to list nodes according to the number of individuals in which they are present (for example, you can request the list of nodes that make up the core genome). Finally, we'll show you how to extract a subgraph using the extract_sub_graph command.

12:00

Etienne Bardet, Mariène Wan, Johann Confais and Hadi Quesneville

REPET v4.0: A Comprehensive Tool for Transposable Element Analysis

PRESENTER: Etienne Bardet

ABSTRACT. Abstract Transposable elements (TEs) play a major role in the structure and evolution of eukaryote genomes. Thanks to their ability to move around and to replicate within genomes, they are probably the most important contributors to genome plasticity. Their detection and annotation are now considered mandatory to any genome sequencing project. The REPET package [1, 2] integrates bioinformatics pipelines dedicated to detect, annotate and analyze TEs in genomics sequences. It includes two main pipelines : (i) TEdenovo, that search for interspersed repeats, build consensus sequences and classify them according to TE features [3] and (ii) TEannot, which mines a genome with a library of TE sequences, for instance the one produced by the TEdenovo pipeline, to provide TE annotations. With the latest version, REPET has evolved beyond a simple downloadable archive that required manual dependency management. It is now a Snakemake pipeline, offering a clearer workflow, easier installation, and better compatibility with up-to-date tools. These improvements significantly reduce installation and configuration challenges, making REPET more accessible and efficient. In this demo, we will present the REPET strategy for identifying TEs, highlight the key improvements and new features introduced in the latest version, and demonstrate REPET on a small dataset. By the end of the session, attendees will have acquired the basic knowledge needed to detect and annotate TEs in genomes.

References 1. Flutre T, Duprat E, Feuillet C, Quesneville H (2011) Considering Transposable Element Diversification in De Novo Annotation Approaches. PLoS ONE 6(1): e16526. https://doi.org/10.1371/journal.pone.0016526 2. Quesneville H, Bergman CM, Andrieu O, Autard D, Nouaud D, Ashburner M, et al. (2005) Combined Evidence Annotation of Transposable Elements in Genome Sequences. PLoS Comput Biol 1(2): e22. https://doi.org/10.1371/journal.pcbi.0010022 3. Hoede C, Arnoux S, Moisset M, Chaumier T, Inizan O, Jamilloux V, et al. (2014) PASTEC: An Automatic Transposable Element Classification Tool. PLoS ONE 9(5): e91929. https://doi.org/10.1371/journal.pone.0091929

11:30-12:30 Session 15C: Poster session

#2 Mariène Wan, Françoise Alfama, Etienne Bardet, Johann Confais, Nicolas Francillonne, Christina Gacic, Vanita Haurheeram, Erik Kimmel, Najwa Lakmouri, Maud Marty, Célia Michotey, Cyril Pommier, Raphaël Flores, Michaël Alaux and Anne-Françoise Adam-Blandon "URGI – A scientific facility dedicated to plant bioinformatics"

#50 Gildas Le Corguillé, Anthony Bretaudeau, Bjoern Gruening and Bérénice Batut "Breaking Myths: The Reality of Galaxy’s Capabilities and Impact"

#65 Franck Samson and Sébastien Aubourg "GBOT upgrade : a loci comparison tool dedicated to the exploration of duplications and synteny"

#67 Maximilien Colange, Akpéli Nordor and Abdelkader Behdenna "A priori estimation of reproducibility odds informs the sizing of omic data cohorts"

#155 Lindsay Goulet, Michèle Tixier-Boichard, Alexandre Lecoeur, Marie-Noëlle Rossignol, Florence Valence, Victoria Chuat, Emile Chambellon, Emmanuelle Helloin, Samuel Mondy, Christian Morabito, Benoît Quinquis, Nicolas Pons, Florian Plaza Oñate, Anthony Venon, Carine Remoué, Adrien Falce, Michel-Yves Mistou and Mathieu Almeida "The HARMONI project: Evaluating advanced microbiota characterization methods for host and environmental samples using DNA metabarcoding and metagenomic sequencing"

#159 Ngoc Chau Pham, Delphine Naquin, Erwin Van Dijk, Céline Hernandez, Yan Jaszczyszyn, Magali Perrois and Claude Thermes "A High-Throughput Approach and Pipeline for Single-Bacterium Transcriptomics"

#162 Alexandre Mir, Benjamin Jouen, Patricia Homobono Brito de Moura, Sylvain Prigent, Pierre Pétriacq and Valérie Schurdi-Levraud "Assessing phenomic prediction in the plant species Stevia rebaudiana B."

#164 Clara Emery, Lucas Leclère, Eric Pelletier, Vincent Lefort, Yvan Le Bras and Erwan Corre "French Bioinformatics Institute’s Initiatives for Biodiversity Genomic Data Management"

#165 Nicolas Maurice, Claire Lemaitre, Riccardo Vicedomini and Clémence Frioux "Investigating taxonomy-based clustering of HiFi reads for de novo assembly of complex metagenomes"

#168 Sophia Marguerit, Marc Galland, Cervin Guyomar, Jacques Lagnel, Fabrice Legeai and Amandine Velt "gwas-pipe: a Nextflow pipeline for GWAS analyses incorporating quality control"

#169 Julie Segueni and Kevin Blighe "Identifying subtypes in neurological disease patients"

#180 Elea Pauliat, Paul Tissot, Clément Fraysse, Tristan Hillairet, Stéphane Delmotte, Romain Delunel, Vincent Lacroix, Caroline Leroux, Jérome Lejot, Romuald Marin, Christophe Blanchet, Damien de Vienne, François Mialhe, Dominique Guyot, Christine Oger, Laurence Josset, Jocelyn Turpin, Oldrich Navratil and Vincent Navratil "Virome@tlas, a digital platform for the virosphere surveillance"

#181 Fabien Kambu Mbuangi, Eugeni Belda, Idy Diop, Jean-Daniel Zucker and Edi Prifti "Interpretable Multi-Class Classification of the Microbiome Using Predomics"

#182 Helene Bret and Ingemar Andre "Deciphering codon choice: how deep learning models select between synonymous codons"

#183 Maxime Lecomte, Fabien Jourdan, Louison Fresnais, Kahina Abed, Mickael Le Balch, Romain Grall and Nathalie Poupin "A refined strategy linking transcriptomics and metabolic models for deciphering chemical induced changes"

#185 Victor Lefebvre, Sarah Djebali, Sylvain Foissac and Anamaria Necsulea "Evolutionary divergence of regulatory chromatin contacts following gene duplication"

#187 Nils Giordano, Marie Denoulet, Mia Cherkaoui, Elise Douillard, Magali Devic, Florence Magrangeas, Stéphane Minvielle and Éric Letouzé "Integrative Bulk and Single-Cell Multiomic Framework for Tracing (Sub)clonal Evolution in Multiple Myeloma"

#189 Paul Tissot, Elea Pauliat, Clément Fraysse, Tristan Hillairet, Stéphane Delmotte, Romain Delunel, Vincent Lacroix, Caroline Leroux, Jérôme Lejot, Romuald Marin, Christophe Blanchet, Damien M. de Vienne, Francois Mialhe, Dominique Guyot, Christine Oger, Laurence Josset, Jocelyn Turpin, Oldrich Navratil and Vincent Navratil "Virus–host–ecosystem studies at large-scale: Please check your sequence metadata !"

#192 Alicia Gouge, Rachel Onifarasoaniaina, Sébastien Jacques, Céline Méhats, Julie Guignot and Christophe Le Priol "Spatial transcriptomic analysis of neuroinflammation and brain dysfunction induced by bacterial meningitis in juvenile mice"

#193 Lea Meunier, Guillaume Appé, Maximilien Colange, Éléonore Fox, Lucas Hensen, Camille Marijon, Akpéli Nordor, Solène Weill and Abdelkader Behdenna "Data-Driven Discovery of Novel Antigen Targets: A Scalable Bioinformatics Pipeline"

#194 Philippe Bordron, Julien Touchais and Christine Gaspin "SnoBoard: surfing on ncRNA modifications and snoRNA guides"

#195 Matteo Bettiati, Philippe Nghe and Vaitea Opuu "Size optimization of RNA sequences in the RNA World context."

#196 Raynald De Lahondès, Louison Lesage, Vadim Puller, Fabien Kambu Mbuangi, Eugeni Belda, Jean-Daniel Zucker and Edi Prifti "Gpredomics: rapid, interpretable and accurate prediction models for compositional data"

#199 Rémi Planel and Juliette Bonche "GaaS: Galaxy as a Service"

#200 Antoine Toffano, Jérôme Aze and Pierre Larmande "Protein Function Prediction: Graph Neural Networks as Multi-modal Aggregators of Sequences, Networks, and Ontologies"

#201 Elisabeth Hellec, Gautier Richard, Séverine Hervé, Christelle Fablet, Stéphanie Bougeard, Sarah Thirioux, Céline Deblanc, Mathieu Andraud, Edouard Hirchaud, Pierrick Lucas, Roselyne Fonseca, Nicolas Barbier, Stéphane Gorin, Stéphane Quéguiner, Eric Eveno, Florent Eono, Gilles Poulain, Stéphane Kerphérique, Yannick Blanchard, Nicolas Rose and Gaëlle Simon "Characterization of swine influenza viruses infections in pig herds: A machine learning approach identifying key environmental, physiological, immunological and virological determinants"

#203 Maëla Sémery, Marianyela Petrizzelli, Sylvain Prigent, Mélisande Blein-Nicolas and Christine Dillmann "Integrating proteomics data into a genome-scale metabolic model to predict metabolic fluxes in maize leaf"

#206 Ezechiel B. Tibiri, Romaric K. Nanema, Justin Pita, Fidèle Tiendrebeogo and Christine Tranchant-Dubreuil "Co-building a Sustainable Bioinformatics Training Program: The International Certificate in Bioinformatics and Genomics in West Africa"

#207 Hugo Bellavoir, Anna-Sophie Fiston-Lavier, Sébastien Puechmaille, Sèverine Bérard, Malo Lorenzo, Laureline Sastre and Maxime Lambert "Identification and characterization of inversions’ content in Pseudogymnoascus destructans"

#208 Sasha Darmon, Arnaud Mary and Vincent Lacroix "Models and Methods for de novo Identification of Transposable Elements in short-read RNAseq data"

#210 Alexis Bourdais, Fabrice Legeai, Valentin Guyot and Mikhail Pooggin "A pipeline for Trans-kingdom small RNAs analysis"

#214 Pierre Gérenton, Jean Keller, Philippe Veber, Vincent Lacroix and Bastien Boussau "Methods for identifying associations between plant genomes and symbioses despite the presence of paralogues"

#215 Hélène Collinot, Maryline Favier, Rachel Onifarasoaniaina, Alicia Gouge, Djihane Djeridane, Isabelle Lagoutte, Sébastien Jacques, Daniel Vaiman, Céline Méhats and Christophe Le Priol "Recursivity improves spatial transcriptomics data clustering quality"

#220 Owen Griere, Vera Pancaldi and Matthieu Bernard "Using PhysiCell modelling to simulate the PDAC tumor microenvironment response to therapies based on patients’ spatial omics data"

#223 Aurore Besson, Thomas Derrien, Fabien Degalez, Olivier Godfroy, Sandrine Lagarrigue, Mark Cock, Ahmed Debit and Helena Cruz de Carvalho "SCANS: Assessing lncRNA conservation across species"

#224 Cathleen Mirande-Ney, Santiago Trueba, Yves Gibon and Sylvain Prigent "Predictive metabolomic to decipher plant eco-evolutive tendencies"

#225 Quentin Rott, Odile Lecompte and Laurence Choulier "Single-cell classification of breast cancer and identification of cell surface protein targets."

#227 Mallia Geiger, Victoria Meslier, Elisa Menozzi, Frederico Fierli, Marine Gilles, Kai-Yin Chau, Revi Shahar Golan, Alexandre Famechon, Sofia Koletsi, Nadine Loefflad, Selen Yalkic, Christian Morabito, Aymeric David, Benoît Quinquis, Nicolas Pons, Stanislav Dusko Ehrlich, Jane Macnaughtan, Mathieu Almeida and Anthony Hv Schapira "Exploring the relationship between GBA1 host genotype and gut mi-crobiome in the GBA1L444P/WT mouse model: Implications for Par-kinson disease pathogenesis"

#229 Julien Guidihounme, Simon de Givry and Benjamin Linard "Probabilistic motif search and differentials in pangenome graphs"

#233 Alexandre Flageul, Edouard Hirchaud, Céline Courtillon, Flora Carnet, Paul Brown, Beatrice Grasland and Fabrice Touzain "vvv2_align_SE, vvv2_align_PE / vvv2_display: Galaxy worflows and software to compute, summarize and visualize variant calling and annotations of a viral assembly"

#235 Alexis Mergez, Guillermina Hernandez-Raquet, Raphaël Mourad and Matthias Zytnicki "DLScaff: Deep Learning for Hi-C Assembly Correction and Scaffolding"

#236 Lucien Piat and Ludovic Duvaux "MSpangepop: Simulating complex structural variants under advanced demographic scenarios using the coalescent"

#237 Diego Kauer, Céline Hosteins, Marta Avalos, Laurence Delhaes and Raphaël Enaud "Predicting Non-Response to Cystic Fibrosis Modulators from Microbiota Using Standard and Custom Machine Learning"

#239 Emilie Montaut, Jean-Emmanuel Sarry and Vera Pancaldi "Multi cohort investigation of the AML tumour microenvironment in relation to mutational profiles"

#240 Baptiste Hennecart, Eugeni Belda, Florian Plaza-Oñate, Raynald de Lahondès, Jean-Daniel Zucker and Edi Prifti "Comparative Evaluation of Short-Read, Long-Read, and Hybrid Assemblies for MAG Recovery in Human Fecal Metagenomes"

#241 Meron Tsetargachew, Élodie Calvez, Maxime Lambert, David Couvin, Anubis Vega-Rua, Michael C. Fontaine, Anna-Sophie Fiston-Lavier and Emmanuelle Permal "Evolutionary genomics of the Aedes aegypti Mosquitoes in the West Indies: Impact of Transposable Elements in their Adaptation"

#242 Marcelo Hurtado, Vera Pancaldi, Leila Khajavi and Abdelmounim Essabbar "CellTFusion: A novel approach to unravel cell states via cell type deconvolution and TF activity estimated from bulk RNAseq data identifies cell niches potentially predictive of cancer progression"

#247 Zakaria Tougui, Philippe Cinquin, Nelle Varoquaux and Antoine Frenoy "Mapping the Human Small Intestinal Microbiome: A Statistical Learning Approach"

#253 Thomas Stosskopf, Mégane Boujeant, Anaïs Baudot, Christine Brun and Andreas Zanzoni "A computational workflow to map and evaluate the impact of human genetic variation on the interactome"

#255 Mourdas Mohamed and François Sabot "GraDex, a set of indexes for sequence graphes in GFA format"

#256 Pascal G P Martin, Xuhong Yu and Scott D Michaels "A workflow for the analysis of QuantSeq FWD data"

12:30-14:00Lunch Break

14:00-15:00 Session 16: Keynote 5: Björn Grüning

From Genomes to Galaxies: Two Decades of Scalable, Open data analysis

Chair:

Jacques van Helden

Location: Main amphi

14:00

Björn Grüning

TBA

15:00-18:00 Session 17A: mini-symposium: Methods for Interfacing with Graphs of Genomic Sequences: novel pangenome paradigms