View: session overviewtalk overview
11:30 | Locating the source of forced oscillations in transmission power grids PRESENTER: Robin Delabays ABSTRACT. Forced oscillation event in power grids refers to a state where malfunctioning or abnormally operating equipment causes persisting periodic disturbances in the system. While power grids are designed to damp most of perturbations during standard operations, some of them can excite normal modes of the system and cause significant energy transfers across the system, creating large oscillations thousands of miles away from the source. Localization of the source of such disturbances remains an outstanding challenge due to a limited knowledge of the system parameters outside of the zone of responsibility of system operators. Here, we propose a new method for locating the source of forced oscillations which addresses this challenge by performing a simultaneous dynamic model identification using a principled maximum likelihood approach. We illustrate the validity of the algorithm on a variety of examples where forcing leads to resonance conditions in the system dynamics. Our results establish that an accurate knowledge of system parameters is not required for a successful inference of the source and frequency of a forced oscillation. We anticipate that our method will find a broader application in general dynamical systems that can be well-described by their linearized dynamics over short periods of time. |
11:45 | A Framework for Structural Representation of Nodes in Signed Directed Networks PRESENTER: Shu Liu ABSTRACT. We propose a structural embedding framework for a signed directed network that contains heterogeneous edges: positive and negative directed edges. Existing methods for signed directed networks generally focus on proximity similarity (grouping nodes based on edge attributes) and fulfilling social psychological theories (balance and status theories). Separating proximity similarity and social psychological theories, structural similarity validates the connection patterns between nodes and is crucial to network mining. However, few methods based on structural similarity have been proposed thus far. To address this gap, we propose an embedding framework based on structural similarity for signed directed networks, which consists of the following steps. Given a signed directed network, 1) we create the degree vector for each node and 2) calculate the Exponential Biased Euclidean Distance (EBED) as the similarity between the two nodes. EBED is a novel distance function we proposed that leverages the nature of complex networks (i.e., scale-free and small-world). 3) Inspired by struc2vec, we calculate the multi-scale EBED for each node pair by applying Dynamic Time Warping on its neighbors to construct a weighted multi-layer network. 4) We apply random walk to generate sequences and attain the embeddings by SkipGram. We conducted experiments on five topology networks and confirmed the results through visualization. We also performed intensive experiments on three real-network datasets for three prediction tasks (edge's sign, direction, and node's degree). The results demonstrated that the proposed method outperformed existing methods in all cases. The performance of our method also has considerable scalability improvements when using optimization strategies. Future challenges are to refine the optimization strategy based on the network's features and to apply the proposed framework to the actual fields. |
12:00 | Inferring missing edges in a graph from observed collective patterns PRESENTER: Selim Haj Ali ABSTRACT. Network inference methods have been developed to estimate missing links, or even whole networks, from available data. Most of them evaluate pairwise relationships between nodes. Here we formulate a new paradigm of network inference evaluating data as self-organized collective patterns. We illustrate this approach, summarized in Figure 1, for the case of patterns emerging in reaction-diffusion systems on graphs, where collective behaviors can be associated with eigenvectors of the network’s Laplacian matrix [1]. Our method combines a truncated spectral decomposition of the network’s Laplacian matrix with eigenvalue assignment by matching the patterns to the eigenvectors of the incomplete graph. For illustration of our approach, we consider a Gierer-Meinhardt reaction-diffusion system [3] and the parameter selection scheme from [4]. We then infer missing links using emergent Turing patterns either as: (i) collective modes of the network or (ii) standard pair- wise correlations. Results are detailed in Figure 2. We show that knowledge of a few collective patterns can allow the prediction of missing edges and that this result holds across a range of network architectures, provided patterns are ’global’ enough to contain network-wide information. Our framework can be generalized to other types of self-organized collective patterns and corresponding topological indicators, beyond the case of Turing patterns approximated by the graph Laplacian eigenvectors presented in [2] and which will constitute the subject of our talk. |
12:15 | Explaining the Explainers in Graph Neural Networks: a Comparative Study PRESENTER: Antonio Longa ABSTRACT. Following a fast initial breakthrough in graph-based learning, Graph Neural Networks (GNNs) have reached a widespread application in many science and engineering fields, prompting the need for methods to understand their decision process. GNN explainers have started to emerge in recent years, with a multitude of methods both novel or adapted from other domains. To sort out this plethora of alternative approaches, several studies have benchmarked the performance of different explainers in terms of various explainability metrics. However, these earlier works make no attempts at providing insights into why different GNN architectures are more or less explainable, or which explainer should be preferred in a given setting. In our work, we fill these gaps by devising a systematic experimental study, which tests ten explainers on eight representative architectures trained on six carefully designed graph and node classification datasets. With our results we provide key insights on the choice and applicability of GNN explainers, we isolate key components that make them usable and successful and provide recommendations on how to avoid common interpretation pitfalls. We conclude by highlighting open questions and directions of possible future research. |
12:30 | Reconstruction performance of the stochastic block model in empirical networks PRESENTER: Felipe Vaca ABSTRACT. We assess the performance of the stochastic block model (SBM) in reconstructing 248 empirical networks spanning several domains and orders of size magnitude. We simulate a noisy measurement process and evaluate the model's ability at recovering various descriptors of the network structure. We observe that the SBM yields accurate estimates for most networks in the corpus, but this behaviour is not ubiquitous. In particular, we mostly observe large reconstruction errors in networks having large diameter and slow-mixing random walks --- corresponding typically to networks embedded in space. Contrary to what is often assumed, the SBM is able to provide accurate estimates on networks with a high abundance of triangles. We also demonstrate that incorporating a more detailed error assessment while doing measurement tends to improve the quality of the reconstruction. |
12:45 | Decision-making in uncertain networks via dynamic importance PRESENTER: Erik Weis ABSTRACT. Finding nodes with outsized dynamical influence is a perennial problem in network science. Such information can be used to design interventions that steer dynamics toward desirable outcomes, with applications in various contexts such as infectious disease epidemiology (e.g., optimal placement of epidemiological sentinels or deployment of vaccines), or in the context of viral marketing (influence maximization). While standard formulations of dynamic importance assume complete information about the network, real world scenarios require that we infer the network from noisy, incomplete, and often biased data. Though biased structural noise has been shown to considerably impact structural importance, the impact of structural noise on dynamic importance has been studied only in the context of random noise. In this presentation, we formulate a Bayesian decision-theoretic framework for optimizing interventions under general notions of structural and dynamical model uncertainty. As an example, we consider independent cascade dynamics under three distinct but related dynamical importance problems: targeted immunization, epidemic surveillance, and influence maximization. More specifically, we select a set of active nodes that are either infected, immunized, or surveilled. We explore how noisy structural data impact the quality of the decision-making procedures. Additionally, we characterize the ways in which differing notions of dynamic importance deviate from each other, as well as from structural heuristics (such as centrality measures), under noisy conditions. |
11:30 | Physics and complexity of interdependent networks PRESENTER: Ivan Bonamassa ABSTRACT. Despite the large efforts made over the last decade in harnessing the effects that interdependent interactions have on the mutual functioning of interacting macro-systems, understanding their physics principles is a fundamental challenge that has remained out of reach in the existing approaches. In this talk, I will present a brief introduction to the concept of interdependent couplings starting from their first appearance in the framework of percolation theory and their modeling in the presence of dynamical processes, tracing a roadmap for the physics of interdependent systems. By following this route, we will discuss the case of interdependent Ising networks and show that dependency links therein can be rigorously mapped to adaptive thermal couplings or to directed higher-order interactions, depending on the time scales governing the inter and intra-layer processes. In so doing, we will show that the ground state of randomly interdependent Hamiltonians is governed by the very same equations characterizing percolation cascades in randomly interdependent networks. Following this correspondence, we will identify the structural free-energy of percolation cascades by mapping the latter onto the so-called random k-xorsat, a paradigmatic class of constraint satisfaction problems. We will discuss some implications that this surprising mapping has on the study of the computational complexity as well as for the analysis of the structural metastability of multilayer networks, offering perspectives of cross-fertilization in both areas of research. Finally, inspired by the theoretical understanding gained in spin networks, we will conclude the talk by presenting our recent experimental realization of interdependent networks as thermally coupled disordered superconductors. We will show experimental and theoretical results characterizing the emergence of discontinuous superconducting-metal transitions with mutual hysteretic cycles and of the underlying microscopic cascading processes which physically realize interdependent percolation and generalize it beyond structural dismantling. We will mention practical venues for the development of technologies based on interdependent network materials, while concurrently raising the perspective to reignite that back-and-forth feedback between abstract theory and experimental physics that could drive new discoveries, challenges and frontiers. |
11:45 | On the location and the strength of controllers to desynchronize coupled Kuramoto oscillators PRESENTER: Martin Moriamé ABSTRACT. [See added file MoriameCarlettiAstract.pdf to have the complete abstract page with the mandatory captioned figure] Synchronization is a widespread phenomenon in Nature, in particular in the living kingdom where interacting subsystems, being them biochemical cycles, cells or organs, are able to synchronize their behaviours to eventually exhibit unison rhythms. Brain activity can also be seen as a succession of synchronized states of subgroups of cerebral regions separated by unsynchronized resting states. The main challenge for the brain is to suitably control the succession of this states. There are however cases where living being are unable to perform correctly such control task and hyper-synchronization occurs. A striking example of such malfunctioning are epilepsy seizures. Therefore an interest has been placed in pinning control method able to reduce global synchronization. For example Asllani et. al. (2018) developed a Hamiltonian-based control term acting on the paradigmatic Kuramoto model of coupled oscillators aimed at reducing synchronization and being minimally invasive. Here we perform a numerical study of this model in order to determine which is the best subset of nodes to choose as pinned nodes to achieve the lowest synchronization state possible. We compared several centrality scores (degree, functionability (see Rosell-Tarrago & Diaz-Guilera 2020) and betweenness) and a random selection on several networks topologies (scale-free networks, core-periphery and small-world). The results, an example of which is shown on Figure 1, indicate that among all the used strategies, the degree-based selection give rise to the best control efficiency, i.e. a low synchronization rate achieved while still having a small number of controllers with a small injected signal intensity. This property is far more obvious when the degree distribution is heterogeneous (in scale-free networks the best strategy is to control the hubs and in core-periphery networks the best way to desynchronize the system is to pin core nodes) but is still quite observable in the opposite case. This strategy is therefore the best if one wants to efficiently control the system by putting the smallest number of controllers possible, i.e. being minimally invasive. However, when the number of controllers increases too much this degree-based selection becomes less efficient. A second fact that can be observed is the importance of the spreading of the pinned in the networks. Indeed, it seems that the best subset of controllers has to minimize the shortest path distance between them and the other nodes. A further interesting observation is that our results share several traits with previous works on pinning control aiming to improve synchronization, despite of the opposite goal and the difference of context (systems with linear coupling while we used a system with non-linear coupling and control term). It seems so that there is a certain universality of the best pinning control strategy that the most influential nodes are globally the same, independently of the actual goal of the controller. |
12:00 | Low-dimension network controllability PRESENTER: Remy Ben Messaoud ABSTRACT. Network controllability is a powerful theoretic framework to identify the driver nodes capable of steering the activity of the network[1]. However, in practice, it often suffers from numerical imprecisions due to well-known ill-posedness conditions that affect the estimate of the control signals[2]. Reducing the problem complexity is a backdoor strategy for many real situations, e.g., when only one node can be stimulated at a time. This can be done by focusing on specific target components of the network[3], as well as on controlling the corresponding average activity state[4]. By combining recent advances in output controllability and graph signal processing, we introduce an alternative framework that extends the above-mentioned functionalities. Instead of controlling the states of all the nodes, we control their projections on the low-dimension eigenmaps resulting from the spectral decomposition of the network Laplacian (Fig.a). We show with extensive simulations on hierarchical modular networks that there is a significant precision improvement when controlling the low-dimension eigenmaps, as compared to standard approaches that only consider the average state of the modules (Fig.b-c). To experimentally validate our approach, we consider the human brain network and show that low-dimension control of the first eigenmaps (up to 6) of the default mode network (DMN) gave more biologically plausible[5] drivers as compared to when controlling the original states of all the DMN nodes or their average (Fig.d). In conclusion, our framework provides a novel strategy to improve the significance of network controllability in practical situations by leveraging the network structure and reducing the problem's dimensionality. References 1. Pasqualetti, F., Zampieri, S. & Bullo, F. Controllability metrics, limitations and algorithms for complex networks. in 2014 American Control Conference 3287–3292 (2014). doi:10.1109/ACC.2014.6858621. 2. Sun, J. & Motter, A. E. Controllability Transition and Nonlocality in Network Control. Phys. Rev. Lett. 110, 208701 (2013). 3. Gao, J., Liu, Y.-Y., D’Souza, R. M. & Barabási, A.-L. Target control of complex networks. Nat. Commun. 5, 5415 (2014). 4. Casadei, G., Canudas de Wit, C. & Zampieri, S. Model Reduction Based Approximation of the Output Controllability Gramian in Large- Scale Networks. IEEE Trans. Control Netw. Syst. (2020) doi:10.1109/TCNS.2020.3000694. 5. Uddin, L. Q., Clare Kelly, A. M., Biswal, B. B., Xavier Castellanos, F. & Milham, M. P. Functional connectivity of default mode network components: Correlation, anticorrelation, and causality. Hum. Brain Mapp. 30, 625–637 (2008). |
12:15 | Canalization and entropy improve prediction of disorder in Boolean network dynamics PRESENTER: Jordan Rozum ABSTRACT. Biomolecular network dynamics are thought to operate near the critical boundary between ordered and disordered regimes, where large perturbations to a small set of elements neither die out nor spread on average. Theories about the dynamical regime of Boolean automata networks were originally considered in the thermodynamic limit (N approaching infinity) in random homogeneous networks. However, degree heterogeneity, finite-size effects, and redundancy are important, especially in experimentally-derived models of biochemical regulation. A biomolecular automaton (e.g., gene, protein) typically has high regulatory redundancy, where a small subset of regulators determines activation via collective canalization. Previous work has shown that effective connectivity, a measure of collective canalization, leads to improved dynamical regime prediction for Boolean networks with homogeneous in-degree distribution[2]. We expand this by considering i) random Boolean networks (RBNs) with heterogeneous in-degree distributions, ii) additional experimentally derived models of biomolecular processes, and iii) new measures of heterogeneity in automata update rules. We find that effective connectivity improves dynamical regime prediction in all of the models considered (89% accuracy and AUPPRC=0.94 vs 69% accuracy, AUPRC=0.55 in the traditional approach). In RBNs, combining effective connectivity with bias entropy dramatically improves the prediction of disorder, as measured by the Derrida coefficient (our regression error is less than half that of the traditional approach).The strong prediction performance demonstrates that regulatory redundancy (canalization) is a major factor in determining the dynamical regime of biochemical networks. Our work yields a new understanding of criticality in biomolecular networks that accounts for collective canalization, redundancy, and heterogeneity in the connectivity and logic of their Boolean network models. The details of this work can be found in [1]. [1] FX Costa, JC Rozum, AM Marcus, and LM Rocha [2023]. "Effective connectivity and bias entropy improve prediction of dynamical regime in automata networks." Entropy. In Press. [2] Manicka, S., M. Marques-Pita, and L.M. Rocha [2022]. "Effective connectivity determines the critical dynamics of biochemical networks". J. Royal Society Interface. 19(186):20210659. DOI: 10.1098/rsif.2021.0659. |
12:30 | Control and observation of target nodes in large-scale networks PRESENTER: Arthur Montanari ABSTRACT. Controllability and observability establish the minimal conditions for the complete control and estimation of the internal state of a dynamical system. In the context of large-scale complex systems, such as powers grids, neuronal networks, and food webs, high dimensionality poses physical and cost constraints on actuator and sensor placement, limiting our ability to make a network controllable or observable. Noting that often only a relatively small number of state variables are essential for intervention and monitoring purposes in large networks, the generalized notions of target controllability and functional observability have been introduced in network science and control theory. A system is functionally observable (target controllable) when a targeted subset of state variables can be reconstructed (steered) using the available measurement (control) signals. Here, we show how these concepts can be generalized to explore the underlying graph of large-scale network systems and be applicable to nonlinear dynamics. In particular, we establish a duality relation between target controllability and functional observability, which—unlike their classical counterparts—does not hold immediately for all systems (Fig. 1A). The developed theory has immediate applications to actuator and sensor placement problems (Fig. 1B) as well as the co-design of controllers and observers for target control and estimation (Fig. 1C). Our methods are applied to cyber-attack detection in power grids, monitoring of the COVID-19 pandemic, and early-warning of seizures, demonstrating that the proposed approach can achieve accurate estimation with substantially fewer resources. |
12:45 | Synchronization-induced Taylor’s law of a coupled food chain model on networks PRESENTER: Yuzuru Mitsui ABSTRACT. Taylor's law (TL) is a power law relationship between the mean and variance [1]. Sometimes TL is called fluctuation scaling. There are two major types of TL, the temporal TL and the spatial TL. For the temporal TL, the mean and variance are computed over time, and for the spatial TL, the mean and variance are computed over space. TL has been observed in various fields, such as ecology, biology, network science, and so on [2, 3]. However, several problems with TL are unsolved. (i) Why is it observed so widely, especially in ecosystems? (ii) Although it is proven that the exponent of TL can take any value, why do these exponents often take values around 2 in ecosystems? (iii) Are the temporal and spatial TLs governed by the same mechanism? We found analytically and numerically that synchronization, a widely observed phenomenon in ecosystems as well as TL, can induce both the temporal and spatial TLs with exponent 2 on complete graphs [4]. However, we need to extend our theory because actual ecosystems are not complete graphs but should have some network structure. In this presentation, we discuss network structures for which our theory is valid and those for which it is not, and take the random graph and one-dimensional lattice as examples. Figure 1 shows the examples of synchronization-induced TL of the coupled food chain model on the random graph. When time series are not synchronized (Figure 1(a)), the temporal and spatial TLs are not observed (Figure 1(b, c)). On the other hand, when time series are synchronized (Figure 1(d)), the temporal and spatial TLs are clearly observed (Figure 1(e, f)). The theory presented here is a generalization of our previous theory. [1] L. R. Taylor, Aggregation, variance and the mean, Nature 189(4766), 732-735 (1961). [2] Z. Eisler, I. Bartos, and J. Kert´esz, Fluctuation scaling in complex systems: Taylor’s law and beyond, Advances in Physics 57(1), 89–142 (2008). [3] R. A. J. Taylor, Taylor’s Power Law: Order and Pattern in Nature (AcademicPress, 2019). [4] Y. Mitsui and H. Kori, Temporal and spatial Taylor’s law induced by synchronization of periodic and chaotic oscillators, APS March Meeting 2023. |
11:30 | Strong Connectivity and Influence in Real Directed Networks PRESENTER: Niall Rodgers ABSTRACT. I present a series of recent results which relate to the structure and dynamics of directed networks. Building on the technique of Trophic Analysis which can be used to measure the hierarchical ordering and global directionality of a directed network while also being simple to calculate and interpret. Firstly we tackle the problem of predicting strong connectivity in directed networks. In many real, directed networks, the strongly connected component of nodes which are mutually reachable is very small. This does not fit with current theory, based on random graphs, according to which strong connectivity depends on mean degree and degree-degree correlations. And it has important implications for other properties of real networks and the dynamical behaviour of many complex systems. We find that strong connectivity depends crucially on the extent to which the network has an overall direction or hierarchical ordering – a property measured by trophic coherence. Using percolation theory, we find the critical point separating weakly and strongly connected regimes, and confirm our results on many real-world networks, including ecological, neural, trade and social networks. We show that the connectivity structure can be disrupted with minimal effort by a targeted attack on edges which run counter to the overall direction. We also link to recent results which show how the notion of network influence and influenceability can be understood through the lens of Trophic Analysis. |
11:45 | Emergent stability in complex network dynamics PRESENTER: Chandrakala Meena ABSTRACT. The stable functionality of networked systems is a hallmark of their natural ability to coordinate between their multiple interacting components. Yet, real-world networks often appear random and highly irregular, raising the question of what are the naturally emerging organizing principles of complex system stability. The answer is encoded within the system’s stability matrix — the Jacobian — but is hard to retrieve due to the scale and diversity of the relevant systems, their broad parameter space, and their nonlinear interaction dynamics. Here, we introduce the dynamic Jacobian ensemble, which allows us to investigate systematically the fixed systematically dynamics of a range of relevant network-based models. Within this ensemble, we find that complex systems exhibit discrete stability classes. These range from asymptotically unstable, where stability is unattainable, to sensitive, in which sta- bility abides within a bounded range of the system’s parameters. Alongside these two classes, we uncover a third asymptotically stable class, in which a sufficiently large and heterogeneous network acquires a guaranteed stability, independent of its microscopic parameters and of external perturbation. Hence, in this ensemble, two of the most ubiquitous characteristics of real-world networks - scale and heterogeneity - emerge as natural organizing principles to ensure fixed-point stability in the face of changing environmental conditions. |
12:00 | Color-avoiding connected spanning subgraphs with minimum number of edges PRESENTER: József Pintér ABSTRACT. The robustness of networks against random errors and targeted attacks has attracted a great deal of research interest. The robustness of a network refers to its capacity to maintain some degree of connectivity after the removal of some edges or vertices. On the other hand, little work has been done when the attack-tolerance of the vertices or edges are not independent but certain classes of vertices or edges share a mutual vulnerability. The shared vulnerabilities can be modeled by assigning a color to each class that may represent a shared eavesdropper, a controlling entity or correlated failures. In the traditional approach, a single path provides connectivity, however, here connectivity corresponds to the ability to avoid all vulnerable sets of vertices or edges via multiple paths such that no color is required for all paths. Formally, we say that an edge-colored graph is edge-color-avoiding connected if after the removal of edges of any single color, the graph remains connected; and similar concepts can be introduced for vertex-colored graphs as well. The first articles introducing this concept analyzed how the color frequencies affect the robustness of the networks: they found that the colors with the largest frequencies control vastly the robustness of the network, and colors of small frequency only play a little role. The study of color-avoiding percolation was continued by Giusfredi and Bagnoli in diluted lattices, then with mathematical rigor in Erdős--Rényi random graphs, by Ráth et al., Lichev and Schapira, and Lichev. From a computational complexity point of view, color-avoiding connectivity was also studied by Molontay and Varga. In this work, we study the problem of finding a color-avoiding connected spanning subgraph with minimum number of edges. From a practical point of view, this problem emerges when we want to minimize the maintenance costs of the network by removing some of its edges while preserving color-avoiding connectivity. First, we show that this problem is NP-hard, then we present some polynomial-time approximation algorithms for it. Finally, we illustrate the utility of the algorithms on transportation and infrastructure networks. |
12:15 | Extended-range percolation in complex networks PRESENTER: Lorenzo Cirigliano ABSTRACT. Classical percolation theory underlies many processes of information transfer along the links of a network. In these standard situations, the requirement for two nodes to be able to communicate is the presence of at least one uninterrupted path of nodes between them. In a variety of more recent data transmission protocols, such as the communication of noisy data via error-correcting repeaters, both in classical and quantum networks, the requirement of an uninterrupted path is too strict: two nodes may be able to communicate even if all paths between them have interruptions/gaps consisting of nodes that may corrupt the message. We propose a general model of extended-range percolation on complex networks, aiming to provide a mathematical basis for information transmission involving path interruptions. We obtain exact results for infinite random uncorrelated networks with arbitrary degree distribution, using the generating functions formalism. We also present an efficient message-passing formulation of the theory that works well in finite real-world networks. The interplay of the extended range and heterogeneity leads to novel critical behavior in scale-free networks. |
12:30 | Robustness and volume exclusion in physical networks PRESENTER: Luka Blagojevic ABSTRACT. Physical networks are networks composed of volume-occupying objects embedded in three-dimensional space. For example, a biological neural network is composed of neurons that are physical objects, which also have a corresponding connectome network that encodes their synaptic connections. Due to technological advances, data describing the full three-dimensional structure of physical networks is becoming increasingly available, providing an opportunity to ask fundamental questions about the relationship between physical and network structure. Recent work hypothesized that volume exclusion (the fact that physical nodes and links cannot overlap) and spatial embeddedness are the key properties that affect the structure and evolution of physical networks. Building upon these ideas, we propose methods to quantify how strongly a link is affected by volume exclusion and its robustness to physical damage. Specifically, we investigate the connection of physicality and network structure in empirical data, which includes individual neurons, neural networks, vascular networks, plant roots, molecular networks in the mitochondria, and an imprint of an anthill. In most physical networks, a link is a tube-like object that connects two nodes through a non-straight path, such as neural branches in the brain or vessels in a vascular network. By definition, a physical network link does not overlap with other components of the network. Intuitively, if volume exclusion does not play a role in the formation of the link, an alternative link following a statistically similar trajectory would also avoid intersections by chance. Therefore to quantify how strongly a physical link is affected by volume exclusion we randomly re-shuffle segments of the link while keeping the rest of the network fixed and measure the number of intersections created Fig.1(A). Results show that the links of the neural network are the most affected by volume exclusion and the mitochondrial network the least, quantified by their intersection counts. Furthermore, we observe that the distribution of link volume exclusion is heterogeneous, indicating that some parts of the network are strongly confined in space, while others are largely unaffected by physical constraints. Additionally, we found a positive correlation between link centralities and volume exclusion measures, indicating that the three-dimensional shape and network structure of some networks is intertwined. To investigate the robustness of physical networks, we simulated two types of physical damage - random damage and spatially correlated physical damage, which is inspired by empirical phenomena. The impact of the damage is quantitatively displayed with 3D layouts that are associated with regions in space Fig.1(B). Our results show that tree-like physical networks (like the roots of a plant) are more vulnerable, compared to biological neural networks, which are robust to all types of damage. In conclusion, our results indicate three important findings: i)they provide evidence that volume exclusion plays a role in the growth of real physical networks and ii)they show that physical and network structures are strongly connected iii) that 3D layouts can reveal vulnerabilities of the physical network to the spatially correlated damage. |
12:45 | Assessing the risk of infrastructural networks to natural catastrophes PRESENTER: Tomas Scagliarini ABSTRACT. Networks of physical infrastructures such as power grids, water supplies and air transportation play a central role in modern societies. Natural disasters like storms and earthquakes often result in the failure of several nodes or links, leading to the disruption of large parts of these networks, causing huge economic losses and great distress in the general population. Here we address the problem of studying the vulnerability of a network due to external shocks caused by natural disasters, like earthquakes and tropical storms. In this work, we build a risk map for the networks of US power grid and the air transportation, associating to each node a risk coefficient representing the probability that the failure of that node triggers the failure of the entire network. |
11:30 | Analysing tripartite, manufacturer-supplier-product networks: the case-study of the automotive sector PRESENTER: Massimiliano Fessina ABSTRACT. Over the last twenty years, the growth of network science has impacted several disciplines by establishing new, empirical facts as well as novel methodologies for their analysis. In the fields of economics and finance, a class of systems that has recently gained attention is that of interfirm networks or supply chains where nodes, i.e. firms, are linked by buying/selling relationships of products and services. The present contribution focuses on a peculiar representation of the data, concerning the automotive sector, obtained from the `Marklines' portal (www.marklines.com), i.e. a tripartite, manufacturer-supplier-product one (see fig. 1, left). From a purely empirical point of view, the tripartite representation of the `Marklines' dataset is characterized by heavy-tailed degree distributions and disassortative patterns: in words, few hubs co-exist with a plethora of `leaves' and degree correlations are negative. Of greater interest is the (statistically significant) abundance of pairs of nodes sharing the same two neighbors [1]: such a result confirms what has been already pointed out in [2], i.e. the presence of a class of motifs which are expressions of a self-organizing principle that differs from the homophily-driven one shaping social networks [3] but is analogous to the complementarity-driven one shaping protein-protein interaction networks. On the theoretical side, we extend the Exponential Random Graphs formalism for the analysis of n-partite networks: specifically, we consider the cases n=2 and n=3 and employ the corresponding null models to project our data structure onto each of its three layers. One of these projections is shown in fig. 1 (right): it reveals the presence of homogeneous clusters of manufacturers (e.g. the ones belonging to General Motors are tightly interconnected); remarkably, these clusters are also `geographically' homogeneous, i.e. allow us to distinguish Europe and the US (within the same cluster but belonging to different communities - respectively, the yellow one and the green one) from China, India and Japan. Moreover, the analysis of our dataset via the Economic Fitness and Complexity toolbox reveals the presence of a nested ecosystem of suppliers where few of them `serve' only one manufacturer, thus representing a source of fragility for interfirm networks: should the unique manufacturer default, its `exclusive' suppliers would probably default as well; should one of the `exclusive' suppliers default (and assuming no redundant suppliers), the corresponding manufacturer would suffer a large loss. Besides shedding light on the organizing principles of interfirm networks, the framework we have explored in this contribution opens up the possibility of generating realistic, economic scenarios, to be used for testing the effects of disruptive events such as the recent Covid-19 pandemics. 1. Saracco et al. Scientific Reports 5, 10595 (2015). 2. Mattsson et al. Frontiers in Big Data 4, 666712 (2021). 3. Istvan et al. Nature Communications 10, 1240 (2019). |
11:45 | Reconstructing firm-level interactions: the Dutch input-output network PRESENTER: Leonardo Niccolò Ialongo ABSTRACT. There is increasing interest for the importance of production networks in the analysis and fore- casting of economic systems. Unfortunately, input-output relationships at firm level are difficult to observe due to the sensitive nature of the information and, for most countries, the absence of a dataset capturing the structure of these relationships in a reliable way. In this work [1] we extend the density corrected Gravity Model (dcGM) [2] to incorporate the information available on sec- tor linkages and define a fitness-induced configuration model that embodies the knowledge of the sector-wise in-strength per firm and the empirical link density. The proposed model is identical to the dcGM when no information about how the total in-strength is divided by sector. We test the improvement of our methodology on two complementary datasets of Dutch firms constructed from inter-client transactions on the bank accounts of the two major Dutch banking institutions. In figure 1 we can observe that as the total in-strength by sector is divided into more detailed sector definitions (layers in the figure) the performance of the model improves significantly and that this improvement is greater than what could be expected if a random partitioning of firms in sectors was applied. This confirms our hypothesis that the in-strength by sector contains significant information about a non-trivial meso-structure that must be constrained. We find that this is due to the fact that the real relationships between firms are extremely sparse in the number of sectors and that firms in the same sectors have correlated behaviour. In conclusion we find that this is a suitable methodology for estimating a realistic structure of the inter-firm network from information on the firms flows by sector and the network density. We believe this can provide a foundation for more realistic simulations concerning the propagation of distress in the economy. |
12:00 | Estimating the impact of supply chain contagion on financial stability PRESENTER: Zlata Tabachová ABSTRACT. Credit risk assessment - estimating potential losses from a counter-party's failure to repay its debt - is a central for sound banking business. Traditionally, credit risk models focus on the borrowers financial conditions only, not taking the risk of supply chain contagion into account. However, crises such as pandemics, geopolitical instabilities, or natural disasters have drastically revealed that the propagation of shocks along supply chains can potentially lead to large financial losses of firms. Based on a unique country wide micro-dataset containing all major supply chain links of Hungarian firms and the loans of these firms from banks, we simulate how an initial failure of firms spreads along the supply network, leading to additional firm defaults and consequentially additional losses to banks.In particular we simulate how an initial shock triggers a shock propagation cascade in the supply network with the model of [1]. The shock propagation cascade causes production losses to firms that are not affected by the initial shock. We use the income statement variables of each firm to translate the production losses into financial losses. If the latter exceeds short-term liquidity or equity of firms, then those firms default and fail to meet their contractual obligations toward banks. As a result, banks exposed to the defaulted firms suffer additional losses due to supply chain contagion. Based on our simulation we first define a financial systemic risk index (FSRI) of a firm, that measures the financial losses of the banking system caused by the firms own default and loan defaults caused by the supply network shock propagation triggered by the initial failure of that firm. We show that a small fraction of firms poses sizeable risks to the financial system, affecting up to 16% of the banks' equity. These losses are chiefly caused by supply network contagion. Second, we simulate the propagation of 10,000 initial shock scenarios and calculate the expected loss (EL), value at risk (VaR) and expected shortfall (ES) of each bank with and without supply network contagion. Our simulations show that on average EL, VaR, and ES of banks can be amplified by a factor of 4.3, 4.5 and 3.2 respectively, by supply network contagion. Our findings show that credit risk assessment should take into account supply network contagion. This channel might be important for regulators’ systemic risk assessment and to have a more complete picture of threats to financial stability. |
12:15 | Mapping the economic complexity of green supply chains ABSTRACT. As the world is transitioning into a more sustainable system of production and consumption, the booming demand of green products enables new developing opportunities, and many countries consider a deeper integration into the green supply chains as a national strategy. However, the green supply chain represents a complex system of multiple products, which lack a view of its full landscape. Each green product requires specialized knowledge to gain comparative advantage in the global market, and the pathway to success depends on the place-specific portfolio of existing capabilities. In this work, 11 green supply chains were identified from research papers, reports and life-cycle inventory database, based on their physical input-output relations, and formed a network of 472 nodes and 588 edges, spanning the fields of renewable energy generation & storage, CCUS, green hydrogen, etc. Machine learning algorithms were used to match the identified products to 344 standard 4-digit HS products via their semantic similarity. The method of economic complexity and product space were further applied on the identified green products, which revealed the competitiveness of countries and the relatedness between products via their co-occurrence at country level. It was found that: 1) the identified green supply chain network is weakly connected and exhibits clear community structure for different fields, which accounts for a wide range (~20%) of all HS products. 2) The products involved in green supply chains mostly locate in the center of product space network, with a higher relatedness if there exist direct input-output relation or when a pair of products belong to the same community, which indicates more shared production capabilities (e.g., shared labor and skill requirement/infrastructure/etc.) besides their shared supply chain features. 3) The involved products are mostly complex product in the category of machinery/electronics/chemicals, except the ores of critical minerals with low complexity. The green supply chain of semiconductors on average are most complex, followed by various renewable energy generation, while energy storage is less complex. Most countries with a high comparative advantage in exporting the involved products have a high economic complexity score and a diversified production structure. 4) Countries are more likely to diversify into the green products with many existing related products and shared capabilities, and almost half of the countries have the potential to diversify into at least 1 product involved in green supply chain. However, the probability shrinks quickly as more products are considered, which means only a few diversified countries like Germany and China could potentially have a comparative advantage along the whole green supply chain. Less-diversified developing countries need smart strategy to target their high-potential sectors and pay more effort to establish and maintain their advantage in green products. In summary, the results of this analysis provide valuable insights into the structure of the green supply chain and its relationship with economic complexity, especially the significance of acquiring and developing the necessary knowledge and capabilities for effective management of the green supply chain. |
12:30 | Estimating the loss of economic predictability: Comparing industry- and firm-level production network models PRESENTER: Christian Diem ABSTRACT. The development, employment and growth of economies is crucially determined by firm-level production networks (FPNs), as is their resilience to disturbances propagating along corporate supply chains. The networks’ resilience fundamentally depends on the details of firms' input-output relations. However, widely used input-output models (IO), are almost exclusively calibrated with highly aggregated industry-level production networks (IPN), in the form of input-output tables. This raises the question of what the limits of predicting economic outcomes with industry-level models are. Here we leverage a nearly complete nationwide FPN containing 243,399 Hungarian firms with 1,104,141 supplier-buyer-relations, and self-consistently compare the production losses from COVID-19 related shocks propagating on the aggregated IPN and the granular FPN. Industry- and firm-level shocks are of the same size, but the latter affect firms within industries differently and realistically represent the initial effects of the COVID-19 pandemic. The shock size is inferred from Hungarian firms' actual employment reductions in the course of the early phase of the pandemic. To arrive at the shocked production levels at the industry-level, for every NACE2 industry, k, we aggregate the shock, ψ, and obtain the remaining industry-level shock, ϕ_k. To quantify the size of mis-estimations if the initial shocks on the firm-level were slightly different, we sample 1,000 different, synthetic realizations of the COVID-19 shock, Ψ, that are of the same size when aggregated to the industry-level, but affect different firms within industries Following the initial COVID-19 shock, we simulate how the adaptation of firms' supply and demand propagate downstream and upstream along the production network, once on the firm-level and once on the industry-level. We employ the simulation model of Diem et. al 2022 [1], where each firm (industry) is equipped with a generalized Leontief production function. The simulation continues up to time, T, when the production levels of firms have reached a new stable state. The final production level represents the fraction of the original production a firm (sector) maintains after the shock has propagated. We define the FPN-based economy-wide production-loss, L_firm^ (ψ) as the fraction of the overall revenue in the network (measured in out-strength, s_i^out) that is lost due to the initial shock and the consequences of its propagation. The IPN-based economy-wide production-loss, L_ind^ (ϕ), is defined accordingly Our findings reveal that using aggregated IPNs leads to large estimation errors of economy-wide production losses of up to 37%. While the industry model yields a 9.6% loss, the FPN-based losses range from 10.5% to 15.3%, suggesting that sector-level IO-models have a natural limitation in forecasting economic outcomes of firm-level production networks. We ascribe the discrepancy to the large heterogeneity of firms within industries, as firms within the same sector on average only sell 23.5% to and buy 19.3% from the same industries. This underlines the inadequacy of industries for representing the firms they include. Similar errors are likely when estimating economic growth, CO2 emissions, and policy interventions with industry-level IO-like models. Our study emphasizes that using granular data is key for understanding and predicting the behavior of economic systems. |
12:45 | Inequality in economic shock exposures across the global firm-level supply network PRESENTER: Tobias Reisch ABSTRACT. For centuries, national economies created wealth by engaging in international trade and production. The resulting international supply networks not only increase wealth for countries, but also create systemic risk: economic shocks, triggered by company failures in one country, may propagate to other countries. When working on aggregate data, the effect of these shocks is typically dramatically underestimated. Using global supply network data on the firm-level, we present a method to estimate a country's exposure to direct and indirect economic losses caused by the failure of a company in another country. Figure 1a illustrates schematically how we simulate a cascade of supply chain disruptions, by iteratively reducing firm's outputs proportional to their lack of inputs. Subsequently, we aggregate the exposures to the country level. In Fig. 1b we show the network of systemic risk-flows across the world as the expected exposure Ecd of country c to a (random) firm defaults in country d. We find that rich countries expose poor countries much more to systemic risk than the other way round. We demonstrate that higher systemic risk levels are not compensated with a risk premium in GDP, nor do they correlate with economic growth. Systemic risk around the globe appears to be distributed more unequally than wealth. These findings put the often praised benefits for developing countries from globalized production in a new light, since they relate them to the involved risks in the production processes. Exposure risks present a new dimension of global inequality, that most affects the poor in supply shock crises. It becomes fully quantifiable with the proposed method. |
11:30 | Measuring Polarization Dynamics from Signed Interaction Networks: an Application to Misinformation Crowdsourcing on Twitter PRESENTER: Emma Fraxanet Morales ABSTRACT. Online polarization mechanisms have been studied from different perspectives (i.e. text analysis or network structure) and using different methodologies (Bail et al. 2018; Martín-Gutiérrez et al., 2016; Barberá et al. 2015; Waller et al. 2022). The study of online polarization over time presents challenges such as identifying appropriate time scales and developing a framework that can accurately describe the various mechanisms at play. The aim is not only to measure the degree of ideological separation between communities but also to understand their evolution and response to endogenous and exogenous events, which cannot be captured by analyzing temporally aggregated data. Although signed networks can provide an accurate framework to analyse polarization by distinguishing positive and negative interactions, current methodologies cannot perform fine-grained temporal analysis while maintaining a comprehensive and global overview of the communities. This research gap is exacerbated by the lack of explicitly signed interaction datasets with temporal information. Inspired by balance theory, we define a signed network from aggregated votes which is partitioned according to maximal balance (Aref et al., 2016). We extend the existing computational framework by proposing a new Signed Polarization Index (SPI), based on quantifying the relevance of individual votes that are not in agreement with our optimal partition within a given time window. Unlike previous balance indices, this index has a natural applicability to temporal networks and displays stronger statistical properties. A key feature of our approach is that we normalize against a null model of sign configuration, rather than using a fixed constant given by network size. We apply this framework to Birdwatch, a crowd-based fact-checking platform linked to Twitter in which users add notes to Tweets stating their reliability. These notes can be further up-voted or down-voted by other users. Previous work has shown high political involvement and polarization in this platform (Allen et al., 2022; Pröllochs et al., 2022). As a proxy for ground truth communities (Republicans, R, and Democrats, D), we infer political ideology of the original tweet's authors (Barberá et al., 2015), which can be directly informative of the ideology of each group when aggregating for a combination of the nature of the notes (i.e. misleading or not misleading). Moreover, we can contextualize events by studying the word frequency of the tweets published and tagged in specific time frames. We find that Birdwatch is strongly polarized and fluctuates significantly over time. We analyse the existence of such variations by means of internal cohesion within a group (cohesiveness) and external division amongst groups (divisiveness) and find that polarization peaks have a complex interplay between these mechanisms. For example, peaks in SPI that map to Covid-19 vaccination discussions are related to higher divisiveness between the groups. Conversely, the mix of mechanisms at play in conversations about Trump and US Elections is more heterogeneous. This work has important implications to improve our understanding of online polarization and find conciliation strategies, in order to create platforms with healthier discussions. |
11:45 | SHEEP: Signed Hamiltonian Eigenvector Embedding for Proximity PRESENTER: Shazia Ayn Babul ABSTRACT. In this paper, we analyze signed networks, where the edges can have either positive or negative weights. The relationship between the local and global structure of signed networks is typically understood through the concept of structural balance theory. Structural balance relates local motifs in the graph to clusterability, the ability to partition the network into antagonistic factions (clusters with internal positive edges and negative edges in between). Identifying the optimal partition of a signed graph into k clusters, where k is unknown, is difficult and different methods have been proposed to solve this problem numerically. However, there remain many open questions in this field, particularly in the case when a network does not have a natural cluster structure, since the formation of unbalanced signed social networks may be influenced by a combination of processes resulting in different patterns than structure balance alone. When ground truth clusters do not exist, it may be more useful to quantify the similarity between node in terms of a continuous distance variable, eliminating the need to make assumptions about faction numbers and memberships. Here, we present a physically inspired method for signed network embedding called SHEEP, incorporating multi-scale information into a proximity measure between nodes. We construct a Hamiltonian from the network, modelled as a system of interacting particles with positive (negative) edges mapped to attractive (repulsive) forces. The Hamiltonian admits a minimum energy configuration, which can be re-configured as a computationally efficient eigenvector problem. We show that the embedding is intrinsically related to structural balance, outputting a “ground state energy” which we show is a statistical test for bi-polarization. The algorithm is distinct in that it (1) can be used to understand continuous, proximal node relationships; (2) does not require a priori assumptions about the existence of (discreet) clusters; and (3) locates the optimal embedding dimension. Since the Hamiltonian is a function of distance, SHEEP provides a continuous metric for understanding node relationships and relative extremism. We quantitatively evaluate the performance on synthetic and empirical generated networks to recover continuous node attributes, including a signed network representation of the members of the USA House of Representatives, obtaining embedding positions that are highly correlated with the Nokken-Poole continuous ideology scores for the members. |
12:00 | Enmity Paradox PRESENTER: Amir Ghasemian ABSTRACT. The “friendship paradox” of social networks states that, on average, “your friends have more friends than you do” [1]. This phenomenon has only been investigated from the vantage of positive networks, however. Here, we theoretically and empirically explore a new paradox we refer to as the “enmity paradox.” We use empirical data from 24,687 people living in 176 villages in rural Honduras. We show that, for a real negative undirected network denoted as (ur), created by defining an edge as a reciprocated interaction, the paradox does not exist, or, if exists, it is really small; but for a real negative undirected network denoted as (us), created by symmetrizing an interaction, the paradox exists as it does in the positive world of friendship (Figure 1). In a mixed world of positive and negative ties, we study the conditions for the existence of the paradox, both theoretically and empirically, finding that, for instance, one’s friends have more enemies than oneself. In order to understand these paradoxes in greater depth, we examine them in higher orders as well as in directed networks. Finally, we evaluate the generalized enmity paradox for nontopological attributes in real data, finding that the generalized enmity paradox is much more limited in comparison with the generalized friendship paradoxes, a finding which depends sensitively on the empirical properties of these villages. |
12:15 | Local balance reveals major historical events in signed networks of international relations PRESENTER: Fernando Diaz-Diaz ABSTRACT. Ever since Heider’s seminal work, the notion of structural balance has played a key role in the analysis of alliances and enmities in social networks -from internet users to international powers. However, the binary criterion ”balanced/unbalanced” has proven to be insufficient for many empirical complex systems. Because of this, many authors have proposed different indices to measure the level of unbalance of a signed network. These indices measure the balance of a signed network as a whole, without detailing which nodes are responsible for unbalancing the network. In contrast, we now propose a local balance index and an average balance index. The latter does not coincide in general with the global balance index, suggesting some kind of emergent behavior. Armed with this mathematical framework, we turn our attention to the network of international diplomatic relations between the years 1814 and 2014 and analyze the time series of the local balance index of each country. We find that the drops in the local balance are strongly correlated not only with armed conflicts between countries, but also with systemic instabilities within a country, even in the absence of war. This is the case of the revolutionary wave of 1848, where several European countries suddenly reduce their local balance despite the absence of explicit interstate conflict. These findings show the complex, non-local nature of geopolitical conflicts and suggest that the topological structure of diplomatic networks plays a major role in the onset and evolution of international conflict. |
12:30 | Competitive influence maximisation in the presence of negative ties PRESENTER: Sukankana Chakraborty ABSTRACT. Network-based interventions have shown promise in prompting behaviour changes in populations. Their implementation in practice is however riddled with challenges. An important development in this aspect is the influence maximisation framework, commonly used to study interventions in a theoretical setup with the aim of determining best practices that can optimise outcomes in the real world. We explore this problem in a competitive setting where two contenders compete to maximise the spread of their influence in a social network (e.g. political campaigns). Historically the problem has been studied in networks with strictly positive edges, where influence propagates by word-of-mouth or positive recommendations. Here we study the impact of negative ties (that naturally occur in many real-world networks) on influence maximisation efforts in a competitive environment. More specifically, we study how negative edges can impact influence spread in networks when not accounted for in the intervention strategy. Negative ties here are antagonistic relationships between agents in a social network by virtue of which an agent influences their neighbour to adopt an opposing opinion. We study the propagation of binary opinion states, A and B in a population of $N$ individuals interacting through positive and negative edges. At any given point in time, individuals in the network strictly conform to either states at a rate proportional to the strength of the influence experienced by them (from social neighbours and external controllers). We assume that controllers strictly exert positive influence on the network and opinion propagation in the network follows voter dynamics. Our results demonstrate that compared to naïve influence maximisation approaches (that assume all ties to be positive), knowledge of negative ties and accounting for such knowledge can yield high gains in opinion-shares in the population nearly 20% in some networks). We further show how this gain in opinion-shares varies with network topology, ratio of competitor budgets and competitor allocations, and we highlight scenarios where the knowledge of negative ties offers no gains to the controller. In addition, we present an analytical framework that illustrates how optimal allocations to a node depend on its negative degree and the amount of competitor influence on them. Finally, we examine the problem in a game-theoretic setting (where competitor strategies are not fixed), and we illustrate cases where information about negative ties surprisingly leads to a loss in opinion-shares. |
11:30 | Workplace Recommendation with Temporal Network Objectives PRESENTER: Kiran Tomlinson ABSTRACT. N/A |
11:45 | LEXpander: applying colexification networks to automated lexicon expansion PRESENTER: Anna Di Natale ABSTRACT. Thematic word lists, that is lists collecting words that deal with a chosen topic, are at the base of most text analysis applications. Be it to collect text snippets that address a specific topic or to run text analysis on the texts themselves, word lists are ubiquitous. Usually, word lists are hand crafted by researchers, adapted to a novel setting from previously published resources, or created with the help of automatic word list expansion algorithms. Despite the importance of the problem of their creation, previous work has not systematically assessed the quality of word list expansion methods nor compared existent tools against each other in a comprehensive way. Our work is the first to provide a benchmark to evaluate and compare word list expansion methods. This novel framework takes into account more than 70 different topics for word lists and can be used to test algorithms developed for languages different from English. Additionally, we propose a new word list expansion algorithm based on a semantic network, LEXpander, and prove that it outperforms previous approaches in two different languages, thus showing that network science can be successfully applied to NLP problems. LEXpander is based on the network built from colexification records. A colexification is a linguistic phenomenon that occurs when one language uses the same word to convey two different meanings. LEXpander maps the seed words given as inputs onto the colexification network, retrieving the neighboring words which will form the expanded word list. One of the advantages of using a colexification network for this task is that its structure is independent from language, that is the same algorithm can be used to expand word lists in languages different from English. In this work, we show that the network-based algorithm LEXpander outperforms other, widely spread methods: neural word embeddings like FastText and GloVe and other semantic networks like WordNet and its German counterpart, OdeNet in the expansion of both English and German word lists. For example, LEXpander achieves a F1 score of 0.15 when expanding 30\% randomly selected words from the English ground truth word list, while the best method we compared it with yields to a score of 0.12. The results of our study confirm the potential of the usage of linguistic resources and network science to address NLP problems and improve the analysis of texts in behavioral research. Moreover, we show that our approach can be applied to German, for which traditionally the existing resources are limited in comparison to English. |
12:00 | Complementarity vs. Similarity in Semantic Networks PRESENTER: Gabriel Budel ABSTRACT. There is a growing understanding that many networks are shaped by complementarity rules: the links in these networks appear between nodes with complementary properties. Examples of networks where complementarity plays an important role include protein-protein interaction networks, networks of interdisciplinary collaboration, and production networks. Indeed, protein molecules with complementary properties are more likely to interact and companies in production networks are similar to their competitors, but prefer to trade with partner companies that complement them. Complementarity networks are routinely analyzed with methods that were originally developed for social networks where links are established among nodes with similar properties. Unlike similarity, complementarity is not transitive: if node A complements node B, and B complements C, then A is not expected to complement C. As a result, similarity-based methods applied off-the-shelf to complementarity-driven systems are prone to errors. Examples include the inefficiency of similarity-based link prediction for missing protein interactions and inconsistencies in community detection problems. In the present work, we asked if semantic networks are also organized according to complementarity mechanisms. Our main hypothesis is that sentences are constructed such that words with different meanings and functions complement each other. In our study, we used the ConceptNet database to construct several semantic networks with different semantic relationships. Through the analysis of 3-cycle and 4-cycle densities in these semantic networks, we established that semantic networks are shaped by both similarity and complementarity principles. We found that among the seven relationship types with the most links in ConceptNet, the networks based on the "Antonym" and "Has-A" relationships are predominantly complementarity-based. We build upon the complementarity embedding framework developed by us earlier to learn a complementarity representation of the Antonym network, see Fig. 1. In the obtained complementarity representation, each node i is represented by two points x_i and y_i, shown as circles and squares, and distances between points of different types quantify the complementarity between the corresponding nodes: the smaller distances d(x_i, y_j) or d(x_j, y_i), the higher the complementarity between nodes i and j. We find that the complementarity representation of the antonym network allows us to infer not only antonym, but also synonym relationships. In summary, we hope that complementarity-based representations of semantic networks will prove instrumental in improving and understanding Natural Language Processing tasks, such as analogy solving, sentiment analysis, and text completion. |
12:15 | Unveiling the Impact of Employee Exits on Neighboring Social Interactions PRESENTER: David Gamba ABSTRACT. Amidst growing uncertainty and frequent restructurings, the impacts of employee exits are becoming a central concern for organizations. While prior research has examined implications on team performance and career mobility, this study delves into the effects on socialization patterns of local coworkers connected to exiting employees. Using rich communication data from a large Chinese Fortune 500 company, we track the longitudinal evolution of network metrics within communication subgroups of neighbors associated with exiting employees, contrasting these with the networks of neighbors of employees who stayed. We additionally compare these effects across two periods with varying degrees of organizational uncertainty. This research provides critical insights into managing workforce changes and preserving communication dynamics in the face of employee exits. |
12:30 | Understanding guest entrances into the podcast ecosystem PRESENTER: Sydney DeMets ABSTRACT. One key finding emerging from research on social media-based mis/disinformation is the critical role that a small set of actors play in repeatedly drawing attention to misleading information [1]. Across a variety of domains from vaccine hesitancy to election fraud narratives, scholars have demonstrated the outsized role that these key actor play in structuring information flow online. However, most of these studies have been focused on Twitter and Facebook, leaving critical gaps in our understanding of how false and misleading information spread. In particular, little to no work looks at the role of podcasts [2], despite the prevalence of this medium for information and news consumption. The knowledge gap around podcasts—episodic, on-demand talk-radio shows—is particularly concerning, due to their popularity, loyal listeners, and the relatively permissive content policies coming from the largest platforms. As of 2019, the Joe Rogan Experience boasted more average downloads per episode than the two widely viewed U.S. conservative television talk show hosts, Tucker Carlson and Sean Hannity, had per night, combined. Despite widespread criticism from the press and medical community, some of the largest podcast platforms signed multi-million dollar exclusivity contracts with disinformation promoting shows [3]. Many questions about the role of spread of mis/disinformation through audio media remain, and this work aims to build this knowledge. Podcasts are often structured in the fashion of talk-radio—a show host invites and interviews a guest, who often is or purports to be an expert on a given topic. The guest and host benefit via the exchange visibility with their respective audiences: hosting and being hosted can be a boost for both of their fan bases. As guests who spreads mis/disinformation (e.g. anti-vaccine advocates) are invited to more shows, they spread their misleading messages to the audiences of the podcasts they are hosted on. However, the paths that guests take through the podcasting world are unstudied. For in- stance, a popular individual may first appear on a very central podcast, then slowly accept interviews from less central podcasts, or not accept any new interview requests (top-down approach). Alternatively, someone may start by accepting interviews from lesser-known podcasts and work their way up to being interviewed on a very central podcast (bottom-up approach). Determining how guests enter and navigate their way through the podcasting network provides insight into how key actors (both hosts and guests) confer attention onto others. Theses attention dynamics are a core component of understanding how m/disinformation spreads. Taking a social network approach, this research has two aims: First, we map the relationships among hosts and guests surrounding a disinformation-heavy pod- cast as a two-mode network; second, we evaluate if guests seemingly navigate the podcasting network using a “top down” approach, leveraging the preferential attachment mechanism. Beginning with a seed node, the Conservative Daily podcast, we compile a list of each person who appeared as a guest on the show. We use these guests as additional query nodes to grow our network using a one step snowball sampling strategy. We query these names using the Spotify Search API to identify other shows these guests had appeared on. We alternate across the modes of hosts and guests for five steps and use the resulting network in our analysis. To assess if guests enter this network according to a preferential attachment process, we build a dyadic rela- tional events model and examine if the normalized total degree predicts a node’s tendency to interview new guests. This project contributes a novel network dataset of relationships among individuals on podcasts, and it provides an examination of the processes that dictate information flow in an audio misinformation-heavy network. Understanding how misleading information and narratives flows through podcasts narratives has significant implications for studies of mis/disinformation as well as policy makers and communities. |
12:45 | Tinkering and Innovation in Collapsing Cultural Systems PRESENTER: Sergi Valverde ABSTRACT. Diversity drives both biological and artificial evolution. In cultural evolution, it is commonly assumed that the generation of novel traits is innate to a subset of the population (e.g., experts). In contrast, diversity demonstrates collective dynamics, such as oscillations, that cannot be reduced to merely individual traits. Here, we investigate how a popular cultural domain can expand so rapidly that the demand for subject-specific experts exceeds the supply, resulting in a balance that favors imitation over invention. We anticipate a decline in diversity and a rise in information redundancy as more ideas are copied rather than invented. Three case studies are utilized to evaluate model predictions: early personal computers and home consoles, social media posts, and cryptocurrencies (see Figure). During the exponential growth of imitators, each example deviates abruptly from conventional diffusion models. We attribute this transition to a “dilution of expertise." (Duran-Nebreda, S., O’Brien, M. J., Bentley, R. A., and Valverde, S. (2022) Dilution of expertise in the rise and fall of collective innovation. Humanit Soc Sci Commun 9, 365. https://doi.org/10.1057/s41599-022-01380-5). Our theoretical model predicts the observed patterns of linguistic diversity, network complexity, information density, and collective boom-and-bust dynamics. |
11:30 | Gender differences in collaboration and career progression in physics PRESENTER: Mingrong She ABSTRACT. We examine gender differences in collaboration behaviour and academic career progression in physics. We use co-authorship ego networks to capture collaborative behaviour and use the likelihood and time to become a principal investigator (PI) and the length of an author's career to measure career progression. We used generalised linear models (GLMs) and accelerated failure time models (AFTs) to examine whether collaborative behaviour and career progression in physics vary by gender. We found that, controlling for number of publications, the relationship between collaborative behaviour and career progression was independent of gender. Researchers who published with more unique co-authors more frequently and in more distinct sets of collaborators (i.e., lower clustering coefficient) were more likely to become PIs and became PIs faster. However, we found that men and women had different collaboration behaviour. We observed that women had fewer collaborators (smaller network size), published fewer times with the same co-authors (lower mean tie strength), and published more often with the same group of collaborators (higher clustering coefficient). In terms of career progression, women were less likely to become PIs, became PIs more slowly, and had shorter careers. |
11:45 | Gender representation and homophily as barriers to women’s career in science PRESENTER: Ana Jaramillo ABSTRACT. In the last few decades, women’s participation in academia has greatly increased. As a result, new research priorities have been realized such as the recognition of women’s health as a priority in medicine [2], and the use of women’s experiences in designing interventions and public policies to mitigate social inequalities [1]. However, the increase in women’s participation and integration has not been equal across different fields and career stages. For instance, there is still low participation of women in positions of power and decision-making [3], which affects resource availability, recognition, and representation of women, and in general the entire society. We analyse the gender participation and coauthorship networks formed by more than 200 million papers published from 1955 to 2020, divided into 19 academic fields from the Semantic Scholar Open Research Corpus [6]. Specifically, we concentrate on women’s participation in different career stages, the gender differences in drop-out rates, and women’s representation in top-ranking positions considering productivity (number of papers), citations, and degree (number of co-authors). Finally, we use the Mixing Matrix and Adjusted Mixing Matrix to estimate gender homophily, assortativity and adjusted assortativity [5] per field. We plot in Figure 1 the characteristics of three fields organized by women participation: Physics with 14% (lowest), philosophy with 24% (median), and psychology with 39% (highest). The proportion of women in early and medium career stages grow over time (humanities and social sciences are the fastest). But, just social sciences have a constant growth of women in senior stages (Figure 1A). We measure the representation of women(men) in top-ranked positions as the proportion of women(men) in the x% top-ranking of each metric over the proportion of women(men) in each field. Here, smaller(larger) values than one refer to under(over) representation. Women are underrepresented in the highest percentiles and when the participation of women increases, the transition from under to fair representation is smoother (see the case of psychology in Figure 1B). Finally, our results show a relationship between women’s participation and homophily. As expected, when we increase the participation of women, there is a higher proportion of women-women and women-men co-authorships (top panel going from lighter to darker yellows, Figure 1C). Interestingly, when the participation of women increases their homophily decreases while for men the homophily values remain constant. Also, we found that gender assortativity (r) increases when the participation of women increases but when correcting it by their group size (r_adj ) [4], the values remain high but constant. Our work aligns with previous evidence about shorter careers for women as an indicator of the under-representation of women in top-ranked positions and the low participation in senior career stages [3]. However, we also observe that for many fields, the increase in women’s participation seems to plug the “leaky pipeline” and retain women in more long-lasting careers, with better representation in top-ranking positions and more diverse research groups with smaller values of internal homophily. |
12:00 | The impact of users’ homophily and recommendation biases on social network inequalities PRESENTER: Stefania Ionescu ABSTRACT. Millions of professional content creators earn a living in today’s Creator Economy, but their income and visibility depend not only on the content they upload but also on complex and opaque moderation processes, recommender systems, and viewer biases. There are concerns about the lack of fair remuneration and income imbalance with respect to specific protected attributes, but it remains unclear how these factors interact to contribute to inequality. To address this challenge, we use an agent-based modeling approach to simulate different interventions and measure fairness for both content creators and viewers. Our results suggest that reducing biases in moderation and recommendations should precede reducing viewer homophilic tendencies, and boosting the visibility of protected creators may not produce fair outcomes with respect to all metrics. The study shows the potential of using ABMs to understand biases and interventions in complex sociotechnical systems. |
12:15 | Intersectional Inequalities in the Impact of Online Visibility on Citations PRESENTER: Orsolya Vasarhelyi ABSTRACT. Although recent years have seen proactive conversations about the underlying factors of gender and ethnic minorities' under-representation in science, progress has been slow. Among the crucial issues that have been shown to impede scientists from underrepresented groups ability to advance at various career stages are the citation gap and bias in visibility that also extends to the online dissemination of scholars' work. Research has shown a positive, albeit mostly weak link between online visibility and ensuing citation impact. Prior studies have only analysed the impact of gender or race on the citation gap and visibility and did not take into account the potentially accumulated disadvantage that non-white women might face. Therefore, it is unclear how the intersectional relationship between co-author teams' gender and ethnic diversity potentially influences the link between online visibility and citation impact. To empirically address this gap in the literature, we compile a comprehensive data set that includes 14 different broad research areas. We then examine longitudinally (over the span of 7 years) the relationship between articles' online visibility and their citations for teams of varying gender and ethnic diversity. Our findings suggest that teams with higher gender it or ethnic diversity are more likely to produce articles that are among the top 25% most cited ones, but there is an intersectional penalty that teams with high gender it and ethnic diversity suffer. |
12:30 | FairSNA: Algorithmic Fairness in Social Network Analysis PRESENTER: Akrati Saxena ABSTRACT. In recent years, designing fairness-aware methods has received much attention in various domains, including machine learning, natural language processing, and information retrieval. However, understanding structural bias and inequalities in social networks and designing fairness-aware methods for various research problems in social network analysis (SNA) have not received much attention. In our work \cite{saxena2022fairsna}, we addressed that very few works have considered fairness and bias while proposing solutions; even these works are mainly focused on some research topics, such as link prediction, influence maximization, and PageRank. However, fairness has not yet been addressed for other research topics, such as influence blocking and community detection. In Fig.~\ref{lp_exa}, we show a small example of unfairness in link prediction using the Dutch School social network \cite{knecht2010friendship} that has 26 nodes (17 boys and 9 girls), and 63 edges. The homophily value of the network is 0.7 \cite{newman2003mixing}. In Fig. ~\ref{lp_exa} (a), the network is shown, and the nodes are divided into two groups based on gender; blue nodes represent boys and pink nodes represent girls. Next, we remove around 10\% of intra-community and inter-community edges uniformly at random, and the missing links are shown using dashed lines in Fig. \ref{lp_exa} (b). Now, we compute the similarity scores for predicting the missing links using two heuristics methods, (i) Jaccard Coefficient \cite{liben2007link}, and (ii) Adamic Adar Index \cite{adamic2003friends}; and similarity scores are shown corresponding to the missing links in Fig.~\ref{lp_exa} (c) and (d), respectively. We can observe that the value of similarity scores for inter-community links is lower than the intra-community links. Besides this, similarity scores are lower for small and sparse communities. For example, in Fig.~\ref{lp_exa} (d), Adamic Adar coefficient values for the links from the pink community are smaller than the blue community. However, in fair link prediction, the aim is to efficiently predict all kinds of links with high accuracy, irrespective of users' attributes, their communities, or community sizes. In this talk, we will highlight how the structural bias of social networks impacts the fairness of different SNA methods. We will further discuss fairness aspects that should be considered while proposing network structure-based solutions for different SNA problems, such as link prediction, influence maximization, centrality ranking, and community detection. The talk will be based on our work \cite{saxena2022fairsna}. Finally, we will highlight various open research directions that require researchers' attention to bridge the gap between fairness and SNA. |
12:45 | Structural marginalization in networks - How inequality of opportunities generate it: the homophily-fitness model PRESENTER: Nicola Cinardi ABSTRACT. Inequalities in social, economical, and political settings play a major role in the unbalanced scale of equal opportunities and well-being. Very often the effects of inequality may be visible while the mechanisms producing them are hidden and therefore difficult to eradicate or at least to mitigate. Despite the fact that among the mechanisms that generate disparity, inequality of opportunities (resources) is one of the strongest, it is not quantitatively taken into account in social network models. Here, we explore the emergence of structural marginalization as a result of the inequality of resource distributions (opportunities) together with other key factors which lead to structural inequalities when social interactions are considered. We present a growing network model, homophily-fitness model, with binary groups and three main social mechanisms: i) preferential attachment, ii) homophily mechanism (tendencies in connecting to people of similar attributes), and iii) the fitness of the individuals which reflects access to the opportunities and resources. The outcomes of the dynamics allow us to draw conclusions about the mechanisms behind structural inequalities with potential applications to policy-making in online and offline social networks. We show to which extent the homophilic/heterophilic behavior, and the distribution of resources impact individuals' ranking (their degree) and the inter-connectivity in networks. Our results show that inequalities in resources lead to structural inequalities, disadvantaged communities, and the propensity for rich-clubs creation. Regarding the rich-blub effects, our analysis shows that heterophilic behavior doesn't allow for the presence of the rich club. As homophily increases, we observe the formation of rich clubs when resources are considered while the network remains overall degree-non-assortative. Interestingly if resources are excluded from the model rich-clubs don't form. This leads to the conclusion that what determines the emergence of rich-clubs is the presence of resources, independently of their distributions. The work contributes to broadening the understanding of structural marginalization in networks. We provide a new model of networks that includes the effect of equality of opportunities on the structure of social networks. The analytical insights outlined in this work help push a step forward in understanding the emergence of structural marginality and explore ways to mitigate them. |
0. Connected Reality: Virtual Immersion in Social Networks PRESENTER: Alexander Gates ABSTRACT. Have you ever walked into a party and wondered who was the most popular person in the room? Or whom of your colleagues knows the conference speaker and could make an introduction? We are all embedded in complex webs of social relationships with our friends, families, colleagues, peers, neighbors, etc. Intuitively, we understand and navigate these social networks, but it is not always clear beyond our immediate ego networks who else is connected to whom. Social networking applications have begun to provide this information in an abstract form, but mapping the networks into the physical world still remains a challenge for their use in everyday social situations. This project builds on recent fundamental advances in augmented reality and the science of science to provide a virtual immersive experience within a social network. Augmented reality facilitates the exploration of large datasets while leveraging the natural 3D vision perception capabilities of the human brain. Specifically, the Connected Reality application for android and iOS uses OpenCV facial recognition models combined with ARCore/ARkit to display a social network embedded within the user's 3D space. For NetSci 2023, we propose to align i) the presenter and attendance lists with ii) Google image search and iii) the OpenAlex bibliometric database. The resulting dataset will enable conference attendees to explore the Network Science Co-authorship network in the conference space. Attendees can also add themselves to the database. The Connected Reality application constitutes the first step in a larger project exploring how we search and navigate social networks, and the potential for link prediction algorithms to facilitate social connection. Therefore, the demo of this application at the conference will provide valuable feedback before its deployment in experimental settings at future conferences. |
1. Electronic Implementation of Networks of Kuramoto Oscillators PRESENTER: Arthur N. Montanari ABSTRACT. Complex networks of coupled oscillators can represent the dynamical behavior of many different real-world systems, including power grids, neuronal activities, economics relations, ecosystem interactions, and opinion formation. The celebrated Kuramoto oscillator, despite its simple sinusoidal behavior when isolated, can exhibit more complex dynamics when interconnected in large networks, such as explosive synchronization, chaos, chimera states, among others [1]. These emergent behaviors make the Kuramoto model a powerful benchmark to discover new phenomena, develop novel techniques, and provide a mathematically tractable basis for new theories in network synchronization and control. Yet, despite the large use of the model, most results are based on analytical derivations, numerical simulations, or uncontrolled experiments (e.g., using available data of real-world networks). Few results using this model come from a controlled experiments [2], which are fundamental to validate and establish theories and methods in a reproducible environment that includes noise, modeling errors, and parametric uncertainties—as in most realistic applications. Here, we develop an experimental platform of electronic oscillators that implement the dynamical behavior of networks of coupled Kuramoto oscillators. The Kuramoto oscillator is implemented using Wien-bridge circuits [3], [4]. Composed by a single operational amplifier and others off-the-shelf components (Fig. 1a), the Wien-bridge circuit is both robust and low cost, ensuring scalability of the platform. In the designed benchmark, each oscillator has a different (adjustable) natural frequency and can be arbitrarily coupled forming different (weighted and directed) network topologies (Fig. 1b). The results from the circuit and PCB design software Proteus® showed good agreement with the mathematical model and numerical simulations, achieving phase synchronization as illustrated in Fig. 1c. Experimental results will be presented in the conference. The presented benchmark will be used to design a miniaturized version for validation studies in larger networks. |
2. Phase transitions in network robustness PRESENTER: Laura Barth ABSTRACT. The theory of network robustness has significantly grown in the last two decades and is now one of the main pillars of network science due to its broad range of applications [1]. In this context, the elegant formalism for computing the giant component size was studied on different types of random networks and under several attack scenarios, reviewed in [2]. Here we explore which degree distributions lead to a maximal giant component after a random attack of known size. Using the mathematical properties of generating function, we construct a parameterized configuration model network to be robust against a random attack. The mean degree is prescribed, and how the fractions of nodes are distributed in the network has a boundary condition. We do an optimization to investigate what degree distribution maximizes the giant component. We show mathematically that regular graphs optimally withstand small attacks. If the size of the attack exceeds a certain threshold, at which a continuous phase transition occurs. Beyond this threshold, the optimal network is heterogeneous, where the degree of heterogeneity is a function of the size of the attack. Eventually, a second phase transition is encountered, after which the optimal network has the maximally heterogeneous degree distribution (Fig.1a). One important parameter that plays a role in the computation of giant component size is the probability that a random link connects to the giant component. We show that maximizing this indicator also leads to a homogeneous distribution when the attack is small and maximally heterogeneous if the attack is large. In this case, the transition between these regimes is explosive, and no networks of intermediate heterogeneity are ever optimal (Fig.1b). In summary, to define what degree distribution maximizes the giant component, it is necessary to know the size of the attack. We also found an abrupt transition in the proportion of links connected to the giant component as the size of the attack changes. |
3. Network Robustness against Random, Localized, and Targeted Attacks: From the Perspective of Loops PRESENTER: Masaki Chujyo ABSTRACT. Our modern society is exposed to various disturbances, including random failures, large-scale natural disasters, and intentional terrorist attacks. Thus, building a robust system that can function stably against various attacks is necessary. In this work, we investigate the network robustness of connectivity against random, localized, and targeted attacks from the perspective of loops. We focus on a relation between the robustness against different types of attacks and the Feedback Vertex Set (FVS) size, where the FVS is the minimum set of nodes whose removal makes the network no loop. From our results, we conclude that for localized and targeted attacks, the robustness and the FVS size are strongly correlated. |
4. To be or not to be: Uncovering the Interplay between Phenotypic Robustness and Plasticity in Gene Regulatory Networks. PRESENTER: Anantha Samrajya Shri Kishore Hari ABSTRACT. Cancer is a complex disease characterized by uncontrollable cell proliferation and the ability of cells to colonize multiple organs in the body through a process called metastasis. While genetic mutations are thought to be the primary drivers of cancer, these changes are not sufficient to explain the various aspects of cancer, particularly metastasis. Metastatic cells must adapt dynamically to different biochemical and biomechanical changes in their environment as they migrate through tissue barriers, travel through the bloodstream, and colonize distant sites in the body. This dynamic adaptation is facilitated by two important properties of metastatic cells: phenotypic plasticity and phenotypic robustness. Phenotypic plasticity is the ability of cells to adapt dynamically to their environment by acquiring appropriate phenotypes, such as immune evasion and adhesion to the surroundings during colonization and lack thereof during migration. On the other hand, phenotypic robustness refers to the ability of cells to maintain their phenotypes that are conducive to their survival despite environmental fluctuations. Although these two properties have conflicting natures, a balance between the two is necessary for successful metastasis. We analyze gene regulatory networks underlying metastasis to better understand the emergence of phenotypic plasticity and robustness. Specifically, we focus on the complex gene-regulatory networks underlying Epithelial-Mesenchymal Plasticity (EMP), a critical process in metastasis. Our research reveals that the topological traits of these networks hold key information for explaining these emergent properties. While plasticity and robustness have an antagonistic relationship, common topological features such as positive feedback loops support both properties. These features are uniquely enriched in biological networks, indicating their evolutionary importance. Our findings offer new avenues for developing therapeutic targets to control plasticity and reduce the robustness of hybrid E/M phenotypes of cancer cells, which contribute greatly to their metastatic potential. By targeting the positive feedback loops that support both plasticity and robustness, it may be possible to reduce the ability of cancer cells to adapt to their environment and colonize distant organs in the body. Our findings could have significant implications for developing new cancer treatments that are more effective and targeted. |
5. Simplifying functional network representation and interpretation through causality clustering ABSTRACT. Functional networks, i.e. networks representing the interactions between the elements of a complex system and reconstructed from the observed elements' dynamics, are becoming a fundamental tool to unravel the structures created by movement of information. Thanks to this approach, the connectivity structure of such systems stops being information required to correctly understand the dynamics, and becomes instead a result of the analysis itself. Functional networks have found applications in many scientific fields, the most prominent one being neuroscience, but also biology, econophysics, air transport, and epidemiology. A problem inherent functional networks is the complexity associated to their representation and interpretation. The graphical representation of a system composed of even a moderate number of nodes and links is usually a seemingly unstructured cloud of points and lines. To make things even more complex, causality metrics yield links that are directional, with each node potentially being both at the sending and receiving ends of multiple links. Except for some very simple cases and very small networks, it becomes difficult to manually understand the role of each node. In other words, functional networks yield a very detailed representation of the trees; but at the same time they do not help perceiving the global forest. In this contribution we overcome these interpretation challenges through the application of a novel causality clustering approach [1]. The starting point is the hypothesis that functional networks we observe are the sum of two contributions: a main flow of information, and additional secondary flows. A clearer representation could be obtained if these secondary causality links were deleted, or somehow excluded from the final representation. In order to achieve this, a global target causality pattern is firstly fixed, i.e. a small graph where vertices represent clusters (or sets of nodes in the original network that share the same causality role), and edges main flows of information. Secondly, nodes of the network are assigned to the clusters in order to maximise the statistical significance of the pattern. To illustrate, in the simplest case of two clusters, only one asymmetric pattern is possible, with information flowing from the first to the second cluster. The result is easy to interpret, as elements in the first cluster are mostly "forcing", while those in the second are mostly "being forced". We will present how this approach can easily be generalised to high-order patterns and to different causality metrics. We will further discuss the applicability of this approach with a set of synthetic and real-world data sets, the latter ones representing neuroscience and technological problems. Zanin, M. (2021). Simplifying functional network representation and interpretation through causality clustering. Scientific Reports, 11(1), 15378. |
6. Waking up from slow wave sleep perturbs brain connectivity patterns leading to sleep inertia PRESENTER: Luis Jimenez ABSTRACT. Sleep inertia refers to the state of transition between sleep and wake characterized by impaired alertness, confusion, and reduced cognitive and behavioral performance. While the behavioral symptoms of sleep inertia are well described, the neurological changes that lead to this state remain elusive. Here, to understand the state of sleep inertia and the reorganization that the brain undergoes, we took a graph theoretical approach and compared the EEG derived brain connectivity patterns before sleep and after waking up while participants (n = 10) performed multiple tasks that differed in cognitive complexities. We focused on how the degree and the clustering coefficient of brain regions (EEG sensors) change immediately after participants wake up from slow wave sleep. During a psychomotor vigilance task (PVT), designed to assess vigilance, we find that the brain regions with strong network connectivity (degree) before sleep show a reduction in connectivity after waking. In contrast, those with low connectivity before sleep have greater connectivity after waking. The regions that undergo these changes are specific to each subject and these findings are unique to the beta frequency range, which plays a key role in sensorimotor functioning and preserving the current state of the brain. Moreover, in tasks that required inhibitory control and arithmetic reasoning, we found that only the regions with weak connectivity before sleep exhibited more connectivity after waking, but the regions with high connectivity prior to sleeping remained unchanged, highlighting task specific effects. Furthermore, we find that during the PVT, the clustering coefficient within low frequency oscillations of the brain is reduced upon waking while it remains unchanged during other tasks. These results suggest that the connections between regions that are lost after abrupt awakening can be reallocated to other regions in order to renormalize the brain. However, this response may only be evident during specific cognitive states and may be more nuanced during complex task performance. |
7. Threshold-free estimation of entropy from a Pearson matrix PRESENTER: Helcio Felippe ABSTRACT. The entropic brain hypothesis states that key functional parameters should exhibit increased entropy during psychedelic-induced altered brain states. This hypothesis has gained significant support over the years, particularly via thresholding Pearson correlation matrices of functional connectivity networks. However, the thresholding procedure is known to have drawbacks, mainly its arbitrariness in the threshold value selection. Here we propose an objective procedure for directly estimating a unique entropy of a general Pearson matrix. We show that, upon rescaling, the Pearson matrix satisfies all necessary conditions for an analog of the von Neumann entropy to be well defined. No thresholding is required. To demonstrate the generality and power of the method, we calculate the entropy of functional correlations of the brains of volunteers given the psychedelic ayahuasca. We find that the entropy increases for the ayahuasca-induced altered brain state, as predicted by the entropic brain hypothesis for psychedelic action. |
8. A Manifold Minimization Principle for Physical Networks PRESENTER: Benjamin Piazza ABSTRACT. From tree branches to the connectome, physical networks combine both a graph structure, describing their topological connectivity patterns, and a physical structure, capturing the geometry of all nodes and links. While the former is abstract, the latter requires material resources for constructing the physical nodes/links and is therefore, intrinsically costly. In one-dimensional systems, this cost is directly proportional to the lengths of all links, leading to the classic wiring economy that seeks to connect all nodes via finite paths while minimizing overall link length.However, collecting data from an array of real-world physical networks, we find here that such length minimization is frequently violated, leading to a seemingly sub-optimal connectivity. This discrepancy, we show, is rooted in the deficiency of the one-dimensional approach that ignores the higher-dimensional geometry of physical networks.Indeed, real networks have a, typically, two- or three-dimensional structure, and hence their cost is invested in their surface area or volume, not just their length. To model this, we employ a string-theory-based approach, mapping the one-dimensional network structure into a higher-dimensional smooth manifold, which naturally allows for much richer morphological patterns, such as varying node geometry and link twisting. With this mathematical formulation, we find that the sub-optimal network structures observed in our real network data can be explained as a balance between cost minimization, and the network's functional constraints, such as their capacity for nutrient or blood flow. |
9. Communicability in embedded human connectomes PRESENTER: Laia Barjuan ABSTRACT. Historically, brain network communication has focused on optimal routing, which proposes that information travels through topological shortest paths. However, this approach requires each element of the nervous system to have global information about the topology of the network and this assumption is highly unlikely in a physiological system. To overcome this limitation, other more decentralized routing protocols that take advantage of the geometric nature of the brain have been proposed. One of these methods is the navigation or greedy routing, which involves moving to the neighbor closest to a specific target destination. Several studies have used geometric distances to guide navigation in brain networks and have shown that combining topology and geometry can lead to near-optimal decentralized communication. Most of these studies are based on deterministic navigation protocols and only take into account global distance information. Our study explores communication processes in brain networks using a stochastic approach and combines global distance information with local knowledge of the Euclidean distances from the current node to its neighbors. |
10. Gene regulatory networks and their spatial embedding PRESENTER: Eda Cakir ABSTRACT. For a long time it has been hypothesized that bacterial gene regulation involves an intricate interplay of the transcriptional regulatory network (TRN) and the spatial organization of genes along the chromosome. In this talk we explore this hypothesis both on a structural and on a functional level, using a bacterial gene regulatory network as the main example. On the structural level, we study the TRN as a spatially embedded network, where the embedding space is the circular chromosome. Our work is motivated by ’wiring economy’ research in Computational Neuroscience. On the functional level, we analyze gene expression patterns from a network perspective (’digital control’), as well as from the perspective of the spatial organization of the chromosome (’analog control’). Our structural analysis reveals the outstanding relevance of the symmetry axis defined by the origin (Ori) and terminus (Ter) of replication for the network embedding and, thus, suggests the co- evolution of two regulatory infrastructures, namely the transcriptional regulatory network and the spatial arrangement of genes on the chromosome, to optimize the cross-talk between two fundamental biological processes: genomic expression and replication. |
11. Transitions between polarisation and radicalisation in a temporal bi-layer echo chambers model PRESENTER: Janusz Holyst ABSTRACT. Echo chambers and polarisation dynamics are as of late a very prominent topic in scientific communities around the world. As these phenomena directly affect our lives and seemingly more and more as our societies and communication channels evolve it becomes ever so important for us to understand the intricacies of opinion dynamics in the modern era. Here we extend an existing echo chambers model with activity driven agents onto a bi-layer topology and study the dynamics of the polarised state as a function of interlayer couplings. Different cases of such couplings are presented - unidirectional coupling that can be reduced to a mono-layer facing an external bias, symmetric and non-symmetric couplings. We have assumed that initial conditions impose system polarisation and agent opinions are different for both layers. Such a pre-conditioned polarised state can sustain without explicit homophilic interactions provided the coupling strength between agents belonging to different layers is weak enough. For a strong unidirectional or attractive coupling between two layers a discontinuous transition to a radicalised state takes place when mean opinions in both layers are the same. When coupling constants between the layers are of different signs the system exhibits sustained or decaying oscillations. Transitions between these states are analysed using a mean field approximation and classified in the framework of bifurcation theory. |
12. Chimera states on non-regular higher-order structures PRESENTER: Thierry Njougouo ABSTRACT. Chimera states are dynamical states where coherent and incoherent behaviors coexist in the same system. In the framework of network systems, where the basic units are coupled in pairs, the presence of a regular, non-local and linear coupling topology very often leads to the emergence of chimera states. In general, the term ‘nonlocal’ is applied for large systems of interacting particles where a single particle can interact not only with its nearest neighbors but also with particles far away [D. Balagué et al., Physica D, 2013]. Evidence of such collective behavior has been gathered in several domains, such as chemical reactions, power grids, coupled biochemical oscillators, neural networks, etc. In contrast to pairwise interactions, higher-order interactions refer to cases where more than two entities are allowed to interact, i.e., many-body interactions, and they play an important role in various fields of science, e.g., social science, economy and engineering. Chimera states have been recently investigated on higher-order structures by Srilena K. et al. [Srilena K. and Dibakar G., Phys. Rev. E, 2022], where it was shown, by using the Kuramoto model, that chimera are enhanced by non-local and regular high-order interactions. In this work, we focus our attention to higher-order interactions, showing that chimera states can be observed also when the coupling, besides being non-local, is also non-regular as shown in the figure below (left panel). The non-regular topology of the skeleton network underlying the higher-order structure can be observed from the image below (left panel), where it is clear that not all nodes have the same degree. We analyzed the case of Stuart-Landau coupled oscillators in a setting where the system yields chimera states for the pairwise case with linear, regular, attractive-repulsive coupling. We extended the above framework, by considering nonlinear, non-local coupling, but also non-regular structure. The numerical analysis carried out supports the conclusion that, while in the non-regular pairwise case chimeras are elusive and not robust, the non-regular high-order coupling greatly enhances the chimera behavior. An illustrative figure of the case of 5-body interactions showing chimera states in a hypergraph of 501 Stuart-Landau coupled systems is presented below. We can appreciate a perfect coexistence of coherent and incoherent states in the spatio-temporal plot (middle panel) also visible by looking at the imaginary part of the Stuart-Landau complex variable for a fixed time (right panel). |
13. Higher-order random walkers: Blob diffusion on complex networks PRESENTER: Moritz Nikolaus Laber ABSTRACT. Random walks are among the most fundamental types of dynamics on networks. They have not only been used as the basis for community detection [1], centrality measures [2] and opinion formation [3], but also have been the subject of intensive study in their own right [4]. In this work, we extend the notion of a random walker to higher-order objects of arbitrary size. We refer to this dynamical process as subgraph diffusion or more colloquially blob diffusion. In this framework, diffusion involves a connected subgraph (a “blob”), B_t with size b traversing the network. A blob of size b occupies not only a single node but an entire connected subgraph of size b. At every timestep, the blob transitions into a new configuration B_(t+1) subject to the constraint that the new configuration retains b-1 nodes in B_t while again forming a connected subgraph. Note as well that a classic random walker in this framework is simply a blob of size b=1. While dynamics on the microscopic scale of single nodes and the macroscale of whole networks have been thoroughly explored, dynamics of mesoscopic objects in networks have seen far less attention. Here we explicitly focus on the behavior of the interaction of a mesoscopic object—the blob—with the structure of the underlying network. We examine the properties of this novel dynamical system and explore its connections to previous work on community detection [5,6], motif counting [7] and higher-order random walks [8]. Moreover, we investigate the mereological properties of these higher-order random walkers. This means we study the relationship between the blob and its constituents. From this perspective, we can think of blob diffusion as a toy model for context dependence in complex systems and the role of the individual in the collective. In our view blob diffusion represents a new family of dynamics on networks that has both theoretical importance and practical relevance. References [1] M. Rosvall and C. T. Bergstrom; Proc. Natl. Acad. Sci. 105, 1118 (2008). [2] S. Brin and L. Page; Comput. Netw. ISDN Syst. 30, 107 (1998). [3] C. Cooper et al.; SIAM J. Discrete Math. 27, 1748 (2013). [4] N. Masuda, M. A. Porter & R. Lambiotte; Phys. Rep. 716–717, 1 (2017). [5] M. Rosvall et al.; Nat. Commun. 5, 1 (2014). [6] I. Derényi, G. Palla, & T. Vicsek; Phys. Rev. Lett. 94, 160202 (2005). [7] G. Han & H. Sethu;2016 IEEE 16th ICDM (2016), pp. 181–190. [8] F. Battiston et al.; Phys. Rep. 874, 1 (2020). |
14. Seed Selection for Linear Threshold Model in Multilayer Networks PRESENTER: Michał Czuba ABSTRACT. A problem of selecting an optimal seed set to maximise influence in networks has been a subject of intense research in recent years. However, while one can list many works concerning this issue, there is still a missing part to be tackled: multilayer networks. As has been proved in the literature, methods robust for single layer networks are not easily applicable to its multilayer counterparts, which narrows usability of state-of-the-art research in real case scenarios such as marketing campaigns, misinformation tracking epidemiology, where multilayer networks usually express real conditions better. In this work, we show efficiency of various metrics, used to determine the initial seed set for multilayer Linear Threshold Model (LTM). |
15. The limitations of outbreak control: A metapopulation network modeling study PRESENTER: Clara Bay ABSTRACT. During an infectious disease outbreak, public health measures such as contact tracing, isolating infectious individuals, and travel restrictions are implemented in an attempt to control and contain disease transmission. The intrinsic properties of a disease such as the amount of presymptomatic/asymptomatic transmission, reproduction number, and generation time play a critical role in determining the strength of interventions needed to control the outbreak [1]. Here, we will study the combination of factors needed to control a global infectious disease outbreak using a networked, metapopulation epidemic model. Specifically, we will examine two public health measures aimed at controlling an outbreak that affects the local and global dynamics: (1) isolating detectable, infectious individuals and (2) travel/border screening. To simulate this process, we first analyze the local dynamics that occur within a single homogeneously mixed subpopulation. We implement a SLIR-like compartmental modeling scheme where individuals are classified as either susceptible, latent, infectious, or recovered. The infectious compartment is further divided into individuals that are not detected (ex. asymptomatic carriers that are harder to detect using symptom-driven public health interventions) and then those that are detected and their isolation status (i.e. due to interventions such as contact tracing and testing policies). Isolated individuals do not contribute to future transmission chains. In Figure 1 we analytically calculate the effective reproduction number (Reff) varying the percentages of undetected transmission and isolated infections. The baseline R0's, the effective reproduction number when there are no interventions, are taken from the early COVID-19 pandemic (left) [2] and the 2009-10 Swine Flu pandemic (right) [3]. As expected, Reff decreases from the baseline R0 as the amount of undetected transmission decreases and as the isolation rate increases. Given estimates of the amount of undetected transmission for the COVID-19 and the 2009-10 flu pandemic, we see that it would take a much higher level of isolation to decrease the effective reproduction number of COVID-19 below one than it would for the 2009 influenza. We further explore the dynamics of this model within a metapopulation network where subpopulations or communities are coupled by mobility flows. Within this framework, we study the effect of isolation and travel screenings at both the local and global scales by deriving an expression for the global epidemic threshold [4]. This critical threshold, which determines the necessary conditions for the epidemic to spread globally, is a function of the isolation rates, the amount of undetected transmission, and the border screening detection rates. We will validate our analytical results through stochastic, numerical simulations. The methodology presented here can inform future responses to outbreaks by characterizing the controllability of a disease given its intrinsic properties and the strength of both local and global interventions. [1] Fraser, C., Riley, S., Anderson, R. M. & Ferguson, N. M. Factors that make an infectious disease outbreak controllable. Proc Natl AcadSci USA 101, 6146–51 (2004). URL https://www.ncbi.nlm.nih.gov/pubmed/15071187. [2] Davis, J. T. et al. Cryptic transmission of sars-cov-2 and the first covid-19 wave. Nature 600, 127–132 (2021). URL https://www.ncbi.nlm.nih.gov/pubmed/34695837. [3] Balcan, D. et al. Seasonal transmission potential and activity peaks of the new influenza a(h1n1): a monte carlo likelihood analysis based on human mobility. BMC Med 7, 45 (2009). URL https://www.ncbi.nlm.nih.gov/pubmed/19744314. [4] Colizza, V. & Vespignani, A. Epidemic modeling in metapopulation systems with heterogeneous coupling pattern: theory and simulations. J Theor Biol 251, 450–67 (2008). URL https://www.ncbi.nlm.nih.gov/pubmed/18222487. [5] Sah, P. et al. Asymptomatic sars-cov-2 infection: A systematic review and meta-analysis. Proc Natl Acad Sci USA 118 (2021). URL https://www.ncbi.nlm.nih.gov/pubmed/34376550. |
16. Attitudes towards booster, testing and isolation, and their impact on COVID-19 response in winter 2022/2023 in France, Belgium, and Italy PRESENTER: Giulia de Meijere ABSTRACT. To counter the 2022/23 winter surge due to Omicron subvariants, European countries are currently focusing on testing, isolation, and boosting strategies. However, widespread pandemic fatigue and limited compliance threaten mitigation efforts. To establish a baseline for interventions, we ran a multicountry survey to assess respondents’ willingness to receive booster vaccination and comply with testing and isolation mandates. The vast majority of survey participants (N=4,594) was willing to adhere to testing (>91%) and rapid isolation (>88%) across the three countries. Pronounced differences emerged in the declared senior adherence to booster vaccination (73% in France, 94% in Belgium, 86% in Italy). We inferred the vaccine-induced population immunity profile at the winter start from prior vaccination data, immunity waning estimates, and declared booster uptake. Integrating survey and estimated immunity data in a branching process epidemic spreading model, we evaluated the effectiveness and costs of protocols that were implemented in France, Belgium, and Italy throughout 2022, to manage the 2022/23 winter wave. Model results estimate that testing and isolation protocols would confer significant benefit in reducing transmission (17-24%) with declared adherence. Achieving a mitigating level similar to the French protocols, the Belgian protocols would require 30% fewer tests and avoid the long isolation periods of the Italian protocols (average of 6 days vs. 11). A cost barrier to test would significantly decrease adherence in France and Belgium, undermining protocols’ effectiveness. Simpler mandates for isolation may increase awareness and actual compliance, reducing testing costs, without compromising mitigation. High booster vaccination uptake remains key for the control of the winter wave. |
17. Biases in prediction of COVID-19 cases through the analysis of genetic copies concentration in wastewater plants PRESENTER: Mattia Mattei ABSTRACT. DISCLAIMER: This work does not use networks. However, we believe it could be of special interest for all those network scientists who work in epidemiology. ABSTRACT: In the context of the SARS-CoV-2 coronavirus global emergency, wastewater-based epidemiology (WBE), i.e. the surveillance of epidemic spreading trough the analysis of virus concentration in wastewater plants, is presenting itself as a potential complementary tool to clinical testing. The concept of WBE centers around the knowledge that SARS-CoV-2 RNA can be detected in stool samples excreted by human bodies, and then shed in the sewage system. The interest of the WBE relays on two main aspects: wastewater data can potentially account for the unreported cases, and they can also represent an estimate in advanced over time respect to diagnostic tests. In this work, we analyzed data about absolute concentrations of SARS-CoV-2 gene biomarker N1 in wastewater samples collected weekly at the entrance of 16 wastewater treatment plants (WWTP) in Catalonia, Spain. We considered the period between October 2021 and March 2022. We first quantified the delay between sewage data and reported cases, calculating the Pearson correlation between the number of genetic copies in each WWTP (linearly interpolated) and the 7-days averaged number of reported cases summed up on all the served municipality for each specific plant; we performed the correlation analysis shifting back the reported cases from 0 to 20 days. The 16 different plants display an average correlation of 0.88 (0.71 - 0.96) and an average delay of 8.7 days (0 - 20). Such high correlation indicates that wasterwater data, although being extremely volatile and affected by uncertainties, can broadly capture the current trend of the epidemic. Then, they seem to anticipate voluntary testing of a relevant quantity of days, in general more than reported in other studies. Successively, we proposed a slight variation of the Susceptible (S), Infected (I), Recovered (R) model to in-globe also wastewater data. In our theoretical framework infected people are divided into infected but not detected I_N and detected and isolated ones I_D with a transition probability between them which has a non trivial time-dependency. We argue that p(t) is, mainly, fundamentally related to the general perception that the population has about the on-going epidemic and we assumed that p(t) is proportional to the ratio of the detected infections at time t. Moreover, at each time step we simulated the shed of genetic copies of the virus in sewage trough the convolution of daily new infections with a shedding profile distribution. We validated this model using Approximate Bayesian Computation (ABC) for parameter estimation, using both the data-sets about genetic copies concentrations and daily reported cases. We were able to estimate the actual prevalence of infection, which resulted to be about 53\% against the 19\% detected in the same period in Catalonia. In the period between November and December 2021, the daily reported cases were up to 10 times lower than actual new infections. The models predicts an average anticipation of about 5 days of simulated wastewater data respect to the simulated reported cases curve. It also enabled us also to estimate a parameter of interest in WBE, i.e. the maximum quantity of genetic copies shed in a gram of feces by an individual during the course of the infection, which results in good agreement with the literature. |
18. The lock-down communities management: Optimal spatio-temporal clustering of the inter-provincial Italian network during the Covid-19 crisis PRESENTER: Jules Morand ABSTRACT. The COVID-19 pandemic has highlighted the need to better understand the dynamics and evolution of epidemics. Moreover, the unprecedented amount of information collected about it, at different levels of resolution, motivates a data-based modelling of pandemics. Besides, companies like Meta have made available co-location and movement data based on cell-phone tracking of their users. These data are completely anonymous, and can be used in an aggregated form to estimate probabilities of movement of people in between different areas, at different scales as well. Numerous studies in statistical physics have shown that it is possible to map an epidemic dynamic onto a network, whose nodes corresponds to persons or groups, and the links the contacts between them. Using movement data from META Data for good program, and publicly available data on the Italian population from ISTAT, we model the circulation of people in Italy before and during the Covid pandemic. In grouping the data at the level of provinces, we reconstructed the transition matrix for each day for the whole network. Using these matrices, we performed a spatial and a temporal clustering of the Italian movement network. Interestingly, the temporal clustering successfully identifies the first two lockdowns, without any further information about them. The spatial clustering results in 11 to 23 clusters depending on the period (normal traveling vs lockdown). The spatial clusters coincide well with proposed Italian macro-regions, and can be explained based on the available infrastructure and economical exchanges between different Italian regions. |
19. Short- and long-term temporal network prediction based on network memory PRESENTER: Li Zou ABSTRACT. Temporal networks like physical contact networks are networks whose topology changes over time. Predicting future temporal networks is crucial e.g., to forecast and mitigate the spread of epidemics and misinformation on the network. The classic temporal network prediction problem that aims to predict the temporal network in the short-term future based on the network observed in the past has been addressed mostly via machine learning algorithms, at the expense of high computational costs and limited interpretation of the underlying mechanisms that form the networks. This motivates us to develop network-based models to predict future network based on the network properties of node pairs observed in the past. Firstly, we investigate temporal network properties to motivate our network prediction models and to explain how the performance of these models depends on properties of the temporal networks. We explore the similarity between the network topologies (snapshots) at any two time steps with a given time lag/interval. We find that the similarity is relatively high when the time lag is small and decreases as the time lag increases. Inspired by such time-decaying memory of temporal networks and recent advances, we propose two models that predict a link's activity (i.e., connected or not) at the next time step based on past activities of the link itself or also of the neighboring links, respectively. Via seven real-world physical contact networks, we find that our models outperform in both prediction quality and computational complexity, and predict better in networks that have a stronger memory. We also reveal how different types of neighboring links contribute to the prediction of a given link's future activity, again depending on the properties of temporal networks. Furthermore, we adopt both models as well as baseline models for long-term temporal network prediction, that is, predicting temporal networks multi-time steps ahead based on the network topology observed in the past. We find our models still perform better than baseline models at each step ahead in long-term prediction, networks with stronger memory have a larger prediction quality, and the decay speed of prediction quality is positively correlated with the decay speed of network memory. |
20. Testing Causal Timescales in Temporal Networks PRESENTER: Anatol E. Wegner ABSTRACT. Temporal networks capture how interactions in complex systems evolve over time.In general, mechanisms underlying real-world complex systems can operate at multiple timescales. Therefore, quantitative methods for identifying characteristic time scales at which temporal correlations and causal structures take place have been an active field of research. Although in some cases one might refer to domain knowledge to determine the characteristic timescales, in most cases there is no known ground truth, which is one of the major challenges in testing timescale detection methods. In this work, we consider synthetic and empirical data sets that address these challenges in the context of an information theoretical method that is aimed at identifying timescales at which causal paths are most predictable. |
21. Graph Neural Networks for temporal graphs: State of the art, open challenges, and opportunities PRESENTER: Veronica Lachi ABSTRACT. The ability to process temporal graphs is becoming increasingly important in a variety of fields such as social interaction, contact tracing, recommendation systems, and many others. Traditional graph-based models are not well suited for analyzing temporal graphs as they assume a fixed structure and are unable to capture their temporal evolution. Therefore, in the last few years, several models able to directly encode temporal graphs have been developed, such as random walk-based methods, temporal motif-based methods and matrix factorization-based approaches. Recently, GNNs have been successfully applied also to temporal graphs, achieving state-of-the-art results on tasks such as temporal link prediction, node classification and edge classification. However, despite the potential of GNN-based models for temporal graph processing and the variety of different approaches that emerged, a systematization of the literature is still missing. Existing surveys either discuss general techniques for learning over temporal graphs, only briefly mentioning temporal extensions of GNNs, or focus on specific topics, like temporal link prediction or temporal graph generation. This work aims to fill this gap by providing a systematization of existing GNN-based methods for temporal graphs, or Temporal GNNs (TGNNs), and a formalization of the tasks being addressed. Our main contributions are the following: i) We propose a coherent formalization of the different learning settings and of the tasks that can be performed on temporal graphs, unifying existing formalism and informal definitions that are scattered in the literature, and highlighting substantial gaps in what is currently being tackled. ii) We organize existing TGNN works into a comprehensive taxonomy that groups methods according to the way in which time is represented and the mechanism with which it is taken into account. iii) We highlight the limitations of current TGNN methods, discuss open challenges that deserve further investigation and present critical real-world applications where TGNNs could provide substantial gains. |
22. Common spatial pattern for time-varying networks PRESENTER: Juliana Gonzalez-Astudillo ABSTRACT. Time-varying networks are graphs whose topology changes in time. These mathematical models capture how the interaction architecture of the units in a complex system evolves dynamically [1]. In many real situations, an important question is how to discriminate between behaviors from the underlying dynamic changes of the network nodes. Here, we address this problem through the lens of machine learning and adopt a signal processing perspective to improve the classification of the network behavior. To do so, we first transform the temporal network in a set of multivariate node strength time series, which measures how nodes change their connectivity over time (Fig. 1A). We then establish a formal link with the common spatial pattern (CSP) algorithm, a supervised data-driven method to extract the signal sources maximizing the separation between two conditions [2]. Finally, we evaluate the global performance through a support vector machine (SVM) classifier. We validate our framework on a set of temporal networks obtained from EEG brain recordings in a healthy subject during a motor task involving their left and right hand [3]. Fig. B shows the resulting spatial filters that lead to a classification accuracy of 0.92. Notably, these topographies emphasized motor areas, while the power spectrum of the highlighted node corroborates a difference between conditions (Fig. C). Altogether, these preliminary results provide a new framework that leverages signal processing and machine learning to study complex networks. |
23. The exploration-exploitation paradigm for modelling dynamic networks PRESENTER: Vito Dichio ABSTRACT. A number of natural systems – notably in biology – can be thought as arising from the interplay between the exploration of the configuration space and the selection of advantageous patterns appeared by chance, in a time-dependent fashion. This is strictly true in the case of Darwinian evolution, where random genetic mutations may lead to the appearance of individuals fitter to the environment than the others, at a given time. We take the latter as a general principle and draw analogies from evolutionary models to design a theoretical framework for the dynamics of networked systems – unweighted, undirected. Notably, we assume the existence of one or more optimal states encoded – either explicitly or implicitly – as maxima of a function F(G). We explore two simple case-studies: a distance-like F with a single maximum and an energy-like F akin to the Hamiltonian of the well-known 2-star model. In the first case, we are able to derive an analytical solution for the dynamics of the graph density. In the second case, we study the asymptotic behaviour and find a phase transition from a disordered to an ordered phase. Our theoretical contribution together with the simulation tool we device, paves the way for novel statistical network models for systems apt to be investigated under the lens of the exploration-exploitation dynamics. |
24. Information transfer in co-location networks PRESENTER: Zexun Chen ABSTRACT. Social structures influence human behaviour, including their movement patterns. Indeed, latent information about an individual's movement can be present in the mobility patterns of both acquaintances and strangers. We develop a ``colocation'' network to distinguish the mobility patterns of an ego's social ties from those not socially connected to the ego but who arrive at a location at a similar time as the ego. Using entropic measures, we analyse and bound the predictive information of an individual's mobility pattern and its flow to both types of ties. While the former generically provide more information, replacing up to 94\% of an ego's predictability, significant information is also present in the aggregation of unknown colocators, that contain up to 85\% of an ego's predictive information. Such information flow raises privacy concerns: individuals sharing data via mobile applications may be providing actionable information on themselves as well as others whose data are absent. |
25. Scaling laws associated with percolation transition in the city size distribution PRESENTER: Naoya Fujiwara ABSTRACT. It is a difficult problem how to define “city”. For example, the city size depends on its definition: If we adopt the definition of a city as a connected component of square grids in which the population is greater than a certain threshold value, the city size follows the power-law distribution. This definition reminds us of the percolation model with the spatial correlation in the population distribution. It is well known that the cluster size follows the power-law distribution and the scaling associated with the critical phenomena, if we set the parameter value close to the percolation threshold. This fact leads us to carry out analyses related to critical phenomena in the city size distribution. Although there are some previous studies on cities using the percolation, our study focuses on the modelling based on the real data such as the population distribution and topography. In this study, we used the 500 meters square grid populationmesh data from Japanese census data. We set a threshold n_c and consider the connected components (clusters) of grids that have the population greater than n_c as citiesmetropolitan areas. As a result, tThe obtained clusters shows similarity properties to those observed ine site percolation that indicates critical phenomena, and the population distribution follows the power law. In addition, we randomly shuffle the population of the square grids, and found that the obtained clusters after randomization are still located close to real cities. These results suggest the importance of geographical effect in city location, and the relation between the power-law city size distribution and the percolation. |
26. The resilience of post-lockdown mobility networks PRESENTER: Lucila G. Alvarez-Zuzek ABSTRACT. Cities are complex systems, evolving and adapting to their environment. The analysis of the networks describing urban mobility flows is a central aspect of transportation analysis and modelling. Moreover, their understanding is also at the core of many other important applications ranging from urbanism to epidemics containment, as they represent the backbone of all human interactions, favouring the social and economical functioning of the city, favouring productivity and innovation. This has become ever so evident since the COVID-19 pandemic affected our cities in an unprecedented way. Governmental containment measures massively altered the mobility of citizens, and changed the functional characteristics of the urban landscape. Quantifying in which way the structure of urban movements changed during this crisis can help to inform possible alternative containment measures in upcoming pandemics, but more in general represent a great opportunity for reaching a deeper understanding over the organisation of cities. The goal of this work is indeed to characterise the evolution of functional organisation of cities before, during, and after the COVID-19 lockdowns through the lens of network science. We do this by using a large privacy enhanced dataset describing the trajectories of anonymous opted-in individuals. This dataset allowed us to reconstruct the aggregated mobility flows in four major cities across the U.S.A (New York City, Seattle, Washington D.C., and Boston). Understanding changes in human behaviour through these events is indeed crucial from a public health perspective. The lockdown scenario presented a natural experiment not only to help enhance future policy decision-making but also to lead toward a better general understanding of the complex structure of cities. And therefore, help improve their inhabitants’ sustainability, health, and well-being. We first approach this problem by applying the methodology introduced by Gallotti et al. that focus on how urban network are able to process information, measuring to which extent different areas of the city facilitate human flows -- functional integration -- and to which extent there are separate clusters of connected areas -- functional segregation. By considering those measures simultaneously, it is possible to characterise how well human flows mix through the city according to the existing distribution of venues and the way residents use them. We observe in the cities studied, and using different temporal granularities (see in the figure the example of Boston) that the integration of urban flow network has been largely restored after the lockdowns. In parallel, the level of network segregation has instead grown during the lockdown, but progressively dropped afterwords approaching pre-pandemic values at the end of the summer 2020. This resilience is remarkable, since the total mobility was still greatly reduced in summer, as measurable with in our data that still register way smaller mobility counts with respect to before the onset of the epidemic. Not only cities appear to have successfully recovered this shock, but we even observe some evidences of anti-fragile behaviour in urban mobility networks. Post lockdown flow networks, even if comprising fewer movements and, being formed of a smaller number of edges, surprisingly display slightly wider Largest Connected Components. Finally, we test the universal visitation law by Schlapfer et al., by computing the product $rf$ of travel distances $r$ and visiting frequency $f$. Although we do not reproduce the exact scaling found in the original paper, possibly due to methodological differences, the distribution $p(rf)$ appears as well robust to the shock of the lockdowns and associated reduced flows. The visitation law results not only universal not only across different cities, but also comparing before, during and after the lockdowns. These results suggests that, although the unprecedented reduction to mobility flows due to governmental intervention aiming at containing the spread of COVID-19, the underlying mechanisms ruling the distribution of urban mobility were not disrupted. The cities analysed were extremely resilient, as they recovered their natural functioning, characterised by the functional integration and segregation, regardless of major post-lockdown changes in the mobility, including significant changes in individual habits such as those associated with smart working. |
27. Potential landscape of human flow in cities PRESENTER: Takaaki Aoki ABSTRACT. People are moving from one locations to another in their daily lives, for commuting, shopping, entertainment, schools, etc. This human flow provides vital information for unfolding the actual shapes of cities based on lively human behavior by place-to-place interactions from origin to destination. However, it is not easy to handle massive data on human flows as it is because, for example, when there are 1,000 locations on a map, the flow dataset is depicted by a million links from the origin to the destination. In this study, we identified the potential of human flow directly from a given origin-destination matrix. By using a metaphor for water flowing from a higher place to a lower place, the potential landscape visualizes an intuitive perspective of the human flow and determines the map of urban structure behind the massive movements of people. From the map, we can easily identify the sinks (attractive places) and the sources of human flow, not just populated places. The detected attractive places provides beneficial information for location decision making for commercial or public buildings, optimisation of transportation systems, urban planning by policy makers, and measures for movement restrictions under a pandemic. |
28. Quantifying and forecasting the mobility change in Shanghai during the SARS-CoV-2 Omicron Outbreak PRESENTER: Bin Sai ABSTRACT. To explore the spatiotemporal evolution of human mobility pattern under the multi-demographic attributes and high resolution geographical data in COVID-19 transmission. In this study, we use the large-scale mobile signaling data in Shanghai, China during the SARS-CoV-2 Omicron outbreak in 2022, covering a diverse phase of 1) pre-outbreak (February 14-28, excluding the effects of China’s Lunar New Year), 2) pre-lockdown(March 1-31), 3) Shanghai lockdown (April 1-30), 4) Late lockdown (May 1-31), and 5) Post-lockdown (June 1-30). We find that the city heterogeneity of mobility pattern among different age groups, gender, and movement time reveals that the demographic attributes is differentiated, which is more prominent in the younger group. The pandemic and non-pharmaceutical interventions has reshaped the movement pattern, it have different degree of impact on human movement at different periods. These findings provides a better understanding of mobility networks structure in determining the spread of major public health incidents, and in order to provide reference for Public Health and Preventive Medicine practice. |
29. Mobility Census for the analysis of rapid urban development PRESENTER: Gezhi Xiu ABSTRACT. Monitoring urban structure and development requires high-quality data at high spatio-temporal resolution. In comparison to the accelerating and aggregating human culture in ever-larger cities and an increased paced of urban development, traditional censuses are out-of-pace. An alternative is offered by the analysis of other big-data sources, such as human mobility data. However, these often noisy and unstructured big data pose new challenges. Here we propose a method to extract meaningful explanatory variables and classifications from such data. Using movement data from Beijing, which is produced as a byproduct of mobile communication, we show that meaningful features can be extracted, revealing for example the emergence and absorption of subcenters. This method allows the analysis of urban dynamics at a high spatial resolution (here, 500m) and near real-time frequency. |
30. Course-Prerequisite Networks for Analyzing and Understanding Academic Curricula PRESENTER: Konstantin Zuev ABSTRACT. Understanding a complex system of relationships between courses is of great importance for the university's educational mission. This talk is dedicated to the study of course-prerequisite networks that model interactions between courses, represent the flow of knowledge in academic curricula, and serve as a key tool for visualizing, analyzing, and optimizing complex curricula. We show how course-prerequisite networks can be used by students, faculty, and administrators for detecting important courses, improving existing and creating new courses, navigating complex curricula, allocating teaching resources, increasing interdisciplinary interactions between departments, revamping curricula, and enhancing the overall students' learning experience. The proposed methodology is illustrated with a network of courses taught at the California Institute of Technology. |
31. Scholarly Recognition and Transition Patterns in the Scientific Awards Network PRESENTER: Yixuan Liu ABSTRACT. Introduction: Scientific awards recognize individual achievements and contributions to advancing scientific research. Some of these awards are highly prestigious and widely recognized within the academic community, while others have also gained attention and awareness among the general public. In this study, we seek to understand how scholars have broken out of academic circles to achieve broader recognition through the analysis of scientific award networks. By examining the proceedings of these awards and their network structures, we aim to gain insights into the factors that have facilitated the promotion of scientific advancement and the recognition of individual researchers in the broader society. Methods: We used three different datasets, namely Wikidata, OpenAlex, and a recently published notability dataset (https:// www.nature.com/articles/s41597-022-01369-4), to collect data on over 7,000 awards and 41,000+ academic scholars, along with their associated publications and citations. We constructed a directed award network by connecting nodes representing awards with weighted edges that capture the flow of people transitioning between the awards. The weight of each edge is based on the number of individuals who have received both awards represented by the connected nodes. Using the directed award network, we computed transition probabilities between awards based on the outgoing link weight of each award divided by the total in-going strength for that award. This method allowed us to identify the most significant transitions between awards and analyze the patterns of how scholars have moved between different awards over time. Results: The analysis of the scientific awards network revealed the presence of large hubs of popular awards and clusters by academic disciplines, as shown in Figure 1(A). The transition probabilities also captured these patterns, with high within-discipline transition probabilities found for chemistry, biology, and physics, as illustrated in Figure 1(B) using Nobel Prize laureates as an example. Furthermore, the directed high transitions to Nobel Prizes were identified, with certain awards serving as precursors to Nobel Prizes, as shown in Figure 1(D), such as the Wolf Prize in Physics and the Oliver E. Buckley Condensed Matter Prize. Our analysis also revealed that winning a Nobel Prize significantly increases a scholar's reputation (K-S statistic=0.55, p-value=0.004), as demonstrated in Figure 1(C). |
32. A Fine-Grained Map of All Sciences: Visualizing Nations' Scientific Production PRESENTER: Filipi N. Silva ABSTRACT. Science is a heterogeneous and complex system that encompasses a vast array of disciplines and concepts. To gain situational awareness of current trends and knowledge gaps, we visualize the landscape of all sciences by embedding hundreds of millions of papers using SPECTER[1], a language model that learns the similarity of paper titles connected by citations. We employ the Microsoft Academic Graph[2] to train the model and generate a map of science for the entire set of papers in the Web of Science dataset. We then used UMAP[3] to reduce the original dimension from 756 to 2D and 3D, trained on a uniformly random 2.5% sample. Our map provides a comprehensive view of the highly structured landscape of scientific disciplines, capturing the nuanced relationships between subjects. For example, "Chemistry" lies between Physics, Tech Engineering and Biology, and "Health" is surrounded by Medicine and Psychology. The map also reveals potential holes that lie both between and within disciplines. To facilitate communication among different stakeholders, we develop an interactive Web-based exploratory visualization that provides an intuitive, holistic understanding of science. The is empowered by Helios-Web[4] and a GPU-accelerated density estimation algorithm, enabling real-time visualization of millions of papers on a standard laptop. As an example, we showcase our map to gain situational awareness of the current state of national science enterprises. We associate papers with countries based on funding agencies acknowledged in the paper, extracted from the Web of Science. We visualize the production of papers associated with the United States and China, revealing differences in research interests between the two countries. Specifically, the United States appears to have a stronger focus on health and medicine, while China's research production is more pronounced across the Tech and Engineering field. Overall, our visualization offers a powerful way to foster communication among a wider audience, including researchers, policymakers, and the general public, making a valuable contribution to the fields of Network Science and Science of Science. [1] A. Cohan, S. Feldman, I. Beltagy, D. Downey, and D. S.Weld, “Specter: Document-level representation learning using citation informed transformers,” arXiv preprint arXiv:2004.07180, 2020. [2] K.Wang, Z. Shen, C. Huang, C.-H.Wu, Y. Dong, and A. Kanakia, “Microsoft academic graph: When experts are not enough,” Quantitative Science Studies, vol. 1, no. 1, pp. 396–413, 2020. [3] L. McInnes, J. Healy, and J. Melville, “UMAP: Uniform manifold approximation and projection for dimension reduction,” arXiv preprint arXiv:1802.03426, 2018. [4] http://github.com/filipinascimento/helios-web |
33. It’s in the syllabus: A data infrastructure for innovation studies PRESENTER: Qing Ke ABSTRACT. We present it’s in the syllabus, a full-stack, continuously updating data infrastructure suitable for innovation studies and seek the community’s use and feedback as well as contributions of additional data. |
34. Network Resiliency to Systemic Failures within Time Horizon by Managing Robustness/Recoverability Capabilities: Work in Progress ABSTRACT. Numerous systemic failures of various networked infrastructures demonstrate that interconnectivity creates various risks, including systemic risk of undesirable contagion. This risk is especially pronounced in a case of discontinuous instability due to higher potential losses and a possibility of high-loss metastable, i.e., persistent, equilibria within system stability region. We propose a framework for system resilience maximization with respect to allocation of the robustness/recoverability capabilities, given system time horizon of interest, risk tolerance, and total budget. As an example, we demonstrate that increase in the adjusted for the time horizon risk averseness, shifts the optimal resource allocation from systemic failure prevention to recovery. |
35. A network-based strategy of price correlations for optimal cryptocurrency portfolios PRESENTER: Ruixue Jing ABSTRACT. A cryptocurrency is a digital asset that is maintained by a decentralized system using cryptography. Investors in this emerging digital market are exploring the potential to profit in the same way that they do in the traditional financial market by considering robust portfolios. Since the cryptocurrency market is a self-organized complex system, we aim to exploit the complex inter-dependencies between the cryptocurrencies to understand the dynamics of the market and to build efficient portfolios by using the framework of network theory. By mapping the correlations between cryptocurrencies within specific periods, we can select highly decorrelated cryptocurrencies to form diversified portfolios using Markowitz Portfolio Theory. Our methodology allows us to study the optimal number of cryptocurrencies in the portfolio. We found that 46 coins give the maximum average return within 1-day investment horizon. The performance of our portfolios is superior to benchmarks up to an investment horizon of 5 days, reaching up to 1,066% average return within 1-day horizon. Such large returns come with reasonable associated risk. Our analysis assumes completely agnostic knowledge about the future market and thus the risk associated to the large returns is deemed reasonable. We also show that popular cryptocurrencies are typically not the most interesting to be included in the portfolio to maximize the average returns. Short-term cryptocurrency investments may be competitive to traditional high-risk investments such as stock market or commodity market but call for caution given the high variability of prices. Taking into account the past correlations of prices provides means to reduce risk and improve the performance of cryptocurrency portfolios and may be an alternative indicator to create reasonable portfolios in comparison to other methodologies based on cryptocurrency price prediction. |
36. Multidimensional Economic Complexity: How the Geography of Trade, Technology, and Research Explain Inclusive Green Growth PRESENTER: Viktor Stojkoski ABSTRACT. To achieve inclusive green growth, countries need to consider a multiplicity of economic, social, and environmental factors. These are often captured by metrics of economic complexity derived from bipartite trade networks. To bridge this gap, we introduce a multidimensional approach to economic complexity that combines data on the geography of exports by product, patents by technology, and scientific publications by field of research (Figure 1). We use this approach to estimate the trade Economic Complexity Index (ECI (trade)), the technology Economic Complexity Index (ECI (technology)), and the research Economic Complexity Index (ECI (research)). We use these indexes to explain variations in economic growth, income inequality, and greenhouse emissions. We show that measures of complexity built on trade and patent data combine to explain future economic growth and income inequality and that countries that score high in all three metrics tend to exhibit lower emission intensities. These findings illustrate how a multidimensional network approach to economic complexity using data on the geography of trade, technology, and research can explain inclusive green growth. |
37. Product Progression: a machine learning approach to forecasting industrial upgrading PRESENTER: Giambattista Albora ABSTRACT. The relatedness between an economic actor (for instance a country, or a firm) and an economic activity (a product) is a measure of the feasibility of that economic activity. As such, it is a driver for investments both at a private and institutional level. Traditionally, relatedness is measured using complex network approaches derived by co-occurrences like the Product Space: a country is related to a product p if it already exports similar products (products that in the product network are close to p). In our work, we introduce a method to measure the relatedness based on machine learning and we compare it with usual complex network approaches finding that machine learning, in particular decision tree-based algorithms like Random Forest and XGBoost, perform better. In order to quantitatively compare the different measures of relatedness, we use them to predict the future exports of countries, assuming that more related products have a higher likelihood to be exported in the near future. Although they provide a better relatedness measure, machine learning algorithms have the disadvantage of being less interpretable than networks. Indeed when a network approach suggests that a country c is close to a product p, one can see the network of the products and visualize that country c exports products that are close to p. This issue can be addressed with a feature importance study whose output is a measure of which products are useful in order to start exporting another product. |
38. A Weighted and Normalized Gould‒Fernandez brokerage measure PRESENTER: Zsófia Zádor ABSTRACT. The Gould and Fernandez local brokerage measure defines brokering roles based on the group membership of the nodes from the incoming and outgoing edges. This talk extends on this brokerage measure to account for weighted edges and introduces the Weighted and Normalized Gould‒Fernandez measure (WNGF). We define brokerage as follows: if the sum of the inverse of the transaction flow between the focal node and either neighbor is smaller than the inverse of the flow directly from the incoming to the outgoing neighbor, then the focal node is identified as a broker. Fig. 1. visualizes when node r will be identified as a broker in the different roles. The value added of this new measure is demonstrated empirically with both a macro level trade network and a micro level organization network. The results gained from the WNGF measure are compared to those from two dichotomized networks: a threshold and a multiscale backbone network. The results show that the WNGF generates valid results, consistent with those of the dichotomized network. In addition, it provides the following advantages: (i) it ensures information retention; (ii) since no alterations and decisions have to be made on how to dichotomize the network, the WNGF frees the user from the burden of making assumptions; (iii) it provides a nuanced understanding of each node’s brokerage role. These advantages are of special importance when the role of less connected nodes is considered. The two empirical networks used here are for illustrative purposes. Possible applications of WNGF span beyond regional and organizational studies, and into all those contexts where retaining weights is important, for example by accounting for persisting or repeating edges compared to one-time interactions. WNGF can also be used to further analyze networks that measure how often people meet, talk, text, like, or retweet. WNGF makes a relevant methodological contribution as it offers a way to analyze brokerage in weighted, directed, and even complete graphs without information loss that can be used across disciplines and different type of networks. |
39. Effects of syndication network on specialisation and performance of venture capital firms PRESENTER: Qing Yao ABSTRACT. The Chinese venture capital (VC) market is a young and rapidly expanding financial subsector. Gaining a deeper understanding of the investment behaviours of VC firms is crucial for the development of a more sustainable and healthier market and economy. Contrasting evidence supports that either specialisation or diversification helps to achieve a better investment performance. However, the impact of the syndication network is overlooked. Syndication network has a great influence on the propagation of information and trust. By exploiting an authoritative VC dataset of thirty-five-year investment information in China, we construct a joint-investment network of VC firms and analyse the effects of syndication and diversification on specialisation and investment performance. There is a clear correlation between the syndication network degree and specialisation level of VC firms, which implies that the well-connected VC firms are diversified. More connections generally bring about more information or other resources, and VC firms are more likely to enter a new stage or industry with some new co-investing VC firms when compared to a randomised null model. Moreover, autocorrelation analysis of both specialisation and success rate on the syndication network indicates that clustering of similar VC firms is roughly limited to the secondary neighbourhood. When analysing local clustering patterns, we discover that, contrary to popular beliefs, there is no apparent successful club of investors. In contrast, investors with low success rates are more likely to cluster. Our discoveries enrich the understanding of VC investment behaviours and can assist policymakers in designing better strategies to promote the development of the VC industry. This work is under review in the Journal of Physics: Complexity |
40. Unveiling the Dynamics of Private Capital Networks: An Application of Temporal Exponential Random Graph Models PRESENTER: Yuanyuan Shang ABSTRACT. This study examines the dynamic relationships between actors in the private capital investment ecosystem, using Temporal Exponential Random Graph Models (TERGMs). Despite a growing body of literature on the prevalence of networks in financial markets, the generative process of private capital networks remains largely unknown. Using yearly investment network data from Preqin’s alternative assets database, this study investigates the complex relationships between General Partners (GPs) and Limited Partners (LPs), and provides a comprehensive understanding of the patterns of network formation and evolution. The results offer valuable insights into the decision-making processes of GPs and LPs and the significant factors that impact their interactions. This study has practical implications for both GPs and LPs, as it provides a framework for informed investment decisions and optimized investment strategies. Additionally, this work contributes to the advancement of TERGMs in the field of finance research. |
41. Incorporating Social Network Structure into Discrete Choice Models PRESENTER: Kiran Tomlinson ABSTRACT. N/A |
42. An agent-based model of cultural change for a low-carbon transition PRESENTER: Daniel Torren-Peraire ABSTRACT. Meeting climate goals requires radical changes in the consumption behaviours of individuals. This necessitates an understanding of how the diffusion of low-carbon behaviours will occur. The speed and inter-dependency of changes in behavioural choices may be modulated by individuals’ culture. We develop an agent-based model to study how behavioural decarbonisation interacts with longer-term cultural change, composed of individuals with multiple behaviours that evolve due to imperfect social learning in a small-world network. The model incorporates a cultural evolutionary framework, where culture is defined as socially transmitted information. This is represented in the model as an environmental identity which consists of slow long-term change driven by a faster behavioural diffusion process. The strength of interaction between individuals is determined by the similarity in their environmental identity, leading to inter-behavioural dependency and spillovers in green attitudes. The presence of environmental identity helps stimulate consensus formation in behavioural choices, relative to the case of behavioural independence, as it facilitates interactions between individuals of different behavioural backgrounds in large groups. We find that the speed of consensus formation in environmental identity is strongly influenced by exposure to individuals with contrasting opinions. This may be derived from sources such as inter-behavioural spillovers, confirmation biases in social interactions or breaking of homophily effects. Moreover, with the inclusion of green influencers, who act as broadcasters of zero-emission behaviours, we find that the extent of decarbonisation is a function of both confirmation bias and the attitude distance between normal individuals and green influencers. This indicates the need for an individual-specific tailored approach when providing green information, to avoid alienating individuals. |
43. More people too poor to move: divergent effects of climate change on global migration patterns PRESENTER: Albano Rikani ABSTRACT. The observed temperature increase due to anthropogenic carbon emissions has impacted economies worldwide. At the same time national income levels in origin and destination countries influence international migration. In particular, emigration is relatively low not only from high income countries but also from very poor regions, which is explained in current migration theory by credit constraints and lower average education levels, among other reasons. These relationships suggest a potential non-linear, indirect effect of climate change on migration through this indirect channel. Here we explore this effect through a counterfactual analysis using observational data, a global international migration model and two different methods accounting for the macroeconomic impact of climate change. We show that a world without climate change would have seen less migration during the past 30 years, but this effect is strongly reduced due to inhibited mobility. Our framework suggests that migration within the Global South has been strongly reduced because these countries have seen less economic growth than they would have experienced without climate change. Importantly, climate change has impacted international migration in the richer and poorer parts of the world very differently. In the future, climate change may keep increasing global migration as it slows down countries’ transition across the middle-income range associated with the highest emigration rates. |
44. Birds of a feather: A method for detecting suspicious clusters of companies in the UK PRESENTER: Kathyrn Fair ABSTRACT. The international financial system facilitates the construction of complex networks of company ownership that are often exploited for malicious intent. The industry of offshore service providers help hide, obscure, and launder trillions of dollars each year while leaving little trace. Although this issue has long been recognised, governments and the consortiums designed to fight corruption acknowledge that we are no closer to dismantling corruption networks as these networks adapt and innovate faster than policymakers can make policy. Recent implementations of databases on company directors, beneficial ownership, and their physical location were intended to reduce the effectiveness of shell companies as a vehicle for obscuring information. Nevertheless, shell companies continue as the preferred instrument for sheltering profits from oversight. We assert that databases on companies are currently under-exploited and propose a method for detecting clusters of suspicious companies. We argue that company ownership structures adhere to patterns of specialists and generalists, where some clusters of companies under shared ownership self-organize based on their domain expertise (`specialists’), while others diversify the industries they participate in by owning companies in many areas (`generalists’). Though there exist legitimate reasons for clusters of companies to be diversified, we find evidence that a handful of generalist clusters may actually be specialists at hiding wealth through the formation of shell companies. We generate a network of company ownership in the United Kingdom and use data from the International Consortium of Investigative Journalism to train a machine learning algorithm to detect suspicious companies. Our findings help re-conceptualize the role of firms who use domain expertise to circumvent government efforts at revealing the real owners of companies and offers a methodology that can adapt to evolving strategies of obscuring beneficial ownership. Specifically, we rely on a sample of 500,000 companies and their officers in the UK’s Companies House registrar. Each company is required to list their directors and address, and they must comply with other legal requirements reported in the registrar. We collect this information and match companies that appear in the Panama and Pandora paper leaks. These data are used to construct a network of company ownership within the UK. Data on leaked companies compliance, registration, and position within the network (characterised through node level metrics, and community membership) are used to train a random forest model to identify companies resembling the shell companies that appear in the leaked documents. The model is applied to the full sample of companies and used to generate a ``suspiciousness score'' for each company, indicating the extent to which it resembles those companies named in the leaks. Including information on community membership not only reduces the number of false positive suspicious companies identified by the machine learning algorithm but additionally facilitates a focus on communities containing a large concentration of potential shell companies. We argue that although clusters of nodes with shared ownership within these communities may look like generalist or diversified investments, they may actually be specialists in the creation of companies that can be used for illicit purposes. Our approach provides law enforcement and policymakers with a tool to identify groups of companies that merit attention, facilitating a more efficient use of enforcement resources. |
45. Color-Noun Matching Analysis of Color Images in Literature PRESENTER: Sungpil Wang ABSTRACT. We understand color to be primarily a visual phenomenon, but it is also well known to have a deep connection to human psychology and emotion on both individual and group levels. The study of color symbolism in literature can therefore shed light on its writer's personal psyche and the sociocultural context in which a work was produced. Traditional studies on color symbolism in literature, however, have exhibited a number of limitations. First, often only explicit color adjectives (red, blue, etc.) were considered, although a writer may use many other words to conjure up implicit color imagery (apple, ocean, etc.). Second, the matching between literary images and color depend on the researcher's subjective judgments. Here we propose a framework to overcome these limitations in the spirit of the so-called “distant reading” based on computational means and large-scale data. Color words are divided into two groups, color adjectives and color images. The former are straightforward enough, but the latter need careful consideration because there exists a wide range of possibilities for matching objects with colors. Therefore it is necessary to establish a method to assign one or more colors to a given word. To do this we set a threshold value for classifying what color(s) a given noun represents, using color palettes of 11 basic color adjectives and the list of nouns that are already associated with a specific color in ConceptNet For example, consider the noun “apple” classified as red in ConceptNet. We project the color palettes of thirty images representing red, and thirty images of red apples onto the HSV (Hue-Saturation-Value) color space, then compute the AHD (Average Hausdorff Distance) between the palettes. We obtain the distances for other noun-color pairs based on ConceptNet in a similar manner, which now form the basis for setting the threshold which is equal to the average of the JSD (Jensen-Shannon Divergence) of the distance. A new noun can then be classified as having a certain color (among the 11 base colors) if its JSD with it exceeds the threshold. These results can be expressed a kind of discrete distribution about the color of apple with the x-axis as the name of the color and the Y-axis as 11 distances. At this time, if the distribution of red is obtained in the same way, the distance between the distribution of red and apple can be calculated using JSD(Jensen-Shannon Divergence) as a metric. To be robust, we utilize conceptNet to create more than one pair for each color, such as red and apple, and set the maximum value of JSD for all pairs as a threshold. If the JSD between the distribution for a color obtained from the actual image of a particular noun and the discrete distribution of a color adjective is smaller than the threshold the noun can be defined as having that color.The color images searched in this way are analyzed together with or separately with color adjectives, enhancing the overall understanding of the color usage of the subject. With this framework we analyze the 950 works of Seo Jeong-Ju (1915–2000), a giant in Korean poetry of the 20th century. We assign colors to a poem if it contains a color word determined using the method, and created a bipartite network between the poems and the base colors, shown in Figure 1. The edge weight is the number of the corresponding color words. The figure shows the poet's differing propensity for each color. Based on metrics that analyze bipartite networks, the framework are developed to study the color usage of poets. In the future, we expect to provide a basis for comparing artists on a large scale from perspective of color symbolism and to find a cluster of artists with similar color usage patterns. |
46. Coexistence of structural balance and ego-hierarchies PRESENTER: Adam Sulik ABSTRACT. Why and how are human relationships formed? This is still an unsolved problem of social sciences. One of the main theories proposed to explain these phenomena are structural balance theory and status theory. Although the balance theory introduced by Heider is vastly applied in multiple models, due to the diverse nature of positive and negative relationships, this theory alone is often unable to capture the observations from real datasets. One example is the under-representation of nonhierarchical triads which are the triads that are less stable according to status theory. That is why here, in addition to stability as understood by Heider's hypothesis, which is contained in structural balance theory, we take into account also the dynamics based on social status. We present two new models: an extended model of structural balance on directed networks and a first model combining dynamics of structural balance and status theories. For the latter model, similarly as in real datasets, the under-representation of nonhierarchical triads is observed. Along numerical results we present analytical expectations capturing the changes of positive link density and phase transitions. |
47. Position in the supply network and digitalisation activity of manufacturing firms PRESENTER: László Lőrincz ABSTRACT. Purpose A new wave of digitalisation called Industry 4.0 is transforming manufacturing companies. Besides the size, capabilities or leadership of firms, supply chain management practices are recognized as key factors to behind digitalisation processes. However, the few empirical works that study the relationship between supplier connections and evolving digitalization mainly rely on small scale, survey based dataset that often focuses on the connections of a specific company. The aim of this study is to uncover the relationship between the nation-wide supply network position of manufacturing firms and their engagement with digitalisation. Data We combined three firm-level databases in the research facility of the Hungarian Central Statistical Office. First one is the 2019 Eurostat “Survey on ICT Usage in Enterprises” for the case of Hungary that targeted firms over 10 employees. We restricted our sample to manufacturing companies (N=2,369). To determine firm characteristics (such as the industry, size or productivity), we complement the survey data with balance sheet information. To construct the supply network, we use the interfirm VAT report database. Corresponding to the survey, we only considered firms with at least 10 employees (N=15,564). We defined links between companies in case their annual transactions value exceeded 10,000 EUR in the three consecutive years of 2017-2019. The resulting “supply network” has an average degree of 5.5 (median number of partners is 2) and its giant component connect 95% of the observed firms. Average distance between companies in the giant component is 4.6, and its diameter is 14. Methods To measure digitalisation, we consider 122 items covering IT infrastructure, skills, EDI, E-business, big data and machine learning, cloud, social media, CRM, 3D printing, IoT, and robots. We use multiple correspondence analysis for dimension reduction. We find that the first factor represents the complex digital program (ie. exploiting many technologies). The other factors represent dimensions related to only one or two technologies. To assess the relationship between the digitalization (measured by the complex digital program factor), and the network position of firms, we applied multivariate regression models. Network measures included eigenvector centrality, degree, and the number of foreign-owned (MNE) partners. We control for firm characteristics such as size, revenue, productivity, ownership, and exporting. Results Results of the regression analysis (Table 1) shows that central position in the network, together with being connected to MNE buyers are positively related to the digitalisation intensity of firms. |
48. Politicians, millionaires, and directors of Mexican companies: are director’s profile associated with their network position? ABSTRACT. The objective of this project is to investigate if the profile of the directors of corporations listed on the Mexican Stock Exchange (MSE) is associated with the structural position they hold in the network of directors. In particular, if the directors’ profile as politician, millionaire, or position type in the board of their company (independent, manager, owner) is associated with their degree centrality, betweenness centrality, and Burt’s constraint index. Two types of sources were used: network and directors’ profile data. The network data was obtained by scraping the board of directors’ names and corporations listed in the MSE, then creating the board of directors’ network, in which nodes are directors and edges are corporations to which they belong. The profile data of directors includes politicians, millionaires, and position type in the board of director. Politicians’ names were collected from Wikipedia and included presidents, cabinet secretaries and senators from 1980 to 2022. Millionaires’ names were collected from the list of Forbes 2022. Position type was included in the data extracted from the MSE. The analysis was done by searching for mean differences among directors’ profiles in terms of degree, betweenness and constraint index, and then comparing it with a null network model. In addition, several logistic models were implemented to predict the type of director using only the mentioned network indicators as predictors. The results show that politicians have a high number of connections and work both as local (Burt’s constraint) and global intermediaries (betweenness centrality). Millionaires have a high number of connections, but lower than politicians, and work only as local intermediaries, not as global intermediaries. Owners have low connections and avoid positions of intermediation. Millionaires are owners, but the difference between an owner in Forbes and an owner out of Forbes’ list is their position in the network. |
49. KAIST Knowledge Graph (KKG) of creative associations between concepts PRESENTER: Halla Kim ABSTRACT. A great level of interest in understanding human creativity, there have not been many studies in networks on how creative thoughts–novel and useful combination of concepts–manifest themselves in the cognitive conceptual space. The quantification of lexical creativity [1] or the examination of the semantic networks of creative people[2] have yet to observe the temporal dynamics of or explain the rise of creativity. Here we introduce KAIST Knowledge Graph (KKG), a free-text knowledge graph dataset constructed through voluntary participation of the members (students, staff, and faculty) of KAIST made of nodes and edges with detailed description in natural language sentences. This data set differs from existing knowledge graphs such as Freebase and Wikidata in that it reflects the cognitive and metaphoric relationships constituting people’s thoughts rather than dictionary- based relationships. As the structure and characteristics of underlying network significantly affect creativity through knowledge transfer or diffusion, one needs to explore and analyze the conceptual space[4] in order to understand how humans play with different concepts to display creativity inside the conceptual space. As one of the conceptual space, the dataset in this paper contains 817 nodes and 1495 edges, which was created by the participation of 737 university members through a dedicated website (https://kcn.kaist.ac.kr/). Starting from the initial node set by the president of the university, participants were asked to create and add new nodes reminiscent of existing nodes. The properties of each node or edge include a name, a description in free-text form, time created, user id who created it, and a related image (for some nodes). By harvesting natural language sentences to describe the creative process, our dataset may serve as an example case of social experiment constructing knowledge base for human creativity expression. |
50. From communities to outliers: a portrait of WallStreetBets users PRESENTER: Anna Mancini ABSTRACT. The GameStop (GME) short squeeze of January 2021 unveiled for the first time the influence that social networks can have, once coupled with trading apps, on financial markets. The unprecedented event, mainly initiated by the r/WallStreetBets (WSB) community of Reddit, has been studied extensively, both on community-wide and individual-user levels. In this work we focus on the microscopic level, where two main aspects are analyzed: one concerning the behavior of WSB users, showing how it conforms to a similar trend as we approach the short squeeze, the other focusing on those users that we have labeled as ”GME outliers”, who have contributed early on to spread interest in the stock. |
51. Academic Mobility as a Driver of Productivity: A Gender-centric Approach PRESENTER: Mariana Macedo ABSTRACT. pSTEM fields (Physical Sciences, Technology, Engineering and Mathematics) are known for showing a gender imbalance favouring men. This imbalance can be seen at several levels, including in university and industry, where men are the majority of the posts. Academic success is partly dependent on the value of the researchers' co-authorship networks. One of the ways to enrich one's network is through academic movement; the change of institutions in search of better opportunities within the same country or internationally. In our work, we look at the data for one specific pSTEM field, Computer Science, and describe the productivity and co-authorship patterns that emerge as a function of academic mobility. We find that women and men both benefit from national and international mobility, women who never change affiliations over their career are rarely well-cited or highly productive, and women are not well-represented in the overall top-ranking researchers. |
52. Capturing trends in emerging AI technologies and their application to business domains for strategic foresight PRESENTER: Kosuke Yamamoto ABSTRACT. Background and purpose of the study: New technologies emerging one after another in the AI domain are expanding their range of application in various fields and creating new business opportunities. Capturing the dynamics of AI technology-based business development provides us with strategic foresight for startups and investors, as well as policy makers aiming to build an innovation ecosystem [1]. In this study, we aim to capture trends in the emergence of new AI technologies and their application to market needs through a network approach using real data, and to identify the relationship between technology and business growth. Data: We capture the emergence of AI technologies through patent data, and its domain application through company establishment data. We identified 642,767 patents published between 1990 and 2021 as AI-related patents, following the definition proposed by the World Intellectual Property Organization [2]. For the company establishment data, we extracted information of 687,778 companies founded between 1990 and 2021, from Crunchbase [3], which is one of the largest-scale company databases. Constructing the networks: We processed the data by conducting the following four steps. [Step 1] TF-IDF was applied to the nouns extracted from each patent abstract, and nouns with TF-IDF values above a threshold were extracted as characteristic nouns. [Step 2] For each year, the Jaccard coefficient was calculated for each pair of extracted nouns. The edges for which the Jaccard coefficient exceeded the threshold value were extracted. [Step 3] The two nouns from all years in all extracted edges were identified as network nodes. [Step 4] Nodes (=nouns) were searched for in the companies’ business descriptions. When a noun pair appeared in the description of a company founded in a certain year, a ‘company’ edge was created connecting these two nodes for the year. These steps were carried out for each year, resulting in networks with time-varying structures within the same set of nodes. We created three networks: (i) an AI patent network, (ii) a company network using information of all companies, and (iii) a company network using information of companies claiming to be in the AI business domain (17,383 companies in total). Results: Figure 1-(A) overlays networks (i) and (ii), while Figure 1-(B) overlays networks (i) and (iii). From Figure 1-(A), it can be seen that the patent edges gradually spread from the left side of the network to the right side over time. (Note that the ForceAtlas2 method on Gephi was used for the visualization.) We found that company edges precede those patent edges, already connecting between nodes located on the right-hand side. This means that the domains where AI technology will penetrate are already explored by companies. On the other hand, when we focus only on AI companies (Figure 1-(B)), we found that, although the domains where the company edges spread was ahead of the domains where the patent edges spread until around 2017, the patent edges move faster after that and precede the AI company edges. That is, technological development in the AI field precedes the business of AI companies, suggesting a shift from a market needs-oriented to a seeds-oriented approach. (* These findings are difficult to read from Figure 1 – we quantified the time-series changes of the three networks and their comparison. The details will be presented at the conference.) We believe that the further investigation of these networks and their changes will provide novel strategic foresight regarding the emergence of new technologies and their associated businesses (– e.g., we could identify which business domains will soon be approached by new AI technologies, which domains are likely to be connected in near future by emerging technologies, etc.) |
53. Higher-Order Estimate of Open Mindedness in Online Political Discussions PRESENTER: Valentina Pansanella ABSTRACT. Opinion Dynamics (OD) models study in synthetic settings how different social and psychological factors may lead to different long-term outcomes on public opinion. The current lack of data-driven approaches that validate models on real data has led us to develop a data-driven time-aware methodology that estimates users’ open-mindedness, starting from users’ interactions represented as networks. However, in many online contexts (e.g. Reddit), people mainly participate in group discussions, which could be better captured by exploiting higher-order structures. In the present work, we extend our previous approach to hypergraphs. We applied this methodology to three different discussions on controversial topics in the American political landscape during the first two years of Trump’s presidency. We modelled such discussions both as networks and hypergraphs, to unveil the different insights that may emerge from using different underlying structures. We analyzed the estimated open-mindedness distributions on the political sphere discussion, where interactions are modelled as hypergraphs and as networks. In both settings, the temporal dynamics of open-mindedness are similar: Democrats and Moderates show a more consistent level of open-mindedness over time, while Republicans show a decrease in the level of open-mindedness during the first three semesters, followed by an increase in the last semester. However, when considering group interactions instead of pairwise ones, we can see that the opinion dynamics seems to be driven by a lower level of open-mindedness on average, especially when considering the Republicans subpopulation. Preliminary analyses suggest that this phenomenon may be due to an increase in the opinion variability of the interaction contexts, which may include users that would not be considered when using networks. |
54. The diffusion of information in social media – How complex is it? PRESENTER: Pedro Duarte ABSTRACT. With the dawn of the social media age, the amount of available content has radically increased. Information has to compete for our time and we have to choose what we prefer - the diffusion of information is therefore dictated by the economy of attention. As a result, while some pieces of information are shared a lot, most of them aren’t shared at all. The cause for this heavily skewed distribution is often attributed to the preferential attachment phenomenon, in which each new node added to the network has a higher probability to connect to an already well connected node. The end result is a power law distribution: a few nodes have a large degree value but most of them have a very low one. The observed distribution of cascade sizes might arise from this underlying network structure. However, the popularity of a particular piece of information or the attractiveness of a node may also play important roles. We call this popularity fitness. The main question we want to address is whether either fitness, the network or a combination of both best describes the spread of information. To answer this question, we will use data from Twitter, a social media platform. Twitter provides a quantitative framework, essential for the study of diffusion of information "in the wild". We first show that a simple four parameter equation describes the growth of twitter cascades well and, consequently, the cascade sizes distribution. To understand the contribution of the social network, we simulate different types of networks and propagate information through them. We then use Approximate Bayesian Computation (ABC) to fit the parameters of the model. Preliminary results indicate that the network alone cannot produce a good fit to cascade distribution. We are now in the process of introducing a distribution of fitnesses as well as the network. |
55. Temporal Activations in Continuous Opinion Dynamics PRESENTER: Fatemeh Zarei ABSTRACT. During social encounters and their naturally ensuing discussions, our viewpoints and personal opinions are influenced by our interactions with others within the social networks in which we are situated. Understanding the continuity between the increasing online exposure and the observed rise of polarisation is of crucial importance for predicting cultural developments. As human interaction and opinion formation are often too complex, toy models may be designed to study critical mechanisms, relevant variables, and symmetries that give rise to specific macro-behaviours. The Deffuant model is an opinion dynamics model treating non-binary opinions with bounded confidence, which is of particular interest to studying extremism and the propagation of polarisation. The agent’s continuous opinions are updated by dyadic encounters, simulating social interactions between connected agents within the network. In addition to the effect of network structure, we study the influence of bursty activity patterns on the stabilization time and the number of final clusters of opinion in the Deffuant model. The result shows that bursty activity patterns slow down the opinion evolution, and the slowdown effect is proportional to the burstiness. Moreover, the final state of the system depends on inter-event time distribution, in such a way that the bursty activity patterns increase the number of final clusters of opinions. |
56. Generating interactions for friendship networks PRESENTER: Piotr Górski ABSTRACT. The prominent model for generating interactions is activity driven model \cite{perra}. This model is able to successfully replicate certain observables (e.g., degree distribution) of interaction distributions for systems such as a citation network. Such a model, however, fails to reproduce interactions in systems with a high global clustering coefficient or in systems consisting of densely connected communities. An example of such a network is a community of high school students \cite{secondref} connected via face-to-face interactions. We propose a model that uses knowledge of declared friendships between students and past knowledge of previous events to predict future interaction patterns. Our model is able to reproduce observables, such as edge weight and clustering coefficient distributions. Although the starting point is the input distributions from friendship networks, as it is seen in Fig. 1, the final distributions are significantly different and resemble the true distributions of interactions. |
57. Framework for developing quantitative agent-based models based on qualitative expert knowledge: an organised crime use-case PRESENTER: Frederike Oetker ABSTRACT. In order to model criminal networks for law enforcement purposes, a limited supply of data needs to be translated into validated agent-based models [1, 2]. What is missing in current criminological modelling is a systematic and transparent framework for modelers and domain experts that establishes a modelling procedure for computational criminal modelling that includes translating qualitative data into quantitative rules [3, 4]. For this, we propose FREIDA (Framework for Expert-Informed Data-driven Agent-based models). Throughout the paper, the criminal cocaine replacement model (CCRM) will be used as an example case to demonstrate the FREIDA methodology. For the CCRM, a criminal cocaine network in the Netherlands is being modelled where the kingpin node is being removed, the goal being for the remaining agents to reorganize after the disruption and return the network into a stable state. The agents are simultaneously embedded in multiple social and business networks and possess heterogeneous individual attributes which determine the probability of shared ties, and the possibility of new relations being formed. Qualitative data sources such as case files, literature and interviews can be translated into empirical laws, and combined with the quantitative sources such as databases form the three dimensions (environment, agents, behaviour) of a networked ABM. Finally, FREIDA introduces sensitivity statements and validation statements to transition to the computational model and application phase respectively. In the last phase, iterative sensitivity analysis, uncertainty quantification and scenario testing eventually lead to a robust model that can help law enforcement plan their intervention strategies. Keywords: methodological framework, criminological modelling, computational networks, validation methods, mixed methods [1] Roks, Robert & Bisschop, Lieselot & Staring, Richard. (2021). Getting a foot in the door. Spaces of cocaine trafficking in the Port of Rotterdam. Trends in Organized Crime. 24. 10.1007/s12117-020-09394-8 [2] Rosés Brüngger, Raquel & Kadar, Cristina & Pletikosa, Irena. (2016). Design of an Agent-Based Model to Predict Crime (WIP) [3] Müller et al., Describing human decisions in agent-based models - ODD+D, an extension of the ODD protocol, 2013 [4] Transparent and Comprehensive model Evaluation) (Towards better modeling and decision support: Documenting model development, testing, and analysis using TRACE, 2014 |
58. Contact networks have small metric backbones that maintain community structure and are primary transmission subgraphs PRESENTER: Luis M. Rocha ABSTRACT. The structure of social networks strongly affects how different phenomena spread in human society, from the transmission of information to the propagation of contagious diseases. It is well-known that heterogeneous connectivity strongly favors spread, but a precise characterization of the redundancy present in social networks and its effect on the robustness of transmission is still lacking. This gap is addressed by the metric backbone, a subgraph that is sufficient to compute all shortest paths of weighted graphs. This subgraph is obtained via algebraically-principled axioms and does not require statistical sampling based on null-models. We show that the metric backbones of nine contact networks obtained from proximity sensors in a variety of social contexts are generally very small, 49% of the original graph for one and ranging from about 6% to 20% for the others. This reflects a surprising amount of redundancy and reveals that shortest paths on these networks are very robust to random attacks and failures. While many edges involved in local structure are removed to reveal the backbone, the latter preserves all shortest paths whether these characterize local, short-range or long-rage distances. Therefore, the metric backbone preserves the complete distribution of multi-scale distances and natural hierarchy of complex networks. Indeed, we show that the metric backbone preserves the community structure of all the original contact networks studied. Additionally, using Susceptible-Infected (SI) epidemic spread models, we show that the metric backbone is a primary subgraph in epidemic transmission, almost preserving the transmission times of the original network, especially in comparison to random and thresholded graphs of the same size. This suggests that the organization of social contact networks is based on large amounts of shortest-path redundancy which shapes epidemic spread in human populations. Importantly, other backbone and sparsification methods remove edges that are not redundant for shortest paths. They are based on comparison with an expected connectivity distribution (i.e., a null model )or desired network properties (e.g., degree, betweenness, or effective resistance)—sometimes altering retained edge weights. All those methods remove edges (and potentially nodes) based on thresholding edge weights (retaining only the edges with a proximity weight larger than a given value) or comparing to a null-model distribution. Thus, in either case there is an arbitrary parameter that tunes the removal of edges (and nodes). In contrast, the metric backbone is a parameter-free, principled method to obtain a (typically very small) subgraph that fully preserves all shortest paths on the original graph. In summary, the metric backbone (and the generalized distance backbone for any path length measure) is an important subgraph with regard to epidemic spread, the robustness of social networks, and any communication dynamics that depend on shortest paths. This is important for studying and disrupting spreading phenomena on social networks, which we report in detail in [1]. At this conference we will further discuss newer developments such as a multilayer distance backbone, the ultra-metric backbone, and the role of distance backbones on more complex spreading phenomena. [1] R. B. Correia, A. Barrat, and L. M. Rocha. Contact networks have small metric backbones that maintain community structure and are primary transmission subgraphs. PLOS Computational Biology, 2023. doi: 10.1371/journal.pcbi.1010854. |
59. Spreading and Structural Balance on Signed Networks PRESENTER: Yu Tian ABSTRACT. Two competing types of interactions often play an important part in shaping system behaviour, such as activatory or inhibitory functions in biological systems. Hence, signed networks, where each connection can be either positive or negative, have become popular models over recent years. However, the primary focus of the literature is on the unweighted and structurally balanced ones, where all cycles have an even number of negative edges. Hence here, we first introduce a classification of signed networks into balanced, antibalanced or strictly balanced ones, and then characterise each type of signed networks in terms of the spectral properties of the signed weighted adjacency matrix. In particular, we show that the spectral radius of the matrix with signs is smaller than that without if and only if the signed network is strictly unbalanced. These properties are important to understand the dynamics on signed networks, both linear and nonlinear ones. Specifically, we find consistent patterns in a linear and a nonlinear dynamics theoretically, depending on their type of balance. We also propose two measures to further characterise strictly unbalanced networks, motivated by perturbation theory. Finally, we numerically verify these properties through experiments on both synthetic and real networks. |
60. Perturbation-based graph theory: an integrative dynamical perspective for the study of complex networks PRESENTER: Gorka Zamora-López ABSTRACT. Built upon the shoulders of graph theory, the field of com- plex networks has become a central tool for understanding complex systems. Represented as a graph, empirical systems across domains can thus be studied using the same concepts and the same metrics. However, this simplicity is also a major limitation since graph theory is defined for a binary and symmetric description where the only relevant informa- tion is whether a link exists or not between two vertices. Despite the successful adaptation of graph theory to directed graphs, its application to weighted networks has been rather clumsy. Empirical relations are usually weighted and we daily face the need to take arbitrary choices like, for example, having to threshold the real data to obtain a binary matrix on which, now yes, the graph tools can be applied. Here, we propose a reformulation of graph theory from a dynamical point of view that can help aleviate these limitations, valid at least for the class of real networks that accept flows. First, we show that classical graph metrics are derived from a simple but common generative dynamical model (a discrete cascade) governing how perturbations propagate along the network. The Green’s function R(A, t) of the adjacency matrix A for the discrete cascade represents the network response to unit external perturbations at consecutive discrete times t. All the relevant information needed to describe the network and to define graph metrics is unfolded via the generative dynamics from the adjaceny matrix A onto its Green’s function R(A, t), see Fig. 1A. From this perspective, graph metrics are no longer regarded as combinatorial attributes of a graph A, but they correspond to spatio-temporal properties of the network’s response to external perturbations. Second, seen from this dynamical angle, we learn that the difficulties of graph theory to deal with weighted networks are the consequence of the constrains of its “hidden” dynamical model, rather than a limitation imposed by the binary representation. Therefore replacing the underlying discrete cascade by other generative models (either discrete or continuous, conservative or non-conservative) network metrics can be redefined from the corresponding Green’s function R(t) of each model. For example, graph distance is typically evaluated as the minimal number of “hops” needed to traverse between two nodes. But in the case of weighted networks “hops” is no longer a valid metric of distance. Instead, from a dynamical point of view, the time that a perturbation on node i takes to significantly affect other vertices j can be used to redefine their distance, as shown in Fig. 1B. Another limitation of graph theory is the difficulty for comparing across networks. Under this framework, a simple renormalization of the connectivity matrices allows to align networks of same architecture but of different densities or sizes as shown in Fig. 1C. In summary, we propose a dynamical formulation of graph theory in which the underlying generative model is explicit and tunable. This allows to define metrics in which both directionality and link weights are natural – built-in – aspects of the metrics. This flexibility provides the oportunity to calibrate network analyses by choosing generative models that are better suited for the specific system under study; thus balancing between simplicity and interpretability of results. A plethora of past efforts have employed different types of dynamics to study and characterise complex networks, e.g., by navigating on them, the propagation of random walkers or via routing models. We envision that the perturbative formulation here proposed serves |
61. Theoretical analysis of co-occurrence network structure derived from frequency distribution ABSTRACT. In analyzing the characteristics of a set of data like electronically stored data of sentences, we sometimes construct a co-occurrence network by focusing on the relation of the paired presence of the elements. In the definition of a co-occurrence network, each element in a unit of data is considered as node, and two elements in the unit are linked to each other. And, this is applied to all units of the data in the dataset to form a network. The characteristics of the co-occurrence network are considered to reveals the potential relationships between the elements embedded in the data. On the other hand, however, given frequency of occurrence of the element, some of the features of the co-occurrence network become dependent on the frequency of occurrence, causing a bias in the co-occurrence that we should originally see. In this study, we attempted a theoretical derivation of the co-occurrence network structure when the frequency of occurrence of an element is given, and expressed the features of the network that depend on the frequency of occurrence. Specifically, the theoretical values of link probability, expected value of weights, and Jaccard similarity between nodes were derived from the frequency of occurrence and confirmed numerically. We also discussed the significance of co-occurrence compared to the theoretical values. |
62. Nearest-neighbour directed random hyperbolic graphs PRESENTER: Mikhail Tamm ABSTRACT. Undirected hyperbolic graph models have been extensively used as models of scale-free small-world networks with high clustering coefficient. However, their use has been so far limited to undirected networks. Here we presented a simple directed hyperbolic model, where nodes randomly distributed on a hyperbolic disk are connected to a fixed number m of their nearest spatial neighbours. We introduce also a canonical counterpart of this model, which we call "model with varied connection radius"(VCR), where maximal length of outgoing bond is space-dependent and is determined by fixing the average out-degree to m. We show that the resulting networks consist of two distinct parts, a central core where the in-degree has an approximately Poisson distribution with average m, and peripheral part, where the average in-degree is position-dependent and increases exponentially with increasing distance from the periphery of the disk. The distribution of nodes between core and periphery is controlled by the dimensionless density of the nodes: in high-density networks the core dominates, while the low-density networks are dominated by the periphery. As a result in the periphery-dominated regime the networks have a very asymmetric degree distribution: while the out-degree distribution is narrow (delta-functional for the $m$-nearest neighbour model and Poisson for the VCR model, the in-degree is a truncated power law with exponent -3, spanning the values of degree from $\sim m$ to $\sim m^2/\nu$, where $\nu$ is the dimensionless density of the nodes. We calculate the local and global reciprocity in the $m$-NN and VCR networks and show that it is of order 1 in all cases. In fact, for the VCR networks reciprocity converges to 1 for high densities and to 1/2 for low densities. The reciprocity can be regulated independently of other properties of the model by varying an additional temperature-like parameters, which regulates the stochasticity of bond formation. |
63. Structural Redundancy in Directed Weighted Graphs PRESENTER: Felipe Xavier Costa ABSTRACT. Network models of multivariate interactions in nature and society, usually consider the strength of observed interrelations, such that connectivity is best represented by weighted graphs. In some cases, those relationships are not symmetric as in flow networks, likelihood of gene regulation, financial transactions, nonreciprocal friendship, among others examples. In those scenarios, weighted directed graphs are better suited to model the structure of interactions. The distribution of edge weights, along with network connectivity, can lead to topological redundancies, whereby two nodes can be more strongly connected via a third one than directly. In other words, the shortest path between two nodes is not necessarily their direct connection, and a distance backbone subgraph of the original network suffices to compute all of the shortest-paths [2]. Here, we show that this subgraph exists for both undirected and directed graphs (details in the recently published paper [1]). Identifying the distance backbone is a parameter-free and algebraically-principled network sparsification methodology that preserves all shortest paths, nodes, and connectivity. Also, it does not alter edge weights. The metric backbone is of particular interest because it derives from the most common way of computing path lengths on graphs: summing all the edge distance weights that comprise it. Using additional networks than in [1], we show that the metric backbone is typically very small in networks across domains, but their directed backbone tends to be larger than their undirected counterparts. We further study (semi-metric) edges not on the metric backbone, using their semi-metric distortion, s_{ij} : the ratio of the direct edge distance over the shortest-path distance. For backbone edges s_{ij} = 1, otherwise s_{ij} > 1, identifying edges that are redundant for shortest path computation [2]. A large s_{ij} denotes nodes x_i and x_j that are weakly related directly, but very strongly related indirectly via the backbone. A U.S. airport network of domestic nonstop segments exemplifies the effect of directionality on shortest paths. Its undirected (directed) metric backbone is composed of 16.14% (27.59%) of the original network, and the median semi-metric distortion of this network is 40.4 (17.7). This shows that considering directionality, significantly reduces the redundancy in this network, as more edges are necessary for preserving directed shortest paths, and the central tendency of the “strength” of that redundancy is also smaller in the directed case. |
64. Finite Size Scaling Approach for Revealing Inherent Scale-Freeness in Heterogeneous Networks PRESENTER: Yeonsu Jeong ABSTRACT. Various systems in nature consist of interacting entities. One of the ways to understand the characteristics of these systems is to analyze the interaction structures which often bring us intriguing macroscopic phenomena. Most of the systems in nature appear scale-free (SF) distribution (whose degree distribution follows a power law), such as social networks, the Internet, protein interaction networks, and metabolic networks. Numerous empirical and theoretical studies have helped that argument to be acceptable. However, recently the discussion about the `scale-freeness' of empirical networks has re-entered. This debate eventually stems from the inherent finiteness of the real-world system, which leads to careful consideration and comprehension on the size effect of a system of interest. A recent study has suggested a method to identify the true scale-freeness in networks by employing a finite-size scaling, inspired by the popular method to analyze criticality in the statistical physics community. This method is based on a scaling hypothesis such as %\begin{equation} $P(k)k^{\gamma}=\mathcal{F}(kN^d)$, %\label{eq:scaling} %\end{equation} where $k$ is degree, $P(k)$ is the cumulative degree distribution with a power-law exponent $\gamma$ for system size $N$ and $\mathcal{F}$ is an arbitrary scaling function with an exponent $d$ obtained by the moment ratio test. By the suggested quantity $S$ named the quality of the scaling collapse, one can classify a given network into a strong, weak, or non-SF network. The method gives good performance about the Barab{\'a}si-Albert (BA) network, Erd{\H o}s-R{\'e}nyi graph, and others~\cite{serafino2021true}. There are various ways to generate heterogeneous networks by different underlying mechanisms, such as preferential attachment (for the BA network) and maximal randomization (for the static model), which yield different exact functional forms of the degree distributions. In this study, we extend the previous study by applying the method to other heterogeneous networks with different mechanisms. We consider the static-model network and the BA network including the generalized BA model. One freely adjusts the degree exponents in the static model and the generalized BA model. Figure \ref{fig:fss} shows the scaling results. We reduce the size $N$ of the network from each starting network by random sampling. The reduction of the size for preparing the various sizes is motivated by a real-world situation where it is hard to collect and aggregate data more (equivalently increasing $N$). The scaling hypothesis says that the $P(k)$ curves collapse into a single curve for proper values of $\gamma$ and $d$, and it is supported by the nice collapse as in Figs.~\ref{fig:fss}(a),~\ref{fig:fss}(b), and~\ref{fig:fss}(c) for three types of network model. The deviation from the desired curve $\mathcal{F}$ for a given $\gamma$ is equivalent to the quality $S$ of collapse, which means a smaller value of $S$ for the nicer collapse. The results of $S$ in Figs.~\ref{fig:fss}(d),~\ref{fig:fss}(e), and~\ref{fig:fss}(f) signal the optimal $\gamma$ giving the best collapse in the heterogeneous networks, and the respective values of $\gamma$ coincide with those imposed for the starting networks in simulation and with the estimated ones by the maximum likelihood. |
65. A Multi-Layer Network Model of Climate Reinsurance: Company Benefits vs. Climate Resilience? PRESENTER: Roger Cremades ABSTRACT. A Multi-Layer Network Model of Climate Reinsurance: Company Benefits vs. Climate Resilience? Roger Cremades ^1,2, Paul Hudson ^3, Hocine Cherifi ^4 ^1Wageningen University, Wageningen, The Netherlands ^2Fondazione Eni Enrico Mattei, Venice, Italy ^3University of York, York, United Kingdom ^4University of Burgundy, Dijon, France Decreased insurability has been observed in selected locations related to increased climate risks. Decreased insurability could contribute to limits to adaptation to climate change, posing major challenges to the economy across world regions [1]. One of the visible features of the current (re)insurance global market is its limited capacity to absorb risk [2]. Furthermore, the capacity of this market might be decreasing in areas with increasing climate risks. In addition, just 4 companies dominate the market with assumed reinsured premiums above $20B, which indicates that a certain degree of oligopoly might be limiting the capacity of the global reinsurance market to absorb climate risks. Traditional economic models of (re)insurance markets fail to capture the influence of market structures, which we capture in a multilayer reflecting the activities of public and private entities with insurance cover, primary insurers, and re-insurers. These are represented as nodes forming multiple layers in a network model. The main attributes of the nodes relate to their insurance-related activities, and those of the links to their fees, coverage and claims. Aiming to understand the impact of higher levels of risk related to a changing climate, we explore the role of the structure of the network, and in particular we investigate the risk transfer and the claim activation conditions across nodes, and the number of nodes per layer. A first version of this multilayer model was built on the basis of the functioning of the (re)insurance market and the dynamics of its actors represented as nodes and their transactions represented as a network linking them; this was coded in NetLogo. One of the limitations of studying the economics of insurance is that it is often not possible to obtain data from the insurance industry due to the competitive advantage that such information could provide to other businesses. In this context, out model stands out as a stylized model capturing the most important features of the market at hand. With this model, we investigate a number of market structures and reproduce the dynamics of the nodes under different levels of climate risk, representing policy scenarios leading to higher levels of resilience due to a larger number of different node types (e.g. reinsurers) spreading risk in the network. The results show that market structures populated with different numbers of reinsurers have an influence in the resilience of the network under higher levels of risk. Fig. 1 shows how a network with a larger number of reinsurer nodes (see X axis, variable 'n-re-insurer-nodes') is able to cope with larger climate impacts (see variable ‘impacted-people-netneighbours’ in the Y axis, it is the size of a connected component representing the size of the simulated climate impact). These results indicate that that some current (re)insurance market properties — particularly the existing limited number of sizeable reinsurers in the market— call for policy incentives to increase the number of reinsurance providers, which is a challenge for the somehow oligopolistic nature of this market, and which will increase the insurability and climate resilience of the overall system. Fig. 1: The heatmap shows the aggregated annual balance of re-insurance companies in the market without accounting for the revenue from the reinvestment of insurance premiums, thus solely analyzing the influence of stylized climate shocks and market structure comparing different assumptions on policy interventions to promote climate resilience. The size of simulated climate impacts, involving a larger impact on neighboring communities is represented in the Y axis (variable ‘impacted-people-netneighbours’ is the size of the impacted connected component), while the X axis (variable 'n-re-insurer-nodes') represents the number of re-insurers available in the market. [1] Cremades, R., Surminski, S., Máñez Costa, M., Hudson, P., Shrivastava, P. and Gascoigne, J. Using the adaptive cycle in climate-risk insurance to design resilient futures. Nature Climate Change 8 (2018). [2] Wynes, S., Garard, J., Fajardo, P., Aoyagi, M., Burkins, M., Chaudhari, K., ... & Matthews, D. (2022). Climate action failure highlighted as leading global risk by both scientists and business leaders. Earth's Future, 10(10). |
66. Spacing ratio statistics of multiplex directed networks PRESENTER: Tanu Raghav ABSTRACT. Eigenvalues statistics of various many-body systems have been widely studied using the nearest neighbor spacing distribution under the random matrix theory framework. Here, we numerically analyze eigenvalue ratio statistics of multiplex networks consisting of directed Erd ̋os-R ́enyi random networks layers represented as, first, weighted non-Hermitian random matrices and then weighted Hermitian random matrices. We report that the multiplexing strength rules the behavior of average spacing ratio statistics for multiplexing networks represented by the non-Hermitian and Hermitian matrices, respectively. Additionally, for both these representations of the directed multiplex networks, the multiplexing strength appears as a guiding parameter for the eigenvector delocalization of the entire system. These results could be important for driving dynamical processes in several real-world multilayer systems, particularly, understanding the significance of multiplexing in comprehending network properties. |
67. Multilayer disease networks via multipartite projections: linking risk factors to CVD-depression multi-morbidities via molecular mediators PRESENTER: Jie Li ABSTRACT. Extensive epidemiological studies have demonstrated many surprising multi-morbidities, such as between cardiovascular diseases (CVD) and depression. They can have a significant impact on a person’s quality of life. However, their underlying biological pathways are still poorly understood. Traditional disease networks are constructed using a bipartite network between disease variables and a single type of biomarkers (or a single mixture of multiple types). Here we propose a method based on multipartite projection to construct multiplayer disease networks, use Shannon mutual information to capture non-linear interactions, and consider the resulting layers to decompose the total correlation between disease variables. We apply our method to a dataset from the cardiovascular risk in Young Finns Study (YFS), a longitudinal cohort on CVD risk factors including lifestyle, biological and psychological measures. The three phenotype modules studied are risk factors, CVD and depression, whose indirect correlations emerging from two layers of metabolites and lipids lead to a weighted multilayer network consisting of links between the risk factors and CVD-depression phenotypes. In this weighted network, the intensity of a projected correlation should be a function of the local network in the multipartite network, we test and compare four methods. Based on the expectation that the weighted multilayer networks should function as a decomposition of the total (pairwise) correlation between disease variables. We find that using the sum of the average correlations between the risk-phenotype pairs and their shared neighbors performs best. The projection method predicts some interesting findings. Firstly, it places all such risk factors in a single network that their relative importance can be assessed. In the YFS, the reconstructed network finds that sex and BMI are the two most important risk factors for CVD and depression phenotypes in young adults. Secondly, it detects the most common significant molecular mediators between risk factors and phenotypes. When investigating the indirect correlations between CVD and depression phenotype modules, the most common significant mediators are creatinine and triglycerides in small low-density lipoprotein in metabolites, and triacylglycerols and acylcarnitine (18:2) in lipids. This finding suggests that these detected metabolites and lipids may play an important role in the development of CVD-depression multi-morbidities due to exposure to the risk factors. Our method generalizes to any number of layers and disease layers, leading to a truly system-level overview of biological pathways contributing to multi-morbidities. |
68. Dynamics-based Reconstruction of the Multilayer Structure from an Aggregated Network PRESENTER: Aobo Zhang ABSTRACT. Multi-type interactions are common in complex systems. In many cases, we can only observe whether there is a link between two individuals without knowing the type of the link. The distinction of link types within a complex network is crucial for understanding the dynamics on the network especially when the dynamics behave differently on each type of links. We propose in this paper a network decomposition method using propagation time series, which decomposes an aggregated single-layer network into a multilayer network with each layer consisting of links of the same type. We apply the method to various model networks and real networks and find that it works accurately even when diverse network structural characteristics are present. We also investigate the method's effectiveness and resilience under various restrictions, finding that it is applicable in networks with more than two layers. This work offers an effective and universal framework for untangling the multilayer structure in complex networks. |
69. Similarities and differences between phonological and orthographic networks PRESENTER: Pablo Lara-Martinez ABSTRACT. The study of natural language using a network approach has made it possible to characterize novel properties ranging from the level of individual words to phrases or sentences. Here, we study the differences and similarities between phonological and orthographic networks. We consider both the written and spoken representation of words to construct the corresponding networks a . Two words are linked if their “similarity” is above a given threshold. Specifically, we consider a database with 10 3 different words from 12 natural languages (with their corresponding IPA translation), where the links were defined in terms of the Damerau-Levenshtein distance, whichis a similarity distance between strings that measures the minimum number of operations to the necessary to transform one string of characters into another: insertion, elimination, substitution, and transposition. In order to compare both networks, we apply several global and local metrics, such as link density, or the number of connected components, as well as the degree, the clustering coefficient and the average degree of the neighboring nodes. Our results for 12 different languages from 4 different language families (Romance, Germanic, Slavic and Uralic) reveal that languages like French and English lead to similar metric-values and different from those obtained for their corresponding linguistic family. In summary, it is found that differences between the phonetic and written networks is markedly higher for French and English, while for all the other languages analyzed, this separation is relatively smaller compared to the other languages belonging to the same linguistic family. These results also agree with properties such as homophony and transparency in natural languages, where asymmetries between spoken and written language have been reported. We conclude that our approach allows us to explore additional properties of the interaction between spoken and written language. The present study can be naturally extended with the incorporation of additional layers for example: containing semantic information, polarity information, among others, to explore additional properties with potential use in contexts of text classification, automatic speech recognition systems and pattern identification in natural languages. |
70. Structural precision: evaluating link prediction algorithms from the perspective of edge significance PRESENTER: Xin Lu ABSTRACT. Recent years have seen a lot of interest in link prediction in complex networks. A fundamental but still unsolved issue is how to evaluate prediction algorithms in various applications fairly. Traditional metrics evaluate algorithmic performance based on the ratio of correctly predicted edges rather than their significance. The widely adopted metrics to measure the accuracy of link prediction algorithms are the area under the receiver operating characteristic curve (AUC) and precision, while their effectiveness is recently under debate. In light of the fact that different edges play different roles in the function and evolution of networks, we propose a new metric based on edge significance that considers their impact on the network topology from two perspectives: improving the clustering characteristics and maintaining global connectivity, defined as structural precision (SP). We address the problem of evaluating the link prediction algorithm by considering the cost of edges predicted, which can, for example, be specified as a function of the network’s edge centrality properties or other topological variables. Experiments on empirical datasets demonstrated that our metric could effectively discriminate between different algorithms and help select more accurate methods for link prediction in various applications. |
71. The geography of innovation dynamics PRESENTER: Matteo Straccamore ABSTRACT. This study investigates the patterns and drivers of technological diffusion among the world’s metropolitan areas (MAs) and the role that geography plays in this process, with a focus on the different paths of innovation followed by countries and MAs with varying levels of technological competitiveness. Here, we found that nations play a critical role in determining the likelihood of technologies spreading between metropolitan areas. Our analysis shows that for the same distance, technologies are more likely to spread between metropolitan areas in the same country. We then built a predictive model for technology diffusion by combining similarities between MAs, technologies, and taking into account the country of origin, while considering the technological priorities of different countries and MAs. These predictions outperform traditional algorithms and have potential applications for policymakers, who can use them to guide investment decisions and prioritize areas for innovation. Moreover, we found that the role of countries in the innovation process changes over the years. Specifically, the technological competitiveness of countries is becoming increasingly important in shaping the paths of technological growth. To capture this changing dynamic, we introduced a new metric that compares the technological competitiveness of different countries using data from their respective metropolitan areas. To visualize the paths of technological growth and understand the changing role of countries, we applied a visualization technique using the UMAP dimensional reduction algorithm. This allowed us to identify a primary path of technological growth that is characterized by increasing diversification, as well as several other paths that represent countries prioritizing different technology portfolios. Overall, our research provides insights for policymakers seeking to promote economic growth and technological advancement. By understanding the different paths of technological growth and the changing role of countries in this process, countries can better position themselves to enhance their technological competitiveness and contribute to the global innovation landscape. |
72. Brain network flexibility as a marker of mutual adaptation of humans and machines. PRESENTER: Kanika Bansal ABSTRACT. Technology is rapidly developing more intelligent capabilities, displaying flexible and adaptive behaviors, giving rise to mutually adaptive human-machine intelligent systems. Such systems require that not only humans adapt, learning to use these new technologies, but the intelligent technology itself must be capable of adapting, augmenting its behavior to better suit system goals - a synthetic metacognition of sorts that could enable future symbiotic emergent behavior. Understanding the neurodynamics that underlie such processes is essential to develop intelligent devices that seamlessly adapt to the user necessities. In this work we explored, through the lens of complex networks, the adaptation of subjects to an externally-worn “intelligent” exoskeleton boot (ExoBoot) designed to assist walking long distances by applying torque bilaterally at the ankle on each step. Interestingly, use of the ExoBoot can often represent challenges to individuals who struggle adapting to this capability. In this study, we attempt to characterize the neural network dynamics that give rise to those that more easily adapt to this assistive technology. We used EEG functional connectivity to construct a network representation of the brain dynamics and submitted the networks to a community detection analysis, a Louvain-like algorithm to detect dynamical patterns related to adaptation to the device. We divided the sensors in 3 spatial groups (frontal, mid, and posterior) and compared the neural flexibility at rest to adaptation metrics derived from electromyography (EMG) and motion tracking while subjects used the ExoBoot. Results suggest that “trait-based” neural flexibility relates to the extent of adaptation an individual displays while walking with the ExoBoot, especially in the sensor group in the posterior region. This predictive association, if found to generalize across tasks and potentially human-systems, has the potential, if harnessed, to accelerate the learning and usability of mutually adaptive human-machine systems. |
73. Semantic Graphs Reveal the Narrative Framing in News PRESENTER: Elisabeth Lex ABSTRACT. News should convey objective information on current events. However, the perception of news not only depends on their neutrality, but also on their framing. According to Entman (1993), the framing of a communicating text depends on the selection and saliency of certain aspects. One such aspect is the narrative information embedded within texts. Such narrative framing can exemplarily be observed in the climate change debate, where the framing of news, although neutral in tone, is noticeably distinct between sources, and specific narratives (e.g., naturally-caused vs. human-made) are propagated. Therefore, our research investigates how and which narratives can be extracted from news articles. We leverage a semantic representation for text called abstract meaning representation (AMR) to encode textual content as graphs and mine those graphs for their narrative information (refer to Figure 1 for an example). By identifying common elements and sub-graphs, we can reveal the narrative framing of a collection of articles. For instance, in previous research, we successfully identified noteworthy distinctions in the reporting between mainstream and conspiracy media on health-related news (e.g., COVID-19). In sum, the mainstream narratives are more science-oriented (e.g., have scientists as actors), while conspiracy narratives are belief-oriented (e.g., are embedded in a religious context). Currently, we broaden our application domain to climate change and strive for a longitudinal study of frame adoption. |
14:15 | Fast Multiplex Graph Association Rules for Link Prediction PRESENTER: Michele Coscia ABSTRACT. Multiplex networks allow us to study a variety of complex systems where nodes connect to each other in multiple ways, for example friend, family, and co-worker relations in social networks. Link prediction is the branch of network analysis allowing us to forecast the future status of a network: which new connections are the most likely to appear in the future? In multiplex link prediction we also ask: of which type? Because this last question is unanswerable with classical link prediction, here we investigate the use of graph association rules to inform multiplex link prediction. We derive such rules by identifying all frequent patterns in a network via multiplex graph mining, i.e. all those subgraphs that appear $\sigma_x \geq \sigma_{min}$ times, with $\sigma_x$ being the number of occurrences of pattern $x$ in the network. To keep the problem computationally tractable, we only focus on patterns with at most $n$ nodes, with $n$ specified by the user. Then, a $b \rightarrow a$ rule can be built if $a$ is a graph pattern that includes $b$ plus one edge. Its confidence is $\sigma_a / \sigma_b$, which is the probability that $b$ can be extended to $a$ by adding one edge. We can then score each unobserved $u,v$ link's likelihood by summing the confidences of all rules that an $(u,v)$ edge can complete. Since different rules can be completed with edges in different layers, this leads to a natural multiplex link predictor. Association rules add new abilities to multiplex link prediction. We can predict new node arrivals, because the new edge in $a$ can connect to a node that was not present in $b$. We compare with the state of the art of multiplex link prediction, including graph neural network approaches, and we find that this ability of predicting incoming nodes increases dramatically the performance of link prediction in large online social networks -- AUC of 0.741 on the Pardus massively multiplayer online game network, versus a neural network AUC of 0.584, a low score due to the inability of seeing incoming nodes. When applied to signed networks, association rules allow to extend theories such as social balance theory, by allowing to study higher order structures with four or even five nodes, rather than being limited to the study of triangles. We can show how most rules in the Pardus network tend to decrease frustration -- a measure estimating how many connections with the unexpected sign are present in the network. Only 1.7\% of rules increase frustration, and 97.5\% of rules result in a pattern with zero frustration. |
14:30 | De Bruijn goes Neural: Causality-Aware Graph Neural Networks for Time Series Data on Dynamic Graphs PRESENTER: Lisi Qarkaxhija ABSTRACT. We introduce De Bruijn Graph Neural Networks (DBGNNs), a novel time-aware graph neural network architecture for time-resolved data on dynamic graphs. Our approach accounts for temporal-topological patterns that unfold in the causal topology of dynamic graphs, which is determined by causal walks, i.e. temporally ordered sequences of links by which nodes can influence each other over time. Our architecture builds on multiple layers of higher-order De Bruijn graphs, an iterative line graph construction where nodes in a De Bruijn graph of order represent walks of length, while edges represent walks of length. We develop a graph neural network architecture that utilizes De Bruijn graphs to implement a message passing scheme that considers non-Markovian characteristics of causal walks, which enables us to learn patterns in the causal topology of dynamic graphs. Addressing the issue that De Bruijn graphs with different orders can be used to model the same data, we apply statistical model selection to determine the optimal graph to be used for message passing. An evaluation in synthetic and empirical data sets suggests that DBGNNs can leverage temporal patterns in dynamic graphs, which substantially improves performance in a node classification task. |
14:45 | Bayesian Detection of Mesoscale Structures in Pathway Data on Graphs PRESENTER: Vincenzo Perri ABSTRACT. Mesoscale structures are an integral part of the abstraction and analysis of complex systems. They reveal a node's function in the network, and facilitate our understanding of the network dynamics. For example, they can represent communities in social or citation networks, roles in corporate interactions, or core-periphery structures in transportation networks. We usually detect mesoscale structures under the assumption of independence of interactions. Still, in many cases, the interactions invalidate this assumption by occurring in a specific order. Such patterns emerge in pathway data; to capture them, we have to model the dependencies between interactions using higher-order network models. However, the detection of mesoscale structures in higher-order networks is still under-researched. In this work, we derive a Bayesian approach that simultaneously models the optimal partitioning of nodes in groups and the optimal higher-order network dynamics between the groups. Our method can be seen as an extension of SBM to higher-orders, or as an adaptation of HMM to paths on graphs. In synthetic data, we demonstrate that our method can recover both standard proximity-based communities and role-based groupings of nodes. In synthetic and real-world data, we show that it can compete with baseline techniques, while additionally providing interpretable abstractions of network dynamics. In summary, we propose a Bayesian method that identifies mesoscale structures in dynamic complex networks where edges have temporal interdependencies. Figure 1 illustrates the type of general mesoscale patterns our method can recover from pathway data. Given the common need to better understand the behavior of complex systems through identifying subgroups in the network dynamics, we expect our contribution to be of interest to many researchers in the network science community. |
15:00 | Learning the right layers: a data-driven layer-aggregation strategy for semi-supervised learning on multilayer graphs PRESENTER: Sara Venturini ABSTRACT. Clustering (or community detection) on multilayer graphs poses several additional complications with respect to standard graphs as different layers may be characterized by different structures and types of information. One of the major challenges is to establish the extent to which each layer contributes to the cluster assignment in order to effectively take advantage of the multilayer structure and improve upon the classification obtained using the individual layers or their union. However, making an informed a-priori assessment about the clustering information content of the layers can be very complicated. In this work, we assume a semi-supervised learning setting, where the class of a small percentage of nodes is initially provided, and we propose a Laplacian-regularized model that learns an optimal nonlinear combination of the different layers from the available input labels. The learning algorithm is based on a Frank-Wolfe optimization scheme with inexact gradient, combined with a modified Label Propagation iteration. We provide a detailed convergence analysis of the algorithm and extensive experiments on synthetic and real-world datasets, showing that the proposed method compares favourably with a variety of baselines and outperforms each individual layer when used in isolation. |
15:15 | Detecting relationships in multivariate time series using reduced auto-regressive modeling and its network representation PRESENTER: Toshihiro Tanizawa ABSTRACT. An information theoretic reduction of auto-regressive modeling called the Reduced Auto-Regressive (RAR) modeling is applied to several multivariate time series as a method to detect the relationships among the components in the time series. The results are compared with the results of the transfer entropy, one of the common techniques for detecting causal relationships. These common techniques are pairwise by nature and could be inappropriate in detecting the relationships in highly complicated dynamical systems. When the relationships between the dynamics of the components are sufficiently linear and the time scales in the fluctuations of each component are in the same order of magnitude, the results of the RAR model and the transfer entropy are consistent. When the time series contain components that have large differences in the amplitude and the time scales of fluctuation, however, the transfer entropy fails to detect the correct relationships between the components, even though the time series are generated by linear equations. In contrast, the results of the RAR modeling in this case are correct. For a highly complicated dynamics such as human brain activity observed by electroencephalography measurements, the results of the transfer entropy are drastically different from those of the RAR modeling. |
15:30 | Flexible inference in heterogeneous and attributed multilayer networks PRESENTER: Martina Contisciani ABSTRACT. Community detection is one of the most popular approaches to define and identify the mesoscale organization of real-world networks. Recent studies have shown that accounting for node attributes can improve prediction performance, as the additional information can be exploited for a variety of tasks, such as predicting missing links. Moreover, the interplay between edge structure and nodes metadata can yield significant insights into the underlying organization and functional relations in the network. Current approaches mainly focus on integrating metadata and the structural information of single-layer networks. Multilayer networks, nevertheless, allow for a more complex and nuanced understanding of real-world data. However, despite few contributions, how to perform inference on multilayer networks together with node attributes (known as attributed multilayer networks) is still an unexplored topic. In particular, it is not clear how to combine conveniently and in a principled way various sources of information, together with the network topology, and assess how these impact downstream inference tasks. Here we present ADLALM, a probabilistic model to perform community detection in directed and undirected attributed multilayer networks, that takes in input any number of layers and attributes, regardless of their data types. Our approach differs from previous studies in that ADLALM flexibly adapts to any combination of input data, while standard methods rely on model-specific analytic derivations that highly depend on the data types given in input. Our formulation assigns mixed community memberships to the nodes, and transforms the parameters into a shared space where their distributions can all be modeled with Gaussians, both priors and posteriors. Using ideas from probabilistic Machine Learning, we derive an inference procedure that is simple to utilize--it is based on automatic differentiation and does not need any explicit derivations--and scales efficiently to large real-world systems. ADLALM estimates full posterior distributions and, thanks to the Laplace Matching technique (Hobbhahn and Hennig, 2021), it conveniently maps them to different desired domains to ease interpretation. For instance, to provide a probabilistic interpretation of the inferred communities, our method properly maps the parameters of a Gaussian distribution to that of a Dirichlet distribution, that has positive domain and enforces a normalization on a simplex. We validate our algorithm through a variety of experiments. Via synthetic studies with known ground truth, we find that ADLALM accurately recovers parameters and successfully reconstructs previously unseen data in link and covariate prediction tasks. Furthermore, we provide a thorough investigation on the choice of the prior distributions and transformations of the posteriors, and show how these can be flexibly employed to model a variety of different data types. To conclude, we conduct a comprehensive study of a real-world social support network that describes interactions between individuals in an Indian village. Here, we highlight how using different combinations of interactions and attributes leads to identify different sets of communities and thus highlight different aspects of the problem. The manuscript is in preparation. |
15:45 | The effect of Collaborative-Filtering based Recommendation Algorithms on opinion diversity PRESENTER: Alessandro Bellina ABSTRACT. A central role in shaping the experience of users online is played by recommendation algorithms. On the one hand they help retrieving content that best suits users taste, but on the other hand they may give rise to the so called "filter bubble" effect, favoring the rise of polarization. In this work we study how a user-user collaborative-filtering algorithm affects the behavior of a group of agents repeatedly exposed to it. By means of analytical and numerical techniques we show how the system stationary state depends on the strength of the similarity and popularity biases, quantifying respectively the weight given to the most similar users and to the best rated items. In particular, we derive a phase diagram of the model, where we observe three distinct phases: disorder, consensus and polarization. In the latter users spontaneously split into different groups, each focused on a single item. We identify, at the boundary between the disorder and polarization phases, a region where recommendations are nontrivially personalized without leading to filter bubbles. Finally, we test the model considering music listenings history from the "last.fm" dataset observing a great agreement between real data and simulations, showing that the model is able to capture the main features of the real-world situations. |
16:00 | Old embeddings and novel AIs for network prediction in temporal graphs and phylogenetics ABSTRACT. In this talk we will consider the problem of predicting the interaction structure of a network from partial, and potentially indirect, information. We will tackle that problem with a combination of a classic statistical embedding framework and a variety of novel machine learning and neural network techniques. To make it concrete, we will consider three scenarios, where: 1. the information is constituted by metadata at the level of the nodes, that we suspect may influence the probability of two nodes interacting; 2. as a special example of 1., we consider an ecology & evolution scenario where nodes represent species, edges represent trophic interactions, and the metadata is expressed as phylogenies (rooted trees encoding the (shared) evolutionary history of the species); 3. the information is expressed as a sequence of networks that constitute the (potentially incompletely observed) history of a dynamical complex network, and we are asked to predict its temporal evolution. The low-rank graph embedding method we adopt comes from the statistical theory of Random Dot Product Graphs[1] (RDPG), which provides a robust framework for interpreting the truncated Singular Value Decomposition of a complex network. Within the RDPG framework, each node i is described by two vectors of latent features Li and Ri, each of which can be seen as points in a d dimensional space (where d is usually small). Then, the probability of two node i interacting with node j is given by the dot product of their latent features, Li·Ri. The RDPG framework provide an efficient way of computing these latent features. As illustrated in a recent series of paper, we marry this embedding technique with different machine learning approaches depending on the scenario and the available data: 1. In scenario (1) we use classic neural networks to predict latent features, and eventually interaction probabilities, from node tabular metadata. We apply this to a very large network of touristic trips to Aotearoa New Zealand, to predict where tourist will likely go[2]; 2. In scenario (2) we use phylogenetic ancestral state reconstruction to predict the (unobserved) food web through knowledge transfer and evolutionary information[3]; 3. In scenario (3) we interpret the temporal evolution of the networks, through the lenses of the embedding, as a dynamical system; this allows us to apply neural network differential equation and Universal Differential Equations (UDE) to (a) predict the network dynamics and (b) discover the differential equations[4]. In the future, we intend to try and apply this approach to the temporal evolution of ecological networks under the pressure of forcing effects determined by global warming. In conclusion, we hope our talk may convince the audience that the analytic flow proposed (embed -> AI -> reconstruct) is both interpretable, as it exploits a well studied statistical framework, efficient, as the embedding step is extremely simple and the AI can focus on just the remaining part of the problem, and elastic, as the AI step can be chose ad hoc to match the available data and the given question. Whether this interpretability, efficiency, and elasticity are payed by a loss in accuracy and generality with respect to a fully AI approach is, for us, an open question. References: [1] Athreya, Avanti, et al. "Statistical inference on random dot product graphs: a survey." The Journal of Machine Learning Research 18.1 (2017): 8393-8484. [2] Runghen, Rogini, Daniel B. Stouffer, and Giulio V. Dalla Riva. "Exploiting node metadata to predict interactions in bipartite networks using graph embedding and neural networks." Royal Society Open Science 9.8 (2022): 220079. [3] Strydom, Tanya, et al. "Food web reconstruction through phylogenetic transfer of low‐rank network representation." Methods in Ecology and Evolution 13.12 (2022): 2838-2849. [4] Smith, Connor, and Giulio V. Dalla Riva, Work in Progress |
16:15 | Identifying key players in networks through variational quantum algorithm with deep Q-learning PRESENTER: Xiao-Long Ren ABSTRACT. Identifying a set of key nodes in a network that, if removed, would lead to the dismantling of the network is a fundamental problem in Network Science. In this study, we propose a novel approach that combines reinforcement learning with quantum convolutional neural networks. Our method captures the long-range correlations of nodes while maximizing the capture of network information. This is achieved while ensuring the network structure remains unchanged. The proposed method can be applied to a wide range of network scenarios after being trained on small networks. We introduce quantum computing and utilize the tensor product and unitary matrix to establish the network framework in reinforcement learning. This exponentially reduces the number of model parameters and computational complexity compared to traditional neural networks. We compare the results with other network dismantling algorithms and obtain better performance. This research shed light on the design of faster algorithms for many hard optimization problems. |
16:30 | ‘Stealing fire or stacking knowledge’ by machine intelligence to model link prediction in complex networks PRESENTER: Alessandro Muscoloni ABSTRACT. Current methodologies to model connectivity in complex networks either rely on network scientists intelligence to discover reliable physical rules, or use artificial intelligence (AI) that stacks hundreds of inaccurate human-made rules to make a new one that optimally summarizes them together. Here, we provide an accurate and reproducible scientific analysis showing that, contrary to the current belief, stacking more good link prediction rules does not necessarily improve the link prediction performance to nearly optimal as suggested by recent studies. Finally, under the light of our novel results, we discuss the pros and cons of each current state of the art link prediction strategy, concluding that none of the current solutions are what the future might hold for us. Future solutions might require the design and development of next generation ‘creative’ AI that are able to generate and understand complex physical rules for us. |
14:15 | Diffusion approximation of a network model of meme popularity PRESENTER: James Gleeson ABSTRACT. Models of meme propagation on social networks, in which memes compete for limited user attention, can successfully reproduce the heavy-tailed popularity distributions observed in online settings. While system-wide popularity distributions have been derived analytically, the dynamics of individual meme trajectories have thus far evaded description. To address this, we formulate the diffusion of a given meme as a one-dimensional stochastic process, whose fluctuations result from aggregating local network dynamics using classic and generalised central limit theorems, with the latter based on stable distribution theory. Ultimately, our approach decouples competing trajectories of meme popularities, allowing them to be simulated independently, and thus parallelised, and expressed in terms of Fokker-Planck equations. |
14:30 | Analysis of mean-field approximation for Deffuant opinion dynamics on networks PRESENTER: Alina Dubovskaya ABSTRACT. The Deffuant–Weisbuch model [1] is a bounded-confidence type opinion formation model. The model can demonstrate consensus, polarisation and fragmentation of opinions, depending on the value of the confidence bound parameter. In this work, we present an asymptotic and linear stability analysis of the mean-field Deffuant–Weisbuch model [2] defined on networks composed of two degree classes. With the use of asymptotic analysis, we explain how opinions evolve on such networks and how opinion clusters form. We present an approximate model that is independent of the confidence bound parameter, allowing for the dynamics to be derived from a single solution for all confidence bound values. With the use of linear stability analysis, we derive an analytical estimate for the number and locations of final opinion clusters for any given confidence bound value [3]. Comparison with numerical simulations shows that our estimate accurately predicts the location of major clusters for both the network-based model and the model with a fully-mixed population. References [1] G. Deffuant, D. Neau, F. Amblard, and G. Weisbuch: Mixing beliefs among interacting agents. Adv. Complex Syst. 03(01n04):87–98, (2000) [2] S. C. Fennell, K. Burke, M. Quayle, and J. P. Gleeson: Generalized mean-field approximation for the Deffuant opinion dynamics model on networks. Phys. Rev. E. 103(1), (2021) [3] A. Dubovskaya, S. C. Fennell, K. Burke, J. P. Gleeson, and D. O’Kiely: Analysis of mean- field approximation for Deffuant opinion dynamics on networks. arXiv:2210.07167 (2022) |
14:45 | Modeling critical connectivity constraints in random and empirical networks PRESENTER: Laurent Hébert-Dufresne ABSTRACT. Random networks are a powerful tool in the analytical modeling of complex networks as they allow us to write simple mathematical models specifically designed to study interesting properties and behaviors of networks. One notable shortcoming of these models is that they are often used to study processes in terms of how they affect the giant connected component of the network-- its robustness or ability to spread a supercritical epidemic-- yet they fail to properly account for that component. For example, random network models are used to answer questions such as how robust is the network to random damage but fails to capture the structure of the network even under zero damage. Here, we introduce a simple conceptual step to account for such connectivity constraints in existing models. We distinguish network neighbors into two types of connections that can lead or not to a component of interest, like a spanning tree or a giant component, and call those critical and subcritical degrees. Our model can be captured by a joint distribution P(k,c) of subcritical degree k and critical degree c. In doing so, we can for example solve a bond percolation process with a simple equation which can in some cases approximate state-of-the art models like message passing which require a number of equations linear in system size. We discuss potential applications of this simple framework for the study of infrastructure networks where connectivity constraints are critical to the function of the system. |
15:00 | Spreading of opinions with different qualities in heterogeneous networks PRESENTER: Thierry Njougouo ABSTRACT. Humans and animals often choose between options with varying quality, requiring a consensus formation through interaction with others. This phenomenon is widely studied in the context of opinion dynamics, with results showing that individual mechanism of information affects the outcome and the time required to reach consensus. The main models used are the Voter Model (VM) and its variations, and the Majority Model (MM), both of which focus on the propagation of discrete opinions in a population. In the original VM, agents have a limited cognitive load as they only use social information form one (randomly chosen) neighbour. Instead, in the MM, the agents have a high cognitive load as they must count all neighbours’ opinions and select the most frequent one. In this study, we build and analyze a model for collective decision making between two options characterized by an objective quality. The option’s quality defines the probability that the option is voted and communicated to the neighbours. Such a quality-based decision-making model is useful to describe best-of-n decisions in collective animal behaviour [Reina et al. PRE 2017] and swarm robotics applications [Valentini et al. Frontiers. 2016]. Our model generalizes the previous two opinion dynamics models (VM and MM) by regulating the agents’ cognitive load, and in turn, their information processing mechanism. The key parameter of our model is the agents’ cognitive load which leads to a continuous range of solutions, interpolating between the VM, the MM, and even to models with lower levels of cognitive load (the extreme is zero load and socially-independent—i.e. random—opinion changes as shown in the figure below). We analyze the dynamics of our model in homogeneous and heterogeneous networks. Our results show that the individual cognitive load can regulate the speed-accuracy trade-off in the collective dynamics. Increasing the individual cognitive load can lead to quicker but less accurate collective decision (consensus on lower quality option). Instead, slower and more accurate decisions can be obtained for intermediate cognitive load values. Through heterogeneous mean field theory, we also study the impact of the network topology on the population dynamics. Our analysis shows that increasing network heterogeneity (increasing the exponent of the scale-free degree distribution), the population becomes more robust to errors, i.e., the population selects the best option even when a large majority momentarily selects the inferior option. Finally, we investigate the robustness of the population dynamics against the presence of a minority of zealot agents that never change opinion and only vote for the inferior option. Our analysis shows that zealots can be used to overcome the effect of quality option and also the effect of the proportion of agents with the opposite opinion. |
15:15 | Threshold Cascade Dynamics on an Adaptive Network PRESENTER: Byungjoon Min ABSTRACT. The propagation of opinions in a networked system and the change of social connections according to the opinions of nodes are intertwined and coevolving each other. In this presentation, we study the coevolutionary dynamics of social contagion with network evolution. We consider the spread of the opinion occurs by following the threshold cascade model, where a node adopts the opinion if the fraction of its adopted neighbors is greater than a prescribed threshold. In addition to the contagion processes, each non-adopted node breaks its social ties with an adopted neighbor with probability p, and find a new neighbor. Along these processes, We explore the coevolutionary dynamics of social contagion and network evolution using extensive numerical simulations. Our study offers the potential to deepen our understanding of the complex and evolving systems. |
15:30 | Contagion dynamics on hypergraphs with nested hyperedges PRESENTER: Jihye Kim ABSTRACT. Many complex social systems, which are prolific in higher-order interactions taking place among any number of individuals, can be encoded as hypergraphs consisting of nodes and hyperedges. Hyperedges, representing multi-body interactions, can have diverse sizes, and can be included (nested) within other ones as illustrated in Fig. 1 (a). To quantitatively investigate the effects of such nested structure on spreading dynamics, here we introduce a nested-hypergraph model where the average fraction εs of nested hyperedges of size s is adjustable, and we address a simplicial susceptible-infectious-susceptible (SIS) model on the nested hypergraphs. In the contagion model, a susceptible node in a hyperedge of size s can catch a disease at rate βs only when all the other nodes in the hyperedge are infectious; and an infected node turns into a susceptible one at rate µ. We propose an analytical framework called the facet approximation (FA) framework in which the density of infected nodes in the nested hyperedges is approximated by using the density of infected nodes in the largest hyperedges (called the facets) of size sm that is evaluated explicitly, and the infection density in non-nested (free) hyperedges is approximated by the global average as in the mean-field approach. In the FA, we establish equations of time-evolution of the fraction of facets with a given number of infected nodes, and the fraction of susceptible nodes distinguishable by their degree vectors. Then we apply the FA to the nested hypergraphs with facets of size three. Therefore the FA can capture dynamical correlations existing in the nested structure. Applying the FA to the simplicial SIS model on nested hypergraphs with hyperedges of size two and three, we can obtain the stationary-state fraction of infected nodes I∗ which is a function of ε2 and λs ≡ msβs/µ (s = {2, 3}) where ms is the expected number of hyperedges of size s to which a node belongs; we found that a continuous or discontinuous transition between endemic and disease-free phases occurs for large enough λ2 as shown in Fig. 1 (b). From the FA, we can also obtain the phase diagram presented in Fig. 1 (c) indicating that an increase in ε2 makes it easier to spread over the nested hypergraphs for infectious disease. In the future work, we could utilize the FA as a tool for studying contagion dynamics on hypergraphs containing the overlapping-community structure. |
15:45 | Distinguishing simple and complex contagion processes on complex networks PRESENTER: Elsa Andres ABSTRACT. The reasons why we adopt a certain behavior, such as the action to stop smoking, is strongly influenced by our peers, with adoption mechanisms changing from one individual to another. While some people might be prone to change their behavior after the influence of only one of their friends, others would need several stimuli from their social circles before acting the same way. On top of this, individuals can also decide to adopt a behavior independent of their peers, due to intrinsic attitudes or external factors such as advertisements. In the complex networks literature, the three mechanisms for adoption just described correspond to three well-known contagion processes, where adopters are usually called infected (or susceptible otherwise). Simple contagion (SI) is a probabilistic process in which a susceptible node can be contaminated with probability β after each contact with an infected node [1]. By contrast, complex contagion (CP) includes social reinforcement: the proportion of infected nodes of an ego-node must be above a certain threshold φ in order to contaminate it [2]. Finally, a node can undergo a spontaneous adoption (Sp) with probability r, no matter the infection status of its neighbors. Until now, the distinguishability of these three phenomena has only been studied looking at macroscopic quantities, such as contagion curves or the order of contagion of all the nodes of a graph [3, 4]. These approaches have two main drawbacks: as they (i) require full knowledge of the contagion process and (ii) assume that every node behaves the same way, thus limiting their applicability to real-world scenarios. In this work, we instead shift the focus to the microscopic level. In particular, we address the problem how the Sp, SI and CP mechanisms can be distinguished at the egocentric level for single adoption cases. In other words, we rely exclusively on the information available from the point of view of the adopter, who has only local knowledge of its neighborhood. We develop two classification methods based on random forest (ML) and Bayesian likelihood (LLH), where the parameters r, β and φ can be known a priori or not. We first test our methods on synthetic networks assuming known parameters, and show that the ML approach performs better than the LLH. In general, we reach accuracies over 70% of the processes whose depend on the underlying parameters. We then classify the adoption of different hashtags on Twitter where the parameters r, β and φ are inferred individually for each node. Figure 1 shows the case of a single hashtag, #GiletsJaunes. Most of the contagions are classified as SI by both methods. This might be due to the political nature of the hashtag that gives less importance to social reinforcement, while in some of the other hashtags of different nature, such as in the #10YearsChallenge, both of our methods predict predominantly CP. Overall, this study enhances our comprehension on how to differentiate seemingly similar global contagion processes driven by different contagion mechanisms at the microscopic level, opening the door to an individual-level identification in empirical data. |
16:00 | Belief propagation algorithm for improving opinion formation heuristics PRESENTER: Enrico Maria Fenoaltea ABSTRACT. With the recent Covid-19 pandemic or the succession of news reports on the war in Ukraine, it has become evident how difficult it is for nonexperts to form a reliable opinion on a given topic. Indeed, even with a huge amount of information easily available online, many individuals deny the efficacy of vaccines or are confused by propaganda operations concerning the war. So, how do people form opinions when they have so many sources of (often conflicting) information at their disposal? According to behavioral sciences, people frequently develop opinions about complicated subjects using simple heuristics \cite{kahneman2011thinking}. Recent models on opinion formation \cite{medo2021fragility, meng2021whom} have shown that trivial heuristics make individuals' opinions unreliable when the number of information sources is very large. How can people do better? And how can they avoid drowning in a sea of news about which they can no longer distinguish whether these are true or false? Motivated by these questions, we introduce a model to compare different heuristic rules and show that an opinion formation process whose rules are inspired by the belief propagation algorithm \cite{yedidia2003understanding} performs better than other heuristic rules already studied in the literature. In particular, we describe a model in which a single agent gradually develops opinions on $N$ sources of information connected by signed links that indicate positive and negative relations, respectively. The sources of information on which opinions are formed could be, for instance, news media or politicians belonging to different parties. The positive and negative relationships indicate whether the two information sources have converging or conflicting positions. In this way, the information sources are the nodes of a signed undirected network. Each opinion, or node, can be in two states: positive or negative (e.g., reliable or unreliable). In principle, if two nodes are in the same state, the link connecting them is positive, and negative otherwise. The individual initially knows the state of only one node, and his goal is to infer what the states of all other nodes are when there is noise in the system. The noise represents how much the sign of the links is related to the state of the nodes: if there is no noise, the rule mentioned earlier applies (i.e., two nodes in the same state have a positive link, etc.); if the noise is maximal, the sign of the links gives no information about the states of the nodes. We define an opinion formation process based on message passing: the individual walks over each node and, by aggregating messages from their neighbors, determines what its confidence level is that each node is positive or negative. The messages are defined as in the standard belief propagation theory \cite{yedidia2003understanding}. We show numerically and analytically that there is a critical value of noise below which the individual can infer the correct state of most nodes (Fig.1). This is also true for $N\to \infty$. This is a remarkable result, since the individual exploits only local information (only messages from the neighbors of each node). At the same time, however, we show that, with other local heuristic rules existing in the literature, individuals cannot extract any information from the network when $N\to \infty$. |
16:15 | Generalized network density matrices for analysis of multiscale functional diversity PRESENTER: Arsham Ghavasieh ABSTRACT. The network density matrix formalism allows for describing the dynamics of information on top of complex structures and it has been successfully used to analyze from the system's robustness to perturbations to coarse-graining multilayer networks from characterizing emergent network states to performing multiscale analysis. However, this framework is usually limited to diffusion dynamics on undirected networks. Here, to overcome some limitations, we propose an approach to derive density matrices based on dynamical systems and information theory that allows for encapsulating a much wider range of linear and non-linear dynamics and richer classes of structure, such as directed and signed ones. We use our framework to study the response to local stochastic perturbations of synthetic and empirical networks, including neural systems consisting of excitatory and inhibitory links and gene-regulatory interactions. Our findings demonstrate that topological complexity does not necessarily lead to functional diversity---i.e., complex and heterogeneous responses to stimuli or perturbations. Instead, functional diversity is a genuine emergent property that cannot be deduced from the knowledge of topological features such as heterogeneity, modularity, presence of asymmetries, or dynamical properties of a system. |
16:30 | Transition to a structurally balanced paradise state in the system with agents' opinions PRESENTER: Piotr Górski ABSTRACT. Homophily and structural balance are two important processes in the formation and evolution of relations in social systems. Here, we describe a model of pair and triad dynamics. We consider a system of $N$ agents that possess $G$ attributes each. Following homophily, agent similarity is used to determine signs of relations between agents. Following structural balance, changes in agents' attributes that may resolve the triadic tensions are the consequence of the unbalanced triads. Using Fokker-Planck equations, we reconstruct the phase diagram for such a model and we confirm the analytical results by extensive numerical simulations. We show that in the presence of agents' attributes, the phase transition to a paradise (all links positive) takes place only when the number of attributes $G$ is much higher than the number of agents $N$ in the system, $G>O(N^2)$, $N \to \infty $. If the condition is not fulfilled, then the transition to a paradise state does not take place. It follows the lack of structural balance observed in many social data can be explained by undercritical numbers of social attributes for interacting parties. |
14:15 | Taxonomy of cohesion coefficients for weighted and directed multilayer networks PRESENTER: Paolo Bartesaghi ABSTRACT. Clustering and closure coefficients are among the most widely used indicators to describe the topological structure of a network. Many different definitions have been proposed over time, especially in the case of weighted networks, where the choice of the weight assigned to the triangles plays a crucial role. The present work extends clustering and closure coefficients to the most general context of weighted directed multilayer networks. The first aim is to provide general definitions and to show how specific coefficients already proposed in the literature result as special cases. Indeed, the tensor formalism makes it possible to incorporate the new definitions, as well as all those existing in the literature, into a single unified notation. We also introduce a new coefficient, here called the clumping coefficient, as a natural extension of the two existing ones. The clumping coefficient overcomes the distinction between clustering and closure coefficients, which is based on a conventional choice in completing open triads of nodes, and combines them into a single measure, providing an overall view of the level of cohesion of a node in the network. For each coefficient, new local and global versions are proposed. In particular, local coefficients are introduced for a single node on a single level, for all the replicas of the same node on all levels or for all the nodes within a single level. Finally, a global coefficient is introduced based on the classical concept of transitivity. In the directed case, it is also often useful to take into account only directed triangles of a certain type, specifically out, in, cycle and middleman triangles. We provide explicit versions of all the coefficients adapted to directed triangles of each type and show how they can be used to regain the coefficients known in the literature. From a methodological point of view, this work aims to systematise the current knowledge in the field, providing a complete taxonomy of all the cohesion coefficients. We also analyse a number of applications to simulated and real networks, in particular, an application to the multiplex temporal financial network based on the returns of the S\&P100 assets, and the multilayer world trade network, disaggregated into sectoral layers. |
14:30 | Fitting degree distributions of complex networks PRESENTER: Shane Mannion ABSTRACT. This work introduces a method for fitting to the degree distributions of complex network datasets, such that the most appropriate distribution from a set of candidate distributions is chosen while maximizing the portion of the distribution to which the model is fit. Current methods for fitting to degree distributions in the literature are inconsistent and often assume a priori what distribution the data are drawn from. Much focus is given to fitting to the tail of the distribution, while a large portion of the distribution below the tail is ignored. It is important to account for these low degree nodes, as they play crucial roles in processes such as percolation. Here we address these issues, using maximum likelihood estimators to fit to the entire dataset, or close to it. This methodology is applicable to any network dataset (or discrete empirical dataset), and we test it on over 25 network datasets from a wide range of sources, achieving good fits in all but a few cases. We also demonstrate that numerical maximization of the likelihood performs better than commonly used analytical approximations. |
14:45 | Node anonymity in networks: The infectiousness of uniqueness PRESENTER: Rachel de Jong ABSTRACT. Introduction. Ensuring privacy of individuals is of paramount importance to social network analysis research. An important part of this problem is determining how to measure anonymity of nodes in a network, which requires a thorough understanding of how individuals can be identified, i.e., the “attacker side”. Previous work uses anonymity measures that are often based on local node properties such as the degree or 1-neighborhood structure [1]. However, our results show that this might not be sufficient, as additional knowledge of the 2-neighborhood, or just knowledge on the existence of one extra link, results in greatly decreased node anonymity. Approach. We use the notion of k-anonymity and say a node is k-anonymous if there are k−1 equivalent nodes. If there are no such nodes, the node under consideration is unique and thus not anonymous. We focus on two different measures for equivalence. Code can be found at https://github.com/RacheldeJong/dkAnonymity. 1) d-k-Anonymity. In the first measure, d-k-anonymity, equivalent nodes are structurally indistinguishable with perfect knowledge of their d-neighborhood, where d is the distance to the focal node. More specifically, if nodes are equivalent 1) their d-neighborhoods are isomorphic and 2) the nodes have the same structural position in these d-neighborhoods. Note that the ego network is equal to the 1-neighborhood. However, d-k-anonymity is a strict measure that accounts for a lot of knowledge, especially when the graph is dense. 2) Anonymity-Cascade. To alleviate abovementioned concern, secondly, we investigate the de-anonymizing effect of knowledge about 1-neighborhoods and one additional link. Anonymity-cascade is a procedure that extends d-k-anonymity. It starts by finding all nodes with a unique 1-neighborhood (i.e., setting d=1). Then, if among the neighbors of such a node there is a node with a unique 1-neighborhood, it is possible to uniquely identify this node with the additional knowledge that the nodes are connected (Cascade “one level”). This case is also captured by d-k-anonymity with d = 2. Next, we can repeat this process by reusing these uniquely identified nodes to iteratively identify more nodes in the network (Cascade “final level”). Experimental results. To understand how these two methods assess the anonymity of nodes in real-world networks, we perform experiments on a set of 36 diverse (social) network datasets from well-known repositories, varying in size, density and domain. The results for d-k-anonymity with d=1 and d=2 can be found in Figure 1. Networks are sorted by size, and range from 167 nodes to 3 million nodes and up to 18 million edges. The figure shows that knowledge of the 2-neighborhood instead of 1-neighborhood, for example as used in [1], very strongly decreases anonymity. We find that networks that have a high level of anonymity at d=1 either lose a lot of anonymity when d>1, or that most nodes remain anonymous even for larger neighborhoods. Experiments with Anonymity-Cascade reveal that this approach reports strongly decreased node anonymity in real-world networks; even with the additional knowledge of one extra link. Conveniently, this measure is much faster to compute than d-k-anonymity with d=2, enabling fast anonymity measurement in large networks. Conclusion and outlook. Measuring structural anonymity of nodes is an important step in assessing anonymity and identity disclosure risk in networks. We proposed and evaluated two methods that demonstrate how this risk severely increases when more than a node’s direct neighborhood is considered. The proposed methods and associated findings have potential implications in practical settings where researchers are sharing sensitive social network data and wish to guarantee the anonymity and therewith privacy of people in the network. References [1] D. Romanini, S. Lehmann, and M. Kivelä, “Privacy and uniqueness of neighborhoods in social networks,” Scientific Reports 11: 20104, 2021. |
15:00 | Comparing Link Filtering Backbone Techniques in Real-World Networks PRESENTER: Ali Yassin ABSTRACT. Networks are valuable representations of complex systems. They can be analyzed for various purposes, such as identifying communities, influential nodes, and network formation. However, large networks can be computationally challenging. Multiple techniques have been developed to reduce the network size while keeping its main properties. One can distinguish two approaches to deal with this issue: 1) structural and 2) statistical methods. Structural techniques reduce the network while preserving a set of essential properties. In contrast, statistical techniques tend to filter nodes or links that blur the original network. They rely on a statistical hypothesis testing model or estimate to filter noisy edges or nodes. In this study, we carry out a comprehensive comparison of seven statistical filtering techniques (write their names) on a collection of 39 weighted real-world networks of various sizes (Number of nodes ranging from 18 from to 13,000) (number of links ranging from 78 to 5,574,233) and origins (character, web, biological, economic, infrastructural, and offline/online social). First, we investigate the similarities between the filtering techniques. Indeed, each link has an associated probability value (P-value), allowing us to compare the methods through correlation analysis. In a second set of experiments, we investigate the relationship between the basic local properties of the nodes and the underlying statistical model through the P-values. Then we turn to the global backbone properties. More precisely, we compare the weight distribution of the extracted backbones to that of the original network for a given significance level (alpha = 0.05$). Finally, we study the backbone's criticality. We iteratively remove edges in ascending order of their P-value from the original network and measure the size of its largest connected component (LCC). |
15:15 | Structural measures of similarity and complementarity in complex networks ABSTRACT. The structure of complex networks commonly reflects their functional properties and processes that created them. Seminal studies have shown that different systems, from neural networks to the World Wide Web, tend to be characterized by the presence of statistically over-represented small subgraphs, known as network motifs. While it is natural to expect different motifs to be related to particular functions or properties of a given system, it is often not easy to determine what they are exactly. In general, principles that would explain the prevalence of specific motifs across different application domains are still mostly unknown. An important exception is the well-known abundance of triangles in many types of real-world networks, which is linked to triadic closure and transitive relations driven by similarity between nodes in a (latent) metric space. In other words, in similarity-driven systems adjacent nodes are likely to share a lot of neighbors, and this implies the abundance of triangles and a latent geometric structure. Thus, one may ask whether there are other principles, akin to similarity, that are linked to their characteristic motifs, possibly thanks to some kind of intrinsic geometry? Here we show that one such principle is complementarity, which organizes relations driven by differences and synergies and is linked to the abundance of quadrangles (4-cycles). Indeed, many important phenomena, from trade and division of labor to protein-protein binding, may be better explained by complementarity, or differences and synergies, between diverse features of the involved parties. For instance, two types of wine may be often bought together with the same kinds of bread and cheese but rarely both of them will occur in the same transaction. In this talk, starting from simple geometric arguments, we define two families of structural coefficients of similarity and complementarity measuring the density of triangles and quadrangles. The similarity coefficients can be seen as a generalization of clustering and closure coefficients, while complementarity coefficients are a combined measure of bipartivity and bipartite clustering. We validate the proposed theory by demonstrating that the proposed coefficients discriminate effectively between different kinds of social relations. In particular, we show that they distinguish between friendship relations, which are typically driven by similarity or homophily, and health advice relationships, which can be expected to be driven by complementarity as an act of advice depends on a knowledge differential between an adviser and an advisee. |
15:30 | Network Classification Based Structural Analysis of Real Networks and their Model-Generated Counterparts PRESENTER: Roland Molontay ABSTRACT. Data-driven analysis of complex networks has been in the focus of research for decades. An important area of research is to study how well real networks can be described with a small selection of metrics, furthermore how well network models can capture the relations between graph metrics observed in real networks. In our paper, we apply machine learning techniques to investigate the aforementioned problems. We study 500 real-world networks along with 2,000 synthetic networks generated by four frequently used network models with previously calibrated parameters to make the generated graphs as similar to the real networks as possible. Our paper unifies several branches of data-driven complex network analysis, such as the study of graph metrics and their pair-wise relationships, network similarity estimation, model calibration, and graph classification. We find that the correlation profiles of the structural measures significantly differ across network domains and the domain can be efficiently determined using a small selection of graph metrics. The structural properties of the network models with fixed parameters are robust enough to perform parameter calibration. The goodness-of-fit of the network models highly depends on the network domain. By solving classification problems, we find that the models lack the capability of generating a graph with a high clustering coefficient and a relatively large diameter simultaneously. On the other hand, models are able to capture exactly the degree-distribution-related metrics. |
15:45 | Node-layer duality: exploring the dark side of multilayer networks PRESENTER: Charley Presigny ABSTRACT. Multilayer networks have a profound implication in the characterization of complex systems whose nodes exhibit multiple types of interactions. While common courses have focused on the role of the nodes across layers (node-centric view), the role of layers across nodes has been poorly investigated (layer-centric view) (Fig. 1a). In other words, only one side of multilayer networks has been taken into account so far, thus ignoring the possibly relevant information contained in the other side. Here, we introduce the concept of node-layer duality and provide a new complementary characterization of multilayer networks from both nodewise (X) and layerwise (Y) perspectives. To have a first intuitive characterization, we adopt a comparative framework that computes the multilayer node degree-based Euclidean distances dX and dY between different networks. We show analytically, and confirm with extensive simulations, that the two sides always provide complementary information and that depending on how the links are reorganized within and between layers, one side can better discriminate one from the other (Fig 1b). Notably, both distances depend on the number of nodes N and layers M in the Erdös-Renyi multilayer network, i.e., dX ∝M√N , dY ∝N√M (Fig. 1c). Based on these findings, we then use dX and dY to propose a novel characterization of real multilayer networks, from transport to social systems. Results show the importance of considering the node-layer duality concept to uniquely characterize such real-world systems (Fig. 1d). In particular, we observe that the dX-dY space distinguishes well between spatial and non-spatial networks. Altogether, our results shed light into the node-layer duality properties of multilayer networks and provide the first tools to quantify their effect on the structure of real networks. |
16:00 | Quantifying the Topological Stability of a Simplicial Complex PRESENTER: Anton Savostianov ABSTRACT. Simplicial complexes are generalizations of classical graphs. Their homology groups are widely used to characterize the structure and the topology of data in chemistry, neuroscience, transportation networks, etc. Exploiting the isomorphism between homology groups and so-called higher-order Laplacian operators, our work investigates the complex's topological stability: how does the homology group change when some edges in the complex are perturbed? By introducing suitable weighted graph Laplacian operators, the question is formulated as a matrix-nearness problem, with a spectral objective function that suitably takes into account potential homological pollution due to eigenvalues inherited from previous groups. Given the initial Laplacian operator, we introduce a continuous flow over the edges of the initial simplex and we develop a bi-level optimization procedure that computes the nearest simplex (or, equivalently, the smallest edge perturbation) with a different homology by integrating an alternated matrix gradient flow. |
16:15 | igraph — the network analysis package PRESENTER: Szabolcs Horvát ABSTRACT. Originally released in 2006, igraph is one of the pioneering open-source software tools for the analysis of large-scale complex networks, as well as for graph theoretical computations. In the past two years, igraph began a process of redesign and rejuvenation, aiming to create a solid software foundation for serving the network science community's needs over the next decade. The goal of this presentation is to give insight into the development process of a modern network analysis library, provide opportunity for audience feedback, and encourage the broader network science community to participate. As an open-source project, igraph partly relies on volunteer contributions by its user base for long-term sustainability, and for keeping up with the latest developments in network science. igraph's design goals are to be high-performance and reliable, while staying easy to use and easy to contribute to. To this end, its algorithms are implemented in the C programming language, but the system is designed to be used primarily from high-level languages. Currently, official interfaces are provided for Python, R and Mathematica. A unique feature of the igraph C library is that it was designed from the ground up for deep integration into high-level host languages, providing features such as interruptible computations, native error handling, making use of the host language's random number generator, progress reporting, etc. Each interface is designed to fit in well with the unique features of its host language, and to feel familiar to its users. |
16:30 | Towards big multilayer network data management PRESENTER: Georgios Panayiotou ABSTRACT. Multilayer networks (MLNs) are a popular model across a variety of disciplines for representing, manipulating and analyzing complex systems. A typical application is social networks, which often consist of millions of actors and associated relationships (e.g. online social networks, country-scale population networks). Considering the increasingly large size of typical MLN data, there is a prominent need to facilitate their efficient storage and access, which can allow cost-effective data preprocessing and retrieval, and thus support development of interactive analysis systems. However, current practices run into a multitude of issues. On the one hand, the vast majority of available MLN software focuses on analytical tasks like community detection and visualization. As such, they usually lack features necessary for big data management such as a principled query language, which can allow easily expressible and highly portable queries, for example dynamic generation of different views and aggregations. More importantly, the delicate balance between the libraries' underlying storage format scalability and process efficiency should be addressed, as performance of even simple layer manipulation operations greatly differs depending on data model choice. On the other hand, established database applications neither consider the layer as a first-class abstraction, nor provide appropriate algorithms to manipulate it, thus requiring additional overhead in attempting to model an MLN. Based on an attribute-extended version of the generic multilayer network model, we propose a taxonomy for layer-specific definition, manipulation and query operators covering and extending the ones found in major MLN software. We also discuss a reduced set of operators necessary for designing a layer-supporting framework for graph databases. Finally, we benchmark current libraries on operator performance, which can later provide a basis for spotlighting processes in need of optimization. |
14:15 | Networks of climate finance reveal systemic failure of fossil fuel divestment PRESENTER: Max Falkenberg ABSTRACT. Effective climate action is critically dependent on a rapid and sustained energy transition from fossil fuels to green energy. The banking sector are a key player in this, funding new energy projects totalling several hundred billion dollars each year. Using Bloomberg New Energy Finance data, we first identify key banks in the sector and show how energy investments have undergone a significant transition between 2010 and 2021, principally characterised by an increase in green investment, but with little evidence of a system-wide reduction in fossil fuel spend. Then, by developing a network model for the reassignment of capital, we show how the substitution effect, the phenomenon whereby the capital divested from one bank is replaced by new capital from a competing bank, prevents effective, system wide divestment. We show that unless multiple major banks divest from the fossil fuel sector in parallel, the divestment of individual banks has little to no actual effect on the total value of fossil fuel projects which are funded in a given year. However, if banks are subject to regulations which restrict their fossil fuel investments according to the bank's own assets - for instance the ''one-for-one'' capital requirements rule recently proposed by the European Parliament but ultimately rejected - then the individual divestment of banks can have a non-zero impact on the sector, with a phase transition in divestment efficiency as the number of divesting banks increase. Our results highlight the need for collective action, stressing the importance of regulatory oversight to ensure that fossil fuel divestment at the banking level has the desired effect at the project level. |
14:30 | Complex dynamics of multihoming in dark web markets PRESENTER: Elohim Fonseca dos Reis ABSTRACT. Dark web marketplaces are unregulated online platforms that facilitate the trade of illicit goods among users. Due to their illegal nature, they exhibit unique dynamics not found in typical marketplaces. Users are offered anonymity, but have no protection from the dark markets. Despite closures from authorities or exit scams, the ecosystem continued to grow and has shown resilience as new markets are created and users migrate -- resulting in a dynamics metaphorically called a game of `Whack-a-Mole'. This resilience has been typically observed through global measures, such as traded volume or number of users. By developing a method to classify users either as sellers or buyers, we were able to uncover unseen complex dynamics emerging from the networks of transaction among those parties. We analysed the evolution of the dark market ecosystem over more than a decade. Our results are based on the temporal networks of the 32 largest dark markets by volume and the associated peer-to-peer (P2P) network of transactions between the users. We showed that the network of sellers is composed mostly by sellers that trade only in the P2P network, although most buyers trade only in markets. In particular, we analysed the dynamics of `multihomers', defined as users that are simultaneously trading in more than one market. By identifying sellers and buyers from all traders, we could identify `multisellers' (i.e., a multihomers that are sellers) and `multibuyers' (i.e., multihomers that are buyers). We found that the dynamics of multihomers has a central role in the connectivity of the ecosystem. Moreover, by studying the effects of external shocks, we observed that the ecosystem resilience is mostly supported by the network of buyers rather than sellers. The major shock on the ecosystem was caused by a law enforcement operation by the end of 2017. The traded volume of the markets notably drops afterwards but rapidly recovers previous values and continues to grow. However, we show in the first row of Fig.~\ref{fig:conferenc_fig} that the multiseller network, where nodes are active markets and edges are multisellers, suffered a structural change after the operation. Nevertheless, we found that the median net income of multisellers was persistently larger than that of non-multisellers throughout the whole period of observation. Conversely, the network of multibuyers exhibited a strong resilience with almost imperceptible changes, as show in the second row of Fig.~\ref{fig:conferenc_fig}. We observed an intermediate resilience regime in the seller-to-seller (S2S) network, the network of transactions between sellers, as shown in the third row of Fig.~\ref{fig:conferenc_fig}. Although the S2S network also suffers a structural change, unlike the multiseller network, it shows signs of recovery but slower than the multibuyer network. These findings unveil complex patterns beneath the temporal networks of transactions in dark web marketplaces and its associated P2P network. By studying the networks and the dynamics of buyers and sellers, we characterize the resilience and evolution of the ecosystem of dark web markets. |
14:45 | Venture Capital Networks: Lead Investors Fit (& Win) PRESENTER: Marta Zava ABSTRACT. Venture capital markets are characterized by strong relationships built on networks and reputation, which have rarely been quantified and considered jointly. Using a novel methodological framework stemming from network theory, we represent and model how funding from investors to companies is provided across rounds. Based on a temporal adaptation of the topological overlap T, we introduce a node centrality measure, that we called multiplier M, able to capture the investor's power of catalyzing investments in the companies she backs. We then used the M index as measure of fitness in a temporal bipartite model, outlining the leader-follower dynamics embedded in investors' decisions. We validated the theoretical model with a pilot dataset comprising companies founded in California between 2015 and 2017 which received at least 2 rounds of investments, for a total of 20,173 deals. Results show that companies' success and having a high performing investor in the first round are correlated. |
15:00 | Portfolio diversification using network science and machine learning PRESENTER: Miroslav Mirchev ABSTRACT. Maintaining a balance between returns and volatility is a common strategy for portfolio diversification, whether investing in traditional equities or digital assets like cryptocurrencies. One approach for diversification is the application of clustering, or community detection when relationships between assets are represented by a graph. We examine two graph representations, a standard distance matrix based on correlation (Cor) and another based on mutual information (MI). The Louvain (LV) and Affinity propagation (AP) algorithms were employed for finding the network communities based on annual data on a minimal spanning tree (MST) graph representation. Furthermore, we examine building assets' co-occurrence graphs, where communities are detected for each month throughout a whole year and then the links represent how often assets belong to the same community. Portfolios are then constructed by selecting several assets from each community based on local properties (degree centrality), global properties (closeness centrality), or explained variance (PCA), with three value types (max, med, min), calculated on MST or fully connected community subgraph. Recently, graph neural networks (GNN) have received tremendous attention, so we also consider the application of graph convolutional networks (GCN) for portfolio construction represented as a quadratic unconstrained binary optimization problem, which maximizes return while minimizing volatility by penalizing assets that are close in the graph representation. We explored these various strategies on data from S\&P 500 and the top 203 cryptocurrencies with a market cap above 2M USD in the period from Jan 2019 to Sep 2022. Each stocks/cryptocurrencies portfolio is equally-weighted and composed of 25/20 assets. The portfolios are built using annual data and evaluated on the following year. For comparison we also include a randomly chosen portfolio of the same number of assets, as well as a portfolio composed of all possible assets. This procedure is repeated 20 times by monthly rolling window and we analyze the averaged annual returns (%) vs. volatility for several selected strategies yielding maximal return or minimal volatility compared to other asset selection criteria from the same community detection procedure. GNNs could be applied for portfolio diversification through community detection, but it is more efficient to directly address the portfolio construction problem. Moreover, we extend our study and explore various other applications of GNNs for portfolio management by incorporating other knowledge beside assets values, such as sector information, supply-chain relationship, news sentiment, etc., thus building a multi-relational graph. |
14:15 | Non-backtracking considerations for a modelling framework for the spread of epidemics on temporal networks PRESENTER: Philipp Hövel ABSTRACT. We present a modelling framework for the spread of epidemics on temporal networks from which both the Individual-Based (IB, [1]) and Pair-Based (PB) models can be recovered. A temporal PB model is systematically derived using this framework [2], which offers an improvement over existing PB models. It moves away from edge-centric descriptions, such as the contact-based model, while the description is concise and relatively simple. For the contagion process, we consider a Susceptible-Infected-Recovered (SIR) model, which is realized on a network with time-varying edges. We demonstrate that the shift in perspective from IB to PB quantities enables exact modelling of Markovian epidemic processes on temporal networks, which contain no more than one non-backtracking path between any two vertices. In short, the non-backtracking graph is the same as the standard graph of vertices and edges with the added constraint that only adjacent edges that are non-backtracking are included. To gauge the quality of the TPB model, we investigate the spreading on empirical networks. In addition, we consider the non-backtracking path density, which is defined as the ratio of non-backtracking cycles and paths. We show that the higher this ratio, the better is the agreement of the TPB model with Monte-Carlo simulations of Markovian epidemic processes. Finally, we explore limit cases, for which the ratio can analytically calculated. For example, we show that the maximum non-backtracking cycle density at time step (n-1)/(1+2(n−1)), which tends to 0.5 for large times. [1] E. Valdano, L. Ferreri, C. Poletto, V. Colizza, Physical Review X 5(2), 021005 (2015). [2] R. Humphries, K. Mulchrone, J. Tratalos, S.J. More, P. Hövel, Applied Network Science 6(1), 23 (2021). PH acknowledges support by the Deutsche Forschungsgemeinschaft (German Research Foundation) – Project-ID 434434223 – SFB 1461. |
14:30 | Trading contact tracing efficiency for finding patient zero PRESENTER: Petter Holme ABSTRACT. As the COVID-19 pandemic has demonstrated, identifying the origin of a pandemic remains a challenging task. The search for patient zero may benefit from the widely-used and well-established toolkit of contact tracing methods, although this possibility has not been explored to date. We fill this gap by investigating the prospect of performing the source detection task as part of the contact tracing process, i.e., the possibility of tuning the parameters of the process in order to pinpoint the origin of the infection. To this end, we perform simulations on temporal networks using a recent diffusion model that recreates the dynamics of the COVID-19 pandemic. We find that increasing the budget for contact tracing beyond a certain threshold can significantly improve the identification of infected individuals but has diminishing returns in terms of source detection. We unravel a seemingly-intrinsic trade-off between the use of contact tracing to either identify infected nodes or detect the source of infection. This trade-off suggests that focusing on the identification of patient zero may come at the expense of identifying infected individuals. |
14:45 | Exact solution of Markovian SIR epidemics on heterogeneous networks PRESENTER: Massimo Achterberg ABSTRACT. We study heterogeneous, continuous-time Markovian Susceptible-Infected-Recovered (SIR) epidemics on a heterogeneous network with N nodes. Each node is either susceptible, infected or recovered, thus the total number of configurations is 3^N. We use a trinary numeral system to describe all possible configurations. The corresponding infinitesimal generator is shown to be upper-triangular, allowing for direct computation of eigenvalues and eigenvectors. Under the assumption of heterogeneous infection and curing rates, we show that the exact time-dependent solution can be computed. Our exact solution matches with Monte Carlo simulations, but is very far off the frequently used mean-field approximation. |
15:00 | Vaccination Strategies for COVID-19 in the Omicron Era: A Mathematical Model Analysis PRESENTER: Piergiorgio Castioni ABSTRACT. The emergence of Omicron variants, with their high rates of reinfection, has brought a new level of urgency to the already pressing issue of vaccine distribution. This heightened potential for further spread only amplifies the challenge of allocating limited vaccine doses effectively. To address this challenge, we conducted a mathematical analysis of the impact of age-stratified vaccine prioritization strategies. Our findings indicate that while vaccines have a short-term impact in reducing infections and fatalities, improper timing of distribution can result in a delayed decline in population immunity and a subsequent increase in the severity of a second wave. The key to success is balancing timely roll-out with optimal prioritization. Our model provides a means to compare the effects of various prioritization strategies across different contexts. |
15:15 | Epidemic spreading and contact tracing on clique networks PRESENTER: Abbas Karimi Rizi ABSTRACT. Contact tracing, the practice of isolating individuals who have been in contact with infected individuals to prevent the further spread of disease, is an effective and practical way of containing disease spread. Here, we show that this strategy is particularly effective in the presence of social groups: Once the disease enters a group, contact tracing does not only cut direct infection paths but it can pre-emptively quarantine group members such that it will cut indirect spreading routes, resulting in complex contagion-like dynamics within social networks that contain groups. We show these results by using a deliberately stylish model that allows us to isolate the effect of these complex contagion-like dynamics with the clique structure of the network where the contagion is spreading. This allows us to derive mean-field approximations and epidemic thresholds to demonstrate the efficiency of contact tracing in social networks with small groups. Additionally, this allows us to show that two slightly different formulations of the dynamics yield similar results. Our results illustrate how contract tracing in real-world settings can be more efficient than what would be predicted by models that treat the system as fully mixed or the network structure as locally tree-like. |
15:30 | Statistical evidence of proximity-sensitive awareness in clusters of identical genetic sequences during the COVID-19 pandemic PRESENTER: Gergely Odor ABSTRACT. Understanding the impact of superspreading events on the outcome of an epidemic has become an important challenge in network science during the COVID-19 pandemic. Previous efforts in contact tracing and genetic sequence analysis revealed a few case-studies, where superspreading events in different social groups lead to vastly different downstream infection patterns. We hypothesize that these discrepancies are, at least partially, caused by the different levels of proximity-sensitive awareness in different social groups (i.e. the willingness/ability to change individual behaviour based on the information about the closeness of infection events in the social network), and that such differences are ubiquitous during a pandemic. To support our hypothesis, we consider an SIR epidemic model with an aware and a non-aware population mixed with a certain level of assortativity, and we analyse the genetic sequences that are produced by the infection under elementary mutation models. Simulations of percolation cluster sizes show that, for most parameter ranges, the size distribution clusters of identical genetic sequences that become extinct during the epidemic is a power-law, and that the exponent is higher for the aware population, suggesting that superspreading events are contained more quickly. Finally, we observe that the empirical cluster size distribution is a power-law on the genetic sequence dataset downloaded from the GISAID platform in several countries as well, and that the exponent is higher in younger people living in metropolitan areas, supporting previous findings from a telephone survey of self-reported levels of proximity-sensitive awareness. Besides showing evidence for proximity-sensitive awareness, understanding cluster size distributions has further significance. Similar analyses are often performed in the field of philodynamics with more involved methods. In turn, our proposed method based on cluster sizes provides a non-parametric, conceptually simpler and computationally more tractable way of drawing conclusions about human behaviour based on genetic datasets, paving the way for biostatistical pipelines without phylogenetic reconstruction, which are able to process the genetic datasets of unprecedented size created since the COVID-19 pandemic. |
15:45 | Comparing the efficiency of forward and backward contact tracing PRESENTER: Jonas L Juul ABSTRACT. Contact tracing is important when mitigating the spread of many infectious diseases. During the COVID-19 pandemic, much attention has been paid to the effectiveness (ability to mitigate infections) and cost efficiency (number of prevented infections per isolation) of different contact tracing strategies. In particular, one influential paper by Kojaku et al. investigated the effectiveness and efficiency of backward contact tracing. Kojaku et al. --- in addition to deriving a novel and interesting sampling bias related to contact tracing in populations --- reported that backward contact tracing was ``profoundly more effective'' than forward contact tracing, that contact tracing effectiveness ``hinges on reaching the `source' of infection'', and that contact tracing outperformed case isolation in terms of cost efficiency. Here we show that these conclusions are not true in general. They were based in part on simulations that vastly overestimated the effectiveness and efficiency of contact tracing. Instead, we show that the efficiency of contact tracing strategies is highly contextual; faced with a disease outbreak, the disease dynamics determine whether tracing infection sources or new cases is more impactful. In their investigation of what makes contact tracing efficient in networked populations, Kojaku et al. carried out simulations that overestimated contact tracing efficiency and effectiveness. The culprit is Kojaku et al.'s choice of first simulating unhindered disease spread, and then implementing contact tracing on the obtained chains of infection --- rooted directed trees indicating who infected whom. We show that simulating disease spread and contact tracing simultaneously, rather than separately, reduces the efficiency of contact tracing in Barabási-Albert (BA) networks by an order of magnitude; from Kojaku et al.'s approximate estimate of 20 prevented infections per isolation to a more modest 2 prevented infections per isolation (Fig. 1 A). Next, we demonstrate that whether forward or backward contact tracing is more efficient and effective on BA networks can depend on the infectiousness profile --- the curve specifying at what times infectious nodes are most infectious --- for the disease in question. In our simulations, backward contact tracing is more efficient than forward contact tracing if the infected are equally infectious at any time in their cause of disease (Fig. 1A). For a disease with infectiousness profile like the empirically reported infectiousness of COVID-19 patients, forward contact tracing is preferable (Fig. 1B). We obtain similar results when simulating epidemics and contact tracing on other networks (Erdös Rényi and person-gathering networks). Finally, we demonstrate that simple case isolation sometimes beats contact tracing in terms of efficiency and confirm the intuition that backward tracing increases in efficiency in networks with high degree heterogeneity. Faced with an epidemic, authorities will have to strike a balance between the efficiency and effectiveness of mitigation measures. Here, we demonstrate that whether forward contact tracing, backward contact tracing, or even case isolation without contact tracing is more efficient depends on the disease in question --- contrary to recent reports. Even so, backward tracing could still be valuable as a means to uncover new branches of the transmission tree that could then be forward traced. |
16:00 | Effects of local interactions in epidemic outbreaks in networks of structured populations PRESENTER: Cédric Simal ABSTRACT. Spreading phenomena constitute a class of processes that models applications that span from the diffusion of information to contagion diseases [1,2]. Depending on the spatial extension of the social interactions, mathematical modeling has been focused either on the role that the pairwise interactions between individuals have on the overall dynamics or on the geometry of the spatial support where the agents move before they enter into contact. However, little is known about the effects that the local and global network structures simultaneously have on the dynamics outcomes, although the unequal interactions between these two parts might strongly affect the overall behavior. In this work, we introduce a mean-field model that explicitly considers the role that local interactions play in spreading models of structured populations [3]. Such a concatenated network model has been named a metaplex network [4,5]. Based on linear stability analysis, we show that the average degree of the local networks can capture their contribution in the global reaction-diffusion system and detect which of them drives the spreading of infection. Furthermore, we show that counterintuitively, the final size of the epidemic is not necessarily proportional to the global degree of the node that spreads the infection. |
16:15 | The impact of timescale separation in the co-evolution of contagions and institutions PRESENTER: Jonathan St-Onge ABSTRACT. Epidemic models are used to study the spread of an undesired agent through a population, be it diseases infecting countries, misinformation adulterating social media, or pests blighting regions. In fighting these epidemics, we do not exclusively depend on either global top-down interventions or individual adaptations. Interventions often come from local institutions such as public health departments, moderation teams on social media, or other forms of group governance. We leverage recent development of institutional dynamics to investigate the intermediary scale of groups, which is understudied compared to macro-level top-down interventions or micro-scale adaptive individual behaviour. Using principles adapted from group selection theory, we model meso-scale groups attempting local control of an epidemic through adaptation based on successes and failures of other groups. This modeling approach results in a hypergraph model where institutions can emerge and grow on hyperedges (representing groups) to locally affect the epidemic dynamics. In this model, we find complex co-evolutionary dynamics which we summarize as five possible dynamical regimes (see figure). Across all regimes, we find that a faster rate of policy imitation leads to a higher steady-state prevalence. Fast imitation is beneficial in the early phase, but high reactivity does not give enough time for stronger policies to prove their efficacy, eventually leading to an abundance of weak institutions unable to control the epidemic. Additionally, the initial conditions determine the transient behaviors. In particular, if groups with stronger policies are present from the outset, the magnitude of epidemic waves is greatly reduced and slightly delayed in time. Altogether our results illustrate the complex dynamics missed by models that ignore the dynamical interplay of contagions with group interventions. |
16:30 | Effect of initial infection size on network SIR model PRESENTER: Guilherme Machado ABSTRACT. We consider the effect of a non-vanishing fraction of initially infected nodes (seed) on the SIR epidemic model on random networks \cite{machado2022}. This is relevant when the number of arriving infected individuals is large, and may be applicable to other forms of contagion such as publicity campaigns. The SIR model is frequently studied by mapping it to a bond percolation problem, in which edges in the network are occupied with the probability $p$, of eventual infection along an edge connecting an infected individual to a susceptible neighbor. The connected component to which a single infected individual belongs matches the set of individuals that will eventually be infected (and eventually recover). This gives accurate measures of the final size of the infection and epidemic threshold in the limit of a vanishingly small seed fraction. We show, however, that when the initial infection occupies a non-vanishing fraction $f$ of the network, this method yields ambiguous results, as the correspondence between edge occupation and contagion transmission no longer holds. Thus the calculated giant component does not accurately represent the size of the epidemic or the paths of infection through the network. We propose instead to measure the giant component of recovered individuals within the original contact network. This gives consistent behavior, and captures the dependence of the epidemic size on the seed fraction. We give exact equations for the size of the epidemic and the epidemic threshold in the infinite size limit. We observe a second order phase transition of the same kind found in the original formulation, however with a critical point (epidemic threshold) dependent on $f$. Small changes in $f$ can produce significant changes in this threshold. We give the phase diagram for the epidemic threshold as a function of transmission probability $p$, and seed fraction f. When the seed fraction $f$ tends to zero we recover the standard results. |
14:15 | Twitter cascade reconstruction to find misinformation amplifiers PRESENTER: Matthew DeVerna ABSTRACT. Identifying problematic misinformation spreaders on Twitter is a difficult problem because the platform's data assume all retweets are of the original poster, or "originator." Accounts that often reshare content to larger audiences, i.e., "amplifiers," are rendered invisible. We address this challenge by presenting a novel approach to reconstruct social media cascades and then utilize it to investigate the role of amplifiers in disseminating misinformation on Twitter. The proposed "Probabilistic Diffusion Inference" (PDI) method relies on assumed probability distributions to weigh the likelihood of potential parents in a cascade. This approach is stochastic, allowing the generation and analysis of multiple versions of a single cascade, and can flexibly adopt any researcher-formulated probability distribution. Here we assume that accounts with more followers, and tweets that occurred more recently, have a greater likelihood of being retweeted. Probabilities generated from these assumptions are combined to select the parent of each retweet (i.e., the account that was actually retweeted). Our study defines misinformation at the source level [1,2,3,5]. Tweets linking to the 100 most shared low-credibility sources are gathered in real-time from Twitter between Apr--Jul 2020. We apply the PDI approach to 5K misinformation cascades randomly selected from the first week of each month. The reconstructed cascades are utilized to create a retweet network for each month. To account for the stochastic nature of PDI, we create 1K versions of each month's retweet network and report average results across these 1K networks. Our first analysis defines amplifiers as those who do not originate a misinformation cascade but who ultimately earn > 50% of that cascade's retweets. Our data suggest that 3.7--5.7% of all cascades contain amplifiers (See Fig. 1, left in PDF). However, further analysis reveals that many of these amplifiers are themselves originators of other misinformation cascades. Excluding these users, the fraction drops to 1.5--2.1% (See Fig. 1, right in PDF). Next, we measure the proportion of misinformation retweets from accounts via a network dismantling procedure that removes the highest-ranking accounts one-by-one, based on various metrics, and compare their performance to an optimal ranking [4]. Results suggest that finding superspreaders of misinformation does not require cascade reconstruction, as misinformation originators tend to amplify each other's content. This study introduces a flexible approach for reconstructing social media cascades and clarifies the role of amplifiers in the spread of misinformation on Twitter. Results show a modest percentage of misinformation cascades are fueled by amplifiers. Many of these accounts are originators of other misinformation, implying a tight-knit community of misinformation influencers. As a result, our analysis suggests that ranking users based on platform data outperforms reconstructed network rankings in finding superspreaders of misinformation. [1] A. Bovet et al. doi:10.1038/s41467-018-07761-2, 2019. [2] N. Grinberg et al. doi:10.1126/science.aau2706, 2019. [3] G. Pennycook et al. doi:10.1073/pnas.1806781116, 2019. [4] C. Shao et al. doi:10.1371/journal.pone.0196087, 2018. [5] C. Shao et al. doi:10.1038/s41467-018-06930-7, 2018. |
14:30 | Predicting Persuasion Success with Network-Based Machine-Learning Tools PRESENTER: Agnes Horvat ABSTRACT. Persuasion is one of the most important challenges of social interaction in a variety of settings ranging from interpersonal conversations to political and marketing campaigns. The underlying process, however, is difficult to generalize without large-scale data because people have different ways of viewing the world and different people are swayed by different types and styles of arguments. Our study relies on a corpus of discussions from /r/ChangeMyView, an active community on Reddit that invites individuals to post an opinion that they accept may be flawed, in an effort to understand other perspectives on the issue. In this community, an original poster (OP) begins by stating their point of view and reasoning behind their beliefs. Challengers are then invited to contest the OP's viewpoint. OPs explicitly recognize challengers' successful arguments by replying with the "Δ" character and explaining how and why their original viewpoint changed. Using natural language processing (NLP) techniques, we follow each discussion post, complete with its comment threads, to identify persuasive/non-persuasive challengers (i.e., those awarded/not awarded Δ points), their interaction patterns, and the linguistic properties of their arguments. While previous studies demonstrate that persuasive arguments are characterized by key linguistic factors (e.g., word count and the use of articles and pronouns) and patterns of interaction (e.g., challengers' entry time and number of comments in the discussion), less is known about how people's embeddedness in social networks impacts their ability to persuade others in group settings. To fill this research gap, we construct user interaction networks based on comments and their replies within each discussion post. In our main analysis, we evaluate the performance of five supervised classification models (Decision Trees, Random Forests, Adaptive Boosting, Logistic Regression, and Gaussian Naive Bayes) in predicting which challengers will succeed in persuading an OP to change their mind based on the challengers' (a) local network measures, (b) argument linguistic style, (c) linguistic matching with OP's text (i.e., word interplay features), and (d) challengers' interaction with the OP. Using 5-fold cross-validation, we observed that the Random Forest classifier had the best performance (AUC = 0.93 with all features combined. Across the different feature categories, network-based features yielded the best performance (AUC=0.90). Thus, network features play an essential role in persuasion. Our study provides new insights into the structure of real-world influence networks that emerge from group discussions and how individuals' local network properties (e.g., how well-connected a challenger is to other challengers in the network as well as how important/credible/trustworthy a challenger is based on their ability to receive comments from other important challengers in the network), enable/inhibit their ability to persuade others. Broadly, we apply network science methods to provide a better understanding of social influence and persuasion in online discussions. |
14:45 | Improving discourse quality in social media networks: A large-scale, longitudinal study of influencing speech characteristics PRESENTER: Alina Herderich ABSTRACT. Introduction & General Aim Political discussions in social media networks are often overshadowed by hate speech. Uncivilized practices can keep people with moderate attitudes from participating in online discussions, distorting public opinion. Citizen-organized counter speech can break the cycle of hate speech on the Internet. While several counter speech strategies have been identified, their effectiveness is still largely unknown. We examine which speech characteristics are most effective in mitigating online hate. Methods Our data set includes 130,127 conversations on German Twitter in response to posts of prominent news outlets from 2015 to 2018 with 1,150,469 tweets in total. We used ARDL to analyze short- and long-term effects of four speech characteristics (a speaker’s argumentative strategy, the presence of in- and outgroup thinking, the social psychological goal with respect to those groups, and the presence of emotions) in the networks formed by the replies of the discussion trees. As outcomes we defined the presence of hate speech, toxicity, and political extremity. We combined expert annotation with several twitter-xlm-roberta-base models to identify different speech characteristics and used pre-existing classifiers where applicable. Results We show descriptive results for the development of speech characteristics over time. Offering opinions (not necessarily facts, but without insults) and sarcasm are negatively related to hate, toxicity and extremism. Other constructive comments such as providing information show mixed or even negative effects, possibly because of backfire effects. Comments including in- and outgroup thinking were negatively associated with discourse quality compared to neutral speech. Similarly, emotionally charged comments, positive (e.g., pride) or negative (e.g., anger), increased hate, toxicity and extremism. Conclusion Building on an unprecedented large, longitudinal dataset, we provide new and nuanced insights into the effects of speech characteristics on discourse quality useful for individuals and groups wishing to reduce hate in online spaces. |
15:00 | Who is driving the conversation? Analysing the nodality of British MPs and journalists on online platforms PRESENTER: Leonardo Castro Gonzalez ABSTRACT. In political science, nodality - the capacity to share and receive information - is one of the key tools that government uses to make policy. Nodality used to be something that government possessed almost by virtue of being government, as the “water mill” at the centre of society’s information networks [1]. But in a digital era, governmental actors face greater competition for nodality. Widespread use of the internet means that journalists, public figures and even citizens themselves can acquire nodality, which can challenge government’s capacity to capture public attention and to communicate efficiently with society at large. In every policy-related conversation, there is constant jostling for position. The nodality of an agent needs to be seen with respect to the other agents, and governmental actors have no monopoly on centrality. We study the nodality of British Members of Parliament (MPs) and journalists 1 on Twitter in conversations around four different topics: 1) The Ukraine War, 2) the so-called Cost of Living Crisis, 3) Brexit and 4) the COVID-19 pandemic. Each topic is directly related to an important public policy issue facing the UK. Our research questions are: How can we mesure nodality on online conversational networks and how do network metrics contribute to this? Which set of actors has the most nodality for a given topic over time? How does relative nodality of actors and topics change over time? We analyse Twitter data obtained for 581 MPs and 8606 journalists between January 14, 2022 and January 13, 2023. We classify tweets using a weak-supervision classifier [2] and construct a directed graph for each topic, where actors form nodes and any interaction from j to i is captured as an edge i → j. We do a Granger-causality analysis on the activity time series of journalists and MPs (Figure 1a). We complement this analysis by exploring how well centrality measures can explain the nodality of actors, and which of these actors are the most “nodal” (Fig. 1b). Finally, we compare the interaction network to a configuration model and examine biases in the way journalists interact with political parties (Fig. 1c, where positive values indicate more interaction than expected at random). The final aim of this work is to understand how these actors are able to increase and deploy their nodality, drive the conversation and influence others on online platforms |
15:15 | Supply, demand and spreading of news during COVID-19 and assessment of questionable sources production PRESENTER: Pietro Gravino ABSTRACT. We exploit the burst of news production triggered by the COVID-19 out- break through an Italian database partially annotated for questionable sources [1]. We compare news supply with news demand, as captured by Google Trends data. We identify the Granger causal relationships between supply and demand for the most searched keywords, quantifying the inertial behaviour of the news supply. Focusing on COVID-19 news, we find that questionable sources are more sensitive than general news production to people’s interests, especially when news supply and demand are mismatched. We introduce an index as- sessing the level of questionable news production solely based on the available volumes of news and searches. Furthermore, we introduce an analysis of the spreading layer of the dynamics. Measured from social media data, it can rep- resent the bridge of interplay between supply and demand. We contend these results can be a powerful asset in informing campaigns against disinformation and providing news outlets and institutions with potentially relevant strategies. |
15:30 | “Doing the Search”: Differential Tracking in Disinformation Websites and its Impact on Search Engine Results PRESENTER: José Reis ABSTRACT. Internet users are constantly being tracked and their behavioral data collected and used by third-parties. Recent studies have exposed not only problematic differential and invasive tracking practices, but also how our online data is used to build a customized “bubble” of information shaping our worldview. Therefore, in this personalized world, where the spread of disinformation increases and its consequences are felt, one central question arises: What happens to our browser experience when one clicks on fake news links? We aim to answer this question by studying (1) whether disinformation websites exhibit differential tracking behavior and (2) how previous consumption of disinformation websites impacts search engine results and recommendations. To do so, we have developed an ethical two-step audit relying on virtual agents, or bots, that mimic users browsing the web. Our preliminary results indicate that 1) disinformation websites are tracked more heavily by third-parties and that, 2)search engine results are different according to the history (disinformation vs. non-disinformation consumption) of the bot. |
14:15 | Characterising air transport delays through complex networks: results and challenges ABSTRACT. The characterisation of delay propagation is one of the major research topics in air transport management, due to its negative effects on the cost-efficiency, safety and environmental impact of this transportation mode. In spite of a substantial body of literature, the mechanisms underlying how and why delays propagate are still poorly understood, with all deployed mitigation policies being broad in scope - i.e. policies tend to penalise all delays, irrespective of their role in the global dynamics. The reasons for this can be traced back to the limitations inherent simulation-based studies, including limited availability of real data (as e.g. on connecting passengers and on airline's operational policies), the intrinsic uncertainty of the system's dynamics, and the difficulty of validating synthetic models. A better understanding of air transport architectural interactions may come from the study of how the system processes information. When aircraft travel between two airports, they do not only transport passengers and goods, but also transmit information about the status of the departure airport (and of the whole crossed airspace) to the destination. One airport receiving (possibly delayed) flights and dispatching them to other airports is not just managing the movement of the aircraft, but is also receiving, processing and retransmitting information about the system. A parallelism can be envisioned between the human brain and air transport: similarly to how information processing in the brain is studied using functional networks, delay propagation processes can also be studied from this point of view. Instances of delay propagation can be detected through causality metrics, such as, e.g., the celebrated Granger causality test, for then being mapped as links of a complex network. Even if it comes with its own limitations, this approach allows representing the dynamics of the system in a simple and intuitive way, and the latter can further be characterised by relying on the many techniques available to quantify properties of networks. In this contribution we will review the main results and insights obtained using this and similar approaches in the last years. We will further discuss some open problems and challenges, including the use of different causality metrics; the different ways of reconstructing and pre-processing delay time series, and their effects on the results; and the possibilities offered by complex networks theory for guiding the deployment of more effective mitigation policies. |
14:30 | Modular gateway-ness connectivity and structural core organization in maritime network science PRESENTER: Mengqiao Xu ABSTRACT. Around 80% of global trade by volume is transported by sea, and thus the maritime transportation system is fundamental to the world economy. To better exploit new international shipping routes, we need to understand the current ones and their complex systems association with international trade. We investigate the structure of the global liner shipping network (GLSN), finding it is an economic small-world network with a trade-off between high transportation efficiency and low wiring cost. To enhance understanding of this trade-off, we examine the modular segregation of the GLSN; we study provincial-, connector-hub ports and propose the definition of gateway-hub ports, using three respective structural measures. The gateway-hub structural-core organization seems a salient property of the GLSN, which proves importantly associated to network integration and function in realizing the cargo transportation of international trade. This finding offers new insights into the GLSN’s structural organization complexity and its relevance to international trade. |
14:45 | Feel Old Yet? Updating Mode of Transportation Distributions from Travel Surveys using Data Fusion with Mobile Phone Data PRESENTER: Eduardo Graells-Garrido ABSTRACT. Cities often lack up-to-date data analytics to evaluate and inform transport planning interventions to achieve sustainability goals, as traditional data sources are expensive, infrequent, and suffer from data latency. Mobile phone data provide an inexpensive source of geospatial information to capture human mobility at unprecedented geographic and temporal granularity. This paper proposes a method to estimate updated mode of transportation usage in a city, with novel usage of mobile phone application traces to infer previously hard to detect modes, such as bikes and ride-hailing/taxi. By using data fusion and matrix factorisation over a network of datasets, we integrate official data sources (e.g., travel surveys, census) with mobile phone data, reconstruct the official data using the model, and incorporate knowledge from digital traces into an updated dataset. We tested the method in a case study of Santiago (Chile) and inferred four modes of transportation: mass-transit, which includes all public transportation; motorised, which refers to cars and similar vehicles; active, which includes pedestrian and cycle trips; and taxi, which includes traditional taxi and ride-hailing applications. As a result, we quantified how much mass-transit reduced its share in most areas of the city, with exception of those where new metro/rail lines started to operate in the last five years; we also quantified how motorised transport increased throughout the city. Additionally, the taxi share increased, and the active share decreased as well, showcasing how Santiago is moving toward less sustainable transportation regardless of the interventions implemented in the city. The results were validated with official data from smart card transactions. |
15:00 | Bilevel optimization for flow control in optimal transport networks PRESENTER: Alessandro Lonardi ABSTRACT. Increasing infrastructure robustness while also ensuring high efficiency of transport is a compelling organizing principle of transportation networks at all scales. Nevertheless, the interplay between these two quantities is often poorly understood. We study their relation by formulating a bilevel optimization problem, which can be interpreted as the competition between the objectives of greedy passengers and a network manager. Passengers aim at minimizing their origin-destination travel cost (lower-level problem), whereas the network manager aims at guaranteeing global infrastructural efficiency (upper-level problem) by mitigating traffic bottlenecks. The key is that the lower-level objective depends on parameters set at the upper-level. For instance, a manager can introduce tolls or capacity limits to force passengers to redirect to lower traffic. Hence, the need for a bilevel optimization framework that entangles the two settings. To solve the problem, we propose BROT (Bilevel Routing on networks with Optimal Transport), a method based on optimal transport theory for the lower-level problem, and exploit projected gradient descent to optimize the upper-level one, conditioned on the lower-level constraints. In particular, we leverage the formalism of capacitated networks, hence we treat passengers' flows as electrical currents, to find efficient closed-form updates of the edge capacities---measuring the infrastructure size needed to allocate passengers---and their weights---the cost of travel. Our model admits a clear interpretation. On one side, greedy passengers travel along the shortest path while disregarding each other routes, hence high capacity is essential to allocate many passengers to central links of transportation infrastructures. On the other side, since congested links can lead to infrastructural failures, the network manager tunes the weights of the edges to incentivize passengers to move away from jammed connections. By updating weights and capacities alternatively with our method, we are able to obtain numerically topologies that trade-off between shortest-path but highly congested networks and distributed but longer-path networks. We validate our method on synthetic data and investigate several metrics to quantify the Pareto efficiency of BROT's optimal solutions. We then show how this approach can be applied on real networks, which illustrates how decision-makers could be informed on how to design carbon-efficient infrastructures with high transportation performance. This manuscript is in preparation. |
15:15 | Variation in Cluster Formation between Free and Congested Flow in Urban Road Networks PRESENTER: Yongsung Kwon ABSTRACT. Human social activity hinges on the mobility infrastructure of that society. In particular, traffic conditions on urban road networks highly impact regular daily human activity, such as commuting at a city level. Accordingly, understanding traffic behavior has been a significant concern in urban planning. Percolation-based network analysis has contributed to understanding how traffic congestion spreads and is resolved on road networks. Percolation analysis can simulate the transition of two representative states, traffic congestion and free flow states. Most previous studies have explored various transition behaviors but usually focused on either of the two states. Relatively less attention is paid to the comparison between the two different processes, such as whether congestion and free flow behave similarly or not. To fill the current research gap, we investigate and compare the percolating processes in congestion and free flow states on the same road network. We empirically find the differences between the two processes in terms of a giant connected component and suggest our hypothesis on the reasons behind the phenomena with a preliminary model study. We apply the percolation analysis to the traffic network of Chengdu-si in China using taxi data from 1st June 2015 to 15th July 2015. We define the weight of a link as the ratio of the taxi speed on the road at each hour to the maximum speed used during a day, assigning the traffic condition on the link with an hour resolution. We conduct percolation simulations by occupying the roads in order of low weight for the traffic congestion model and high weight for the free flow model. Fig. 1(a) displays the emerging process of a giant component, by an increase of link density p for a traffic jam (or q for free flow). It is noteworthy that the cluster formation processes of the two cases significantly differ, which is shown as the gap between the two curves. The gap appears for all hours [the inset of Fig. 1(a)]. It also signals that the structure of free-flow and jamming clusters differ from each other. As seen from the snapshots of the detailed structure in Fig. 1(b), the jamming cluster is formed in the central area, and the free-flow cluster mostly presents near the outer ring of the city. We conjecture that the different cluster-forming behaviors may be attributed to the spatial correlations between the speeds of roads, in other words, the phenomenon that traffic condition is similar among adjacent roads. To investigate the effect of the correlation, we generate a weight-shuffled (uncorrelated) network from the original networks. Figure 1(c) shows that the weight-shuffled networks have higher thresholds than most original networks. It indicates the earlier onset of the giant component in correlated networks, which is in line with the previous finding that the percolation threshold decreases in the system with positive correlation. The distinct difference in threshold points between the uncorrelated and original ones signifies that the correlations contribute to the different transition behaviors in empirical road networks. Next, we measure the correlation coefficient to understand whether it relates to the gap sizes. Figure 1(d) illustrates the weight-weight correlation as a function of the hopping distance d for the networks at 08:00 (rush hour and large gap) and 22:00 (non-rush hour and small gap). It shows that the correlation level at 08:00, when the gap is large, is entirely higher than at 22:00. The result also reinforces our hypothesis that correlation induces a difference in the cluster formation process of the congestion and free flow. However, further analysis is necessary, such as how the correlation is organized in detail. Thus we need to clarify how each cluster formation process is related to the local correlation pattern. |
15:30 | More Accurate Demand Forecasting in Open Systems using Cartograms PRESENTER: Sangjoon Park ABSTRACT. As deep learning has developed, it has also been used for demand forecasting such as traffic and transportation. In the case of traffic demand, graph neural network (GNN) has been used to consider not only adjacent demand but also the case where demand occurs over a long distance in that it usually occurs when moving a long distance. However, there are some difficulties in using the GNN model: (1) high computational complexity with many nodes; (2) defining the structure of the adjacency matrix in real data. In order to overcome the difficulty of complexity cost, previous studies have used regional demand which is a sum of demand for shared transports at an individual level. The regional demand shows only a general pattern in which shared transports are densely placed in space, and thus the characteristics of individual transport demand are ignored. They have even defined the adjacency matrix by the Pearson correlation coefficient between the demand for regions because actual data does not have a connection structure information. However, a helpful structure for demand prediction can exist in addition to the structure made by the Pearson correlation coefficient. We propose a cartogram method that can reflect characteristics of individual transport demand to the regional demand, and utilize graph attention network (GAT) to address above the problems. The cartogram method spreads transportation less densely by using a Voronoi tessellation. The Voronoi tessellation divides the space into polygonal shapes containing one transportation point. After dividing the space, we move each point (transportation) to the center of the polygon, resulting in a less dense spread of points. We evaluate our new method using the information on public bicycle rental history in Seoul for 2018 and 2019 years. In Figs. 1(a) and (b), the stations are uniformly placed in the space after using the cartogram method. We compare the performance before and after using the cartogram method by modifying the spatial-temporal convolutional graph attention network (ST-CGA Network). The ST-CGA Network consists of the GAT which learns the weight of the edges. This model learns the degree of association between regions in the direction of predicting demand well, expresses it as a weight (attention score), and obtains important connections according to the weight. We analyze the attention score of the GAT which is the role of the adjacency matrix. In Fig. 1(c), the high-value cluster has the stations outward of the city, but the low-value cluster has the stations in the center of the city. In the center of the city, the demand is higher than in the outward city. The regions outward need more information to update the node information than the regions in the center of the city. Therefore, the position of regions affects the structure. We obtain better performance and connection structure through the cartograms method and the ST-CGA Network. This study can be applied to other transportations, although we do not know the connection structure. |
15:45 | Resource-driven movements of livestock herds: impact of climate change on network dynamics PRESENTER: Rowland Kao ABSTRACT. Contact patterns through resource-driven movements play a significant role in the dynamics of infectious disease spread between livestock populations. Due to the projected scarcity of resources resulting from climate change, policy change, and urbanisation, the rate and duration of contact will change in response to socio-environment and climate change, which could result in the intensification of disease risk in rural Tanzania. An understanding of how populations are connected through the use of shared resources and how they might change in the future is therefore important when modelling infectious diseases of livestock. In this study, we developed and parameterized a spatially explicit, distance-dependent movement probability model of livestock herds to shared resources in an agropastoral community of northern Tanzania. Model parameters were estimated using an Approximate Bayesian Computation (ABC) by fitting the model to the summary statistics of the observed network data, in each season, for each resource type (grazing and watering). Observed networks of village-to-village contacts in a month were generated from community participatory mapping from a previous study between January 2016 and December 2017. The Normalized Difference Vegetation Index (NDVI) was used as a proxy for the availability of resource areas in each season. Estimated model parameters were used to simulate monthly bipartite networks for each resource type (grazing and watering), which were converted to both monthly and yearly village-to-village networks. Our model performed well in capturing key metrics of the yearly observed networks, including degree, betweenness, eigenvector centralities, network size, transitivity, and assortativity. Additionally, the monthly number of contacts and distance between connected villages in the simulated networks were similar to those in the observed networks. Finally, we used estimates of NDVI values, based on a projection of temperature and precipitation using a previously developed statistical model, to assess changes in livestock movement configuration. Our model provides opportunities to explore the impacts of future changes related to climate and policy through the removal or alteration of resources. |
16:00 | Detecting the sensitive spots in the African interurban transport network PRESENTER: Andrew Renninger ABSTRACT. Transport systems are vulnerable to congestion, weather, and other events that create delays and disruptions or isolate entire regions (1). This situation is particularly relevant in parts of the African transport network for three reasons. First, the density of highways is relatively reduced, so few alternatives exist in the case of a network disruption. Second, the lack of resources and infrastructure makes the market for goods in the continent susceptible to increases in transportation costs. And thirdly, because of the increasing violence against civilians in the region. Terrorist groups use the fragility of the network to increase their control in the area and the impact of their attacks. It is estimated that between 2001 and 2021, the number of casualties attributed to violence against civilians in Africa has increased 260% (5). Here, we measure the risk to the African transport network based on two separate indices: the intensity of future events μ and the impact ν of one event on the flow that travels through the network. To estimate the intensity of future events happening in city i, we use data from the Armed Conflict Location & Event Data project (ACLED) that monitors political violence across Africa, and other parts of the world, mainly from local media reports (5). We construct a self-exciting point process to determine the intensity of future events based on the previous circumstances. To estimate the impact of an event, we use data from Africapolis, and the network of all highways in the continent constructed from OpenStreetMap (2; 3; 4). Major highways and roads are considered within the urban network, so it is possible to compute a gravity-based estimate for the flow between any pair of cities and assign it to the shortest route between them based on existing infrastructure. We then measure the impact νi based on the flow it would be observed if node i was removed. Thus, for each city in the urban network, we construct the intensity μi and its impact on the network flow νi (Figure). Results show that certain cities in the network have a high risk and increase the vulnerability of Africa's infrastructure. These cities have a high propensity for suffering subsequent violence against their civilians and given their connectivity structure, they also substantially affect the overall regional functioning. We show that the removal of just 10 edges would require rerouting 50% of trips according to our model; looking at where conflict is likely, the top 100 edges by μ account for 17% of trips. Aggregating risk to cities, we find that the cities with the highest μν risk are those characterised by small and medium size and large degree—that is, cities that act as local or regional hubs. Correlation between intensity and impact is strongest in West and Central Africa, but there are also "crossing" links that join regions which both high impact and high conflict. We show that these areas tend to be characterised by the presence of terrorist groups like Boko Haram in Nigeria and Al-Shabaab in Somalia, with the key difference that Boko Haram operates at important junctures while Al-Shabaab exists at the fringe. |
16:15 | Emergence of complex topologies from flux-weighted optimization of network efficiency PRESENTER: Sebastiano Bontorin ABSTRACT. Transportation and distribution networks are a class of spatial networks that have been of interest in recent years. These networks are often characterized by the presence of complex structures such as central loops paired with peripheral branches, which can appear both in natural and man-made systems, such as subway and railway networks. In this study, we investigate the conditions for the emergence of these non-trivial topological structures in the context of human transportation in cities. We propose a minimal model for spatial networks generation, where a network lattice acts as a spatial substrate and edge velocities and distances define an effective temporal distance which quantifies the efficiency in exploring the urban space. Complex network topologies can be recovered from the optimization of joint network paths and we study how the interplay between a flow probability between two nodes in space and the associated travel cost influences the resulting optimal network. In the perspective of urban transportation we simulate these flows by means of human mobility models to obtain Origin-Destination matrices. We find that when using simple lattices, the obtained optimal topologies transition from tree-like structures to more regular networks, depending on the spatial range of flows. Remarkably, we find that branches paired to large loops structures appear as optimal structures when the network is optimized for an interplay between heterogeneous mobility patterns of small range travels and longer range ones typical of commuting. Finally, we show that our framework is able to recover the statistical spatial properties of the Greater London Area subway network. |
16:30 | Quantifying road network vulnerability by access to healthcare PRESENTER: Hannah Schuster ABSTRACT. The resilience of transportation networks is highly variable. Some components are crucially important: their failure can cause problems throughout the system. One way to probe a system for weak points needing reinforcement is via simulated stress tests. Network scientists have applied node or edge removal simulations to many systems: interbank lending markets, power grids, software networks, etc. Reliable transit via roads is especially crucial in healthcare: delays in travel to hospitals have a significant negative effect on patient outcomes including mortality [1]. Yet past studies of road network resilience focus on general mobility or specific kinds of events like floods [2]. And it is unclear how classical resilience analysis applies to geographically embedded road networks with homogeneous degree distribution. We address this gap by using a coarse-grained representation of the Austrian road network in which nodes are municipalities and edges connect municipalities directly via roads. We stress this network, observing changes in accessibility when removing individual edges and groups of edges in geographic clusters using a population-weighted measure of hospital accessibility [3]. Under specific scenarios, certain segments play a critical role in accessibility. We observe changes in burdens on individual hospitals as road closures change which hospitals are nearest. These results are valuable for scheduling road maintenance, extending the road network, or evaluating hospital capacity. |
15:30 | Investigation of the relationship between the existence of zombie firms and their positions in the business trade network PRESENTER: Rei Kinoshita ABSTRACT. Firms that are close to bankruptcy in terms of performance, but continue to exist thanks to loans from financial institutions, are called zombie firms [1]. In addition to low labor productivity, zombie firms have a negative impact on the macroeconomy by distorting the market. Such firms should be eliminated by market competition, but in practice, they continue to survive. There is a great deal of public interest in the reality of zombie firms and the mechanisms by which they emerge. In this study, we investigate the reality of zombie firms using data from Tokyo Shoko Research, which covers information on 2 million Japanese firms. Existing studies on zombie firms focus only on the financial information of zombie firms and their relationships with financial institutions. In contrast, this study analyzes the positioning of zombie firms within the business-to-business network. The bankruptcy of a zombie firm affects its business partners, who may favor the zombie company to prevent this from happening, and this may be a factor that allows the zombie firm to continue to exist. The data used in the analysis are financial data (2001-2020) and information on business partners (2011-2020) for approximately 2 million firms in Japan. The data include financial indicators based on B/S and PL, and the information on business partners is a list of each company's main suppliers and sales partners. Using financial data, we first compared the following two existing definitions of “zombie firms” to see which definition captures the reality of zombie firms better (which has thus far not been quantitatively discussed). Definition (1): A firm whose Interest Coverage Ratio (operating income divided by interest expense) is less than 1 for three consecutive years; Definition (2): A zombie firm is defined using the prime rate (see [2]). Our analysis results showed that definition (1) better captures the main characteristic of zombie firms – i.e., low productivity. Next, a transaction network (directed) was created for the 10 years from FY 2011 to FY 2020, in which firms were nodes, and edges were stretched from supplier firms to sales partners based on the business partners data. We then categorized the firms into "zombie firms”, “All firms", "zombie firms' suppliers (predecessors)”, “zombie firms’ sales partners (successors)”, and “all other firms”, and calculated the median network centrality index for each group (see Figure 1). The figure shows that the median value of All firms exceeds the median value of the zombie firms in all years, indicating that zombie firms do not have an important position in the transaction network. The median of the sales partners (“successors”) is larger than that of the supplier firms for all years in Page Rank and eigenvector centrality, indicating that zombie firms’ sales partners are more important in the network and are more closely connected to the important firms. Other findings in our study show that firms closely related to zombie firms tend to be in industries that are less susceptible to changes in the external environment, such as public institutions and the electricity, gas, heat supply, and water supply industries. This research is the first attempt to clarify the reason for the survival of zombie firms, which should have been eliminated, from the perspective of networks of dependence on other firms. If the structural characteristics of these dependence relationships are further clarified through our future analysis, we can expect to contribute to the search for the causes of economic stagnation. |
15:45 | A new method for mapping global phosphorus flows ABSTRACT. Phosphorus is one of the key elements in the production of fertilizers and thus the production of food. Since phosphorus is constantly removed from the soil in the process of agricultural production, its reliable availability in the form of fertilizers is essential for food security and economic development. In this paper we present a new method to trace the flows of phosphorus from the countries where it is mined to the counties where it is used in agricultural production. We achieve this by combining data on phosphorus production with data on fertilizer use and data on international trade of phosphorus-related products. We show that by making certain adjustments to value-based data on net exports we can re-construct the matrix of material phosphorus flows to a large degree, a results that is important for devising measures on sustainable development and environmental accounting, since it allows to connects research on material flows to the analysis of international trade networks and supply chains. |
16:00 | Regional value trees in Europe PRESENTER: Jan Schulz ABSTRACT. Production processes have become more fragmented, interrelated, and global in the last decades highlighting the need for input-output models of production. This paper models input linkages between disaggregated sectors as value trees and seeks to explore industrial and regional specificities in their topology. Value trees represent a hybrid between the stylized model of a value chain, where each producer has exactly one upstream and downstream connection, and a star network, where one central hub is connected to several counterparties, which in turn are unconnected among themselves (also known as a hub and spoke configuration). We introduce a branching process that includes star- and chain-like structures as limiting cases but also nests trees as a generalization. This process is parsimonious and has merely two parameters: breadth and length. The process predicts a power law relationship between the total number of nodes in the sub-tree rooted at an individual node (tree size) and the cumulative tree size, with a characteristic (allometric) scaling exponent of unity for the star network, two for the chain graph, and intermediate values for trees. Using EU regional input-output data on the NUTS-2 level, we employ a breadth-first search algorithm to construct value trees for European regions and for both backward (demand-side) and forward (supply-side) linkages. Empirical estimates confirm the theoretically predicted scaling law and tree structure of production for both perspectives with estimates between unity and two (see Fig.1). Supply-side linkages are significantly closer to an exponent of two and thus a chain-like configuration, in line with the intuition that the supply-side perspective emphasizes long chains of processing inputs. We also document several regional and sectoral specificities in scaling exponents. Among those, we show that industries located in the regional core of Europe rely on longer and more chain-like forward-linkage production structures than peripheral regions, indicating a larger degree of specialization. Indeed, in a sectoral disaggregation, the industrial sector exhibits the most chain-like forward-linkages, with the raw materials sector being much more star-like. Raw materials can thus be interpreted as something close to universal input for most sectors. These applications demonstrate the potential of the allometric scaling exponent as a summary statistic for production structure. |
16:15 | Structural correlations and economic decline spreading in international trade hypergraphs PRESENTER: Sudo Yi ABSTRACT. Networks consisting of pairwise links often fall short of representing the full interaction and relationship profile of complex systems, and there have been recently a surge of investigations on higher-order networks such as simplicial complexes or hypergraphs. The network approaches taken so far to international trade are thus probably incomplete as it includes numerous exporter and importer countries and traded products. In this study, we analyze the annual international trade data from 1962 to 2018 to construct the international trade hypergraphs consisting of triangular hyperedges representing the exporter-importer-product relationship, and investigate their structure and dynamics on them [1]. First, by computing the background correlation remnant even in the exponential random hypergraphs that are maximally random preserving the given empirical hyperdegree sequence, we identify the true correlations of the hyperdegrees of an exporter and its neighboring importer or product [Fig. 1 (a) and (b)]. These results reveal the bias of the exporters of low hyperdegree towards the importers of high hyperdegree and the products of low hyperdegree, which is not readily accessible in the pairwise networks but robustly identified in the trade hypergraphs every year. Regarding the dynamics of international trade, we see that individual trade volumes fluctuate from year to year for various reasons such as a short supply of raw materials or political issues. A large drop in the volume of an individual trade may be considered as a local economic decline. We consider a hyperedge $h$ {\it declined} if its trade volume $W_h(t)$ decreases over consecutive years significantly as $r_h (t) \equiv \log \left[W_h (t + 1)/W_h (t)\right] < r_*$ with a negative constant $r_*$. The distribution of such declined hyperedges is found to be far from random but clustered and correlated with the trade volumes of the hyperedges [Fig. 1 (c)]. By adopting the Susceptible-Infected-Recovered (SIR) model to simulate the spread of economic decline over hyperedges with the infection rate decreasing with trade volume, we show that the correlation between the degree and weight (trade volume) of a hyperedge, identified in the empirical data, affects significantly the decline spreading phenomena [Fig. 1 (d)]. |
16:30 | Chance, biases, and data incompleteness. Uncovering patterns in Roman time maritime trade. PRESENTER: Luce Prignano ABSTRACT. While network science has been applied to a wide range of research fields, there remain some domains where such applications are still emerging and not yet fully explored. In this work, we present a novel application of network approach to the analysis of archaeological data. We present a paradigmatic case-study based on a dataset of special interest for classical archaeologists: We analyzed the cargo composition of the shipwrecks (2BC-7AD) to investigate the evolution of Mediterranean connectivity in Roman time. We considered amphoric types, not only because they are the most frequent category of findings on board, but mostly for the useful additional information associated with them. In particular, provenance and the date range of their production are especially useful for the purpose of our study. The basic idea is that co-occurrences in the same shipwreck (links) of amphoric types (nodes) from different regions can be regarded as a proxy for the economic interactions between those regions. Hence, we built a bipartite network of contexts (shipwrecks) and categories of artifacts (amphoric types) in the corresponding assemblages and projected onto one of the two classes of nodes. Additionally, we approximately estimated when such interactions took place by dating the cargoes from the overlap between the dating ranges of all the types of amphorae in them. We built five networks of amphoric types by slicing the chronological arc of the whole dataset into periods with a comparable amount of findings. However, the dataset is affected by some important biases concerning the geographical distribution of the shipwrecks (discovery-related issues) and the relative representation of different periods (conservation issues). To perform a rigorous, quantitative characterization of the evolution of the connectivity patterns of amphoric types, we devised an ad hoc re-sampling technique. This technique allowed us to generate a statistical ensemble of synthetic (randomized) cargoes that do not violate any of the constraints provided by the archaeological evidence, thus keeping all the biases unaltered. By adopting a probabilistic approach, we determined the features of the inter-regional connectivity patterns inferred from the empirical networks that differ significantly from those observed in the projections of the randomized cargoes. In this way, we identified the features that cannot be explained as a mere product of chance combined with the effect of the biases in the data and are therefore susceptible to be considered as inherent to the historical dynamics. Finally, we compared the information inferred by means of our approach to the established historical knowledge about the evolution of trade connectivity in the Mediterranean. |
15:30 | Competition for popularity and interventions on a Chinese microblogging site PRESENTER: János Kertész ABSTRACT. Microblogging sites are important vehicles for the users to obtain information and shape public opinion thus they are arenas of continuous competition for popularity. Most popular topics are usually indicated on ranking lists. In this study, we investigate the public attention dynamics through the Hot Search List (HSL) of the Chinese microblog Sina Weibo, where trending hashtags are ranked based on a multi-dimensional search volume index. We characterize the rank dynamics by the time spent by hashtags on the list, the time of the day they appear there, the rank diversity, and by the ranking trajectories. We show how the circadian rhythm affects the popularity of hashtags, and observe categories of their rank trajectories by a machine learning clustering algorithm. By analyzing patterns of ranking dynamics using various measures, we identify anomalies that are likely to result from the platform provider's intervention into the ranking, including the anchoring of hashtags to certain ranks on the HSL. We propose a simple model of ranking that explains the mechanism of this anchoring effect. We found an over-representation of hashtags related to international politics at 3 out of 4 anchoring ranks on the HSL, indicating possible manipulations of public opinion. |
15:45 | Entropy-based detection of Twitter’s echo-chambers PRESENTER: Fabio Saracco ABSTRACT. The process by which information reaches the virtual world favours the formation of homogeneous communities of individuals, polarised on the same opinions, known in the literature under the name of echo chambers: ‘a bounded, enclosed media space that has the potential to both magnify the messages delivered within it and insulate them from rebuttal’ [1]. As pointed out in the reference mentioned above, the nature of the echo-chambers depends on two different phenomena: 1. in online social networks users interact with similar ones; 2. similar users build their opinions on similar news sources. When the two phenomena appear at the same time, users form an echo-chambers. In this presentation, we propose an entropy-based procedure to spot the presence of echo-chambers on Twitter that is completely agnostic about the nature of the dataset. The procedure is pictorially represented in Fig. 1. We first aim at detecting group of users sharing similar information diets: the bipartite network of users and shared urls is projected on the layer of users and finally validated, using an entropy-based null-model as a benchmark [2]. We, then, detect cluster of users interacting among themselves via retweet: communities of users sharing interests and/or political opinions can be detected, again, using an entropy-based null-model as a benchmark, as it was proposed in Ref. [3]. In the datasets we analysed we found a limited presence of echo-chambers, due to the fact that there are quite a few validated groups of users engaging with the same pieces of news. Otherwise stated, most of the “information diets” of users are compatible with random noise and thus dot not carry any signal. Remarkably, in most of the cases, the clusters of users gathered by significantly similar information diets, focus their attention on non reputable source, as stated by NewsGuard [4]. We then compared these clusters with the discursive communities. It is remarkable that all clusters identified by users following the same information diets fall into the same discursive community, having a clear political orientation. |
16:00 | Filter Bubble effect in opinion dynamics models PRESENTER: Giulio Iannelli ABSTRACT. Social media influence online activity by recommending to users content strongly correlated with what they have preferred in the past. In this way they constrain users within filter bubbles that strongly limit their exposure to new or alternative content. We investigate this type of dynamics by considering a (multistate) voter model where, with a given probability λ, a user interacts with a “personalized information” suggesting the opinion most frequently held in the past. By means of theoretical arguments and numerical simulations, we show the existence of a nontrivial transition between a region where consensus is reached and a region (above a threshold λc) where the system gets polarized and clusters of users with different opinions persist indefinitely. The threshold always vanishes for large system size N, showing that consensus becomes impossible for a large number of users. This finding opens new questions about the side effects of the widespread use of personalized recommendation algorithms. |
16:15 | Counterfactual Bots PRESENTER: Homa Hosseinmardi ABSTRACT. Socio-technical systems have raised questions about whether what we observe on platforms such as Instagram or YouTube reflects society or the platform's design, i.e. its algorithms. It is difficult to determine which is more likely, but when it comes to technology's disturbing consequences, it is easy to overlook the role that human behavior plays. What is the extent to which recommendations influence engagement? This study seeks to disentangle the role of recommendation algorithms from the intentions of users. Specifically, we design experiments utilizing logged in accounts (bots) that mimic user behaviors, personalized by following real YouTube viewership trajectories. A commonly held perception is that recommender systems systematically lead users to increasingly ideologically biased content, yet previous research on the subject has not reached consensus on the matter. On the one hand, audits simulating user behavior on YouTube \cite{brown2022echo,haroon2022youtube} have found that, as users blindly follow the recommender system, they indeed receive recommendations that are increasingly ideologically biased. On the other hand, studies with real user traces show that the consumption of highly partisan content on YouTube is driven by a combination of user preferences and platform features~\cite{hosseinmardi2021examining,chen2022subscriptions}. Ref.~\cite{hosseinmardi2021examining}, for instance, indicates that far-right content is often accessed through external websites and not in the end-tail of long, recommendation-driven sessions; and Ref.~\cite{chen2022subscriptions} has further found that subscriptions (i.e., when users ask to receive updates about a YouTube channel) play a big role in how users consume extreme content on the platform. A possible explanation for these seemingly contradictory findings is that existing audits of YouTube's recommender system do not meaningfully disentangle the role of the algorithm from user preferences, e.g., a majority of YouTube users might prefer right-leaning content. Using bots, i.e., automated programs that simulate user behavior on YouTube, we estimate the influence of recommender systems on what users consume by contrasting recommendations obtained when bots follow real user behavior (which we obtain from user traces) and when they blindly follow algorithmic recommendations. Unlike previous work, we are able to compare the content users are recommended by randomly following the recommender system with a meaningful counterfactual -- the content they have consumed while surfing on the Web (and, naturally, interacting with the algorithm according to their own preferences). Specifically, we trained a set of four bots to watch real user histories for $N_{train}=60$ videos (the personalization phase), while then three bots continue watching $N_{heldout}=60$ videos based on a predefined rule and one continues watching based on the real user's trace. The experiments are conducted on ten users representing different news viewership archetypes, and we repeat each experiment 20 times per user. We observe that in all experiments the highest concentration is around the center. The recommended videos are more moderate when bots are guided by one of the algorithmic paths, Fig. 1. |
Vito Latora (Queen Mary University of London, UK): The dynamics of social systems with higher-order interaction