View: session overviewtalk overview
Shlomo Havlin (Bar-Ilan University, Israel): Interdependent Networks: Novel Physical Phase Transitions
Mirta Galesic (Santa Fe Institute, USA; Complexity Science Hub, Austria): Dynamics of belief networks
11:00 | Higher-order Laplacians for graph embedding PRESENTER: Franziska Heeg ABSTRACT. The Laplacian matrix is fundamental in machine learning applications on graphs because its eigenvectors provide a natural and principled way to obtain vector representation of nodes, i.e., embeddings. The Laplacian describes a heat diffusion process on a network, representing how heat (or information) diffuses through the network topology through direct and indirect interactions, i.e., through edges and paths, respectively. In this context, the eigenvectors relate to the temperature distribution of nodes in the diffusion process; therefore, they contain information on how heat spreads through the graph via sequences of edges, i.e., via paths. A simple diffusion process is often too simple to characterize the rich behaviour of dynamical processes in real networked systems. This is, for instance, true for sequential or temporally resolved interactions, such as time-stamped social interactions, click stream data, financial transactions, or dynamic ecological networks. A simple diffusion process implicitly assumes that we can obtain paths based on a transitive expansion of edges, i.e., given an edge from a to b and an edge from b to c, we can always obtain a path abc by transitively extending edges. However, numerous works have underlined that real paths from temporal networks differ from those obtained based on the transitivity assumption. This discrepancy is problematic for various applications of network science, and it questions the validity of vector space representations obtained based on standard graph models. Specifically, a representation obtained from a standard Laplacian matrix cannot encode information on the sequential ordering in which nodes are traversed. In this work, we address this issue by proposing embeddings of temporal networks that are based on a higher-order generalization of graph Laplacians. The method uses higher-order network models, which encode sequential patterns in causal walks or paths in temporal networks, using an approach that resembles higher-order Markov chains. First, we apply standard spectral methods to the topology of the higher-order network. Then, we define a projection that maps vector representations obtained for nodes in a higher-order network to a time-aware vector representation of nodes in the first-order network. Our results in a synthetic example for casual paths in a lattice network show that unlike the standard Laplacian, the higher-order Laplacian generated by our method recovers over- and under-representations of certain paths by stretching the embedding in the corresponding directions. Given the fundamental role played by the Laplacian matrix in network science and machine learning applications on networks, and the growing interest in higher-order networks, we expect our contribution to be of broad interest to the NetSci community. We would thus be happy to present our work in an oral contribution. |
11:15 | Starting a Fire with Twigs: Influence of Encapsulation Relations on Bottom-up Dynamics on Hypergraphs PRESENTER: Timothy LaRock ABSTRACT. Hypergraphs are important network models for representing interactions occurring between two or more nodes simultaneously. Given that hyperedges may be of arbitrary size, smaller hyperedges may or may not be subsets of larger hyperedges. While some work has begun to address the question of hyperedge overlap, questions about how overlapping hyperedges influence dynamical processes remain open [1,2,3]. We study the influence of the subset relationship between hyperedges of different sizes, which we call encapsulation, on the spread of complex contagion. For two hyperedges e and e' with sizes |e| < |e'|, we say that e' encapsulates e if the smaller hyperedge e is a subset of the larger hyperedge e'. We expect this relationship to occur in many higher-order interaction datasets such as co-authorship, where an author A may write one paper each with authors B and C. If the 3 authors collaborate together on a paper, the third interaction encapsulates the first two. After verifying in empirical data that encapsulation is a key property of real-world hypergraphs, we study a particular kind of complex contagion process for which we show that encapsulation structure is vital to spreading. Our work builds on advances in the study of dynamical processes on higher-order structures, including the relationship between spreading dynamics on hypergraphs compared with simplicial complexes, where encapsulation relationships are implied [4,5]. The spreading in the dynamical process we study occurs at both the node and hyperedge level; each node and hyperedge is in a binary state, either inactive or active. At each timestep in the process, a hyperedge is chosen randomly and activated if the number of already-activated nodes within the hyperedge is larger than a threshold. We show that when this threshold is based on the size of the hyperedge, encapsulation relationships are necessary for spreading to happen from the lowest to highest orders. To use an analogy to building a campfire, smaller hyperedges are twigs, which need to be lit in order to light larger kindling (mid-sized hyperedges), which in turn must be lit before the logs (largest hyperedges) can catch fire. In Figure 1, we illustrate the effect of encapsulation in a toy hypergraph model on N=8 nodes. Two 4-node hyperedges h_1 and h_2 with 3 common nodes are fixed. We then add an increasing number of 2-node hyperedges (horizontal axis) randomly, sampled in two different ways. In uniform sampling, we choose from all of the 28 possible 2-node hyperedges with equal probability. In biased sampling, we sample from the same set of edges with probability proportional to the maximum overlap of the 2-node hyperedge with either of the fixed 4-node hyperedges: the probability of sampling e is proportional to the maximum intersection between e and h1 or h2. To avoid scenarios where full activation is impossible, we only run simulations on sampled hypergraphs that are connected. Results on biased hypergraphs show that only a relatively small number of 2-node hyperedges are required before more than half of the simulations result in full activation, while when sampling uniformly at random another edge is required to reach similar results. Since we require that our sampled hypergraphs are connected, this suggests that it is not only connectivity that is driving the biased hypergraphs towards full activation. Instead it is a combination of connectivity and encapsulation among hyperedges that determines the feasibility of bottom-up activation. Our presentation will expand on these toy model results with simulations on large random and empirical hypergraphs. 1. Chodrow. Configuration Models of Random Hypergraphs. Journal of Complex Networks. 2020;8(3). 2. Sun and Bianconi. Higher-order Percolation Processes on Multiplex Hypergraphs. Physical Review E. 2021; 104(3). 3. Lee, Choe, Shin. How Do Hyperedges Overlap in Real-World Hypergraphs? - Patterns, Measures, and Generators. The Web Conference. 2021; 3396-3407. 4. Zhang, Lucas, and Battiston. Do Higher-Order Interactions Promote Synchronization? arXiv:2203.03060. 2022. 5. Iacopini, Petri, Baronchelli, and Barrat. Group Interactions Modulate Critical Mass Dynamics in Social Convention. Communication Physics. 2022; 5(1). |
11:30 | PRESENTER: Yujie Zeng ABSTRACT. Simplicial complexes have recently been in the limelight of higher-order network analysis, where a minority of simplices play crucial roles in structures and functions due to network heterogeneity. However, it remains elusive how to characterize simplices’ influence and identify vital simplices of order p (termed p-simplices), despite the relative maturity of research on vital nodes (0-simplices) identification. Meanwhile, graph neural networks (GNNs) are potent tools that can exploit network topology and node features simultaneously, but they struggle to tackle higher-order tasks. In this paper, powered by GNN techniques, we propose hierarchical simplicial convolutional networks (HiSCN) to identify vital p-simplices incorporating real influence scores derived from samples or propagation simulations. It can tackle higher-order tasks by leveraging novel higher-order presentations: hierarchical bipartite graphs and higher-order hierarchical (HoH) Laplacians, where targeted simplices are grouped into a hub set and can interact with other simplices. Besides, HiSCN employs learnable graph convolutional operators in each HoH Laplacian domain to capture interactions among simplices, and it can identify influential simplices of arbitrary order by changing the hub set. Empirical results demonstrate that HiSCN significantly outperforms existing methods in ranking both 0-simplices (nodes) and 2-simplices. In general, this novel framework excels in identifying influential simplices and promises to serve as a potent tool in higher-order network analysis. |
11:45 | Mapping biased higher-order walks reveals overlapping communities PRESENTER: Anton Holmgren ABSTRACT. Flow-based community-detection methods uncover communities that trap network flows for a relatively long time. Conventionally, they model flows as first-order Markov chains with memoryless random walks. But for many systems, memoryless random walks cannot accurately describe the dynamics on the network, and the identified flow-based communities fail to capture the actual organization. For example, when modeling citation flows between journals using random walks without memory, multidisciplinary journals such as Nature and Science tend to cluster with the single research field where they are most prominent. In contrast, modeling citation flows with second-order Markov chains based on two-step citation paths reveals the multidisciplinary journals' presence in multiple fields. How can we identify these overlapping flow-based communities that capture real organization when no multistep path data are available but only links? To reveal overlapping communities, we model higher-order processes on standard first-order networks with biased random walks. Inspired by the biased random-walk model proposed in the representation learning algorithm node2vec, we let the previously visited link guide the next step. This model lets us control the effective community sizes by varying the so-called return parameter p and in-out-parameter q. But instead of simulating walks, we express the higher-order dependencies directly using so-called state nodes. To avoid the curse of dimensionality in expressing every possible second-order path, we use a variable-order model that only introduces state nodes where the information gain is significant. Using the map equation framework, we partition the variable-order networks into overlapping communities that trap higher-order flows. To evaluate the performance of our method, we analyze synthetic and real-world networks. We can recover the planted overlapping communities in the synthetic networks and find overlapping communities in the real-world networks. We illustrate our method in Fig 1: a subset of the S. cerevisiae protein-protein interaction network containing Zds1 and Zds2 gene products. With our approach, we assign Zds1 to all three complexes, while Zds2 splits between the orange and green complexes. For both gene products, most flow belongs to the green complex, where conventional community detection would assign them. Our approach is a scalable alternative to overlapping community-detection methods such as clique percolation. It enables researchers to model and map higher-order dependencies in ordinary networks with tunable community sizes and overlap. |
12:00 | Analysing multiscale clustering with persistent homology PRESENTER: Dominik Schindler ABSTRACT. In many clustering applications it is desirable to find not only a single partition but a sequence of partitions that describes the data at different scales. The problem of such multiscale clustering methods like Markov Stability (MS) analysis then becomes to determine representative partitions from a long sequence of semi-hierarchical partitions. We address this problem of scale selection from the perspective of TDA and define a novel filtration, the Multiscale Clustering Filtration (MCF), which measures the level of hierarchy in the 0-dimensional persistent homology and tracks the emergence and resolution of conflicts between cluster assignments in the higher-dimensional persistent homology. Numerical experiments on synthetic networks show that robust partitions are located at distinct gaps in the persistence diagram, meaning that "good partitions resolve many conflicts". Tor our knowledge MCF is the first TDA application to semi-hierchical clustering, and a sequential application of MS and MCF provides an alternative to the purely combinatorial clique complex for inferring higher-order network interactions. |
12:15 | Higher-order interactions: an efficient heuristic to identify synergistic associations in big data PRESENTER: Stavroula C. Tassi ABSTRACT. Complex systems are characterized by different types of relationships between their interacting parts. These relationships can involve higher-order interactions at different levels of hierarchy and magnitude. During the past decades networks have been used as an interpretation tool of complex systems; however, they often consist only of pairwise interactions, or include higher-order interactions which are computed directly from the pairwise interactions. Synergistic interactions, on the other hand, involve more than two variables working together to produce a combined correlation that is greater than the sum of their individual correlations. To understand these interactions, in the context of big data, researchers focused on the multivariate extensions of Shannon’s mutual information to quantify, and uncover the relationships that go beyond simple pairwise links [1]. Recently, O-information has emerged as a promising metric for capturing synergistic phenomena, and gained recognition for its ability to efficiently distinguish the synergy-redundancy (shared) balance in a system [2]. Synergistic interactions are truly complementary to pairwise (or lower-order) interactions, in the sense that they strictly cannot be computed from them, unlike for instance the simplices resulting from the well-known topological data analysis. This implies that a naive approach to identify such interactions suffers from combinatorial explosion, since all possible combinations of triplets, quadruplets, and so on, would have to be tested independently. This makes the approach unfeasible for large data sets. To avoid this computational hurdle, we leverage a network-based approach, which decomposes the problem and computes pairwise interactions based on conditional entropies to predict where highly synergistic interactions could be found. Specifically, our approach, which is illustrated in Fig. 1, maps triangle detection in this specific pairwise network to identify the strongest synergistic 3-way combinations (triplets). The approach was developed and tested in a subset of a high-dimensional dataset, the Cardiovascular Rick in Young Finns Study (YFS) [3]. Applying the proposed approach and taken into account 1,673 participants and 99 features we achieved a recognition rate of about 85%. This rate is stable across a wide range of parameters. Our approach can easily generalize to even higher-order interactions, as long as the data is rich enough to ensure statistical significance. Our heuristic makes the analysis of these interactions in real data feasible, which is crucial for exploring hidden structures, identifying key drivers, and understanding the different commonalities of complex systems. |
12:30 | Synergistic Hypergraphs: A Method for Removing Unnecessary Hyperedges in Information-Theoretic Hypergraphs through O-information. ABSTRACT. Synergistic interactions are a prevalent phenomenon in complex systems and can take many forms. A clear sign of a synergistic effect can be seen when independent variables individually cannot predict a target (zero correlation), but when considered together they can (non-zero correlation). It can also be seen when the multivariate correlation of three or more variables together is larger than the sum of the corresponding pairwise correlations. Information-theoretic hypergraphs have been proposed to capture such synergistic associations that are often missed in popular pairwise-based association networks. However, the abundance of synergistic interactions in real-world data, and density of the corresponding hypergraphs, makes it challenging to analyse these hypernetworks and identify the most critical synergistic interactions that impact network dynamics. In this work, we propose an approach for filtering out interactions in synergistic hypergraphs, based on the O-information heuristic, without loss of information. By comparing overlapping information in synergistic hyperedges, we keep only the most salient ones and construct a lower-density synergistic hypergraph. This approach also provides a heuristic to determine the order of certain hyperedges to be included in the network, where hyperedges are merged, removed, or retained depending on their overlapping information content. We compare our lower-density synergistic hypergraphs to dense synergistic hypergraphs computed without filtering and traditional networks based on dyadic associations, using synthetic and real medical datasets. Our results show that this approach reduces the hypergraph density by approximately 30%. These sparser hypernetworks are effective in identifying the most important synergistic interactions that offer a better understanding of the system when considered in conjunction with the dyadic-association networks. Our approach has the potential to provide a more accurate representation of complex systems with synergistic relationships. |
12:45 | XGI: A Python package for higher-order interaction networks PRESENTER: Nicholas Landry ABSTRACT. CompleX Group Interactions (XGI) is a library for higher-order networks, which model interactions of arbitrary size between entities in a complex system. This library provides methods for building hypergraphs and simplicial complexes; algorithms to analyze their structure, visualize them, and simulate dynamical processes on them; and a collection of higher-order datasets. XGI is implemented in pure Python and integrates with the rest of the Python scientific stack. XGI is designed and developed by network scientists with the needs of network scientists in mind. We demonstrate this library's effectiveness by choosing a higher-order dataset and then analyzing it. We start by extracting basic statistics of this dataset such as its size, degree distribution, and density. We also calculate structural measures such as connectedness, assortativity, and centrality and compute different statistics of these metrics. We show not only how to easily represent higher-order datasets with different data structures, but also how to use statistics to easily filter the dataset. Lastly, we visualize this dataset using the visualization capabilities of XGI. |
11:00 | Fourier Transform of temporal networks PRESENTER: Alain Barrat ABSTRACT. Very diverse systems, such as people in contact with each other, the connection of neurons, or public transportation, can be represented as temporal networks [1]. These networks potentially evolve on multiple temporal scales that cannot always be observed through the simple observation of the temporal evolution of standard measures such as the instantaneous density of links. Furthermore, interaction activities, motifs, or even features of a given temporal network can vary and recur on different time-scales. Some methods have already been proposed to find recurrences on networks [2, 3], focusing on exact similarities of labelled edges through time, without considering structural similarities at intermediate or global scales. Here, we address the problem to measure the characteristic time-scales of structural and density changes in temporal networks. To tackle this issue, we introduce a new methodology pipeline to detect periodic changes in time-varying structures. First, we extract sub-networks of the initial temporal network using a sliding time-window of a certain time span and at a certain frequency: this yields a time series of temporal sub-networks G_T(n) (n = 1, ...). We then represent each G_T(n) as a static network through two approaches: the supra-adjacency (SA) method [4] and the event-graph (EG) method [5] (obtaining G_SA(n) and G_EG(n) for each n). Next, we compute the dissimilarity between each pair of consecutive static networks (G_SA(n) and G_SA(n + 1), or G_EG(n) and G_EG(n + 1)), using the method of [6], to obtain the dissimilarity functions D_SA(n) and D_EG(n). Finally, we compute the Fourier transform of these signals. We first test our methods on synthetic networks with periodic variations of their density and their structure. While both methods indicate the correct periods, the SA-method detects better changes in the density of the network, and the EG-method identifies more efficiently structural changes. We use both methods to measure time-scales in empirical temporal networks (Figure 1) and obtain results that could not be measured by looking only simple interaction counts. Our new methods are able to measure the periods of temporal networks and may help to better understand the different phenomena occurring at multiple time-scales on them. |
11:15 | Randomized reference models for temporal networks PRESENTER: Christian Lyngby Vestergaard ABSTRACT. Empirically measured networks and dynamic processes that take place in these situations show heterogeneous, non-Markovian, and intrinsically correlated topologies and dynamics. This makes their analysis particularly challenging. Randomized reference models (RRMs) have emerged as a general and versatile toolbox for studying such systems. Defined as random networks with given features constrained to match those of an input (empirical) network, they are notably used as null models for hypothesis testing and, more generally, to investigate the relationship between different network features and their roles in dynamic phenomena. RRMs are typically implemented as procedures that reshuffle an empirical network, making them very generally applicable. However, while a multitude of different randomization techniques are found in the literature, the effects of most shuffling procedures on network features remain poorly understood, rendering their use non-trivial and susceptible to misinterpretation. We propose a unified framework for classifying and understanding microcanonical RRMs (MRRMs) that sample networks with uniform probability. Focusing on temporal networks, we use this framework to build a taxonomy of MRRMs that proposes a canonical naming convention, classifies them, and deduces their effects on a range of important network features. We furthermore show that certain classes of MRRMs may be applied in sequential composition to generate new MRRMs from existing ones. As an application of the framework, we show how to apply a series of MRRMs to analyze how different network features affect a dynamic process in an empirical temporal network. Our taxonomy provides a reference for the use of MRRMs, and the theoretical foundations laid by the framework may further serve as a base for the development of a principled and automatized way to generate and apply randomized reference models for the study of networked systems. Reference: Gauvin et al., "Randomized reference models for temporal networks." SIAM Review 2022, 64:763--830. arXiv:1806.04032 |
11:30 | Flow of temporal network properties under local aggregation and time shuffling PRESENTER: Didier Le Bail ABSTRACT. Most studies of temporal networks have focused on either instantaneous structures or on the network aggregated on the whole timespan. Reshuffling procedures to create null models are also usually performed on the whole network timeline. Intermediate time scales have not received as much interest so far. Here we propose to characterize a temporal network by its behaviour under transformations that are local in time, namely: (i) a local time shuffling, which consists in dividing the timeline of the temporal network into successive blocks of size b, and then applying independent shuffling of the snapshots within each block. This destroys correlations at time scales smaller than b but preserves large time scales (Intuitively, this corresponds to the analog for temporal networks of a low-pass filter acting on the time correlation function). (ii) a local temporal aggregation on successive time windows of length n. The flow of the network statistical properties, as a function of the aggregation and time shuffling scales, constitutes a new way to characterize temporal networks at multiple time scales. It allows us to detect characteristic time scales in empirical data sets, as well as differences between temporal networks with otherwise very similar statistical properties. This characterization tool, namely comparing the way the model and the empirical data’s properties transform when flowing from one time scale to another, could thus be used to validate models of temporal networks at multiple time scales, in addition to the usual comparison between the model and the data’s statistical properties. |
11:45 | Lyapunov Exponents for Temporal Networks PRESENTER: Annalisa Caligiuri ABSTRACT. By interpreting a temporal network as a trajectory of a latent graph dynamical system, we introduce the concept of dynamical instability of a temporal network, and construct a measure to estimate the network Maximum Lyapunov Exponent (nMLE) of a temporal network trajectory. Extending conventional algorithmic methods from nonlinear time-series analysis to networks, we show how to quantify sensitive dependence on initial conditions, and estimate the nMLE directly from a single network trajectory. We validate our method for a range of synthetic generative network models displaying low and high dimensional chaos, and finally discuss potential applications. |
12:00 | A longitudinal data-driven causal discovery analysis for depressive disorders, incorporating domain experts' perception of underlying causal interrelations PRESENTER: Angela Koloi ABSTRACT. A Causal Graph (CG) is estimated to identify potential causes of depressive disorders as well as promising targets for prevention strategies. Depressive disorders, modifiable risk factors and lifestyle attributes were selected from the Netherlands Study of Depression and Anxiety (NESDA). This study proposes a new method for learning CG by (i) extending a well-known causal discovery algorithm to handle longitudinal cohort data via temporal cohort structure and (ii) integrating domain experts’ knowledge. |
12:15 | Mapping flows on multilayer networks with incomplete layers PRESENTER: Jelena Smiljanic ABSTRACT. Detecting communities in multilayer networks becomes challenging when the observed data are incomplete, containing missing links in one or several layers: Community detection methods can overfit and find spurious communities, giving misleading system characterizations. Here we propose an approach to regularize flows in multilayer networks that reduces overfitting in the flow-based community detection method known as the map equation. The multilayer map equation framework models network flows using a higher-order Markov chain approach such that the random walker's next step depends on the current layer. It can assign nodes to several communities and enables capturing overlapping communities across layers. However, modelling flows on incomplete network data can cause detecting spurious communities because the random walker's transition rates are distorted. We have recently proposed a Bayesian approach that overcomes the problem of overfitting in single-layer networks with incomplete data. Here we generalise our approach and derive a Bayesian estimate of the random walker's transition rates to regularize flows on multilayer networks. We assume a random network without modular structure and corresponding random-walker transitions between any two nodes such that the map equation detects multilayer communities only if there is sufficient evidence in the empirical data. In multilayer networks with no interlayer links, the map equation uses a relax rate that allows the random walker to move between layers. Relaxing the random walk is necessary to handle the interplay between layers, but there is no principled approach to selecting an optimal value. The relax rate affects the random walk's dynamics, and a bad choice can deteriorate the quality of the identified community structure. The map equation with regularized flows overcomes this issue by making a prior assumption about the random walker's transitions between layers. As a result, the relax rate becomes superfluous. We find that the map equation with regularized flows outperforms the standard map equation and enables more robust community detection in incomplete multilayer networks. In our analysis, we use synthetic multilayer networks with planted community structure, and study how community detection performance changes as we vary the number of observed links. With enough observations available, both the standard map equation and the map equation with regularized flows detect robust communities in multilayer networks. However, revealing significant communities becomes challenging when insufficient network data is available: The standard map equation can overfit and identify spurious communities, irrespective of the relax rate. In contrast, without the need to choose a relax rate, our generalized prior prevents overfitting. |
12:30 | Temporal network compression via network hashing PRESENTER: Rémi Vaudaine ABSTRACT. Temporal networks provide a good representation of systems with pairwise interactions evolving over time. Dynamical processes lying on temporal networks, such as epidemic or information spreading, influence or cascading failure, depict different phenomena of interest [1,2]. But, since their collective patterns cannot be larger than the largest connected component of the underlying network, the percolating structure of the temporal network determines largely the outcome of any of these dynamical processes. However, the computation of the connected components of temporal networks is difficult because of the temporal dimension. Indeed, in temporal networks, paths between pairs of nodes must respect time: interactions have to be in the right time-order for an effect to propagate from nodes to nodes [3,4]. Thus, a time-respecting path is a sequence of adjacent events where two events are adjacent if they share at least one node and the first event occur before the second one. Then, the out-component of a node can be defined as the set of nodes reachable by any time-respecting path in the temporal network starting from that specific node. Moreover, the size of the out-component, i.e. the number of nodes reachable from a source node, is the maximum number of nodes that can be involved in a dynamical process starting from this source. Let n be the number of nodes, then the component matrix can now be defined as the n by n matrix where the (True or False) entry (i, j) indicates whether node i is in the out-component of node j or not. Then, the size of the out-component of node j is simply the number of True in the j-th column of the component matrix. To compute the size distribution of out-components efficiently, we propose a streaming matrix algorithm that only requires one scan through the list of events and output the component matrix. Then, we also propose a general purpose compression scheme via hashing described in Fig 1 to further improve our algorithm. Our solution outperforms the state-of-the-art in the computation efficiency of the largest out-components, giving an upper estimate for the size of any macroscopic phenomena. |
12:45 | Innovation and Order in Citation Networks PRESENTER: Tim Evans ABSTRACT. We study citation data on eight vaccines, including four COVID-19 vaccines, approved between 2013 and 2022 and based on one of four different vaccine platforms. Our data is obtained from four types of document: drug approvals, clinical trials, patents and journal articles. Our data comes from ClinicalTrials.gov, Lens.org and Dimensions.ai. Each type of document is a node in a distinct layer of the network. Starting from the approval document issued by the US Federal Drug and Food Administration (FDA) for one of our eight chosen vaccines, we follow the bibliographical references back for a number of steps to produce citation networks of between 12 and 113 thousand nodes and an average degree of around 14.5. The sense of order encoded in a DAG allows us to assign a unique height h(v) and depth d(v) to every node v. We define the criticality c(v) of a node to be c(v)=H-h(v)-d(v) where H is the largest height in the network. Any node lying on the longest path in the DAG will have zero criticality, and nodes that lie on paths that are slightly shorter than the longest path will have small criticality values. Our conjecture is that nodes with low criticality are the most important documents for the innovation process. In geometric terms, our inspiration comes from the fact where a DAG is embedded in a Minkowski space-time, the longest path in the DAG is the closest path to the space-time geodesic, the path of least resistance, least action. Our method gives us a path of events that narrates innovation bottlenecks. We quantify the position and proximity of documents to these innovation paths to identify key innovation events. We also have information on funders. We show that when it comes to vaccine innovation, diffusion-oriented entities are preoccupied with basic research; biopharmaceuticals tend to participate in applied research and development activities; while challenge-led entities tend to sit in the middle. |
11:00 | Inferring the stochastic dynamics of complex networks via message-passing mechanism PRESENTER: Tingting Gao ABSTRACT. The dynamics of many real complex systems are stochastic rather than deterministic, i.e. complex system dynamics should be captured by stochastic differential equations (SDEs) rather than ordinary differential equations (ODEs). Owing to the increasing availability of nodes' activities, there are recent attempts to infer ODEs of complex network dynamics. However, inferring SDEs of network dynamics from observation data is still an outstanding and challenging problem, especially given the fact that stochasticity is an intrinsic property of system dynamics and cannot be removed by frequency-based filters from activity series data. Here, we propose an efficient inference framework, which combines message-passing and library-based sparse learning techniques, to tackle the challenges. Specifically, we separate the system dynamics using three specific deep neural networks (NNs) that together learn the explicit equation. The new framework extends our previous one (T.-T. Gao, G. Yan, Nature Computational Science 2022) to stochastic complex systems, and is effective to infer various network dynamics, including stochastic Lorenz, Rossler, and Hindmarsh-Rose coupled dynamics on weighted and signed networks, as well as second-order Viscek model that describes flocking dynamics. Interestingly, our framework confirms that the second-order Viscek model is indeed able to capture the stochastic flocking dynamics inferred from four empirical flocks datasets. Taken together, the results not only offer a new method for data-driven discovery of stochastic dynamics of real complex systems but also pave a path for downstream tasks such as intervention and control. |
11:15 | Maximizing anti-coordination in network games PRESENTER: Ceyhun Eksin ABSTRACT. Anti-coordination games can be used to study competition among firms, public goods scenarios, free-rider behavior during epidemics, network security, etc. In each of these scenarios, there is a desired action for each agent, e.g., not taking the costly preemptive measures during a disease outbreak, not investing in insurance/protection etc., in the absence of other agents. When other agents are around, they can affect the benefits of the desired action, providing incentives for agents to switch. Despite peer influences, some agents continue to take the individually desired action, endangering their peers and the rest of the population. That is, rational behavior can lead to the failure of anti-coordination in the population, harming the well-being of the system overall. In such scenarios, we envision the existence of a central coordinator with the goal to induce behavior that supports global well-being. Here, we consider one such mechanism where the coordinator intervenes by controlling a few agents in the network. Here, actions of agents have inverse dependencies, i.e., agents tend to naturally differentiate themselves from their neighbors. For instance, an agent can be encouraged to wear a mask (adopt costly strategy) when its neighbors flout protocols. Rational behavior is not optimal always, and resulting equilibria can retain a lot of the disease transmission links (agents flouting protocols on both ends). The centralized player can then steer the convergent action profile toward socially desirable outcomes by controlling the actions of a few players throughout the learning phase. We define the goal of the central coordinator as maximizing anti-coordination (MAC) on convergence of dynamics between connected pairs of agents by deactivating network links. Thus, we define the goal of the central coordinator as maximizing anti-coordination on convergence of said dynamics between connected pairs of agents by deactivating network links (ensuring at least one agent per link plays 0). First, we analyze the hardness of MAC in general graph instances. Then, we investigate how specific network structures can affect the equilibria of such controlled learning dynamics and prove approximate submodularity and almost monotonicity of MAC for line-networks using a recursive argument. Thereby, we propose a greedy strategy that takes advantage of cascades of influence diffusion in the network, and prove performance guarantees by exploiting the aforementioned properties of MAC for line-instances. For the general bipartite graph, we provide an inapproximability result, showing that the violation of submodularity on a specially designed bipartite graph can be of the order of the number of edges in the graph, thereby showing failure of submodularity in the worst case. However, numerical results strongly suggest effectiveness of our algorithm for solving MAC in most practical bipartite instances. We are able to show that MAC is monotone and submodular, in expectation, in dense bipartite networks. We begin by defining a function that measures one-step influence which encourages agents to adopt the costly strategy level at each step of the dynamics. We establish that the one-step influence function is monotonic and submodular for dense networks. Next, we describe the stochastic process of the set of agents taking the costly action (0) when the dynamics are seeded by a set of agents who are controlled to play 0. We prove that the distribution of agents choosing to play strategy level 0 on convergence remains intact, if, instead of controlling the entire set from the get-go, we break it into a set of smaller subsets (a partition) and control each set one by one in stages. This property allows us to provide an alternate equivalent description of the process based on a selection rule. The proof of the submodularity result entails designing such a coupling between the actual (greedy) selection method and another equivalent selection method for which we can show the diminishing returns property (submodularity). Together, these results imply that the expected worst case performance of the greedy selection protocol is indeed bounded by a fraction of the optimal solution. |
11:30 | Link overlap influences opinion dynamics on multiplex networks of Ashkin-Teller spins PRESENTER: Cook Hyun Kim ABSTRACT. Social consensus has an important meaning in society, because language, a tool of communication, can be formed through consensus, and furthermore, even morality and ethics have been formed through consensus. Therefore, it is important to investigate under what conditions social public opinion is formed in complex networks and how social public opinion changes according to dynamic social variables and network structures. Opinion dynamics is understanding the process of forming public opinion through a mathematical model. It is known that these opinion dynamics are closely related to the Ising spin model, and through this relationship, studies to understand the formation of public opinion through the Ising spin model are continuously being handed down. Thus, research on what kind of phase appears and what phase transition occurs has been actively conducted in the Ising spin model defined on a complex system network. However, there were clear limitations in considering singlex networks. This is because the interaction of modern society is too complex and diverse to be described the only one connection line. To overcome this limitation, a multiplex network was introduced and we consider the spin model defined on the multiplex network. The multiplex network introduces a new dynamical property: link overlap interaction. This link overlap interaction refers to a higher-order interaction that acts through $(1,1)$ multilinks. This overlap interaction presents an interesting concept in the perspective of opinion dynamics. Through this overlap interaction, it is possible to describe the formation of consensus related to a correlation between multiple topics beyond simply agreeing/disagreeing opinion on one topic. We found that various types of phase transitions occur according to the structure of multiplex networks. Based on this result, we would like to suggest an important implication that it is necessary to consider the network structure in more detail to quanlitatively understand and predict social public opinion. |
11:45 | The evolutionary dynamics of multiplayer cooperation in networks of mobile agents PRESENTER: Diogo L. Pires ABSTRACT. The self-organisation of collective behaviour is observed in populations across all levels of complexity, from microorganisms to human societies. Numerous evolutionary models are used to study these phenomena, often incorporating population structure due to its ubiquitous presence and long-known impact on emerging behaviour. Modelling agents interacting pairwisely via first-order interaction networks shows that structure may indeed promote cooperation when the average degree is low enough and the interaction network is far from complete. However, this result vanishes under some evolutionary processes. Realistic multiplayer interactions can be studied by considering interacting groups to arise from the encounters of agents moving on spatial or virtual networks leading to an emerging higher-order interaction network. We focus on a Markov movement model, under which individuals move contingent on the composition of the groups they met in the previous time step. In this context, complete networks solidly promote the evolution of multiplayer cooperation contrary to other more structured topologies which are detrimental to it. In our work, we seek to understand this departure from previous pairwise interaction network models and if similar dependencies on update mechanisms are maintained. Following previous work, we modelled the co-evolution of interaction strategies (cooperate or defect) and movement strategies (propensity to not move). We ran a large set of agent-based simulations of strategy fixation processes, with stochastic exploration phases between evolutionary steps. We find that the evolution of cooperation is mainly dependent on network topology---complete networks always lead to it the most frequently---with evolutionary dynamics having little effect on evolutionary outcomes. We believe that movement dependent on group composition led to this robustness. On one hand, it erased the locality of interactions, thus partially suppressing the impact of evolutionary structural viscosity on the fitness of individuals. On the other hand, the emerging assortative behaviour was much more powerful in promoting cooperation, overshadowing the impact of other effects. These two factors hinder the exceptional significance of the DBB (and BDD) dynamics in promoting the evolution of cooperation. There were two lasting quantitative differences between results. Comparing dynamics where selection acts in events of the same order in time, we observed that acting on birth (death) generally favoured cooperation (defection). Finally, the dynamics where selection acted in the second event tended to amplify selection. Nonetheless, these differences very rarely led to fundamentally different evolutionary outcomes. |
12:00 | Ordering dynamics and aging in Threshold models PRESENTER: David Abella Bujalance ABSTRACT. This study investigates the effect of aging on cascade dynamics in Watts' threshold model. Aging is defined as the decreased tendency of agents to change state the longer they remain in the current state. The study shows that aging leads to a slower cascade dynamics, resulting in a power-law increase of the fraction of adopted agents instead of an exponential increase as in the original model. This behavior is universal for different networks and control parameters. The study derives an approximate master equation and shows that the power law dynamics with aging and exponential increase in the original model share the same exponent. Additionally, the study introduces a symmetrical version of the threshold model and characterizes it for different network structures, identifying three phases as the threshold parameter is varied. When aging is included in this model, phase I (disordering) is destroyed, but the transient regime still characterizes a new phase I*. The study also provides theoretical and numerical simulation results for the behavior of the symmetrical model and the version with aging. |
12:15 | Is cooperation sustained under increased mixing in evolutionary public goods games on networks? PRESENTER: Wei Zhang ABSTRACT. Well-mixed populations model population-wide interactions since everyone shares the same set of interacting partners, the entire population. While networks model local rather than global interactions by restricting them to social neighborhoods. Therefore, a question arose: when individuals interact in groups on networked populations, if there is a probability to connect two groups and form an additional global group, whether the additional mixing links always play a positive or negative role in the evolution of cooperation? In this work, we propose an evolutionary game model that is able to capture the effect of long-range links mixing local neighborhood and global group interactions in a finite networked population. We derived dynamical equations for the evolution of cooperation under weak selection by employing the mean-field and pair approximation approach. Using properties of Markov processes, we can approach a theoretical analysis of the effect of the density of mixing links. We find a rule governing the emergence and stabilization of cooperation, which shows that the positive or negative effect of mixing-link density for fixed group size depends on the global benefit in the public goods game. With mutations, we study the average abundance of cooperators and find that increasing mixing links promotes cooperation in strong dilemmas and hinders cooperation in weak dilemmas. These results are independent of whether strategy transfer is allowed via mixing links or not. |
12:30 | Evolutionary game selection creates competitive environments PRESENTER: Onkar Sadekar ABSTRACT. We propose a novel paradigm for studying strategic decisions making where both strategies and the environment can evolve over time. To do this, we model the strategic interactions in a population of decision-makers as an evolutionary pairwise symmetric game where each player can copy both the strategies and the games (i.e. the payoff matrix, representing the environment) of the other players. We show that, when the interactions among the players are modeled as a network, different games and strategies are selected in the long term, compared to the well-mixed population scenario. Moreover, we characterize the equilibrium of our game of games in terms of the network topology, showing the crucial role played by network heterogeneity and the presence of hubs. By unveiling the link between the interaction structure of the social network and the evolution of competition/cooperation our model can shed new light on the origin of social dilemmas ubiquitously observed in real-world social systems. |
12:45 | From subcritical behavior to a correlation-induced transition in rumor models PRESENTER: Guilherme Ferraz de Arruda ABSTRACT. Rumor and information spreading are natural processes that emerge from human-to-human interaction. Such processes have a growing impact on people's daily lives due to increasing and faster access to information, whether trusted or not. A popular mathematical model for spreading rumors, data, or news is the Maki--Thompson (MT) model. In this model, individuals can be in one of three states: ignorant, spreader, or stifler. The spreading evolves through the contact between nodes defined by an undirected network. Our process is defined in continuous time as a collection of Poisson processes. If the contact is between a spreader and an ignorant, the second node will learn the rumor and become another spreader at rate $\lambda$. On the other hand, if the contact happens between a spreader and someone that already knows the rumor (spreader or stifler), then the spreader that initiated the contact will lose interest in the rumor, thus becoming a stifler at a rate $\alpha$. Existing work based on first-order mean-field approximations suggested that this model does not have a phase transition, with rumors always reaching a finite fraction of the population irrespective of the spreading rate. Here, we show that a second-order phase transition is present in this model, which is not captured by first-order mean-field approximations. Since the MT model has infinitely many absorbing states, the critical point is the spreading parameter that separates the two scaling regimes. Before this point, the final number of stiflers when the process reaches an absorbing state does not scale with the system size, and hence its fraction goes to zero in the thermodynamic limit. After the critical point, the number of stiflers scale with the system size. This transition is shown in 1 (a) and (b), where we present the order parameter (the fraction of stiflers) and the time to reach the absorbing state, respectively. Moreover, we propose and explore a modified version of the Maki--Thompson model that includes a forgetting mechanism, where each stifler spontaneously becomes ignorant at a rate $\delta$. This modification changes the Markov chain's nature from infinitely many absorbing states in the classical setup to a single absorbing state and allows us to use a plethora of analytic and numeric methods to characterize the model's behavior. In particular, we were able to provide an estimation of the critical point by accounting for the correlations between states. More importantly, we find a counter-intuitive behavior in the subcritical regime, where the lifespan of a rumor increases as the spreading rate drops, following a power-law relationship. These results are summarized in Fig. 1 (b), where we present the time to reach the absorbing state for different sizes, demonstrating the power-law subcritical behavior. This behavior implies that, even below the critical threshold, rumors can survive for a long time. Furthermore, using an asymptotic analysis where we scale the model's parameters, we were able to show that no phase transition is expected in the first-order mean field approximation. Our results emphasize the role of correlations in the MT model phase transition and motivate further research on developing more sophisticated mean-field approximations. Together, our findings are at odds with most classical results and show that the dynamic behavior of rumor models is much richer than previously thought. Thus, we hope our results motivate further analytical and numerical research and investigations involving real-world systems. The work described in this abstract has been published in Nat Commun 13, 3049 (2022). |
11:00 | The advantage of Quantum Annealer Devices in Brain Connectome Community Detection PRESENTER: Alessandro Crimi ABSTRACT. The brain can be viewed as a small-world system with distinct regions interconnected to support various cognitive tasks and functions. Community detection is a crucial problem in computational neuroscience in this context. In this study, we investigated the advantage of quantum annealers, and in particular the Leap's Hybrid Solver, to discover community within brain connectomes. Our research demonstrates that when computing communities of brain connectomes, quantum annealers can obtain a higher modularity index than a classical annealer. These encouraging preliminary findings suggest that, when compared to classical computing optimization techniques, quantum annealers may be the best option. |
11:15 | Functional Desynchronization and Structural Rewiring in Brain Tumor Networks PRESENTER: Joan Falcó-Roget ABSTRACT. Despite extensive mapping of brain networks in presence of brain tumors, functional and diffusion MRI signals inside the tumors are often neglected. Using a surgical dataset of pre- and post-surgery multimodal MRI acquisitions [1], we explored how intra-tumor signals deviate from healthy ones in the same tissue. Even more, we study reorganization of functional signals by defining a score based on intrinsic properties of the signal [2]. Finally, by fiber tracking within the tumor we aim at predicting post-surgery structural reorganization by means of a recently validated machine learning model [3]. We analyze all these scores and predictions in different types of tumors based on grading, size, histology and periventricular location aiming at finding properties that explain differences in network recovery and survival. Overall, we provide a proof of concept in regards to MRI signals and networks that explicitly acknowledge the presence of tumors and demonstrate the complexity of the signals derived from them. |
11:30 | Structure of brain connectome and contactome in fly, mouse, and human PRESENTER: Anastasiya Salova ABSTRACT. Analyzing networks of neurons is fundamental to understand the structure and function of the brain as a whole. However, it poses unique challenges, since even the simple questions of what the node positions are and what counts as an edge require careful consideration. Additionally, standard spatial network generative models do not take into account the interplay of the complex fractal neuron structure, spatial orientation, and biological wiring rules. Here, we study complex neuronal networks in the fruit fly, mouse, and human brain [1-3] with the goal of gaining insight into their structure to build generative models. We focus on two network layers: synaptic networks (connectomes) and physical proximity networks (contactomes) that represent important spatial constraints on synapse formation. Having these two networks for each organism provides an opportunity to explore distance-based generative models of both network layers. Comparing the properties of the "true'' contactome and network models allows us to gain understanding of the role of physical constraints on synaptic network. Moreover, our results highlight prevailing motifs in connectomes that are informative of the underlying biological wiring rules. [1] Xu, C. Shan, et al. "A connectome of the adult drosophila central brain." BioRxiv (2020) [2] Bae, J. Alexander, et al. "Functional connectomics spanning multiple areas of mouse visual cortex." BioRxiv (2021) [3] Shapson-Coe, Alexander, et al. "A connectomic study of a petascale fragment of human cerebral cortex." BioRxiv (2021) |
11:45 | Characterising the structural organization of the whole-brain: encompassing the cortex, subcortex, and cerebellum PRESENTER: Julian Schulte ABSTRACT. The white matter is made of anatomical fibers that constitute the highway of long-range connections between different parts of the brain. This network is referred to as the brain’s structural connectivity and lays the foundation of network interaction between brain areas as it has been shown to constrain functional connectivity and to also relate to various psychiatric and neurological disorders. When analyzing the architectural principles of this global network most studies have mainly focused on cortico-cortical and partly on cortico-subcortical connections. Here we show, for the first time, how the integrated cortical, subcortical, and cerebellar brain areas synergistically shape the structural architecture of the whole brain. Considered individually, the cortical, subcortical, and cerebellar sub-networks show distinct network features despite some similarities, which underline their individual structural fingerprints. Whereas the three sub-networks are characterized by a nearly optimal short-average pathlength and capacity to transmit information, they differ regarding their degree distribution, clustering, and assortativity. Taken together, the global structural network displays a modular and hierarchical organization – similar to the one typically described for the cortex alone. However, (i) community detection reveals a modular organization that transcends the classical – cortical, subcortical, cerebellar – subdivision pointing to functional communities that encompass regions of the three parts. Also, we find that (ii) the most prominent hubs of the global rich-club correspond to subcortical regions whose lesioning leads to a major disruption in network efficiency and signal propagation, more so than lesions to cortical hubs. Our results, exposing the heterogeneity of internal organization across cortex, subcortex and cerebellum, and the crucial role of the subcortex for the integration of the global anatomical pathways, highlight the need to overcome the prevalent cortex-centric focus towards a global consideration of the structural connectivity. |
12:00 | Exploring coarse-graining effects in structural brain connectivity PRESENTER: Máté Józsa ABSTRACT. There are indications that the axonal length distribution in the brain decays exponentially, a phenomenon known as the exponential distance rule (EDR)(lhs of Fig. 1). However, individual axon level data is limited while brain region level of connectomes are available for a number of species (drosophila, mouse, macaque and human). Our simple mathematical model reproduces the observed deviation from the EDR of the weighted length distribution of interregional connectomes by accounting for the inherent coarse graining effects (rhs of Fig. 1). The limitations of the model are carefully scrutinized, including the effect of curvature and dimensionality. The robustness of the model is demonstrated by numerical simulations, and matching it to a neuronal null model. The results indicate the universality of the EDR rule by extending its validity to a number of widely different species. |
12:15 | Assortative mixing in micro-architecturally annotated brain networks PRESENTER: Vincent Bazinet ABSTRACT. The wiring of the brain connects micro-architecturally diverse neuronal populations. These neuronal populations have distinct anatomical and cellular makeups and thanks to modern technological advances, this heterogeneity in the brain’s micro-architecture can be imaged with unprecedented detail and depth. The conventional graph model, however, encodes brain connectivity as a network of nodes and edges, and abstracts away the rich biological detail of each node. How is brain network connectivity related to regional micro-architecture? In this work, we investigated the systematic arrangement of brain network connections with respect to multiple biological attributes. More specifically, we asked whether brain regions with similar attributes are more likely to be connected with each other. To disentangle the relationships between the brain’s connectivity, regional heterogeneity, and spatial embedding, we implemented novel null models that control for the spatial autocorrelation of nodal attributes. We considered a range of molecular, cellular, and genetic attributes and we performed all experiments using four brain network datasets from three different species (human, macaque, and mouse). This allowed us to uncover universal principles of organization across network reconstruction techniques, species, spatial scales, and attributes. We find that regions with similar attributes tend to connect with each other above and beyond spatial proximity and that communication between micro-architecturally diverse populations is supported by long-distance projections. We also uncover intricate mixing patterns between neurotransmitter systems and cortical layers. Finally, using meta-analytic decoding, we find that the arrangement of connectivity patterns with respect to biological attributes shape patterns of regional functional specialization. Specifically, regions that connect to biologically similar regions are associated with executive function; conversely, regions that connect with biologically dissimilar regions are associated with memory function. In summary, the present work bridges microscale attributes and macroscale connectivity. While carefully controlling for the background effect of the brain’s spatial embedding, we systematically assessed how connectivity is interdigitated with a broad range of micro-architectural attributes and empirically tested multiple theories about the wiring of cortical brain networks. |
12:30 | Hierarchical modularity optimizes memory capacity near criticality in echo state networks PRESENTER: Filip Milisav ABSTRACT. Introduction. The complex architecture of brain networks is believed to be organized in a hierarchy of increasingly polyfunctional nested circuits (Bazinet et al., 2021, NeuroImage; Hilgetag & Goulas, 2020, Philos. Trans. R. Soc. B; Mesulam, 1998, Brain; Meunier et al., 2010, Front. Neurosci.). Previous reports have suggested that this hierarchical modular structure contributes to maintain a balance between information segregation in specialized communities and global integration via intermodular communication (Hilgetag & Goulas, 2020, Philos. Trans. R. Soc. B; Meunier et al., 2010, Front. Neurosci.). Yet, how hierarchical modularity shapes network function remains unclear. Here, we constrain artificial neural networks with wiring patterns inspired by the hierarchical modular topology of biological neural networks. This allows us to causally relate hierarchical network architectures to cognitive performance in a simulated memory task. Furthermore, we characterize the computational capacity of these neuromorphic networks across a range of dynamics, bridging stable and chaotic regimes. Methods. We take advantage of reservoir computing (Lukoševičius & Jaeger, 2009, Comput. Sci. Rev.), a paradigm particularly well adapted to the design of neuromorphic networks. The classic reservoir architecture consists of a recurrent neural network of interacting nonlinear neurons (echo state network) complemented by a linear readout module. Only the readout module is trained, allowing us to constrain the network with arbitrary connectivity patterns that remain unchanged throughout learning. Here, we use a stochastic block model (Holland et al., 1983, Soc. Networks) to specify 3 hierarchical levels of modularity. Starting from 8 modules of 100 highly interconnected nodes, we systematically modify intermodular connectivity, defining 2 other hierarchical levels further containing, in a nested fashion, 4 modules of 200 nodes and 2 modules of 400 nodes (Fig. 1a). For each level, we generate 100 synthetic graphs and randomly assign weights to the produced edges by sampling a uniform distribution between 0 and 1. By maintaining the same total edge probability across levels, we ensure that all the generated networks share a very similar density. Furthermore, we uniformly scale the connection weights to produce a range of spectral radii α. This allows us to consider a corresponding range of dynamics gradually transitioning from a stable (α < 1) to a chaotic (α > 1) regime. To evaluate the computational capacity of the networks, we use a memory task in which the readout module is trained to reproduce a time-delayed version of a random input signal sampled from a uniform distribution between -1 and 1. The 8 original modules are used as input and output nodes across all hierarchical levels. After training, memory capacity is evaluated as the Pearson correlation between the target and the predicted signal, averaged across all pairwise combinations of the original 8 modules as input and output nodes and 16 time-lags, incremented in single timesteps from a one-unit time-lag. Results. By contrasting the memory capacity of the synthetic network ensembles across levels of hierarchy, we consistently find that a higher hierarchical modularity level is associated with better task performance (Fig. 1b). However, only at criticality (α = 1), where the brain is believed to operate (Cocchi et al., 2017, Prog. Neurobiol.), do we find significant differences in memory capacity between all levels of hierarchy according to Wilcoxon-Mann-Whitney rank-sum tests (p < 10-8). It is also in this dynamical regime that we observe the largest differences in performance between hierarchical levels, but also the highest memory capacity across α values for each level. Conclusion. We find that higher-order hierarchical modularity optimizes memory capacity in echo state networks. The impact of such a nested modular organisation reaches its peak at criticality, which might explain recent results showing that reservoirs constrained with human brain connectivity patterns perform optimally near criticality (Suárez et al., 2021, Nat. Mach. Intell.). Here, this boost in performance is achieved by tuning intermodular connectivity, possibly modulating permeability of information flow between functionally segregated communities in a way that strikes a balance between information segregation and integration. |
12:45 | Geometrical congruence, greedy navigability and myopic transfer in complex networks and brain connectomes PRESENTER: Carlo Vittorio Cannistraci ABSTRACT. We introduce in network geometry a measure of geometrical congruence (GC) to evaluate the extent a network topology follows an underlying geometry. This requires finding all topological shortest-paths for each nonadjacent node pair in the network: a nontrivial computational task. Hence, we propose an optimized algorithm that reduces 26 years of worst scenario computation to one week parallel computing. Analysing artificial networks with patent geometry we discover that, diffetently to current belief, hyperbolic networks do not show in general high GC and efficient greedy navigability (GN) with respect to the geodesics. The myopic transfer which rules GN works best only when degree-distribution power-law exponent is strictly close to two. Analysing real networks - whose geometry is often latent - GC overcomes GN as marker to differentiate phenotypical states in macroscale structural-MRI brain connectomes, suggesting connectomes might have a latent neurobiological geometry accounting for more information than the visible tridimensional Euclidean. |
11:00 | [CANCELLED] Connecting intercity mobility with urban welfare PRESENTER: Gourab Ghoshal ABSTRACT. While significant effort has been devoted to understand the role of intraurban characteristics on sustainability and growth, much remains to be understood about the effect of interurban interactions and the role cities have in determining each other’s urban welfare. Here we consider a global mobility network of population flows between cities as a proxy for the communication between these regions, and analyze how it correlates with socioeconomic indicators. We use several measures of centrality to rank cities according to their importance in the mobility network, finding PageRank to be the most effective measure for reflecting these prosperity indicators. Our analysis reveals that the characterization of the welfare of cities based on mobility information hinges on their corresponding development stage. Namely, while network-based predictions of welfare correlate well with economic indicators in mature cities, for developing urban areas additional information about the prosperity of their mobility neighborhood is needed. We develop a simple generative model for the allocation of population flows out of a city that balances the costs and benefits of interaction with other cities that are successful, finding that it provides a strong fit to the flows observed in the global mobility network and highlights the differences in flow patterns between developed and developing urban regions. Our results hint towards the importance of leveraging interurban connections in service of urban development and welfare. |
11:15 | Impact of COVID-19 on Chile’s Internal Migration PRESENTER: Leo Ferres ABSTRACT. We study the phenomenon of long-term internal relocation within a country during the COVID-19 pandemic by analyzing eight months of anonymized eXtended Detail Records for 1.3 million mobile phone devices over three years. Our results show that 2.17% of the population permanently left SCL in 2020, that preferred destinations stayed relatively stable, that the exodus from Santiago during 2020 was most significant for richer comunas and that people predominantly stayed in urban comunas. |
11:30 | Human Mobility in China: A National Overview Before and After the COVID-19 Outbreak PRESENTER: Suoyi Tan ABSTRACT. A systematic understanding of the mobility patterns of populations and the subsequent outcomes are clearly important agenda items for urgent policy decisions. In this study, we use a unique, nation-wide mobility data extracted from mobile phones to examine the general and exceptional mobility patterns in China 2020, covering a diverse periods of: 1) normal traveling, 2) the probably world’s largest mass human migration – China’s Lunar New Year travel season (chunyun), 3) the hold of population flow with COVID-19 qurantine measures, and 4) the recovery stage. We find that cross-city movements, which increased substantially in chunyun and then sharply dropped during the lockdown, are primarily dependent on travel distance and the social-economic development of cities. Following the Lunar New Year holiday, national mobility continues to withhold till mid-February. The COVID-19 outbreak and interventions have retained more than 72.89 million people backing to large cities. Mobility network analysis reveals the clusters of highly connected cities, conforming to the social-economic division of urban agglomerations in China. While the mass migration with large cities being held, smaller cities then connected more densely to form new clusters. During the recovery stage after lifting travel restrictions, netflows of over 55% city pairs have reversed the direction of movements before the lockdown. These findings offer the most comprehensive understanding on the national mobility at fine resolution across various scenarios in China and are of critical importance for decision-making regarding public health emergency response, transportation planning, and regional economic development, among others. |
11:45 | Changes in the time-space dimension of human mobility during the COVID-19 pandemic PRESENTER: Clodomir Santana ABSTRACT. The society produces digital records of, for example, the places we visit, the products we buy, and the people we call. These digital records proved to be valuable in studying different aspects of human behaviour (Buckee, 2020 Science). Here, we leverage Location Base Service (LBS) data from mobile phone users to study how citizen mobility patterns have been affected during the COVID-19 pandemic in the UK. Assessing the effects of the mobility restriction policies on daily routines relies on investigating the relationship between space and time-based population mobility patterns. We employ the radius of gyration to gauge the span of the urban movement (spacial dimension). We also define mobility synchronisation as a time metric that quantifies the co-temporal occurrence of the daily mobility motifs (Santana, 2022 arXiv) – i.e. leave/return home from work happens periodically at the same time (temporal dimension). Combining these space and time metrics, we can estimate the effect of the mobility restrictions on the population. We noticed a recovery in the radius of gyration after the governments started easing the restrictions and gradually reopening businesses (Fig. 1 A). The results also indicate that the two lockdowns affected the synchronisation of people’s movement differently (Fig. 1 B). The mobility synchronisation displays a recovery latency compared to the gyration radius (Fig. 1 A). Furthermore, how we respond to mobility restriction measures is interwoven with the characteristics of the geographical space, such as income groups, economic activities, and population density. In particular, we focus on disentangling how the population density in terms of rural-urban classification and the different socio-economic groups have adjusted their routine to comply with the mobility restrictions imposed. We observed that the radius reduction was slightly more significant in rural areas than the urban ones. In contrast, the decrease in synchronisation levels was more notable in urban areas than rural ones. We noticed that high-income groups displayed a more considerable reduction in the radius and synchronisation than the low-income groups. We also studied the differences concerning the duration and the type of trips. Fig. 1 B illustrates that high-income groups have the most reduction in the duration of their work-related trips. While Fig. 1 C depicts that rural areas presented the greatest increase in park trips compared to the baseline year of 2019. In summary, the analysis of the spatial dimension of human mobility coupled with the insights from the study of the time dimension allows us to characterise the impact of stay-at-home policies on the population of different areas/socioeconomics. These differences suggest that each group experiences, in a particular way, the emergence of asynchronous mobility patterns primarily due to mobility restriction policies and the ascension of new habits (e.g. home office and home education). |
12:00 | Neighborhood Detection from Mobility Data PRESENTER: Gergő Pintér ABSTRACT. A growing literature investigates the mobility of individuals to better understand social segregation in cities. The socio-economic status of neighborhoods as well as physical and administrative barriers in urban areas are known to influence mobility patterns and separate social strata from one another. Yet, it is not clear how these factors of separation are interrelated. In this paper, we examine aggregate networks generated from individual mobility with the Louvain community detection algorithm to detect different scales of neighborhoods, analyze the role of administrative and physical barriers in shaping the neighborhoods’ scales and quantify the socio-economic coherence of detected neighborhoods. We use GPS-based mobility data between 2019 September and 2020 February in Budapest provided by a data aggregator company that collects and combines anonymous location data from smartphone applications. Using the Infostop algorithm, we detect stops where a user spent some time during the day. House block polygons are extracted from OpenStreetMap. Two blocks are connected by an edge if a user had consecutive stops between the given blocks within a day. Then, the Louvain community detection algorithm is applied to the stop-network with different resolution parameters that clusters the blocks into communities. The communities were compared with administrative boundaries, the districts of Budapest, (Figure 1a), and infrastructural barriers, e.g., higher-order roads like highways (Figure 1b). The changes in the community areas are evaluated in respect of the administrative and infrastructural barriers during the change of the resolution parameter with the Fréchet distance method and comparing the symmetric area differences between the clustered blocks and the administrative districts. Finally, we use residential real estate sale contracts collected by the Hungarian Central Statistical Office to characterize the housing prices of every block. This enables us to determine the mean and standard deviation of real estate prices of communities detected on the mobility network using different resolution parameters (Figure 1c). The detected communities on the mobility network tend to fit administrative and physical barriers as the resolution parameter grows (not presented). However, administrative barriers matter more in one area (Figure 1b/marker 1) but physical barriers matter more in another area (Figure 1a/marker 2). District 21 (marker 3), the district is bounded by the river Danube, which is a special case where the physical and the administrative barrier match. Note that the features of a physical barrier can affect its community-forming power as Figure 1b shows: lower order roads (displayed by dotted lines) seem to have no impact on communities, but higher order ordinal roads do e.g., (Figure 1b/marker 2 or 3). Figure 1c shows the average standard deviation of real estate prices per community across Louvain resolution values. The standard deviation significantly decreases until resolution 4.0 then moderately starts to increase which signals an optimal resolution value for capturing coherent neighborhoods in terms of socio-economic status. |
12:15 | A Graph Attention Network for human mobility prediction PRESENTER: Erjian Liu ABSTRACT. Predicting human mobility patterns between locations has practical applications in transportation science, epidemic spread, environmental quality, and many other fields. For more than 100 years, researchers have proposed a variety of trip distribution models to address this challenging problem. The two most influential models are the gravity model and the radiation model, which have been successfully used to predict commuting, migration and intercity travels. Despite both models being widely used in predicting mobility patterns at different spatial scales, they only focus on nodes and link features, such as local population and distance between areas, without considering features distributions among the node mobility network neighbors. Recently, the introduction of the deep gravity model provided new insights into the use of machine learning for studying human mobility. The model exploits many features (e.g., land use, road network, health facilities, etc.) extracted from voluntary geographic data and uses deep neural networks to reproduce mobility flows. However, some studies have pointed out that neighboring traffic zones (TZs) have an important impact on trip distribution, while this is not considered in previous models. Here, we propose a graph neural network model that synthesizes neighboring TZs features to compute structural information for each location node and predict human mobility flows in a variety of cities. Our model uses several input features to compute the probability of traveling from origin to other locations in three steps. First, we inform the model with the features of all locations, their neighboring TZs, the distance and shortest path travel time between them. In each location, we have 41 features including residential and working population, number of subway stations, road length, building outline data, points of interest (e.g., hotel, shop), areas of interest (e.g., industrial area, commercial area), etc. Secondly, we use the Graph Attention network (GAT) to obtain the weights between each location and its neighboring TZs and use the Fully Connected (FC) layer to perform linear and nonlinear transformations. It is known that the distribution of mobility flows between locations obeys the power-law distribution, which is also known as the 80/20 rule. During the training process, the model takes into account this heterogeneity by assigning higher weights to larger flows according to a weighted loss function. The output of the last layer is the probability of moving from one location to another that, multiplied by the origin total outflow, produces the predicted flow. To validate the performance of our model, we employ cell phone signaling data from more than 300 Chinese cities. We systematically compare the travel distance distribution, the inflow distribution and travel fluxes between all pairs of locations produced by our model with respect to multiple other machine learning models in terms of three accuracy metrics. Results show that our model out-competes all the other models considered both for small and large flows between location pairs. Finally, we rank the importance of features and find that the four most influential features to reproduce mobility flows are the distance, the working population of destination, the resident population of origin, and the shortest travel time between origin and destination, among which distance ranks by far as the most important one. Our model introduces a novel framework that goes beyond node and link features by introducing locations structural features regarding local urban structure to inform a new model that out-competes previous traditional and machine learning models. From a policy perspective, the model could help improve urban mobility networks resilience in multiple cities and help traffic authorities to predict demand changes and avoid traffic jams in case of transportation network disruptions due to extreme events of any type, like quarantines, floods, or earthquakes. |
12:30 | Modeling the resilience of businesses using mobility-based dependency networks PRESENTER: Takahiro Yabe ABSTRACT. Quantifying the economic costs of businesses caused by extreme shocks, such as the COVID-19 pandemic and natural disasters, is crucial for developing preparation, mitigation, and recovery plans. Drops in foot traffic quantified using large scale human mobility data (e.g., mobile phone GPS) have recently been used as low-cost and scalable proxies for losses of businesses that rely on physical visits to stores, such as restaurants and cafes. Studies have so far neglected the interdependent relationships that may exist between businesses and other facilities. For example, university campus lockdowns during the pandemic may severely impact foot traffic to student-dependent local businesses. Such dependency networks could cause secondary and tertiary cascading impacts of shocks and policies, posing a significant threat to the economic resilience of business networks. To model such cascading effects, we construct, analyze, and simulate dependency networks of business using mobility data of millions of co-visits to different point-of-interest (POI) in US urban areas. We compute the dependence of a POI i on another POI j, w_ij=|s_i ∩ s_j |/|s_i | , where s_i and s_j denote the sets of users who visit POIs i and j respectively. We obtain the weighted directed dependency network with adjacency network W and measure its network statistics using null networks (weighted spatial configuration model), revealing its high clustering properties. We further find significant associations between dependency on certain POI categories and positive and negative impacts during the pandemic. Finally, we predict the propagation of changes in visits to POIs ΔV under hypothetical external shock scenarios f^* via the Leontief (input-output) model using the dependency network W. An example of the simulation is shown in the Figure, where we simulate the spatial cascades of the impact of 50% reduction of visits to colleges on nearby POIs, which has substantial negative effect on POIs with high dependency on colleges, even in areas far away from college campuses. Future applications of this method include applications to assessing the cascading impacts of various urban policies including rewiring of the transportation network, and addition and removal of nodes (e.g., parks, transit hubs) to and from the urban network via public investments. |
12:45 | Messengers: the strength of weak ties in swarm robotic networks for a spatial collective estimation scenario PRESENTER: Mohsen Raoufi ABSTRACT. The performance of collectives is largely influenced by the connectivity of their interaction network. For spatial networks, the connectivity is determined primarily by the interaction range, which is often a given configuration of the system and is difficult to alter. The performance of collectives decreases significantly where connectivity is extremely limited. However, in systems consisting of mobile agents, e.g. animal groups or mobile robots, it is possible to dynamically modify the network by the movement of agents in space. This facilitates the diffusion of information on the effective dynamic network. In our previous work [3], inspired by the wisdom-of-crowds effect, we studied the speed-vs-accuracy tradeoffs (SAT) in a distributed spatial collective estimation scenario. We highlighted the link between exploration-vs-exploitation and SAT, and quantified the impact of network connectivity on SAT using the DeGroot naıve social learning model [1] to achieve consensus in a static network, using real robotic swarms as our case study [4]. We introduced an artificial homophily for agents, where they move in space and get closer to neighbors with similar opinions. The so-called Exploitation helps the collective to increase precision in estimation and also results in an emerging collective contour-capturing behavior. This co-evolution of the network structure and opinion of agents shows rich dynamics. However, in low-connectivity settings, when the network becomes disconnected, the system can get trapped in local minima, causing the formation of echo chambers, as shown in Figure 1-a. This inhibition of information flows between echo chambers negatively impacts the collective performance, consensus achievement, and the collective contour-capturing task. A potential solution to the problem could restore the effective connectivity of the network. By doing so, the information can diffuse on the network and inform the other clusters with different estimations. A trivial, yet costly solution is to improve the hardware and increase the interaction range, but a more cost-effective alternative is to leverage the mobility of the robots to transmit information through physical movement, allowing for information exchange between clusters over distances larger than the communication range. To achieve this, we propose a new state for individual agents called “messenger”. Randomly moving Messenger agents can be considered as embodied data that can diffuse through space on length scales far larger than the actual communication range. A Messenger migrating from one cluster to another resembles and implements long, weak ties in a dynamic spatial network [2]. We propose a Dichotomous Markov Process (DMP) as a decentralized method for agents to decide when to behave either as a Messenger or an Exploiter. By changing the parameter of the DMP (PM, PE, probability of switching to Messenger, and Exploiter, respectively), we can vary the population and time-duration of Messengers, which relates to the number and length of corresponding new links in the dynamic network. Here, we evaluate the parameters of the DMP in terms of consensus precision in the opinion domain as well as contour capturing in physical space. Figure 1-b demonstrates the sub-optimal regions for each set of the DMP parameters. Please check this video on two example simulations with and without Messengers: https://tubcloud.tu-berlin.de/s/tL5TbJFKBSGjEP [1] Morris H DeGroot. Reaching a consensus. Journal of the American Statistical Association, 69(345):118–121, 1974. [2] Mark S Granovetter. The strength of weak ties. American journal of sociology, 78(6):1360–1380, 1973. [3] Mohsen Raoufi, Heiko Hamann, and Pawel Romanczuk. Speed-vs-accuracy tradeoff in collective estimation: An adaptive exploration-exploitation case. In 2021 International Symposium on Multi-Robot and Multi-Agent Systems (MRS), pages 47–55. IEEE, 2021. [4] Mohsen Raoufi, Pawel Romanczuk, and Heiko Hamann. Estimation of continuous environments by robot swarms: correlated networks and decision-making. In 2023 International Conference on Robotics and Automation (ICRA). IEEE, 2023. 1 |
11:00 | A global bibliometric perspective on the social structure of science PRESENTER: Aliakbar Akbaritabar ABSTRACT. We reconstruct career-long productivity, impact, (inter)national collaboration, and (inter)national mobility trajectory of 8.2 million scientists worldwide using more than 28 million article and review publications in Scopus. We aim to debunk three well-established bibliometric myths in previous research about academics’ productivity, collaboration and mobility. Debunking these myths is only possible with a global perspective simultaneously considering all the influential bibliometric variables alongside the network of collaboration among scientists. To do so, we use multiple correspondence analysis and consider a combination of 12 widely-used bibliometric variables. We further analyse the networks of collaboration among these authors in form of a bipartite co-authorship network and detect densely collaborating communities using Constant Potts Model (CPM)'s Bipartite extension. Our results show that the claims of literature on increased productivity, collaboration, and mobility are principally driven by a small fraction of highly prolific, collaborative, mobile, and impactful scientists. These top 10% are driving the observed trends in the bibliometric literature. We find a hierarchically clustered structure with a small top class, and large middle and bottom classes. Investigating the composition of communities of collaboration networks in terms of these top to bottom classes and the academic age distribution shows that those at the top succeed by collaborating with a varying group of authors from other classes and age groups. Nevertheless, they are benefiting disproportionately to a much higher degree from this collaboration and its outcome in form of impact and citations. |
11:15 | Mapping Philanthropic Support of Science PRESENTER: Alexander Gates ABSTRACT. Nonprofit organizations and foundations play a crucial role in supporting science, representing around 44% of basic research funding at US universities. Unfortunately, our current understanding of scientific funding has primarily been limited to federal agencies, while in philanthropy, we have only been able to study subsets of donors. However, the recent release of US nonprofit tax records by the Internal Revenue Service (IRS) has enabled us to capture the complete swath of philanthropic foundation donors. Our analysis, the first of its kind, reveals a large number of nonprofit funders offering a wide range of support. We find that philanthropic funders tend to support geographically close recipients, with grant-giving relationships becoming increasingly entrenched over time. Furthermore, the bipartite network of supporters and recipients contains predictive power, allowing us to forecast future funder-recipient relationships. To track philanthropic funding to science we collect and disambiguate over 10 million grants listed on tax returns of 685,397 non-profit organizations in the US from 2010-2019. We identify a subset of organizations involved in scientific research and higher education, finding 695,917 grants, totaling over $200 billion to science. We next constructed a network in which nodes represent organizations, and a directed weighted link captures the grant amount from a funder to a receiver. In contrast to federal funds distributed nationally, philanthropic funding is strongly focused locally. If grants were distributed randomly across the nation (preserving the number of recipients in each state), about 5% of grants would be awarded in the donor’s home state. In contrast, we find that 49% of grants in the donor’s state, a locality nearly 10-fold the random baseline.Likewise, when we examine grants over time, we find that 71% of grants repeat one year later, and for the 27,384 funding relationships ongoing for 7 years there is a nearly 90% likelihood to continue. Funding relationships that persist over multiple years are also likely to involve higher annual amounts. Finally, a common phrase in philanthropy says “if you’ve met one funder, you’ve met one funder,” implying that each foundation has its own priorities and little can be gleaned from one funder's priorities about another. However, we find evidence of predictability in funding patterns. We examine a subset of funders active in both 2018 and 2019 and use the bipartite Adamic-Adar Index (AA) to predict new relationships. We find that the predictions obtained from the AA index from 2018 have strong predictive value for 2019, resulting in a remarkably high area under the receiver-operator curve (AUROC) of 0.87. These findings have important implications for researchers, foundation funders, and government policymakers. Applying novel tools from machine learning and network science to philanthropic data could improve funding allocation, help funders better provide for recipients, boost recipient access to philanthropic resources, and enable policymakers to increase the impact of philanthropic funding. |
11:30 | Quantifying progress in research topics across nations PRESENTER: Kimitaka Asatani ABSTRACT. A scientist's choice of research topic affects the impact of their work and future career. While the disparity between nations in scientific information, funding, and facilities has decreased, scientists on the cutting edge of their fields are not evenly distributed across nations. Here, we quantify relative progress in research topics of a nation from the time-series comparison of reference lists from papers, using 71 million published papers from Scopus. We discover a steady leading–following relationship in research topics between Western nations or Asian city-states and others. Furthermore, we find that a nation's share of information-rich scientists in co-authorship networks correlates highly with that nation's progress in research topics. These results indicate that scientists' relationships continue to dominate scientific evolution in the age of open access to information and explain the failure or success of nations' investments in science. |
11:45 | Social Contagion in Science PRESENTER: Satyaki Sikdar ABSTRACT. Modern science has become increasingly collaborative over the past decades. Coauthors often expose us to new tools, methods, and theories, and large teams have become almost necessary to tackle complex problems in various disciplines. In this setting, collaboration networks, where nodes are authors and edges connect two nodes if they have coauthored a paper (see Fig. 1B), are platforms where dynamic processes unfold, facilitating the transmission of knowledge and ideas and allowing scientists to influence their peers in choosing future research directions, much like in a social contagion process. Model setup and data. Given a scientific topic $t$, reference year $T_0$, and window size $T$, we construct two consecutive non-overlapping exposure (EW) and observation windows (OW), spanning years $[T_0 - T, T_0)$ and $[T_0, T_0 + T)$ respectively. We identify active authors who published papers on the topic $t$ during the EW and use them to build the coauthorship graph. Inactive authors are those in the graph who are not active and are the candidates for influence in the OW. We use the OpenAlex dataset and report the results for 6 topics (in Fig.~1D) across three fields: Physics, Computer Science, and Biology \& Medicine. We set the window size ($T$) as 5 years and consider multiple reference years ($T_0$). Results. In Fig. 1D, we plot the activation probability, the inverse cumulative probability (orange) that an inactive author becomes active in the OW, as a function of the number of active contacts ($k$) in the EW [1]. As expected, we see an increasing trend. In particular, the jump from $k$ = 0 to $k$ = 1 is remarkable, showing that the probability of spontaneous activation in the absence of previous contacts ($k$ = 0) is much lower than that of induced activation via contagion ($k$ $\geq$ 1). Most growth occurs for low values of $k$, after which the curve flattens. The observed probabilities are compared with a baseline for simple contagion (dashed line), which assumes that authors act independently of each other. The empirical curves deviate from the baseline as the number of contacts grows, providing evidence of complex contagion. In conclusion, we shed light on complex information diffusion dynamics in collaboration networks by studying how peer pressure manifests in future topic switches for a scientist. References [1] G. Kossinets and D. J. Watts, “Empirical analysis of an evolving social network”, Science, vol.311, no.5757, pp.88–90, 2006. |
12:00 | Uncovering the universal nature of citation networks: From science of science to law of law and patterns of patents PRESENTER: Robert Mahari ABSTRACT. Human culture is built on our ability to record and accumulate knowledge. Perhaps one of the most sophisticated examples is the scientific system. Science accumulates knowledge over time by building on existing work through citations, which allow scientific communities to compress and use existing knowledge. Examining how scientists cite existing work has revealed many insights into the ways scientists combine existing knowledge to produce new knowledge. Although it is tempting to consider these insights as universal “laws” of citation, these patterns may be a result of the unique procedures and incentives of the scientific enterprise, and may not generalize to other systems. We explore the universality of citation dynamics by focusing on two additional, sophisticated knowledge systems – the common-law legal system and the U.S. patent system. All three systems – science, law, and patents – rely on citations to build on the past. In all three systems, citations are employed as the primary mechanism to draw upon the existing base of knowledge. While these systems are collaborative knowledge systems built on citations, they are distinct in terms of procedures and incentives. Anyone can attempt to publish science or file a patent and merit it determined at the time of publication, while judges are carefully preselected but can publish opinions without review; scientists and inventors choose their own research problems while judges are assigned to cases; the number of scientists and inventors has been growing rapidly while the entry to judicial system is limited; science aims to be egalitarian whereas the legal system has a codified hierarchy. These contrasting differences in how the systems are organized and operate provide an ideal opportunity to test whether the “laws” of citation can be generalized beyond science. We show that, despite the stark differences between the three systems, the fundamental citation dynamics are remarkably universal, suggesting that the citation dynamics are largely shaped by intrinsic human constraints and robust against the numerous factors that distinguish the three systems. We demonstrate that the systems share similar characteristics across preferential attachment (F, J, N), citation recency (G, K, O), and diamonds in a rough (H, L, P). While some of the observed patterns can be explained by preferential attachment models, some emerge from collective behavior. We propose a new Collective Citation Model (A-D) that bases citation dynamics on the entire knowledge system, rather than just individual publications, and show that it gives rise to the empirical dynamics we observe. This model is able to better predict the trajectory of the most successful papers, the emergence of diamonds in the rough, and the growing number of references per publication. Our results build a strong bridge across three disparate systems, suggesting that theories and tools that describe human-based reference mechanisms (e.g., science, common law) can be translated into one another. |
12:15 | Hidden Directionality of Co-citation Network and Its Relation to the Impact of Scientific Papers PRESENTER: Dahan Choi ABSTRACT. Many scientific accomplishments result from the coevolution of knowledge. Investigating the relationships between scientific discoveries is crucial in comprehending the evolutionary paths of science and suggesting potential pathways for future scientific progress. In this study, we explore the latent directionality among scientific findings in a co-citation network while tracing the features of the papers. Co-citation is a measure of the relationship between two documents based on the number of times they are cited together in other documents. It provides valuable insights into the interconnections between scientific discoveries. For instance, co-citation analysis is widely employed by researchers to assess the similarity between two documents, as documents that are related tend to be cited together more often [1]. However, the level of relatedness between co-cited documents in a given document can differ, and many of them may be negligible. To better understand these relationships, it is necessary to identify essential edges, which are considered more informative connections than the others. To accomplish this, we establish a co-citation network using the Microsoft Academic Graph dataset, one of the largest and most comprehensive databases of scientific papers. Then, we apply the information entropy approach to extract the essential edges from the co-citation network. Information entropy can be used to quantify the heterogeneity of the weight of edges attached to each node. By utilizing this property, it is possible to quantify the number of connections deemed effective in a given node [2]. The Rényi entropy for node i with the parameter α is given by S_α (i)=1/(1-α) ln(∑_j▒w ̃_ij^α ), where j is neighboring node of i and normalized weight w ̃_ij=w_ij \/∑_j▒w_ij . Note that directionality of edges arises from the inequality between w ̃_ij and w ̃_ji. The Rényi entropy S_α (i) approaches ln〖k(i)〗 if all of the w ̃_ij are similar, while S_α (i)\≃0 if a single dominating w ̃_ij exists. Therefore, we define the effective out-degree of node i as k_α^→ (i)=exp[S(i)]. To extract the essential neighbors from the viewpoint of each individual node, we choose only the top k_α^→ (i) neighbors. Then, we obtain the subnetwork composed of the most essential edges [2, 3]. We examined the relationship between the characteristics of papers—such as relative publication year, citation count, and novelty—and the direction of edges that arose during the normalization process (Fig. 1). Additionally, our investigation shows that the average similarity across the co-citation network increases during the extraction process while the average similarity of removed edges remains relatively constant. |
12:30 | Modeling Scientific Recognition Process in Complex Awarding Systems – Going Beyond Citation PRESENTER: Ching Jin ABSTRACT. Scientific prizes confer credibility to persons, ideas, and disciplines, provide financial incentives, and promote community-building celebrations. Despite a considerable number of endeavors to predict prize winnings in scientific community, citation-based metrics are considered in most studies, little is known about how prize interactions influence prize winning processes. How do prizes rely on other prizes to make judgement? Do they award existing winners of high-status prizes to legitimate their own status, or do they prefer providing more opportunities to junior scholars with less prizewinning records? Answering these questions can largely deepen our understanding of scientific recognition process and help us to accurately predict future prize winners. In this study, we curated a large-scaled dataset containing 13,000 prize winners receiving more than 3,000 prizes (Fig. 1D). For each prize-winning event, several information is collected including prizewinning time, award type, and scholar records, enable us to generate a prizewinning timeline for each individual scholar. To quantify how likely a prize is awarding to existing prizewinners, we propose a collective measure for each prize, recognized elite fraction (REF, α), calculating the fraction of winners who had won other prizes (recognized elite) before they win the particular prize. This measure not only provides a continuous generalization of the famous binary classification of prizes, awarding the future (e.g. MacArthur Fellowship) and awarding the past (e.g. Copley Medal) (Fig.1A), but is also very stable over time (Fig. 1B), firmly locating prizes in a complex hierarchical system (Fig. 1C). Furthermore, this new measure allows us to quantitatively explore the prizewinning selection process (Fig. 1EF). Although the selection pattern appears very distinct for prizes with different prestige, they could be described by a universal function, enabling us to build a minimal selection model, which accurately captures the individual prizewinning trajectories and complex collective behavior of the entire system (Fig. 1GH). This study provides new insights into the social function of prizes which may have strong policy implication. |
12:45 | Exploring visualizations of large networks and embeddings using Helios-Web ABSTRACT. The study of complex systems is a crucial aspect of modern science, and network science provides a framework for examining the intricate interactions that exist within many real-world systems. To address the challenge of handling networks with more than 10,000 nodes, we have developed a new network visualization and exploration tool named Helios-web [1]. This tool uses GPU-based rendering and continuous force-directed layouts to visualize networks of millions of nodes in real-time, making use of a variety of rendering techniques such as billboards [2] and signed distance fields [3]. The tool includes an API and interactive features, allowing users to search, filter, and highlight nodes or edges based on their attributes. It has been integrated into the OSoMe platform [4] for social media network visualization and exploration. A preliminary tool to visualize OpenAlex citation networks is also available [5]. Helios-web outperforms existing open-source solutions for network visualization, including Graphviz, Gephi, Cytoscape, graph-tool, igraph, networkx, Graphia, and 3d-force-graph, being capable of rendering networks with more than 50,000 nodes on modest hardware and more than one million nodes on better hardware. In addition to networks, projections of embeddings can also be explored with the tool, which which also provides capabilities to display the neighbors of entities in original space. In conclusion, Helios-web provides a platform that can handle large networks with millions of nodes and facilitates real-time visualization and exploration of dynamic complex networks. However, its main objective is not to replace existing open-source solutions, but to provide an alternative backend that can be used standalone or integrated into existing or new software and platforms. [1] https://github.com/filipinascimento/helios-web [1] L.Wagner, D. Limberger,W. Scheibel, M. Trapp, and J. D¨ollner, “A framework for interactive exploration of clusters in massive data using 3d scatter plots and webgl,” in The 25th International Conference on 3D Web Technology, pp. 1–2, 2020. [2] C. Green, “Improved alpha-tested magnification for vector textures and special effects,” in ACM SIGGRAPH 2007 courses, pp. 9–18, 2007. [4] https://osome.iu.edu/tools/networks [5] https://observablehq.com/@filipinascimento/openalex |
11:00 | Behavioral and Topological Heterogeneities in Network Versions of Schelling’s Segregation Model PRESENTER: Hiroki Sayama ABSTRACT. Agent-based network models of residential segregation have been of persistent interest to various research communities since their origin with Thomas Schelling. Frequently, these models have sought to elucidate the extent to which the collective dynamics of individuals’ preferences may cause segregation to emerge. This open question has sustained relevance in U.S. jurisprudence. Previous investigation of heterogeneity of behaviors (preferences) by Xie & Zhou (2012) has shown reductions in segregation on networks. Previous investigation of heterogeneity of topologies by Gandica, Gargiulo, & Carletti (2016) has shown no significant impact to observed segregation levels. Recent work by Sayama and Yamanoi (2020) has shown the importance of representing realistic heterogeneities in dynamical social network models. In this work, the necessity of concurrent representation of both behavioral and topological heterogeneities in network segregation models is examined. Extending the previous works, additional network simulations were conducted using both Xie & Zhou’s and Schelling’s preference models on 2D lattices with varied levels of densification to create topological heterogeneities (i.e., clusters, hubs). Results show a richer variety of outcomes, including novel differences in resultant segregation levels, fragmentation, and hub composition. Notably, with concurrent, increased representations of heterogeneous preferences and heterogenous topologies, reduced levels of segregation and fragmentation emerge. Implications and areas for future study are discussed. Gandica, Y., Gargiulo, F., & Carletti, T. (2016). Chaos, Solitons, and Fractals, 90, 46-54. Sayama, H., & Yamanoi, J. (2020). NetSci-X 2020 Proceedings, pp. 171-181. Xie, Y., & Zhou, X. (2012). PNAS, 109(29), 11646-11651. |
11:15 | Segregation in high-resolution residential mobility network PRESENTER: Louis Claude Bernard Olivier Boucherie ABSTRACT. We present a study on residential segregation, utilizing high-resolution data from the Danish population registry. Our network-based approach employs a modified version of the Infomap community detection algorithm, which takes into account high-order flow with memory and the gravity law. We also generate representative samples by randomly selecting addresses within a community's convex hull. Our findings demonstrate that these overlapping communities exhibit greater homogeneity in socio-economic indicators compared to traditional administrative units. This suggests that the community-based administrative units are more suitable for designing public policy. |
11:30 | Socioeconomic segregation in friendship networks: Prevalence and determinants of same- and cross-SES friendships in US high schools. ABSTRACT. Chetty and colleagues (2022) show that friendship networks in high school are socioeconomically segregated and that parents’ school choices explain about half of the segregation, attributing the other half to “friending bias”. However, apart from SES homophily, many other processes within schools can induce socioeconomic segregation in friendship networks, which are not well understood. In this paper, I examine the prevalence and determinants of socioeconomic segregation in high school friendship networks using the Add Health survey, which collected friendship networks and detailed information about students. I exploit recent advances in exponential random graph modeling to understand how SES-stratified settings (e.g., ability tracking), homophilous tendencies (e.g., racial homophily), and endogenous network mechanisms (e.g., triadic closure) contribute to socioeconomic segregation in friendship networks. In particular, racial homophily is often thought to be a key driver of socioeconomic segregation, owing to the strong association between race and SES. However, since racial homophily promotes ties both within and between socioeconomic boundaries, its impact on socioeconomic segregation is unclear. The results show that friendship networks are socioeconomically segregated, especially at the bottom of the SES distribution. Surprisingly, segregation is neither determined by SES-stratified settings within schools nor by endogenous network mechanisms as their effects on segregation are ambivalent. Instead, the results reveal that socioeconomic segregation in high school friendship networks is determined by the interplay of race and SES homophily. |
11:45 | A Mobility-Informed Segregation Model PRESENTER: Daniele Gambetta ABSTRACT. The famous Schelling's model suggests that even when individuals are open to living in neighborhoods with different ethnicities and would move only when their ethnicity is a small minority, a city may still end up sharply segregated. Although disarmingly simple, Schelling's model provides a fascinating look at how individuals might produce a non-desirable collective outcome even without intending to. Many variants of the model have been proposed so far in all of which unhappy agents choose the next location randomly on the grid, pursuing individual happiness without any constraints related to mobility in the choice of the destination. However, extensive research on human mobility and migration suggests that distance and location popularity do play a crucial role in location choices. In this work, we implement a mobility-informed segregation model (MISMO), where agents move accordingly to dynamics inspired by the gravity law of human mobility, i.e., agents prefer nearby locations to distant ones and more relevant locations to less relevant ones. In detail, an agent in location A selects the next destination, B, based on a probability P(B) that depends on the distance d(A, B) and the relevance val(B): P(B) propto val(B)^alpha d(A, B)^beta. We assume two agent types and conduct several simulations on grids of different sizes and proportions of agent types, cell occupancy rates, and homophily thresholds. A simulation ends when all agents are happy or after a maximum of 500 steps. We quantify the final level of segregation as the average segregation of each agent, calculated as the ratio of neighbors belonging to the same group to the total number of neighbors. We find that, for beta < 0 (i.e., the agent prefers nearby locations over far ones), MISMO converges to a final level of segregation lower than the classic Schelling model and in a higher number of steps. Moreover, the lower beta, the lower the final level of segregation and the longer the model's convergence time. In other words, adding mobility constraints leads to a segregated city, but slowly and with low segregation levels than a city with random movements. We also investigate the role of val(b)^alpha, varying alpha while beta = 0. We find that increasing the importance of cell relevance brings an elongation of convergence time, suggesting that agents compete for the same relevant cells thus being frequently close to the opposite group, thus increasing the time needed to find an equilibrium. Our study provides interesting insights into the relationship between segregation dynamics and human mobility laws, open the question on how to measure this relationship in real data about real urban dynamics. |
12:00 | Mobility segregation dynamics during pandemic interventions PRESENTER: Rafiazka Hilman ABSTRACT. The COVID-19 outbreak emerged as an external shock in all countries that altered the typical configurations of mobility networks. Its catastrophic impacts may perpetuate individual mobility that is already constrained by socioeconomic stratification [1, 2]. Existing literatures suggest that in such situations people of higher income are more flexible to reduce their mobility, while low-income people have lower capacity to adjust their mobility and social distancing. Such differences can amplify mobility biases and lead to stronger mobility segregation patterns than usual [3, 4]. Moreover, individual adaptation in response to the pandemic depends on the strength of non-pharmaceutical intervention (NPIs) policies and the local severity, resulting in gradual recovery in visit patterns [5]. We analyse the changing residual isolation and segregation patterns in mobility patterns in response to COVID-19 for large urban areas such as New York, London, Jakarta and Bogota. Analysing mobility data for approximately 600K people and 200 million trajectories, covering transitions during characteristic pandemic periods, such as before lockdown (BL), during lockdown (L), and reopening (R). We combine this mobility dataset with source income data from the Central Bureau of Statistics at the census tract level to determine socioeconomic status of both people (based on home inference) and places they visited. In addition, we refer to the stringency index from the Oxford COVID-19 Government Response Tracker (OxCGRT) to identify periods of expected behavioural changes due to the different interventions. The structure of mobility patterns can be modelled as an attributed bipartite network G=(U,P,E) with nodes identified as people’s home and places they visit, each of them associated with a socioeconomic label, and weighted links that describe visiting patterns and frequencies. The mobility stratification matrix M_{ij}, computed from the corresponding bipartite mobility network for each city, appears with a strong diagonal indicating a high assortativity index r already before the pandemic. Moreover, a dynamical plot of the assortativity index signals abrupt changes due to interventions, doubling the assortativity value for some cities (Fig.1 a). The difference in stratified mobility patterns between two consecutive periods (i.e. the difference of the corresponding M_{ij} matrices) is captured by Mobility Adjustment Matrix S_{ij} and the derived Average Residual Isolation Effects \mu_{re}. Interestingly, we find residual segregation in most cities (Fig.1 b), indicating long-term effects of pandemic interventions on socioeconomic mixing. Finally, we address the question, which interventions contributed the strongest to the change of segregation in different intervention periods. Our results highlight population level dynamical segregation phenomena observed at the individual level, that provides important conclusions for better policy design with more equal consequences among people from all socioeconomic classes. |
12:15 | Highways are barriers to urban social connections PRESENTER: Anastassia Vybornova ABSTRACT. Geographic distances influence social connections inside cities - even in our digital age [1,2]. Both the perception of physical distance and the likelihood of forming social ties across locations are influenced by infrastructural elements, as previous studies showed: public transport increases social connectivity along routes [3]; barriers to physical mobility influence the creation of social connections [4]. To directly investigate the impact of highways on social connections, we analyse how the spatial configurations of a city's social and infrastructural networks are correlated. We use a highly granular, georeferenced social network of mutual follower relationships between Twitter users in the top 50 US metropolitan areas [5]. For each of these cities, we create a gravity-law inspired configurational null model [6] that reflects the population density and distance distribution of users' home locations. We then overlay the spatial network of social connections with the network of highways from OpenStreetMap, and measure the average number of highways crossed by social network ties. We find that the probability of an edge crossing at least one highway is significantly lower for real social connections than for the null model, and validate these findings by several multivariate regression models. Our results confirm that urban highways, apart from causing spatial segregation patterns, also have a directly measurable negative correlation with the density of social connections, and unveil the importance of infrastructure to the study of increasing inequalities and social network fragmentation in urban areas. [1] D. Liben-Nowell et al. “Geographic routing in social networks”. In: Proceedings of the National Academy of Sciences 102.33 (2005), pp. 11623–11628. [2] G. Krings et al. “Urban Gravity: A Model for Inter-City Telecommunication Flows”. In: Journal of Statistical Mechanics: Theory and Experiment 2009.07 (2009), p. L07003. [3] M. Bailey et al. “Social Connectedness in Urban Areas”. In: Journal of Urban Economics 118 (2020), p. 103264. [4] G. Tóth et al. “Inequality Is Rising Where Social Network Segregation Interacts with Urban Topology”. In: Nature Communications 12.1 (2021), p. 1143. [5] E. Bokaányi et al. “Universal Patterns of Long-Distance Commuting and Social Assortativity in Cities”. In: Scientific Reports 11.1 (2021), p. 20829. [6] P. Expert et al. “Uncovering Space-Independent Communities in Spatial Networks”. In: Proceedings of the National Academy of Sciences 108.19 (2011), pp. 7663–7668. |
12:30 | Mobility and transit segregation in urban spaces. PRESENTER: Nandini Iyer ABSTRACT. The ability for individuals to move throughout a city creates opportunities for reducing the impact of social and economic disadvantages. By facilitating movement within urban areas, transit systems can democratize accessibility to resources and opportunities, while also fostering social integration and interactions among individuals from different areas and/or sociodemographic backgrounds. Conversely, inequalities in transport services can hinder individuals from fulfilling their travel demands. In this work, we explore socioeconomic segregation in cities from the perspective of their transit systems and how they interact with the other layers of the urban segregation landscape. In our analyses, we combine socioeconomic data from the 2020 American Community Survey (ACS) with amenity visitation patterns from anonymized mobile phone traces provided by SafeGraph to estimate the mobility flows between areas (i.e., Census Block Groups - CBGs) in a given city. From these flows, we sampled the public transit networks for 15 US cities, leveraging on the General Transit Feed Specification (GTFS) data for those cities. Thus, we use the volume of mobility flows between block groups – and their respective economic breakdowns – to estimate the socioeconomic profiles of the travellers along the routes within the transit networks. From these mobility networks, we estimate the economic segregation level using the Index of Concentration at the Extremes (ICE). This allows us to characterise how segregated each edge in the transit network would be under different assumptions of demographic transit use. We compute two dimensions of experiential segregation (ES) for a given neighbourhood: the ES at the amenities its residents visit (Destination Segregation) and the ES while using the transit system to reach said destinations (Transit Segregation). Our stochastic approach to measuring transit use allows us to test different assumptions as to which economic groups are likely to use the transit system. Figure 1 depicts segregation under the assumption that every demographic uses the transit system with equal probability. We compare our empirical results to the measured segregation of a null model, which hypothesises that destination segregation is what fuels the identified disparities in transit segregation across income groups. Our findings suggest that the segregation experienced while using the transit system are reflective of underlying inequality in a city’s transport service. Panel A in Figure 1 portrays that the transit system exhibits segregation disparities across income groups, but to a smaller extent than residential segregation. Panel B elucidates how transit segregation is not solely an artefact of destination segregation by comparing empirical results to that of the null model, in which levels of destination segregation converge to reflect the city’s economic composition. While segregation still exists in the transport and destination dimensions, Figure 1 conveys how the individuals are exposed to the highest magnitudes of segregation in the residential dimension, with destination and namely transit segregation allowing for potential avenues for reducing experiential segregation. |
12:45 | How networks shape diversity for better or worse PRESENTER: Andrea Musso ABSTRACT. Socio-diversity, the social analog of bio-diversity, is fundamental for innovation, productivity, and collective intelligence. How can it be promoted? This paper studies how social structure can promote and hinder socio-diversity, employing models of behavioral dynamics and numerical simulations. By introducing the structural diversity index, a quantifier revealing the propensity of a social structure to sustain diversity, we investigate how fundamental characteristics of social networks---degree heterogeneity, clustering, distance, and size---affect behavioral diversity. We show that degree-heterogeneity obstructs diversity, while clustering and distance favor it. These results open new perspectives for understanding how to change social structures to sustain more (behavioral) diversity and, thereby, societal innovation, collective intelligence, and productivity. |
0. The attractor structure of functional connectivity in networks of coupled logistic maps. PRESENTER: Venetia Voutsa ABSTRACT. Functional connectivity (FC) has been used in the field of complex systems as a means to capture dynamical processes on graphs by translating dynamical observations into pairwise relations of nodes. The comparison of functional connectivity to structural connectivity (SC) – i.e. network architecture– quantifies the dominant dynamical phenomena on the graph and indicates to what extent the structural graph is visible through dynamics. By utilizing the concept of the two classes of functional connectivity – one based on simultaneous and one based on sequential activity of the nodes –, in this talk, we suggest a method to mechanistically understand the relationships between topology and dynamics for coupled logistic maps using cellular automata (CA) as a data analysis tool. In the case of networks of coupled logistic maps in their chaotic regime we find that as the coupling strength increases, positive sequential SC/FC gives way to slight positive simultaneous SC/FC, followed by higher sequential activity again, and then succeeded by dominant synchronous activity of the system. Symbolic dynamics is often used as an approach for a better understanding of coupled logistic maps. Our results show that the system’s behaviour, indicated by the SC/FC correlations extracted from the symbolic series, are maintained in the coarse-grained space. This observation allows us to proceed further and use symbolic encoding to formulate equivalent CA models. The numerical experiments suggest that our method is able to replicate the behaviour of the initial system. The added noise enhances the SC/FC correlations since it favours a more uniform sampling of the CA attractors. Overall, our strategy gives insight into the question of how network architecture shapes the collective behaviour in the dynamical system of coupled logistic maps, by relating the SC/FC relationships to structural properties of the underlying attractors of the symbolic dynamics. |
1. Multi-agent herding control of complex systems PRESENTER: Andrea Lama ABSTRACT. Herding is a collective a type of behaviour observed in complex systems emerging in those situations where a group of agents suddenly start to behave collectively in the same way [Zhao et al., PNAS, 2011]. Examples include people in a crowd starting moving in the same direction or investors buying the same stocks. In this talk, we discuss the herding control problem where a group of agents (the herders) is tasked with the goal of controlling the collective dynamics of another group of agents (the targets) so as to make some desired collective behaviour to emerge [Auletta et al., Auton. Robots, 2022]. Differently from the problem of controlling a complex network by acting on a fraction of its nodes or edges, this problem entails steering the dynamics of a network of target agents via controlling the dynamics of a complex network of herders interacting with them. Specifically, after reviewing the current state-of-the-art, we focus on the problem of herding a group of stochastically diffusing target agents in the plane towards a desired region by orchestrating the collective behaviour of a group of cooperating herders with limited sensing. We design the herders as a network of coupled oscillators whose dynamics is adapted in a distributed manner to that of the targets. We split the herders' dynamics in a set of hierarchical actions that the herders can perform on different time scales. We then adjust the time scale separation to achieve the final goal. We investigate the effectiveness of the proposed control strategy and uncover the scaling relations involving the number of herders and targets, the sensing area of the herders and other key parameters. Numerical simulations show that, differently from the infinite sensing case, where the number of herdable target agents scales as the square of the number of herders, counterintuitively, in the finite sensing case for some given number of herders the number of targets that can be successfully herd is not only upper bounded but also lower bounded. Based on this observation we propose a set of sufficient conditions to guarantee the success of the herding task and make a group of target agents herdable. The theoretical results are validated numerically analyzing the scaling laws between the sensing capability of the herders, the number of agents, and the initial configurations of herders and targets. Finally, we will discuss the applicability of the results to a set of representative real-world applications. |
2. A Shortcut to Stable Power Grids PRESENTER: Yunju Choi ABSTRACT. The frequency stability in AC-based modern power systems is crucial, particularly in renewable power grids, as there typically exists a large regional bias in power generation and consumption. Modifying the connection structure of a power grid may improve its performance and stability of the power grid. However, in reality, the grid operators hardly install new transmission lines due to public resistance for environmental protection. In this sense, it is necessary to implement a strategy for improving the power grid's stability that requires limited modification on a power grid. This study aims to understand how to improve the synchronization stability of a power grid using a single additional transmission line, called a shortcut. In a simple ring structure that is divided into two parts symbolizing the biased power distribution, we analyze the synchronization stability of the nodes by adjusting the connection location of the shortcut. We find that implementing an additional shortcut that equally divides a network in terms of the number of nodes increases the overall stability of the network. However, counterintuitively and interestingly, synchronization stability is not the best when connecting the two topological centers of the producers and consumers. We further observe and analyze the effects of the characteristics of the shortcut by systemically diversifying the network structure to understand the nature of complex power grids, and to build robust structures that can respond effectively to rapidly changing power environments. |
3. Role of a Community Structure in a Power–Grid Network PRESENTER: Jonghoon Kim ABSTRACT. The reliability of the power grid is crucial for modern society. Understanding the structural properties of power grids is important to maintain reliable power supply. As one of the key factors that influence to the dynamic stability of the power grids, in this study we investigate a community structure, which has been partially explored in previous studies. A community in a network is a subset of nodes that have more connections within the group than the rest of the network. Since the dense connection within a community can play a role as the interaction mass by relaxing the disturbance inside, it is important that understanding how different groups of nodes interact and eventually influence the overall stability of the grid. In this study, we evaluate the synchronization stability of power-grid nodes by using three measures: Basin stability, Functional Secureness, and Functional Robustness. The measures are designed to estimate the nodes’ synchronization stability based on full- or partial-synchronization recovery in terms of active or passive perspectives. As a comparative study, we investigate the model networks that consist of a node chain or a community chain. The results show that the distribution of the synchronization stability remains similar in both topology models, which provides insights for designing more robust and reliable power grids. |
4. Counting Non-Isomorphic Graphs PRESENTER: Rana Shojaei ABSTRACT. Graph theory has unquestionably had a major influence on network science. One of the longstanding challenges in this domain is counting the number of simple graphs that can be constructed between a given number of unlabeled (indistinguishable) nodes. The main reason for the difficulty in counting arises from possible symmetries which result in identical graphs. P\'{o}lya’s theorem has been widely and successfully used to enumerate objects under permutation, from the various possible color patterns on a network with desired topology to the number of isomers for a chemical molecule. Indeed, it was in the context of this problem that the term ’graph’ was first coined. Given the popularity of both the theorem and the problem, it seems unlikely that this connection has not previously been made. However we were not able to locate previous works that pursued this angle, and many researchers in modern network science seem to be unaware of the possibility. We have therefore chosen to revisit the counting of graphs with P\'{o}lya's theorem from a modern perspective. To count the number of graphs that can be constructed from $N$ unlabeled nodes, we consider the fully connected graph between $N$ nodes and ask how many different graphs can be constructed by removing a subset of the edges. This is identical to the number of non-isomorphic colorings of the edges in the fully-connected graph with $2$ colors, which can in turn be computed by P\'{o}lya's theorem. Hence, we need to consider all permutations of edges induced under the permutations of the nodes that leave the graph unchanged. For this purpose, we construct the set of all non-increasing sequences of positive integers, called $L$, so that $j_1 +2 j_2+3 j_3+ ... +nj_l= N$ in which $j_l$ indicates a partition of the graph with size $l$. We use $(n_1,n_2,...)$ to refer to permutation groups where there is one permutation between $n_1$ objects, another between $n_2$ objects, and so on, such that $(1,1,1,1)$ is the trivial permutation of $4$ objects. If $k_j$ is the multiplicity of the partition $j$ in $L$, the number of simple graphs $M$ that contain $N$ vertices can be computed as follows: \begin{equation} M = \sum_{l\in L} \frac{2^{A+B}}{C}, \; where \; A = \sum_{i} {\rm ceil}((l_{i-1})/2), \end{equation} \begin{equation} \small B = \sum_i\sum_{j>i} {\rm gcd} (l_i,l_j) \;, \;and \;\; C=\prod_{j=1}^{N} k_j! j^{k_j}. \end{equation} In summary, there is a simple closed-form formula for the number of simple graphs between unlabeled nodes. However, we are likely not the first to discover this formula and it does not lead immediately to an efficient algorithm. However, we discuss some indications that future advances leading to an efficient algorithm may be possible and even the inefficient formula may be of use in analytical calculations. |
5. Heterogeneous graph neural networks for Academic Collaboration Network characterisation from spurious data PRESENTER: Daniele Pretolesi ABSTRACT. This work examines how Heterogeneous Graph Neural Networks (HGNN) may be used to characterise the scientific production of Academic Collaboration Networks, using potentially flawed and incomplete data as a starting point. We cast our investigation on a dataset based on the publications of Machine Learning Genoa Center (MaLGa) faculty members. Leveraging our direct knowledge of MaLGa, we may accurately assess and interpret the obtained results and their quality. We start by collecting the list of papers published by MaLGa faculty since 1984. The publicly available dataset [1], sourced from an institutional public repository of scientific results, presents two key challenges: non-normalized author data and incomplete semantic attributes, including missing keywords and abstracts, due to its heterogeneity and potential sparsity. We tackle these issues by employing a preprocessing pipeline that uses authoring files and prior knowledge to normalise authors' names. We also complete the missing keyword using a keyword attribution strategy based on NLP and LLM, selected among several state-of-the-art methods [2]. The preprocessing phase described above facilitates the construction of an information-rich heterogeneous graph, which is utilized to characterize the Academic Collaboration Network (ACN) in terms of node-type prediction. We compare several embedding techniques on heterogeneous graphs with or without authors' information to identify the most effective approach. Specifically, we employ a predictive model based on Heterogeneous Graph Neural Networks (HGNN), as proposed in [3-4], which achieves excellent performance in our evaluation. |
6. Discrete model emulating properties of hyperbolic random graphs ABSTRACT. Geometrical random graphs on a hyperbolic disk are scale-free, small-world and have large clustering coefficient. It is the reason why they are widely used to model real-world networks. However, their widespread application is hampered by the fact that researchers in various applied fields are typically not familiar with hyperbolic geometry. Here we suggest a simple discrete model which reproduces main properties of hyperbolic geometrical random graphs. Consider a regular tree with degree $p$ and $n$ generations, and add a bond between any two vertices if the shortest path connecting them on the tree is not longer than some given $m$. It is easy to show that for $m = n$ such a regular graph has a log-periodic discrete power-law degree distribution with scaling exponent -2, diameter 2, and clustering coefficient of order 1. The average degree of this graph is of order of the square root of the number of vertices. We consider three modifications of this model, which remain sparse in the large n limit: first, we study the behavior of the networks if n tends to infinity while m stays finite. In this case the graph has a distinct core-periphery structure, with core consisting of vertices of generations smaller than (n-m), which all have the same degree. However, in the periphery the power-law degree distribution is preserved, while the fraction of monomers in the core is finite and exponentially small for large m. As a result, the degree distribution converges to a truncated power law, while diameter of the network diverges as n/m, i.e. logarithmically in the number of bonds. The other two generalizations correspond to introducing stochasticity in network formation by (i) introducing a random occupancy of nodes (i.e., nodes can be either occupied or empty, only occupied nodes connect) or (ii) randomness in bond formation (i.e., nodes at distances less than m can connect with probability q < 1). Finally, we discuss the properties of a directed nearest-neighbour network on a tree, i.e., network where each node is connected by directed links to its k nearest neighbours. We show that the properties of these networks are similar to hyperbolic nearest-neighbour networks in continuous space. Thus, the suggested simple model reproduces the main properties of continuous hyperbolic graphs without relying on complicated calculations. We argue that such a simple formulation of a hyperbolic geometrical graph can be useful for pedagogical purposes, lead to further insights into their structure and help promote their interdisciplinary applications. |
7. A Principled, Flexible and Efficient Framework for Hypergraph Benchmarking PRESENTER: Nicolò Ruggeri ABSTRACT. In recent years hypergraphs have emerged as a powerful tool to study systems with multi-body interactions which cannot be trivially reduced to pairs. While highly structured benchmark models have proved fundamental for the standardized evaluation of algorithms and the statistical study of real-world networked data, these are scarcely available in the context of hypergraphs. Here we propose a flexible and efficient framework for the generation of hypergraphs with many nodes and large hyperedges, which allows to specify general community structures and tune different local statistics. We illustrate how to use our model to sample synthetic data with desired features (assortative or disassortative communities, mixed or hard community assignments, etc.), benchmark community detection algorithms, and generate hypergraphs structurally similar to real-world data. Overcoming previous limitations on the generation of synthetic hypergraphs, our work constitutes a substantial advancement in the statistical modeling of higher-order systems. |
8. Characterizing spatial networks using β-skeletons PRESENTER: Szabolcs Horvát ABSTRACT. Most classic network analysis techniques were designed to be applicable to arbitrary, generic graphs. However, the nodes of many real-world networks exist in physical space, with only nearby nodes being connected. This strongly constrains their possible connectivity structures, rendering many classic graph measures uninformative, and of limited use for classification. This is even more true in networks where only direct spatial neighbours are connected, and long-range connections are completely missing. Examples include various transport networks in biological organisms (such as vasculature), networks of streets, fungal networks, etc. In all these cases, node locations almost completely determine connectivity. We propose a novel approach to characterizing such networks through the concept of β-skeletons, a family of parametrized proximity graphs that naturally capture spatial neighbour relations. Despite its great potential, this concept has so far been mostly ignored within the field of spatial network analysis. We study the statistical properties of β-skeletons using both exact and numerical approaches, then building on these results, we introduce an innovative way of characterizing spatial point patterns by analysing their skeletons. Finally, we use three-dimensional biological network datasets to demonstrate that $\beta$-skeletons accurately capture the structure of most direct-neighbour spatial networks based on their node locations, and can thus be used to gain insight into their local network structure. |
9. Beyond space and blocks: Generating networks with arbitrary structure PRESENTER: Remy Cazabet ABSTRACT. The mesoscale organization of networks is one of the most studied topics in network science. Some structures in particular have attracted a lot of attention, such as the block or community structure, the spatial structure, or the core-periphery structure. Many works have been published on how to detect these structures in observed networks, and how to generate random graphs having such a structure. In this work, we propose a framework to generate random graphs 1)having a desired number of nodes and edges, 2) following a desired structure –not limited to blocks and spatial structures 3) whose structure strength is controlled with a single parameter, from deterministic to fully random. |
10. Heuristic Modularity Maximization Algorithms for Community Detection Rarely Return an Optimal Partition or Anything Similar PRESENTER: Samin Aref ABSTRACT. Community detection is a classic problem in network science with extensive applications in various fields. The most commonly used methods are the algorithms designed to maximize modularity over different partitions of the nodes into communities. Using 91 real and random graphs from a wide range of contexts, we investigate the extent to which current heuristic modularity maximization algorithms succeed in maximizing modularity by evaluating (1) the fraction of their output modularity value for a graph over the maximum modularity of that graph and (2) the similarity between their output partition and a modularity-maximizing partition. Our computational experiments involve eight existing modularity-based heuristic algorithms, which are used by no less than tens of thousands of peer-reviewed studies. We compare them against an exact integer programming method that returns a modularity-maximizing partition. The average modularity-based heuristic algorithm returns optimal partitions for only 15.1\% of the 91 graphs considered. We also observe a considerable dissimilarity, in terms of normalized adjusted mutual information, between the sub-optimal partitions and optimal partitions of the graphs in our experiments. More importantly, our results show that near-optimal solutions tend to have partitions disproportionally dissimilar to an optimal partition. Taken together, our analysis points to a crucial limitation of commonly used modularity-based algorithms for discovering communities: They rarely return an optimal partition or a partition closely resembling an optimal partition. Given this finding, developing an exact or approximate algorithm for modularity maximization is a much-needed requirement for a more methodologically sound usage of modularity for discovering communities. |
11. Degree distributions under general node removal: Power-law or Poisson? PRESENTER: Mi Jin Lee ABSTRACT. Perturbations made to networked systems may result in partial structural loss, such as a blackout in a power-grid system. Investigating the resultant disturbance in network properties is quintessential to understand real networks in action. The removal of nodes is a representative disturbance, but previous studies are seemingly contrasting about its effect on arguably the most fundamental network statistic, the degree distribution. The key question is about the functional form of the degree distributions that can be altered during node removal or sampling, which is decisive in the remaining subnetwork's static and dynamical properties. In this work, we clarify the situation by utilizing the relative entropies with respect to the reference distributions in the Poisson and power-law form. Introducing general sequential node removal processes with continuously different levels of hub protection to encompass a series of scenarios including random removal and preferred or protective removal of the hub, we classify the altered degree distributions starting from various power-law forms by comparing two relative entropy values. From the extensive investigation in various scenarios based on direct node-removal simulations and by solving the rate equation of degree distributions, we discover in the parameter space two distinct regimes, one where the degree distribution is closer to the power-law reference distribution and the other closer to the Poisson distribution. |
12. Characterization of Resilience/Efficiency Optimized Networks ABSTRACT. The network structure of ecosystems has been widely investigated and found to possess distinct topological properties that demonstrate the ability to persist when faced with system perturbations. Understanding and utilizing these properties could have important implications for real-world applications to complex networks. To identify and utilize these properties, this paper attempts to characterize this topology by describing three conventional network models. The methods used to generate the network models are presented, as well as the analysis results and a discussion of potential implications. The network topology present in modern ecosystems is a balanced function of efficiency and resilience, indicating that the ability to persist is a function of an information theoretic-based metric termed Ascendancy, a measure of network efficiency and Reserve, a measure of network resiliency quantifying the available actions with which adaptability may occur. This balance is necessary as a highly efficient network has an increased probability of failure, given perturbations. A highly resilient network wastes resources and is susceptible to cascading failures as they propagate through hubs and communities. Throughout the academic literature, this Resilience/Efficiency metric has been used at the system level to characterize network flows in such varied disciplines as Economics, Industrial Ecology and Civil Engineering. Random network models (N = 1,000) were generated using three different methods: Erdos-Renyi Random Network, Watts Strogatz Small World and Barabasi-Albert Scale Free. Resilience/Efficiency metrics were calculated for each network and plotted along the Resilience/Efficiency curve to provide a full spectrum of networks. These networks were partitioned (N = 100) into three distinct categories: Highly Resilient, Highly Efficient and Optimally Resilient/Efficient. Network property means were calculated for each network category. These properties include Edge Density, Average Degree, Transitivity, Average Path Length, Assortivity and Global Efficiency. Characterization of these three models, when viewed through a Resilience/Efficiency Optimality perspective, indicates significant differences with network centrality measures offering further detail. These differences may indicate how the ability to persist varies, implying that their topological characteristics play an essential role. These differences may help explain why real-world network representations of these systems possess differences in stability and the ability to propagate failure or are more able to adapt. |
13. Bow-tie structures of Twitter discursive communities PRESENTER: Fabio Saracco ABSTRACT. Bow-tie structures were initially introduced for the description of the World Wide Web (WWW) [1]: if we represent websites as nodes in a direct network in which the edges are the hyperlinks connecting them, then the greatest number of websites take part to a weakly connected component (WCC) with a peculiar structure. In a nutshell, the definition of the bow-tie’s groups of nodes, or "sectors", depend on if they can reach or can be reached by the greatest strongly connected component (SCC) of the WCC. In the recent analyses of Twitter debates, the literature has focused on discursive communities, i.e. communities of accounts interacting among themselves via retweets, i.e. sharing the messages produced by other accounts [2]. In the present work [3], we present the analysis of the network structure of the discursive communities of 8 different thematic Twitter data sets in different languages, related to various debates, political or not, in Europe. Surprisingly, we observe that almost all discursive communities therein display a bow-tie structure, with small differences. In general, discursive communities displaying a bow-tie structures are frequent in the political debates, while they are absent when the argument of the discussion is different as sport events, as in the case of Euro2020 data sets. A closer inspection to the quality of the contents shared permits to have a clearer idea on what are the implications of the presence of such structures in the debate. Using the domain annotation from the fact-checking website Newsguard, we consider the quality of the contents flowing inside or between the various groups: the content with the lowest quality is the one produced and shared in the SCC. In most of our datasets, we observe that the greatest sector is OUT, i.e. all nodes that can be reached by nodes in the SCC. A great OUT block implies that the great- est part of the accounts has access to a great variety of contents (since, according to the definition in [1], it is accessible by nearly all sectors of the bow-tie, see Fig. 1)but the quality of the content shared is, in general, quite low. Indeed, the access to a great number of question- able sources -here accounts sharing non-reliable URLs- is exactly the definition of infodemics, according to the World Health Organization (WHO). 1] A. Broder et al, 2000, Graph structure in the web, Computer Networks [2] C. Becatti et al , 2019, Extracting significant signal of news consumption from social networks: the case of Twitter in Italian political elections, Palgrave Commun. [3] M. Mattei et al, 2022, Bow-tie structures of twitter discursive communities, Scientific Reports 2022 12:1 |
14. How does news sentiment affect propogation on Twitter? ABSTRACT. Since its creation in 2006, microblogging platform Twitter has been slowly transforming into a platform to receive news. Twitter has many unique characteristics like the retweet button which aid in the propagation of news. Previous studies have shown that over 70 percent of Twitter users receive news through Twitter. Additionally, sentiment analysis of a large tweet database has proven that negative tweets are more prevalent among Twitter, and demographic characteristics affect the response to negative and positive news. In this study, we provide a model to simulate the diffusion of the news based on demographics and network structure. We simulate the demographic and degree distribution of Twitter and observe the diffusion of positive and negative news by calculating the utility of nodes if they share the news. We compare different types of networks and demographics to investigate the speed, reach, and propagation of positive and negative news. This report finds that the retweet volume of negative news is more than double that of positive news, and the speed of diffusion is twice that of positive news as well. |
15. A network framework to assess space sustainability PRESENTER: Matteo Romano ABSTRACT. Networks are used as descriptive and representative tools for several applications involving complex systems. This work uses networks as a tool to represent the complex reality of the orbital environment around the Earth, populated by thousands of satellites and debris which every day are involved in close encounters with high probability of collision between each other. Instead of focusing on the single encounters, the proposed approach considers the larger picture produced by the interactions between the various objects of the population. The members of this population than can be observed are listed in an online catalogue which is updated daily and contains information about their orbital state (position and velocity), allowing for the precise propagation of their trajectories in the near future. The large number of trajectories is then reduced to avoid performing thousands of direct confrontations: a filtering process compares the relative geometry of the orbits to predict whether there is the potential for an encounter between two objects and, finally, whether one will occur. The remaining objects, then, are represented as the nodes of a network, in which a link between two nodes is established whenever a possible collision between two objects is predicted. By weighting the links using the values of collision probability, the topology and properties of the network are studied to obtain a global assessment of the risk of collisions within the population and of their consequences on the safety of space activities. Different measures are defined to quantify how dangerous or unsafe an object is with respect to others or how severe the consequences of a collision may be. This method is used to formulate criteria to judge the hazard level of the population and guide the creation of new standards for space sustainability. |
16. A Centrality Ranking Strategy in Modular Complex Networks PRESENTER: Hocine Cherifi ABSTRACT. Centrality measures rank nodes using the classical descending order ranking scheme, with the top being the most influential. However, this ranking scheme has a major drawback. It is agnostic of the mesoscopic organization of most real-world networks. To tackle this issue, we propose a community-aware ranking scheme identifying influential nodes across all the network communities [1]. Given a centrality measure, the proposed strategy selects the top central nodes in each community and ranks them in decreasing order of their community size. Then it moves to the next most central node in each community and proceeds with the same ordering strategy. One iterates this process until reaching the given budget of nodes. This strategy allows the selection of influential nodes in all the communities without saturating their zone of influence. We investigate the proposed approach with the Susceptible-Infected-Recovered (SIR) diffusion model on a set of synthetic and real-world networks using two local centrality measures (Degree and Maximum Neighborhood Component), two path-based measures (Betweenness and Closeness), and two iterative refinement measures (Katz and PageRank). We compare it with the classical descending order ranking scheme based on the final outbreak size (i.e., the nodes in the recovered state). Figure 1 (A) illustrates the relative difference in the outbreak size (ΔR) between the ranking schemes for the Degree and Closeness centrality in the EU Airlines and AstroPh networks. A positive value of (ΔR) indicates that the proposed ranking scheme is more effective in igniting a larger diffusion. One can notice that the community-aware ranking scheme consistently outperforms its alternative. Figure 1 (B) represents the EU Airlines and AstroPh networks. The big nodes are the top 15% extracted by Degree centrality in the two ordering strategies. The nodes selected by the proposed ranking scheme are spread across the network. In contrast, with the classical descending order ranking scheme, they concentrate on the core of the networks. Consequently, the diffusion dies out internally, leaving the remaining communities intact. We observe similar behavior in the other networks and centrality measures under investigation. One can use this generic community-aware ranking scheme with various centrality measures in all network types (undirected/directed and unweighted/weighted). Its main advantage is to naturally selects distant nodes to expand any diffusion phenomena based on any given budget. It is suitable for use cases such as viral marketing, awareness campaigns, hindering misinformation on networks, and vaccination strategies in networks with community structure. [1] Rajeh, Stephany, and Hocine Cherifi. "Ranking influential nodes in complex networks with community structure." Plos one 17.8 (2022): e0273610, https://doi.org/10.1371/journal.pone.0273610 |
17. Inferring networks from the microgenomic abundance data PRESENTER: Gorka Buenvaron-Campo ABSTRACT. Microorganisms are agents that play a decisive role in hundreds of large-scale processes in the macroscopic world. Around 1-3\% of the body mass of a living being is made up of trillions of microorganisms that participate symbiotically in metabolic processes and are essential for the correct functioning of these processes \cite{10.1111/j.1753-4887.2012.00493.x}. The evolution of macro and microscopic beings is linked to the evolution of the interactions established between them. These microscopic beings are enormously responsible for other processes on a planetary scale; for example, despite accounting for only 1-2\% of global plant carbon, phytoplankton are responsible for processing around 40\% of total CO2 through photosynthesis \cite{falkowski1994role}. Prokaryotic carbon is 60-100\% of the estimated total carbon in plants, and they represent the largest pool of N and P on earth. An estimated $4-6\, 10^{30}$ prokaryotic cells occur in the earth, mainly in the open ocean, soil, and oceanic and terrestrial surfaces \cite{whitman1998prokaryotes, eguiluz2019scaling}. Understanding how the micro-life interacts and constructs consortiums may be extremely helpful for facing global challenges as the climate change, developing more efficient drugs and treatments or optimizing industrial processes. \\ The current work focus on the eukaryotic metagenome. Several experiments have performed an extensive data acquisition task, in which the abundance of different phylogenetic units has been identified and accounted for in numerous samples. Specifically, the data from the Tara \cite{Tara} and Malaspina \cite{Malaspina} expeditions, as well as experiments in Mesocosmos \cite{Mesocosm} conducted at KAUST, consist of very rich datasets that have phylogenetic data of the identified organisms, the abundance found of each of these in different samples, and some metadata about the samples themselves, such as wind/water direction, geolocation, temperature, sample depth, etc. Given the heterogeneity of the protocols used in each of the experiments, we are currently developing the methodology on a selection of samples, but the intention is to apply it in the future to the Gene-Sphere database collected in KAUST. \\ Specifically, our methodology focuses on the study of pairwise similarity between eukaryotic and prokaryotic organisms using geographic co-abundance as a measure of similarity, comparing abundance profiles between pairs. This approach enables clustering techniques to identify groups of organisms in the samples. To identify these groups, community detection algorithms can be applied that identify modules in the network based on pairwise relationships. Alternatively, more sophisticated techniques such as embedding methods allow us to find clusters that aggregate different organisms according to their properties.\\ |
18. Effect of spatial correlations on Hopfield Neural Network and Dense Associative Memories PRESENTER: Giulio Iannelli ABSTRACT. Hopfield model is one of the few neural networks for which analytical results can be obtained. However, most of them are derived under the assumption of random uncorrelated patterns, while in real life applications data to be stored show non-trivial correlations. In the present paper we study how the retrieval capability of the Hopfield network at null temperature is affected by spatial correlations in the data we feed to it. In particular, we use as patterns to be stored the configurations of a linear Ising model at inverse temperature beta, thus limiting our analysis to exponentially decaying spatial correlations. Exploiting the signal to noise technique we obtain a phase diagram in the load of the Hopfield network and the Ising temperature where a fuzzy phase and a retrieval region can be observed. Remarkably, as the spatial correlation inside patterns is increased, the critical load of the Hopfield network diminishes, a result also confirmed by numerical simulations. The analysis is then generalized to Dense Associative Memories with arbitrary odd-body interactions, for which we obtain analogous results. |
[CANCELLED] 19. Linear Clustering Process on Networks PRESENTER: Ivan Jokić ABSTRACT. We propose a linear clustering process on a network consisting of two opposite forces: attraction and repulsion between adjacent nodes. Each node is mapped to a position on a one-dimensional line. The attraction and repulsion forces move the nodal position on the line, depending on how similar or different the neighbourhoods of two adjacent nodes are. Based on each node position, the number of clusters in a network, together with each node’s cluster membership, is estimated. The performance of the proposed linear clustering process is benchmarked on synthetic networks against widely accepted clustering algorithms such as modularity, the Louvain method and the non-back tracking matrix. The proposed linear clustering process outperforms the most popular modularity-based methods, such as the Louvain method, while possessing a comparable computational complexity. |
20. Co-Evolving Dynamics and Topology in a Coupled Oscillator Model of Resting Brain Function PRESENTER: Maria Pope ABSTRACT. Dynamic models of ongoing BOLD fMRI brain dynamics and models of communication strategies have been two important approaches to understanding how brain network structure constrains function. However, dynamic models have yet to widely incorporate one of the most important insights from communication models: the brain may not use all of its connections in the same way or at the same time. We present a variation of a phase delayed Kuramoto coupled oscillator model that dynamically limits communication between nodes on each time step. An active subgraph of the empirically derived anatomical brain network is chosen in accordance with the local dynamic state on every time step, thus coupling dynamics and network structure in a novel way. We analyze this model with respect to its correlation to empirical time-averaged functional connectivity, finding that it significantly outperforms standard Kuramoto models when an appropriate edge selection rule is chosen. We also perform analyses on the novel structural edge time series produced by activating edges connecting nodes to their most synchronized neighbors. We demonstrate a slowly evolving structural network topology moving through itransient episodes of integration and segregation. We hope to demonstrate that the exploration of novel modeling mechanisms and the investigation of dynamics of networks in addition to dynamics on networks may advance our understanding of the relationship between brain structure and function. Figure caption: The schematic shows the calculation of the phase update for a single node on a single time step of both the classic Kuramoto-Sakaguchi model (blue arrows) and the proposed variation of the model (orange arrows) for comparison. In the classic model, the update to each node is summed across all structurally connected neighbors. In the proposed model, each node selects m neighbors with the smallest phase difference (most synchronized), and the corresponding 'active edges' are retained - all other edges are set to zero. Influence is then summed over the active edges. |
21. Predicting attractors from spectral properties of stylized regulatory networks PRESENTER: Dzmitry Rumiantsau ABSTRACT. How the architecture of gene regulatory networks ultimately shapes gene expression patterns is an open question, which has been approached from a multitude of angles. The dominant strategy has been to identify non-random features in these networks and then argue for the function of these features using mechanistic modelling. Here we establish the foundation of an alternative approach by studying the correlation of eigenvectors with synthetic gene expression data simulated with a basic and popular model of gene expression dynamics -- attractors of Boolean threshold dynamics in signed directed graphs. Eigenvectors of the graph Laplacian are known to explain collective dynamical states (stationary patterns) in Turing dynamics on graphs. In this study, we show that eigenvectors can also predict collective states (attractors) for a markedly different type of dynamics, Boolean threshold dynamics, and category of graphs, signed directed graphs. However, the overall predictive power depends on details of the network architecture, in a predictable fashion. Our results are a set of statistical observations, providing the first systematic step towards a further theoretical understanding of the role of eigenvectors in dynamics on graphs. |
22. Protein residue networks (PNR) and discrete Markov chains to characterize active sites in SARS-CoV2 main protease Mpro through graph theory ABSTRACT. Proteins are biological macromolecules that mediate the vast majority of catalytic reactions in all organisms. Protein’s folded states are quasicrystalline three-dimensional conformations from where specific regions known as catalytic and/ or binding sites are defined. These regions play a fundamental role since they participate and promote crucial biological functions. In order to perform those, not only energetic stable overall structure is needed but also an efficient flow of information along the macromolecular structure must be achieved. In this work we present an analysis of the efficency of the topological structure of SARS-CoV2 main protease Mpro which is key enzyme involved in viral replication cycle. Due to this, Mpro has been targeted as a good candidate for antiviral drug in-vitro design. For Mpro and their variants we compute the Kemeny constant and the mean-first passage time matrix M from which relevant roles of individual amino acid residues can be identified and potential candidates for binding or catalytic sites within the enzyme. |
24. Restore structure evolution trajectory of networked complex systems ABSTRACT. At the core of complexity science, networked systems are ubiquitous across a broad range of fields from biology, neuroscience to sociology. The formation processes of complex systems carry key information in the systems’ functional properties. Here applying machine learning algorithms, we demonstrate for the first time that the historical formation process of various empirical networks can be extracted reliably with high precision, including protein-protein interaction, ecology, economic, and social networks. Intriguingly, we discover that if the performance of machine learning model is slightly better than random guess on pairwise order of links, reliable restoration of network formation process can be achieved. It means that in general history restoration is highly feasible on empirical networks. Furthermore, we find that recovering the historical formation processes reveals the underlying network evolution mechanisms. In addition to its fundamental scientific value, the recovered link temporal sequence can significantly enhance the performance of widely adopted structure or link prediction algorithms, demonstrating its immense potential in practical applications. |
25. Temporal Network Prediction and Interpretation PRESENTER: Li Zou ABSTRACT. Temporal networks refer to networks like physical contact networks whose topology changes over time. Predicting future temporal network is crucial e.g., to forecast epidemics. Existing prediction methods are either relatively accurate but blackbox, or white-box but less accurate. The lack of interpretable and accurate prediction methods motivates us to explore what intrinsic properties/mechanisms facilitate the prediction of temporal networks. We use interpretable learning algorithms, Lasso Regression and Random Forest, to predict, based on the current activities (i.e., connected or not) of all links, the activity of each link at the next time step. From the coefficients learned from each algorithm, we construct the prediction backbone network that presents the influence of all links in determining each link’s future activity. Analysis of the backbone, its relation to the link activity time series, and to the time aggregated network reflects which properties of temporal networks are captured by the learning algorithms. Via six real-world contact networks, we find that the next step activity of a particular link is mainly influenced by (a) its current activity and (b) links strongly correlated in the time series to that particular link and close in distance (in hops) in the aggregated network. |
26. Neighbourhood matching creates realistic surrogate temporal networks PRESENTER: Antonio Longa ABSTRACT. Temporal networks are essential for modeling and understanding systems whose behavior varies in time, from social interactions to biological systems. Often, however, real-world data are prohibitively expensive to collect or unshareable due to privacy concerns. A promising solution is `surrogate networks', synthetic graphs with the properties of real-world networks. Until now, the generation of realistic surrogate temporal networks has remained an open problem, due to the difficulty of capturing both the temporal and topological properties of the input network, as well as their correlations, in a scalable model. Here, we propose a novel and simple method for generating surrogate temporal networks. By decomposing graphs into temporal neighborhoods surrounding each node, we can generate new networks using neighborhoods as building blocks. Our model vastly outperforms current methods across multiple examples of temporal networks in terms of both topological and dynamical similarity. We further show that beyond generating realistic interaction patterns, our method is able to capture intrinsic temporal periodicity of temporal networks, all with an execution time lower than competing methods by multiple orders of magnitude. |
27. Modeling gentrification as a dynamic spatio-temporal process PRESENTER: Nicola Pedreschi ABSTRACT. Gentrification, "the rapid increase in cost and standard of living in a disadvantaged neighbourhood" [1], causes the relocation of lower-income inhabitants in favor of wealthier citizens. Relocation is caused by socio-economic inequalities and may be influenced by the presence of amenities and infrastructures. In this work, we focus on relocation trajectories and develop an agent-based gentrification model. We model society as a mixture of agents belonging to three (Low-, Middle-, High-) income profiles, with different needs and possibilities to improve their living conditions. We represent the urban system as a hierarchical, modular network whose nodes are housing units, communities are coarse-grained neighborhoods, and agents (individuals) relocate between two housing units. Agents of different types base their choice of whether to move on the current cost of living at different spatial network scales (micro, meso, macro). For example, high- income agents have access to temporal information regarding the fluctuations of cost-of-living across the nodes and communities in the underlying network, i.e., its rate of change. A visual representation of agents’ behaviours can be found in Figure 1. A Low-Income agent l moves from its current node when its income at time t is lower than the average income on the node (micro-scale), wl (t) < wn1 (t) ≡ ⟨wi (t)⟩i∈n1 . Agent l relocates to a new node ni if wl (t) > wni (t) − ε, where wni (t) is the average income in n1 and ε ≡ wn1 (t) − wni (t) is the cost-of-living gap between two nodes. Agent a relocates when the cost of living increases and moves to a new, affordable node in the network. A Middle-Income agent m moves from its current node if (1) the agent is poorer than the average wealth in its current community (neighborhood, meso-scale) C1 by more than a fraction α of its own income; or (2) the agent is richer than the average wealth in its current community by more than α times its own income. Agent m relocates when either the cost of living in its current community is not affordable (1) or if aims for a better standard of living (2). In the former case, m moves to a random node in a new affordable community Ci, whose affordability depends on the cost-of-living gap w.r.t. the starting community. In the latter case, m relocates to (a random node in) a new community where the cost of living is not higher than the agent’s income by more than the tolerance. A High-Income agent h moves from its current node to maximize the profit of an investment in a new, up-and-coming neighbor- hood. At any time step, h evaluates the (discrete) rate of change of the average wealth of each community Ci in the network, δ ̄wCi (t) and moves to the community that maximizes this rate, if higher than its own by an investment-risk tolerance α. The overall model outcome is a collection of relocation trajectories of the three types of agents over time, which can naturally be translated into the framework of temporal networks. Our model does not have a specific termination condition [2]. In fact, we model gentrification as a continuous process over time to understand what are the emerging and reoccurring spatio-temporal patterns. Ultimately, we introduce a temporal network- based gentrification indicator Gi of a neighborhood (community) defined as the difference between the out-flow of LI agents from the neighborhood, and the coordinated in-flow of MI agents. We conduct extensive experiments to characterize the gentrification phenomenon by integrating our Gi with other socioeconomic segregation indices, showing that gentrification patterns arise in neighborhoods that have previously undergone periods of gradual segregation. |
28. Using network science in veterinary epidemiology PRESENTER: Gavrila Amadea Puspitarani ABSTRACT. Infectious disease outbreaks in livestock populations compromise animal health and welfare and may lead to high economic losses. Historical epidemics have been associated with movements of animals; therefore, several European countries have developed livestock registration and movement databases to trace animal movements. In this study, we analyze seven years (2015-2021) of daily records of pig movement in Austria and select epidemiologically relevant network metrics that have practical applications for surveillance and control of infectious disease outbreaks. We first explore the topology of the network and its structural changes over time, including seasonal and long-term trends in the pig production activities. We then investigate the temporal dynamics of the network community structure using InfoMap algorithm. We show that the Austrian pig trade network exhibits a scale-free topology but is very sparse, suggesting a moderate impact of infectious disease outbreaks. However, our findings highlight that two regions (Upper Austria and Styria), presenting higher farm density and a significant number of hubs. These regions also host 77% of pig holdings in the largest strongly connected component, making them more vulnerable to infectious disease outbreaks. Dynamic community detection revealed a stable behavior of the clusters over the study period. This study provides important insight on the interplay between network theory and veterinary epidemiology and valuable information to designing cost-effective infectious disease surveillance and control plans. Notably, we argue that enhancing biosecurity and surveillance in the highly connected holdings (hubs) would greatly reduce the structural risk and favor timely detection of pathogens. Similarly, trade communities may offer an optimized approach to managing infectious diseases through data-driven zoning. |
29. Using a co-incidence network to investigate interaction patterns across geospatial areas PRESENTER: Emily Harvey ABSTRACT. Schools and workplaces are some of the most important connectors within societies – inducing interactions between people who often live quite far apart and would be otherwise unconnected. The significance of these interactions has been highlighted in the last 3 years due to their role in the transmission of COVID-19. More generally, these co-incident interactions play an important role in sharing information and cultural practices. We used linked micro-data within the Statistics NZ Integrated Data Infrastructure (IDI) to construct a bipartite network of ~5M individuals linking them to their ~1.7M dwellings, ~488K workplaces, and ~7K schools. We then project onto the dwelling nodes by identifying when individuals from different dwellings are linked to (co-incident in) the same employment or education context. The network is finally simplified by aggregating dwellings, and the connections between them, into geospatial area units. The resulting Aotearoa Co-incidence Network (ACN) is a weighted, multiplex network that describes how different geospatial areas are connected to one another through people interacting, at work and in education (self-loops indicate co-incident interactions by individuals living within the same geospatial area). There are a number of applications of such a co-incident network: -Using the ACN directly produces estimates of which areas a detected case of concern could have spread from or to, and thus where to prioritise surveillance efforts. -Centrality measures can be used as proxy for the relative transmission risk in different areas due to the density of connections through workplaces and schools. -Community detection methods reveal cluster of geospatial areas that are most densely connected through co-incident interactions. These communities can be produced for different education levels, as well as different industry sectors, and can be used for example as an initial estimate for defining spatial boundaries for movement restrictions, used to control spread of infectious disease. |
30. Understanding COVID-19 pandemic trajectories: why changes in online behavior matter for now-casting PRESENTER: Joana Gonçalves-Sá ABSTRACT. Online behaviour has been used as a tool for close to real-time study of different health-related behaviours, including identifying disease outbreaks. However, its validity in predicting infections, has been questioned by many, particularly during extraordinary events, such as the COVID-19 pandemic. Here, we present a new approach and argue that monitoring online behaviour, from social networks to search engines, during the worst possible moment, with highest media hype, can help us better understand motivations. In particular, we will show how it allows us to disentangle between online searches that are more associated with the disease and searches that are prompted by other factors, such as media exposure. Our work indicates that it is possible to learn from pandemics to improve the now-casting of future infection waves and that including more data is not necessarily better. |
31. Forecasting Human Mixing Behavior Using Online Surveys PRESENTER: Mehdi Zahedi ABSTRACT. Many respiratory infectious diseases are transmitted through proximity contacts among individuals. Therefore, predicting infection spread and the impact of non-pharmaceutical interventions such as stay-at-home orders, recommendation to work from home, or the closure of schools or public-facing businesses relies on the monitoring and predictions of the reduction in individual mixing patterns that all those different interventions might induce. Traditionally, in epidemic transmission models, the average number of contacts are assumed to be constant over time. However, the average number of contacts like many other social behaviors change dynamically, particularly in response to exogenous factors (e.g. non-pharmaceutical interventions) or as individuals adapt their behavior due to situational awareness or increased fear of infection during a pandemic (e.g. as we observed particularly during the first wave of the COVID-19 pandemic). In this work, our goal is to forecast short term behavioral changes - such as average number of contacts - using both earlier values of the target variable and other measurable exogenous observables. We rely on individual survey data collected via an online anonymous questionnaire in Hungary during the whole course of the COVID-19 pandemic involving over 2.4% of the country’s population [1]. For each respondent, we have information on their self-reported number of contacts, domestic and international traveling patterns, smoking habits, self-protection attitudes and multiple socio-demographic characteristics including respondents’ age, gender, location, occupation, education, family structure, and many other details. Combining all these behavioral signals with temporal stringency information we train a multivariate autoregression model [2] to make short-term forecasts of average contact numbers at the population level. In Fig. 1 we show the performance of our model in producing 1-day and 7-days ahead forecasts for the average number of contacts. Our work provides a framework that can be a first step towards effective models of behavior forecast. In our presentation, we will show how this model can be used to produce predictions for other countries and how these predictions can be used into traditional epidemic modeling approaches to add in realism when modeling emerging outbreaks. |
32. A performance characteristic curve for model evaluation: applications in information diffusion prediction PRESENTER: Radosław Michalski ABSTRACT. In machine learning, a model can be regarded as a collection of the algorithm, the parameters and other things that can recognize a certain pattern in the data and utilize the pattern to forecast something unknown. While different models for the same task all claim to outperform others, a general framework to systematically evaluate different models remains to be explored. In traditional engineering fields, the instruments are usually evaluated more comprehensively. For example, the performance of an engine can be evaluated by the performance characteristic curve which illustrates how the output power and torque vary with the engine's rotation speed. In this way, the optimal region for an engine can be identified, allowing us to select the tool right for a given task. This motivates us to seek the performance characteristic curve for a machine learning model, which tells how the performance of the model changes with different complexity of the task. Here, we take the information diffusion prediction task as a particular case, which aims to predict a message's retweet sequence from a starting node. We use the mean average precision (MAP) as the measure of the match between the predicted and actual spreading sequence, and the average pairwise comparison entropy (APCE) as the measure of the inherent randomness in the data. In general, MAP decreases with APCE, reflecting the fact that prediction is less accurate when the task is more complicated. But data points are scattered (Fig. 1a). By properly scaling these two quantities, we identify a scaling curve between the randomness of the data and the prediction accuracy of the model, which holds under different spreading mechanisms, in both empirical and synthetic data on different networks (Fig. 1b). The scaling curve captures a model's inherent capability of making correct predictions against increased uncertainty, allowing us to compare different models in a variety of circumstances. The validity of the performance curve is tested by three prediction models in the same family, giving rise to conclusions in line with existing studies (Fig. 1c). The performance characteristic curve is also successfully applied to evaluate two distinct models from the literature (Fig. 1d). Taken together, our work reveals a pattern underlying the data randomness and prediction accuracy. The performance characteristic curve provides a way to systematically evaluate models' performance, and sheds light on future studies on other frameworks for model evaluation. |
33. Mapping networks as spindles to improve diffusion prediction PRESENTER: Jianhong Mou ABSTRACT. Understanding the dynamics of diffusion on networks is of critical importance for a variety of processes in real life, such as information propagation, viral marketing and disease spreading. However, prediction on the temporal evolution of diffusion on networks remains challenging as the complexity, non-linearity, coupling of network topology, and adaptation behaviors. Contrary to developing complex mathematical models to depict the diffusion process with the whole network structure, in this study, we propose a new network topology feature called spindle vector, characterizing the intrinsic property of node spatial distribution, to approximate the spatio-temporal evolution of diffusions on networks. Experiments on synthetic and empirical networks validate that our method could achieve profound precision and outperforms the classic Quench Mean-Field (QMF) method. We further simplified the approximation and approved that the new approach can achieve comparable prediction achievements with much less information in most networks. The new metric provides a general and computationally efficient approach to network diffusion prediction and is of potential for wider network applications. |
34. The Impact of Reservoir Networks in Recurrent Neural Networks PRESENTER: Seonho Nam ABSTRACT. Recurrent neural network (RNN) is a type of artificial neural network that exhibits a memory effect and are suited to learning data in temporal order, making them the most fundamental form of recent deep learning techniques. The memory effect differentiates RNNs from ordinary feedforward neural networks (FNNs). As RNNs possess recurrent weights which allow them to produce their own dynamic structures even without any external input. One can say that RNNs simulate dynamics, while FNNs simulate functions in the mathematical aspect. RNNs are able to learn from and retain information from the past, which is achieved by the use of the so-called RNN cell, where the RNN possesses a memory. The RNN cell is designed to receive the current input, as well as the state from the previous RNN cell, allowing for the continual learning of new data and the retention of previous information. Reservoir computing approaches to recurrent neural networks (RCRNNs) differ from conventional RNNs in that they do not learn the weights of all layers, but only those of the read-out layer $W_{\text{out}}$. This is enabled by the presence of a reservoir inside RCRNNs, which is composed of nodes that are ten times smaller and one thousand times larger than the input layer size. These nodes are randomly connected to form a random network, although the effects of this structure are yet to be fully elucidated. For RCRNN models, it is important to know how the reservoir state is updated. The current reservoir state is a combination of the reservoir state from the previous time step, the changes in the state caused by the input data, and the changes caused by the reservoir network. The equation for the reservoir state is as follows: S(t)= (1-a) \cdot S(t-1) + a \cdot \text{tanh}(W_{\text{in}} \cdot x(t) + W \cdot S(t-1)). The output for RCRNNs models is generated by interpreting the reservoir state. The process of generating the output is as follows: y(t)=f(W_{\text{out}} \cdot S(t)), where RCRNNs consist of a W_in matrix which receives the input data, a reservoir network which nodes have a state vector S(t), and a read-out matrix W_out. To investigate the effects of the ratio of signed edges in a reservoir network on the performance of reservoir computing, we constructed an RNN model utilizing reservoir computing, which is composed of two main parts: an untrained reservoir network and a read-out part which interprets the reservoir state. Our model used a feedforward neural network for the read-out part and was expected to provide better performance than the previously adopted single-layer linear regression. Typically, ER random network model is used for the untrained reservoir network with an edge sign ratio that is statistically equal. To verify our hypothesis, we artificially adjusted the sign ratio of the reservoir network. To do this, we pre-configured the positive and negative signs according to the set ratio. We have found that artificially adjusting the ratio of the reservoir network requires the spectral radius of the adjacent matrix to be less than unity. In this study, we examine how the performance of RCRNNs changes according to the sign ratio of the reservoir. Our study aims to uncover the effects of an unknown reservoir network and to present the characteristics of a network that can increase the performance of RCRNNs. Furthermore, we also perform research on the edge weight of the network, in addition to the sign ratio. |
35. Finding polarised communities and tracking information diffusion on Twitter: The Irish Abortion Referendum PRESENTER: Caroline Pena ABSTRACT. The analysis of social networks enables the understanding of social interactions, polarisation of ideas, and the spread of information and therefore plays an important role in society. We use Twitter data - as it is a popular venue for the expression of opinion and dissemination of information - to identify opposing sides of a debate and, importantly, to observe how information spreads between these groups in our current polarised climate. Cascade trees allow us to identify where the spread originated and to examine the structure it created [1]. This allows us to further model how information diffuse between polarised groups using mathematical models based on the popular Independent Cascade Model via a discrete-time branching process which have proved fruitful in understanding cascade characteristics such as cascade size, expected tree depth and structural virality [2]. Previous research [3] examined the 2015 Irish Marriage Referendum and successfully used Twitter data to identify groups of users who were pro- and anti-same-sex marriage equality with a high degree of accuracy. Our research improves upon this work by 1) Showing that we can obtain better classification accuracy of users into polarised communities on two independent datasets (Irish Marriage referendum of 2015 and Abortion referendum of 2018) while using substantially less data. 2) We extend the previous analysis by tracking not only how yes- and no-supporters of the referendum interact individually but how the information they share spread across the network, within and between communities via the construction of retweet cascades. To achieve this, we collected over 688,000 Tweets from the Irish Abortion Referendum of 2018 to build a conversation network from users’ mentions with sentiment-based homophily. From this network, community detection methods allow us to isolate yes- or no-aligned supporters with high accuracy (97.6%) — Figure 1a shows that these yes and no-aligned groups are indeed stratified by their sentiment. We supplement this by tracking how information cascades (Figure 1b) spread via over 31,000 retweets, which are reconstructed from a user mention’s network in combination with their follower’s network. We found that very little information spread between polarised communities. This provides a valuable methodology for extracting and studying information diffusion on large networks by isolating ideologically polarised groups and exploring the propagation of information within and between these groups. (a) (b) Figure 1: C1 (Yes supporters) in blue, C2 (No supporters) in red. (a) Average sentiment-out for each community cluster per day. (b) Example of a retweet cascade. [1] S. Goel, A. Anderson, J. Hofman, and D. J. Watts. The structural virality of online diffusion. Management Science, 62(1):180–196, 2016. [2] Gleeson, James P., et al. Branching process descriptions of information cascades on Twitter. Journal of Complex Networks 8(6): cnab002, 2020. [3] D. J. O’Sullivan, G. Garduño-Hernández, J. P. Gleeson, and M. Beguerisse-Díaz. Integrating sentiment and social structure to determine preference alignments: the Irish marriage referendum. Royal Society open science, 4(7):170154, 2017. |
36. Individual differences in knowledge network navigation PRESENTER: Manran Zhu ABSTRACT. As online information accumulates at an unprecedented rate, it is becoming increasingly important and difficult to navigate the web efficiently. To create an easily navigable cyberspace for individuals across different age groups, genders, and other characteristics, we first need to understand how they navigate the web differently. Previous studies have revealed individual differences in spatial navigation, yet very little is known about their differences in knowledge space navigation. To close this gap, we conducted an online experiment where participants played a navigation game on Wikipedia and filled in questionnaires about their personal information. Our analysis shows that participants' navigation performance in the knowledge space declines with age and increases with foreign language skills. The difference between male and female performance is, however, not significant in our experiment. Participants' characteristics that predict success in finding routes to the target do not necessarily indicate their ability to find innovative routes. |
37. Difference in emotional transitions between COVID-19 and other natural disasters’ tweets revealed by emotional evolution trees PRESENTER: Sota Watanabe ABSTRACT. In social networking services such as Twitter, people's emotions have a significant impact on the diffusion of information. It has been found that tweets which contain more emotion are more likely to cause a reversal of opinions and spread faster. Therefore, when a natural disaster occurs, it is necessary to quickly analyze people's emotions and the diffusion process of information in order to prevent confusion. However, previous analytical methods have yet to explore those diffusion processes, including people's emotions, because those processes have been based on retweets or follows. In this study, we constructed a network based on the similarity of sentences, called an evolution tree, rather than retweets or follows. Furthermore, we developed an emotional evolution tree, in which each node in the network is assigned an emotional value, to analyze the information diffusion process that includes people's emotions when a disaster occurs. Here, we collected tweets of COVID-19 and general natural disasters, such as earthquakes, especially in Japan, and compared how people’s emotions change during disasters by visualizing each information diffusion process in a network. The analysis showed that people's emotions in COVID-19 waves 1 and 7 frequently switched from positive to negative content (or vice versa), even on a single pathway in the network. In natural disasters such as earthquakes, on the other hand, such switching was less frequent on a single pathway, and negative and positive content tended to persist for longer periods of time. These results may indicate that COVID-19 has a heterogeneity when spreading compared to other natural disasters. |
38. On the role of network topology and committed minority in facilitating social diffusion PRESENTER: Tianshu Gao ABSTRACT. Social diffusion is a key phenomenon in human societies, whereby individuals collectively replace a status quo by adopting a newly introduced alternative$^{1,2}$. Evolutionary game theory has been widely employed to develop agent-based models (ABMs) tailored to study such diffusion dynamics. The network topology defining agent-to-agent interactions$^3$ and the presence of committed minority$^4$ (agents who stubbornly select the new alternative) are known factors that can facilitate and unlock social diffusion. In our recent work$^1$, we conducted online group experiments and used the data to propose and parametrise a game-theoretic ABM for social diffusion; in order to focus on identifying the key behavioural mechanisms in the model, an all-to-all network was assumed. Here, we build on that study to shed light on the impact of the network structure on social diffusion. We conducted an extensive campaign of Monte Carlo simulations to examine how i) the network topology, ii) the location of the committed minority on the network, and iii) the introduction of committed minority over time, can shape the diffusion dynamics. A key advantage of our work is that the ABM and its parameters are estimated from experimental data. We are thus able to comprehensively explore these three factors using simulations, without making strong assumptions on the model parameters or performing extensive parametrical studies. To investigate network topology effects, we considered synthetic networks with different characteristics. We found that diffusion occurred more frequently and at a faster speed as the average degree of the network decreased, irrespective of network class (Fig.~1a). However, for a fixed average degree, highly-clustered networks (e.g., Watts–Strogatz small-world networks) were the best for facilitating diffusion (Fig.~1b). Next, we explored positioning committed minority at node locations with largest centrality values to enhance diffusion dynamics, using a variety of different measures, such as eigenvector, degree, closeness, betweenness, and Bonacich. We found that Bonacich centrality$^5$ with negative attenuation factor had the largest effect in accelerating diffusion in all network structures (Fig.~1c); briefly, committed minority should be positioned as neighbours of agents with low degree, supporting the insights into average degree described above. Different strategies were examined for introducing committed minority over time, with the goal of exploiting the trend-seeking mechanism in the normal agents$^{1,6}$. Finally, a case study based on a real-world network confirmed our findings$^7$ (see Fig.~1c). |
39. Voter-like dynamics with conflicting preferences on homophilic networks PRESENTER: Filippo Zimmaro ABSTRACT. Two of the main forces shaping an individual’s opinion are social coordination and personal preference or personal bias. To understand the roles of these factors and that of the topology of the interactions, we study an extension of the voter model proposed by Masuda and Redner (2011), where the agents are divided into two populations with opposite personal biases. We consider a modular graph with two communities that reflect the bias assignment, modeling the phenomenon of epistemic bubbles (Nguyen, 2020). We analyze the models by approximate analytical methods and by simulations. Depending on the network and the biases’ strengths, the system can either reach a consensus or a polarized state, in which the two populations stabilize to different average opinions. The modular structure has generally the effect of increasing both the degree of polarization and its range in the space of parameters. When the difference in the bias strengths between the populations is large, the success of the very committed group in imposing its preferred opinion onto the other one depends largely on the level of segregation of the latter population, while the dependency on the topological structure of the former is negligible. We compare the simple mean-field approach with the pair approximation and test the goodness of the mean-field predictions on a real network. |
40. Dynamics of Ideological Bias Shifts of Users on Social Media Platforms PRESENTER: Boleslaw Szymanski ABSTRACT. The political biases of users shared over social media evolve over time. Although broad analyses of the mosaics of political biases on social media platforms are common, there is a paucity of studies focusing on understanding the motivation behind the individual users' political bias evolution. Using NetSense data, we found that on a university campus, changes of the students' opinions are driven by their desire to agree with popular opinions. Our study used two platforms, Twitter and Parler, and on both tracked 10,000 of their most prolific users for five months centered on the 2020 U.S. Presidential election. Only users active in September were included in the study, and those who were inactive in December were considered dropouts from the platform. For each tracked user, we used the weighted average of the political biases of the news media they propagated over a short period of time to assign them initial and final political biases. The results show that platform-wide political bias evolution is driven by a similar desire as on campus, to align with the popular and close to the current political bias on each platform. If it was impossible, the users dropped from the platform faster than students were leaving the campus. The results show that users prefer to move toward the local majority group representing their bias or remaining unchanged. The influence of center bias diminishes over time. On Parler, many movements are within distance of 2, so users can converge toward a single popular right bias, creating a unimodal distribution. On Twitter, both left and right biases have local maxima of popularity with both left and right polarization, which attract users with users with the same polarization, resulting in many movements for distance of just 1. In the talk, we will discuss in detail how attraction to popular biases fueled by homophily is balanced by reluctance to change opinions, which may cause weakening of the user’s current interactions or even cause them to drop out of the platform. |
41. Mechanistic interplay between opinion polarization and spread of information PRESENTER: Henrique De Arruda ABSTRACT. One of the most important challenges in researching information diffusion is understanding how the spread of information affects opinion dynamics. In this work, we conduct a computational investigation of how the competition of posts for users' attention influences the mechanisms of an opinion polarization agent-based model. In that model, users connected in a network have a continuous opinion between -1 and 1, incrementally influenced by posts they receive or send. As posts are sent, the model may rewire connections based on a bounded-confidence comparison, which leads to several network configurations. Here we extend this model to allow users to store old posts in an individual memory list and introduce a probability of re-sharing one of them instead of creating a new post. We find that the emergent organization from this post competition is capable of producing transient system states not observed previously, such as the formation of echo chambers. Furthermore, enabling the re-sharing of previous posts allows the simulation to track the spread of each post and produce cascade statistics, which elucidates mechanistic aspects of the interaction between opinion dynamics and the diffusion of information. |
42. International Fisheries Conflict and Spatial Overlap Networks to Identify High-Risk Dyads PRESENTER: Keiko Nomura ABSTRACT. As modern distant water fishing fleets expand their coverage over the world’s oceans, the potential for international conflict over fisheries resources is evolving. However, it remains difficult to understand the conditions driving such conflict. Two factors that may increase the likelihood of fisheries conflicts are a) sharing geographic space and b) having a history of previous conflicts (i.e., enduring rivalry). Here, we combine these ideas to investigate which pairs of countries fishing in the Pacific Ocean are more likely to engage in future fisheries-related disputes based on geographic and conflict connectivity. We construct separate networks representing spatial overlap of fishing grounds and history of fisheries conflict, then use node correspondence methods to identify common dyads that we call “high-risk” dyads. We find that spatial extents of fishing activities are dominated by relatively few countries (e.g., China, Taiwan, Japan). These countries also frequently overlap and conflict with others, reflected in their high centrality scores. For the high-risk network, distinct modules of high-risk countries emerge, suggesting that certain groups of countries are more likely to conflict with each other than others. We also map the overlapping fishing grounds between high-risk dyads to identify geographic hotspots where conflict may occur, and find conflict-prone regions in the South China Sea and ungoverned high seas. By using a network approach that incorporates geographic and international relations perspectives, we describe which actors and regions may be at an elevated risk of future fisheries conflict. Such network approaches are useful for a systematic view of the highly integrated modern global fisheries system. |
43. Mapping cultural differences via community detection on networks of inter-area lexical similarity PRESENTER: Thomas Louf ABSTRACT. Cultural diversity makes for a great part of humanity's heritage, and as such, its preservation has attracted considerable efforts. But there is increasing evidence of loss of this diversity, and many point to the acceleration of globalization as the primary cause of a trend of cultural uniformization. As of now, the mere measurement of cultural diversity remains tremendously complex, and only a few indicators allow to capture some aspects of its loss. The pace at which languages go extinct is one of them: today, most of the estimated 6000 languages of the world are endangered. But language is not only a cultural trait, it is also the most important vehicle of cultural transmission. This first implies that language preservation is paramount to preserving cultural diversity. But a less obvious implication is that the many aspects that define an individual's cultural identity should be reflected in their language. By analyzing what people speak about and how, it should be possible to infer some prominent cultural traits. That is the starting premise of this work. Culture is such an elusive concept that one cannot arbitrarily select a few linguistic markers to look for, and then infer cultural differences from those without introducing a very limiting a priori bias. That is why we do not define any marker of interest, but rather strive to infer both cultural regions and the topics that define them. We do so through the analysis of a corpus of billions of geo-tagged Tweets posted around the world between 2015 and 2021. Until this point, this work starts off on the same foundation laid out in a previous study. But it takes a radically different route because its scope is more general. Here, we wish not only to study a single country, but also to be able to infer cultural differences between countries sharing a common language. That is why we first do away with any kind of spatial smoothing. Second, because there can be dialectal differences with some words being unique to some areas, and due to the potentially high heterogeneity of areas in terms of Twitter activity, we also do not select words based on their frequency over the whole Twitter corpus. We rather consider the area with smallest token count (after filtering out areas with too few tokens), and determine the number of words that exceed a minimum frequency. We then take the union of the sets of words that have a rank lower than this value in all areas. This defines our list of words, in which, importantly, globally rare but locally important word forms are included. Further, we take a more robust approach to the inference of clusters of similar areas, as we leverage two complementary methods. The first (i) infers groups of areas and groups of words (topics) from a bipartite network of areas and words, in which the multiplicity of the edge linking a word to an area is equal to is absolute frequency. In the second (ii), we first compute the polarization in the use of each word in each area, that is $\frac{p_{w, i} - p_w}{p_{w, i} + p_w}$, with $p_{w, i}$ the relative frequency of word $w$ in area $i$, and $p_w$ its global frequency. We then define a network where the nodes are the areas, all connected by edges weighted by the correlation between their polarization vectors. We apply the hyperbolic tangent transformation to those, and infer partitions from the version of the weighted stochastic block model with normal distributions. In Fig. 1, we show some first results we obtained by applying the two inference methods on France. Interestingly, (i) shows rural and urban divides, with more use of slang (ntm) and contractions (jcrois), and more talk of politics (totalitaire) in urban areas (blue), while (ii), despite the absence of any spatial smoothing, remarkably identifies a geographic division between North East and South West. This shows the potential of such methods, which we will further apply to our readily available corpus of worldwide Tweets, for instance to include other French-speaking countries, but also studying other world languages such as English, Spanish, Arabic or Portuguese. |
44. Networks of national preference in scientific recognition PRESENTER: Alexander Gates ABSTRACT. The increasing globalization of science is built on the open exchange of knowledge. Yet, the modern scientific enterprise is subject to strong regional and national cultural biases that shape which discoveries, methods, and frameworks are recognized by the scientific community. As a consequence, quantitative indicators of scientific recognition, especially citation counts, reflect a geographical bias. Here, to capture the international variability in scientific citation and recognition, we focus on country specific citation rankings. The ranking of citation counts is arguably a more accurate indicator of institutional performance, and is regularly used as a measure of top performance despite temporal and field variations. Specifically, we adapt a statistical test for rank over-representation or under-representation of a country's publications within the reference lists of publications from another country, empowering us to measure when one nationality over- or under- cites the work from another nationality. We confirm that scientists from all nationalities tend to over-cite work from the same country, even when accounting for tendencies to self-cite their own publications, or cite other publications affiliated with the same institution. Next we map the network structure of international scientific recognition. % over/under represent the publications from other countries. The international scientific recognition network is a signed, directed network in which each country is a node, and a source country is linked to a target country by a positive (negative) if the source country over-cites (under-cites) the target country's publications. We find that of the 2,450 international relationships in 2015, 558 (22.78%) are positive over-citations (forming the positive network shown in Figure 1), 887 (36.2%) are negative under-citations, and the remaining 1005 (41.02%) reflect random citation relationships. Stochastic block models reveal a bi-partition of the network such that countries within each community are much more likely to over-cite each other's work, while under-cite the publications from countries in the other community. Analysis of the dynamic scientific recognition network empowers further revelations about changing geopolitical recognition in science. |
45. The role, function and impact of automated agents in protest communication networks: a network analysis approach PRESENTER: Linda Li ABSTRACT. Social bots are becoming a significant element in influencing political participation and communication on social media as automation becomes more prevalent. However, only a few of previous studies looked into bots’ specific behaviours and strategies and their influence on social media platforms. As such, our research attempted to identify, describe and understand automated agents (social bots) in political communication on Twitter from a network science perspective. We used the case of discourse related to two protests: Extinction Rebellion (XR) and Black Lives Matter (BLM) to answer two research questions:1)How can social bots be categorized by their activity and function? 2) How social bots impact information diffusion in political debate on Twitter? We used a self-trained random-forest-based model to identify bots. Approximately 40 to 50 percent of users engaged in protest-related discussions showed bot-like behavior in both cases. Roughly 34 % of the tweets were generated by bots in our dataset. Overall, the activities of bots were predominately (31% out of 34% of all tweets by bots) posting original messages or retweeting each other, and only a small proportion of the tweets by bots (3%) were reposted messages by human users. Human-like users, at the same time, spread more messages (16% out of the total 66%) produced by bot users than vice versa. Furthermore, we classified users with k-means clustering based on their network metrics. Four distinct categories emerged, including high-impact users (high centrality, high out-degree in the following network, mostly verified), bot pseudo fans (high in-degree in the following network, low centrality), bot astroturfers (high out-degree in the communication network, mostly retweeting each other) and common actors. Of all those groups, the high-impact users were predominantly made of humans (95%) but included verified media users who used some extent of automation to aid their tweet posting. The pseudo-fans and the astroturfers were predominantly bots that served different purposes to impact political communication online. We then used Exponential Random Graph Model (ERGM) and other network-based statistical methods to investigate the impact of bots on information diffusion and protest organization online. Statistical analysis suggested that bots successfully generated multiple information cascades online in protests, mostly by astroturfing (retweeting the same message together in a short time). They spread extreme, anti-protest messages, and humans seemed to catch up and spread the messages after 30 to 40 minutes. Network-based analysis indicated that those cascades had a prolonged impact: although most of the bots tweeted only once, some of the bots persisted temporally, and they impacted protest-related information diffusion and protest organisation at the structural level. Temporal analysis revealed that these bots’ messages made the human users’ sentiment more extreme and caused a decline in humans to human communication. These results added to the burgeoning literature of network science, social bots and political communication by shedding light on the descriptive traits and their impact on communication in the online contentious politics realm with network science methods. |
46. Flexible agent-based models for explaining population booms and busts in prehistory PRESENTER: Dániel Kondor ABSTRACT. Cyclic patterns in history have long been the focus of research. Considering the dynamics of large states ("empires"), it is often assumed that increased fragility is the price of the complexity created by governance institutions and economic relations. However, there is accumulating evidence on boom and bust cycles in population numbers of small-scale, non-state societies, going back to the Neolithic period. Such evidence suggests a dynamic landscape of interactions with an inherent complexity that results in emergent behavior on spatial scales significantly larger than the apparent scale of political organization. To study potential causes and underlying mechanisms, we have developed a flexible agent-based model using the HoloSim framework that is able to represent a range of possible hypotheses and allows quantitative statistical analysis of outcomes. In this study, we have applied this framework to Mid-Holocene Europe (ca. 7000-3000 BCE), covering the expansion of farmers from Western Anatolia to most of the continent. This time period encompasses several well-studied examples of rapid population growth followed by a decline within a few hundred years. Scholars continue to argue about the precise nature and cause of the demographic patterns observed in the archaeological record. Debate largely centres on whether external drivers, such as changing climatic conditions, or internal pressures, such as density-dependent conflict between and within groups, acted as the main cause of these "boom and bust" cycles. Here, we put these competing hypotheses to the test by developing robust agent-based simulations and assessing results against a large body of empirical material. We quantitatively evaluated model results by utilizing a large dataset of radiocarbon (14C) measurements that we took as proxies for regional population during the study period. We performed a statistical evaluation of typical time scales and amplitudes of boom and bust events based on calculating temporal autocorrelation (ACF) and variances (CV). We compared the statistics of ACF minima and CV values of 14C-derived and simulated regional time series. Results show that climate variations alone are insufficient to account for the typical population patterns seen in this period. While climate events often cause "shocks" that manifest as fast population declines, followed by recovery, the time scale of these is significantly faster than the several hundred years long patterns typically seen in the radiocarbon dataset. On the other hand, density-dependent conflict, when modeled as a second-order process, can reproduce the main statistical properties of booms and busts in a robust way. This means that the conflict-based hypothesis presents a plausible explanation as a major causal force behind Mid-Holocene population dynamics of Europe. At the same time, it underscores the importance of understanding how large-scale patterns may arise from local interactions in a self-organized way. Our HoloSim framework is adaptable to a wide range of scenarios, from interactions among stateless groups to larger scales of political organization such as the complex dynamics of interstate competition. |
47. The Role of Friend Recommendation Algorithms in Shaping Informational Ecosystems Online PRESENTER: Kayla Duskin ABSTRACT. Recommendation algorithms are now ubiquitous across social media platforms, affecting not only the content that users see but also how they construct their social network. Recommended social ties have the potential to exacerbate or mitigate the political polarization present in many online communities. In this work we create a set of automated accounts on Twitter to probe the effect of the platform’s “Who to Follow'' feature on a user’s personal network, and specifically how this feature may impact on the formation of political echo chambers. It has been shown that, in aggregate, the introduction of this feature to Twitter in 2010 increased the number of followers for most users and additionally increased triadic closure and promoted the formation of asymmetric network ties. However, less is known about the impact of the recommendations on users’ personal networks and how this algorithm may contribute to or protect against a new user’s likelihood of ending up in an ideological echo chamber. To shed light on these critical questions, we created ten pairs of Twitter accounts and initialized the social network of each pair by following the campaign account of a candidate in one of five 2022 US Senate races. Both accounts in each pair began by following the same candidate account; one account grew its network by following the users suggested by the recommendation system while the other grew its network by following accounts retweeted in their home timeline. We monitored an eight week period of network growth, additionally capturing the account metadata and posts of the followed accounts. We use the resulting dataset of egocentric networks on Twitter to answer key questions about the structure and content of personal social networks influenced by Twitter’s recommendation algorithm and to what degree they resemble echo chambers. We assess differences in network closure (i.e. triangles) as well as homogeneity along various dimensions including audience size, tenure on the platform, and political views, identified both formally (i.e. partisan elected officials) and informally (i.e. based on position in the larger political network on Twitter and by the source of shared content such as news). We conduct a longitudinal analysis to assess the growth and stability of these metrics as personal networks grow under the influence of the recommendation algorithm. Additionally, we examine how the recommendations differ based on the party affiliation of the first account followed by each automated account. This work contributes to a growing literature on the impact of algorithmic design on online informational ecosystems and has implications that extend beyond individual platforms. By shedding light on the downstream impacts of network recommendation algorithms, this work can inform those designing and building these systems to promote healthy informational systems online. |
49. One Graph to Rule them All: Using NLP and Graph Neural Networks to analyse Tolkien's Legendarium PRESENTER: Albin Zehe ABSTRACT. Natural Language Processing and Machine Learning have considerably advanced Computational Literary Studies. Similarly, the construction of co-occurrence networks of literary characters, and their analysis using methods from social network analysis and network science, have provided insights into the micro- and macro-level structure of literary texts. Combining these perspectives, in this work we study character networks extracted from a text corpus of J.R.R. Tolkien's Legendarium. We first extract character mentions and their co-references using NLP methods. We aggregate the extracted character mentions to interactions by a proxy method, saying that two characters interact whenever they are mentioned in the same sentence. From these co-occurrences, we create character networks that show the interaction topology of characters in the text. We train different random walk methods like DeepWalk and Node2Vec and message passing models such as Graph Convolutional Network (GCN) and Graph Attention Networks (GAT) to predict which book a character is most strongly affiliated with (affiliation classification). We evaluate the models by their performance for affiliation classification, but also for link prediction, that is, reconstructing edges/interactions that were artificially removed from the network. Our best models reach an F1-score of 92.32% for affiliation classification and 88.53% ROC/AUC for link prediction, respectively. This shows that our model can represent the character networks of the novels very well. Our work further highlights the benefit of integrating graph-based methods with NLP techniques, specifically the integration of word embeddings as node features in graph neural network models. We plan to use this approach for further tasks in computational literary studies that heavily depend on the topology of character constellations like, e.g., genre classification or character relation classification. |
50. Identifying Medical Provider's Contribution to Medical Cooperation Using Graph Convolutional Networks PRESENTER: Yu Ohki ABSTRACT. Regional medical cooperation is promoted to differentiate and coordinate with medical providers to respond to the increasing medical needs of an aging population. We construct a statistical model that expresses the relationship between a medical provider network and the quality of regional healthcare delivery using graph convolutional networks (GCNs). We also aim to estimate the contribution of each medical provider to medical cooperation from the constructed model. This study uses inpatient and outpatient patient claims data related to femoral neck fractures from 2014 to 2019 for patients who resided in one prefecture in Japan and were hospitalized due to a femoral neck fracture. First, we construct the medical provider network, representing patient-sharing relationships among medical providers. Second, we consider a statistical model using GCNs. The graph-level output variable is the rate of home discharges from all medical providers, and the input variable is the number of inpatients and outpatients in each medical provider. We evaluate the model on fit to the test data by 10-fold cross-validation and improve the accuracy to the coefficient of determination 0.49 by using RGCN. Finally, we calculate a relevance score for each node of the output variable using Grad-CAM. This result suggests that the score of each node represents the level of impact of each medical provider in the regional medical cooperation. |
51. Novelties and Polarization in Recommendation Algorithms PRESENTER: Giordano De Marzo ABSTRACT. The widespread use of recommendation algorithms has raised concerns about their potential impact on individuals and society. While they allow us to access content that would otherwise be difficult to find, they may also enclose us in algorithmic bubbles and limit the diversity of what we see. This works investigates how different recommendation algorithms affect the discovery of novelties, like listening to a new song or watching a new movie. We find that the user-user collaborative filtering favors the reach of a collective consensus and increases the rate at which each user experiences novelties. The situation is different for the more sophisticated and widespread matrix factorization algorithm. While it still enhances novelties discovery, it also leads to fragmented configurations of users. The conclusions is that recommendation algorithms can improve our online experience by enhancing the discovery of novelties, but the specific algorithm used must be carefully evaluated to avoid opinion polarization. |
52. Innovation and performance: inside the organizational network ABSTRACT. This research examines the impact of intra-organizational networks on innovation and performance using a large and granular dataset provided by a multinational firm. The dataset contains information on the communication and collaboration patterns of employees within the firm, as well as their individual performance metrics and specific characteristics of the projects they work on, including project complexity and an innovation score. This dataset includes both a formal collaboration network and an informal collaboration network, elicited through a name-generator survey. Using social network analysis techniques, we want to investigate the structure of the intra-organizational networks and their impact on innovation and performance. By using network measures, we can predict individual and project performance. This research has important implications for organizations seeking to enhance their innovation and performance. The large and granular dataset provided by the multinational firm allows us to conduct a comprehensive analysis of intra-organizational networks and their impact on innovation and performance, providing valuable insights for both academics and practitioners. |
53. Bipartite Network Analysis of Sample-Based Music PRESENTER: Dongju Park ABSTRACT. Musical sampling is a composition technique of popular music that borrows elements from existing recordings to produce new songs. The sampled music is thus deeply related to the newly created music based on it. We can therefore surmise that the sampling practice of an artist reflects the characteristics of the subgenre or the music community which the artist belongs operates in and belongs to. In this study, we present a complex network analysis of the communities of artists connected via sampling relationships. We establish an artist-song bipartite network of artists who perform the sampling and songs that are subject of sampling, and detect communities through the BiLouvain algorithm. We show that sample-based musical scene has a clear community structure, and identify six communities that differ significantly in size compared to other communities. These six communities represent the main styles of sample-based music. We consider both local and global structures of the network by referring to the degree centrality and Birank values when identifying important artists in each community. We also derive a coarser or finer community structure by varying the resolution parameter of Barber's modularity, which is maximized during community detection. While this study focuses on the understanding of sample-based music that forms the basis of an overwhelming majority of contemporary popular musical paradigms, we believe this framework is general enough to be applied to many other creative fields that involve referencing of existing works. |
54. The evolution of freshmen networks during and after lockdown: Comparing two cohorts PRESENTER: Judith Gilsbach ABSTRACT. We investigate the role of tutorials as social foci (Feld 1981) in the evolution of a freshman network during and after a period of online-only lectures in a German university due to Covid-19. The first cohort consists of 42 freshmen who begun their bachelor’s studies in sociology in autumn 2020 with all lectures incl. tutorials online. The second cohort (N = 66) begun their bachelor’s studies in autumn 2021 with most lectures incl. all tutorials given in person on campus. We collected three waves of network and attribute data for each cohort throughout the course of their first semester. In the online survey students were asked to nominate their acquaintances, friends, and people they talk to about important issues via clicking on their pictures. The photos were collected during the welcome week. Accordingly, we ensured that contacts could be nominated even if students did not know the full names of their new acquaintances. We also ensured that only those who signed the data protection agreement could be nominated by others. In both cohorts about 20 students did not participate at all. They are not considered a part of the network. Of those who participated, not all filled in each of the three surveys. Therefore, we imputed the missing ties model based. For the first wave we used a stationary Stochastic Actor Oriented Model (SAOM) as suggested by Krause (2018). For the later waves we used the SAOM internal mechanism, that is integrated in the RSiena software. The analysis was conducted by specifying SAOMs for each cohort. We conducted several quality checks for the imputation and analysis models. We find a larger impact of the tutorial groups in the first cohort that was taught online only and a huge improvement of the analysis results by imputation. In our talk we would like to present the collection of relational data via photos and the model-based imputation of missing tie variables as well as our results regarding the role of tutorials for the network formation and some first explanations. |
55. Power-laws and preferential attachment across social media platforms PRESENTER: Joao Neto ABSTRACT. We analyze 100 statistical observables of content across 11 different social media platforms. We find that almost all observables follow heavy-tailed distributions, and most follow power-laws. With a preferential attachment model, we identify aging and the comment-to-submission ratio as key parameters in explaining the variability of power-law distributions. Our results could inform platform design and user experience, and may have implications for understanding information and influence dynamics in social media ecosystems. |
56. Statistics on Reddit: entropy and Gini coefficient PRESENTER: Guilherme Machado ABSTRACT. The social website Reddit has a unique structure in which users submit and interact with content in numerous communities or subreddits. Reddit can be viewed as a bipartite network, with users represented by one kind of node, and the subreddits another. In this work we study the statistics of Reddit through Shanonn entropy and the Gini coefficient. Entropy has been widely used to characterise the predictability of agent’s behaviour [1] [2]. We adapt this concept to Reddit by looking at the entropy of users with respect to their comment distribution among subreddits. Higher entropy indicates less predictable behaviour. The mean entropy of users who frequent a given subreddit then indicates the overall predictability of that community [3]. We find a large variation of mean user entropy between subreddits, indicating significant heterogeneity in the patterns of usage of different subreddits. We also study the distribution of the number of user comments within each subreddit. We find that this comment distribution tends to have a heavy tail, but one whose shape changes with the subreddit’s size. We verify that these changes are not explained by finite size or sampling effects for communities of different sizes. Instead we observe a systematic broadening of the distribution with subreddit size. This suggests that subreddits have a core periphery structure, as the most active users (in the tail of the distribution) are disproportionately responsible for greater and greater activity as the subreddit size increases. To measure this disproportionality, we calculated the Gini coefficient of each subreddit, as shown in the left panel of figure 1. We observe that the typical Gini coefficient increases monotonically with increasing subreddit size. However this increase saturates as subreddits grow very big, as it becomes harder to increase the fraction of users in the core (even with a heavy tail), see right panel of figure 1. |
57. Political Participation and Voluntary Association : A Hypergrah case Study PRESENTER: Amina Azaiez ABSTRACT. Policy formation in representative democracy is constantly influenced by various types of civic organizations, from interest groups to voluntary associations. In this work, we provide a local case study that focuses on the interconnection between voluntary associations and local political institutions such as the city hall, in a city of almost two thousand residents. While traditionally, sociologists address this issue by studying the social characteristics of agents such as age, gender, or socio-professional statues, here, we focus on the structure of interactions between members of organizations to reveal interactional behaviours that explain political involvement of citizens. To this end, we model group interactions using hypergraph theory. We study the overall architecture interactions and find that there is a community-based structure, where communities gather members from similar organizations. We define an interactional based measure to quantify political participation and provide two centrality measures that tend to identify ’openness’ and ’integration’ of individuals. We find that ’openness’ is the behaviour that explains the best political participation of members of associations. |
58. Exploring the Communication network of the Civil Society organizations in Austria dealing with Ukrainian refugees and humanitarian aid PRESENTER: Kateryna Maikovska ABSTRACT. The full-scale Russian invasion of Ukraine in 2022 has triggered the biggest forced migration wave in the recent history of Europe. Civil society immediately mobilized to assist the refugees and organize humanitarian aid. In this paper I present a study of communication network of the civil society organizations (CSOs) and volunteers located in Austria that aim to help Ukrainians. I use network interviews and social network analysis to investigate the structure, communication problems, and communication strategies on the CSOs network. The findings suggest that the network has been rapidly changing and evolving from self-organized connective action to organizationally enabled connective action frame. This is expressed in specialization of volunteer groups and their transformation into emergent CSOs. Further I discuss the structure of the CSO network and elaborate on the differences and similarities of communication strategies of its actors. The study provides a generous overview of the CSOs operation and perspectives for future research and policymaking. It is also a relevant contribution to the network science as it provides an example of collecting network data via semi-structured qualitative interviews and field work. |
60. Echo chambers and information transmission biases in homophilic/heterophilic networks PRESENTER: Fernando Diaz-Diaz ABSTRACT. We study how information transmission biases arise by the interplay between the structural properties of the network and the dynamics of the information in synthetic scale-free homophilic/heterophilic networks. We provide simple mathematical tools to quantify these biases. Both Simple and Complex Contagion models are insufficient to predict significant biases. In contrast, a Hybrid Contagion model — in which both Simple and Complex Contagion occur—gives rise to three different homophily-dependent biases: emissivity and receptivity biases, and echo chambers. Simulations in an empirical network with high homophily confirm our findings. Our results shed light on the mechanisms that cause inequalities in the visibility of information sources, reduced access to information, and lack of communication among distinct groups. |
61. Correlation distances in social networks PRESENTER: Padraig MacCarron ABSTRACT. The degree assortativity measures the correla- tions in the degree of neighbouring nodes. Its im- portance was underlined due to it distinguishing helping the structure of social networks from other types of complex networks [1] (though this is found to not always be the case for online social net- works [2]). Studies of assortativity usually focus on nearest neighbour correlations, since neighbours in- teract directly only with their nearest neighbours. Here instead, we extend its usual definition beyond that of nearest neighbours. In social networks, assortativity is related to that of homophily, whereby individuals associate with people similar to themselves. This leads to a ten- dancy of individuals to associate mostly with others of a similar race or ethnicity, age, religion or inter- est. Previous studies have suggested that nodes’ influence may extend beyond the immediate neigh- bourhood in social networks (e.g. [3]). However, serious questions have been raised about the meth- ods of these studies [4]. Here we use the assortativity at different dis- tances to test whether there is any correlation, or “influence”, beyond nearest neighbours in many social network datasets. We also test this one some temporal networks, such as a scientific co- authorship network and a mobile phone datasets to see if correlations increase over time. However, we find that these positive correlations diminish after one step in most of the empirical networks analysed. Properties besides degree sup- port this, such as the number of papers in scien- tific coauthorship networks, with no correlations beyond nearest neighbours (Fig. 1). Beyond next- nearest neighbours we also observe a diasassorta- tive tendency for nodes three steps away indicating that nodes at that distance are more likely different than similar. We apply this method to model networks and compare these results to the real networks. In all assorative generated networks, as well as the social networks, the correlation also diminishes after the first step. This result is the opposite of what the “three degrees of influence” work would suggest. Our work instead implies, that in a social network, the more distant you are from someone, the more likely you are different to them. References: [1] Newman M.E.J. and Park J., Why social net- works are different from other types of net- works, Phys. Rev. E 68, 036122 (2003). [2] Hu H.B., & Wang X.F. EPL (Europhysics Let- ters), 86(1), 18003 (2009). [3] Christakis N. and Fowler J., The New England Journal of Medicine 357, 370-379 (2007). [4] Lyons R, Statistics, Politics, and Policy 2(1), Article 2 (2011). |
62. Let's Tweet about Soccer? A Gender-centric Question PRESENTER: Akrati Saxena ABSTRACT. Fans, sports organizations, as well as players use social networks like Twitter to build and maintain their identity and sense of community \cite{Williams2014,webist16}. Breaking news about soccer sometimes comes first on Twitter than traditional media channels and provides an excellent means to access instantaneous information from official and unofficial sources \cite{webist16}. Soccer has more than 3.5 billion fans across the world, and it is estimated that 1.3 billion of them are females (around 38\%) \cite{lange_2020}. Our work examines whether, in a male-dominated environment, women and men differ in communication patterns. Women soccer fans tend to experience biases and prejudices, and our question relies on how this is translated to online spaces~\cite{toffoletti2017sexy,cavalier2018stick}. Through our study, we look into the patterns of interaction between men and women and how communication evolves over a period of 3 months (March 7 to June 6, 2022) for English and Portuguese languages. After our data preprocessing, we have 7,676,624 tweets in English (6,365,239 are from males and 1,313,731 are from females) and 2,958,443 tweets in Portuguese (2,312,415 are from males and 648,395 from females). The highest women rate in our data reaches 28\%, i.e., close to the overall Twitter ratio (29.6\%). We first study both the networks and observe that the Portuguese network has a higher women ratio in interactions (Figure~\ref{figure1}.A) and a lower homophily (Figure~\ref{figure1}.B) than the English network. One of the reasons is that the dataset seems to be originated from people belonging to specific places, mostly from Brazil and Portugal, where people are more familiar with soccer. The network structure from women's tweets tends to be denser, with higher average clustering and lower assortativity than men's one, aligned to previous research in other types of communication such as research collaborations~\cite{jaramillo2021reaching}. Then, we further carried out text analysis of whether men and women exhibit emotions differently in soccer. The emotions are computed using NRC Emotion Intensity Lexicon, VADER, and Google's Perspective API \cite{Hutto_Gilbert_2014,10.2307/27857503}. We plot, in Figure~\ref{figure1}.C, the gender differences of the emotions extracted from the tweets over time. We found that, in soccer, women tend to express higher levels of joy and anticipation than men in both languages, and disgust tends to be more gender-neutral with a slightly higher level for males. Interestingly, we did not find any qualitative difference in relation to the gender differences in emotion between the English and Portuguese collected tweets. We thus found that the emotional response of different gender seems independent of the overall network structure. In this talk, we will also discuss other insightful results on sentiments, emoji usage, retweeting behavior, and communication patterns observed in our dataset. Nevertheless, we expect to investigate further the patterns found from the communities extracted from our networks and patterns extracted from a female-dominated sport. |
63. Understanding Changes in Ideas in Science Fiction with Word Embedding PRESENTER: Minsang Namgoong ABSTRACT. Word embedding is a Natural Language Processing (NLP) technique for finding the vector representation of a word in terms of other words that reflects the syntactic or semantic context of word usage, enabling us to quantify the association between two words. In this study we use the technique to examine how the authors' "ideas" - the perception of the subject materials - have changed over time in Science Fiction. We employ Dynamic Bernoulli Embeddings (DBE), a type of word embedding that hypothesizes that embedding vectors shift through the vector space over time. DBE produces different embedding vectors for the same word in different time periods, enabling us to explore the transitions in word meanings and associations. We can visualize this using the egocentric network of the word of interest. In the Figure, we show two egocentric networks of the word "world" in the 1910's and the 1970's where the words are connected when the cosine similarity between the embedding vectors is 0.5 or larger. The associated words that forming a densely connected community (e.g. a clique) around the ego show how the context or the word meaning may differ in different era. We anticipate this approach to aid us in the quantitative evaluation of literary works. |
64. Gender homophily in acknowledgement mention network PRESENTER: Keigo Kusumegi ABSTRACT. Gender bias in academia has been identified as a prevalent issue, particularly in the form of male-gender homophily. This has led result in lower citations for female researchers and a higher likelihood of male researchers collaborating with other male colleagues (but not for female researchers). The complexity of gender homophily is driven by factors such as socially constrained gender roles and a preference for same-sex colleagues. In this study, we used a new dataset [1] on academic acknowledgments to determine whether such gender homophily also appears in acknowledgment mention network. Our findings indicate that papers with a higher proportion of male authors are less likely to mention female researchers in their acknowledgments (Fig. 1). Conversely, papers with a higher percentage of female authors exhibited an equal proportion of male and female mentions in the acknowledgments. This pattern aligns with previous research on gender homophily in collaborations. Our results imply that male-dominated collaborations may result in a heightened tendency to exclude female researchers. |
65. Minority group size moderates inequity-reducing strategies in homophilic networks PRESENTER: Sam Zhang ABSTRACT. Members of minority groups often occupy less central positions in networks, which translates into a lack of access to essential information, social capital, and financial resources. This marginalization in relation to the majority group can arise from simple structural features of social networks: the tendency of individuals to connect to own in-group (homophily) and the preference for connecting to popular individuals in the network (preferential attachment). Yet in real life, groups of actors exist at the boundaries of minority and majority group memberships whose presence affects the visibility of the minority group. By modeling a social system with these boundary groups, we illuminate the hidden role of the minority group size in moderating (even non-linearly) the benefit of these intermediating groups. First, we introduce allied majority group members who support the minority group by forming ties to it. Second, we introduce whose minority characteristics are mostly invisible or unreadable and thus face no discrimination by the majority group. We build on the directed network model with preferential attachment and homophily. To capture marginalization, we measure inequity in the generated network as the proportion of the minority in the top-k ranked nodes using the PageRank algorithm. Guided by the social-scientific literature, we design three scenarios capturing different homophily preferences of majority and minority groups towards allied and incorporated members---ranging from being penalized by everyone but the majority to being accepted by all groups. Our results suggest that interventions on structural inequities on networks can depend sensitively on the relative sizes of the groups involved. We observe the increasing difficulty of a minority group to achieve parity as the group shrinks: both external strategies (increasing allies) and internal strategies (increasing incorporation) require a higher relative threshold of participation before equity is reached. Allied majority members are equally successful in helping minority overcome inequity regardless of whether minority members associate with them or not; yet, the efforts of incorporated minority members are facilitated if other incorporated and minority members affiliate with them. Policymakers attempting to adopt policies that have been successful in improving representation for one group may need to account for the relative group sizes, as well as the nuances in relationships between different groups and allied and incorporated members as a basic variables. |
66. A mixed-methods approach to study socio-semantic networks PRESENTER: Alexandre Hannud Abdo ABSTRACT. We argue that more effort should be put in making state-of-the-art results from network science actionable by those trained as qualitative researchers. Our contribution, which targets the analysis of socio-semantic networks, has been to develop an extensive methodology in tandem with different groups engaged in field research, incorporating recent advances in community detection and higher-order graphs by means of specially crafted interfaces for graph data and their model representations. The computational tools of this approach — nicknamed sashimi — are appropriately available as a suite of no-code methods in a gratis cloud service, and also as a free-and-open-source software library. We present results from the latest research projects experimenting with the approach: (a) in the field of Transition Studies, an investigation into the variety of disciplinary manifestations throughout the social sciences of the "research problem of destabilisation of socio-technical systems", that seeks to inform current destabilisation/discontinuation/phase-out scholarship with a wider understanding of the problem; (b) in Science and Technology Studies, an analysis of policy documents pertaining to the regulation of artificial intelligence, identifying the interplay between actors associated with different themes, sectors and perspectives (solutionism, contestation, regulation); (c) still in STS, an analysis of social media interactions concerning environmental controversies, focusing on the issue of pesticides. In the context of these examples, we'll explain a set of concepts and practices, emerging from our usage, to productively co-construct meaning between the representations exposed through the interfaces, and the goals, inputs and choices of a researcher with field and experiential knowledge. In particular, how to interpret the clusters and the specificity and commonality scores of inter-cluster relationships employed in the maps, how to formulate inquiries based on sequences of network filtering and dimension chaining operations, and finally how to construct coherent groups of document clusters we call constellations, and identify attribute flows in their cores and frontiers. |
67. The Network Effect of Deplatforming on Social Media PRESENTER: Max Falkenberg ABSTRACT. Deplatforming, or the removal of malicious accounts from mainstream social media, has become the key tool for countering misinformation and protecting users on social media. However, little consideration has been given to how such moderation policies impact the wider social media ecosystem. In part, this is because few studies have been able to track social media users after being suspended from mainstream platforms. Here we address this gap, presenting a comprehensive network analysis of the ban-induced platform migration from Twitter to Gettr on the US political right. Our dataset covers the near-complete evolution of Gettr from its founding in July 2021 to May 2022 including 15 million posts from 790 thousand active users (''Not verified'' cohort in figures). Of these users 6,152 are verified, 1,588 of which self-declare as active on Twitter (``Matched'' cohort). For these 1,588 self-declared Twitter users, we download their Twitter timeline from July 2021 to May 2022 totalling 11 million posts (''Twitter'' cohort). For other verified Gettr users, we identify the 454 accounts who have been suspended by Twitter (''Banned cohort''). Analyzing these cohorts, we first show that retention of migrated users varies significantly: users banned from Twitter are more active (between 3--6 times), and remain on Gettr for significantly longer, than users who have not been suspended from Twitter. Second, we analyse the structure and content of Gettr, demonstrating that there is little difference between those users suspended on Twitter, and those still active on Twitter, see figure 1(a); both groups, and Gettr in general, are broadly representative of the US far-right. Third, we show that Gettr content is, on average, significantly more toxic than the Twitter content of matched users. However, the most toxic Twitter content is more toxic than on Gettr, see figure 1(b/c). To better understand this phenomenon, we focus on the behaviour of the matched cohort on Twitter and study the Twitter users with whom they interact, see figure 2. We define the ''quote-ratio'' as the number of matched accounts who quote-tweet a user on Twitter, normalised by all mentions. This shows that the cohort of matched users on Gettr broadly align with far-right media sources and differ significantly from left-leaning groups. Studying the toxicity of posts on Twitter, we find that posts who mention users who are disproportionately quote-tweeted than retweeted are significantly more toxic than other posts. We find that many of these posts target US Democrat politicians, and specifically female politicians. Finally, we discuss the broader impact of fringe platforms on democracy by showing the impact of Gettr on the Brasilia insurrections in January 2023. Together, these results emphasise the critical importance of carefully considered social-media moderation policies, and the need for high quality data on, and monitoring of, fringe social-media platforms. |
68. How to measure trends on Twitter? PRESENTER: Ana Meštrović ABSTRACT. Here we introduce new indicators for measuring trends on Twitter. We perform a quantitative analysis of tweet content and related reactions in terms of retweets, replies, and quotes. One of the problems is how to recognize content which seems to be important according to a large number of reactions but actually is not so relevant. More specifically, there is always a certain group of users who continuously share the same content, while there are no other, external users sharing that same content. This behavior leads to a quantitative bias, where some content appears to have a large number of incoming reactions, even though they all come from the same user and are not shared throughout the network. To detect such bias, we introduced two new indicators: α and β which are calculated by combining the unique number of users who shared the hashtag at least once and the total number of times that the hashtag was shared. We perform an experiment on the dataset CroTW11-2022 consisting of 386,168 Tweets posted during the 30 days of November 2022, by 6,887 users from Croatia. Figure 1 illustrates hashtag popularity according to α and β indicators. [Figure 1] According to our preliminary results, the most popular content is spread using the Retweet reaction. The spread of the most popularly shared content, hashtag "Vatreni", is related to the FIFA World Cup, which takes place in the analyzed time period (November 2022). It should be noted that this analysis is very short-sighted due to the short time span. In addition, popular content that is shared by a large number of users has a high α, while content that is shared by a small number of users has a low α. Popular content that is included in the majority of shared content has a high β, while content that is included in the minority of shared content has a low β. |
69. An unsupervised network-based anomaly detection of credit card fraud PRESENTER: Aida Sadeghi ABSTRACT. In the modern world we live today, the growth of online payments has expanded the reach of e-commerce and shifted society away from cash dependency. Alongside this development, online payments have become a target for fraudsters who exploit the openness of such purchases and the transmission of the credit card information. Transaction fraud detection has become a crucial task to identify fraudulent activities effectively. Fraud is characterized by its rare, hidden, constantly changing and time-sensitive nature. The aforementioned challenges give rise to recognized limitations of current techniques, including (1) the acquisition of expensive labeled data, which may still not encompass all the fraudulent cases, i.e., anomalies, (2) a need for domain specialization, (3) an inability to encompass a broad spectrum of anomaly types, and (4) challenges in detecting collectives of fraudsters who work together. Thereby, it is appropriate to approach fraud detection in general using networks and in particular for credit card fraud detection, to examine the network of transactions. Research shows that incorporating network information into the fraud detection task, greatly improves the performance. However, whereas existing research typically uses supervised machine learning for this task, extensive balancing of the data is often required, which may lead to sub-optimal results. In this research we apply state-of-the-art unsupervised anomaly detection approaches with the goal of identifying fraudsters in a network of credit card transactions. We consider the problem as a link detection task to evaluate the viability of network analytics in the context of credit card fraud. First, we build a network by linking cardholders with merchants; next, we assign attributes to nodes and edges, after which we study the networks based on their characteristics. We extract network features from the network, based on the node and edge centrality and the connectivity in the network. Then, with an augmented dataset, to achieve this objective, we utilize contemporary downstream techniques that incorporate different assumptions regarding data distribution. Specifically, we employ two shallow non-parametric methods, namely the Empirical-Cumulative-distribution-based Outlier Detection (ECOD), the Copula Based Outlier Detector (COPOD). In addition to that, we apply two jointly parametric and non-parametric deep auto-encoders, the Deep Support Vector Data Description (DeepSVDD) and the Deep Auto-encoding Gaussian Mixture Model (DAGMM). Our preliminary results indicate that analyzing transaction fraud with relational and structural aspects of the transaction network gives beneficial information which might outperform cutting-edge techniques. Meanwhile, our contribution provides insightful results where, in the world of shallow versus deep methods for real-time financial data, using network features would improve the performance of all the models. Furthermore, we will apply model agnostic methods to look into the models' predictions to understand the anomalies and enable the detection of fraud early, which is so crucial for financial loss reduction. |
70. Multi-level social relationships among Michelin-starred chefs PRESENTER: Agustí Canals ABSTRACT. Michelin-starred restaurants are relevant actors in the hospitality industry, especially within developed countries. Their importance resides not so much in their business volume but in their role as a flagship of the industry in the region. In Catalonia there is an important geographical concentration of Michelin-starred outlets due in part to traditional factors like the variety of products from the Catalan territory, its cultural and socio-economic level, and the proximity of French cuisine. In the last decades, though, other additional factors have become relevant: the presence of great groundbreaking chefs such as Ferran Adrià, Santi Santamaria, Joan Roca or Carme Ruscalleda, the formal education of new chefs, the presence of the sector in social media, and the increasing interaction among chefs creating relationships that foster knowledge sharing and innovation. In this paper, we examine these relationships by studying the structure of the social network of Michelin-starred chefs in Catalonia and its possible effects on the development of the sector. We build the Catalan chefs network from data gathered from an existing and accepted sample of restaurants: the list of Michelin-starred restaurants for Spain and Portugal, specifically the list for Catalonia, which is historically the most awarded region in this area. Data were collected between 2013 and 2017 in a process involving in-depth and inductive interviews with the help of a semi-structured questionnaire, in which chefs were asked about their relationships with each of the other members of the sample and the nature of these relationships. The final sample consists of 64 chefs. We obtain a final network of 64 nodes and 779 links, with a density of 0.39 and a diameter of 3. Having information about the nature of the relationships (personal or professional) allows us to treat it as a two-layer network. By looking at different features of the network, like the centrality of the nodes, presence of structural holes, communities, k-core decomposition, or ego-networks, and relating them to the information we obtained from each member of the network in our interviews, we analyze aspects such as the role of top-tier chefs (awarded with 2 or 3 Michelin stars) in the sector, the characteristics of the processes of knowledge sharing and collaboration, or the relationship between professional success and position in the network. |
71. Systemic Risk of Cascading Failures/Overloads in a Recoverable Network under Redistributed Elastic Load: Work in Progress ABSTRACT. While systemic failure/overload in recoverable networks with load redistribution is a widely common phenomenon, prohibitively high dimension of the problem explains current relying on simulations for evaluation and moreover mitigation of the corresponding systemic risk to certain disadvantage of theoretical understanding. The proposed in this paper model for systemic risk evaluation and mitigation, which relies on approximate dimension reduction at the onset of systemic failure/overload, intends to reduce this gap. We assume that component failure/overload rates increase with the corresponding Lagrange multipliers, contagion at micro-level is described by a Markov process with locally interacting components, and the macro-level system dynamics is approximated by a 2-state Markov process alternating between operational and systemically failed system states. |
72. Continuous Projection Space - Technology codes and Mergers & acquisitions predictions PRESENTER: Matteo Straccamore ABSTRACT. Measures of relatedness in bipartite networks are fundamental tools in complex systems that can be used for policy implications. In this study, we use a bipartite firm-technology network and these measures can be used to make predictions of future mergers and acquisitions (M&A) and the identification of technology trends. A classic approach is the simple calculation of Co-Occurrences, a network-based one that simply counts how many objects are in common between two entities. For example, if two technology t1 and t2 have a high Co-Occurrence, i.e. are implemented together by a significant number of firms, we will say that a firm that produces only t1 has all the capabilities to do the same with t2, i.e. that firm has a high relatedness with t2. However, there are different types of Co-Occurrences in the literature that depend on different formulations and are not fully capable of finding nontrivial relationships between technologies. In this study, we compare the relatedness measures obtained by various Co-Occurrences formulations with the approaches of supervised machine learning (ML) algorithms. Our results show that the supervised ML algorithms outperform the other Co-Occurrences formulations in terms of prediction of future technologies in firms, while the Jaffe measure outperforms in terms of M&A predictions. Furthermore, we present a new method, the Continuous Projection Space (CPS), which can solve the problem of low interpretability of the supervised ML algorithms results. By applying the CPS method to the firm and technology layers, we obtain the Continuous Technology Space (CTS) and Continuous Company Space (CCS), respectively. The CTS can be used to obtain measures of relatedness used to make predictions about what technology will be used by a company, while the CCS can be used to make predictions about future mergers and acquisitions (M&A) between two companies. Our study shows that the use of relatedness measures in bipartite networks can have important policy implications in complex systems. Furthermore, the CPS method provides a visual tool to inform and justify strategic decisions and overcome the low interpretability problem of the supervised ML algorithms results. Future research could explore the application of the CPS method to other types of bipartite networks and different policy implications. |
73. Determining Trade Execution Timing with Stock Market Networks: Performance Comparison between Network Node Degree-based EMA and Traditional EMA PRESENTER: Hiroki Sayama ABSTRACT. Determining the right timings of buying and selling stocks is often led by technical indicators such as Exponential Moving Average (EMA) and Relative Strength Index (RSI). Although they are heavily used by traders in general, they often mislead investors due to the complex nature of stock markets. In this study, we propose the use of stock market network measurements such as node degrees in the calculations of the indicators and evaluate how effective they become compared to the original indicators. Particularly, the objective of this study is to 1) build time-series of Standard & Poor’s 500 Index (S&P 500) networks, 2) compute EMA6 and EMA13 by using the skewness of the degree distributions of the networks, 3) simulate the trade executions of both network-based EMA and traditional EMA crossover strategy and 4) compare the trading performances between them. The networks were constructed following our previous study in stock market prediction [1] with the S&P 500 companies as the nodes and the mutual information of 60-minute pair-wise price movements of the companies as the edge weights. With a total of 5,340 consecutive minutes of price records, we built 89 time-series of networks and computed the skewness of the node degree distribution per network. We visualized both network-based EMAs and traditional EMAs along with the hourly S&P 500 index movements and buy and sell transactions made by an execution rule (Figure 1). Although trade execution algorithms can be more complex in real-life scenarios, in order to focus on examining the preliminary result of using network measurements, the trade simulation was made through a simple rule-based execution following the EMA-Crossover strategy: execute a buy transaction when EMA6 goes above the EMA13 and a sell transaction in the opposite situation. The total trade return was calculated by adding up all the differences between the index at sold and at purchased in each trade. As shown in Figure 1, we found that the network-based EMA strategy not only captured more trading opportunities, but also made a significantly outperforming return on each trade than those of using original EMA. This finding may suggest that the collective information from the networks of underlying companies is an effective factor to improve the predictability of the indicator. In the future, we would extend this study to further examine the effectiveness with other network measurements such as centrality along with other indicators such as RSI. |
74. MNEs and multilayer economic networks befind industrial coagglomeration ABSTRACT. Firms benefit from geographic proximity to other firms in case they can exchange skilled labor, inputs or know-how (Ellison et al. 2010). However, companies are very heterogeneous in their capabilities to develop such connections to other firms and to get advantages from the local presence of similar firms, which makes local connection patterns unequal (Giuliani and Bell 2005). This is especially true in economies such as Hungary, where mulitnational enterprises (MNEs) with strong global global value chain connections have a strong presence (Elekes et al. 2019). In this study we use the unique setting of a multilayer, nation-wide economic network, where we connect companies through economic transactions and labor exchange. We explore connection patterns at the level of industries, industry-regions and firms. Besides the verification of the previously identified channels behind industrial co-agglomeration in case of Hungary, we illustrate that firms in industries with high local concentration indeed dominantly connect to other firms in locally concentrated industries. These network patterns are especially true for the most successful firms inside industrial agglomerations. However, we present that MNEs mainly exchange inputs and labor with other MNEs or use the capabilities of local firms purely as inputs for their production. |
75. Multilayer fusions: a mixed percolation problem on families of fixed-degree graphs PRESENTER: Ashlesha Patil ABSTRACT. Entanglement, a key quantum information resource, is routed in a quantum network using nodes called repeaters. In an entanglement generation protocol, M>=1 Bell states (links) are generated between neighboring repeater pairs with probability p. Every repeater then performs "fusion" on group(s) of three or more links, which when succeeds with probability q. A successful fusion glues the ends of all the links in the group and deletes all the links if it fails. This protocol maps to a mixed site-bond percolation problem. The aim of this work is to expand the percolation region. We use three different strategies to form fusion groups for a given k which leads to percolation problems that have not been studied earlier and study the scaling of the site percolation threshold of a strategy with M. |
14:15 | Weighted network motifs as random walk patterns PRESENTER: Rossana Mastrandrea ABSTRACT. In the last decades, network theory has been successfully applied to explore a great variety of complex systems. Particularly interesting is the study of network local patterns that can shed light on the emergence of global properties, such as motifs. Only few works have proposed an extension of the motif concept to the weighted case. The method proposed in this work has few ingredients: (i) unbalanced weighted networks; (ii) a sink node and (iii) a random walker with a limited number of steps. The sink node compensate the excess of ingoing flows and balance the network. It allows to highlight the role of weights heterogeneity in shaping the network structural organization in subpatterns. The approach gives the frequency of paths of any possible length observable within a fixed number of steps of a random walker placed on an arbitrary node. We applied our approach to different real networks and test the significance of weighted motifs occurrences using different null models. We consider a maximum of 3 steps offering both analytical results and simulations outcome. In fig.1(a) we show all paths detectable by construction. Figure 1(b) reports the frequency of the eight weighted motifs for some networks. It offers an intuitive idea about the different nature of the systems according to their local organization in subgraphs. We found that networks belonging to the same field show very similar motifs significance profiles: economic networks; social networks, ecological networks. Furthermore, focusing on specific weighted motifs we can shed light on underlying functioning mechanism of different systems. The motifs significance profile (fig.1(c)) reveal insights about networks organization similarities and differences. Furthermore, it is possible to focus on selected nodes, study the over time variation of the weighted motifs frequency around it, and relate them with exogenous shocks as economic, political, and social events (fig.1(d)). |
14:30 | The dynamic nature of percolation on networks with triadic interactions PRESENTER: Hanlin Sun ABSTRACT. Higher-order networks encode the many-body interactions of complex systems such as ecosystems, brain networks, and biochemical systems. Recently, it has become increasingly recognized that taking into account higher-order interactions changes significantly our understanding of the interplay between the structure and function of complex systems. Triadic interactions are an important type of higher-order interaction in which one node regulates the interaction between the other two nodes of the network. Triadic interactions are intrinsically signed as the regulatory node can act as an enhancer or an inhibitor of the interaction between the other two nodes. Interestingly, triadic interactions are widely found in real-world systems such as ecosystems and neuronal networks. However, so far the role of triadic interactions in determining the dynamics of the higher-order networks has been only investigated for small ecological networks. Here [1] we show that triadic interactions can dramatically change the large-scale dynamical properties of higher-order networks. We assume that positive or negative triadic interactions can regulate which links are active and which links are damaged and can turn percolation into a fully-fledged dynamical process. In this dynamical process, the giant connected component intermittently involves a different set of nodes, and the order parameter undergoes period doubling and a route to chaos. In absence of negative regulatory interactions, this percolation process always reaches a steady state and the phase diagram displays a discontinuous hybrid phase transition. The model is a high-dimensional dynamical process depending on the activity of each node and link of the network, however, the dynamic is well captured by a low-dimensional set of equations which is in very good agreement with extensive Monte Carlo simulations of the dynamical process. In conclusion, this work shows that signed triadic interactions can induce blinking and chaos in the connectivity of higher-order networks. In the future, the proposed framework can be applied to other generalized network structures including hypergraphs and multiplex networks. [1] H. Sun, F. Radicchi, J. Kurths and G. Bianconi, The dynamic nature of percolation on networks with triadic interactions.,arXiv preprint arXiv:2204.13067. (2022). |
14:45 | Hyper-core decomposition of hypergraphs: the role of hyper-cores in higher-order dynamical processes PRESENTER: Marco Mancastroppa ABSTRACT. Going beyond networks, in order to include higher-order interactions involving groups of elements of arbitrary sizes, has been recognized as a major step in reaching a better description of many complex systems. In the resulting hypergraph representation, tools to identify particularly cohesive structures and central nodes are still scarce. We propose here to decompose a hypergraph in hyper-cores. We illustrate this procedure on empirical data sets, showing how this suggests a novel notion of centrality for nodes in hypergraphs, the hyper-coreness. We assess the role of the hyper-cores and of nodes with large hyper-coreness in several dynamical processes based on higher-order interactions. We show that such nodes have large spreading power and that spreading processes are localized in hyper-cores with large connectedness along groups of large sizes. In the emergence of social conventions moreover, very few committed individuals with high hyper-coreness can rapidly overturn a majority convention. |
15:00 | Complex hypergraphs ABSTRACT. Providing an abstract representation of natural and human complex structures is a challenging problem. Accounting for the system heterogenous components while allowing for analytical tractability is a difficult balance. Here I introduce complex hypergraphs (chygraphs), bringing together concepts from hypergraphs, multi-layer networks, simplicial complexes and hyperstructures. To illustrate the applicability of this combinatorial structure I calculate the component sizes statistics and identify the transition to a giant component. To this end I introduce a vectorization technique that tackles the multi-level nature of chygraphs. I conclude that chygraphs are a unifying representation of complex systems allowing for analytical insight. |
15:15 | Temporal-topological properties of higher-order evolving networks PRESENTER: Alberto Ceria ABSTRACT. Human social interactions are typically recorded as time-specific dyadic interactions, and represented as evolving (temporal) networks, where links are activated/deactivated over time. However, individuals can interact in groups of more than two people. Such group interactions can be represented as higher-order events of an evolving network. Here, we propose methods to characterize the temporal-topological properties of higher-order events to compare networks and identify their (dis)similarities. We analyzed 8 real-world physical contact networks, finding the following: a) Events of different orders close in time tend to be also close in topology; b) Nodes participating in many different groups (events) of a given order tend to involve in many different groups (events) of another order; Thus, individuals tend to be consistently active or inactive in events across orders; c) Local events that are close in topology are correlated in time, supporting observation a). Differently, in 5 collaboration networks, observation a) is almost absent; Consistently, no evident temporal correlation of local events has been observed in collaboration networks. Such differences between the two classes of networks may be explained by the fact that physical contacts are proximity based, in contrast to collaboration networks. Our methods may facilitate the investigation of how properties of higher-order events affect dynamic processes unfolding on them and possibly inspire the development of more refined models of higher-order time-varying networks. |
15:30 | Percolation and critical dynamics of (k,q)-core decomposition of hypergraphs PRESENTER: Jongshin Lee ABSTRACT. We propose a pruning process to find the specific core structures of the hypergraphs, named the $(k,q)$-core pruning process. This process contains the recursive removing nodes with a degree less than a given number $k$ and hyperedges of size less than $q$. We derive exact time evolution equations describing the changes in the degree distribution during this process. Percolation properties of the resulting giant core structures are numerically and theoretically examined. We discover the change in phase transition type according to the values of $k$ and $q$. For $k \ge 3$ or $q\ge 3$, we check the hybrid phase transition where the jump and critical phenomena of the order variable coincide, quantifying them through a finite-size scaling method and confirming that they are consistent with theoretically predicted results. We also scrutinize the critical dynamics of the pruning process at the percolation transition point. For $k \ge 3$ or $q \ge 3$, we observe the universal power-law relaxation dynamics $P^{(v)}(k-1,t)\sim t^{-2}$ at the critical point. However, for $k=q=2$, we obtain the $P^{(v)}(k-1=1,t)\sim t^{-3}$, which implies the different mechanism of the dynamic critical behavior. Novel degree-dependent critical relaxation dynamics $P^{(v)}(z,t)\sim t^{-z}$ is deduced for $k=q=2$. Our method can effectively select highly interconnected modular structures within the given hypergraph and avoid the overestimated structures from $k$-core from the graph representations. Also, our study sheds light on understanding the cascading failure of diverse higher-order networked systems, such as the collapse of metabolic processes or the sudden spread of infectious diseases. |
15:45 | The simpliciality of empirical datasets PRESENTER: Nicole Eikmeier ABSTRACT. Higher-order networks model group interactions in a complex system and usually are represented with one of two mathematical objects: (i) hypergraphs and (ii) simplicial complexes. Simplicial complexes are hypergraphs with the additional requirement that if an interaction occurs between m individuals, then every sub-interaction must also occur. Many null models of higher-order networks either do not consider the inclusion structure of group interactions or explicitly assume a simplicial complex. In contrast, empirical datasets do not often exist at either of these extremes. We define the simpliciality as a measure of where a dataset lies on this spectrum as well as several concrete ways to quantify the simpliciality of a dataset. We show that existing null models are inadequate for generating datasets with the same inclusion structure as empirical datasets. In addition, we show that the location of the missing subfaces are extremely important for quantifying the simpliciality of a dataset. We introduce null models aimed at bridging the gap between hypergraph null models and null models of simplicial complexes by specifying the inclusion structure. |
16:00 | Simplicially driven simple contagion PRESENTER: Thomas Robiglio ABSTRACT. Single contagion processes are known to display a continuous transition from an epidemic-free state to an epidemic one, for contagion rates above a critical threshold. This transition can become discontinuous when two simple contagion processes are coupled in a bi-directional symmetric way. However, in many cases, the coupling is not symmetric and the nature of the processes can differ. For example, risky social behaviors—such as not wearing masks or engaging in large gatherings—can affect the spread of a disease, and their adoption dynamics via social reinforcement mechanisms are better described by complex contagion models rather than by simple contagions, more appropriate for disease spreading. Here, we consider a simplicial contagion (describing the adoption of a behavior) that uni-directionally drives a simple contagion (describing a disease propagation). We show, both analytically and numerically, that, above a critical driving strength, such a driven simple contagion can exhibit both discontinuous transitions and bi-stability, absent otherwise. Our results provide a novel route for a simple contagion process to display the phenomenology of a higher-order contagion, through a driving mechanism that may be hidden or unobservable in practical instances. |
16:15 | Higher-order interactions reveal complex memory in temporal networks PRESENTER: Luca Gallo ABSTRACT. Temporal networks are a largely used tool to model how interactions among the elements of a complex system evolve in time. So far, most of the studies on the temporal organization of complex systems have been limited to pairwise interactions only. However, real-world systems are often characterized by higher-order interactions, that can not be captured by a network description. Though recent research has considered temporal properties of higher-order networks, a detailed characterization of their temporal structure is still lacking. Here, we analyze the temporal correlations existing in systems with higher-order interactions. We define a set of measures to quantify correlations through a variety of dimensions. We study how much interactions of a given order are related to interactions of the same order, i.e., intra-order correlations. Also, we examine whether interactions of a given order are related to interactions of a different order, i.e., cross-order correlations. We consider different social systems, for which we have high-resolution data about their temporal evolution. We observe a variety of temporal patterns, from periodicity to decaying autocorrelations. Regarding this latter case, we find that intra-order correlation decreases for every order of interaction, with a power-law decay followed by an abrupt loss of correlation at typical timescales, highlighting a hierarchy across different orders. We observe that interactions of higher order tend to have a lower value of correlation, and that higher orders of interaction remain correlated for less time compared to lower order interactions. We also find that different orders of interactions display a non-trivial temporal organization, with some orders of interaction anticipating the others. This suggests the existence of an asymmetry between the mechanisms of formation and dissolution of groups. We investigate the mechanisms shaping temporal correlations in social systems by introducing a minimal model of temporal higher-order networks with memory, i.e., the cDARH model. In such a model, each hyperedge updates its state either by copying from its past, i.e., self memory, or from that of an hyperedge of a different order, i.e., cross-order memory, or randomly. Our model reveals that memory can drive the emergence of temporal correlations, with different orders of interactions possessing different degrees of memory (panel A of Fig.1). Also, the existence of cross-order memory is revealed to be the fundamental factor underlying the emergence of a gap in the cross-order correlations (panel B of Fig.1). Our work is the first to study non-Markovianity and memory in systems characterized by higher-order interactions. It sheds a light on various processes underlying social interactions, including periodicity, aggregation and schisming, paving the way to further analyses of this subject. |
16:30 | Percolation and topological properties of temporal higher-order networks PRESENTER: Leonardo Di Gaetano ABSTRACT. Many time-varying networks are characterized by non-pairwise interactions, whose dynamics can not be captured by traditional graph models. Here we introduce a hidden variables formalism to analytically characterize higher-order networks. We apply our framework to a higher-order activity-driven model, providing analytical expressions for the main topological properties of the time-integrated hypergraphs, depending on the integration time and the distribution of activity potential characterizing the model. We provide analytical estimates for the percolation times of general classes of uncorrelated and correlated hypergraphs, showing on empirical data that neglecting the higher-order nature of interactions leads to systematically underestimating the percolation threshold. Our work contributes to a deeper understanding of the interplay between group dynamics and the dynamical processes unfolding over them. |
14:15 | Embedding directed networks into hyperbolic spaces PRESENTER: Gergely Palla ABSTRACT. Hyperbolic networks offer a natural approach for modelling scale-free, highly clustered and small-world networks, and can often display a pronounced community structure as well. In the meantime, there is growing evidence for the existence of hidden geometric spaces behind the structure of real networks as well. Inspired by that, a number of embedding methods were developed over the years that are capable of providing an optimal arrangement of the nodes in some representation of the hyperbolic space solely based on the network topology. Although these algorithms proved to be efficient from several aspects, most of them were designed for handling undirected networks, while embedding approaches that can take into account the possible directionality of the links are still in their infancy. Motivated by that, here we introduce a family of hyperbolic embedding methods specifically for handling directed input networks. Since the research on the embedding of directed networks into Euclidean spaces is already in a more matured phase, a natural idea is to define a general, model-independent conversion from Euclidean to hyperbolic node coordinates, and use it on the output of well-established Euclidean embedding methods. Our suggestion for such conversion is to preserve the angular coordinates and transform the radial coordinates based on the idea that if the Euclidean and the hyperbolic radial arrangements are converted to the same space, then the resulting node arrangements must be easily reconcilable to each other. According to that, we use the linearly expanding half-line as the pass-through between the polynomially expanding Euclidean and the exponentially expanding hyperbolic spaces, and assume that the node arrangements obtained on the half-line from the Euclidean and the hyperbolic radial coordinates reflect the same radial attractivity of any node compared to the highest one. This way we can obtain a conversion similar to the one used previously in the literature, but without the need to assume that the network was generated by any specific hyperbolic network model. We first tested this approach both on the original and two slightly modified versions of the High-Order Proximity preserved Embedding (HOPE). The central idea of the HOPE method is to apply singular value decomposition on a proximity matrix between the nodes formed from the Katz indexes, whereas our modified versions either shift the mean of the proximity matrix elements to 0 before the dimension reduction (HOPE-S) or treat the first dimension of the embedding to be redundant and use the singular values from the second to the d+1th one to create a d-dimensional embedding (HOPE-R). These modifications are intended to lift the inherent angular restriction of the point cloud in the embedding obtained with HOPE and achieve a more circular arrangement of the nodes. In parallel, we propose an alternative Euclidean embedding method as well, relying on the dimension reduction of a proximity matrix of exponentiated shortest path lengths between the nodes (abbreviated as TREXPEN for TRansformation of EXponential shortest Path lengths to EuclideaN measure). Naturally, the output of TREXPEN can also be transformed into a hyperbolic embedding using the suggested model-independent conversion method. On top of these, we also developed an embedding based on a dimension reduction technique applied to a distance matrix of directed topological distances that maps the nodes directly into the hyperbolic space, without the need for any conversion from Euclidean to hyperbolic coordinates (abbreviated as TREXPIC for TRansformation of EXponential shortest Path lengths to hyperbolIC measure). |
14:30 | D-Mercator: multidimensional hyperbolic embedding of real networks PRESENTER: Robert Jankowski ABSTRACT. One of the pillars of the geometric approach to networks has been the development of model-based mapping tools that embed network topologies in their latent geometry. In particular, Mercator embeds real networks into the hyperbolic plane. However, some real networks are better described by the multidimensional formulation of the underlying geometric model. Here, we introduce D-Mercator, an embedding method that goes beyond Mercator to produce multidimensional maps of real networks into the $D+1$ hyperbolic space where the similarity dimension is represented in a D-sphere. We evaluated the quality of the embeddings using synthetic $\mathbb{S}^D$ networks. We also produced multidimensional hyperbolic maps of real networks that provide more informative descriptions than their two-dimensional counterparts and reproduce their structure more faithfully. Having multidimensional representations will help to reveal the correlation of the dimensions identified with factors known to determine connectivity in real systems and to address fundamental issues that hinge on dimensionality, such as universality in critical behavior. D-Mercator also allows us to estimate the intrinsic dimensionality of real networks in terms of navigability and community structure, in good agreement with embedding-free estimations. |
14:45 | Dynamics of random hyperbolic graphs PRESENTER: Fragkiskos Papadopoulos ABSTRACT. Random hyperbolic graphs (RHGs) have been shown to be adequate models of real-world complex networks, as they naturally and simultaneously possess many of their common structural characteristics. However, existing work on RHGs has mainly focused on structural properties of network snapshots, i.e., of static graphs, while little is known about the dynamical properties of RHGs. In this talk, we will consider the simplest possible model of dynamic RHGs in the cold regime (network temperature $T < 1$) and derive its most basic dynamical properties, namely the distributions of contact and intercontact durations. These distributions decay as power laws in the model with exponents that depend only on the network temperature $T$ and are consistent with (inter)contact distributions observed in some real systems, cf.~Fig.~\ref{figAll}. Interestingly, these results hold irrespective of the nodes' expected degrees, suggesting that broad (inter)contact distributions in real systems are due to node similarities, instead of popularities. We will also see that several other properties, such as weight and strength distributions, group size distributions, abundance of recurrent components, etc., are also consistent with real systems, justifying why epidemic and rumour spreading processes perform remarkably similar in real and modelled networks~\cite{Papadopoulos2019}. Furthermore, we will discuss a recent generalization of the model that incorporates link persistence~\cite{Papadopoulos2022cold}, as well as results from dynamic RHGs in the hot regime (network temperature $T > 1$)~\cite{Papadopoulos2022hot}. In the hot regime, the intercontact distribution is nonnormalizable, meaning that hot RHGs (including the configuration model that emerges for $T \to \infty$) cannot be used as null models for real temporal networks, in stark contrast to cold RHGs. We will conclude with future work directions. |
15:00 | Inherent uncertainty of hyperbolic embeddings of complex networks PRESENTER: Simon Lizotte ABSTRACT. Network geometry is a versatile yet simple framework that captures several of the observed properties of empirical networks, such as their non-vanishing clustering, sparsity, and power-law degree distribution. This accurate description is achieved by placing vertices in a metric space (usually hyperbolic) and by connecting them according to their proximity in that space. To use this framework to describe observed networks, one must find vertex coordinates that best reproduce the observed topology, a difficult task generally solved with a mixture of greedy algorithms, simplifying heuristics and machine learning techniques. By approaching the embedding task from a Bayesian perspective, we quantify the uncertainty of the inferred positions and parameters, thereby going beyond the point-wise estimates obtained with current state-of-the-art embedding techniques. We make use of recent advances in probabilistic programming and implement a Hamiltonian Monte Carlo algorithm to estimate a complete posterior distribution for the embedding. In so doing, we overcome several technical challenges: 1) we introduce a continuous approximation of the likelihood gradients to manage discontinuities caused by vanishing angular distances; 2) we make use of standardization techniques to handle graph and space symmetries and align samples; and 3) we explore the influence of initial conditions. With these techniques, we uncover genuine multimodality in the posterior distributions that cannot be explained by algorithmic issues or structural or spatial symmetries. As such, our work highlights the irreducible uncertainty inherent to the hyperbolic embedding task, thereby paving the way for more comprehensive and accurate embedding algorithms of empirical networks in hyperbolic space. |
15:15 | Hidden Geometry Reveals Multiscale Organization of Real Multiplex Networks PRESENTER: Gangmin Son ABSTRACT. Multiplex networks raise an issue of interlayer comparison for characterizing their structure. In this context, it has been revealed that real multiplex networks have hidden geometric correlations (GCs) [1]. Specifically, since each layer in a multiplex network can be embedded in a hidden metric space, the GC is derived from the correlation between the inferred coordinates. Meanwhile, the geometric approach also provides a natural framework for the multiscale unfolding of complex networks, since the hidden geometry provides a rich reservoir of distances in contrast to the geometry based on the shortest paths between nodes [2]. The multiscale unfolding provides downscaled replicas of single-layer networks, which has led to both theoretical and practical advances such as multiscale greedy routing. However, little is known about its extension to multiplex networks. Here, we extend the multiscale unfolding to multiplex networks to explore the multiscale organization of real multiplex networks. We introduce a new concept, the GC spectrum of a multiplex network, which can be obtained by measuring GCs for its downscale replicas. Interestingly, real multiplex networks exhibit non-monotonic patterns in their GC spectra, while an existing model for GC, called the geometric multiplex model (GMM), predicts monotonically increasing ones. We suppose that the intriguing behaviors are reproduced in multiplex networks where their microscopic geometric organization is preserved across layers and their macroscopic one does not. Our model, the multiscale GMM, validates the scenario. Furthermore, we investigate how the multiscale organization affects cascading dynamics in multiplex networks. Our results may enrich the understanding of the interplay between hidden geometry and dynamics in multiplex networks. |
15:30 | The geometry of constellation line figures ABSTRACT. In past and current indigenous astronomies across the world, groups of stars in the night sky formed constellations. One way to represent a constellation is the line figure: a spatial graph on the fixed background of stars, whose links are geodesics (arcs) between pairs of stars. The spatial graph need not be spatially planar, nor connected. Constellation data was digitised by the author; the dataset currently contains 1893 constellation line figures from 71 world cultures. From recent work (PLOS ONE 17(7): e0272270 2022), we know that certain sky regions (with their distinct star patterns) produce a high diversity of line figures across cultures, while other sky regions don't. The latter holds for the regions of the Western Scorpius and Corona Borealis, whose graph shapes appear universal, and thus predictable. The question that naturally follows is: what are the causal factors behind the line constructions, so to what is this universality due? Research question. We ask which factors explain the geometry of the line figures, so to what extent can the line figure be predicted if the star pattern is given. A large set of factors is in play: endogenous factors (features pertaining to the sky pattern, such as the aspect ratio of star group taken as a point cloud), but also exogenous factors (features pertaining to the cultural imagination, such as the type of culture, or the semantics of the constellation: a humanoid figure is generated differently than a bird figure). We answer this question by analysing how close generative models for spatial graphs are to the empirical line figure (and when this match is statistically significant). Many deterministic models are relevant: from concave hulls to alpha shapes (including minimum spanning trees and convex hulls) to k-nearest neighbour graphs, triangulations, but also extensions which model specific geometric shapes (such as lassos). They embody Gestalt principles, which have been claimed to guide the perception of star patterns: neighbouring stars are grouped preferentially, and an economical structure is preferred. Descriptive machine-learning models are trained to make this prediction, and the patterns they found are explained. Results. We find that the majority of line figures are correctly reproduced over a given star pattern by one or another spatial graph generative model. The exact percentage varies greatly with the culture: line figures from oral, folk traditions (such as those of landlocked tribal cultures) match most often one or more deterministic models. A star pattern with a skewed aspect ratio (the point cloud is much longer than it is wide) preferentially leads to minimum spanning trees (including long lines such as the body of the Western Scorpius). The relative neighbourhood graphs models line figures most accurately when drawing the convex hull over the group of stars leaves few unconnected stars, but the aspect ratio is still skewed (if the latter condition is not fulfilled, a concave or convex hull matches the line figure better). Cultural factors also play a role, such as the phylogeny of the culture (particularly Chinese versus Mesopotamian) and the semantics of the line figure (large humanoid figures are hardest to predict). |
15:45 | Towards a Better Understanding of the Characteristics of Fractal Networks PRESENTER: Marcell Nagy ABSTRACT. The fractal nature of complex networks has received a great deal of research interest in the last two decades. Similarly to geometric fractals, the fractality of networks can also be defined with the so-called box-covering method. A network is called fractal if the minimum number of boxes needed to cover the entire network follows a power-law relation with the size of the boxes. The fractality of networks has been associated with various network properties throughout the years, for example, disassortativity, repulsion between hubs, long-range-repulsive correlation, small edge betweenness centralities, non-small-worldness, and weak correlation between degree and betweenness centrality. However, these assertions are usually based on tailor-made network models and on a small number of real networks, hence their ubiquity is often disputed. Since fractal networks have been shown to have important properties, such as robustness against intentional attacks, it is in dire need to uncover the underlying mechanisms causing fractality. Hence, the main goal of our work is to get a better understanding of the origins of fractality in complex networks. To this end, we systematically review the previous results on the relationship between various network characteristics and fractality. Moreover, we perform a comprehensive analysis of these relations on five network models and a large number of real-world networks originating from six domains. We clarify which characteristics are universally present in fractal networks and which features are just artifacts or coincidences. |
16:00 | Spacetime Random Geometric Graph Neural Networks PRESENTER: Chester Tan ABSTRACT. Causal set theory is a proposal for quantum gravity, which models spacetime as a discrete partially ordered set of spacetime atoms \cite{glaser2014closed}. A causal set can be generated by sprinkling points in spacetime, and connecting points if they lie within each other's light cones, much like random geometric graphs. Random geometric graphs have recently been shown to be a powerful and interesting way to model graph neural networks \cite{paolino2022unveiling}: instead of approximating a known continuous Laplacian with a graph Laplacian, a graph Laplacian is learnt form an observed graph that is guaranteed to approximate an unknown continuous Laplacian of the metric-probability sampling space that generates the random geometric graph. This study builds on previous work presented at NetSci2020 on spacetime random geometric graphs by Chester Tan and Tim S. Evans\footnote{\url{https://easychair.org/smart-program/NETSCI2020/2020-09-24.html\#talk\:156732}}. It investigates how spacetime random geometric graphs can be used in place of Riemannian random geometric graphs in the framework of modelling graph neural networks with random geometric graphs, to model graph neural networks of temporal networks and other directed acyclic graphs that have a partial order structure \cite{badie2022directed,clough2017embedding}. The causal discrete set d'Alembertian, which has been shown to recover the continuous d'Alembertian, is investigated as the graph Laplacian in this framework and solves problems of non-locality and infinite valences in spacetime random geometric graphs. This study bridges causal set theory, temporal networks, and graph neural networks, and hopes to be of interest to a broad range of communities including network science, physics, and geometric deep learning. |
16:15 | Embedding networks into hyperbolic spaces by greedy routing optimisation PRESENTER: Bendegúz Sulyok ABSTRACT. Finding the optimal embedding of networks into low-dimensional hyperbolic spaces is a challenge that received considerable interest in recent years, with several different approaches proposed in the literature. In general, these methods take advantage of the exponentially growing volume of the hyperbolic space as a function of the radius from the origin, allowing a (roughly) uniform spatial distribution of the nodes even for scale-free small-world networks, where the connection probability between pairs decays with hyperbolic distance. One of the motivations behind hyperbolic embedding is that optimal placement of the nodes in a hyperbolic space is widely thought to enable efficient navigation on top of the network. According to that, one of the measures used to quantify the quality of different embeddings is given by the greedy routing score, intended to measure the efficiency of a simple navigation protocol based on the hyperbolic coordinates. In the present work, we develop an optimisation scheme for this score in the native disk representation of the hyperbolic space. This optimisation algorithm can be either used as an embedding method alone, or it can be applied to improve this score for embeddings obtained from other methods. According to our tests on synthetic and real networks, the proposed optimisation can considerably enhance the greedy routing score in several cases, improving the given embedding from the point of view of navigability. |
16:30 | Dynamic geometries and clonal interference create heterogeneous adaption patterns on networks PRESENTER: Adrian Zachariae ABSTRACT. In the strong-selection weak-mutation regime, rare mutations accumulate sequentially and the adaption rate depends linearly on mutation rate and pop- ulations size. However, for asexual populations, if population size and mutation rates are large enough, the phe- nomenon known as clonal interference (CI) will will decrease the effective rate of adaption. CI occurs when muta- tion events are so frequent that new mutations can occur be- fore previous mutation have fixated. This phe- nomenon has been extensively studied in the context of meta- populations and lattice-like population structures. From these studies we know that CI is stronger and adaption slower in meta-populations with slow migration-rates and if lattice-like structures are not broken up by long-range links. On a uniform population structure, positive mutations spread in wavelike patterns, as already noted by Fischer in 1937, while on a complex networks structure this simple pattern will mostly vanish. However, process-derived quasi-metrics and and geometric properties have been very successful in predicting epidemic spreading and similar phenomena, defin- ing a notion of distance that restores wave-like propagation on complex systems. With a strong dependence on network properties, it is unsur- prising that heterogeneous networks can have varying adop- tion speeds in different parts of the network, depending on local or meso-scale properties of their neighborhood. Because of the inherent non-linearity of CI, we find patterns that depend on the mutation rate not only in strength but can have multiple distinctive regimes. Within each regime, dif- ferent parts of the network can become the dominant source of new fixating strains. Here, we want to use the toolbox of the dynamic geometry of networks to provide a better understanding of Clonal Interfer- ence in networks. |
14:15 | Economic complexity at the company level PRESENTER: Andrea Zaccaria ABSTRACT. The Economic Complexity (EC) [1,2] approach consists in applying tools from statistical mechanics, complex networks, and machine learning to economics, with special attention to comparing theoretical results with empirical evidence. The starting point is usually a bipartite country-product network or, more in general, a network connecting economic actors with activities. EC provides two key concepts. The first is a measure of the competitiveness of a country in terms of the complexity of its capabilities, for instance, ECI [1] or the fitness measure [2]. The second one is relatedness [3], which measures the similarity between two economic sectors and the feasibility of entering into a new economic activity for an economic actor (e.g., a country starting to export a product). Both can provide powerful quantitative tools to investigate present economies and design future strategies of development, as proven by their adoption by both the World Bank and the European Commission. Usually, the data considered for the EC investigations are at the country level: for instance, international trade, reshaped as a bipartite country-product network. It is well-known that this network exhibits a nested structure, which is the foundation of different algorithms that have been used to investigate countries' development and forecast national economic growth. In this talk, I critically review the use of this kind of data by comparing it to a unique database provided by the Italian Institute of Statistics (ISTAT). The ISTAT database connects more than 200000 Italian companies to the products they export. Changing the subject from countries to companies, a significantly different scenario emerges. While a globally nested structure is observed at the country level, a local, in-block nested structure emerges at the level of firms (fig. a) [4]. Remarkably, this in-block nestedness is statistically significant with respect to suitable null models and the algorithmic partitions of products into blocks have a high correspondence with exogenous product classifications. These findings lay a solid foundation for developing a scientific approach to investigate companies, and maybe computing the fitness of a company. Traditionally, also relatedness is measured using complex networks approaches derived from country-level co-occurrences. I compare complex networks and machine learning algorithms trained on both country and firm-level data. In order to quantitatively compare the different measures of relatedness, I use them to predict future exports at the country and firm level, assuming that more related products have a higher likelihood to be exported in the near future. Our results show that relatedness is scale-dependent: the best assessments are obtained by using machine learning on the same typology of data one wants to predict [5]. Moreover, while relatedness measures based on country data are not suitable for firms, firm-level data are quite informative also to predict the development of countries. In this sense, models built on firm data provide a better assessment of relatedness with respect to country-level data. I also discuss the effect of using community detection algorithms and parameter optimization, finding that a partition into a higher number of blocks decreases the computational time while maintaining a prediction performance that is well above the network-based benchmarks. In conclusion, I show the importance of considering data at the company level when applying complexity and network-based concepts to economics. ------------- [1] Hidalgo, C. A., and Hausmann, R. PNAS 106, no. 26 (2009): 10570-10575. [2] Tacchella, A., et al. Scientific reports 2, 1 (2012): 1-7. [3] Hidalgo, C. A., et al. In International conference on complex systems, pp. 451-457. Springer, Cham, 2018. [4] Laudati, D., et al. arXiv:2202.01804 (2022). [5] Albora, G. and Zaccaria, A. arXiv:2202.00458 (2022). |
14:30 | Resolving the complexity puzzle: positions in global value chains contribute to explaining economic development PRESENTER: Tamás Sebestyén ABSTRACT. Recent studies have revealed that economic complexity correlates with income level, and the deviation from this correlation predicts future growth (Hidalgo and Hausmann, 2009). If a country is able to produce and export more varieties of unique goods, it is said to have a more complex economy. In the background, more complex economies must have special skills, capabilities, and knowledge which are the main driving factors of growth. The ranking of the Economic Complexity Index (ECI) shows that some transitioning countries in Central-Eastern Europe (e.g. Czechia, Hungary, Slovenia, or Slovakia) reveal more complex economies than some more developed European countries (e.g. Denmark, France, Netherlands, or Norway). This would point to more skills and knowledge in these CEE countries to produce complex goods and services that predict better growth potential. We call this phenomenon the "complexity puzzle", and provide a potential explanation of it in this study. An important intuitive observation in this respect is that countries in Central-Eastern Europe occupy a specific position within global value chains (GVCs) (Nölke and Vliegenhart, 2009; Rugraff, 2006). While they are specialized in complex industries in their export portfolio (automotives, electronics), their import portfolio shows similar patterns, resulting in remarkable records on the export side in their complexity ranking, with a rather limited domestic value-added content. Indeed, the structure of the global value chains has gained importance in the global economy. It can determine the intensity of shock propagation between countries and reveal the countries' role in global production systems, which can also be associated with their level of development and welfare (Timmer et al. 2019; Meng et al., 2020). In this study, we focus on the countries' role, in other words, countries' position in GVCs and use it as a complementary explanatory factor to growth in addition to economic complexity. We determine GVC position in two different ways. First, we take into account the value of imported inputs and measure export values through domestic value-added content. Second, we also take into consideration how far the production phase of a country in the GVCs is from the final product. With this reflection of 'upstreamness', we can measure what effect the countries' exports have on the production chain. The empirical results of the study first show, that the complexity indices calculated on the basis of domestic value-added content of export rather than gross export numbers fit better in terms of correlation with development. The reason behind this result is that filtering out imported inputs, we can quantify more accurately the skills and knowledge of the countries that provide the background of export activities. In addition, our results reveal that countries' level of 'upstreamness' in GVCs correlates with their level of development, especially in the case of service sectors rather than manufacturing. This provides a convincing resolution to the 'complexity puzzle', pointing out that while export complexity is a good first approximation of the economic capabilities of a country, controlling for its position within GVCs can further refine this picture. Our results are summarized in Figure 1. Panel a) (see attached file) shows that more developed European countries have a much stronger (central) position in GVCs in service sectors compared to CEE countries. Panel b) on the other hand suggests an inverted pattern with respect to manufacturing. This shows that CEE countries made the effort to reach a strong position in manufacturing but this development path offers only constrained opportunities for growth. Their avenue for further growth lies in strengthening their position within services as well. |
14:45 | Modular network comparison of inter-industry labour mobility networks reveals differences in industrial clusters across six European countries PRESENTER: Neave O'Clery ABSTRACT. There is an emerging consensus in the literature that locally embedded capabilities and industrial know-how are key determinants of growth and diversification processes. In order to model these dynamics as a branching process, whereby industries grow as a function of the availability of related or relevant skills, industry networks are typically employed. These networks describe the complex structure of the capability or skill overlap between industry pairs, measured here via inter-industry labour flows. We argue that communities in these networks represent industrial clusters in which workers circulate and diffuse knowledge. In this paper we seek to understand the extent to which countries share a common industrial cluster structure. In order to do this we collect inter-industry labour mobility networks for six European countries and seek to compare their modular structure. However, there exists a relative lack of methods to compare network topology beyond simple set-based metrics which neglect the underlying topology. Here we introduce two new methods to compare the modular structure of networks, one ‘global’ (full partition) and one ‘local’ (per community). We find that, for example, Irish industrial clusters are similar to the UK for complex services but similar to Germany and the Netherlands for hospitality, agriculture, manufacturing and food. |
15:00 | Emerging Labour Flow Networks PRESENTER: Kathyrn Fair ABSTRACT. Constructing labour flow networks (LFNs) allows us to explore labour mobility dynamics. Nodes within a LFN contain pools of jobs that share characteristics (e.g. they belong to the same industry, occupation, and geographical region). Additionally, the weights of flows (edges) between nodes indicate the relative frequency with which people move from a job on one node to a job on the other, reflecting structural constraints caused by labour market frictions. By mathematically modelling employment dynamics on LFNs it is possible to explore the formation of these networked flows. Previously developed LFN models assume that the network structure is static and exogenously generated, supported by evidence that firm-to-firm labour flows tend to persist through time. However, this assumption does not always hold. For example, a substantial shift in the skillsets required to perform certain jobs or in the types of jobs that are available (e.g. due to a shock such as a drastic technological transformation) could drastically alter the weights of flows between nodes. Thus, developing LFN models that emerge realistic LFNs from fundamental economic behaviour has significant value. We introduce a novel agent-computing model in which the job-switching decisions of individual agents (who are attempting to maximise their utility) result in the endogenous generation of labour flows. Our model, informed by microdata from the United Kingdom (UK), recreates the observed UK LFNs with a high level of accuracy. We validate the model by showing that while the fundamental variables contributing to agent’s job switching decisions (e.g. occupational skill similarities, geographical proximity) cannot explain the observed LFNs, our model generates new information that does. Finally, we use the model to explore how shocks impacting the underlying distributions of jobs and wages alter the topology of the LFN. This framework represents a crucial step towards the development of models that can answer questions about the future of work in an ever-changing world. |
15:15 | Synchronization of Regional Development - Influence of Clusters in the Regional Production Network ABSTRACT. Understanding regional, sub-national, and developmental economic processes is of high political interest. There are different strands of economic theories of regional development, neoclassical theories, new growth as well as new trade theory, and theories of evolutionary economics, which can be divided by their respective conclusions. The recent advent of complexity economics, that is, applying complexity thinking to understand processes in the economy, changes the focus when studying phenomena such as regional development. One of these foci lies in the connectedness between entities of a system in networks through clusters and how macro outcomes emerge from their interactions. However, how the connectedness of regions through clusters in production networks influences regional development is unexplored. In this paper, statistically sound signals are derived from the share of trade within clusters to predict synchronization patterns of regional development between German counties. These results contribute to the economic literature on regional development by showing that the connectedness between regions through clusters can enhance the analysis of synchronized developmental patterns which is an aspect largely ignored by usual pairwise correlation analyses. This opens new perspectives on how to conceptualise regional development in order to adapt regional, national, and European policies. Another contribution of this paper extends beyond the scale of regional development. This paper is motivated by concepts derived from understanding the economy as a complex system and thus, serves as a first step towards conceptualising a framework that studies regional development by means of complexity economics. Also, it can serve as a starting point to further study regional development and regional inequalities from a complex systems point of view. With whom you are connected helps to predict your developmental outcomes. |
15:30 | Urban Economic Fitness and Complexity from Patent Data PRESENTER: Matteo Straccamore ABSTRACT. This study investigates the relationship between technology, innovation, and economic development in metropolitan areas worldwide. Using patent data from 1980 to 2014, we employ network-based techniques and the Fitness and Complexity framework [2] to analyze patterns of specialization and diversification among metropolitan areas. The Fitness and Complexity framework is a tool that allows us to quantify the power of a metropolitan area to be technologically competitive (Fitness) and how difficult a technology is to implement (Complexity). At the metropolitan area level, it is a particularly interesting scale for the application of the Fitness and Complexity algorithm, since it has been shown that the interplay between specialization and diversification can change at different scales. Through the application of this framework, we find coherent distinguished groups of metropolitan areas belonging to the same geographical area or similar from an economic point of view (Fig. 1a), using only the information about patents. Our results reveal that technological innovation is a key driver of economic development (Fig. 1b) and that Chinese metropolitan areas exhibit a coherent pattern of diversification (Fig. 1c). We investigate whether this pattern is coordinated at a national level and explore potential causes for this behavior. Additionally, we apply novel methods for bipartite networks to further understand the technological production of metropolitan areas. The results of this study provide a deeper understanding of the dynamics of technological innovation at the metropolitan level and have implications for policy-making and economic development. Our findings also open up new research avenues, such as investigating the long-term, all-purpose strategy that China may be implementing for the development of technologies and the definition of production baskets of individual metropolitan areas. |
15:45 | A Network Science Approach to Model Algorithmic Competition and Collusion PRESENTER: Luc Rocher ABSTRACT. Algorithms are now playing a central role in digital marketplaces, setting prices and automatically responding in real time to competitors' behavior. The deployment of automated pricing algorithms is scrutinized by economists and regulatory agencies, concerned about its impact on prices and competition. Existing research has so far been limited to cases where all firms use the same algorithm, suggesting that anti-competitive behaviour might spontaneously arise in that setting. Here, we introduce and study a general anti-competitive mechanism, adversarial collusion, where one firm manipulates other sellers that use their own pricing algorithm. We propose a network-based framework to model the strategies of pricing algorithms on iterated 2 and 3-firm markets. In this framework, an attacker learns to endogenize competitors' algorithms and then derive a strategy to artificially increase its profit at the expense of competitors. Facing a drastic loss of profits, competitors will eventually intervene and revise or turn off their pricing algorithm. To disincentivize this intervention, we show that the attacker can instead unilaterally increase both its profits and the profits of competitors. This leads to a collusive outcome with symmetric and supra-competitive profits, sustainable in the long run. Together, our findings highlight the need for policymakers and regulatory agencies to consider adversarial manipulations of algorithmic pricing, which might currently fall outside of the scope of current competition laws. |
16:00 | Measuring the velocity of money PRESENTER: Carolina Mattsson ABSTRACT. Modern payment infrastructure is increasingly digital, and money changing hands now often produces transaction records in real-time. This high-granularity temporal network data opens up tremendous possibilities, including the prospect of characterizing transaction dynamics down to the level of individuals. This work presents a new mathematical and empirical methodology for studying the rate of such dynamics. |
16:15 | Modeling innovation in the cryptocurrency ecosystem PRESENTER: Giordano De Marzo ABSTRACT. Blockchains are among the most relevant emerging technologies of recent times and, according to many, they will have a central role in shaping the future of our society. Since the introduction of Bitcoin in 2009, the first notorious blockchain system bound to a cryptocurrency, the blockchain ecosystem has experienced a huge growth, driven by innovations both in conceptual and algorithmic terms, and in the creation of a large number of new cryptocoins. New blockchains and their associated cryptocoins, emerge mostly as the result of forking already existing projects. Here, we show that the appearance of new cryptocoins can be well described by a sub-linear power-law (Heaps’ law) of the total crypto-market capitalization. At the same time, we propose a model that well reproduces the evolution of the cryptocurrency ecosystem. Our model suggests that each cryptocurrency triggers, on average, the creation of ca. 1.58 novel cryptocoins, a result confirmed by the analysis of the Bitcoin historical forking tree. Moreover, we deduce that the largest cryptocurrency, nowadays Bitcoin, will comprise around the 50% of the whole crypto-market and that this fraction is going to stabilize in the near future, provided that the present fundamental macro-economic conditions do not change radically. |
14:15 | Generalized contact matrices for epidemic modelling PRESENTER: Adriana Manna ABSTRACT. Contact matrices have become an integral part of epidemic modelling. They typically capture the stratification of individual contacts across age groups. Indeed, interaction rates between age groups are not homogeneous. Children, for example, tend to interact more with other children, or the elderly, than with young adults. Furthermore, contact matrices might be further stratified by accounting for the context where interactions take place (i.e., home, work, school, community). Desegregating contacts by age and context allows capturing key features of our interactions, and possible heterogeneities in the burden of diseases across groups of the population [1], [2]. Despite their pivotal role, age and context are far to be the only important variables shaping contact patterns, disease outcomes, and diffusion. Socio-economic status, a generic term linked to different socio-demographic variables such as income, wealth, gender, ethnicity, and/or education among others, is also an extremely important factor. From the Influenza pandemics of 1918 and 2009, to the West African Ebola outbreak and the COVID-19 pandemic, lower socio-economic status has been consistently associated with higher rates of infections, and deaths, as well as reduced access to care and ability to comply with non- pharmaceutical interventions (NPIs). Although the importance of socio-economic status in disease transmission dynamics is well recognized, the overwhelming majority of epidemic models neglect this dimension. Often, it is analyzed only retrospectively when looking at models’ outputs (e.g., number of deaths or cases). The roots of this shortcoming can be traced back to the lack of i) modelling frameworks designed to consider socio-economic status as one, or more, of their structural features [3], ii) data describing stratification of contacts across different socio-economic status dimensions. In this work, we tackle the first limitation by developing a general epidemic framework able to accommodate generalized contact matrices stratified according to multiple socio-demographic dimensions. The proposed framework goes from classic contact matrices Ci j , where individuals are grouped according to their age bracket i, to Ga,b, where individuals are characterized by their age and m other categorical variables. Hence, a = (i, α1 , α2 , . . . , αm ) and b = (j,β1,β2,...,βm) are index vectors (i.e., tuples) representing individuals membership to each category (Fig. 1). Think for example stratifying a population according to age, income (α1), and biological gender (α2). In this case, the generalized contact matrix Ga;b would describe the average number of contacts that an individual in age bracket i, income α1, and biological gender α2 has with people in age group j, income β1, and biological gender β2 in a given time window. We denote with K the number of age groups, while with Vp the number of groups in each of the m other dimensions. The obtained generalized matrix G can be naturally described as a multi- dimensional matrix. The use of T = K ∏mp=1 Vp × K ∏mp=1 Vp index vectors allows for flattened representation in a squared 2-dimensional matrix of size K ∏mp=1 Vp × K ∏mp=1 Vp . In order to analyze the consequences of integrating such a matrix in mathematical models for infectious disease, we focus on the prototypical Susceptible-Exposed-Infectious-Recovered (SEIR) compartmental models and we derive a closed form expression for the basic reproductive number R0 . We validate our mathematical formulation with numerical simulations of hypothetical scenarios where we model random mixing and assortative contact patterns along the additional dimensions that we are considering. Namely, starting from real-world data on age-stratified contacts of four countries – Hungary, Zimbabwe, Peru and Vietnam – we model different generalized contact matrices in each of them. We use such matrices to perform a sensitivity analysis of their spectral radius (ρ), which is the mathematical quantity that determines the value of R0. In addition, we study how the disease spread among different population sub-groups. Preliminary results show that (i) our mathematical formulation for R0 holds and captures the threshold value of the epidemic phase space (ii) taking into account additional socioeconomic dimensions not only changes the theoretical value of R0 but also reveals striking differences in epidemic outcomes across population groups. Overall, our work contributes to the literature by bringing socio-economic dimensions to the forefront of epidemic modelling. Tackling this issue is crucial for developing more precise descriptions of epidemics, and thus designing better strategies to contain them. |
14:30 | Monkeypox spread among Parisian venues PRESENTER: Mattia Mazzoli ABSTRACT. The first case of the French monkeypox outbreak was detected on May 18, 2022. By the beginning of June, 72% of French cases had been detected in the Île-de-France region, primarily in Paris, and mainly among men-having-sex-with-men (MSM). Incidence rose rapidly with 440 confirmed cases by the end of June, of which 312 from Île-de-France. Using sexual behavioural data collected in the pre-epidemic period, we studied the early phase of the reported outbreak in Paris and investigated whether attendance to MSM commercial venues (e.g. saunas, backrooms, bars, etc.) fuelled the observed spread. Using the 1,089 respondents of the 2015 PREVAGAY survey in Paris, we built a bipartite network linking respondents to the MSM commercial venues they visited. We projected the data into the space of venues to build a network of venue co-visits, in which venues are nodes and links connect venues if they share co-visitors. Links were weighted by the number of co-visitors. We used outbreak data on the onset date from the post-infection survey answered by cases detected in Paris from the start of the outbreak till mid-July 2022 to determine the date of the first case in each commercial venue visited by cases in the 3 weeks prior to symptoms onset. We then fitted a mathematical model of transmission based on the empirical network of venue co-visits to the observed invasion times using Markov Chain Monte Carlo (MCMC) sampling. We repeated the estimation each time removing certain co-visits features from the network. We found that the original network of venue co-visits best reproduced the observed invasion dynamics among venues. Including information about the number of visits by each individual to each venue did not change the predictive accuracy of the model, suggesting that the existence of a co-visit is sufficient to predict the spread, irrespective of visit frequency. Our findings indicate that the pattern of co-visits is a good predictor of spatial transmission among MSM venues in Paris. They underline the need for early intervention at co-visited sites and targeted information to users of co-visited sites to reduce the spread of monkeypox to yet unaffected venues or in the event of resurgence. |
14:45 | Impact of the human contact network on the selective advantage of emerging viral variants. PRESENTER: Pourya Toranj Simin ABSTRACT. The first years of SARS-CoV-2 spread in the human population have witnessed the emergence of variants of concerns (VOCs) with increased transmissibility and/or immune escape (i.e. ability to reinfect individuals who acquired immunity to the virus). Upon emergence, VOCs were able to grow in frequency, outcompeting the resident variants, previously circulating. Their advantage can be quantified by their selection coefficient, i.e. the logarithm of their frequency relative to the resident virus. If on one side the selection coefficient is clearly affected by VOC traits - i.e. advantage in transmission and/or strength of immune escape -, a given VOC was found to have different selection coefficients in different populations. Furthermore, when two VOCs (for example, one with transmission advantage and the other with immune escape, as in Alpha and Beta) emerged one after the other, the competition resulted in different outcomes in different places. Population immunity to SARSCoV-2 plays an important role as it may favour an immune-escaping variant. However, the topology of the host contact network could also be important. Previous studies highlighted the role of the network on variant emergence. However, the complex interplay between variants’ traits and network topology in determining variants’ interaction still presents open questions, as limited work analysed the transient dynamics of emergence, more relevant for SARS-CoV-2 VOCs. Here, we analyse the joint role of human contact network and population immunity to SARS-CoV-2 on the selection coefficient of an emerging variant. We build a two-layer network model accounting for household and outside-households contacts. We assume contacts outside households have a negative binomial degree distribution and we compare different means and variances to mimic the diversity of social mixing scenarios at the time of variant emergence - e.g. due to the level of social restrictions. We then model the emergence of VOCs considering either a single VOC or two VOCs emerging at different moments in time. We ran stochastic computer simulations of the spreading dynamics and we systematically quantify VOCs’ selection coefficient as a function of the parameters tuning the transmissibility of the resident virus, variants’ traits (transmission advantage and/or immune escape) and population immunity at the time of emergence. Results show that the selection coefficient is highly sensitive to the properties of the host population and the phase space of the possible dynamics regimes is complex. Contact network parameters play a major role. The selection coefficient of a more transmissible variant decreases with population immunity, while it is the opposite for the immune-escaping variant, as expected. However, the dispersion in the degree distribution of the host contact network hinders the emergence of a more transmissible variant for population immunity above a certain threshold, while favouring an immune-escaping one. Analysis of simulation output reveals the dynamical mechanisms leading to this outcome and shows that hubs may act as supper-spreader or supper-blockers depending on the level of immunity and the variant characteristics. These results shed light on how human behavioural heterogeneities and their interaction with the epidemic history mediate variants’ interaction dynamics thus providing ground knowledge that could help interpreting records of emerging VOCs. |
15:00 | An informational approach to uncover the age group interactions in epidemic spreading from macro analysis PRESENTER: Alberto Aleta ABSTRACT. Identifying causal relationships in time series analysis is essential in many disciplines. One of the most popular techniques to accomplish this task is the information-theoretic measure transfer entropy (TE). TE quantifies the amount of information that the past of a source process $X$ provides about the current value of a target process $Y$, in the context of $Y$'s own past. However, in certain processes, there might be multiple sources interacting with one target. These interactions may be independent, but they can also be synergistic or even redundant. Thus, it is necessary to implement multivariate TE estimations to account for all relevant sources of a target. The challenge is then to define and identify which of those sources are relevant. But in a multivariate setting, the size of the source set can quickly grow, and thus it is necessary to rely on greedy algorithms to estimate multivariate TE. Among the disciplines that can benefit from these approaches, we can find epidemic spreading. Individuals are heterogeneous in both their biological and social characteristics, something that has important implications in terms of the spreading of infectious diseases. For these reasons, theoretical models may distinguish groups of individuals that have some particular characteristics and study their interactions. For instance, one can classify individuals into age groups and use age-contact matrices extracted from surveys to encode the interactions among these groups. However, information collected this way is inherently static and does not consider the complexity of human behavior and how it changes over time. In this work, we use TE to extract the causal network of interactions between age groups from surveillance data. In particular, we analyze the incidence time series of each age group in Spain from early 2020 to mid-2022 to infer, for each epidemic wave, the interactions among these groups. Our analysis shows that the network varies across waves, with some age groups being drivers of the spreading in some of them while not in others. These results can be used to understand how human behavior changes and adapts during an ongoing pandemic and, in turn, devise models that can incorporate this information in real-time rather than relying on static - and potentially outdated - pictures of the system. |
15:15 | The Populated Aotearoa Interaction Network for modelling spread of COVID-19 in Aotearoa New Zealand PRESENTER: Dion O'Neale ABSTRACT. Interaction networks are important for modelling, and understanding, a range of processes; spread of infectious disease being one recent high-profile application. In order to provide contagion modelling advice to decision makers, as part of the government response to COVID-19 in Aotearoa New Zealand, we have used a range of data sources to build the Populated Aotearoa Interaction Network (PAIN). The PAIN is a bipartite network that explicitily represents approximately five million individuals connected to the contexts in which they interact. The network is built as four layers representing dwellings (~1.7M), workplaces (~488K), schools & childcare (~7K) and a “community” layer that captures remaining interactions such as shared public transport, church and sporting events, shopping, and social interactions. Each layer of the PAIN is built from empirical data, much of which is drawn from Statistics New Zealand’s Integrated Data Infrastructure (IDI). The IDI contains linked microdata including individual and dwelling level attributes (Census), employment (wage and salary tax records), and enrolment at educational institutions. In order to accurately represent the important topological features of the network of interactions captured in such linked microdata, without risking re-identification, we have developed a process to convert the microdata into a synthetic population, along with associated correlation data. This synthetic population is then used to construct an ensemble of synthetic interaction networks that can be used with a network-based contagion process in order to simulate various scenarios involving spread of COVID-19, including estimating the effects of a range of interventions. The detailed attributes used in the network construction make it well suited to including equity relevent effects in contagion modelling, such as ability to work from home or household size. |
15:30 | The interplay between social inequalities, human contact patterns and the spread of epidemic processes PRESENTER: Marton Karsai ABSTRACT. Human contact patterns represent the routes of infectious disease spreading by shaping the underlining transmission chain among susceptible individuals. Although they have been largely studied [3] [2], the growing need for the application of realistic approaches to complex public health questions is calling for a deeper understanding of the heterogeneities and dynamics of human mixing patterns. Physical contact networks are usually represented in the aggregate form of an age contact matrix (Mi j ), which encodes information on the average number of contacts that individuals of different age groups have with each other. While age is a recognised determinant of people’s contact patterns, the current literature falls short to understand the role of other social, demographics, and economic factors in shaping individuals’ contact pattern. Meanwhile, due to the lack of data, it is often impossible to characterize social mixing patterns over time, especially during times of emergencies (i.e. a pandemic or during non-pharmaceutical intervention periods). The goal of this work is to shed light on the main determinants, beyond age, which shape human contact patterns, and to understand how the relative importance of such determinants changes over time. We argue that it is crucial to decouple age contact matrices along other dimensions when modeling the spread of an infectious disease. To demonstrate this, we propose a data-driven mathematical framework to account for such differences in contact patterns in epidemiological models. The data used in this study comes from the MASZK study [1], a large data collection effort to observe social mixing patterns during the COVID-19 pandemic. The data collection was carried out in Hungary from April 2020 to July 2022 on a monthly basis. The data was collected via representative phone surveys using CATI methodology that involved 1000 large representative samples in each month. Beyond information on contacts before and during the pandemic, the MASZK dataset provided us with an extensive set of information on socio-demographic characteristics (gender, education level etc.), health condition (chronic and acute illness etc.), financial and working conditions (income, employment status, home office etc.), and the behaviour and attitude towards pandemic related measures (attitude towards vaccination, mask wearing etc.) of the participants. To explore how these dimensions shape contact patterns, we model the expected total number of contacts for a respondent i using a negative binomial regression as follows μi = α + β1age_groupi + β2Xi + β3age_groupi ∗ Xi + εi. (1) Here age_groupi is the age class of i; Xi is the variable of interest (e.g., education level, working condition), age_groupi ∗β2Xi is the interaction term of age group and the variable of interest , and εi is the error term. To be able to provide meaningful description of the interactions, we analyse the average marginal effect AME of Xi for different age groups. After identifying significant differences of the effects among the age categories, we rank the variables according to their importance in shaping contact patterns in the different periods of the pandemic. Our results show that multiple variables have significant effects in determining the number of contacts in different age groups. Employment situation and education level seem to play the most important roles throughout the pandemic.This demonstrates the need to incorporate a more precise stratification of the modelled population in epidemiological studies, especially considering social inequalities beyond the age of individuals. Further, we propose an innovative mathematical framework, which adapts the classical age-stratified SEIR model to accommodate contact matrices differing for each subgroup of the population defined by the new variable taken into account. In doing so we are able (i) to show how adapted SEIR models predict different spreading patterns of the epidemic as compared to the traditional, only age-stratified settings, and (ii) we can identify subgroups of the population that are strongly affected by the epidemic. Figure 1 shows that highly educated and employed individuals have higher attack rates in all age groups. Interestingly, the model that accounts for differences in contact patterns among different income classes does not reveal large differences among these subgroups in terms of the modelled attack rates. Fig. 1. Attack rates by age class for different subgroups of the population as predicted by our adapted SEIR model. Results are shown for the 4th wave, considering R0 = 2.5. |
15:45 | Robust Immunization of the Herd: Mosaic Vaccination against Escape Mutants PRESENTER: Simon Rella ABSTRACT. Immune escape due to pathogen evolution erodes vaccine efficacy and can lead to increased pathogen growth rates in an otherwise vaccinated population. In order to create robust immunity, various types of “universal” multi-epitope vaccines were proposed. These vaccines are tailored towards granting a broad immunity that protects against multiple variants of the pathogen at question, and therefore shielding any vaccine escape before it can be transmitted. Individuals with immunity towards diverse epitopes but prolonged diseased state however pose an obstacle to this strategy, as their immune system can serve as a selective environment for super resistant mutants. In this work we model the evolutionary implications of population based mosaic vaccination, in which an array of related multi-epitope vaccines is distributed on a given contact network. Considering diverse hypergraph architectures of epitope similarity, different contact networks and a range of models for within body evolution, we show that distributing targeted epitopes across a population can drastically reduce the rate of pathogen fixation compared to a single “universal” vaccine. Our results are specifically relevant for pathogens with high mutation rates and moderate contagiousness (see Figure). |
16:00 | The strength and weakness of disease-induced herd immunity PRESENTER: Takayuki Hiraoka ABSTRACT. When a substantial portion of a population acquires immunity against infectious disease, the risk of a widespread outbreak decreases as immune individuals become less likely to contract the pathogen or to transmit it to others. This provides collective protection for the entire population, including individuals without immunity—a phenomenon known as herd immunity. Historically, the herd immunity threshold has been calculated assuming that the population is homogeneously mixed and the immunity is randomly distributed. While network epidemiology has clearly revealed the limitations of the first assumption, the deviations caused by the second are less well understood, in particular when it comes to immunity acquired through natural infection. Consider a situation where a disease attempts to invade a population, part of which already has immunity to it. Herd immunity works differently depending on whether it is induced by past infection or by random, homogeneous immunization. It is well known that population-level protection against a second wave is enhanced by degree heterogeneity because high-degree nodes are more likely to be immunized by the first wave of infection. However, as we show here, there are subtle effects that work in the opposite direction, making herd immunity weaker. These are rooted in the contiguous nature of the epidemic spread, which gives rise to mesoscale mixing heterogeneities similar to those discussed in the context of vaccination. We show that the total amount of disease-induced herd immunity is determined by the balance of these competing mechanisms. To study their effects, we consider the susceptible-infected-recovered (SIR) dynamics on contact networks. We evaluate the strength of herd immunity quantified by the largest possible epidemic size $C'$ (i.e., the size of an epidemic with very large transmissibility) after the removal of immune nodes from the network, both when immunity is induced by natural infection and in the case of uniformly random immunization. The two competing mechanisms are best illustrated by inspecting the average degree of the immune subgraph and the number of edges between immune and susceptible nodes (Fig. 1). When the contact network is a random regular graph, the infection cannot exploit degree heterogeneity and we only see the effects of mixing heterogeneity, which results in a smaller number of edges between recovered and susceptible nodes than in the case where the same number of nodes is randomly immunized. Therefore, disease-induced herd immunity is weaker. However, in the case of an Erdős–Rényi network with a Poisson degree distribution, the two effects balance each other exactly so that $C'$ is equal to that under homogeneous immunization. Even in this case, we see that the spread of natural infection leaves more edges within the immune subgraph but fewer edges on the interface. Most real-world contact networks are embedded in lower-dimensional spaces and highly clustered (i.e., with many short cycles) as a result. To this end, we discuss our systematic investigation into the cases where the contact networks exhibit such loopy structures as well. |
16:15 | Mixed bubbles: Maximizing epidemics with geographic and categorical assortativities in modular geometric graphs PRESENTER: Samuel Rosenblatt ABSTRACT. Some of the most influential results in network science show how important a few random long-range links can be on the connectivity of otherwise local networks. In reality, these long-range links are rarely random and instead stem from alternate connection mechanisms. One notable example concerns pathogen-spreading networks dominated by two types of assortativity: geographic proximity (e.g., neighbors or colleagues) and in-community preferences (e.g., travel to family or conferences). We propose a mixture model to explore epidemic dynamics on networks with varying mixing of geometric and categorical assortativity. We combine soft random geometric networks with a community-specific configuration model. Simulations of Susceptible-Infectious-Recovered dynamics (SIR) on our network model show that outbreaks are small when networks are highly assortative in terms of geographic location or category, and quickly rise in scale when “shortcuts” in the system are provided by mixing mechanisms. Specifically, when the number of communities is large, the worst outbreaks occur when both types of assortativity are equally important. With fewer communities, we find that outbreaks are maximized when categorical assortativity is relatively more dominant than geographic proximity. These results inform many important real-world scenarios. For the ongoing COVID-19 pandemic, our contact networks could maximize spread when they feature an equal mix of assortativity with local communities (neighbors and colleagues) and distributed ones (family). Our work is of particular interest in engineered systems, such as the livestock supply chains, which often mix opportunistic local assortativity with company-driven long range connections. We focus on such engineered systems as our model can suggest structural interventions which could be enacted through laws, incentives, or corporate actions. We therefore use our abstract network model to generate hypotheses and interventions which we then test on a large, complex, agent-based model of the swine production network in the United States. |
14:15 | Distinguishing simple and complex contagion processes on networks PRESENTER: Giulia Cencetti ABSTRACT. Contagion processes on networks, including disease spreading, information diffusion, or social behaviors propagation, can be modeled as simple contagion, i.e. involving one connection at a time, or as complex contagion, in which multiple interactions are needed for a contagion event. Empirical data on spreading processes however, even when available, do not easily allow to uncover which of these underlying contagion mechanisms is at work. We propose a strategy to discriminate between these mechanisms upon the observation of a single instance of a spreading process. The strategy is based on the observation of the order in which network nodes are infected, and on its correlation with their local topology. |
14:30 | Optimal Mesoscopic Structure of General Binary-State Dynamics on Networks PRESENTER: Jérémi Lesage ABSTRACT. The mesoscopic structure, which encapsulates connectivity patterns emerging at the scale of groups, plays a fundamental role in information diffusion on complex networks. In recent years, theoretical works have provided strong evidence supporting this for specific types of mesoscopic structures and dynamics. For instance, it has been shown to partly govern the localization phenomenon in spreading processes, a realization behind the design of better confinement measures. Memory capacity of the brain has also been linked to the optimal modularity of the mesoscale organization of connectomes. Additionally, the optimal modularity of the brain have been shown to explain some aspects of the brain functions like the memory capacity. However, the mathematical models behind these results are usually tailored to the problem at hand, which limits their scope and calls for more generalized frameworks. We propose a general framework for describing arbitrary binary-state dynamics on mesoscopic structures. Our work is a direct generalization of the approximate master equation (AME) framework, but where the underlying network structure is specified by a configuration model with node types. In our framework, the nodes are compartmentalized by their state, their type, their generalized degree and their generalized active degree - both of which encode the number of neighbors of each type. In turn, this compartmentalization enables us to write multi-Type Approximate Master Equations (TAME) that accurately approximate the time evolution of the dynamics for each type of nodes. We use our framework to investigate the optimality of information diffusion for the threshold model on modular graphs with different mesoscopic structures. We show that TAME captures accurately the optimal modularity region and the time evolution for the fraction of active nodes in each community. Finally, we show how the generality of our framework naturally extends the concept of optimal modularity to richers mesoscopic structures and to other binary-state dynamics, including complex contagions and consensus models. |
14:45 | Combining social science with network epidemiology to model the elimination of an endemic livestock disease PRESENTER: Ewan Colman ABSTRACT. Could consumer concern for animal health ripple through livestock industry supply chains and lead to the elimination of a cattle disease? This question is relevant to efforts to eliminate bovine viral diarrhoea (BVD) in England. While some countries have enforced mandatory laws to control the disease, England hopes to achieve elimination through a voluntary scheme that requires farmers to meet certain bio-security standards. To prevent BVD from spreading farm-to-farm through the network of cattle movements, farms in the scheme are only permitted to purchase from farms with high levels of bio-security, potentially causing a cascade of good bio-security behaviour through the network of supplier-purchaser relationships. For some farmers, however, it is not financially beneficial to join the scheme. These tend to be "fatteners": farms that grow beef cattle during the latter stages of their life. Owing to the connectivity of these farms within the network, the disease has been able to persist despite general improvements in bio-security across the industry. Pressure to join the scheme could potentially come from consumers; experiments show that consumers are willing to pay extra for meat that has not come from animals with BVD. If this knowledge could be used to incentivise behaviour change then it could be the key to nationwide elimination. Here we ask how much change is sufficient to achieve this? We approached the problem with a network-based model of co-evolving disease spread and social contagion. We used a database of cattle movements across the UK and our own survey asking farm managers about their attitudes to bio-security. Using these data, we constructed a novel model of coupled circulating contagions, one simple (BVD) and one complex (membership of the scheme). We surveyed farmers about their disease concerns and management practices including testing, vaccination and membership of the BVD control scheme. Clustering analysis showed a clear separation between dairy and beef producing farms. Dairy farmers adopted more bio-security measures than beef suckler farmers, while fattener farms were the least bio-secure. Farmers who have had to deal with the infection in the past were highly likely to adopt bio-security, although this did not extend to joining the control scheme. Qualitative insights from the questionnaire informed the parameter space of the model. To simulate consumer pressure we selected a percentage of "fattening" nodes to have higher rates of detection and protection and lower susceptibility to infection. Other parameters were chosen to model a scenario where the effect of the intervention is apparent. We find that the transition between endemic persistence and elimination is abrupt and significant changes are observed if the number of nodes with changed behaviour exceeds a critical threshold. In conclusion, targeting fattening farms for an intervention against BVD, through consumer pressure or otherwise, will only result in elimination if the scope and effectiveness of the intervention is substantial. |
15:00 | Quantifying population dynamics of complex contagions PRESENTER: Christoph Riedl ABSTRACT. Complex contagions [1] are used widely to model spreading phenomena for which social reinforcement influences the adoption of a behavior. Many works have shown how different these complex contagions can be from simple contagions, leading to superexponential growth and discontinuous phase transition for instance. Here, we show how similar they can be. We develop a theoretical model of contagion on networks with tunable social reinforcement (Fig. 1A). We find two regimes: the first characterized by standard exponential growth—as predicted by simple contagion—and the second by superexponential growth. However, we also show that it is almost impossible to observe superexponential growth near criticality (when the basic reproduction number is close to 1) even when social reinforcement is strong. Intuitively, this is because multiple exposures are rare when the prevalence is low. In that case, the statistical properties of the spreading phenomenon are described well by a simple contagion process. To support this hypothesis, we analyze empirical data from a country-scale product diffusion experiment [2] with a simple branching process (Fig. 1B). While there is evidence for social reinforcement at the individual level in this experiment [3], the branching process describes well the statistics of the reconstructed transmission trees and the distribution of small clusters, in line with our theoretical framework. Our work suggests that a broad category of social phenomena could be modeled accurately with simple contagions, even if social reinforcement is at play at the individual level. Conversely, it shows that global statistical properties of social contagions are insufficient to test whether they are complex or simple at the individual level. [1] S. Lehmann and Y.-Y. Ahn, eds., Complex Spreading Phenomena in Social Systems. Computational Social Sciences, Springer, 2018. [2] C. Riedl, J. Bjelland, G. Canright, A. Iqbal, K. Engø-Monsen, T. Qureshi, P. R. Sundsøy, and D. Lazer, “Product diffusion through on-demand information-seeking behaviour,” J. R. Soc. Interface, vol. 15, p. 20170751, 2018. [3] J. Lee, D. Lazer, and C. Riedl, “Complex contagion in viral marketing: Causal evidence and embeddedness effects from a country-scale field experiment,” SSRN 4092057, 2022. |
15:15 | Investigating Churn in Online Wellness Programs: Evidence from a U.S. Online Social Network PRESENTER: Ankur Mani ABSTRACT. Online wellness activity platforms are increasingly utilizing wellness programs and social support to motivate healthy activities and improve user engagement. However, many wellness programs suffer from high churn rates that discount their expected efficacy, and negative social influence may lead to a churn contagion that amplifies the churn speed and scale. Hence, a need arises to understand why users churn wellness programs and how social contagion contributes to the churns. Leveraging the exercise challenge setting, the exercise data, and a large social network on a renowned U.S. online fitness platform, we investigate the effect of peers' behavior in exercise challenge churn on ego. To achieve the research goal, we employ an instrumental variable framework, using the exogenous variation of peers' weather in locations that differ from the ego's location as instruments. The framework untangles the endogeneity of the estimated effect using variations created by peers' weather as a shock to the ego's churn. We measure churn as a decision an ego makes after being inactive for one to two weeks and define peers as the ones an ego follows on the platform. We find that exercise challenge churn is socially contagious and demonstrates a complex contagion. Interestingly, our analyses reveal that the social contagion of churn diffuses from the sub-central or peripheral egos who have fewer friends in the social network to central egos who have more friends in the social network. Such churn contagion mostly confines to low-density network communities with members who are poorly connected with one another. Our findings have important implications for designing intervention plans to stop wellness program churn based on social contagion. |
14:15 | Complex network analysis of electronic health records from multiple populations reveals biases in the administration of known drug-drug interactions PRESENTER: Luis M. Rocha ABSTRACT. The co-administration of drugs known to interact has a high impact on morbidity, mortality, and health economics. We study the drug-drug interaction (DDI) phenomenon with a large-scale longitudinal analysis of age and gender biases found in drug administration data from three distinct health care systems. Using network science, risk and statistical analysis, we study drug administrations from population-wide electronic health records (EHR) in Blumenau (Brazil; pop. 330K), Catalonia (Spain; pop. 7.5M), and Indianapolis (USA; pop. 864K) with an observation window ranging from 1.5 to 11 years. Indeed, our work is the first to characterize in detail the heavy burden of DDI on health systems that are very distinct in geography, population, and policy (a paper detailing the analysis is currently under review but available [1]). We first compute a stratified risk of DDI for several severity levels per patient’s gender and age at the time of dispensation. To investigate the highly multivariate causes of DDI risks, we use network science and statistical null models. DDI networks built from various measures of co-administration allow us to explore and identify key drugs involved in the DDI phenomenon, including interactions with increased gender and age risk. An interactive version of these networks is available at http://disease-perception.bsc.es/ddinteract/, allowing public health and pharmacology specialists to study in detail any drugs and symptomatology of interest. The role of polypharmacy is also elucidated via statistical null models that shuffle drug labels while accounting for cohort-specific drug availability. We find a large risk of known DDIs in all populations that affect 13%, 13%, and 20% of patients in Indianapolis, Blumenau, and Catalonia, respectively, and identify 149 DDI instances that appear in all three health care systems. The increasing risk of DDI as patients age is very similar across all three populations but is not explained solely by higher co-administration rates in the elderly. Although the risk of DDIs increases with age, administration patterns point to a complex phenomenon that cannot be solely explained by polypharmacy and comorbidity. We also find that women are at higher risk of DDI overall—with the exception of men over 50 years old in Indianapolis. Additionally, to exemplify how actionable interventions can result from the analysis, we simulate alternative polypharmacy regimens for DDIs involving the proton-pump inhibitor (PPI) Omeprazole. We show how substituting this drug for other PPI drugs with fewer known interactions, can reduce the number of patients affected by known DDIs by up to 21% in both Blumenau and Catalonia, and 2% in Indianapolis, exemplifying how analysis of EHR data can lead to significant reduction of DDI and its associated human and economic costs. Our work characterizes the heavy burden of DDIs for health systems that are very distinct in geography, population, and management. Although the risk of DDI increases with age, especially for patients with comorbidities, our results point to a complex DDI phenomenon driven by culture and economics, in addition to biological factors, which interact to overburden health systems with patients taking a multitude of DDIs. From a broader perspective, our work demonstrates that an integrative, systems science methodology to analyzing EHR can be used to study the DDI phenomenon in different cultures and health systems. The lack of safer drug alternatives we identify, particularly for chronic conditions, further overburdens health systems, thus highlighting the need for disruptive drug research. [1] J. Sanchez-Valle, R. B. Correia, M. Camacho-Artacho, R. Lepore, M. M. Mattos, L. M. Rocha, and A. Valencia. Analysis of electronic health records from three distinct and large populations reveals high prevalence and biases in the co-administration of drugs known to interact. medRxiv, 2023. doi: 10.1101/2023.02.06.23285566. URL https://www.medrxiv.org/content/early/2023/02/08/2023.02.06.23285566. |
14:30 | The human exposure network: a multi-scale study of the impact of chemicals in human health PRESENTER: Salvo Danilo Lombardo ABSTRACT. Diseases and phenotypic manifestations result from the combination of genetics and environmental factors. Despite that many chemical exposures have not been thoroughly studied yet and most of our current knowledge comes from individual epidemiological and toxicological studies, increasing evidence attributes a large variety of different diseases to water, soil, and air pollution. Here, we used a network-based approach to construct a comprehensive map in which 9,887 exposures are linked through their shared impact at the genetic level. The map can be used to identify groups of exposures that have similar biological effects, even if they are chemically different. By using the human interactome of protein-protein interactions, we found that exposures affect well-defined neighborhoods and high interactome connectivity is the prime indicator of the harmfulness of an exposure. A systematic comparison between the interactome modules affected by exposures and disease-associated modules suggested that interactome overlap can be used to predict exposure-disease relationships. To evaluate the validity of these predictions, we cross-referenced country-wide disease prevalence data with reports of environmental exposures. We found that elevated levels of a particular exposure in air or water correlated with an increase in the prevalence of diseases whose interactome modules overlapped with those of the respective exposures. Taken together, we provide a framework for relating the genetic component of chemical exposures with their epidemiological observation. |
14:45 | Scale-dependent landscape of semi-nested community structures of 3D chromosome contact networks PRESENTER: Sang Hoon Lee ABSTRACT. Mammalian DNA folds into 3D structures that facilitate and regulate genetic processes such as transcription, DNA repair, and epigenetics. Several insights derive from chromosome capture methods, such as Hi-C, which allow researchers to construct contact maps depicting 3D interactions among all DNA segment pairs. To better understand the organizing principles, several groups analyzed Hi-C data assuming a Russian-doll-like nested hierarchy where DNA regions of similar sizes merge into larger and larger structures. However, while successful, this model is incompatible with the two competing mechanisms that seem to shape a significant part of the chromosomes' 3D organization: loop extrusion and phase separation. The first part of our work [1] aims to map out the chromosome's actual folding hierarchy from empirical data, by treating the measured DNA-DNA interactions by Hi-C as a weighted network. From such a network, we extract 3D communities using the generalized Louvain algorithm with an adjustable resolution parameter, which allows us to scan seamlessly through the community size spectrum, from A/B compartments to topologically associated domains (TADs) [2]. By constructing a hierarchical tree connecting these communities, we find that chromosomes are more complex than a perfect hierarchy. Analyzing how communities nest relative to a simple folding model, we find that chromosomes exhibit a significant portion of nested and non-nested community pairs alongside considerable randomness. In addition, by examining nesting and chromatin types, we discover that nested parts are often associated with actively transcribed chromatin. Another reoccurring issue that seems to reflect the fundamental limitation of community detection in the case of stochastic algorithms is the possibility of inconsistent detection results (the same community detection method may disagree with itself) [3, 4]. If too strong, such inconsistencies may cause problems if the data interpretation relies too heavily on a specific community structure when there are others equally feasible. This is a fundamental problem pertaining to any data clustering scheme that cannot be solved using better community detection algorithms. In the second part of our work, we investigate the inconsistency of 3D communities in Hi-C data. We utilize an inconsistency metric [4], map out the community spectrum at different scales of the Hi-C contact network, and quantify where the community separation is most inconsistent. As a result, we find that the nodal inconsistency or functional flexibility [3] are also related to the local chromatin activity as in the nestedness analysis. [1] D. Bernenko, S. H. Lee, P. Stenberg, and L. Lizana, Mapping the semi-nested community structure of 3D chromosome contact networks, bioRxiv 10.1101/2022.06.24.497560 (2022). [2] S. H. Lee, Y. Kim, S. Lee, X. Durang, P. Stenberg, J.-H. Jeon, and L. Lizana, Mapping the spectrum of 3D communities in human chromosome conformation capture data, Scientific Reports 9, 6859 (2019). [3] H. Kim and S. H. Lee, Relational flexibility of network elements based on inconsistent community detection, Phys. Rev. E 100, 022311 (2019). [4] D. Lee, S. H. Lee, B. J. Kim, and H. Kim, Consistency landscape of network communities, Phys. Rev. E 103, 052306 (2021). |
15:00 | Multilayer networks of plasmid genetic similarity reveal potential pathways of gene transmission PRESENTER: Shai Pilosof ABSTRACT. Antimicrobial resistance (AMR) is a significant threat to public health. Plasmids---circular DNA entities that reside in bacteria---are the principal vectors of AMR genes. It is well known that plasmids carrying AMR genes can move across animal hosts via bacteria (Fig 1A). However, it remains unknown how the dynamics of gene exchange between plasmids scale up to between-host transmission. Given that cattle is a major source for AMR, the cow rumen is a highly relevant study system. Here, we used theory and methodology from network science and disease ecology to investigate gene transmission between plasmids and cows in a dairy cow population. Our data included the collection of plasmids within 21 cows. We constructed a multilayer network based on pairwise plasmid genetic similarity (Fig. 1B), which is a signature of past genetic exchange events. Plasmids containing genes allowing them to spread had a higher degree. Moreover, the degree distribution had a signature of super-spreading whereby a few plasmids had a disproportionately high degree. Using multilayer modularity analysis, we discovered a module composed of plasmids with the same AMR gene and one containing plasmids with a high spreading ability (Fig. 1C). Cows that contained both modules also shared transmission pathways with many other cows, making them candidates for AMR gene super-spreading. Using simulations of AMR gene spreading we show that a new gene invading the cow population will likely reach all cows (Fig 1D). Finally, we showed that the distribution of edge weights contained a non-random signature for the mechanisms of gene transmission, allowing us to differentiate between plasmid dispersal and genetic exchange. Our results provide insights into how AMR genes spread across animal hosts. From a network-science perspective, we inferred the processes underlying an observed network structure and the ecological function of the network. |
15:15 | Finding Shortest and Nearly Shortest Paths in Substantially Incomplete Networks PRESENTER: Maksim Kitsak ABSTRACT. Dynamic processes on networks, be it information transfer in the Internet, contagious spreading in a social network, or neural signaling, take place along shortest or nearly shortest paths. Computing shortest paths is a straightforward task when the network of interest is fully known, and there are a plethora of computational algorithms for this purpose. Unfortunately, our maps of most large networks are substantially incomplete due to either the highly dynamic nature of networks, or high cost of network measurements, or both, rendering traditional path-finding methods inefficient. We find that shortest paths in large real networks, such as the network of protein-protein interactions (PPI) and the Internet at the autonomous system (AS) level, are not random but are organized according to latent-geometric rules. If nodes of these networks are mapped to points in latent hyperbolic spaces, shortest paths in them align along geodesic curves connecting endpoint nodes. We find that the latent-geometric alignment is sufficiently strong to allow for the identification of shortest path nodes even in the case of substantially incomplete networks, where the numbers of missing links exceed those of observable links~\cite{kitsak2023finding}. Our finding can be either a curse or a blessing, depending on the circumstances. One could exploit the geometric localization of shortest paths to disrupt or eavesdrop on communication paths of interest. On the other hand, the knowledge of geodesic fairways may help identify alternative optimal paths and rule out inefficient or fraudulent paths in communication networks. Additionally, through the analysis of latent geometric localizations of cellular pathways, we find that some of them, including ubiquitin proteasome (UPP), transforming growth factor beta (TGFb), and cell cycle pathways, are organized similarly to communication pathways. Not only are these pathways aligned along geodesic curves, but other genes in the latent geometric vicinity of them appear to be functionally related. We emphasize that there is no one-size-fits-all solution to the shortest path problem. In order to identify shortest path nodes in a partially known network, one needs to know both the mechanisms of network formation and the character of missing data. Distance to geodesic, in this respect, assumes that link formation in the network is captured by its latent geometry, and unknown links are missing uniformly at random. The first condition is a must: one cannot expect to identify shortest paths in non-geometric networks using geometric methods. The second assumption could probably be relaxed. While our embedding algorithm is designed for networks with uniformly missing links, it should be straightforward to generalize it to the special case when the probability of a missing link is also a function of a latent distance between the nodes. |
15:30 | Characterising Evolutionary Pathways for Adaptations using Analysis of Metabolic Correlations Networks PRESENTER: Sharon Samuel ABSTRACT. After the inactivation of essential genes, adaption often includes “peripheral mutations” in the genome, which allow the rerouting of metabolic pathways to increase fitness. The effects of such rewiring are likely to propagate through the metabolic network. However, identifying these consequences is challeng- ing given the complexity of the cell’s metabolic network, which comprises hundreds of interdependent chemical reactions. Here, we combine an experiment with network analysis to investigate causal node- and network-level effects of adaptive processes on metabolic correlation networks. We replaced an essen- tial gene in wild type (WT) Escherichia coli bacteria with a less efficient homolog, causing a decrease in fitness. The perturbed, unevolved cells were then allowed to evolve. The evolved cells regained the lost fitness (Fig. 1A). We quantified the relative amounts of metabolites in each strain using Mass- spectrometry. For each bacteria strain (WT, unevolved, evolved) we constructed a network in which links encode the correlation in relative amounts between metabolites (nodes) (Fig. 1B). Correlation analysis is a powerful strategy to identify systemic metabolic changes, capturing direct and indirect re- lationships. A comparison of link turnover using the Jaccard index showed low link sharing between the networks. However, the networks were more similar than random expectations. We identified a group of core correlations that appeared in all networks. This stable network backbone is likely to contain crucial metabolic reactions necessary for homeostasis. Moreover, the networks showed a similar modular topol- ogy, consisting of a large module and approximately 40 smaller modules. We also identified network hubs using various centrality measures. The experimental perturbation led to the loss of many hubs. Some of these were regained in the evolved bacteria, along with the emergence of new ones. In addition, nodes that possessed core correlations remained central in the three networks. In conclusion, we showed that perturbation and subsequent evolution led to considerable modifications of the metabolic network, which can be captured using a network-science perspective. This network approach has a strong potential to bridge the genotype-phenotype-fitness divide—an ongoing challenge in biology. |
15:45 | Predicting genetic interactions with a network based method PRESENTER: Ruiting Xie ABSTRACT. An unexpected phenotype often emerges from a combination of two or more gene variants, resulting from genetic interactions. A negative (positive) genetic interaction leads to a more (less) severe fitness effect than expected. Here, we focus on pair-wise genetic interactions to study the relationship between genotypes and phenotypes in yeast. Our input is a global genetic interaction network of ~550,000 negative and ~350,000 positive genetic interactions[1]. Like all large-scale biological datasets, this genetic interaction network is still incomplete. 27% of gene pairs remain unmeasured while the false negative rate for measured gene pairs is estimated to be above 40% in a replicate screen [2]. To infer missing genetic interactions, we use a network-based method that has succeeded in predicting other biological networks such as missing protein-protein interactions [3] and synaptic polarities between neurons [4]. We predicted ~7,000 new genetic interactions between essential genes and ~14,000 genetic interactions between non-essential genes with a threshold chosen such that the inferred interactions are at least as meaningful biologically as the known genetic interactions, based on comparisons to additional datasets, see Figure. With these predictions, we aim to unveil more details on gene function based on their similarity in genetic interaction profiles. Our results and prediction methods have the potential to be further applied to other organisms, including ongoing experiments in humans. [1] M. Costanzo et al., Science 353, aaf1420 (2016). [2] M. Costanzo et al., Science 372, eabf8424 (2021). [3] I. A. Kovács et al., Nature Communications 10, 1240 (2019). [4] M. R. Harris, T. P. Wytock, and I. A. Kovács, Advanced science 9, e2104906 (2022). |
16:00 | Are “hubs” in beta-cell clusters an emergent network property or do they exist independently? PRESENTER: Marco Patriarca ABSTRACT. The cell network structure of the pancreatic islets of Langerhans has been the subject of several experimental and theoretical studies. A long-standing dilemma is whether the collective oscillations of beta-cells require the presence of specialized pacemaker cells, named “hubs”, or synchronization occurs through a “democratic” mechanism, where the collective cell network behavior is a nonlinear average of the properties of its individual elements. The topic has received so much attention to justify a review entirely focused on the “hub” dilemma [1]. In a recently published work [2] we mimicked the architecture of a beta-cell network by a cubic lattice of heterogeneous FitzHugh-Nagumo (FHN) elements. This topology resembles the experimentally known features of a beta-cell islet. We introduced heterogeneity in the network in the form of diversified external currents Ji acting on the network elements, drawn from a Gaussian distribution with standard deviation s. In our simulations, we varied the width s of the distribution, finding a clear “Diversity-induced Resonance” effect co-occurring with a fraction f = 5% of “hubs” (units with Ji corresponding to an intrinsically oscillatory state), in good agreement with observations [3]. While the results of our previous study support the existence of hubs, they do not allow us to draw any conclusion on whether these hubs are an emergent network property or they exist independently of the network. Trying to dig deeper into this question, here we present the results of new simulations where, using the same model, we produce some damages in the beta-cell network by selectively disconnecting either hubs or nonhubs, thus changing the number of units, their heterogeneity, and the network topology. We found rather surprising and apparently contradictory results, some of which are summarized in Fig. 1. In a series of simulations, we disconnected from the network 1/3 of the hubs, by setting their coupling constant C=0 in the coupled FHN equations. This means the corresponding FHN units had no interaction with other network units. As shown by the red bars in Fig. 1, this caused a dramatic drop of the collective oscillatory activity vs. the reference network configuration, where no elements were disconnected, shown by the green bars. On the other hand, upon disconnecting the same number of nonhubs by the same approach, we found virtually no change in oscillatory activity, as shown by the yellow bars. This seems to suggest that hubs do play a crucial role as a distinct subset of network elements. However, if we build a truncated distribution of Ji values, where the central range corresponding to oscillatory FHN states is missing, therefore the network is formed by nonhubs only (without any disconnected units), then the global oscillatory activity is also maintained vs. the reference system (blue bars). Therefore, hubs seem to be crucial for global network oscillations if they are initially present and get disconnected, whereas their complete absence does not prevent the network from being in a fully resonant oscillatory state. In our contribution we will present additional data and hypotheses to explain this apparent contradiction. Our learnings help to shed light on the “hub” dilemma but, at the same time, raise new questions that will require more work to be fully understood. [1] B.E.Peercy, A.S.Sherman, J.Biosci. 47, 14 (2022). [2] S. Scialla, A. Loppini, M. Patriarca, E. Heisalu, Phys. Rev. E 103, 052211 (2021). [3] N.R. Johnston et al., Cell Metab. 24, 389 (2016). |
16:15 | Complex Networks in Dengue-induced endothelial cell plasticity PRESENTER: Nelson Fernández ABSTRACT. From the study of embryonic development, it has been found that there are different types of Cellular Plasticity (CP), also when cell injuries or damage have occurred. Recently, it has been discovered that a particular type of CP can be found in endothelial cells from some viral infection, such as the Dengue virus (DENV). This infection causes an Endothelial Dysfunction (EndDys) characterized by a change in cell phenotype (Endothelial to Mesenchymal Transition- EndMT), increased vascular permeability, and low expression of some molecules such as E-cadherin [1]. Since EndDys may be involved in various physiological processes, such as cardiovascular malfunction, congenital heart disease, systemic and organic fibrosis, pulmonary arterial hypertension, and atherosclerosis, its extraction has many promising potential applications in regenerative medicine [2]. Thus, modeling molecular components and their interactions is required to understand and elucidate the structure and dynamics of EndDys phenomena caused by DENV as a new type of CP. To this end, we present a Boolean network model of the molecules involved in EndDys obtained from a bibliometric study on several databases as the first phase of the laboratory experiments. Our meta-analysis and systematic review in PubMed, Scopus, CORE, GitHub, and GitLab repositories found 79,531 papers, resulting in 393 selected articles. From these documents, just 33 studies were extracted by using exclusion criteria and filters for search equations. In these 33 articles, we found 54 molecules related to CP phenomena related to EndDys that generate 128 interactions. According to signaling pathways, we found VEGF involved in EC activation during vascular remodeling (Fig 1.). Typically, VEGFA binds to a VEGFR2 homodimer during angiogenesis and activates signaling from other molecules. Tie-2 receptor-mediated signaling of RhoA by Ang-1 results in activation of Rac1, which inactivates RhoA. The activated Rac1 promotes the accumulation of VE-Cadherin at inter-endothelial junctions. At the same time, the inactivation of RhoA prevents the formation of actin stress fibers and connections, leading to the certainty of the endothelial barrier. Finally, the activity of SNAI1 and SNAI2 molecules is needed for the differentiation of mesenchymal cells. In addition, we found GATA2 transcription factor is involved in the dynamics of the regulation of the identity of the CE, and its loss induces EndMT. Also, GATA2 in endothelial cells activates the transcription of Vegfr2, Nrp1, and GATA2 itself, Our modeling constituted a theoretical framework that incorporates the public data of the experimental results, constituting the summary of the current state of the subject in question, allowing computational analysis. At the same time, our model can generate some hypotheses and guide empirical research to understand CP's regulatory mechanisms when DENV is involved. |
16:30 | Social learning in sperm whale language: evidence from acoustic data. PRESENTER: Antonio Leitao ABSTRACT. This study provides quantitative evidence of clan-specific accents in sperm whales. These animals are known for their hierarchically organised societies and complex vocal communication system. Their communication consists of clicks emitted in precise rhythmic patterns to form codas. By analysing acoustic data from various locations in the Pacific and Atlantic oceans, we found clear differences in the acoustic structure of codas that reflect the hierarchy of social units. Specifically, we built network-based variable–length Markov chains (VLMCs) from the data. These are essentially network models that are capable of reproducing the statistical structure of sperm whales' language. Our results show that geographically overlapping clans tend to have more similar accents for non-identity codas, that is codas used by multiple clans. This suggests a form of social language sharing that preserves each clan's identity. Our results show that differences in coda generation reflect the presence of cultural transmission and group-specific traditions in sperm whale communication. Finally, these findings offer further evidence of the sophistication of the sperm whale language, while at the same time emphasising the importance of continued investigation into the complexity of sperm whale communication and its potential implications for our understanding of cultural transmission in animal societies. The proposed method provides a general framework to study and compare the language of other animal species, and the insights offered by this research provide a foundation for future research in this field. |
14:15 | Measuring polarization in multipolar social contexts PRESENTER: Rosa M. Benito ABSTRACT. Social polarization is a growing concern worldwide, as it strains social relations, erodes trust in institutions, and thus hurts democratic societies. Polarization has been traditionally studied in binary conflicts where two groups support opposite ideas. However, in many social systems, such as multi-party democracies, political conflicts involve multiple dissenting factions. Despite the prevalence of multipolar systems, there is still a lack of suitable analytical tools to study their polarization patterns. In this work, we develop a methodology that extracts the ideological structure of multipolar contexts from social networks and propose several polarization metrics. A representative examples of multipolar context is a multi-party system, in which more than two parties have a realistic chance of obtaining significant representation. We model these systems by considering each party as an opinion pole. If there are n opinion poles, we place each pole (parties) at the vertex of a regular simplex of dimension n-1 (a multidimensional generalization of an equilateral triangle for a tripolar case), being each pole at the same distance of the others. Our multidimensional opinion inference technique is a generalization of a bipolar (one-dimensional) methodology [1] based on models of opinion dynamics. The process consists in building a network of social interactions from empirical data, identifying the opinion leaders and their respective ideological positions [2]. Then, we use the model to propagate the leaders’ opinions throughout the rest of the nodes. Finally, we take the model’s outputs (the converged opinions) as the inferred opinions of the nodes. To characterize and measure the polarization of the inferred opinion distribution we propose different metrics [3] based on the covariance matrix, which is the multidimensional generalization of the variance, a quantity often adopted as a one-dimensional measure of polarization [4]. In particular, we use the trace of the covariance matrix (the total variation) as a global measure of opinion extremeness, and its eigendecomposition to quantify pole alignment (a multipolar analogue of opinion alignment), obtaining the direction of maximum polarization in the ideological space by principal components (PC). Fig 1 illustrates this methodology applied to Twitter data of, a four poles system, the 2015 Spanish general elections. Projections of the opinion distribution onto the simplex faces of the tetrahedron are shown as heat maps and contour plots (A). The centers of mass of the projected opinion distributions are represented as white squares and the projection of the direction of maximum polarization (PC 1), as a double headed arrow. 1D projections onto each edge of the simplex are shown on the sides of the triangles (A). Opinion distribution projected onto the first two principal components (PC1 and PC2), with the proportion of explained variance included in the axes labels is shown in B. |
14:30 | Quantifying Ideological Polarization on a Network Using Generalized Euclidean Distance PRESENTER: Marilena Hohmann ABSTRACT. An intensely debated topic is whether political polarization on social media is on the rise. We can investigate this question only if we can quantify polarization, by taking into account how extreme the opinions of the people are, how much they organize into echo chambers, and how these echo chambers organize in the network. Current polarization estimates are insensitive to at least one of these factors: they cannot conclusively clarify the opening question. Here, we propose a measure of ideological polarization which can capture the factors we listed. The measure is based on the Generalized Euclidean (GE) distance, which estimates the distance between two vectors on a network, e.g., representing people’s opinion. This measure can fill the methodological gap left by the state of the art, and leads to useful insights when applied to real-world debates happening on social media and to data from the US Congress. |
14:45 | Modelling how social network algorithms can influence opinion polarization PRESENTER: Henrique Ferraz de Arruda ABSTRACT. The influence of social networks on society has gained increasing attention in recent years. In particular, one of the main focuses of research has been on the dynamics of information processes. To further understand this topic, we proposed an opinion model that simulates the information changes on social networks, focusing on the information from external sources and how users and algorithms handle it. The first step denotes a piece of news that can be posted on the social network, simulated as a random number. Next, we modeled the willingness of a user to share the information (called post transmission) using probability functions based on the difference between the post and the user's opinion. After this step, we simulate how the social network algorithm distributes the post to other users (called post distribution). Finally, for the users that disagree with the post received, it is possible to rewire its contacts, modeled as an additional probability function. The analysis is based on the measure proposed in (Cota et al., 2019), where the opinions of the users are compared with the average opinion of their neighbors. To understand our dynamics, we tested several different probability functions for both post transmission and post distribution. Our dynamics converged into numerous scenarios, including opinion consensus, polarization, and the formation of echo chambers. For instance, there is the possibility of opinion polarization with or without echo-chamber formation. Polarization without echo chambers indicates that even with polarization, users with different opinions can exchange posts, which is rarer in the case of echo chamber formation. Furthermore, in comparison with real social networks, the results were found to be similar. For other sets of parameters, the outcomes indicate diversity and consensus of opinions. Our study suggests that the social network algorithm could play a key role in mitigating or promoting opinion polarization. |
15:00 | Multidimensional political polarization in online social networks PRESENTER: Antonio Fernández Peralta ABSTRACT. Political polarization in social platforms is a phenomenon gaining increasing attention. Understanding the structure of polarized states and how they emerge and evolve is a fundamental issue that we address in this work. We analyze the community structure of a two-layer, interconnected network of French Twitter users, where one layer contains members of the Parliament and the other one general users. We obtain an embedding of the network in a four dimensional political opinion space by combining network embedding methods and political survey data. We find structural groups sharing common political attitudes in our opinion space and relate them to the political party landscape. The distribution of opinions of professional politicians is narrower than that of the rest the of users, indicating the presence of more extreme attitudes in the latter layer. We find that politically extreme communities interact less with other groups as compared to more centrist communities. We apply an empirically tested social influence model to the two-layer network to investigate the possible interaction mechanisms that can describe the political polarization seen in data. The model works well at the global scale and especially for centrists groups in reproducing the opinion distributions and sheds light on the possible social behaviors that drive the system towards polarization. |
15:15 | On the side-effects of compromising: coupling agents' heterogeneity with network effects in a bounded confidence opinion dynamics model PRESENTER: Rémi Perrier ABSTRACT. The role of heterogeneous confidence in the bounded Hegselmann-Krause (HK) confidence model has recently been extensively studied in a mixed population setting, revealing a complex phase diagram with a reentrant consensus phase for confidence values where the homogeneous model leads to fragmentation [1]. On the other hand, the homogeneous HK model in networks has also been studied [2] and a very recent work shows how networks promote consensus [3]. However, the results of HK dynamics combining confidence and neighborhood heterogeneities were still unclear. In this work, we present a detailed study of HK bounded confidence opinion dynamics with heterogeneous confidences drawn from a uniform distribution in different intervals $[\varepsilon_l, \varepsilon_u]$, on different network topologies, from fully connected to sparse networks and regular lattices. Exploration of the entire parameter space, as well as a thorough finite-size analysis in the region with very closed-minded agents, reveals rich, highly complex, and non-monotonic behaviors. More precisely, we highlight a phenomenon of phase coexistence induced by the combination of a topological effect previously described [3] and the agents' heterogeneous confidences, where both the size of the largest group of opinions and the actual value of the majority opinion must be taken into account. We show that consensus is lost when increasing the number of open-minded agents (i.e. increasing $\varepsilon_u$), this phenomenon happening faster in the phase associated with the majority opinion being moderate than in the phase associated with the majority opinion being extreme. We also uncover a dynamical mechanism in which a small fraction of agents holding an extreme opinion is able to completely overturn the majority opinion from one extreme to the other. References: [1] H. Schawe, L. Hernández, Scientific Reports volume 10, Article number: 8273 (2020) [2] S. Fortunato, International Journal of Modern Physics C16, 259 (2005) [3] H. Schawe, S. Fontaine, L. Hernández, Phys. Rev. Research 3, 023208 (2021) |
15:30 | Self-induced consensus formation among Reddit users on the GameStop short squeeze PRESENTER: Anna Mancini ABSTRACT. The short squeeze of GameStop (GME) shares in mid-January 2021 has been primarily orchestrated by retail investors of the Reddit r/wallstreetbets community. As such, it represents a paramount example of collective coordination action on social media, resulting in large-scale consensus formation and significant market impact. In this work we characterise the structure and time evolution of Reddit conversation data, showing that the occurrence and sentiment of GME-related comments (representing how much users are engaged with GME) increased significantly much before the short squeeze actually took place. Taking inspiration from these early warnings as well as evidence from previous literature, we introduce a model of opinion dynamics where user engagement can trigger a self-reinforcing mechanism leading to the emergence of consensus, which in this particular case is associated to the success of the short squeeze operation. Analytical solutions and model simulations on interaction networks of Reddit users feature a phase transition from heterogeneous to homogeneous opinions as engagement grows, which we qualitatively compare to the sudden hike of GME stock price. Although the model cannot be validated with available data, it offers a possible and minimal interpretation for the increasingly important phenomenon of self-organized collective actions taking place on social networks. |
15:45 | Did the Russian invasion of Ukraine depolarize political discussions on Finnish social media? PRESENTER: Mikko Kivelä ABSTRACT. It is often thought, yet rarely observed, that an external threat increases the internal cohesion of a nation, and thus decreases polarization. We examine this proposition by analyzing political discussion dynamics on Finnish social media following the Russian invasion of Ukraine in February 2022. In Finland, public opinion on joining NATO had long been polarized along the left-right partisan axis, but the invasion led to a rapid convergence of the opinion, and eventually led the country to apply for NATO membership. We investigate how this depolarization took place on Finnish Twitter, where the politically active and partisan segment of the population is likely to be present. We collected Finnish tweets from Dec 30, 2021 to Mar 30, 2022 that contain any NATO-related keyword. We mainly examined four time periods: before (Feb 10 to Feb 23), right-after (Feb 24 to Mar 2), 1-week-after (Mar 3 to Mar 9), and 4-weeks-after (Mar 24 to Mar 30). For each period, we constructed a retweet network of users, where a directed link connects user A to user B if A retweeted B within the period. Using a graph partitioning algorithm, we find three separated user groups before the invasion: a pro-NATO, a left-wing anti-NATO, and a conspiracy-charged anti-NATO group. After the invasion, members of the left-wing anti-NATO group broke out of their retweeting bubble and connected with the pro-NATO group despite their difference in partisanship, while the conspiracy-charged anti-NATO group mostly remained a separate cluster. Our content analysis reveals that the left-wing anti-NATO group and the pro-NATO group were likely bridged by a shared condemnation of Russia's actions and shared democratic norms. Meanwhile, members of the conspiracy-charged anti-NATO group, who built arguments mainly upon conspiracy theories and disinformation, consistently demonstrated a clear anti-NATO attitude and retained strong within-group cohesion. In contrast to existing empirical evidence that suggests the resilience of partisanship-based polarization, our results show that polarization in partisanship-divided issues can be weakened overnight by an external threat. Instead, we find that polarization led by groups built around conspiracy theories and disinformation may persist even in the face of a strong external threat. |
16:00 | Changes of COVID-19 Vaccine Opinions on Twitter: A Combination of Dynamical Network Community Detection and Content-based Analysis PRESENTER: Qianyun Wu ABSTRACT. From the outbreak of COVID-19 in 2020 to recent return to normal life, the vaccine topic has continuously sparked public discussion on social media. Many studies investigated the formation of pro-vaccine and anti-vaccine opinions from the perspective of network communities. They revealed that social media networks are clustered into a few communities with distinctly different ideologies (e.g. vaccine-related or political opinions). However, these studies have not revealed the dynamical change of communities and opinions. Additionally, most of them relied on manual sample checking to determine the ideology of each community, which is neither reliable nor scalable for evolving network analysis. We aim to combine a content-based approach and evolving network community detection to explore the dynamic changes in opinions on Japanese Twitter. To achieve this, we collect 45 million tweets and 80 million retweets by using the Twitter API and setting the search criteria to include the keyword "vaccine (in Japanese)" and a time range of January 2022 to June 2022. We first construct a retweet network and use Louvain method to detect communities. Using data from the full period, we discover 6 main communities (Fig.1) with clear ideology polarization. We then analyze these communities over time by using sliding time windows of n weeks to track their evolution. Next, we develop an opinion classifier to better understand the ideology of these evolving communities. This classifier, which is trained using supervised learning with lexicon and networks features, has a precision of 80% in identifying pro-vaccine and anti-vaccine contents. Analyzing the dynamic vaccine communities and their opinion profiles provides better insight into how communities evolve and how they contribute to changes in collective opinions towards vaccines. |
16:15 | Structural polarization and hate speech in Twitter during electoral campaign: A local Spanish election as a case study PRESENTER: Emanuele Cozzo ABSTRACT. The presence of hate speech in our society is a danger to the coexistence between different social groups. This kind of speech can be particularly relevant in electoral campaigns or in periods of high political polarization when they can be promoted for partisan purposes [2]. In this work, we study the relationship between hate speech and polarization during the 2022 Andalusian regional elections in Spain. These elections were especially relevant since they were the first elections after the entrance of the far-right party VOX in the parliament. We use Twitter discussions during the electoral campaign and right after the election day. The data includes both two months before the elections and two months after. We then computed for each day the percentage of tweets that contained hate speech using a mBERT model [1]. And we also computed for each day the structural polarization of the interaction network between users, i.e., the modularity of the partition in two communities [3]. We find two change points [4] in the time series, one at the beginning of the Twitter discussion, and the other the day of the elections. After the election and at the beginning of the discussion, both the percentage of hate speech and the polarization are constant, although they have a high variance. Before the election day, we can observe that the polarization increases when the election day gets closer. The number of tweets with hate speech also increases, but the number of tweets without hate speech increases even more, resulting in a decreasing percentage of tweets with hate speech (Figure 1). On one hand, we see that political campaigns can increase polarization in the network of users interested in politics. That confirms the expected impact of electoral campaigns polarizing the political behaviours offline. But on the other hand, we can see no increase in hate speech during the increase of polarization. The relevance of these results lies in their counter-intuitive nature. To understand the mechanisms that underlies the anti-correlation between hate-speech and polarization, we check how the communities change every day and which actors are the most influentials in each one. We also investigate which communities are the main originators of hate speech. With that, we find more insights to understand why hate speech can decrease when polarization increases. References [1] Sai Saketh Aluru et al. “Deep learning models for multilingual hate speech detection”. In: arXiv:2004.06465 (2020). [2] Christian Ezeibe. “Hate speech and election violence in Nigeria”. In: Journal of Asian and African Studies 56.4 (2021), pp. 919–935. [3] Ali Salloum, Ted Hsuan Yun Chen, and Mikko Kivelä. “Separating polarization from noise: comparison and normalization of structural polarization measures”. In: Proceedings of the ACM on Human-Computer Interaction 6.CSCW1 (2022), pp. 1–33. [4] Charles Truong, Laurent Oudre, and Nicolas Vayatis. “Selective review of offline change point detection methods”. In: Signal Processing 167 (2020), p. 107299. |
16:30 | Finding polarised communities and tracking information diffusion on Twitter: The Irish Abortion Referendum PRESENTER: Caroline Pena ABSTRACT. The analysis of social networks enables the understanding of social interactions, polarisation of ideas, and the spread of information and therefore plays an important role in society. We use Twitter data - as it is a popular venue for the expression of opinion and dissemination of information - to identify opposing sides of a debate and, importantly, to observe how information spreads between these groups in our current polarised climate. Cascade trees allow us to identify where the spread originated and to examine the structure it created [1]. This allows us to further model how information diffuse between polarised groups using mathematical models based on the popular Independent Cascade Model via a discrete-time branching process which have proved fruitful in understanding cascade characteristics such as cascade size, expected tree depth and structural virality [2]. Previous research [3] examined the 2015 Irish Marriage Referendum and successfully used Twitter data to identify groups of users who were pro- and anti-same-sex marriage equality with a high degree of accuracy. Our research improves upon this work by 1) Showing that we can obtain better classification accuracy of users into polarised communities on two independent datasets (Irish Marriage referendum of 2015 and Abortion referendum of 2018) while using substantially less data. 2) We extend the previous analysis by tracking not only how yes- and no-supporters of the referendum interact individually but how the information they share spread across the network, within and between communities via the construction of retweet cascades. To achieve this, we collected over 688,000 Tweets from the Irish Abortion Referendum of 2018 to build a conversation network from users’ mentions with sentiment-based homophily. From this network, community detection methods allow us to isolate yes- or no-aligned supporters with high accuracy (97.6%) — Figure 1a shows that these yes and no-aligned groups are indeed stratified by their sentiment. We supplement this by tracking how information cascades (Figure 1b) spread via over 31,000 retweets, which are reconstructed from a user mention’s network in combination with their follower’s network. We found that very little information spread between polarised communities. This provides a valuable methodology for extracting and studying information diffusion on large networks by isolating ideologically polarised groups and exploring the propagation of information within and between these groups. (a) (b) Figure 1: C1 (Yes supporters) in blue, C2 (No supporters) in red. (a) Average sentiment-out for each community cluster per day. (b) Example of a retweet cascade. [1] S. Goel, A. Anderson, J. Hofman, and D. J. Watts. The structural virality of online diffusion. Management Science, 62(1):180–196, 2016. [2] Gleeson, James P., et al. Branching process descriptions of information cascades on Twitter. Journal of Complex Networks 8(6): cnab002, 2020. [3] D. J. O’Sullivan, G. Garduño-Hernández, J. P. Gleeson, and M. Beguerisse-Díaz. Integrating sentiment and social structure to determine preference alignments: the Irish marriage referendum. Royal Society open science, 4(7):170154, 2017. |
15:30 | Ensemble inference of unobserved infections in networks using partial observations PRESENTER: Sen Pei ABSTRACT. Undetected infections fuel the dissemination of many infectious agents. However, identifying unobserved infectious individuals remains challenging due to limited observations of infections and imperfect knowledge of key transmission parameters. Here, we use an ensemble Bayesian inference method to infer unobserved infections using partial observations. The ensemble inference method can represent uncertainty in model parameters and update model states using all ensemble members collectively. We perform extensive experiments in model-generated and real-world networks in which individuals have differential but unknown transmission rates. The ensemble method outperforms several alternative approaches for a variety of network structures and observation rates, even though the model is misspecified. The inference method can potentially support decision-making under uncertainty and be adapted for use in other dynamical models in networks. |
15:45 | Towards inferring network properties from epidemic data PRESENTER: Istvan Kiss ABSTRACT. Epidemic propagation on networks represents an important departure from traditional mass-action models. However, the high-dimensionality of the exact models poses a challenge to both mathematical analysis and parameter inference. By using mean-field models, such as the pairwise model (PWM), the complexity becomes tractable. While such models have been used extensively for model analysis, there is limited work in the context of statistical inference. In this paper, we explore the extent to which the PWM with the susceptible-infected-recovered (SIR) epidemic can be used to infer disease- and network-related parameters. The widely-used MLE approach exhibits several issues pertaining to parameter unidentifiability and a lack of robustness to exact knowledge about key quantities such as population size and/or proportion of under reporting. As an alternative, we considered the recently developed dynamical survival analysis (DSA). For scenarios in which there is no model mismatch, such as when data are generated via simulations, both methods perform well despite strong dependence between parameters. However, for real-world data, such as foot-and-mouth, H1N1 and COVID19, the DSA method appears more robust to potential model mismatch and the parameter estimates appear more epidemiologically plausible. Taken together, however, our findings suggest that network-based mean-field models can be used to formulate approximate likelihoods which, coupled with an efficient inference scheme, make it possible to not only learn about the parameters of the disease dynamics but also that of the underlying network. |
16:00 | Population-wide network model of multimorbidity reveals trends in incidence and prevalence of 132 chronic disease patterns PRESENTER: Katharina Ledebur ABSTRACT. In Austria, life expectancy increased by 3.8% for women and 6.0% for men from 1999-2019. However, life expectancy in (very) good health decreased from 65.9 to 63.1 years for women and 66.6 to 64.7 years for men from 2014-2019. Austrians have been found to live longer but less healthy compared to populations of other countries in Europe. Because chronic diseases, a major healthcare burden, are strongly related to ageing, it is feasible to expect an increase in incidence of numerous chronic diseases. Moreover, incidence of multimorbidity, the co-occurrence of multiple chronic diseases, is also expected to rise. Research on multimorbidity, despite its pressing development, is poorly reported and diverges regarding study population, diseases included and timescale. Here, we develop a network-based model to study the development of chronic diseases in the Austrian population. Based on an extensive dataset covering all hospital diagnoses in Austria between 1997 and 2014, a comorbidity network describing patterns of co-occurring diseases (ICD-10 codes covered range from A00 to N99) was developed by Haug et al. (2020). We model, using the same dataset, the Austrian population and simulate from 2003 to 2030 the demographic changes (migration, birth and death) as well as individuals acquiring new diagnoses based on the comorbidity network. The standard scenario simulation follows the main scenarios on birth, migration and death forecasts of Statistics Austria. To demonstrate the usefulness of this model we also simulate two Covid Shock scenarios (one pessimistic, one optimistic) to investigate the impact of sequelae of a SARS-CoV-2 infection on the disease burden of the Austrian population. In total, we find 132 distinct multimorbidity patterns in the population and forecast year-on-year changes in the prevalence of each of these patterns until 2030. We reveal highly heterogeneous trends across these patterns. Our approach can therefore be used for the early identification of rising multimorbidity patterns in the population, for a more precise characterization of their corresponding at-risk populations, providing the basis for more targeted and personalized prevention efforts. |
16:15 | Estimate of COVID-19 school transmission during weekly screening in the Auvergne-Rhône-Alpes region, France, week 47 to 50, 2021 PRESENTER: Elisabetta Colosi ABSTRACT. Reactive screening strategies were applied nationwide in France in late 2021 in response to the Fall Delta wave. In the same period (week 47 to 50, 2021), an experimental weekly screening protocol was proposed in 24 primary schools in the Auvergne-Rhône-Alpes region to identify and isolate cases early and avoid further transmissions. Here, we estimate the impact of the experimental protocol in terms of school transmission and reduction of cases, compared to the national school protocol. We extended an agent-based model for SARS-CoV-2 transmission over a temporal contact network composed of teachers and primary school students (Colosi et al, The Lancet Infectious Diseases, 2022). We parametrized the model to reproduce the Delta variant dominant in the study period, accounting for introductions from community surveillance data. We then fitted the model to the observed prevalence in the 17 schools selected for the analysis (Figure 1A). We estimated a relative contribution of school transmission compared to the introduction of 67% (IQR 53-78) in the Rhône department and 67% (IQR 50-82) in the Savoie department under the study period (Figure 1B). The school transmission reduction achieved by the experimental protocol was estimated to be 41% (IQR 21-55) over the four weeks of experimentation compared to the reactive strategy (Figure 1C). Through field estimates, these findings confirm previous model predictions anticipating the efficacy of systematically screening the school population to reduce the number of infections generated at school through early case detection and isolation. They also provide key information to improve the design and implementation of future prevention and control strategies at school in case of a new resurgence. |
16:30 | Probability generating functions for epidemics on metapopulation networks PRESENTER: Guillaume St-Onge ABSTRACT. Models of contagion on metapopulation networks are effective at assessing the cryptic transmission phase at the beginning of an outbreak when data are scarce and the epidemic is driven by long-range mobility patterns [1]. However, metapopulation frameworks either describe the average state of the system [2] or rely on costly large-scale simulations [1,3], which take time and resources to deploy during emergent disease outbreaks. Here, we provide a flexible and computationally efficient alternative to describing the early phase of an emerging outbreak by leveraging probability generating functions (PGFs). We map the system of mobile agents (and potential carriers of diseases) to a multitype branching process (Fig. 1A), from which we can probe full probability distributions. To give a sense of scale, we get similar results to stochastic simulations in a matter of seconds, as opposed to hours of computation on supercomputers. We are able to estimate important quantities, like the current (and future) state of an outbreak (Fig. 1B), and to perform exact Bayesian inference, like in Fig. 1C where we show a posterior distribution on the basic reproduction number and the time of onset of an epidemic. Other important applications include inferring the location of the source of an outbreak, defining more constrained prior distributions for large-scale simulations, and performing an early assessment of changes in the mobility of individuals. Altogether, our approach provides timely situational awareness when the spreading of concerning diseases is detected and we plan to leverage its computational advantage to help design more robust global surveillance systems. [1] J. T. Davis et al., “Cryptic transmission of SARS-CoV-2 and the first COVID-19 wave,” Nature, vol. 600, pp. 127–132, 2021. [2] A. Arenas et al., “Modeling the spatiotemporal epidemic spreading of COVID-19 and the impact of mobility and social distancing interventions,” Phys. Rev. X, vol. 10, p. 041055, 2020. [3] M. Chinazzi et al., “The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak,” Science, vol. 368, pp. 395–400, 2020. |
Natasa Przulj (Barcelona Supercomputing Cluster, Spain): Omics Data Fusion for Understanding Molecular Complexity Enabling Precision Medicine