previous day
next day
all days

View: session overviewtalk overview

11:00-12:00 Session 4: Poster I
Preference for Number of Friends in Online Social Networks

ABSTRACT. Preferences or dislikes for specific numbers are ubiquitous in human society. In traditional Chinese culture, people show special preference for some numbers, such as 6, 8, 10, 100, 200, etc. By analyzing the data of 6.8 million users of Sina Weibo, one of the largest online social media in China, we discover that users exhibit distinct preference for the number 200, i.e., a significant fraction of users prefer to follow 200 friends. This number, which is very close to Dunbar number that predicts the cognitive limit on the number of stable social relationships, motivates us to investigate how the preference for numbers in traditional Chinese culture is reflected on social media. We systematically portray users who prefer 200 friends and analyze their several important social features, including activity, popularity, attention tendency, regional distribution, economic level, and education level. We find that the activity and popularity of users with the preference for the number 200 are relatively lower than others. They are more inclined to follow popular users, and their social portraits change relatively slowly. Besides, users have a stronger preference for the number 200 are more likely to be located in the region of underdeveloped economy and education. That indicates users with the preference for the number 200 are likely to be vulnerable groups in society and are easily affected by opinion leaders.


ABSTRACT. 随着微博用户数日益增多,微博已然成为了网络舆情的重要产生地,引入超网络对微博舆情的关键舆情要素进行识别,对网络舆情的分析和监控有积极意义.本研究以基于超图的超网络为基础,构建了微博舆情超网络分析模型,应用LDA、SnowNLP、Python仿真分析等方法,识别微博舆情中的关键舆情要素,并对关键舆情要素的特征和情感进行分析与讨论.最后,应用在真实的舆情主题中,识别出六类关键舆情要素,分别是活跃人物、传播人物、热点微博、潜在热门微博、热点主题、中心主题,并分析各关键舆情要素的情感倾向.研究结果表明,建立的超网络模型,能有效识别特定舆情环境下的关键舆情要素,从而对网络热门事件进行舆情分析和监控.

A Link-Based Attacker-Defender Game Model of Combat Networks

ABSTRACT. Link is an essential part of combat networks that present the relative between two nodes like information transmit, transfer, transport, etc. The offensive and defensive game of combat network links are of great significance for identifying the enemy’s vulnerable segments and improving the efficiency of our planning and decision-making tasks [1, 2]. From the perspective of network science, this paper proposes a link-based attacker-defender game model of combat networks to provide link protection strategies and analyze the attacker’s actions. First, the combat network model is built considering cooperation between equipment based on kill web. Combat equipment can be divided into scout equipment, decision equipment, assault equipment, and communication equipment according to their functions. Then, a link-based attacker-defender game model of kill web is established to accurately describe both the attacker’s and the defender’s game activities. In our model, If the attacker chooses the edge of the attack and the defender has no protection to the attacked edge, the edge of the attack fails; If the attacker chooses the attack edge and the defender chooses defense, the edge will continue to play its efficiency. We take the ability value of the network as the profit function in order to reflect the influence of a single connection on the whole. And the solution method is introduced, which can find at least one solution in a limited time. Finally, an example of the presented game model is conducted to verify the method’s validation and feasibility of our proposed model. The Nash equilibrium under two typical strategies is calculated, and the influence of the different number of edges on the Nash equilibrium is analyzed. The results reveal that the defender chooses a deliberate defense strategy, and the attacker’s choice is influenced by cost; the Nash equilibrium under entire strategies is calculated, which shows that the defender often chooses more essential links to defend with a higher probability.

Motif structure for the four subgroups within the suprachiasmatic nuclei affects its entrainment ability

ABSTRACT. Circadian rhythms of physiological and behavioral activities are regulated by a central clock. This clock is located in the bilaterally symmetrical suprachiasmatic nucleus (SCN) of mammals. Each nucleus contains a light-sensitive group of neurons, named the ventrolateral (VL) part, with the rest of the neurons being insensitive to light, named the dorsomedial (DM) group. While the coupling between the VL and DM subgroups have been investigated quite well, the communication among the four subgroups across the nuclei did not get a lot of attention. In this article, we theoretically analyzed seven motiflike connection patterns to investigate the network of the two nuclei of the SCN as a whole in relation to the function of the SCN. We investigated the entrainment ability of the SCN and found that the entrainment range is larger in the motifs containing a link between the two VL parts across the nuclei, but it is smaller in the motifs that contain a link between the two DM parts across the nuclei. The SCN may strengthen or weaken connections between the left and right nucleus to accomodate changes in external conditions, such as resynchronization after a jet lag, adjustment to photoperiod or for the aging SCN.

Network Modeling and Effectiveness Evaluation of Island Air Defense Electronic Countermeasure Equipment System

ABSTRACT. Networked weapon equipment system is a complex system composed of decentralized configuration, a large number of weapons and their coupled interconnected relations. Considering the complex electromagnetic environment and poor mobility of island defense, this paper builds the networked system of island air defense electronic countermeasure equipment based on Mosaic warfare and evaluates the networked weapon equipment system based on complex network theory. According to the certain background, the characteristics of decision-centric warfare represented by Mosaic warfare and the concept of OODA loop theory, firstly, modeling the networked weapon equipment system based on DoDAF2.0, and then applying SysML to visualize the viewpoints. Secondly, combining the complex network analysis methodology, natural connectivity index is used to evaluate the structural destruction resistance of networked weapon equipment system and the maximum size of network connected slices after cascading failure is used to evaluate the robustness of networked weapon equipment system. Then, doing simulation experiment on destruction resistance and robustness of center-based operational network and Mosaic-based operational network, respectively. It is found that the destruction resistance and robustness of Mosaic-based operational network are better, which provides references to the construction of weapon system of systems and the development of technology.

Frequency-Amplitude Correlation Inducing Explosive Phase Transition in Coupled Oscillators

ABSTRACT. Explosive phase transition, as one typical phenomenon with discontinuity and irreversibility in a system of coupled oscillators, has been widely concerned. In previous works, such phenomenon occurs steadily with the positive correlation between oscillators’ natural frequency and their external coupling, e.g. the frequency-degree correlated model and the frequency-weighted model. However, the previous model settings have ignored the role of intrinsic amplitude, especially for the oscillator model containing amplitude information. Here, we propose a new correlation mechanism between intrinsic properties in the phase-amplitude oscillators. To be specific, a surprising explosive phenomenon emerges when each oscillator satisfies a complementary relationship between intrinsic amplitude and natural frequency. The highlight of our design is that explosive phase transition can occur independently of any correlation with coupling, or any particular interaction between coupled oscillators. Moreover, in the cases of symmetric and asymmetric frequency distributions, it can occur stably. Therefore, we provide a new perspective to understand explosive phase transition from correlation of intrinsic properties rather than heterogeneity of external coupling.

Evolution of cooperation on Reinforcement-Learning Driven Adaptive Network

ABSTRACT. From bacteria system and human social network to electric power network and Internet, networks are everywhere in realistic world. Amazingly, networks in real world are formed spontaneously by each agent and almost all networks’ distribution of degree obeys the power law. Inspired by this phenomenon, we construct an evolutionary game theory model that agent plays prisoner’s dilemma game with all its neighbors and ensure agents to have its own decision to change their neighbor via reinforcement learning so that to avoid bad environment. Excitingly, we find several interesting results. The reinforcement-learning-driven adaptive network can substantially promote cooperation when compare to the traditional prisoner’s dilemma game on homogeneous network. Meanwhile, the network’s topology evolves with time from a homogeneous state to a heterogeneous state because player gains experience from the past games and becomes more intelligent to decide whether to player PDG with current neighbors or break the edge to its poorest neighbor and reconnect with its second order neighbor with a high payoff to gain a better environment. We also calculate the distribution of degree and modularity of the adaptive network in a steady state and find our adaptive network obeys the power law and have an obvious community structure which indicates that the adaptive network is similar with the networks in real world. In 1999, Barabasi proposed a scale-free network generation algorithm using the characteristics of network growth and preferential connection. Our work in fact not only reported a new phenomenon in evolutionary game theory on network but also proposed a new thinking to generate scale-free network which is generating scale-free network by evolution of homogeneous network rather than network growth and preferential connection.

Measuring and utilizing temporal network dissimilarity

ABSTRACT. Quantifying the structural and functional differences of temporal networks is a fundamental and challenging problem in the era of big data. This work proposes a temporal dissimilarity measure for temporal network comparison based on the fastest arrival distance distribution and spectral entropy based Jensen-Shannon divergence. Experimental results on both synthetic and empirical temporal networks show that the proposed measure could discriminate diverse temporal networks with different structures by capturing various topological and temporal properties. Moreover, the proposed measure can discern the functional distinctions and is found effective applications in temporal network classification and spreadability discrimination.

Studying reaction systems from the perspective of complex networks

ABSTRACT. please check the uoloaded file. thanks

Detecting time-varying delays from time series in complex dynamical networks

ABSTRACT. Abstract--Time-varying delays are common phenomena in complex dynamical networks which can interfere with the normal operation and evolution of the network and affect the evaluation of the network performance. However, few studies can detect time-varying delays successfully. In this paper, a compressive sensing method based on the inverse scheme of the network reconstruction is established to detect time-varying delays existing in the complex dynamical network. The time delay of a certain time point can be identified by searching for a suitable time series from all the obtained time series. Compared with other methods, the main advantages of our method lie in three aspects. First, it is not necessary to acquire the expression of the time-varying delay in advance since compressive sensing is a completely data-driven method. Second, there is no special limitation on the type of the time-varying delay no matter it is linear or nonlinear. Third, the regularities of the search step are found to optimize our method and make it more efficient in some typical chaotic systems after a large number of experiments. Numerical simulations are given to demonstrate the accuracy of our method.

The Controllability of Complex Networks Based on Connectivity

ABSTRACT. The study of network controllability has raised more and more attention in both automatic control community and complex network community.Especially, the research on the structural control theory has shown great power in analyzing the controllability of large-scale complex systems. By using structural control theory, it’s able to study the controllability of complex networks from graphs without knowing the exact weight of each link. Although structural control theory reveals the relationship between network structure and network controllability, further exploration is still required to deal with the complexity of the actual network structure. In network topology researches, connectivity is one of the most important features. Researchers have found that giant connected components are functional for the stability of the network. Furthermore, the properties of networks are strongly dependent on the connectivity between the nodes that enables them to interact cooperatively as one network, for example, airline routes, electric power grids, and the Internet. Therefore, this paper focus on the study of network controllability based on connectivity. To reveal the relationship between network connectivity and controllability, a new framework is created to analysis the network controllability. Here, a type of condensation graph is proposed to simplify the controllability analysis from both graph and computing perspectives. The result also demonstrate that the analytical framework can have certain advantages in dealing with the dynamic changes of the network structure.

Estimating the power-law exponent

ABSTRACT. Power-law distribution has a wide range of practical applications, such as in network science and engineering. Its statistical theoretical research focuses on the network degree distribution. Based on practical data and theoretical research, references [1] and [2] pointed out that the network degree distribution obeys power-law distribution. There are many literatures on how to estimate the power exponent of power-law distribution, and give the point estimation of power exponent. This paper will give the interval estimation of power exponent. With interval estimation, the error of point estimation can be estimated. The existence of Riemann ς function makes the statistical inference of discrete power-law distribution more complex than that of continuous power-law distribution. Based on Euler product formula, this paper gives a summation formula of derivative of logarithmic Riemannian ς function and deduces the approximate calculation method of MLE. For the commonly used power law distributions, exponent parameters lie between 2 and 3, so the variance does not exist. This makes it more difficult to give the interval estimation of exponent parameter. Using the compression transformation, this paper gives a method of interval estimation based on the central limit theorem. This method is suitable for parameter interval estimation of many other heavy tailed distributions. In addition, this paper also gives an interval estimation method based on the acceptance domain of likelihood ratio test. These methods are illustrated with applications in empirical data obtained by Lotka[7] from investigating the relationship between the frequency of authors of scientific papers and the number of papers published.

A Binary Evolutionary Competition

ABSTRACT. Competition among organizations/individuals prevails. The Matthew effect, which is commonly represented by the proverb "the rich get richer and the poor get poorer", has a significant impact on limited resources competition. To investigate the Matthew effect on competition, we propose a simple model to mimic the competition process in which two groups compete for supporters. In our model, we find that although both sizes are increasing, both the ratio of expected larger and smaller sizes and the expected ratio of larger and smaller sizes converge with different speeds and at different values. In addition, they are both quite sensitive to the initial sizes. The model can be easily applied and extended to various fields.

Fuzzy Community Detection Based on Elite Symbiotic Organisms Search and Node Neighborhood Information

ABSTRACT. Recently, fuzzy community detection has received increasing attention, since it can not only uncover the community structure of a network, but also reflect the membership degrees of each node to multiple communities. Although some pioneers proposed a few algorithms for finding fuzzy communities, there is still room for further improvement in the quality of detected fuzzy communities. In this study, a metaheuristic-based modularity optimization algorithm, named Symbiotic Organisms Search Fuzzy Community Detection (SOSFCD) is proposed. On the one hand, an improved bio-inspired metaheuristic algorithm, Elite Symbiotic Organisms Search (Elite-SOS), is designed as optimization strategy to improve the global convergence of fuzzy modularity optimization. On the other hand, a Neighbor-based Membership Modification (NMM) operation is proposed to intensify exploitation ability and speed up convergence, by efficiently utilizing local information (i.e., node neighborhood) of network topology. Experimental results on both of synthetic and real-life networks with different scales and characteristics show that SOSFCD can find max-modularity fuzzy partitions and coverings, and outperforms many state-of-the-art algorithms in terms of accuracy and stability.

Identification of differentially abundant cell-type in scRNA-seq data using node-attributed community detection approach

ABSTRACT. Differential abundance (DA) analysis of cell-type composition is deployed to address the question to measure, how predefine cell-type population changes with experimental conditions considering the overall scRNA-seq data. The investigation facilitates to unravel the cell-types that are enriched or depleted upon drug treatment or with aging and disease progression. Graph network approach facilitates analysis of complex biological data. Community detection is one of the most important and critical tasks in network field. Classical approach decomposes a network based on graph tropology into sub-networks or communities which are highly connected nodes within community. Community detection of a node-attributed graph using the classical approach does not explicitly guarantee attribute-homogeneity into derived communities. To address this question, researchers proposed different optimization methods that could improve attribute-homogeneity into detected communities along with graph topology. To the best of our knowledge, none of DA methods use predefine cell-type annotation as an extra layer of information to determine cell subpopulations. This caveat drives us to develop a new DA method using node-attributed (cell-type annotation) community detection approach. First, we apply our code on real-world datasets having node attributes to check its performance. The results suggest that in these contexts node-attributed community detection approach works better over the conventional topological clustering approach. Next, we perform DA analysis of M1/M2 cell-type polarization in lung alveolar macrophages dataset with mouse ageing.

Link prediction for long-circle-like networks

ABSTRACT. Link prediction is the problem of predicting the uncertain relationship between a pair of nodes from observed structural information of a network. Link prediction algorithms are useful in gaining insight into different network structures from partial observation of exemplars. Existing local and quasilocal link prediction algorithms with low computational complexity focus on regular complex networks with sufficiently many closed triangular motifs or on tree-like networks with the vast majority of open triangular motifs. However, the three-node motif cannot describe the local structural features of all networks, and we find the main structure of many networks is long line or closed circle that cannot be predicted well via traditional link prediction algorithms. Meanwhile, some global link prediction algorithms are effective but accompanied by high computational complexity. In this paper, we proposed a local method that is based on the natural characteristic of a long line—in contrast to the preferential attachment principle. Next, we test our algorithms for two kinds of symbolic long-circle-like networks: a metropolitan water distribution network and a sexual contact network. We find that our method is effective and performs much better than many traditional local and global algorithms. We adopt the community detection method to improve the accuracy of our algorithm, which shows that the long-circle-like networks also have clear community structure. We further suggest that the structural features are key for the link prediction problem. Finally, we propose a long-line network model to prove that our core idea is of universal significance.

Effect of resource constraints on knowledge diffusion in knowledge collaborative networks

ABSTRACT. The process of knowledge diffusion requires the input of resources. We assume that no external resources enter the networks and that resources within the networks are provided by unacquired and acquired users. Until now, most of the existing studies ignore the internal relationship between resource constraints and knowledge diffusion. Therefore, we introduced a budget-constrained Susceptible-Infected-Recovered (bSIR) model. We perform numerical simulations on a school friendship network with the characteristics of a realistic social network. The expected results of the experiment are that explosive transitions in diffusion levels in populations occur above a critical cost when the diffusion budget is generated by unacquired and acquired users. We will also emphasize that the behavior is very general and that our key results from using the Heaviside step function hold for any budget function that satisfies the general conditions defined above.

Diffusion localization transition on Japanese business network

ABSTRACT. Gravity interaction model [1] (see Eq. 1 below) has been introduced as a mathematical model of gravity law of annual money transport between firms [2]. In this model, the money flow between customer and supplier firms depends on the nonlinearity parameter γ of gravity law that controls the dependency on supplier’s sales, and by changing the value of γ the model exhibits various behaviors of stationary solution. Intuitively speaking, a larger firm will attract more money flow for larger value of γ. The model shows the bifurcation phenomena called the diffusion localization transition, that is, the diffusion-dominant solution for small γ becomes unstable and localized solution realizes for 〖γ>γ〗_c. It is numerically estimated that the value of transition point is about 0.9 [1], however, detail properties of this transition are yet to be clarified.

\frac{dx_i}{dt}= \sum_k \frac{A_ki x_i^\gamma}{\sum_j A_kj x_j^\gamma} x_k - (1+\nu) x_i+1 (1)

We recently found rigorously that the regular ring lattices where each vertex is connected with almost 90 percent of all vertices realizes the same transition point 0.9[3]. However, its network topology is very different from Japanese business network. In this study, we find that Japanese business network shows complicated bifurcation such that the money flow gradually concentrates on particular transaction links even in the diffusion phase as γ increases, which is a novel property not observed in the regular ring lattice. By intensive numerical analysis we find a unique transition where the order of sales is drastically changed, some firms’ sales increase about 1,000 times while some others’ sales decrease sharply to about 1/500 of the previous level. We will report more precise properties of this strange transition.

Wired and Wireless Failure Cross-domain Root Cause Analysis to Operate Telecommunication Network

ABSTRACT. The communication infrastructure has a representative multi-layer network structure. According to the telecommunication network structure, each layer is largely divided into areas such as physical fiber, transport, IP, and wireless. A network consists of the connections of devices in each area.Before the 5G network, the operation of telecommunications networks was concentrated on each operation by layer. However, with the advent of the 5G network era in which the network core is distributed to the edge, the telecommunication service based on multi-layer interaction has become common, and multi-layer integrated operation has become inevitable. So KT created a multi-layer telecommunication fault operation solution using Cross-domain Root Cause Analysis(Cross-domain RCA). In the preprocessor of cross-domain RCA, we first designed a topology DB that manages multiple domains. In this topology DB, commonalities between domains are found through 'trans-line'. And fault alarms are clustered according to trans-line and time correlation. Second, the cross-domain RCA analyzes the clustering alarms using the rule-based CEP engine, as a result, the root cause of the fault alarm can be found.In addition, the postprocessor of this solution has merged with causality between RCA results. The solution supports not only cross-domain RCA engine but also ticket-type root failure display and office floor plan view. As a result, it reduced the time that operators recognize root-cause and measures of failures. The Cross-domain Root Cause Analysis Solution applied to the domestic Busan, KT 5G network in 2021, showed the effect of reducing the operating time by 30% in case of failure.

Peeking strategy for Online News Diffusion Prediction via Machine Learning

ABSTRACT. For computational social scientists, cascade size prediction and fake news detection are two primary problems in news diffusion or computational mass communication research. Previous studies predict news diffusion via peeking the social process (temporal structure) data in the initial stage, which is summarized as Peeking strategy. However, using the peeking strategy to detect false news has become a blind spot in the previous research, advantage and limitation of the peeking strategy has not been thoroughly investigated. To predict cascade size and detect false news, we adopt Peeking strategy based on well-known machine learning algorithms. Our results show that Peeking strategy can effectively improve the accuracy of cascade size prediction. Meanwhile, we can peek into a smaller time window to achieve a high performance in predicting the cascade size compared with previous methods. Nevertheless, we find that Peeking strategy with network structures fails in significantly improving the performance of false news detection. Finally, we argue that cascade structure properties can aid in prediction of cascade size, but not for the false news detection.

12:00-13:30 Session 5A: Structure II
Robustness on modular interacting networks

ABSTRACT. With the spread of IoT applications, the acceleration of the smart city process and the comprehensive popularization of mobile Internet, the amount of various kinds of data related to the country's livelihood is increasing with each passing day. These massive data present many characteristics such as multi-source heterogeneity, wide distribution, dynamic growth, and so on, and clearly indicate from different perspectives that each system is in a large complex network system of interconnection, dynamic evolution, and coupling dependence. This lecture proposes a class of mathematical framework models to study the destructiveness of complex networks with association structure from the structural robustness of complex networks. It is found that the change in the ratio of interdependent edges and interdependent points can make the continuous phase transition that can occur in a single network disappear and make the system become stable. Furthermore, it is shown that the interdependent edges are similar to the external field effect in the ferromagnetic effect. With the help of this external field effect, two types of critical exponents for the critical phase transition point are defined and found to follow the universal Wisdom's law.

However, current theoretical models tend to assume homogeneous coupling among sub-networks, that is all the different sub-networks interact, while real-world systems often have a variety of different coupling patterns. We propose two types of frameworks to explore the structural robustness under such coupled giant networks, including specific deterministic coupling patterns and coupling patterns for specific sub-networks with random connections. It is found analytically and numerically that when the total number of connected edges in the network is kept constant, the location of the critical point varies with the proportion of interconnected nodes rather than monotonically. There exists an optimal ratio of interconnected nodes at which the system becomes optimally robust and resilient to external shocks. The results of this study provide a deeper understanding of network resilience and show how networks can be optimized according to their specific coupling patterns. The results are published in PNAS (2021,118, 22, e1922831118; 2018, 115 (27), 6911-6915).

Research on Cascading Failure Model and Robustness of Weighted Interdependent Networks

ABSTRACT. Cascading failures often occur in real systems, and the modeling method of weighted multilayer networks can be closer to the physical system. However, the cascading failures in complex networks are rarely studied in weighted interdependent networks. On the one hand, the influence of network weight on node capacity and load-sharing strategy is seldom considered in the design of cascading failure model. On the other hand, the cascade failure process in weighted interdependent network is seldom considered. In order to resist the cascading failure of networks, a new cascading failure model is proposed based on the construction of weighted interdependent networks, in which the initial load is constructed by incorporating the node strength, and the load-sharing rules are formulated by combining the remaining capacity of nodes and the node importance as measured by weight and node degree. Secondly, considering the time-varying characteristics of load and the inter-layer dependence in the cascading failure process, a cascading failure algorithm of weighted interdependent networks is proposed. Finally, the effect of parameters in the model on the network robustness and the performance of the model in different interdependent networks are obtained through the analysis of network cascading failures, which verifies the effectiveness of the proposed method.

Cost-effective Network Disintegration through Targeted Enumeration

ABSTRACT. We live in a hyperconnected world---connectivity that can sometimes be detrimental. Finding an optimal subset of nodes or links to disintegrate harmful networks is a fundamental problem in network science, with potential applications to anti-terrorism, epidemic control, and many other fields of study. The challenge of the network disintegration problem is to balance the effectiveness and efficiency of strategies. In this paper, we propose a cost-effective targeted enumeration method for network disintegration. The proposed approach includes two stages: searching candidate objects and identifying an optimal solution. In the first stage, we use rank aggregation to generate a comprehensive node importance ranking, upon which we identify a small-scale candidate set of nodes to remove. In the second stage, we use an enumeration method to find an optimal combination among the candidate nodes. Extensive experimental results on synthetic and real-world networks demonstrate that the proposed method achieves a satisfying trade-off between effectiveness and efficiency. The introduced two-stage targeted enumeration framework can also be applied to other computationally intractable combinational optimization problems, from team assembly, via portfolio investment, to drug design.

Non-Markovian random walks characterize network robustness to non-local cascades

ABSTRACT. Understanding the interplay between structure and dynamics is still one of the major challenges in network science. A central question concerns the robustness of a system against perturbations, since it can advance the development of powerful analytical techniques to explain and unravel rich phenomenology, as well as it can provide a solid ground for informed interventions.

A main assumption behind the analysis of robustness is that for a system to be functional, it needs to be connected. Hence, concepts and techniques from percolation theory become useful and are frequently employed. This is a completely static approach, where a fraction of nodes (or links), either selected uniformly at random or based on topological or non-topological descriptors, is removed from the network. From a dynamical point of view, small failures placed in the network may evolve ---according to some rules that depend on the phenomenon one is trying to model--- causing system-wide catastrophic cascades. For the sake of mathematical tractability, cascades are assumed to spread via direct contacts. However, be it because the physical mechanisms behind the failure propagation permit far-off malfunctions, be it because the knowledge on the observed network topology is incomplete and the failure propagates through hidden or unobserved edges, real-world cascades display non-local features. From a modeling standpoint, some mechanisms like flow redistribution can lead to non-local spreading of failures but the mathematical treatment has been hitherto under-researched due to its sophistication and there is no direct way to control the underlying properties of the non-local events, seriously undermining our understanding of the phenomenon.

To better reconcile theory and observations, we propose a dynamical model of non-local failure spreading that combines local and non-local effects. We assume that the cascade unfolds in a timescale much faster than the recovery of nodes, and that a disrupted unit cannot be visited more than once by the failure. This fact causes the failure to be no longer Markovian and, for modeling purposes, a natural choice is to consider a Self-Avoiding Random Walk-like (SARW) dynamics on the network. To cope with the non-locality, we introduce a teleporting probability: at each step $t$ the failure proceeds as in a SARW ---uniformly choosing an operational neighbor and transitioning there--- with probability $1 - \alpha \in [0, 1]$, otherwise with probability $\alpha$ it teleports to any operative node according to a teleporting rule $T_t(k)$, time- and degree-dependent. We name this the self-avoiding teleporting random walk (SATRW), which interpolates between percolation (purely non-local process, $\alpha = 1$) and the growing SARW (purely local process, $\alpha = 0$).

We have characterized the rich critical behavior of our model by providing analytical expressions for several quantities employed to assess the system’s robustness, such as the time-dependent degree distribution $p_t(k)$, the size of the giant component in the residual network as the process evolves $s_t$, the cascade first-stop time distribution, and the mean value of $S^{\text{(STOP)}}$, the giant component at the cascade stop. These robustness descriptors display an excellent agreement with simulations in synthetic systems characterized by different types of complexity in terms of the heterogeneity of their structural connectivity. We find remarkable differences between homogeneous and heterogeneous systems, e.g., their dependence, or lack thereof, on the particular network parameters. However, we also report some hidden similarities between them, such as a dynamical version of the popular \textit{robust-yet-fragile} feature to static attacks. It is worth noticing that, despite our framework is expected to work for locally tree-like networks lacking topological correlations, such as degree-degree ones, it still works in empirical settings as we have shown for the case of a biomolecular system, namely the interactome of the nematode \textit{C. elegans}, and an infrastructural system, namely a national air traffic network, shown in Fig.~1.

Our findings provide a solid ground for the analytical study of network robustness, in particular, and for non-local non-Markovian processes, in general. The article is currently under review.

Resilience-informed infrastructure network dismantling

ABSTRACT. Large-scale networked infrastructure systems contribute significantly to modern society. Highly intra- and inter-connnected systems enable communities to be more productive, at the expense of becoming more vulnerable to extreme events, cascading failures, and operational demands, either random or deliberate. The resilience of infrastructure systems against common but random failure and rare but intentional attacks is critical for safe communities, as it addresses other types of contingencies in between. Network dismantling is a process to make the network dysfunctional by removing a fraction of components, which provides insights for robustness and resilience design under many events. In particular, to protect networks from uncertain dismantling, we need to understand how to optimally fragment networks into small clusters by removing a fraction of their assets with minimal cost. Approximation methods are desirable because finding the optimal dismantling strategy is NP-hard, thus impractical on infrastructure networks. Existing methods rely on the iterative removal of the nodes with the highest adaptive importance, either from basic centralities, such as degree and betweeness, or from some more advanced metrics like collective influence. However, the additive nature of such methods fails to capture the synergistic nature of the dismantling problem, such as phase transition in percolation process. Also, algorithms connecting network dismantling problems with network decycling problems, tend to identify better dismantling sets. Other recent strategies add realism by adopting nonuniform node removal costs, and applying a bisecting algorithm based on weighted spectral approximations iteratively, to proximate the optimal solution under nonuniform costs. Despite these efforts, the combinatorial optimization nature of the network dismantling problem still requires better global solutions, even if approximated. Additionally, the cost to remove components is the only factor considered in most previous methods. Network resilience, which can inform what to protect from dismantling to facilitate recovery, is seldom included as part of the cost. In this work, we propose a method employing Karger’s contraction algorithm and node-transferring heuristic optimization to approximate the optimal dismantling set, considering both component removal cost and network resilience after dismantling. With the cost associated with the resilience of the dismantled network into account, the proposed method not only dismantles the network as desired, but also makes it hard to recover. The proposed method, resilDism, obtains good performance compared to state-of-the-art network dismantling methods, and provides valuable insights to guide network design and resilience enhancement in practice.

Defense Against Shortest Path Attacks
PRESENTER: Benjamin Miller

ABSTRACT. Identifying shortest paths in a graph is a common problem in applications involving routing of resources, such as packets in a computer network or vehicle traffic on a road. We consider the case where a malicious actor wants a particular path p* to be the shortest between two nodes and can remove edges to achieve this goal. This work aims to raise the attacker's cost by obscuring the true graph while retaining its utility for legitimate users.

In prior work, we showed that finding the optimal attack is NP-hard, but there is an approximation that can be efficiently computed using an algorithm called PATHATTACK. In our present work, the goal is to increase the cost of an attack by concealing the true weights and providing only approximate weights. Our approach is inspired by a technique for privacy preserving approximations of shortest paths. We release a graph where each edge weight has had noise from a Laplace distribution added to it. There are two attackers we consider: (1) an informed attacker, who is given an uncertainty interval to consider around each observed weight, and (2) an oracle-enhanced attacker, who is able to check whether candidate attacks are successful on the true graph. If an attacker assumes an uncertainty interval of width 2α around the observed weight, we say that attacker is α-informed. Each attacker uses a variant of PATHATTACK modified for the context. PATHATTACK uses an approximation algorithm to iteratively optimize the cost of edge removal to make a target path the shortest. The attackers in this case optimize the expected cost given the observed graph. The informed attacker continues until all remaining paths are longer than p*, for any edge weights within the assumed uncertainty intervals.

In each scenario, we compare the attacker's required budget to the budget using PATHATTACK on the original (noiseless) graph. For the informed attacker, we also consider whether or not the attack succeeds. We evaluate the increase in the adversary's budget on a variety of graph models, including Erdős–Rényi, Barabási–Albert, and Kronecker graphs, as well as real networks of interest, such as roads and computer networks. In unweighted graphs, we give the true graph random weights. Our principal evaluation metric is the cost incurred by the attacker using one of the given strategies. Even when the adversary has access to an oracle, the cost incurred in the optimization is appreciably increased by the uncertainty in the data. The informed attackers become more reliably successful as the uncertainty intervals are widened, but this comes with a significant increase in cost. In the presentation, we will discuss the tradeoffs between increasing the adversary's budget and enabling users to compute distances between nodes.

12:00-13:30 Session 5B: Social Networks I
Understanding European Integration with Bipartite Networks of Comparative Advantage

ABSTRACT. Labor division and specialization across nations in the global value chain is a natural process and a potential area of policy intervention to foster convergence. Models on European integration suggest such diverging structures of specialization [1]. However, specialization into identical products signals competition across countries and less developed countries can benefit less from the integration in case they co-specialize with more developed countries. We study the evolution of co- specialization motif [2] directly on the bipartite network between 21 European countries and 31 industry sectors using the Relative Comparative Advantage (RCA)[3,4] to captures if two countries are specialized in the same industry during the period 2000-2014. Moreover, we aim to disentangle the relation of co-specialization within and across the groups of EU15 and CEE member states. By measuring the statistical significance of this motif comparing the occurrence of observed motifs to a null model for each using the Bipartite Configuration Model [5], we can assess how the production structures are changing (overlapping or diverging) across the EU states. Our measurement reveals no significant overlaps of comparative advantages across EU15 countries after 2000, which is probably due to their gradual integration. However, we find that CEE countries tend to be specialized in similar, if not identical, sectors. The number of co-specialization network motifs including both EU15 and CEE countries decrease after enlargement signalling [6] a deeper division of production between EU15 and CEE. We do however find that productivity increases in those CEE industries that had no significant overlap in specialization before EU accession but experience increasing co-specialization with other CEE countries after entering the EU. In the meantime, co-specialization across EU15 and CEE countries contribute slightly positive, if at all significant, to productivity growth after accession. The findings refer to the role of labor division between EU15 and CEE that foster co- specialization of CEE industries and facilitate their integration and convergence in the common market. Our new approach provides novel insights for the current EU policy on Smart Specialization. These results highlight the need to apply these policy tools in new member states differently from old member states to reflect labor division that can support convergence.

Investigating and Modeling the Dynamics of Long Ties

ABSTRACT. Long ties, the social ties that bridge different communities, are widely believed to play crucial roles in spreading novel information in social networks. However, some existing network theories and prediction models indicate that long ties might dissolve quickly or eventually become redundant, thus putting into question the long-term value of long ties. Our empirical analysis of real-world dynamic networks shows that contrary to such reasoning, long ties are more likely to persist than other social ties, and that many of them constantly function as social bridges without being embedded in local networks. Using a novel cost-benefit analysis model combined with machine learning, we show that long ties are highly beneficial, which instinctively motivates people to expend extra effort to maintain them. This partly explains why long ties are more persistent than what has been suggested by many existing theories and models. Overall, our study suggests the need for social interventions that can promote the formation of long ties, such as mixing people with diverse backgrounds.

Characterising the role of human behaviour in the effectiveness of contact-tracing applications

ABSTRACT. Despite the widespread use of contact-tracing (CT) applications against the Covid-19 pandemic, as of today, the debate around their effectiveness is still open. Most studies indicate that very high levels of adoption are required for epidemic control [1], which has placed the main interest of policymakers in promoting app adherence. However, other factors of human behaviour, like delays in adoption or heterogeneous compliance, could also have a relevant impact on app effectiveness and are often ignored by policymakers. To characterise their importance in CT app’s effectiveness, we propose the implementation of a dynamic agent-based model describing the interplay between disease progression and app adoption in a single population. The model relies on a two-layer multiplex network to reflect the connectivity patterns in which each dynamic evolves (in-person contacts for the epidemic and Bluetooth interactions for the CT app) (figure 1a). Epidemic progression is described through a SEPIR model [2] with preventive quarantines triggered by the CT app, which was initialised to reflect the first wave of the Covid-19 pandemic (figure 1c). App adoption is incorporated through a threshold dynamic dependent on the intrinsic reluctancy level of each individual and the evolution of the pandemic. The model also considers heterogeneous compliance in terms of CT app reporting (figure 1b). Using this model, we quantified the impact of three features of human behaviour (time of adherence, the level of compliance, and the maximal level of adoption) in the effectiveness of CT app. This was achieved by exploring the parameter space in three hypothetical scenarios: the voluntary adoption scenario (where complete compliance is assumed), the imposed adoption one (with zero reluctancy towards app adoption) and the “adherence & compliance" scenario (only constraining the maximal level of adoption). The results obtained agree with prior literature, indicating the relevance of high percentages of adoption for the performance of CT apps. However, they also evidenced the importance of early adoption and moderate levels of compliance (above 20%) to obtain effective strategies. The insight obtained was also used to identify a bottleneck in the implementation of the Spanish CT app, where we hypothesise that a simplification of the reporting system could result in increased effectiveness through a rise in the levels of compliance.

Self-induced emergence of consensus in social networks: Reddit and the GameStop short squeeze

ABSTRACT. The short squeeze of GameStop (GME) shares in mid-January 2021, primarily orchestrated by retail investors on the Reddit r/wallstreetbets community, caused major losses for short sellers hedge funds and a drastic surge of the stock price. Such an unprecedented event in finance represents a paramount example of a collective coordination action on online social media resulting in consensus formation on large scale. Here we characterize the structure and time evolution of Reddit conversation data, showing that the occurrence of GME-related comments and their sentiment grew much before the short squeeze actually took place. These early signs of the collective action can be associated to a self-reinforcing, increasing level of commitment and social identity of individual users. In order to understand such dynamics we develop a model of opinion formation that combines peer interaction with a self-induced feedback from the global state of the community. Analytic mean-field solutions and simulations on extracted social networks of Reddit users display a phase transition from a disordered state to full consensus depending on the level of social identity, with the presence of hubs favouring the emergence of consensus. Our results can shed light on the increasingly important phenomenon of self-organized collective actions on social networks.

The Steadiness of Transient Relationships

ABSTRACT. Humans are social animals and having strong and supportive relationships with others has large effects on both physical and mental health. Because such strong relationships are frequently those with long duration, research has traditionally focused on long-term relationships. However, human communication networks have a considerable number of short-term transient relationships, and far less is known about their temporal evolution. Among the few things known about such relationships, previous literature suggests that ratings of relationship emotional intensity decay gradually until the relationship ends. Using mobile phone data from three countries (US, UK, and Italy), we show that the volume of communication between ego and its contacts is stable for most of the duration of a transient relationship. Contacts with longer durations receive more calls, with the duration of the relationship being predictable from call volume within the first few weeks of first contact. Relationships typically start with an early elevated period of communication, settling into the longer steady regime until eventually terminating abruptly. This is observed across all three countries, which include samples of egos at different life stages. These results are consistent with the suggestion that individuals initially engage frequently with a new alter so as to evaluate their potential as a tie, and then settle back into a contact frequency commensurate with the alter's match to ego on the so-called Seven Pillars of Friendship hypothesis. Our results also indicate that, while objective data such as mobile phone contact provides one important dimension about people's communication patterns, subjective measures that may not always parallel objective ones are still needed to develop a complete picture of a person's ego network.

12:00-13:30 Session 5C: Epidemics I
Minimizing school disruption under high incidence conditions due to the Omicron variant in early 2022

ABSTRACT. Countries in Europe have suffered large disruptions in schools due to the exceptionally high rates of Omicron incidence recorded in the community and in particular in children [1] during January 2022. As a consequence, school protocols were put under stress by requiring repeated quarantines or leading to large and sudden testing demand for children, overloading saturated surveillance systems [2], [3]. Extending our previous modeling of SARS-CoV-2 transmission in schools in France [4], we simulated the disease spread over a temporal contact network composed by teachers and primary school students. Then, we compared school protocols in terms of resource peak demands, infection prevention, and reduction of schooldays lost, specifically under the high incidence conditions due to the Omicron variant. We estimated that at high incidence rates reactive screening protocols (as applied in France in January 2022) require comparable test resources as weekly screening (as applied in some Swiss cantons), for considerably lower control. Our findings can be used to define incidence levels triggering school protocols and optimizing their cost-effectiveness.

Assessing spread risk of COVID-19 associated with multi-mode transportation networks in China

ABSTRACT. The spatial spread of COVID-19 during early 2020 in China is primarily driven by outbound travelers leaving the epicenter, Wuhan, Hubei province. Existing studies focus on the influence of aggregated out-bound population flows originating from Wuhan; however, the impacts of different modes of transportation and the network structure of transportation systems on the early spread of COVID-19 in China are not well understood. Here, we assess the roles of the road, railway, and air transportation networks in driving the spatial spread of COVID-19 in China. We find that the short-range spread within Hubei province was dominated by ground traffic, notably, the railway transportation. In contrast, long-range spread to cities in other provinces was mediated by multiple factors, including a higher risk of case importation associated with air transportation and a larger outbreak size in hub cities located at the center of transportation networks. We further show that, although the dissemination of SARS-CoV-2 across countries and continents is determined by the worldwide air transportation network, the geographic dispersal of COVID-19 within China is better predicted by the railway traffic.

The Role of Masks in Mitigating Viral Spread on Networks

ABSTRACT. Mask-wearing has been an important measure in curbing the spread of the virus during the COVID-19 pandemic. While it is well-known that masks qualitatively mitigate viral spread by limiting the transmission of respiratory droplets, many important questions about the quantitative impact of masks remain open. In this work, we provide comprehensive quantitative analysis of the impact of mask-wearing where people can wear one of several types of masks with different levels of protection, or wear no mask. Interestingly, we find that masks with high outward efficiency and low inward efficiency are most useful for controlling the spread in the early stages of an epidemic, while masks with high inward efficiency but low outward efficiency are most useful in reducing the size of an already large spread.

Effect of delayed awareness and fatigue on the efficacy of self-isolation in epidemic control

ABSTRACT. The isolation of infectious individuals is a key measure of public health for the control of communicable diseases. However, involving a strong perturbation of daily life, it often causes psychosocial distress, and severe financial and social costs. These may act as mechanisms limiting the adoption of the measure in the first place or the adherence throughout its full duration. In addition, difficulty of recognizing mild symptoms or lack of symptoms may impact awareness of the infection and further limit adoption. We study an epidemic model on a network of contacts accounting for limited adherence and delayed awareness to self-isolation, along with fatigue causing overhasty termination. The model allows us to estimate the role of each ingredient and analyze the tradeoff between adherence and duration of self-isolation. We find that the epidemic threshold is very sensitive to an effective compliance that combines the effects of imperfect adherence, delayed awareness and fatigue. If adherence improves for shorter quarantine periods, there exists an optimal duration of isolation, shorter than the infectious period. However, heterogeneities in the connectivity pattern, coupled to a reduced compliance for highly active individuals, may almost completely offset the effectiveness of self-isolation measures on the control of the epidemic. The analysis is run analytically within the HMF and QMF mean field frameworks and checked against numerical predictions carried out with the help of a Gillespie dynamics running over a network built according to the uncorrelated configuration model.

Multimorbidity profiles and infection severity in COVID-19 population using network analysis in the Andalusian Health Population Database

ABSTRACT. Introduction. Identifying the population at risk of COVID-19 infection severity is a priority for health systems. Most studies to date have only focused on the effect of specific disorders on infection severity, without considering that patients usually present multiple chronic diseases and that these conditions tend to group together in the form of multimorbidity patterns. Network analysis is a powerful tool to detect networks of individuals through similarities amongst them based on their baseline chronic conditions, with high applicability in epidemiology to unlock the potential of real-world data for health research, and specifically from a person-centered perspective. As far as we know this approach was previously used in just one multimorbidity pattern study, focused on COVID-19 patients, but considering only patients' diagnoses recorded in the primary care setting. By applying network analysis in real-world data, this study explores multimorbidity profiles in the whole population with laboratory-confirmed SARS-CoV-2 infection in the Spanish region of Andalusia and estimates each patient network's likelihood of hospitalization and mortality considering age, gender, and the whole spectrum of patients' chronic diseases from both primary and hospital care.

Materials and Methods. We performed an observational, retrospective study in the Andalusian Health Population Database, which includes demographic and clinical information of all the users of the public health system in Andalusia (Spain). Andalusian's Health System provides universal and free health coverage for all citizens and is used by approximately 99% of the reference population in the region. For this study, we included all 166,242 individuals aged 15 years or older with laboratory-confirmed SARS-CoV-2 infection and at least one chronic condition from June 15, 2020, to December 19, 2020. The Clinical Research Ethics Committee of Andalusia (CCEIBA) approved the research protocol for this study (2309-N-21). For each person, we analyzed sex, age (stratified in three groups: 15-64, 65-79, ≥ 80 years), and all baseline chronic conditions from patients' electronic health records present at the time of inclusion in the study; and we analyzed patient mortality and hospitalization during the follow-up. First, we described the demographic and clinical characteristics of the study population. Then, we applied network analysis in the population with multimorbidity in each sex and age interval subgroup, to identify multimorbidity profiles (i.e., groups of similar patients based on all their baseline chronic conditions). We used the Jaccard index (JI) to measure the similarity between patients due to the binary nature of the diagnostic variables (i.e., absence/presence), as done in previous studies on multimorbidity patterns. A link between two given patients was created if the JI between them was ≥0.33 to include patients who share half or more of their chronic diseases with another patient. Thus, each node represents a different patient, and a link means a JI ≥ 0.33 between patients; as already applied in previous work, combining clinical and statistical criteria for building the patient networks. This cut-off allowed the inclusion of almost all the patients with multimorbidity (144,990 out of 145,070 patients). At the same time, it only included 3.27% of all possible combinations between patients (111,371,820 out of 3,404,243,854 of all possible combinations), which saved computation memory. We used the network's modularity to search for communities of patients within each network, and the Leiden algorithm to guarantee well-connected communities. Community detection methods allow the number and size of the clusters to be determined by the network's structure and not by the researcher. Once the clusters of patients were identified for each subpopulation, and with the aim to characterize multimorbidity patterns obtained, the prevalence of each chronic condition was calculated. We also measured their observed/expected (O/E) prevalence ratio (i.e., the disease prevalence observed in a specific cluster divided by the observed disease prevalence in the stratum of reference). We included a chronic condition in a pattern if 1) the disease prevalence was ≥ 25%; or 2) the O/E prevalence ratio was ≥ 2, and the disease prevalence was ≥ 1%. Then, all clinicians named the patterns by consensus, considering the most relevant diseases within each profile according to their disease prevalence and O/E prevalence ratio, and in line with the names given in the literature. Finally, to calculate the impact on infection severity of each multimorbidity profile, we obtained age-adjusted logistic regression models in each subpopulation.

Results and Discussion. Our results showed that multimorbidity was a risk factor for COVID-19 severity and that this risk increased with the morbidity burden. We saw that the clinical component of each multimorbidity profile determines the risk of COVID-19 severity, as shown in Figure 1 as an example of the results obtained. In general, we found that individuals with advanced cardio-metabolic profiles frequently including respiratory and renal diseases presented the highest infection severity risk in both sexes. Patients with mental health patterns also showed one of the highest risks of COVID-19 severity, especially in women. Network analysis and this approach in particular could facilitate the replicability and automation of the analysis compared to other methods such as classical cluster approaches that also allow analyzing patients and their impact but with frequent computational limitations (e.g., agglomerative hierarchical clustering) and higher grade of subjectivity (e.g., k-means algorithm). The findings of this study strongly recommend the implementation of personalized approaches to patients with multimorbidity and SARS-CoV-2 infection, especially in the population with high morbidity burden, and confirm the potential and applicability of network science to the study of epidemiology of chronic diseases and multimorbidity.

Mobility induces a network interpretation of the Reproductive Number that explains COVID-19 spatial spread and informs surveillance

ABSTRACT. The COVID-19 pandemic is still active, with worldwide circulation of Omicron variant sub-lineages. Due to the recent increase of Omicron sub-lineages prevalence, many countries experienced a case incidence resurgence. In this circumstances, preventing seeding from high prevalence areas is of main importance. Surveillance systems, however, mainly rely on local indicators, like the effective reproductive number R (the average number of secondary cases generated by each case), neglecting the impact of the spatial dynamics in sustaining the epidemic spread due to mobility. We introduce here a theoretical framework based on a metapopulation scheme that defines the reproductive number as a spatial network among the 94 departments of mainland France. By integrating co-location records from Facebook at high spatial resolution together with official epidemic data and estimates, we define metrics for the risk of case importation (Rimp) and exportation (Rexp) at each location. Focusing on the summer-fall 2020 period, we show that spatial importation of cases at specific locations at the start of the summer (e.g. Bouches du Rhône, Rhône and Loire in Figure 1, with a peak in Rimp) triggered a local epidemic (in green in the plot) opening the path to the second wave in the fall, with a large risk of exportation to other locations (peak in Rexp). Our indicator is able to identify early signals highlighting the potential for seeding events that should be the focus of interventions (at the source, to prevent exportations, and at destination, to prevent importation), and that are not identifiable through local surveillance.

12:00-13:30 Session 5D: Spatial Analysis I
When People Move in Groups: Effects of Spatial Edge and Landmark in the Public Space

ABSTRACT. There have been long-standing interests specifically in the association between space and social interaction. Building upon the early observational work led by Jacobs and Whyte in urban public space, we address the question of how dynamic patterns of social interaction in public space vary with the spatial configuration. In particular, we are interested in quantifying the edge phenomenon and spatial landmarks on the site. We apply computer vision techniques and machine learning algorithms to a 9-hour time-lapsed video of an urban park. Using a graph-based method, we first identified dynamic visitor groups in the video. Then, by applying a computer vision algorithm, we delineated the fixed objects that are set and the dynamic edges (the flexible blue blocks on-site) that formed as park visitors moved the moveable furniture around. Our results show that social group activities are not random in space. Resonating Whyte’s claim, what attracts people most is others; a carousel on-site has been mostly successful in attracting groups all day long. Second, the edge phenomenon prevails, especially the dynamic edges. People in groups tend to use flexible furniture more frequently. Moreover, the play activities also attract people’s attention, thus further leading to a more newly formed group of people. Lastly, there exists a hierarchy of fixed objects in space. The carousel and heart-shaped objects are dominant in attracting more social interaction activities. These objects intrigue the “triangulation” process that potentially initiates more meaningful interaction among people. This study contributes to the intersection of urban design and social ties. We capture the dynamics of social activities in an open public space and empirically test the connection between the spatial configurations and people’s social behavior. In particular, our study provides support for incremental urban interventions that allow for flexibility, encourage engagement, and curate participation.

Vulnerability analysis of urban public transport multilayer network under combined attack strategy

ABSTRACT. Vulnerability is one of the most important performance standards of urban public transport network (UPTN). It helps the traffic management and planning department locate the network problems, facilitate the management and improvement of UPTN and improve the efficiency of UPTN. Existing vulnerability studies mostly focus on single-layer networks representing different transportation modes, and often use random or intentional attacks based on node degree. However, UPTN in the real world is interconnected and interdependent by multilayer network. In these networks, the damage of any network layer may affect the functions of other network layers or the whole system, which cannot be expressed by single layer network alone. In this paper , which is mainly reflected in: (1) an urban public transport multilayer network model (UPTMN) is established to capture the interaction between rail transit network and public transport network, and the coverage coincidence degree of station service area is proposed to judge whether there is an association relationship between the nodes of two different networks; (2) A new Combined Attack Strategy (CAS) method is proposed to analyze the network vulnerability of UPTMN model. This method considers the importance of inter layer nodes as transit hubs connecting different traffic networks. Use a new variable to represent the distance between other nodes and inter layer nodes in multilayer network, and it is combined with node degree and node betweenness to form the judgment index of removing nodes in network intentional attack. According to the CAS, a new network performance evaluation index before and after attack is established. In our experiment, the urban public transport system of Qingdao in 2021 is used for simulation. The results show that the UPTMN model considers the impact of the coupling between different networks, and our new method is better than the existing attack methods in the vulnerability analysis of multilayer networks. This study reveals the coordination between different transportation modes of UPTN, and makes up for some deficiencies in the vulnerability analysis of multilayer network.

Experienced Social Mixing of Urban Streets

ABSTRACT. Diversity is one of the chief assets of a desirable city. Urban planning professionals advocate for mixed-income neighborhoods hoping to curate socially-mixed communities. However, as people move around cities beyond where they live, it is of essential importance to understand how much we can increase the social mixing through people's daily experience in activity spaces, such as a street sidewalk. Using a mobile data set of 0.5 million users from three metropolitan areas in the U.S, we show that the street-level experienced social mixing is not only related to the residential composition but also correlates with the number of adjacent amenities and the perception of street design. This relationship is temporal, corresponding to the distribution of amenity functions alone in the streets. More importantly, a longitudinal study shows that an increase in food-related business and education level is significantly related to the rise of experienced social mixing, further implying the potential to operationalize the urban theories for a more diverse city.

(CANCELLED) Revealing Mobility Patterns of Shared Mobility through Mobility Network

ABSTRACT. The rapid adoption of shared mobility services can repurpose urban transportation resources, reduce urban car ownership, and promote the efficiency of urban transportation. However, the effectiveness of shared mobility remains uncertain. This paper attempts to reveal the mobility patterns of shared mobility. We initially clustered drivers of shared mobility services based on service hours with open data from DIDI in Chengdu, China, and discussed the mobility patterns of these drivers via human mobility patterns analysis. Then we constructed mobility networks of shared mobility and compared the network's structure to interpret the variance of drivers' service patterns and their possible external effects. The results show no significant differences in mobility patterns of drivers but significant differences in the structure of their mobility networks. The assortative coefficients of the network of drivers in the morning and evening peak hours are significantly different from those in other hours, and the degree distribution parameters of the network of drivers who provide services around the clock are smaller than those of other types of drivers. We tried to explain the formation of the network structure using the EPR model and found that the pattern of visiting locations of different drivers is the main reason for the variances of the mobility network structure. These findings provide meaningful guidance for developing shared mobility services and optimizing overall urban transportation efficiency.

Urban morphologies determining road-travel route choices – from the rush hours to the wee hours

ABSTRACT. Urban data science offers us a variety of methodologies, perspectives, and insights from real-world data analysis that have greatly advanced our understanding of cities. One established strand of research uses network data, such as road maps and traffic networks, to quantify the geometric structure of cities. For example, an indicator that quantifies the griddedness of a road network by measuring the entropy of the road azimuth angle [1] has been proposed. However, while road networks often consist of a combination of roads radiating out from the center of a city and multiple ring roads around the city, there has been no indicator to measure the degree of such a structure. Some studies in urban data science pursue not only transport network structures but also their functions. For example, ref. [2] studied how travel routes between a given pair of locations change during congestion and normal times, and how that change varies depending on the road network structure. In this study, we carried out the following two steps: (I) We proposed a simple measure to capture the "circularity and radiality" of the transportation network, and used both the indicators and the griddedness measure [1] to capture the characteristics of the transportation network. (II) We analyzed how travel routes between two points change over the course of a day depending on the structure of the transportation network, and how the pattern of change varies with distance from the center of the network for ten cities worldwide. First, for step (I), the traffic network data for each city was obtained from Open Street Map, and the following values were calculated for each road that constitutes the traffic network (origin i and destination j, with the origin being the point farthest from the center of the city): C_ij= |L_i-L_j |/d_ij , (1) Where L_i (or L_j) are the Euclidean distances between i (or j) and the city center coordinates, and d_ij is the Euclidean distance between i and j. Note that 0<C_ij≤1. Very small C_ij suggest that i and j are located along a ring road. Here, we defined the city’s circularity as the percentage of roads with C_ij≤0.2. In the figure, panel A shows the distribution of the griddedness and the circularity for the 10 cities of the world. The use of the two indicators successfully captured the characteristics of the road networks. Next, for step (II), we investigated the change in the average time required to travel between two points over the course of a day (the details of our procedures for selecting origin-destination pairs systematically are omitted in this abstract). In all cities, travel times tend to be longer at certain times of the day (i.e., morning and evening – see panel B in the figure). The change in travel times were particularly large in cities with relatively low road infrastructure with high capacity, such as Moscow, Delhi, and Mexico City (how we examined the degree of road infrastructure is also omitted in this abstract). This change in travel time is not because traffic congestion makes travel time for the same route longer, but rather because the shortest route between two points is different at different times of the day. The Detour Index (measuring the route's expected elongation compared to the straight distance between the origin and destination) is shown for each time period of a day (see Panel C). The results indicate that the diurnal detour index average tends to fluctuate in cities with a low griddedness and high circularity, such as London, Istanbul, Sao Paulo, and Paris. It was also found that in cities with high capacity and well-maintained roads, the detour index rather decreases during congestion.

Effects of mobility-based dependency relationships on economic resilience

ABSTRACT. Quantifying the economic costs of businesses caused by extreme shocks, such as the COVID-19 pandemic and natural disasters, is crucial for developing preparation, mitigation, and recovery plans. Conventionally, surveys have been the primary source of information for measuring economic losses, however, drops in foot traffic quantified using large scale human mobility data (e.g., mobile phone GPS) have recently been used as low-cost and scalable proxies for such losses, especially for businesses that rely on physical visits to stores, such as restaurants and cafes. Such studies often quantify the losses in foot traffic based on individual points-of-interest (POIs), neglecting the interdependent relationships that may exist between businesses and other facilities. For example, university campus lockdowns imposed during the COVID-19 pandemic may severely impact foot traffic to student-dependent local businesses. Such dependent relationships between businesses could cause secondary and tertiary cascading impacts of shocks and policies, posing a significant threat to the economic resilience of business networks. To identify such cascading effects, we build a “dependence demand network” of business using mobility data. To that end, we compute the dependence of a target POI i on a source POI j by dep(i,j)=|s_i∩s_j |/|s_i | , where s_i and s_j denote the sets of users who visit POIs i and j respectively. Because the denominator is based on the number of users who visit the target POI i, dep(i,j)≠dep(j,i). This is a simple but intuitive measure that considers the asymmetry of dependencies between POIs. This aims to quantify how some places depend on others for customers, for instance a coffeeshop might depend highly on workers from a nearby office. The set of users who visit each POI in a specific period is computed using mobility data collected from mobile phone devices. To measure how such dependency relationships may affect the resilience of businesses to external shocks, we analyze how dependency on visits from office POIs and university/college POIs affects the magnitude of disruption that restaurant and café POIs experienced during the COVID-19 pandemic. We used large-scale anonymous, privacy-enhanced mobility data of more than 200K devices from the Metropolitan Boston Area collected in 2020. The dependency scores were computed based on the foot traffic patterns before the pandemic (January and February 2020). The reduction in foot traffic to restaurant and café POIs were calculated relative to the foot traffic in January and February 2020. The figure shows the relationship between the dependency on visits from/to offices and university/college POIs (x-axis, between 0 and 1) and the foot traffic levels compared to pre-pandemic levels (y-axis, in %). The relationships are shown for two periods (April 2020 and October 2020). We observe a negative trend between dependency and foot traffic for both periods, indicating that restaurants and cafés that were more dependent on offices and universities experienced more substantial negative impacts of the non-pharmaceutical intervention policies during COVID-19. The findings here can be used to plan land use and development policies that foster more resilient businesses.

12:00-13:30 Session 5E: Network Embedding
CANE: Causality Aware Network Embedding for Time Series Data

ABSTRACT. Learning vector representations of nodes is a fundamental step for using networks in machine learning pipelines. Embedding methods aim at fulfilling this aim by finding parsimonious vector representations for the topological patterns in network structures. However, structure is often not enough to capture the patterns occurring on the network due to dynamics: while the network structure constrains dynamic, it does not completely determine it. This distinction is due to patterns that, while present in temporal networks, cannot be expressed by their static counterparts. The most straightforward example is that of systems whose patterns differ at different points in time. i.e., the central nodes, communities, and other structures at time $t_1$ are different from those at time $t_2$, making them unpredictable using only the network structure. However, the temporal dynamics of networks can also be characterized by so-called higher-order patterns, i.e., patterns that do not depend on when interactions happen but on their order. Higher-order patterns lead to edge-sequences occurring with higher or lower frequency than what would be expected based on the network structure. The significance of these patterns for network analysis was recently highlighted by works demonstrating that correlation in the chronological order of interactions can invalidate standard methods for network-based data analysis, calling for higher-order modeling and analysis techniques.

Despite their importance, most existing embedding methods do not account for higher-order patterns. They either learn representations for static networks or create representations that change with the variation in time of the network dynamics. This work addresses this gap by proposing a method to incorporate higher-order patterns in vector representations of nodes. The method builds on the multi-order network model presented in a previous paper and uses a banal yet significant feature of the model: multi-order networks are themselves networks. This fact is very convenient, as it allows the application of existing network methods to capture patterns in the higher-order network topology. We apply the community detection algorithm Infomap to the multi-order network, thus capturing higher-order community patterns that are potentially invisible to the application of network analysis methods to the standard topology. Then, we use the communities identified in the multi-order network to define vector representations that are expressive of the so-identified higher-order patterns

Impact of Heterogeneity on Network Embedding

ABSTRACT. In recent years, network embedding has attracted much attention from researchers and achieved excellent performance. But few works investigate the adaptability of network embedding, especially for performance in different network structures. Heterogeneity, as a universal topological characteristic, plays a prominent role in network behaviors. In this study, we investigate the effect of heterogeneity on the effectiveness of existing network embedding approaches. We conduct experiments in scale-free networks with varying power exponents from both macro and micro perspectives to address link prediction and node similarity tasks, respectively. The results indicate that network embedding approaches can be divided into two classes according to their performance in the link prediction task. The first class includes GF, SDNE, and Line1, and the second class includes Line2 and GraRep. As the network heterogeneity decreases, the performance of approaches in the first class declines, while the performance of approaches in the second class initially improves and then declines. Moreover, our simulation discovers that, based on the node similarity metric, nodes are partitioned into two clusters by approaches, corresponding to large-degree nodes and small-degree nodes, respectively. Furthermore, approaches in the same class present similar characteristics between large-degree nodes and small-degree nodes, and the embedding is interpreted to some extent. Specifically, approaches in the first class assume that large-degree nodes are similar to large-degree nodes and small-degree nodes, whereas approaches in the second class assume that large-degree nodes are just similar to large-degree nodes. Performance variations in the link prediction task can be explained by the characteristics of the approaches, and similar characteristics are confirmed in experiments on real networks. Based on the findings for link prediction, we offer a brief guide for choosing an appropriate method based on the extent of heterogeneity. The investigation provides insight into network embedding and offers some interpretation of embedding, which could further establish the connection between network science and machine learning.

Influence of clustering coefficient on network embedding in link prediction

ABSTRACT. Multiple network embedding algorithms have been proposed to perform the prediction of missing or future links in complex networks. However, we lack the understanding of how network topology affects their performance, or which algorithms are more likely to perform better given the topological properties of the network. In this paper, we investigate how the clustering coefficient of a network, i.e., the probability that the neighbours of a node are also connected, affects network embedding algorithms' performance in link prediction, in terms of the AUC (area under the ROC Curve). We evaluate classic embedding algorithms, i.e., Matrix Factorisation, Laplacian Eigenmaps [1] and node2vec [2], in both synthetic networks and (rewired) real-world networks with variable clustering coefficient ($C_L$). Specifically, a rewiring algorithm is applied to each real-world network to change the clustering coefficient while keeping key network properties. We find that a higher clustering coefficient tends to lead to a higher AUC in link prediction, except for Matrix Factorisation, which is not sensitive to the change of clustering coefficient. To understand such influence of the clustering coefficient, we (1) explore the relation between the link rating (probability that a node pair is the missing link) derived from the aforementioned algorithms and the number of common neighbours of the node pair, and (2) evaluate these embedding algorithms' ability to reconstruct the original training (sub)network. All the network embedding algorithms that we tested tend to assign higher likelihood of connection to node pairs that share an intermediate or high number of common neighbours, independently of the clustering coefficient of the training network. Then, the predicted networks will have more triangles, thus a higher clustering coefficient. As the clustering coefficient increases, all the algorithms but Matrix Factorisation could also better reconstruct the training network. These two observations may partially explain why increasing the clustering coefficient improves the prediction performance.

[1] Belkin, M., Niyogi, P.: Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems 14, pp. 585–591. MIT Press (2002). [2] Grover, A., Leskovec, J.: Node2Vec: Scalable Feature Learning for Networks. In: Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’16, pp. 855–864. ACM, New York, NY, USA (2016).

Explainable Node Embeddings

ABSTRACT. Node embedding methods take a network and embed its nodes in a low dimensional latent space. The user specifies the size of the low dimensional space. We address the following questions: (1) Can we explain the meaning of each dimension in the low dimensional latent space? If so, how? (2) Can we change the loss function of any node embedding method to produce embeddings that are more explainable? If so, how? We answer yes to both of these questions and provide methods to achieve them. We define a set of sense features that capture global and local node features in a network. Given a matrix of sense features, F ∈ R^n×f , where n is the number of nodes and f is the number of sense features, and an embedding matrix V ∈ R^n×d, where d is the number of embedding dimensions, we use Non-negative Matrix Factorization to learn an Explain matrix E ∈ R^d×f such that V E = F . Henderson et al. called this procedure sense-making. Each row of the Explain matrix E describes how much each dimension contributes to explaining a given sense feature. Figure 1(a) shows the results of our sense-making (as a heat map) on the EU Email Network when it is embedded into 32 dimensions with Structural Deep Network Embedding (SDNE). By inspecting the heat map in Figure 1(a), we can explain each dimension by their sense features. For example, Dimension 15 is explained by degree, personalized PageRank, number of egonet edges, and node betweenness. While the sense-making procedure mentioned above allows us to explain each dimension in terms of the user- defined sense features, it can still be difficult to explain each dimension when many sense features contribute to it. To improve explainability, we impose two constraints on the Explain matrix E during the training phase. First, we require the rows of the Explain matrix to be orthogonal to one another. That is, we want dimensions that are explained by different sense features. Second, we require the columns of the Explain matrix to be sparse. That is, we want each sense feature to contribute to the explanation of as few dimensions as possible. We impose these constraints by training an augmented version of SDNE. We first run the standard SDNE implementation and use the learned weights to initialize the model for the version with our augmented loss function, which we call SDNE+. Figure 1(b) shows the heat map for the Explain matrix produced by SNDE+ for the same EU Email Network. We observe that the heat map is sparser and fewer dimensions strongly correspond to many sense features. For example, Dimension 5 only corresponds strongly to node betweenness. For brevity, we have omitted the results showing that SDNE+ performs as well as SDNE on downstream tasks (such as link prediction).

Topological data analysis of truncated contagion maps

ABSTRACT. The investigation of dynamical processes on networks has been one focus for the study of contagion processes. It has been demonstrated that contagions can be used to obtain information about the embedding of nodes in a Euclidean space. Specifically, one can use the activation times of threshold contagions to construct contagion maps as a manifold-learning approach. One drawback of contagion maps is their high computational cost. Here, we demonstrate that a truncation of the threshold contagions may considerably speed up the construction of contagion maps. For synthetic networks, we find that a carefully chosen truncation may also improve the recovery of hidden geometric structures. Finally, we show that contagion maps may be used to find an insightful low-dimensional embedding for single-cell RNA-sequencing data in the form of cell-similarity networks and so reveal biological manifolds. Overall, our work makes the use of contagion maps as manifold-learning approaches on empirical network data more viable.

Increasing the Stability of Embeddings of the Degenerate Core

ABSTRACT. Given that real-world networks are noisy and dynamic, how stable are graph embeddings to perturbations in the graph periphery? We show that existing graph embedding algorithms produce embeddings that are unstable when removing periphery nodes leads to spikes in edge density. We introduce a method for identifying instability. Our method measures changes to the pairwise embedding distances among nodes in the graph's degenerate core (i.e., its k-core with maximum k) as outer k-shells (i.e., the "periphery") are iteratively shaved off. After removing the outermost remaining k-shell, we re-embed the graph. We quantify instability as the amount of change to the degenerate core pairwise distribution as each periphery shell is removed.

We apply our method to embeddings of real-world and synthetic graphs produced by a variety of matrix-factorization, skip-gram, and deep-learning embedding algorithms. We find three patterns: 1) degenerate-core embeddings are sensitive to the removal of specific k-shells; 2) degenerate-core embeddings for Erdos-Renyi (ER) and Barabasi-Albert (BA) random graphs are stable; 3) as the periphery is removed, the degenerate-core pairwise distribution becomes smoother, suggesting a loss of community structure. To explain these patterns, we show that points of instability are correlated with increases in edge density. For dense subgraphs, we find that graph embedding algorithms fail to distinguish core and periphery nodes. Patterns 1 and 3 are consistent with this result because we find that edge density increases as particular k-shells are removed, causing inseparability between core and periphery embeddings. On the other hand, the degenerate cores for ER and BA graphs span a large percentage of the entire graph so removing the periphery has little impact on the degenerate-core embeddings.

To mitigate embedding instability, we introduce a generic graph embedding algorithm. Our algorithm augments an existing algorithm's loss function by adding an instability penalty. We augment Laplacian Eigenmaps and LINE and show that the embeddings are indeed stable while preserving link-prediction results.

12:00-13:30 Session 5F: Temporal Networks I
Detection of Causal Structures in Temporal Networks

ABSTRACT. Technological advances allow us to measure dynamic complex systems with high temporal resolution. While temporal networks representing such systems offer new opportunities for obtaining deeper understanding of dynamic complex systems, adapting existing methodologies to the temporal setting can be challenging. For instance temporal correlations and causal structures in temporal networks can invalidate the use of static networks models and representations, and have lead researchers to consider higher order models that can represent such structures more faithfully.

A key challenge in determining higher-order correlations in temporal networks is finding the time-scale at which they occur~\cite{caceres2013temporal,pan2011path}. In most practical settings, one is forced to resort to domain expertise to set the time scale of processes that govern a given temporal network. In this work we attempt to bridge this gap and propose a principled and efficient approach for detecting causal time-scales in temporal networks that combines higher-order network models and techniques of information theory.

Specifically, we analyse causal paths consisting of chronologically ordered edges where subsequent transitions occur within a time scale $\Delta t$. We then define an entropy function $\mathcal{H}_T(\Delta t)$ that is given by the average entropy of higher-order transitions occurring within a given time scale $\Delta t$. The entropy measures how predictable higher order transitions at a certain time scale are and hence when considered as a function of $\Delta t$ the local minima of the entropy function correspond to points where transitions are most predictable/ordered.

We validate our approach on synthetic data sets with planted causal structures at a known time scale as well real world data. In the case of synthetic data the method is able to recover the correct time scale and for real world data we observe clear transitions at time scales relevant to the the systems dynamics (See Fig.~\ref{fig1}). We further observe that the minima of the entropy coincide with statistics of causal paths when these known a priori. Our implementation is based on an efficient method for computing statistics of higher-order transitions in temporal networks~\cite{petrovic2021paco} and hence is applicable to large temporal networks.

Long-term behavioral changes during COVID-19 have increased urban segregation

ABSTRACT. Cities have been the central drivers of economic productivity and opportunities with their dense social networks of knowledge and labor; however, inequality and segregation are rising rapidly in cities, eroding our social fabric. Socioeconomic segregation impacts inequitable access to urban amenities and services, ultimately affecting social, economic, and health outcomes of people living in urban areas. A recent study using large scale mobile phone GPS data revealed that peoples’ mobility behavior account for 55% of income segregation, providing a more comprehensive understanding of segregation that occur in urban environments, beyond residential segregation that has been studied extensively using census data. Non-pharmaceutical interventions (NPIs) in response to COVID-19, including lockdowns and mobility restrictions, have substantially reduced mobility patterns, potentially disrupting the quality and quantity of social interactions in urban environments. However, little has been understood on how the social interactions between sociodemographic groups in cities dynamically responded to different stages of the pandemic. Understanding how urban income segregation dynamically changed during the pandemic could be valuable in developing policies to better prepare for future pandemics. To address this gap, we analyze how urban income segregation changed during the pandemic using large-scale anonymous, privacy-enhanced mobility data of more than 200K devices from the Metropolitan Boston Area across three years from 2019 to 2021. Income segregation scores for points-of-interest (POIs) and individuals are computed based on the methods presented in Moro et al. (2021), with an additional seasonality correction procedure to denoise the estimations. The figure highlights how experienced income segregation increased for all sociodemographic categories in Boston even after 2 years into the pandemic. While more low-income and less-educated demographic groups (colored in reds) had high-income segregation before and especially during the pandemic, older, more educated, and higher-income groups (colored in blues), have consistently had relatively lower segregation levels during the pandemic. A demographic group that substantially increased segregation during the pandemic despite low pre-pandemic segregation was the ‘Asian Hispanic Mix, Mid Income’ group (colored in green), which could reflect the rise in hate crimes and racial discrimination towards the Asian communities. We further find that such increase in segregation is not only driven by reduction in mobility, but drastic changes in urban explorative behavior patterns.

Temporal Network Prediction and Interpretation

ABSTRACT. Temporal networks refer to networks like physical contact networks whose topology changes over time. Predicting future temporal networks is crucial e.g., to forecast the epidemics. Existing prediction methods are either relatively accurate but black-box, or white-box but less accurate. The lack of interpretable and accurate prediction methods motivates us to explore what intrinsic properties/mechanisms facilitate the prediction of temporal networks. We use interpretable learning algorithms, Lasso Regression and Random Forest, to predict, based on the current activities (i.e., connected or not) of all links, the activity of each link at the next time step. From the coefficients learned from each algorithm, we construct the prediction backbone network that presents the influence of all links in determining each link’s future activity. Analysis of the backbone, its relation to the link activity time series, and to the time aggregated network reflects which properties of temporal networks are captured by the learning algorithms. Via six real-world contact networks, we find that the next step activity of a particular link is mainly influenced by (a) its current activity and (b) links strongly correlated in the time series to that particular link and close in distance (in hops) in the aggregated network.

Reciprocity, community detection, and link prediction in dynamic networks

ABSTRACT. Many real networks are dynamical, i.e., the patterns of interactions between their nodes vary over time, e.g., citation networks. As a consequence, many inference methods have been generalized to the dynamic case with the aim to model dynamic interactions. Particular interest has been devoted to extend the stochastic block model and its variant, to capture community structure as the network changes in time. While these models assume that edge formation depends entirely on the community memberships, recent work for static networks shows the importance to include additional parameters capturing structural properties, as reciprocity for instance. Remarkably, these models are capable of generating more realistic network representations than those that only consider community membership. To this aim, we present a probabilistic generative model with hidden variables that integrates reciprocity and communities as structural information of networks that evolve in time. The model assumes a fundamental order in observing reciprocal data, that is an edge is observed, conditional on its reciprocated edge in the past. We deploy a Markovian approach to construct the network's transition matrix between time steps. At each time step, the transition rates of appearance and disappearance of a directed edge between two nodes depend on the current community membership of the nodes, as well as on the existence of a reciprocated edge between them. The parameters' inference is performed with an Expectation-Maximization algorithm that leads to high computational efficiency because it exploits the sparsity of the dataset. We consider two varieties of our model. In one case, community membership vectors remain static over time and only the affinity matrix contains temporal information. In the other case, the affinity matrix is treated as a static parameter, similarly as the community memberships; in both cases, reciprocity parameter and the rate of edge removal are kept static. These two scenarios enable us to thoroughly analyze the model and its performance in networks with different interaction patterns. For instance, in the case of a non-homogeneous community structure over time, the first version would be a more suitable approach, since it could capture the evolving community structures. We validate the applicability of the proposed model and its inference approach by performing experiments on real and synthetic networks for community detection and link prediction. The results on synthetic data show good performance in terms of link prediction. In addition, the analysis on real networks highlight that our model captures the reciprocity of real networks better than standard models with only community structure, while also performing well at link prediction tasks. The manuscript is available at \url{}.

NEtwork STatistic Generation Tool For Static and Temporal Graphs With Attributes

ABSTRACT. Conducting network analysis is a complex endeavour, due to large high-dimensional data sets and complex dependency structures. Exploratory data analysis is complicated, as simple marginals are not sufficient to characterise the complex structure. Depending on the task at hand, graphs can take many forms, directed, undirected, as well as having a large variety of attributes, which can provide additional challenges. While analysing a network, some typical properties may affect other graph attributes. Web graphs present for instance power-law degree distributions [2], whereas social networks tend to have ego network properties [1]; these structures would have an effect for example on closeness centrality and community structure. Moreover, due to privacy concerns, a network scientist may not have direct access to the underlying data but may be able to request network summaries.

As networks are a data structure that is omnipresent in many real world applications, there is thus a need for an easy and accessible tool for creating graph statistics in an automated fashion. To address this issue we propose the NEtwork STatistic (NEST) generation package, a framework where given an input graph, of any type, directed or not, with node and/or edge attributes, temporal or static, provides an exhaustive range of statistics and produces outputs in pdf, csv, and html to suit multiple use cases (see Figure 1). Our current set of statistics encompasses both marginals graph attributes, static network summaries including degree distributions, community structure, motif structures, and temporal summaries exploring how each of these quantities changes over time.

The package has been designed to be fully expandable based on an object orientated structure, which allows the addition of new statistics, including plots and summaries with only a very small number of additional lines of code. We strongly encourage contributions from the community, and we would welcome any additional network summary statistics as pull requests on the soon to launch github site. The tool is fully open source, has been tested in several network science projects, and is easy to use without having expertise in the underlying techniques.

References [1] V. Arnaboldi, M. Conti, M. Gala, A. Passarella, and F. Pezzoni. Ego network structure in online social networks and its impact on information diffusion. Computer Communications, 76, 12 2015. [2] M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law relationships of the internet topology. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, SIGCOMM ’99, page 251–262, New York, NY, USA, 1999. Association for Computing Machinery. [3] J. Yang and J. Leskovec. Defining and evaluating network communities based on ground-truth. Knowledge and Information Systems, 42(1):181–213, 2015.

Maximum-entropy temporal network models

ABSTRACT. An important question for temporal network modeling is that of null model selection. We address this problem by applying the principle of maximum entropy to temporal networks. This allows the derivation of probability distributions on graph-sequences that are as unbiased as possible while satisfying a chosen set of constraints, be they exact or expectation-based. We investigate the many options for constraint-choices, and show how such choices impact the resulting dynamic networks. Constraints can be devised such that the model simultaneously yields (a) tunable network properties within each snapshot, and (b) tunable dynamic properties relating to graph-structure across multiple snapshots. Depending on the form of the latter constraint-types, the resulting temporal networks can be sequences of independently sampled graphs, first-order Markov chains, higher-order Markov chains, or altogether non-Markovian. We construct such models in both discrete and continuous time, with time-homogeneous and time-inhomogeneous constraints, studying them analytically and in large-scale simulations. We show that only certain combinations of constraints are allowed, and explore the parameter spaces of valid constraint-value combinations. We derive maximum-entropy temporal versions of well-known static network models, study their properties, compare them to various well-known temporal network models, and apply this framework to data from several real-world temporal networks. Additionally, we highlight interesting questions and remaining challenges.

13:40-15:25 Session 6: Invited Talks (N. Masuda, J. Tang & C. L. Porta)
Recurrence view of temporal network data: System-state dynamics, recurrence plot, and embedding
WuDao: Pretrain the World
Artificial intelligence in breast cancer diagnostics
15:25-16:20 Session 7: Lightning Session II
Game Theoretic Causality for ML Models: Case Study on COVID-19 Prevalence in the USA

ABSTRACT. While complex machine learning models, such as deep neural networks and ensemble models, when applied to areas of computational mathematics, sciences, engineering, and social sciences often have superior predictive performance compared to interpretable models, their nature of being notoriously hard to interpret has raised numerous concerns. The field of explainable AI aims at developing methods for either increasing the interpretability of or externally explaining the behavior of black-box models, for instance by answering questions about why or how the model prediction was made based on its inputs. Explanation methods based on the solution concept from cooperative game theory known as Shapley values have, over recent years, become popular for interpretable feature attribution, owing to a large extent to their theoretical foundation, including four “favorable and fair” attribution axioms for transferable utility games. In a recent NeurIPS paper, a framework for causal Shapley values was introduced which uses do-calculus to estimate conditional probabilities that are necessary to compute Shapley values based on causal chain graphs representing the underlying causal structure of the variable space. In this paper, we use causal Shapley values calculated for a machine learning model to analyze socioeconomic disparities that can have a causal link to the spread of COVID-19 in the USA. We study several phases of the disease spread to show how the causal connections change over time. We show the distinct advantages nonlinear machine learning models have over linear models when performing a multivariate analysis, by performing causal analysis using random effects models and discussing the correspondence between the two methods to validate our results. In addition, the causal Shapley values allow for extracting the importance of the causal connections modeled by the machine learning algorithm. The approach outlined in this work can be used broadly to assess causality for machine learning models built on high dimensional and large scale time series data sets.

Delayed Impact of Interdisciplinary research

ABSTRACT. Interdisciplinary research is often considered as an under-fueled engine of innovation through straddling disciplinary boundaries. However, it suffered from several hurdles or “paradoxes” including applying for grants, establishing research centers, individual scientists conducting interdisciplinary research and obtaining prestige prize. Such observations raise a crucial yet underexplored question: Does interdisciplinary research manifest delayed impact? To offer quantitative answers to this question, we leverage the Microsoft Academic Graph dataset spanning over more than 35 years, tracking knowledge integrations and citation dynamics for each journal article. Several caveats we observed are important. First, we find that interdisciplinary research has substantial long term citation impact. Yet, despite such facts, pursuing interdisciplinary research undergoes significant delayed impact. Interdisciplinary research takes significantly longer to attract citations compared to its monodisciplinary counterparts. We go one step further to study its underlying mechanisms, finding that peer bias and content features are possible reasons for such delayed impact. Given the fact that interdisciplinary research is often considered as the space for innovative research that is the key to economic growth, our work may have broad implications for developing, nurturing and supporting interdisciplinary research.

Power-law redundancy of urban transportation networks

ABSTRACT. As a lifeline of urban systems, the resilience of urban transportation network against disastrous events directly influences the city operation and residents' activities. Compared with enhancing the emergency response and recovery during or after disasters, moderately improving the transportation network redundancy in pre-disaster stage can help to resist potential disruptions more proactively. Transportation network redundancy can be defined as the number of alternative efficient paths provided to travelers between each origin-destination pair. Different from using traditional topological approach, this paper examines the structural characteristics of transportation networks in 49 major cities in China by utilizing transportation network redundancy, a new perspective considering travelers’ behavior. The results show that the redundancy of urban transportation networks follows the power-law distribution. The exponents of the power-law distribution can systematically reflect the heterogeneity and uniformity of the network redundancy performance. The larger the power-law exponent, the higher the general redundancy of the network. Furthermore, we propose the node redundancy centrality as a new measure to describe the influence of nodes on network redundancy. We find that the node redundancy centrality also follows the power-law distribution with similar exponents in different cities. By analyzing the network structural characteristics with diverse redundancy, we find that the transportation network redundancy is not directly related to the network scale, and the grid-type transportation networks with reasonable road gradation have better redundancy to withstand disruptions. This study can assist in better understanding, planning, and improving the resilience of urban transportation systems.

Preventing Comorbidity of Distress and Suicidality: A Network Analysis

ABSTRACT. Background: There has been substantial evidence in the literature suggesting that a prolonged episode of psychological distress may lead to the distressed individual developing mental disorders such as anxiety & depressive disorders, or worse, suicidal behaviors. Suicide among youths, in particular, has quickly become a pressing public health challenge in recent years, and is now a leading contributor of youths’ disability-adjusted life years despite clinical and policy efforts. Treating symptoms that connect distress and suicidality can be an effective strategy in preventing their comorbidity, i.e., a distressed youth developing suicidal behaviors, subsequently reducing the incidence of suicidal behaviors.

Objectives: By conceptualizing each of distress and suicidality as a community of symptoms, we sought to identify which nodes (symptoms) served as bridges between the two communities.

Methods: Community-dwelling young people in Hong Kong were recruited through the authors’ affiliated institutions and territory-wide community outreach organizations, and were asked to fill online surveys after they had given their written consent. CHQ-12 and SIDAS scales in the survey measured distress and suicidality respectively. Each item in the scales measured a symptom and was conceptualized as a node, altogether forming a network consisting of a community with 12 distress symptoms and another with 5 suicidality symptoms. We estimated a regularized partial correlation network using the graphical LASSO (gLASSO) and extended Bayesian Infor- mation Criterion (EBIC) with a conservative tuning hyperparameter γ = 0.5 to ensure high specificity. Network visualization was conducted with the Fruchtermann-Reingold algorithm. We calculated bridge strength and bridge betweenness centrality to quantify each node’s role in linking the two communities of symptoms. Before interpreting the network, we conducted (i) case-drop subset bootstrap to ensure the accuracy & stability of centrality indices, and (ii) bootstrapped confidence intervals of edge weights to evaluate the stability of edges.

Results: 1968 participants completed the online survey. Correlation stability coefficients for bridge strength and between- ness were 0.44 and 0.52 respectively, indicating that the calculated indices could be reliably interpreted. The strongest link between the two communities was the edge between frequency of suicidal thoughts (SIDAS1) and feelings of hopelessness (CHQ10). The two nodes also had the highest bridge centrality values, indicating their prominent role in linking symptoms of distress with suicidality’s.

Implications: Findings provide empirical evidence suggesting that hopelessness is a key affect that facilitates the onset of suicidal ideation in a distressed young individual. To mental healthcare professionals, then, a distressed youth expressing hopelessness in the present should be taken as an early warning signal for additional support to be given. Instilling hope in distressed individuals may also be integrated as a central component in suicide prevention training and strategies.

Circulation of a digital community currency

ABSTRACT. This work analyzes the circulation of a digital community currency---Sarafu---that saw tremendous growth and substantial use in Kenya during the COVID-19 pandemic. Our analysis focuses on the weighted, directed, time-aggregated network of ordinary transfers; this captures real, observed flow of money and lets us ask where and among whom Sarafu was circulating over this period. Our results demonstrate the interpretability of walk- and cycle- based analyses on flow networks for representing currency systems.

A simple network model for describing structural plasticity in vascular brain networks

ABSTRACT. The brain is densely perfused by the vascular network, which is composed of branching points (ie, the nodes) and arteries, capillaries and veins (ie, the links). The architecture and topology of this network, are particularly constrained by its underlying biological function, and the metabolic demand of the neural substract. Even though whether and how neuronal activity controls the vascular network topology is still debated, it is believed that abnormally high or low neuronal activity affect the vasculature(1). To test this hypothesis, we propose a generative graph model of vascular growth based on Voronoi tessellation network, simulating the self organization of capillaries in the developing mouse brain. The underlying idea of the model is that highly activated neurons will create gradient of growth factor protein that trigger vascular sprouting. To better investigate the relationship between vascular patterns and neuronal activity during development, we used Navier-Stokes equation to simulate the blood flow at different timepoints, and performed stochastic block model to compare the results with real mouse pups vascular networks. Main results show that our growth model was able to reproduce the mesoscale physical network properties of the healthy mice brain vasculature, while only following simple rules, allowing us to easily test several biological hypothesis concerning the neuro-vascular coupling. In sum, this work represents a preliminary step towards the identification of the network mechanisms of vascular plasticity.

Do higher-order interactions promote synchronization?

ABSTRACT. Several examples of higher-order interactions promoting synchronization have recently been found, raising speculations that this might be a general phenomenon. Here, however, we show numerically and analytically that even for simple systems such as Kuramoto oscillators, the effects of higher-order interactions are highly nuanced. Specifically, hyperedges typically enhance synchronization in random hypergraphs but have the opposite effect in simplicial complexes. As an explanation, we identify higher-order degree heterogeneity as the key structural determinant of synchronization stability in systems with a fixed coupling budget. Typical to nonlinear systems, we also capture regimes where pairwise and nonpairwise interactions synergize to optimize synchronization.

“Tempological” Control: Synchronizing Unstable Networks Through Strategic Switching

ABSTRACT. The normal functioning of many complex systems requires them to operate near a prescribed dynamical state. To this end, built and natural systems alike–from flocking birds to power girds–employ a strategy of feedback control to stabilize desirable (but otherwise unstable) dynamical states. Unfortunately, this strategy requires continuous control input to counteract deviations from the functional state, which can be invasive and demand considerable resources.

Here, for networked systems whose function relies on the synchronization of their components, we introduce tempological (temporal + topological) control–a noninvasive feedback control scheme that stabilizes synchronization without any control input. Instead, our strategy works by making sporadic but deliberate changes to the coupling network on-the-fly, based on the dynamical states of the nodes. We show that by implementing this control strategy, a temporal network can achieve synchronization even if all subnetworks (snapshots) are individually unstable (Figure 1).

We demonstrate the utility of our approach numerically using two canonical nonlinear models for synchronization: the Kuramoto and Stuart-Landau systems. We also establish probabilistic guarantees on the success rate of “tempological” control. In particular, we show that in the limit of large temporal networks, success is virtually guaranteed

Quantifying Success in Ballet

ABSTRACT. Performance in visual and other fine arts is notoriously difficult to objectively quantify. Yet, previous work has shown that social signals, reflected in the social network and prestige of affiliations, can function as indicators of successful careers. Here, we explore the influence of social networks on success in ballet, a historical performing art with a unique hierarchical social structure that may influence the success of dancers beyond their performance ability alone. We focus on ballet dancers competing in today's most influential ballet competition, the Youth America Grand Prix (YAGP) in the United States (US), capturing the competition outcomes of 7,000 ballet students from 2000-2021. The competition takes place in two stages consisting of multiple regional semi-finals and one final competition each year. Success is measured in a two-fold fashion: competition success for those winning awards at the YAGP finals, and career success for those with job placement in a renowned ballet company following competition. Critically, the competition recognizes dancers with two types of awards, competition medals (gold, silver, bronze) based on aggregate scores across several judges and dance categories, and the Grand Prix, a subjective appreciation of performance conferred by a committee of judges without explicit criteria. We observe that only 0.65\% of dancers received a YAGP award in the finals and a successful job placement, while 3.9\% of dancers received a successful job placement without an award from the competition finals. Overall, higher ranks in the semi-finals predict potential awards in the competition finals and awards in the finals are significantly more predictive of job placements. To identify potential social influences on success, we construct the co-competition network between dancers' affiliated schools, where two schools are linked if they competed in the same semi-final location. Interestingly, we find no effect of the competitions' geographic locations upon success, even though detected communities suggest a strong influence of geographic regions shaping the competitive environment. From this network, we compute the schools' eigenvector centrality to capture their level of influence in the competitive setting. We find that dancers affiliated with influential schools (top quartile of eigenvector centrality), have higher expected success in both YAGP awards and in their job placements. Affiliation with an influential school significantly enhances potential recognition by a jury's Grand Prix award and job placement much more than competition medals. Overall, we observe that the jury and competition pool composition across the US does not seem to impair the patterns of success. Our results point out that the YAGP fairly awards dancers by their complex and artistic performance according to their division level, yet it is their affiliations what drives successful job placement. In further research, we aim to explore the relationship between high performance and school choice, as well as the influence of the local network and social support for the success in performing arts.

Quantifying the maximum capability of a topological feature in link prediction

ABSTRACT. Link prediction aims to predict links that are not directly visible due to incomplete information of the network, which has profound applications in biological and social systems. Recent studies on the link predictability of a network shed light on the extreme performance that any prediction tool could ever reach. Yet, it is still unclear to what extent a specific topological feature can be leveraged to infer the missing links. Moreover, a feature can be utilized in a supervised or unsupervised manner. However, the inherent performance difference between the two approaches remains unexplored. Here, we show that the maximum capability of a topological feature follows a simple yet theoretically validated expression, which depends on the extent that the feature is held in missing and nonexistent links, but is independent of how the feature is quantified by an index. Hence, a family of indexes based on the same feature shares the same upper bound, allowing us to estimate the potential of all others from one single index. The maximum capability by the supervised approach is higher than that by unsupervised manner, whose improvement can be mathematically quantified. Hence, the benefit of applying machine learning algorithms can be known in advance. Our work benefits from a large corpus of 550 structurally diverse networks, from which the universality of the pattern uncovered is empirically verified. Taken together, we reveal previously unknown regularities underlying the link prediction task. The finding can be applied to optimize the feature selection and stacking, and also advances our understanding of network characteristics associated with the utilization of a topological feature in link prediction.

Beyond Residence: Political Segregation in the Activity Space

ABSTRACT. In the past few years, the US and other countries have seen a rise in political polarization. This kindled research on political segregation, notably recent work has revealed that in residential terms, one's political preference is shared by one's neighbors. An important implication is that the urban map can become a "physical echo chamber". In this work, we use transaction data to study if segregation is preserved when people move around their city through the day, i.e. in their activity space. The study is done in Mexico City and we study the midterm election of 2021. The first contribution of our work is revealing segregation between residents of different zip codes goes beyond their residence, as neighbors from one zip code interact more with neighbors from another zip code if they share political preferences. Secondly, we show that this effect is not driven solely by residential segregation. This is because even when we study pairs of zip codes that are at a fixed distance, increased interactions are seen in neighborhoods with similar political preferences. We also show this by comparing the interactions to those defined by a simulated gravity model. Finally, we show that ~25% of the variance in voting behavior that is not explained by income can be explained by the preference of the areas to which each zip code is most connected.

16:20-17:20 Session 8: Poster II
Information Flow Between Stock Markets: A Koopman Decomposition Approach

ABSTRACT. Stock markets in the world are linked by complicated and dynamical relationships into a temporal network. Extensive works have provided us with rich findings from the topological properties and their evolutionary trajectories, but the underlying dynamical mechanism is still not in order. In the present work, we proposed a technical scheme to reveal the dynamical law from the temporal network. The index records for the global stock markets form a multivariate time series. One separates the series into segments and calculates the information flows between the markets, resulting in a temporal market network representing the state and its evolution. Then the technique of the Koopman Decomposition operator is adopted to find the law stored in the information flows. The results show that the stock market system has a high flexibility, i.e., it jumps easily between different states. The information flows mainly from high to low volatility stock markets. And the dynamical process of information flow is composed of many dynamic modes distribute homogenously in a wide range of periods from one month to several ten years, but there exist only nine modes dominating the macroscopic patterns.

An ecology of film industry networks.

ABSTRACT. The film industry has long been recognised by network science as a worthwhile area of interest, with the movie-actor network according to the Internet Movie Database (IMDb), and the six degrees of Kevin Bacon, in particular, at times rivaling Zachary's Karate network in scholarly attention. Yet, more broadly, studying the film industry as an integrated complex system, as a network of complex networks, has so far been difficult, due to a lack of available data. Here we present the analysis of a large, so far proprietary dataset which is rooted in Cinando, i.e. an online platform operated by March  du Film – Festival de Cannes. The platform services film festivals and their associated markets by facilitating global film industry operations, including rights sales and investments, as well as business-to-business video-on-demand (VOD) viewing. Beyond classic networks, such as the association of actors, directors, and crew with particular films, the Cinando data harbors a whole ecology of meaningful networks, including around 140k instances of film-festival participation, as well as several other networks that are not covered elsewhere (cf. Fig.). The core purpose of our presentation is to unfold the inherent ecology of complex networks, which includes a variety of interactions connecting films, concepts (including genres), human individuals, companies, locations, and events. The data further allows for temporal analysis in granular detail. Interesting aspects include the dynamics of gender in film production networks, the creation of public value through film-festival participation, the structure of film genre classification, and the annual dynamics of the international film festival circus, including effects of the coronavirus pandemic. Rounding off the presentation, we will compare specific aspects with results based from complementary sources, which capture additional materials and time frames. In general, our analysis sets the base for multiple follow-up studies of specific aspects, while highlighting the film industry as a system of complex networks.

Non Preferential Patterns in Node Ranking Dynamics

ABSTRACT. Numerous studies over the past decades established that real-world networks typically follow preferential attachment and detachment principles. Subsequently, this implies that degree fluctuations monotonically increase while rising up the ``degree ladder’’, causing high-degree nodes to be prone for attachment of new edges and for detachment of existing ones. Despite the extensive study of absolute degrees of nodes, many domains consider node relative ranks as of greater importance. This raises intriguing questions - what dynamics are expected to emerge when observing the ranking of network nodes over time? Does the ranking of nodes present similar monotonous patterns to the dynamics of their corresponding degrees? In this paper we show that surprisingly the answer is not straightforward.

By performing both theoretical and empirical analyses, we demonstrate that preferential principles do not apply to the temporal changes in node ranking. We show that the ranking dynamics follows a non-monotonous curve, its functional form suggesting an inherent partition of the nodes into qualitatively distinct stability categories. These findings provide plausible explanations to observed yet hitherto unexplained phenomena, such as how superstars fortify their ranks despite massive fluctuations in their degrees, and how stars are more prone to rank instability.

Assessment of the effectiveness of Omicron transmission mitigation strategies for European universities using an agent-based network model

ABSTRACT. Returning universities to full on-campus operations while the COVID-19 pandemic is ongoing has been a controversial discussion in many countries. The risk of large outbreaks in dense course settings is contrasted by the benefits of in-person teaching. Transmission risk depends on a range of parameters, such as vaccination coverage and efficacy, number of contacts and adoption of non-pharmaceutical intervention measures (NPIs). Due to the generalised academic freedom in Europe, many universities are asked to autonomously decide on and implement intervention measures and regulate on-campus operations. In the context of rapidly changing vaccination coverage and parameters of the virus, universities often lack sufficient scientific insight to base these decisions on. To address this problem, we analyse a calibrated, data-driven agent-based simulation of transmission dynamics of 10755 students and 974 faculty members in a medium-sized European university. We use a co-location network reconstructed from student enrollment data and calibrate transmission risk based on outbreak size distributions in education institutions. We focus on actionable interventions that are part of the already existing decision-making process of universities to provide guidance for concrete policy decisions. Here we show that, with the Omicron variant of the SARS-CoV-2 virus, even a reduction to 25\% occupancy and universal mask mandates are not enough to prevent large outbreaks given the vaccination coverage of about 80\% recently reported for students in Austria. Our results show that controlling the spread of the virus with available vaccines in combination with NPIs is not feasible in the university setting if presence of students and faculty on campus is required.

Network-based emotional profiling of mainstream and alternative narratives about COVID-19 vaccines, AstraZeneca and Pfizer

ABSTRACT. COVID-19 vaccines have been largely debated by the press. To understand how mainstream and alternative media debated vaccines, we introduce a paradigm reconstructing time-evolving narrative frames via cognitive networks and natural language processing. We study Italian news articles massively re-shared on Facebook/Twitter (up to 5 million times), covering 5745 vaccine-related news from 17 news outlets over 8 months (see Fig. 1). We find consistently high trust/anticipation and low disgust in the way mainstream sources framed "vaccine/vaccino". These emotions were crucially missing in alternative outlets (see Fig. 1 C). News titles from alternative sources framed "AstraZeneca" with sadness, absent in mainstream titles (see Fig. 1 D). Initially, mainstream news linked mostly "Pfizer" with side effects (e.g. "allergy", "reaction", "fever"). With the temporary suspension of "AstraZeneca", negative associations shifted: Mainstream titles prominently linked "AstraZeneca" with side effects, while "Pfizer" underwent a positive valence shift, linked to its higher efficacy. Simultaneously, thrombosis and fearful conceptual associations entered the frame of vaccines, while death changed context, i.e. rather than hopefully preventing deaths, vaccines could be reported as potential causes of death, increasing fear. Our findings expose crucial aspects of the emotional narratives around COVID-19 vaccines adopted by the press, highlighting the need to understand how alternative and mainstream media report vaccination news.

Randomising hypergraphs by permuting their incidence matrix

ABSTRACT. Network theory has emerged as a powerful paradigm to explain phenomena where units interact in a highly non-trivial way. So far, however, research in the field has mainly focused on pairwise interactions, disregarding the possibility that more-than-two constituent units could interact at a time. Hypergraphs represent a class of mathematical objects that could serve the scope of describing this novel kind of many-bodies interactions. In this paper, we propose benchmark models for hypergraphs analysis that generalise the usual Erdos-Renyi and Configuration Model in the simplest possible way, i.e. by randomising the hypergraph incidence matrix while preserving the corresponding connectivity/topological constraints - whose definition is, now, adapted to the novel framework. After exploring the mathematical properties of the proposed benchmark models, we consider two different applications: first, we define a novel quantity, the hyperedge assortativity, whose expected value we theoretically derive for all the introduced null models and which we, then, use to detect deviations in the corresponding real-world hypergraphs; second, we define a principled procedure for testing the statistical significance of the number of hyperedges connecting any two nodes.

Measuring the speed of a real-world walk process

ABSTRACT. Digital accounting records let us observe transactions that move money, i.e. the "steps" in a real-world walk process, as it unfolds. The rate at which money changes hands can be described by a distribution of holding times; funds enter accounts, are held there for some period, and then are transferred out. The inverse of a holding time has the units of a speed, and can be used to compute an average transfer velocity. We extend this distributional approach to measure transfer velocity for payment systems that are far from stable over time.

A Modular Backbone For Weighted Complex Networks

ABSTRACT. Real-world networks of millions of nodes and billions of edges pose a crucial hindrance for their analyses. Therefore, it is of vital importance to select relevant nodes and edges while conserving the network's key information. One of the fundamental topological features of real-world networks is its community structure. While most of the works aim to reduce the networks' sizes while preserving the core information, no previous works proposed working on the network's community structure. In this work [1], we propose a backbone extraction method named ``modularity vitality backbone" which filters nodes based on their contribution to the network's overall modularity. Their contribution is quantified using the concept of vitality. The proposed method is compared against the ``overlapping nodes ego backbone" which showed its outperformance over the popular disparity filter [2]. On a set of seven real-world networks, the modularity vitality backbone outperformed overlapping nodes ego backbone with respect to the network's weighted modularity, weighted average degree, and average link weight. Nonetheless, this comes at the expense of lower average betweenness. Figure 1 shows the backbones extracted by the two methods to obtain 30\% of the Les Mis\'erable's original size. These results demonstrate how the proposed backbone can preserve the core internal part of communities and the bridges connecting them, both of which are essential in characterizing the network.

[1] Rajeh, Stephany, et al. “Modularity-based Backbone Extraction in Weighted Complex Networks.” Network Science. NetSci-X 2022. Lecture Notes in Computer Science, vol 13197.

[2] Ghalmane, Zakariya, et al. ``Extracting backbones in weighted modular complex networks." Scientific Reports 10.1 (2020): 1-18.

Structural changes in the mobility network of Spain during the COVID-19 pandemic

ABSTRACT. As the COVID19 pandemic spreads to more than 200 countries around the world, governments and health agencies have to weigh in on difficult decisions to mitigate its impact. The fast progression of COVID-19 pandemic has been mainly related to the high contagion rate of the virus and the worldwide mobility of humans. At the beginning of the pandemic, the absence of pharmacological therapies pushed the governments from different countries to introduce non-pharmaceutical interventions (NPI) to reduce human mobility and social contact. The enforcement of NPI such as partial or total lockdowns changes the mobility patterns of people. Previous studies have shown that mobility reduction measures change the structure of mobility networks in a non-linear way. Herein, we studied the structural changes in the mobility network of Spain in response to the COVID-19 pandemic with special emphasis on the non-pharmacological interventions applied by the government. For this purpose, we have collected a dataset from a mobility study based on anonymized cell phone records conducted by the Mobility and Transportation Ministry of Spain (MITMA) during the period from February 2020 to May 2021. The dataset contains origin-destination matrices between 2850 mobility areas, corresponding to a combination of districts and municipalities, reported on an hourly basis. For the purpose of this study, we have aggregated the mobility matrices on a daily and on a weekly basis. We calculate different metrics in the mobility networks to analyze how the structural properties of the mobility network in Spain changed in response to the NPI applied by the government. Our results show that, after the lockdown was implemented in March 2020, there was a 50% reduction in the overall mobility Fij(t) measured as the total number of trips relative to a baseline (Fig. 1A). We found that the reduction was stronger than the one observed in Germany (Schlosser et al. 2020). We compared the changes in the total trips aggregated by the distance between the mobility areas and found out that during the lockdown period there was a greater reduction in long-range trips (distance > 500km) compared to short-range trips (distance < 50km) showing that the lockdown measure affected more strongly long-range movement (Fig. 1B). Notably, we also found an opposite trend in the summer after the lockdown, when we observed a greater increase in long-range trips than in short-range trips. We later considered the evolution of two network properties, namely the cluster average coefficient <C(t)> and the average shortest path length <L(t)>. When analyzing how these two structural properties responded to the mobility restrictions, we observed a 3-fold increase in <L(t)> and <C(t)> at the beginning of the lockdown which later relaxed to their baseline values (Fig. 1C). Altogether, our results showed that the application of mobility restrictions has a non-linear effect on the structural properties of Spain’s mobility networks (Fig. 1D-E) characterized by a high reduction in long-distance trips which resulted in an increase of the average shortest path length and the local clustering. Interestingly, our observations are in close agreement with those reported to other countries, such as Germany (Schlosser et al. 2020).

An empirical investigation of Backbone Filtering Techniques in weighted Complex Networks

ABSTRACT. Several approaches have been developed over the years to reduce the network size while representing the original network as well as possible. Among these so-called backbone extraction techniques, filtering techniques focus on removing nodes and edges statistically insignificant to, or with a low impact on, the topological properties of the original network. In this study, we perform an extensive comparative investigation of seven filtering techniques on a set of real-world networks originating from various domains. We use two types of performance criteria. First, we look at the evolution of the original network topological properties (density, weighted modularity, average weighted degree, average link weight, average betweenness, entropy) while removing links. Second, we consider a specific use case. It assesses the relevance of the backbone to the diffusion process in the original network using the popular Susceptible-Infected-Recovered (SIR) Model. The top left figure in Fig. 1 illustrates the effectiveness of the High Salience Skeleton, the Noise Corrected Filter, and the Marginal Likelihood Filter to preserve the nodes while filtering a high fraction of links. The figure at the top right represents the evolution of the average weighted degree versus the fraction of remaining links in the backbone. One can see that the Disparity Filter and the Polya Filter preserve the average weighted degree of the original network across all edge fractions. In contrast, the average weighted degree is higher for the Global threshold technique, focusing on high-weighted links. The average weighted degree increases monotonically with the fraction of remaining links for the other backbone extractors. The bottom left figure reports the evolution of the average link weight. One can observe that the Polya Filter, the Global Threshold, and the disparity Filter tend to preserve in priority high-weight links. In contrast, the high salience and the Noise Corrected Filter retain many low-weight links, decreasing the average weighted link. The Marginal Likelihood Filter lies between these two extremes with a balanced set of high and low-weight links. The figure at the bottom right illustrates the evolution of the community structure strength. One can see that all backbones exhibit similar behavior. As the fraction of retained edges grows, the weighted modularity increases to a maximum. Then it decreases until it reaches the value observed in the original network. Backbone with a low fraction of edges contains few intercommunity links strengthening the community structure. Adding edges weaken the community structure because the proportion of inter-community links grows. The table on the right reports the Pearson correlation between each method's edge scores and the infection participation. It illustrates how some filtering techniques assign high scores to the influential edges that participate in information diffusion or an epidemic in a network. Overall, it appears that there is no universal solution. Indeed, edge filtering techniques exhibit diverse behaviors depending on the performance measure under scrutiny.

Robustness of Cohesion in Group Formation

ABSTRACT. Cohesion is fundamental for the function of a social group. To study the interplay between group growth and cohesion, we propose a model where a group grows by a noisy admission process of new members who can be of two different types. Cohesion is defined as the fraction of members of the same type. The model can reproduce the empirically reported decrease of cohesion with the group size. However, when admissions require a consensus of several members, we find a phase transition belonging to the mean-field universality class. Below a critical noise level, the growing group can remain cohesive.

Reaching Consensus in a Temporal Epistemic Network

ABSTRACT. This work develops the concept of temporal network epistemology model enabling the simulation of the learning process in dynamic networks. The results of the research, conducted on the temporal social network generated using the CogSNet model and on the static topologies as a reference, indicate a significant influence of the network temporal dynamics on the outcome and flow of the learning process. It has been shown that not only the dynamics of reaching consensus is different compared to baseline models but also that previously unobserved phenomena appear, such as uninformed agents or different consensus states for disconnected components. It has been also observed that sometimes only the change of the network structure can contribute to reaching consensus. The introduced approach and the experimental results can be used to better understand the way how human communities collectively solve both complex problems at the scientific level and to inquire into the correctness of less complex but common and equally important beliefs' spreading across entire societies.

A network science-based approach to study branded food

ABSTRACT. The quality of the ingested food plays an essential role in health and well-being. Nonetheless, choosing high-quality food is not a trivial task. Some effort has been put into developing indices that quantify food quality. Two of these initiatives are broadly used, NOVA and Nutri-Score. The first accounts for how much processed the food is and is divided into four types of food: 1 - unprocessed or minimally processed, 2 - processed culinary ingredients, 3 - processed, and 4 - ultra-processed. The second is based on the nutritional information of the food, in which "good nutrients" (e.g., proteins, fiber, and vegetables) and "bad nutrients" (e.g., energy, sugars, and salt) are used to increase and decrease the score, respectively. Then, the score is converted into one of the five letters, from the best (A) to the worst (E) score. In this study, we analyzed a large dataset of branded food, namely ``Open Food Facts'', in which the data is manually provided by volunteers. This dataset holds various information regarding each food item, such as the ingredients, nutritional information, and indices of quality (NOVA and Nutri-Score). First, we selected only the items with the composition described in English and including vegetable oils. Furthermore, with the basis of the description of the product name and brand, we consider a single entry for each product. Since some nutritional information is not compatible with what is expected for real food, we adopted the Interquartile Range (IQR) criterion to eliminate outliers. As considered by Nutri-Score, we assume that food containing similar nutrients can have similar health impacts. Our analysis is based on the similarity between the selected food items, aiming at connecting food with similar nutritional information. For this purpose, we created a weighted network where nodes are food items, and the edges are weighted according to the cosine similarity between their nutrient quantities. Using this approach, we expect to infer hidden information concerning these indices. To simplify this network, we kept only the ten edges with the highest weights for each node. Next, we compute the network communities. For each community, we display the most frequently used oils, as well as their nutritional scores. The vast majority of the food is classified as NOVA 4 (Ultra-processed) for all communities. Considering Nutri-Score, communities A and B tend to hold a slightly higher number of health items when compared with the others. Independently on the nutritional characteristics of the food, the most frequent oils tend to be the same. Additionally, soybean and palm oil were selected as among the five most frequent for all communities. Altogether, the discrepancies found between the network communities and the indices illustrate that their information is incomplete. We hope that our network representation can be used to develop novel food quality indices.

Latent learning in networks explains perception change in social gatherings

ABSTRACT. How people perceive their social worlds determines how they behave in society. Perceptions—regardless of their factuality—shape people’s aspirations and willingness to engage in different behaviors, ranging from voting and energy consumption to health behavior, drinking, and smoking. Though these perceptions are confined to individuals’ standpoints, they depend on social ties. More than that, changes in people’s perceptions are expected to be affected by social interaction. However, the mechanisms that drive the change of individuals’ perception within social gatherings are still unexplored.

While social learning theory suggests that communicating information through a network can enhance or worsen individuals’ estimate , there is no empirical or theoretical model to explain whether individuals could learn those cues in social networks even without communicating the information. In our work, we define latent network learning as the process of learning about a specific latent variable via social interactions without explicit communication about this variable.

We investigate this phenomenon in social gatherings by collecting high-resolution face-to-face interaction data and conducting a survey at two different time points within the gathering. We carry out this study at two recent conferences; each dataset consists of individuals’ interactions captured via close-range proximity directional sensors that individuals wore during the conferences. In this setting, the latent variable is the perception of individuals on the proportion of female participants. Before and after the conferences, individuals were asked in surveys about their perceptions of the proportion of females in the conference. The perception of individuals changed between the two surveys before and after the conferences.

To analyze the dynamics of social learning in the datasets, we examine different information aggregation models and compare the actual individuals’ perceptions with the models using the mean-square error (MSE) between the model predictions and the actual values, as an indicator of how well the model can explain the perception change. The bounded confidence model, where at each time step, each individual updates the perception based on the neighbors’ perception within a certain range (i.e., latent network learning), outperforms the other settings, such as when the individuals update their perceptions based on their observations. To further investigate the network effects, we construct null models where we randomize (i) the edges in the interaction networks, (ii) the assignment of individuals’ perceptions, and (iii) both. The results rule out the effect of randomness and show that latent network learning explains perception change in social gatherings.

Monetization in online streaming platforms: An exploration of inequalities in

ABSTRACT. Twitch is a social live streaming platform allowing creators to produce cultural content reacting and interacting with their spectators via a chat system. The platform was initially centered around the gaming subculture, but it consistently gained popularity (from 1.26 million average concurrent viewers in 2019 to 2.78 million in 2021; TwitchTracker) and has substantially extended its content offer [1]. Nevertheless, the growth dynamics of the Twitch spectators network has not been yet studied in detail.

The platform has built itself around technical features allowing for great social interactivity and connectivity affordances in both its consumption modes and the way it allows for content monetization. Twitch's experience is centered around the idea of community [2]. On one hand, the streamer aims at nudging their spectator toward the paying features of their channel via parasocial cues and on the other spectators find in Twitch streams a place where to engage with a community and interact with creators.

If the literature regarding streamers' way of monetizing content is well established [3], the issues regarding the macro effect of such practices are still unclear. In this work, we use public data from a recent Twitch data leak regarding streamers revenue as well as scrapped data from Twitch and Twitch Tracker in order to explore the effects of the streamer subscribers growth process and monetization techniques on economical inequalities associated

We were able to show that, even if it is a platform where the role of attention maximizing algorithms is only minimal, and provides diminishing returns to massive channels forcing streamers to find external ways to monetize their content, Twitch is not an egalitarian platform by any metrics. First, we observe that the revenue repartition in between the Top 10,000 streamers is significantly unequal (Gini 0.57, Fig. a), and the prolongation of the curve using a parabolic fractal indicates that considering all users would reveal even higher inequalities (Gini 0.94, Fig. a). Moreover, looking at the subscribers ranking for the top 1000 streamers (February 2022, source:TwitchTracker) and its extension we find comparable results (Gini 0.63 for the top 10,000, Fig. b) suggesting that subscribers constitute the primary resource.

Secondly, following Wolff and Shen [4], we looked into the ways streamers capitalize on their audience and found that creators gathering larger audiences draw less money from individual subscriber and followers. Streamers having the most followers (source: Twitch) also have the worst conversion rate from followers to subscribers (Fig. c), and draw smaller revenues from subscribers. These dynamics are further characterized by analyzing time series from StreamCharts tracking the evolution in time of the subscribers, followers and views of the different channels (Fig. d).

Finally, we scrapped the Twitch website to isolate affiliation, sponsorship and direct donation link provided for samples of streamers with different revenue and found that creators gathering bigger audiences include significantly (F-test: 47.5; p-value: 8e-12) more sponsorship link, revealing their wish to monetize their viewers in a different manner.

In conclusion, it appears that if inequalities are still high, both term of community size (prolonged subscribers curve Gini's: 0.63) and consequentially in revenue. Nonetheless Twitch features and affordances still favor the monetization of smaller communities causing top creators to set up more ways to capitalize on their audience.

[1] K. Ask, H. S. Spilker, and M. Hansen, “The poli- tics of user-platform relationships: Co-scripting live- streaming on,” FM, June 2019. [2] W. A. Hamilton, O. Garretson, and A. Kerne, “Stream- ing on twitch: fostering participatory communities of play within live mixed media,” in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, (Toronto Ontario Canada), pp. 1315–1324, ACM, Apr. 2014. [3] M. R. Johnson and J. Woodcock, “And today’s top do- nator is”: How live streamers on Monetize and Gamify Their Broadcasts,” Social Media + Soci- ety, vol. 5, p. 205630511988169, Oct. 2019. [4] G. H. Wolff and C. Shen, “Audience size, moderator ac- tivity, gender, and content diversity: Exploring user participation and financial commitment on,” New Media & Society, p. 146144482110699, Jan. 2022.

Functional symmetries in brain networks

ABSTRACT. In a networked system, the nodes are influenced not only by their nearest neighbors but by the overall network structure. The influences of a node can be estimated by quantifying the statistics of its topological profile, that is, the whole network structure seen from each node's standpoint. We compare topological profiles between nodes of functional networks of human brains using information parity. We detected an increase in information parity in the brain networks of subjects under the effects of a psychedelic relative to their ordinary state. We also report a noteworthy enhancement in functional symmetries among key cortical brain regions. We will discuss this variation in the context of network resilience and brain activities under psychedelic effects.

Pearson Correlations on Complex Networks

ABSTRACT. Complex networks are useful tools to understand propagation events like epidemics, word-of-mouth, adoption of habits, and innovations. Estimating the correlation between two processes happening on the same network is therefore an important problem with a number of applications. However, at present there is no way to do so: current methods either correlate a network with itself, a single process with the network structure, or calculate a network distance between two processes. In this paper, we propose to extend the Pearson correlation coefficient to work on complex networks. Given two vectors, we define a function that uses the topology of the network to return a correlation coefficient. We show that our formulation is intuitive and returns the expected values in a number of scenarios. We also demonstrate how the classical Pearson correlation coefficient is unable to do so. We conclude the paper with two case studies, showcasing how our network correlation can facilitate tasks in social network analysis and economics. We provide examples of how we could use our network correlation to infer user characteristics from their activities on social media; and relationships between industrial products, under some assumptions as to what should make two exporting countries similar.