COMPLEX NETWORKS 2022: ELEVENTH INTERNATIONAL CONFERENCE ON COMPLEX NETWORKS & THEIR APPLICATIONS
PROGRAM FOR TUESDAY, NOVEMBER 8TH
Days:
previous day
next day
all days

View: session overviewtalk overview

09:00-09:40 Session Speaker S1: Luís A. NUNES AMARAL - Northwestern University, USA
09:00
An opinionated evaluation of where we are after a quarter century of complex networks

ABSTRACT. It has been nearly 25 years since the seminal paper by Watts and Strogatz. In this period, the study of complex networks has fostered tremendous advances in our mathematical understanding of the structure, dynamics, and formation of complex networks. Those mathematical advances have made inroads into the work of scholars in biological, technological, and social studies.  I speculate on how the community of researchers studying complex networks can have the greatest impact on the creation of new knowledge.

09:40-10:40 Session Lightning L1: Information Spreading in Social Media - Human Behavior
09:40
Will You Take the Knee? Italian Twitter Echo Chambers' Genesis during EURO 2020
PRESENTER: Virginia Morini

ABSTRACT. Echo chambers can be described as situations in which individuals encounter and interact only with viewpoints that confirm their own, thus moving, as a group, to more polarized and extreme positions. Recent literature mainly focuses on characterizing such entities via static observations, thus disregarding their temporal dimension. In this work, distancing from such a trend, we study, at multiple topological levels, echo chambers genesis related to the social discussions that took place in Italy during the EURO 2020 Championship. Our analysis focuses on a well-defined topic (i.e., BLM/racism) discussed on Twitter during a perfect temporally bound (sporting) event. Such characteristics allow us to track the rise and evolution of echo chambers in time, thus relating their existence to specific episodes.

09:45
Formation and Development of Echo Chambers on Twitter under the Pandemic
PRESENTER: Hiroshi Iyetomi

ABSTRACT. The term of echo chambers in politics was coined to describe an isolated situation of people, among whom restricted (often stereotyped and polarized) thoughts, information, and beliefs are shared and self-amplified. The formation of echo chambers can increase social and political polarization, ultimately giving rise to divided societies. The objective of this study is to give a new insight into such a fascinating target of research with network-theoretic methods including the bow-tie analysis and the Helmholtz-Hodge decomposition. We construct a retweet network out of the Twitter data in Japanese collected during the period of October 1, 2019 to May 1, 2021 with vaccine as a keyword. The retweet network thus obtained has a giant strongly-connected component (SCC), which can be cores of echo chambers. The FALCON (Flow Analysis Tools for Large-Scale Complex Networks), developed by one of the authors with his collaborators, works very well to make the bow-tie structure of the network clearly visible. The giant SCC is so heterogeneous that it is decomposed predominantly into three communities. Characterizing those communities in terms of hashtags associated with retweets in them, we find that the controversy as regards vaccination and the political conflict are combined to result in tripartite echo chambers opposing to each other.

09:50
Social Networks in Times of Turkey’s Currency Crisis

ABSTRACT. In the last two months of 2021, the Turkish Lira (TRY) lost about half of its value against the US dollar (USD). This happened after a relatively stable period of almost 20 months (from the beginning of 2020 until mid-October 2021) when the USD/TRY exchange rate steadily increased by about 50%. TRY regained some of its value after that evening when a new financial instrument (“FX-Protected Deposit Accounts”) was introduced by the government. For this study, we collected users and Tweets around the currency crisis times, built follow and favorite networks, and analyzed user behaviors before and after the introduced deposit intervention. Our results for both network types provide that the influence is higher before compared to after the intervention. People tend to follow users with higher followers, especially before the event. Favorite networks show that users also like accounts that tweet about economic analysis, cryptocurrency, and financial solutions.

09:55
A simple model of knowledge scaffolding
PRESENTER: Franco Bagnoli

ABSTRACT. We introduce a simple model of knowledge scaffolding, simulating the process of building a corpus of knowledge based on logic derivations starting from a set of "axioms". The starting idea around which we developed the model is that each new contribution, still not present in the corpus of knowledge, can be accepted only if it is based on a certain number of items already belonging to the corpus. When a new item is acquired by the corpus we impose a limit to the maximum growth of knowledge for every step that we call the ``jump'' in knowledge. We analyze the growth with time of the corpus and the maximum knowledge and analyzing the results of our simulations we managed to show that they both follow a power law. Using an approach based on a death-birth Markov process we were able to derive some analytical approximation of it.

10:00
Gradual Network Sparsification and Georeferencing for Location-Aware Event Detection in Microblogging Services

ABSTRACT. Event detection in microblogging services such as Twitter has become a challenging research topic within the fields of social network analysis and natural language processing. Many works focus on the identification of general events with event types ranging from political news and soccer games to entertainment. However, in application contexts like crisis management, traffic planning, or monitoring people’s mobility during pandemic scenarios, there is a high need for detecting localisable physical events. To address this need, this paper introduces an extension of an existing event detection framework by combining machine learning-based geo-localisation of tweets and network analysis to reveal events from Twitter distributed in time and space. Gradual network sparsification is introduced to improve the detection events of different granularity and to derive a hierarchical event structure. Results show that the proposed method is able to detect meaningful events including their geo-locations. This constitutes a step towards using social media data to inform, for example, traffic demand models, inform about infection risks in certain places, or the identification of points of interest.

10:05
Exploring shareability networks of probabilistic ride-pooling problems
PRESENTER: Michal Bujak

ABSTRACT. Ride-pooling, where travellers share a vehicle to reach their destinations, is becoming an increasingly popular urban mobility alternative.

Thanks to pooling, co-travellers can reduce their travel costs, mobility platforms (like Uber and Lyft) can more efficiently utilise their fleets, and cities can reduce congestion.

Naturally, travellers sharing their rides form a graph structures. While the so-called shareability networks are central to solve the so-called ride-pooling problems, they have not been hitherto explicitly analysed.

Here, we experiment with 147 travellers who want to share rides in New York and pool them with our ExMAS algorithm. We identify two kinds of graph structures emerging from the ride-pooling problems: shareability graphs, where travellers are nodes linked if they can travel together and RV graphs - bipartite graphs with two kinds of nodes: travellers and rides (vehicles) linked if traveller is a part of given ride. Both of which can be further divided into potential and matching graphs.

Yet, to further explore structures of the underlying networks, we propose the probabilistic version of the pooling algorithm. By adding the probabilistic noise to the utility formulas of the ride-pooling algorithm, we introduce the behavioural non-determinism to the pooling problem. When replicated multiple times, we observe richer structures of weighted, probabilistic graphs.

Now we can observe both the stable matches with big weights and bigger components, where pooling compositions can vary substantially day to day.

Such weighted shareability graphs, consistent with the actual travel behaviour, can be useful not only for a deeper understand of the still challenging ride-pooling problems, but also to identify communities in ride-pooling networks, improve attractiveness of pooling services or control the virus spreading in ride-pooling networks.

10:10
Effects of online social capital on academic performance
PRESENTER: Agusti Canals

ABSTRACT. The idea of social capital as the value obtainable from an individual’s social relationships has been used to study many organizational and social settings, but rarely virtual environments. We use data from an online higher education institution to examine how an individual’s social capital derived from her position in the social structure influences her performance levels. We confirm that social capital has a significant effect on achievement. Firstly, we find that centrality in the social network and cohesion have a positive effect. Secondly, we show how not only the network structure counts but also the diversity in the relationships is important. Having access to heterogeneous peers also increases the performance level in learning processes. These findings suggest that the proven importance of social capital in face-to-face situations might be translated into virtual environments and supports the need to build or enhance ICT-mediated students' social networks in distance learning.

10:15
Executive women’s networks: the affinity rule of social capital

ABSTRACT. Social capital is a key feature individuals have to nurse for career progression. Yet, the access to and benefits from social capital may be contingent on a person’s gender. However, there is a dearth of empirical evidence about structural differences between the networks formed by men and women. In particular, we do not know much about the networks of those who have made it to the top. This paper aims at bridging this gap. Specifically, in this paper, we apply network theory to analyze and compare differences, if any, in the network formation features between men and women executives.

10:20
Law and Success: A Complexity Perspective

ABSTRACT. This study, situated at the interface of complexity science and legal policy explores how the understanding of market success, as provided by complex networks theory can influence the design of legal rules in various fields (such as intellectual property, contract law, corporate law, or constitutional law). The study’s method of inquiry is threefold: First, it investigates how legal systems perceive success. This exploration reveals that the law assumes a linear and proportionate relationship between quality and success, and that legal rules often use observable measures of success in market settings as proxies for (unobservable) qualities, which society wishes to promote, reward or protect. For example, legal rules often award the most successful trademarks and technologies broader intellectual property protection in comparison to less successful objects, assuming that their success represents remarkable value or extraordinary investment. Secondly, the paper contrasts this understanding with insights from complex networks theory about the nonlinear relations between success and quality, the power-law distribution of success in market settings, and the sensitivity of the success-quality relations to network properties, such as connectivity. Finally, the analysis shows how the understanding of the quality-success relations, as provided by complexity theory, can improve the design of legal rules, and gestures at ways of reforming legal theory and doctrine to reflect these insights.

10:25
Bayesian Approach to Uncertainty Visualization of Heterogeneous Behaviors in Modeling Networked Anagram Games
PRESENTER: Xueying Liu

ABSTRACT. Heterogeneous player behaviors are commonly observed in games. It is important to quantify and visualize these heterogeneities in order to understand variability in behaviors. Our work focuses on developing a Bayesian approach for uncertainty visualization in a model of networked anagram games. In these games, team members collectively form as many words as possible by sharing letters with their neighbors in a network. Heterogeneous player behaviors include great differences in numbers of words formed and the amount of cooperation among networked neighbors. Our Bayesian approach provides meaningful insights for inferring worst, average, and best player performance within behavioral clusters, overcoming previous model shortcomings. These models are integrated into a simulation framework to understand the implications of model uncertainty and players’ heterogeneous behaviors.

10:40-11:15Coffee Break
10:40-11:15 Session Poster P1A: [1-7] Networks Models
Structure of Core-Periphery Communities
PRESENTER: Peter Marbach

ABSTRACT. It has been experimentally shown that communities in social networks tend to have a core-periphery topology. However, there is still a limited understanding of the precise structure of core-periphery communities in social networks including the connectivity structure and interaction rates between agents. In this paper, we use a game-theoretic approach to derive a more precise characterization of the structure of core-periphery communities.

Analysis of Perturbed Networks

ABSTRACT. It is a frequent situation that the idealized assumptions on which a network model is built do not hold with sufficient accuracy. This motivates the question: what happens if some perturbations are introduced in the idealized network model? The perturbations may be arbitrary, but assumed relatively small. At the first glance, the question seems hopelessly entangled with the considered specific model. After all, the effect of a perturbation may well be strongly model dependent. Surprisingly, however, we show that it is still possible to develop model-independent tools for this problem.

Sandpiles in Networks with Variable Topology: reconnected 2D grid

ABSTRACT. We study a BTW-like sandpile model, over a network which is obtained by a random sequence of reconnections of a square grid, avoiding the existence of isolated nodes and ensuring energy release once an avalanche starts. The distribution of the released energies departs smoothly from the case of the square grid, while still having a power law behavior. The Gini coefficient of grain distribution in the stationary state decreases monotonically as the number of reconnections increases, in contrast with the 1D case, where a transition at about half the maximum number of reconnections occurs.

N-actors conflict model on scale-free networks

ABSTRACT. We present the results of some numerical simulations of a non-linear conflict model in which the nodes (actors) interact with each other on networks with different scale-free topologies and different amounts of cooperative, competitive and mixed interactions. To measure the changes of the states of the actors as a function of the network topology and the different types of interactions among them, some metrics, in terms of the final steady state values of the actors, were defined. We find that the opinions of the actors are more similar to each other, which implies less possibility of conflict occurrence, when the interactions in the networks are mostly mixed, regardless of the network topology. Moreover, by using hierarchical networks we note that hub-to-hub connections exert a strong influence on the organization of the final states of the network actors. Our findings make testable predictions on how the dynamics of a conflict depends on the strategies chosen by the actors.

Optimal network robustness in continuously changing degree distributions
PRESENTER: Masaki Chujyo

ABSTRACT. Realization of highly tolerant networks against malicious attacks is an important issue, since many real-world networks are extremely vulnerable to attacks. Thus, we investigate the optimal robustness of connectivity against attacks on networks in changing degree distribution ranging from power-law to exponential or narrower ones. It is numerically found that the smaller variances of degree distributions lead to higher robustness in this range. Our results will provide important insights toward optimal robustness against attacks in changing degree distributions.

Analyzing Community-aware Centrality Measures Using The Independent Cascade Model
PRESENTER: Stephany Rajeh

ABSTRACT. Key nodes are crucial in enhancing or inhibiting the diffusion process in complex networks. Classical centrality measures are one of the main methods of identifying key nodes. However, such centrality measures ignore the network's community structure. Because communities are so important in real-world networks, using community-aware centrality measures is more effective. Previous research has examined the Susceptible-Infected-Recovered (SIR) and the Linear Threshold (LT) models on real-world networks without considering synthetic networks with controlled properties. This study looks into the consistency of prior assessments using the Independent Cascade (IC) model on real-world and synthetic networks. Using a set of three real-world networks and three synthetic networks, we analyze the behavior of eight community-aware centrality measures. Results show that targeting bridges or highly inter-linked nodes results in better diffusion with limited resources. When the budget is high, targeting distant hubs is more effective. Moreover, setting a uniform threshold and a weaker community structure strength hinders the diffusive power of the community-aware centrality measures.

A Biased Random Walk Scale-Free Network Growth Model with tunable Clustering
PRESENTER: Anurag Singh

ABSTRACT. Complex networks appear naturally in many real-world situations. A power law is generally a good fit for their degree distribution. The popular Barabasi-Albert model (BA) combines growth and preferential attachment to model the emergence of the power law. One builds a network by adding new nodes that preferentially link to high-degree nodes in the network. One can also exploit random walks. In this case, the network growth is determined by choosing parent vertices by sequential random walks. The BA model's main drawback is that the sample networks' clustering coefficient is low, while typical real-world networks exhibit a high clustering coefficient. Indeed, nodes tend to form highly connected groups in real-world networks, particularly social networks. In this paper, we introduce a Biased Random Walk model with two parameters allowing us to tune the degree distribution exponent and the clustering coefficient of the sample networks. This efficient algorithm relies on local information to generate more realistic networks reproducing known real-world network properties.

10:40-11:15 Session Poster P1B: [8-10] Multilayer Networks
On the Effectiveness of Using Link Weights and Link Direction for Community Detection in Multilayer Networks
PRESENTER: Daiki Suzuki

ABSTRACT. Multilayer networks are useful representations of real-world complex networks, and community detection in multilayer networks has been an area of active research. There are several options for the graph representation of a multilayer network, for example, directed or undirected and weighted or unweighted. Although these options may affect the results of community detection in a multilayer network, the representations that are effective for community detection have been not yet been clarified. In this paper, we experimentally investigate how the graph representation of a multilayer network affects the results of community detection. Through experiments using multilayer networks of Twitter users, we show that using a directed graph for each layer of a multilayer network contributes to improved accuracy in estimating the communities of Twitter users. We also show that when there is a clear oppositional structure among nodes in a network, manipulating link weights to emphasize the oppositional structure improves the accuracy of community detection.

Random Matrix Analysis of Multiplex Networks
PRESENTER: Tanu Raghav

ABSTRACT. We investigate the spectra of adjacency matrices of multiplex networks under random matrix theory (RMT) framework. Through extensive numerical experiments, we demonstrate that upon multiplexing two random networks, the spectra of the combined multiplex network exhibit superposition of two Gaussian orthogonal ensemble (GOE)s for very small multiplexing strength followed by a smooth transition to the GOE statistics with an increase in the multiplexing strength. Interestingly, randomness in the connection architecture, introduced by random rewiring to 1D lattice, of at least one layer may govern nearest neighbor spacing distribution (NNSD) of the entire multiplex network, and in fact, can drive to a transition from the Poisson to the GOE statistics or vice versa. Notably, this transition transpires for a very small number of the random rewiring corresponding to the small-world transition. Ergo, only one layer being represented by the small-world network is enough to yield GOE statistics for the entire multiplex network. Spectra of adjacency matrices of underlying interaction networks have been contemplated to be related with dynamical behavior of the corresponding complex systems, the investigations presented here have implications in achieving better structural and dynamical control to the systems represented by multiplex networks against structural perturbation in only one of the layers.

Structural Cores and Problems of Vulnerability of Partially Overlapped Multilayer Networks

ABSTRACT. The concept of aggregate-network of multilayer network (MLN), which in many cases significantly simplifies the study of intersystem interactions is introduced, and the properties of its k-cores are investigated. The notion of p-cores is determined, with help of which the components of MLN that are directly involved in the implementation of intersystem interactions are distinguished. Methods of reducing the complexity of multilayer network models are investigated, which allow us to significantly decrease their dimensionality and better understand the processes that take place in intersystem interactions of different types. Effective scenarios of simultaneous group and system-wide targeted attacks on partially overlapped multilayer networks have been proposed, the main attention of which is focused on the transition points of MLN through which the intersystem interactions are actually implemented. It is shown that these scenarios can also be used to solve the inverse problem, namely, which elements of MLN should be blocked in the first place to prevent the acceleration of spread of dangerous infectious epidemics diseases, etc.

11:15-13:00 Session Oral O1A: Structural Network Measures
11:15
Local Assortativity in Weighted and Directed Complex Networks
PRESENTER: Uta Pigorsch

ABSTRACT. Assortativity measures the tendency of a vertex to bond with another based on similarity. It is commonly defined as the correlation coefficient between the excess degrees of both ends of an edge, and is often associated with the robustness of a network against exogenous shocks. In this context, it is interesting to know which of the vertices or edges of a network are the most endangering on the one hand, and which are the most protective ones on the other hand. The assortativity coefficient, being a global measure, however, cannot provide answers to those kinds of questions. There is a need for a local assortativity measure, which can be either vertex or edge based, in order to identify those vertices or edges that contribute most to the global assortativity structure of a network, respectively. Many real-world networks are weighted networks, however, local assortativity has been exclusively considered for unweighted networks, so far. By generalizing this concept to weighted and (un)directed networks, we unify two approaches used in the literature, and derive distinct measures that allow us to determine the assortativeness of individual edges and vertices as well as of entire components of a weighted network. We demonstrate the usefulness of our measures by applying them to theoretical and real-world networks. Along this way, we also explain how to compute local assortativity profiles, which are informative about the pattern of local assortativity with respect to edge weight.

11:30
Comparison of metrics for measuring the editor scatteredness and article complexity on Wikipedia

ABSTRACT. As collective knowledge spaces such as Wikipedia increase and become more popular, evaluating them and their constituents is becoming increasingly important. To measure the quality of the editors and articles on Wikipedia, self-consistent metrics for the editor scatteredness and article complexity have been introduced previously based on a editor--article bipartite network constructed based on the edit relationship. Article complexity works well as a measure of an article’s level of complexity, and we find that the self-consistent metrics of editor scatteredness and article complexity are equivalent to their relative degrees when editors select articles randomly for editing. To clarify the effects of randomization on the scatteredness--complexity measure, we test it on a network randomized with different shuffling methods using actual edit histories from English Wikipedia. Our results suggest that the self-consistent metrics reflect not only the degree distribution of the editors or articles but also the local nested structure.

11:45
Estimating Affective Polarization on a Social Network
PRESENTER: Marilena Hohmann

ABSTRACT. Political polarization is a much-discussed topic in both academia and public debate. Especially the affective dimension of polarization – i.e., increasing hostility between political adversaries – is often said to pose threats to social collaboration and democracy. Despite the timeliness and relevance of this topic, current approaches to quantifying affective polarization do not fully capture 1) the interplay between the difference among two individuals and the hostility of their interaction, and 2) the structure of the network of interactions at large. To address this methodological gap, we propose a network-aware measure of affective polarization that is based on the Pearson correlation on complex networks. We conduct several experiments that show how our measure captures both components of affective polarization and how it compares to other approaches used in the literature. Subsequently, we apply our measure to a large-scale Twitter data set on Covid-19. The results show that affective polarization in the Covid-19 debate on Twitter was low in early February 2020 and then increased to moderately high levels in the following months before reaching very high levels in July 2020.

12:00
Generalizing Homophily to Simplicial Complexes
PRESENTER: Arnab Sarker

ABSTRACT. Group interactions occur frequently in social settings, yet their properties beyond pairwise relationships in network models remain unexplored. In this work, we study homophily, the nearly ubiquitous phenomena wherein similar individuals are more likely than random to form connections with one another, and define it on simplicial complexes, a generalization of network models that goes beyond dyadic interactions. While some group homophily definitions have been proposed in the literature, we provide theoretical and empirical evidence that prior definitions mostly inherit properties of homophily in pairwise interactions rather than capture the homophily of group dynamics. Hence, we propose a new measure, $k$-simplicial homophily, which properly identifies homophily in group dynamics. Across 16 empirical networks, $k$-simplicial homophily provides information uncorrelated with homophily measures on pairwise interactions. Moreover, we show the empirical value of $k$-simplicial homophily in identifying when metadata on nodes is useful for predicting group interactions, whereas previous measures are uninformative.

12:15
Reconstructing degree distribution and triangle counts from edge-sampled graphs
PRESENTER: Naomi Arnold

ABSTRACT. It is often the case that, due to prohibitively large size or to limits to data collecting APIs, it is not possible to work with a complete network dataset and sampling is required. A type of sampling which is consistent with Twitter API restrictions is uniform edge sampling. In this paper, we propose a methodology for the recovery of two fundamental network properties from an edge-sampled network: the degree distribution and the triangle count (we estimate the totals for the network and the counts associated with each edge). We use a Bayesian approach and show a range of methods for constructing a prior which does not require assumptions about the original network. Our approach is tested on two synthetic and two real datasets with diverse degree and triangle distributions.

12:30
Winner does not take all: contrasting centrality in adversarial networks
PRESENTER: Anthony Bonato

ABSTRACT. In adversarial networks, edges correspond to negative interactions such as competition or dominance. We introduce a new type of node called a low-key leader in adversarial networks, distinguished by contrasting the centrality measures of CON score and PageRank. We present a novel hypothesis that low-key leaders are ubiquitous in adversarial networks and provide evidence by considering data from real-world networks, including dominance networks in 172 animal populations, trading networks between G20 nations, and Bitcoin trust networks. We introduce a random graph model that generates directed graphs with low-key leaders.

12:45
Low-order descriptors of high-order dependencies

ABSTRACT. O-information is an information-theoretic metric that captures the overall balance between re- dundant and synergistic information shared by groups of three or more variables. To complement the global assessment provided by this metric, here we propose the gradients of the O-information as low-order descriptors that can effectively describe how high-order effects are localised across a sys- tem of interest. Our theoretical and empirical analyses demonstrate the potential of these gradients to highlight the contribution of variables in forming high-order informational circuits.

11:15-13:00 Session Oral O1B: Machine Learning & Networks
11:15
Model-independent hyperbolic embedding of directed networks
PRESENTER: Bianka Kovács

ABSTRACT. Hyperbolic network models are known to be able of generating scale-free networks that are also highly clustered, and can even have a strong community structure as well. The success of hyperbolic network models provides a strong motivation for the development of hyperbolic embedding algorithms that tackle the inverse problem, aiming to find the optimal hyperbolic coordinates of the nodes based on the network topology. Recently, several embedding algorithms have been proposed that couple Euclidean dimension reduction techniques with likelihood maximisation according to a hyperbolic network model. However, these methods assumed undirected graphs as inputs and could not grasp the asymmetry of the connections occurring in directed networks. In this work, we propose a new, model-independent method, using which we convert $d$-dimensional Euclidean embeddings created by HOPE and our new method named TREXPEN into hyperbolic ones, even for directed networks. Besides, we introduce a new algorithm (TREXPIC) capable of mapping the topology of directed or undirected networks directly into the hyperbolic space.

11:30
SignedS2V: structural embedding method for signed networks
PRESENTER: Shu Liu

ABSTRACT. A signed network is widely observed and constructed from the real world and is superior for containing rich information about the signs of edges. Several embedding methods have been proposed for signed networks. Current methods mainly focus on proximity similarity and the fulfillment of social psychological theories. However, no signed network embedding method has focused on structural similarity. Therefore, in this research, we propose a novel notion of degree in signed networks and a distance function to measure the similarity between two complex degrees and a node-embedding method based on structural similarity. Experiments on five network topologies, an inverted karate club network, and three real networks demonstrate that our proposed method embeds nodes with similar structural features close together and shows the superiority of a link sign prediction task from embeddings compared with the state-of-the-art methods.

11:45
HM-LDM: A Hybrid-Membership Latent Distance Model
PRESENTER: Nikolaos Nakis

ABSTRACT. A central aim of modeling complex networks is to accurately embed networks in order to detect structures and predict link and node properties. The latent space models (LSM) have become prominent frameworks for embedding networks and include the latent distance (LDM) and eigenmodel (LEM) as the most widely used LSM specifications. For latent community detection, the embedding space in LDMs has been endowed with a clustering model whereas LEMs have been constrained to part-based non-negative matrix factorization (NMF) inspired representations promoting community discovery. We presently reconcile LSMs with latent community detection by constraining the LDM representation to the D-simplex forming the hybrid-membership latent distance model (HM-LDM). We show that for sufficiently large simplex volumes this can be achieved without loss of expressive power whereas by extending the model to squared Euclidean distances, we recover the LEM formulation with constraints promoting part-based representations akin to NMF. Importantly, by systematically reducing the volume of the simplex, the model becomes unique and ultimately leads to hard assignments of nodes to simplex corners. We demonstrate experimentally how the proposed HM-LDM admits accurate node representations in regimes ensuring identifiability and valid community extraction. Importantly, HM-LDM naturally reconciles soft and hard community detection with network embeddings exploring a simple continuous optimization procedure on a volume constrained simplex that admits the systematic investigation of trade-offs between hard and mixed membership community detection.

12:00
The Structure of Interdisciplinary Science: Uncovering and Explaining Roles in Citation Graphs

ABSTRACT. Role discovery is the task of dividing the set of nodes on a graph into classes of structurally similar roles. Modern strategies for role discovery typically rely on graph embedding techniques, which are capable of recognising complex local structures. However, when working with large, real-world networks, it is difficult to interpret or validate a set of roles identified according to these methods. In this work, motivated by advancements in the field of explainable artificial intelligence (XAI), we propose a new framework for interpreting role assignments on large graphs using small subgraph structures known as graphlets. We demon- strate our methods on a large, multidisciplinary citation network, where we successfully identify a number of important citation patterns which reflect interdisciplinary research.

12:15
Detection of Sparsity in Multidimensional Data Using Network Degree Distribution and Improved Supervised Learning with Correction of Data Weighting
PRESENTER: Shinya Ueno

ABSTRACT. Multidimensional data are representatives in a wide range of applications, from those in the latest state-of-the-art science and technology to specific social issues. And they have been subject to analysis using methods such as regression analysis and machine learning. However, they are rarely obtained as complete data and contain more or less biases and deficiencies. In this study, we formed a network from a multidimensional dataset and used its degree distribution to detect data sparsity. Although model analysis based on the degree distribution has been conducted for many years, sparsity detection has not been a target of the degree distribution analysis. Furthermore, we attempted to increase the accuracy and precision of supervised learning by applying regressive weighting according to node grouping in the degree distribution spectrum. By making use of this algorithm, we can expand the range of utilization of incomplete data together with other promising progresses in complex network.

12:30
Link prediction in blockchain online social networks with textual information
PRESENTER: Manuel Dileo

ABSTRACT. In network science, link prediction is one of the most powerful tools, successfully applied in different settings, such as predicting protein-to-protein interactions or network evolution in online social networks. When it comes down to the latter, good performances have been obtained by leveraging structural features only and considering coarse-grain temporal resolutions. However, we have a more limited understanding of how structural-based approaches perform on attribute-enriched temporal networks and to what extent the temporal resolution affects performance. In this study, we focus on the former issue by evaluating the impact of content-based similarity on link formation, and by highlighting the role of node attributes inferred from the produced textual content. Specifically, we apply state-of-art graph neural networks on a high resolution temporal dataset gathered from a growing online social network along with attributes derived from textual content created by the users. The results on link prediction show that textual features can enhance the prediction performance but deep learning models, despite being promising solutions for this task, may also suffer the introduction of structured properties inferred from text.

12:45
The Price of a Skill: Why the Value of a Skill Depends on its “Neighbours”
PRESENTER: Fabian Stephany

ABSTRACT. The global workforce is urged to constantly re-skill, as technological change favours particular new skills while making others redundant. For many of the emerging jobs, precise skill requirements are constantly evolving. In this fast-changing labour market, systematic oversight is key: To know which skills are most marketable and have a sustainable demand. We propose a model for skill evaluation that attaches a “price tag” based on near real-time online labour market data. Thereby, we can isolate the economic return of an individual skill measured in USD per hour. Our model suggests that the value of a specific skill is determined by supply, skill domain, and skill complementarity. Our findings confirm contemporary theoretical assumptions on skill evaluations and highlight that the return on a specific skill depends on the value of its complements. We illustrate why “AI skills” are particularly valuable, as they are frequently combined with other high-value skills. The value of these in-demand skills has been increasing over the last years, as our model indicates. Our findings reveal the most valuable specialisations in AI skills for a set of 97 occupations. The model and metrics of our work can inform digital re-skilling to reduce labour market mismatches. In cooperation with online platforms and education providers, researchers and policy makers should consider using this blueprint to provide learners with personalised skill recommendations that complement their existing capacities and fit their occupational background.

11:15-13:00 Session Oral O1C: Network Analysis
11:15
The Global Network of Embodied Nitrogen Flows - Across Countries and Sectors
PRESENTER: Kaan Hidiroglu

ABSTRACT. Through industrialization and globalization, anthropogenic alterations of the biogeochemical nitrogen (N) cycle have caused more countries to deal with excess N in the environment. Most food products and all livestock contain a certain amount of N, as ~16% of proteins consists of N. This embodied N is traded worldwide causing an imbalance in many countries. FABIO, FAOSATAT, IFASTAT and Exiobase data on global trade and production of food products and livestock was converted to nitrogen and combined with fertilizer usage to build the global embodied nitrogen network. Properties of this embodied nitrogen network can be further revealed through network theory and ecological network analysis, modelling global trade of nitrogen as a food web. Trade balance of countries and their efficiency is widespread globally with many countries reaching high accumulation or loss of nitrogen. The fitness of the network was determined to be suboptimal making it vulnerable to disruption. To increase the sustainability of the network (global nitrogen trade) some further policy regulations are needed. This would allow a better organization of flows reducing the harmful effects of decoupling animal husbandry and feed production.

11:30
From time series to networks in R with the ts2net package

ABSTRACT. Extended abstract

11:45
A Path-Based Approach to Analyzing the Global Liner Shipping Network
PRESENTER: Timothy LaRock

ABSTRACT. The maritime shipping network is the backbone of global trade. Data about the movement of cargo through this network comes in various forms, from ship-level Automatic Identification System (AIS) data, to aggregated bilateral trade volume statistics. Multiple network representations of the shipping system can be derived from any one data source, each of which has advantages and disadvantages. In this work, we examine data in the form of liner shipping service routes, a list of walks through the port-to-port network aggregated from individual shipping companies by a large shipping logistics database. This data is inherently sequential, in that each route represents a sequence of ports called upon by a cargo ship (see Figure~\ref{fig:explanatoryFigure}(a)).

Previous work has analyzed this data without taking full advantage of the sequential information \cite{xu2020modular}. Our contribution is to develop a path-based methodology for analyzing liner shipping service route data. We achieve this by computing navigational trajectories through the network that both respect the directional information in the shipping routes and minimize the number of cargo transfers between routes, a desirable property in industry practice. We call these trajectories \emph{minimum-route paths} \cite{larock_path-based_2022}. We compare these paths with those computed using other network representations of the same data, finding that our approach results in paths that are longer in terms of both network and nautical distance. We further use these trajectories to re-analyze the role of a previously-identified structural core through the network, as well as to define and analyze a measure of betweenness centrality for nodes and edges.

\begin{figure} \centering \includegraphics[scale=0.6]{fig/explanatoryFigure.pdf} \caption{} \label{fig:explanatoryFigure} \end{figure}

In Figure~\ref{fig:explanatoryFigure}(a), we illustrate the input data, which consists of three routes, labeled $r_1$, $r_2$, and $r_3$, visiting the ports A, B, C, D, E. Figure~\ref{fig:explanatoryFigure}(b) shows three graph representations of the input data. The path graph is the traditional directed network representation of the routes, where an edge exists from u to v if the edge appears in at least one route. This graph represents how ships and cargo can move through the network. The directed co-route graph is also a directed graph, but an edge exists from port u to port v if in at least one route port v appears in any succeeding ports of call after port u. The length of the shortest path between any two pairs of nodes in the co-route graph is the minimum-route distance (distances from A shown in (b)). In the undirected co-route graph, every route is made into a clique, or fully connected undirected graph. This representation was used for service route data in previous work \cite{xu2020modular}, emphasizing that cargo transportation between any two ports in a same route can be realized by a single vessel. All minimum-route paths between A and D, which require two routes and do not allow any port to appear more than once, are shown in (c).

In Figure~\ref{fig:linkandlengthpct}, we analyze the role of the \emph{structural core} of the network, identified previously by Xu et al. \cite{xu2020modular}. Ports in the structural core are part of a dense subset of ports that mediate between different modules of the network. Edges between ports are caterogized as \emph{local} edges if neither node in the core; \emph{feeder} edges if one node is in the core; or \emph{core} edges if both nodes are in the core.

Following Xu et al. \cite{xu2020modular}, we compute the percentage of edges and shipping length (real shipping distance between ports, measured in kilometers) on paths between non-core nodes by edge type in the original work using the undirected co-route graph (left) and with our updated minimum-route path approach (right). The first bar in each plot of Figure~\ref{fig:linkandlengthpct} shows the percentage of edges in the graph that fall into each category (sum of in and out feeders in gray to the left of each bar in the right plot, reflecting the fact that minimum-route paths maintain directionality information). The second bar shows the percentage of total shipping length for edges in each category in the graph. The numbers in parentheses are the length percentage divided by the edge percentage. The majority of edges in both representations are local (64.1\% and 64.8\%), but local edges make up relatively less of the total length among all edges in the graphs.

The third bar in each plot shows the percentage of length by category for all paths between peripheral ports (75\% of shortest paths in the previous analysis vs. 26\% here), while the fourth bar shows the same quantity but only for paths that pass through at least 1 core edge (25\% previously vs. 74\% here). Using shortest paths through the undirected co-route graph we find that the role of the core was under-estimated for all paths (16.6\% vs. 24.2\%). However, the length-to-edge percentage ratios are similar for all categories (core: 5.2 vs. 4.6; feeder: 1.7 vs. 1.3; local: 0.4 vs. 0.5). Limiting to only paths that pass through the structural core, the role of the core appears to have been overestimated (62.0\% vs. 27.7\%) and the role of local edges was underestimated in the previous work (4.6\% vs. 33.3\%). In this case, the length-to-edge percentage ratio for the core is much smaller (19.4 vs. 5.3), while the feeder (1.0 vs. 1.3) and local (0.1 vs. 0.5) ratios are again similar. Out-feeder edges, which are edges that leave the structural core, make up a higher percentage of both edges and lengths than in-feeder edges, which go from outside the core to inside. Therefore we find that the role of the core was simultaneously \emph{underestimated} for all shortest paths between non-core ports and \emph{overestimated} for only paths between non-core ports that pass through at least 1 core edge.

\begin{figure} \centering \includegraphics[width=0.47\textwidth]{fig/linkAndLengthPctReproduce.pdf}\hspace{0.25cm}\includegraphics[width=0.47\textwidth]{fig/linkAndLengthPct-115.pdf} \caption{} \label{fig:linkandlengthpct} \end{figure}

While our analysis indicates that previous work did not accurately estimate the role of the structural core, the main conclusion from that work--that the structural core plays an outsized role in mediating navigation of cargo through the network--still follows from our analysis. We further propose a method of computing centrality of ports in the shipping network using minimum-route paths and validate this measure against external measures of port and edge importance, finding that our measure is at least as good as other indicators for throughput and capacity based node and edge ranking, but simpler network indicators are better correlated with country-aggregated edge importance.

12:00
Understanding sectoral integration in energy systems through complex network analysis
PRESENTER: Andrea Diaz

ABSTRACT. This paper studies the concept of sectoral integration of energy systems from a network perspective. In the energy arena, the transition towards a cleaner use of energy has led to a series of changes in how we consume energy, the energy vectors we use to satisfy our needs and, in general, the configuration of our energy systems. These developments add complexity to our systems as their production and consumption configuration evolve. The concept of sectoral integration is recent and does not yet have a commonly agreed-upon definition nor a consistent measuring approach. We show that network analysis can be used to explore this evolution, allowing quantifying the degree of integration of existing systems. By using a stylized model, we propose a series of global and local measures, focusing on different parts of the energy system and allowing measuring system integration quantitatively. We then illustrate the developed measures by analysing the evolution of two European countries’ energy systems over the recent past (1990 – 2019).

12:15
Building co-morbidity networks via Bayesian network reconstruction

ABSTRACT. Patients that simultaneously suffer multiple long-term health conditions pose a problem to current healthcare systems, as these are configured for individual conditions and overlook their interaction. A particularly useful approach to represent co-morbidity data is via networks, where nodes correspond to conditions and edges represent their relations. However, determining the degree of co-morbidity and deciding on their relevance is not trivial. Different measures have been used in the literature for this purpose (e.g. relative risks, $\phi$-correlations, cosine index, etc), but they tend to be biased against certain conditions, and determining thresholds for pruning the network remains an arbitrary process. Here, we adapt Bayesian network reconstruction to infer the network of relevant associations between long-term health conditions without any bias or need to choose an arbitrary threshold. Instead, relevant associations are determined via a statistically sound method that accounts for the noise and uncertainty in the data. We apply the method to a primary care dataset and show how it is able to detect relevant co-morbidity pairs that previous methods in the literature are known to overlook.

12:30
Graph clustering of large trade network: case study of Russian wood industry
PRESENTER: Dmitriy Rusakov

ABSTRACT. Introduction The result of the Russian government’s digitalization of the is a great variety of traceability systems in this country. Milk, furs, drugs, alcohol drinks, clothes, tobacco, and wood are among controlled goods. It is planned to trace all traded goods by 2024. Unified State Automated Information System (USAIS) “Accounting Timber and Transactions with It” was launched in 2015 [1]. Legal entities and sole traders must provide information about their wood trade (timber and lumber) transactions. A significant part of data is open: transaction id, ITN or TIN and organization name for both buyers and sellers, date, deal volume in cubic meters. We analyzed wood trade relationships between Russian companies in 2020. Address and economic activity were extracted from the Federal Tax Service of Russia. Addresses were geocoded with the help of OSM Nominatim. The table with deals was converted into a graph, where vertices are companies and edges are transactions. The edge weight is the volume of a wood deal. There are 16 ths vertices and 26 ths edges in the obtained graph. Total trade volume of edges’ weights is 113 mln cubic meters. The graph was clustered using Leiden algorithm [2]. A special method was developed to avoid the problem of resolution limit: modularity-based algorithms merge some medium and small clusters into large. Every Leiden-created cluster was drawn in abstract space using the Fruchterman-Reingold layout [3]. Then we extracted clusters from the layout using meanshift algorithm [4]. Thus, many Leiden-created clusters were divided into component parts. Both Leiden algorithm and Fruchterman-Reingold layout are non-deterministic algorithms. So, we also developed an approach to restrict the uncertainty of clustering. Results Defined clusters have various size and content. They are rather autonomous. The average share of intra-cluster trade is 93 %. The modularity score of our partition is 0,863. Every cluster was described using open sources from the Internet. The content of the clusters has no contradictions with the logic of wood industry processes. Some clusters look like classical Porter’s clusters [5]: many companies are linked in the technological chains. There is competition in such clusters – a considerable part of companies inside them has the same specialization. Some other clusters look like Kolosovskiys’ spatial-industrial complexes (SIC) [6]. SIC clusters usually consist of several large manufacturers with unique specialization inside a cluster and their small multiple “satellites”. It’s important to stress that the main participants of SIC clusters don’t compete, they could only cooperate with each other within production chains. We defined 5 morphological types of clusters. Vertical monocentric consists of one large enterprise and many medium-sized and small firms. Vertical polycentric consists of several (usually 2 or 3) large enterprises and many medium-sized and small firms. All links in vertical clusters are between small or medium-sized firms and large enterprise. Small and medium-sized firms have no or very few links with each other. Vertical clusters often look like SICs. Horizontal clusters are well-connected groups of enterprises, which have different sizes. Its density of links usually is very high. Sometimes they contain several subclusters. Horizontal clusters often look like Porter’s clusters. Dendritic clusters are linear chains of enterprises (or little groups of enterprises), which relate to each other sequentially. They are widespread in Siberia. Most of such clusters are situated in underdeveloped territories. They may be kernels or consistent parts of a future vertical or horizontal cluster. Simple clusters are very small groups of enterprises. The layout of its graph looks like a little chain, triangle, square or something like that. Their nature is very different – little local group (several little firms from neighboring towns), a pair of huge plants somewhere in the middle of the Siberian periphery, a special wood-trade company of Russian railroads, etc. There are three factors of clustering in the wood industry of Russia. The first is production chains – direct sale and purchase of raw materials by wood industry enterprises without traders. There are 4 types of production chains – pulp, plywood, lumber and chipboard chains. The second is demand chains - sale and purchase of raw materials by wood industry enterprises via traders. There are 3 types of demand chains – redistribution of raw material between manufacturers within the cluster, accumulation of wood volumes for some enterprises outside the cluster, wood trading between traders within the cluster (perhaps, it’s a kind of speculation). The third is a common holder. There are plenty of clusters with groups of enterprises, which belong to the same holders or group of holders. The last step of our study was to plot clusters on the map – most clusters create rather compact areas. So, such spatial projections of wood trade clusters we called wood trade regions. Its borders often overlap with existing administrative boundaries. Sometimes its areas occupy a part of some region or several regions simultaneously (fig. 1). Economic clusters detection is usually based on the computation of location quotient by units of administrative division. There is a cluster in a unit if the location quotient is greater than a certain threshold. Such a method is a common practice all over the world, both in the EU [7] and USA [8]. It’s a rather rigid approach. Enterprises from one unit of administrative division may have no economic links. Wood trade clusters of Russian forest industry sometimes doesn’t correspond to the borders of regions, cities and towns (fig. 1). Summary Information about trade deals from USAIS is a new data source – analysis methods have not been developed, its’ potential is not clear. The information about all deals between enterprises makes it possible to study economic relations at the most fractional level. The key issue of our study is to understand whether enterprises break up into separate groups or not. A method of graph clustering was developed. It is a combination of Leiden algorithm, Fruchterman-Reingold layout and meanshift algorithm. The resulting clusters have various sizes and content. The average share of intra-cluster trade is 93 %. The modularity score of our partition is 0,863. We defined 5 morphological types of clusters: vertical monocentric, vertical polycentric, horizontal, dendritic and simple. There are three main factors of clustering in the wood industry of Russia: production chains, demand chains, common holder. Finally, the map of defined clusters was plotted. Most of the clusters form compact areas – wood trade regions. We have shown, that existing approaches of cluster detection, based on the location quotient often reflect economic networks in the wrong way. Proximity often doesn’t mean connectivity.

References

1. USAIS (EGAIS) "Accounting timber and transactions with it" data, accessed: 07 June 2022, https://lesegais.ru/open-area/deal

2. Traag, V.A., Waltman, L., van Eck, N.J.: From Louvain to Leiden: guaranteeing well-connected communities. Sci. Rep. 9(1), 5233 (2019)

3. Fruchterman, T.M.J., Reingold, E.M.: Graph drawing by force-directed placement. Softw.: Pract. Exper. 21, 1129--1164 (1991)

4. Fukunaga, K., Hostetler, L.: The estimation of the gradient of a density function, with applications in pattern recognition. in IEEE Transactions on Information Theory. 21(1), 32--40 (1975)

5. Porter, M.: The Competitive Advantage of Nations. Free Press, New York (1990)

6. Kolosovsky, N.N.: Izbrannye trudy, edited by N.N. Kazanskiy and etc. Oykumena, Smolensk (2006)

7. Hollanders, H.: Methodology report for the European Panorama of Clusters and Industrial Change and European cluster database. Luxembourg, Publication Office of the European Union (2020)

8. Delgado M., Porter M., Stern S.: Defining Clusters of Related Industries. NBER Working Paper. 20375 (2014)

12:45
Ecological validation of soil food-web robustness for managed grasslands

ABSTRACT. The predominant idea that the stability of systems would increase with their complexity was questioned by Lord Robert M. May (1973, Ecology 54:638-641) who, using methods related to dynamic models, came to the conclusion that the stability of an ecosystem decreases with increasing number of species and interactions. May’s criterion suggested that a community remains stable if a decrease in connectance C is accompanied by an increase in diversity S, so that SC remains a constant quantity. Several studies at first seemed to corroborate May’s hypothesis and found approximately constant values of linkage density SC ≈ 2, although more detailed empirical data, exhibited a higher degree of interaction between species, namely SC ≈ 10, and a positive relationship between C and S contrary to what predicted by May’s criterion. But, with all the ongoing pressures on species and ecosystems, to which extent do biodiversity and connectance matter for the ecosystem functioning of the environment? The aim of our work is to analyse the topology and the robustness of three networks corresponding to reference soil systems, highlighting similarities and differences [1]. Through the use of a model whose dynamics is dictated by an extension of the Lotka-Volterra equations [2], we studied what may be the consequences of artificial perturbations induced in the system. Unlike previous studies, that are often based on a purely topological network analysis despite the availability of so many software tools, this study derives the robustness of the food webs through a strictly dynamical analysis. This represents an upgrade considering that the structure of a given network has a strong impact on the outcomes of dynamics. In this regard, the dynamics of species in complex ecosystems is more tightly connected than conventionally thought, which has profound implications for the impact and spread of perturbations. In [1] we reproduced the ecological network of the soil food web in an abandoned pasture within a multi-agent fully programmable modeling environment in order to simulate dynamically the cascading effects due to the removal of entire functional guilds. In order to quantify the consequent structural changes we introduced an Alteration Index (AI), which considers the sum of the absolute variations in abundance that the species present in the food web undergo due to the forced removal of some other species, normalized with respect to their abundance in the steady state. In the present study [2] the Alteration Index has been used for the first time to enable a direct comparison between different food-web architectures. The results show that all soil networks have a disassortative nature, as expected for theoretical food webs. The values of the clustering coefficient, of the connectance and of the complexity, together with the calculation of the robustness suggest that the fallowed pasture with low pressure management (#247) is more robust than the other two grasslands under middle intensity management. The robustness shown by ecological networks could be useful elsewhere for evaluating the sustainability of agricultural practices to which the soil system is subject.

[1] Di Mauro, L.S., Pluchino, A., Conti, E., Mulder, C.. Ecological validation of soil food-web robustness for managed grasslands. Ecological Indicators (2022) IN PRESS [2] Conti, E., Di Mauro, L.S., Pluchino, A., Mulder, C.. Testing for top-down cascading effects in a biomass-driven ecological network of soil invertebrates. Ecology and Evolution 10, 7062–7072 (2020).

13:00-14:30Lunch Break
14:30-16:00 Session Oral O2A: Human Behavior
14:30
Robustness of Social Cohesion in Growing Groups

ABSTRACT. The cohesion of a social group is the group's tendency to remain united and has important implications for the stability and survival of social organizations, such as political parties, research teams, or online groups. Empirical studies suggest that cohesion is affected by both the admission process of new members and the group size. Yet, a theoretical understanding of their interplay is still lacking. To this end, we propose a model where a group grows by a noisy admission process of new members who can be of two different types. In this framework, cohesion is defined as the fraction of members of the same type and the noise in the admission process represents the level of randomness in the evaluation of new candidates. The model can reproduce the empirically reported decrease of cohesion with the group size. Moreover, we show that, when the admission of new candidates involves the decision of only one group member, the group growth causes a loss of cohesion even for infinitesimal levels of noise. However, when admissions require a consensus of several members, there is a critical noise level below which the growing group can remain cohesive. The nature of the transition between the cohesive and non-cohesive phases depends on the model parameters and forms a rich structure reminiscent of critical phenomena in ferromagnetic materials.

14:45
Individual Fairness for Social Media Influencers
PRESENTER: Stefania Ionescu

ABSTRACT. Many social media platforms nowadays are centered around content creators (CC). On such platforms, the tie formation process depends on two factors: (a) the exposure of users to CCs (decided by, e.g., a recommender system), and (b) the following decision-making process of users. Recent research studies underlined the importance of content quality by showing that under exploitative recommendation strategies, the network eventually converges to a state where the higher the quality of the CC, the higher their expected number of followers. In this paper, we extend prior work by (a) looking beyond averages to assess the fairness of the process and (b) investigating the importance of exploratory recommendations for achieving fair outcomes. Using an analytical approach, we show that non-explorative recommendations usually lead to unfair outcomes. Moreover, even with exploration, we are only guaranteed fair outcomes for the highest (and lowest) quality CCs.

15:00
Extremists of a Feather: Homophily in Violent Extremist Networks

ABSTRACT. Violence is exacerbated in small close-knit groups and through social identification processes. Relying on a new database on the relationships among US Islamist extremists, we explore the structure of this highly violent social network. Social identity theory postulates that individuals tend to relate together because of shared identities. We analyze whether social, cultural, and ideological identities increase the probability to make connections in Islamist extremist networks. We  find that the US Islamist network is highly clustered, modular, and includes only a few larger communities. Furthermore, results from the exponential random graph modeling (ERGM) point to the roles of friends and ideological affinity in driving connections between US Islamist extremists. Overall, these findings suggest that different social identification processes are at play and that some social identities shape violent social networks more strongly than others.

15:15
Multidimensional online American politics: Mining emergent social cleavages in social graphs

ABSTRACT. Dysfunctions in online social networks (e.g., echo chambers or filter bubbles) are studied by characterizing the opinion of users, for example, as Democrat- or Republican-leaning, or in continuous scales ranging from most liberal to most conservative. Recent studies have stressed the need for studying these phenomena in complex social networks in additional dimensions of social cleavage, including anti-elite polarization and attitudes towards changing cultural issues. The study of social networks in high-dimensional opinion spaces remains challenging in settings such as that of the US, both because of the dominance of a principal liberal-conservative cleavage, and because two-party political systems structure preferences of users and the tools to measure them. This article builds on embedding of social graphs in multi-dimensional ideological spaces and NLP methods to identify additional cleavage dimensions linked to cultural, policy, social, and ideological groups and preferences. Using Twitter social graph data I infer the political stance of nearly 2 million users connected to the political debate in the US for several issue dimensions of public debate. The proposed method shows that it is possible to identify several dimensions structuring social graphs, non-aligned to liberal-conservative divides and related to new emergent social cleavages. These results also shed a new light on ideological scaling methods gaining attention in many disciplines, allowing to identify and test the nature of spatial dimensions mined on social graphs.

15:30
Sometimes Less is More: When Aggregating Networks Masks Effects
PRESENTER: Jennifer Larson

ABSTRACT. A large body of research aims to detect the spread of something through a social network. This research often entails measuring multiple kinds of relationships among a group of people and then aggregating them into a single social network to use for analysis. The aggregation is typically done by taking a union of the various tie types. Although this has intuitive appeal, we show that in many realistic cases, this approach adds sufficient error to mask true network effects. We show that this can be the case, and demonstrate that the problem depends on: (1) whether the effect diffuses generically or in a network-specific way, and (2) the extent of overlap between the measured network ties. Aggregating ties when diffusion is network-specific and overlap is low will negatively bias and potentially mask network effects that are in fact present.

15:45
Classical and quantum random walks to identify leaders in criminal networks
PRESENTER: Annamaria Ficara

ABSTRACT. Random walks simulate the randomness of objects, and are key instruments in various fields such as computer science, biology and physics. The counter part of classical random walks in quantum mechanics are the quantum walks. Quantum walk algorithms provide an exponential speedup over classical algorithms. Classical and quantum random walks can be applied in social network analysis, and can be used to define specific centrality metrics in terms of node occupation on single-layer and multilayer networks. In this paper, we applied these new centrality measures to three real criminal networks derived from an anti-mafia operation named Montagna and a multilayer network derived from them. Our aim is to (i) identify leaders in our criminal networks, (ii) study the dependence between these centralities and the degree, (iii) compare the results obtained for the real multilayer criminal network with those of a synthetic multilayer network which replicates its structure.

14:30-16:00 Session Oral O2B: Community Structure
14:30
Distance Dependent Bayesian Community Detection
PRESENTER: Marcelo Mendoza

ABSTRACT. Community detection is a fundamental problem in network science that aims at finding highly connected groups of nodes in complex networks. Of particular interest is the development of methods to detect communities in attributed graphs, that is, in networks with nodal features. Morup et Schmidt [ref] proposed a Bayesian community detection approach with a realistic generative process that guarantees a lower link density across cluster than within cluster for any community pairs. Despite the recent introduction of community detection methods using Bayesian inference, to our knowledge no methods exist for attributed graphs. In this work, we propose a novel Bayesian community detection algorithm that considers attributed graphs. To do so, we define a graph model based on the distance-dependent Chinese Restaurant Process (dd-CRP), a probabilistic model that defines a class of distributions over node partitions that allows for dependencies between the nodes. This extended abstract shows how our algorithm generates networks based on the node attributes. In addition, we present a Bayesian inference approach to learn the model parameters.

14:45
Dynamic Local Community Detection with Anchors

ABSTRACT. Community detection is a challenging research problem, especially in dynamic networks since communities in such networks cannot remain stable as they evolve. For example, new communities emerge, existing communities disappear, grow or shrink. There are many cases where someone is more interested in the evolution of a particular community to which an important node belongs than in the global partitioning of a dynamic network. However, due to the drifting problem where one community can merge into a completely different one, it is difficult to track the evolution of communities. Our aim is to identify the community that contains a node of particular importance, called anchor, and its evolution over time. The framework we propose circumvents the identity problem by allowing the anchor to define the core of the relevant community partially or fully. Experiments with synthetic datasets demonstrate the positive aspects of the proposed framework in identifying communities with high accuracy.

15:00
Community Detection Supported by Node Embeddings (Searching For a Suitable Method)
PRESENTER: Bartosz Pankratz

ABSTRACT. Most popular algorithms for community detection in graphs have one serious drawback, namely, they are heuristic-based and in many cases are unable to find a near-optimal solution. Moreover, their results tend to exhibit significant volatility. These issues might be solved by a proper initialization of such algorithms with some carefully chosen partition of nodes. In this paper, we investigate the impact of such initialization applied to the two most commonly used community detection algorithms: Louvain and Leiden. We use a partition obtained by embedding the nodes of the graph into some high dimensional space of real numbers and then running a clustering algorithm on this latent representation. We show that this procedure significantly improves the results. Proper embedding filters unnecessary information while retaining the proximity of nodes belonging to the same community. As a result, clustering algorithms ran on these embeddings merge nodes only when they are similar with a high degree of certainty, resulting in a stable and effective initial partition.

15:15
Modeling Node Exposure for Community Detection in Networks

ABSTRACT. In community detection, datasets often suffer a sampling bias for which nodes which would normally have a high affinity appear to have zero affinity. This happens for example when two affine users of a social network were not exposed to one another. Community detection on this kind of data suffers then from considering affine nodes as not affine. To solve this problem, we explicitly model the (non-)exposure mechanism in a Bayesian community detection framework, by introducing a set of additional hidden variables. Compared to approaches which do not model exposure, our method is able to better reconstruct the input graph, while maintaining a similar performance in recovering communities. Importantly, it allows to estimate the probability that two nodes have been exposed, a possibility not available with standard models.

15:30
Community Detection for Temporal Weighted Bipartite Networks

ABSTRACT. Community detection of temporal (time-evolving) bipartite networks is challenging because it can be performed either on the temporal bipartite network, or on various projected networks, composed of only one type of nodes, via diverse community detection algorithms. In this paper, we aim to systematically design detection methods addressing both network choices and community detection algorithms, and to compare the community structures detected by different methods. We illustrate our methodology by using a telecommunications network as an example. We find that three methods proposed identify evident community structures: one is performed on each snapshot of the temporal network, and the other two, in temporal projections. We characterise the community structures detected by each method by an evaluation network in which the nodes are the services of the telecommunications network, and the weight of the links between them are the number of snapshots that both services were assigned to the same community. Analysing the evaluation networks of the three methods reveals the similarity and difference among these methods in identifying common node pairs or groups of nodes that often belong to the same community. We find that the two methods that are based on the same projected network identify consistent community structures, whereas the method based on the original temporal bipartite network complements this vision of the community structure. Moreover, we found a non-trivial number of node pairs that belong consistently to the same community in all the methods applied.

15:45
Community Detection using Moore-Shannon Network Reliability: Application to Food Networks
PRESENTER: Stephen Eubank

ABSTRACT. Community detection in networks is extensively studied from a structural perspective, but very few works characterize communities with respect to dynamics on networks. We propose a generic framework based on Moore-Shannon network reliability for defining and discovering communities with respect to a variety of dynamical processes. This approach extracts communities in directed edge-weighted networks which satisfy strong connectivity properties as well as strong mutual influence between pairs of nodes through the dynamical process. We apply this framework to food networks. We compare our results with modularity-based approach, and analyze community structure across agricultural commodities, evolution over time, and with regard to dynamical system properties.

14:30-16:00 Session Oral O2C: Biological Networks
Chair:
14:30
Exploring Glioblastoma multiforme through Network Analysis highlights topologically critical nodes
PRESENTER: Apurva Badkas

ABSTRACT. Glioblastoma Multiforme (GBM) represents one of the most challenging diseases – it is the most common form of malignant brain cancer,the prognosis is bleak, and there are limited treatments available. Inter-and intra-tumoural heterogeneity, along with variable patient response adds to its complications. Network analysis approaches have been exploring the GBM landscape for the past few years. We present an application of a simple, PPIN-based analysis that identifies several critical nodes in the GBM-specific network. These candidates link, or lie in the vicinity of, known disease-associated proteins. Transcriptomic data and survival analysis indicate that the predicted candidates may contribute to the disease phenotype and may affect patient survival. Together, the results show that the method highlights novel, topologically-critical candidates which are likely to contribute to GBM progression. These candidates could be prioritized to assess molecular signatures of GBM, and explored as drug targets.

14:45
Network theory as a novel tool to analyze cardiac arrhythmia
PRESENTER: Sirius Fuenmayor

ABSTRACT. Introduction

The management of cardiac arrhythmia remains the largest problem in cardiac electrophysiology. The prevalence of the most frequent arrhythmia, atrial fibrillation (AF), is expected to rise steeply due to the aging population. Despite intensive research, the mechanism of AF remains unclear, leading to poor results in its treatment. Ablation (burning of tissue) of AF often results in complex atrial tachycardia (AT), which are difficult to treat. Also, ventricular tachycardia’s (VT) and fibrillation’s (VF) are a major cause of sudden cardiac death. Again, eliminating VT with ablation has achieved only modest success in complex cases. Therefore, there is an urgent need to better understand and localize the sources of arrhythmia to improve its treatment. We have invented a radical novel approach of applying network theory to study the mechanisms of AT, VT, and AF, called Directed Graph Mapping (DGM) [1, 2].

Methods

As the electrical wave passes through the heart tissue, the electrodes in an electroanatomic mapping system are activated at specific times often referred to as Local Activation Times (LATs), as described in figure 1, the DGM algorithm constructs a directed network based on the LATs at the coordinates of the corresponding electrodes. The electrodes are treated as the nodes of the network and a directed link is drawn from one electrode to another (neighboring) electrode if the spatial distance divided by the time difference in LATs is in between a certain allowed minimal and maximal conduction velocity. Representing the pattern of electrical activity as a directed network opens the possibility of finding ablation targets using network theory with better reliability than current methods. Also, it opens the way to characterize the patterns using network metrics. As known there are two types of sources of arrhythmia: rotational activity and focal activity. Rotational activity manifests itself as cycles in the directed network, while focal sources are representented as nodes with only outgoing arrows. This enables a rapid and reliable localization of these sources using common network algorithms. The software package DGM does all these analyses automatically while being user friendly via a GUI ) [2].

Results

In this work, first, we will present the versatility of DGM by showing how clinical examples of AT, VT and simulated examples of AF can be analyzed by this tool. Moreover, for AT, we will show that DGM exceeds the currently available tools on the market. We will show how DGM can improve the clinical workflow. Apart from the primary goal of finding sources of arrhythmia using network theory we will also show how characterizing the network associated to the underlying spatiotemporal pattern of the arrhythmia can give additional insights in arrhythmia. Using different network metrics, we can distinguish focal sources from rotational sources, even without the need to analyze any cycles. Dynamical processes on networks such as random walks or diffusion and spreading help to localize the core of rotational activity and focal sources in more complicated cases when incomplete data is available. Moreover, important characteristics of arrhythmia such as its complexity, currently measured using the time series from electrode recordings, can be reformulated using complex networks metrics.

Conclusion

Ablation, the main treatment of arrhythmia, is currently done in a qualitative manner by the judgment of the electrophysiologist which is error prone. DGM has proven to be significantly better than any given method currently used. Finally, the network representation of arrhythmia offers a new way to characterize and study cardiac arrhythmia which we seek to explore.

15:00
Real-world data in rheumatoid arthritis: patient similarity networks as a tool for clinical evaluation of disease activity
PRESENTER: Ondřej Janča

ABSTRACT. Real-world data in rheumatoid arthritis: patient similarity networks as a tool for clinical evaluation of disease activity

Ondrej Janca1,2, Eva Kriegova2, Pavel Horak3, Martina Skacelova3, Jakub Savara1,2, Anna Petrackova2, and Milos Kudelka1

1 Department of Computer Science, Faculty of Electrical Engineering and Computer Science, VSB-Technical University Ostrava, Ostrava, Czech Republic ondrej.janca@vsb.cz 2 Department of Immunology, Faculty of Medicine and Dentistry Palacky University Olomouc, Olomouc, Czech Republic 3 Department of Internal Medicine III – Nephrology, Rheumatology and Endocrinology, University Hospital Olomouc, Olomouc, Czech Republic

1 Introduction Rheumatoid arthritis (RA) is a highly heterogeneous autoimmune inflammatory disease characterized by swelling and tenderness of joints, severely influencing physical function and quality of life[1]. Disease activity measures are used for patient management and treatment selection[2]. These measures evaluate peripheral blood inflammatory markers levels, affected joint counts and patient’s general health questionnaires. Because of the additive nature of the scores’ formula, patients with a similar activity score often have disparate profiles. This information is essential for individual evaluation but is lost during the score calculation. The real-world data is further complicated by different treatment strategies of individual patients, disease duration, demographic backgrounds, comorbidities and inflammation of other origins. In this work, we aimed to provide an alternative independent of questionnaires with an accuracy comparable to that of the activity measures. Following the evidence of the impact of inflammation on lipid metabolism[3], we focused the feature pre-selection on serum lipid parameters and clinical data routinely collected on RA patients. We constructed patient similarity networks (PSN) from combinations of these features using the LRNet method[4]. A PSN is an undirected graph where the vertices represent patients and weighted edges their similarity. LRNet constructs the PSN based on nearest-neighbour analysis, applying the Gaussian function on rescaled and normalized vector data to transform the Euclidean distance between patients into a similarity metric. After a PSN is constructed, the Louvain algorithm is used to detect clusters of patients based on local representativeness. We compared the resulting PSNs to three of the recommended activity measures[2]: DAS28 (Disease Activity Score-28 joints count[6]), SDAI (Simple Disease Activity Index[7]) and CDAI (Clinical Disease Activity Index[8]). These activity measures classify patients into four groups 1–4 (low to high activity), based on different combinations of the following features: swollen and tender joints count (SJC, TJC), patient global health assessment (PGA), evaluator global health assessment (EGA) and C-reactive protein serum level (CRP). The score interval widths of individual groups vary, with the boundary between groups 2 and 3 being particularly significant (since moving from group 2 to group 3 is usually the breaking point for starting a patient’s treatment).

2 Results Multiple PSNs were constructed using combinations of patient features. The quality of the PSN was assessed using four factors: number of detected clusters, modularity of clusters, embeddedness of clusters[5], and silhouette. The model quality was assessed based on the separation of inactive vs active patients (activity groups 1+2 vs 3+4) and the number of inactive patients in active clusters and vice versa. The best results were obtained with a PSN constructed using SJC, CRP, total cholesterol, LDL-cholesterol, and triglycerides. The PSN showed best accuracy for DAS28. The Louvain method detected four clusters: one composed of mainly inactive patients, two of mainly active patients and one of the patients from all four activity groups. For several patients that seemed to be incorrectly classified, a further clinical background check was performed by a clinician. In three of these patients, a serum-lipid-level-altering factor influencing the data was discovered (2 concurrent diseases, 1 medication). In clinical practice, when a new patient is added to the dataset, the network is recalculated and re-constructed. In the updated network, the patient is placed near other similar patients, and since the clinician already has more extensive data on these neighbours, they can make better decisions regarding the new patient.

3 Conclusions The real-world RA data is complicated by various clinical profiles, medication, and concurrent diseases. PSNs prove to be a valuable tool in analyzing and visualizing this data. Regardless of the activity measure, the network provides clinically-relevant information that aids clinicians’ decision-making. The network maps the real-world data with the expected inaccuracy, comparable to activity measures. This model shows promise for use in clinical practice, but to provide more refined results, the model would require an extension of the dataset.

4 Support IGA LF 2022 11, IGA LF 2022 03, MH CZ – DRO (FNOL, 00098892)

References 1. Sparks, J. A.: Rheumatoid Arthritis. Annals of internal medicine 170(1), ITC1–ITC16 (2019) 2. England, B.R., Tiong, B.K., Bergman, M.J., Curtis, J.R., Kazi, S., Mikuls, T.R., O’Dell, J.R., Ranganath, V.K., Limanni, A., Suter, L.G. and Michaud, K.: 2019 Update of the American College of Rheumatology Recommended Rheumatoid Arthritis Disease Activity Measures. Arthritis Care Res 71, 1540-1555 (2019) 3. Choy, E., and Sattar, N.: Interpreting lipid levels in the context of high-grade inflammatory states with a focus on rheumatoid arthritis: a challenge to conventional cardiovascular risk actions. Ann. Rheum. Dis. 68 (4), 460–469 (2009) 4. Ochodkova, E., Zehnalova, S., Kudelka, M. (2017). Graph Construction Based on Local Representativeness. In: Cao, Y., Chen, J. (eds) Computing and Combinatorics. COCOON 2017. Lecture Notes in Computer Science, vol. 10392 (2017) 5. Hric, D., Darst, R. K., Fortunato, S.: Community detection in networks: Structural commu- nities versus ground truth. Physical Review E. 90(6) (2014) 6. Prevoo, M. L. L., van ’t Hof, M. A., Kuper, H.H., van Leeuwen, M. A., van de Putte, L. B. A., van Riel, P. L. C. M.: Modified disease activity scores that include twenty-eight-joint counts: development and validation in a prospective longitudinal study of patients with rheumatoid arthritis. Arthritis Rheum 38, 44–8 (1995) 7. Smolen, J. S., Breedveld, F. C., Schiff, M. H., Kalden, J. R., Emery, P., Eberl, G., van Riel, P. L., Tugwell, P.: A simplified disease activity index for rheumatoid arthritis for use in clinical practice. Rheumatology (Oxford) 42, 244–257 (2003) 8. Aletaha, D., Nell, V. P., Stamm, T., Uffmann, M., Pflugbeil, S., Machold, K., Smolen, J. S.: Acute phase reactants add little to composite disease activity indices for rheumatoid arthritis: validation of a clinical activity score. Arthritis research & therapy 7(4), R796–R806 (2005)

15:15
The sample size value in Network Medicine: an application of gene co-expression networks

ABSTRACT. 1. INTRODUCTION

The sequencing of mRNA levels (RNAseq) in specific contexts has improved our understanding of disease phenotypes and drug mechanisms. Network Medicine uses RNAseq data for inferring gene co-expression networks, which show similarity between the expression patterns of genes, unraveling potential associations between them (1). The influence of sample size in the predictive power of gene co-expression networks is still not understood. Previous studies suggest that sample size augments the reproducibility of networks (2,3), converging to more stable models, and improves the prediction of functional associations (4,5). However, these studies are limited to a few dozens of samples. Moreover, we still do not understand how the significant interactions in gene co-expression networks evolve with sample size, which parts of the network are more affected, nor how this could affect the outcome of a case study.

2. RESULTS

We created multiple gene co-expression networks, with varying sample sizes ranging from 20 to 10,000, obtained from subsamples generated using a bootstrap from our datasets (TCGA or GTEx). The networks were created using the Pearson’s correlation method, obtaining a correlation value and a significance p-value for each pair of genes. We only kept the pairs of genes with p-values below a given threshold (e.g., p-value < 0.05, Bonferroni corrected). We observed that the probability of revealing new significant edges in gene co-expression networks is power-law related to sample number p(N)~N^{-\alpha}. Based on this observation, we derived an analytical solution that explains how the number of significant edges grows less as sample size increases. The final analytical solution is known as stretched exponential function, and was originally used to describe the discharge of the capacitor by Kohlausch (6), and later applied to study tray bodies in the solar system (7) and to model the diffusion-weighted MRI signal in the brain (8). We validated our model on different independent datasets, demonstrating a good fit (Figure 1A). The model provides an accurate estimate of the fraction of information that can be obtained from gene co-expression networks of a given number of samples (Figure 1B), enhancing the uncertainty of Network Medicine analyses based on RNAseq data.

15:30
The orthoBackbone: an evolutionarily-conserved backbone of genetic networks

ABSTRACT. Extracting relevant knowledge from large gene expression datasets is a complex task, especially if the analysis requires cross species comparisons, as much irrelevant information can cloud inference. We present the orthoBackbone an evolutionary-conserved multi-layer subgraph of gene regulatory networks across multiple species. We cast gene regulation networks (GRN) as weighted graphs where nodes are genes of interest and edges correspond to the strength of observed functional interactions. The latter can be derived from large-scale integrated databases of protein interactions, such as StringDB, that compile high-throughput experiments, literature reports, and genomic context predictions.

For this analysis, we build a network for three species of interest---human, mouse, and insect---where nodes and edges denote the interaction knowledge of the biological system under study, as synthesized by StringDB. In particular, strength of interaction ($s_{ij}$) is converted to a distance weight via $d_{ij} = 1/s_{ij}-1$. Cross-species network comparison allows for the identification of key biological processes that have been conserved throughout evolution. Thus, we cast the GRN as a multi-layer network where each layer represents a particular species, and orthologous genes are connected across layers---orthologous genes are homologous sequences that have kept the same functional role across evolution and speciation. Since sequences can diverge during speciation, a gene in one layer may connect to multiple orthologous genes in another.

Producing large multi-layer networks from the integration of data in StringDB (or another similar source), leads to large and dense networks. It is thus useful to employ methods to reduce their complexity, integrate cross-species knowledge, and facilitate biological interpretation by domain experts. For these reasons, we developed the orthoBackbone, building upon the concept of distance backbone of single-layer complex networks. For each species network (i.e., each layer) we compute its metric backbone---although other types of distance backbones, such as ultra-metric or Euclidean can be considered. The metric backbone is a subgraph that is sufficient to compute all shortest paths, obtained by removing all (semi-metric) edges that break the triangle inequality of distance graphs: $d_{ij} > d_{ik} + d_{kj}$ (see \cite{Simas:2021} for details and open-source Pytho} code). In other words, edges that are redundant for computing shortest paths in a given species layer are removed. This drastically lowers network density while preserving connectivity, all shortest paths, and even community structure. Indeed, in our case 80-90% of all edges are removed. By removing edges that are redundant for shortest-paths, we assume that the preserved biochemical pathways are the most important in regulating the entire set of genes of interest, as distance is inversely proportional to the known (and experimentally-derived) strength of gene interaction. Many of the removed edges are likely to offer alternative paths in regulation and signaling, thus providing robustness to perturbation via redundancy, which is at the core of the distance backbone methodology. However, here we are not studying robustness, but identifying the most likely regulatory and signaling pathways in gene regulation.

Because there are ortholog genes relating the various species layers, it follows that a fraction of interactions in the backbones of each layer must be evolutionarily-preserved and likely to encode functional relationships crucial for the survival of different species. Therefore, we produce the orthoBackbone of the multi-layer network as the intersection of the metric backbones of all layers. Thus, the orthoBackbone only contains edges that connect orthologous genes in all constituent metric backbones. In cases where there is a one-to-many gene orthology relation, an edge is kept in the \textit{orthoBackbone} if at least one backbone edge in the respective orthologous set exists in the metric backbone of another species. In our study this further reduces the number edges in each layer to 1.7-2.7%.

In a collaboration with experimental biomedical researchers we show that the \textit{orthoBackbone} identifies fundamental biological programs underlying human infertility across species. Specifically, genes with an old evolutionary origin serve as a genetic scaffold from which transcriptional complexity has evolved, resulting in a core set of 79 ancient functional interactions at the heart of male germ cell identity (see Fig. 1). By silencing 920 candidate genes likely to affect the acquisition and maintenance of this identity, we uncover 164 previously unknown spermatogenesis genes. Integrating this information with human whole-exome sequencing data reveals three novel genetic causes of male infertility shared between species that have diverged for more than 600 million years.

We note the orthoBackbone as here constructed is not itself a metric backbone. This restrictive form was pursued because identifying a small set of orthologous interactions is important for experimental validation. In other words, this orthoBackbone construction is the most conservative when looking for evolutionarily-preserved interactions, as we obtain interactions that are on the metric backbones of every species. In future work, we will study less conservative versions that preserve the characteristics of distance backbones. For instance, if layers are collapsed according to orthology, the metric backbone of the resulting network would itself be a metric orthoBackbone. Whereas the more conservative orthoBackbone is ideal to study important evolutionarily-preserved interactions, as in the male infertility study, the less restrictive one would be useful to study longer interaction pathways that are not necessarily preserved by evolution.

In conclusion, our methodology and results demonstrate a novel synergy between comparative network biology and medical genetics which resulted in the discovery of novel genes involved in human infertility, as well as highlight the importance of evolutionary history on disease. Our open-access and easily-adaptable interdisciplinary effort can be used for translational studies involving other diseases and cell types. A preprint of the work, currently under review, is available.

15:45
Nonlinear machine learning pattern recognition and bacteria-metabolite multilayer network analysis of perturbed gastric microbiome

ABSTRACT. The stomach is inhabited by diverse microbial communities, co-existing in a dynamic balance. Long-term use of drugs such as proton pump inhibitors (PPIs), or bacterial infection such as Helicobacter pylori, cause significant microbial alterations. Yet, studies revealing how the commensal bacteria re-organize, due to these perturbations of the gastric environment, are in early phase and rely principally on linear techniques for multivariate analysis. Here we disclose the importance of complementing linear dimensionality reduction techniques with nonlinear ones to unveil hidden patterns that remain unseen by linear embedding. Then, we prove the advantages to complete multivariate pattern analysis with differential network analysis, to reveal mechanisms of bacterial network re-organizations which emerge from perturbations induced by a medical treatment (PPIs) or an infectious state (H. pylori). Finally, we show how to build bacteria-metabolite multilayer networks (Fig.1) that can deepen our understanding of the metabolite pathways significantly associated to the perturbed microbial communities.

16:00-16:35Coffee Break
16:00-16:35 Session Poster P2A: [1-7] Network Analysis & Measures
Gig economy and social network analysis: topology of inferred network

ABSTRACT. Unparalleled advances in information technology have resulted in the virtualization of the workplace, as well as in a surge for non-traditional work arrangements based on short-term contracts (“gigs”). Work that done remotely through online platforms may be hidden by technology. Thus, how could we possibly access information about the social network of workers? What is the business and social values hidden in the underlying informal social networks of workers? Here, we propose applying methods from complex network analysis to study data from a Brazilian food-delivery company. We present the steps used to make the inference of a social network that relates delivery men according to their co-location patterns. Hence, the obtained network offers a valuable framework to explore, in the future, questions related with the role of informal social networks in the spreading of innovations and in the coordination of behaviors and business strategies.

Analysis of the Structure and Dynamics of European Flight Networks
PRESENTER: Matteo Milazzo

ABSTRACT. We analyze structure and dynamics of flight networks of 50 airlines active in the European airspace in 2017. Our analysis shows that the concentration of the degree of nodes of different flight networks of airlines is markedly heterogeneous among airlines reflecting heterogeneity of the airline business models. We obtain an unsupervised classification of airlines by performing a hierarchical clustering that uses a correlation coefficient computed between the average occurrence profiles of 4-motifs of airline networks as similarity measure. The hierarchical tree is highly informative with respect to properties of the different airlines (for example, the number of main hubs, airline participation to intercontinental flights, regional coverage, nature of commercial, cargo, leisure or rental airline). The 4-motif patterns are therefore distinctive of each airline and reflect information about the main determinants of different airlines. This information is different from what can be found looking at the overlap of directed links.

The valuation of information by dynamic decentralised criticality measures in complex data flow networks
PRESENTER: Yaniv Proselkov

ABSTRACT. Smart manufacturing uses data-driven solutions to improve performance and opera- tions resilience, requiring large amounts of data delivered quickly, enabled by telecom networks and network elements such as routers or switches. Disruptions can render a network inoperable; avoiding them needs responsiveness to network usage, achievable by embedding autonomy into the network, providing fast and scalable algorithms that use key metrics to manage disruptions, such as impact of failure in a network element on system functions. Centralised approaches are insufficient for this as they need time to transmit data to the controller, by which time it may have become irrelevant. Decentralised and information bounded measures solve this by placing computational agents near the data source. We propose an agent-based model to assess the value of the information for calculating decentralised criticality metrics, assigning a data collection agent to each network element, computing relevant indicators of the impact of failure in a decentralised way. This is evaluated by simulating discrete information exchange with concurrent data analysis, comparing measure accuracy to a benchmark, and with measure computation time as a proxy for computation complexity. Results show losses in accuracy are offset by faster computations with fewer network dependencies.

Evaluating network membership with an application to the European Research Programmes

ABSTRACT. Although the topic of networks has received significant attention from the scientific literature, it remains to be seen whether it is possible to quantify the degree to which an organisation benefits from being part of a network. Starting from the concept of network market share, this paper introduces and defines the Collective Network Effect (CNE). CNE is based on the concept that a network member is not only affected by its friends but also by the friends of its friends. By taking into account network connection patterns, CNE provides a proxy for quantifying the benefit of network membership. We computed the CNE for the nodes of a large network built using the whole set of common projects among the participants of the 7th Framework Programme for Research and Technological Development and H2020 of the European Commission. The obtained results show that nodes with a higher CNE have access to substantially more conspicuous fundings than nodes with a lower CNE. In general, such a measure could supplement other centrality measures and be useful for organisations and companies aiming to evaluate both their current situation and the potential partners they should link with in order to extract the highest benefits from network membership.

Graph Partitions in Chemistry
PRESENTER: Ioannis Michos

ABSTRACT. In this work we study partitions of molecules (equitable, almost equitable or other); provide examples that show how these are related with symmetries and physical properties of the molecules; and discuss how such relations can be rendered suitable for predicting properties and/or inverse engineering in the context of materials discovery and design. Graph invariants such as graph spectra, play a pivotal role and provide features that can fit in machine learning schemes. This work extends an ongoing research on crystalline solids such as perovskites, metal organic frameworks, zeolites, but also amorphous materials like polymer melts and glasses, and polymer networks.

Ecological Networks and State Shifts in the Earth’s Biosphere
PRESENTER: Sabin Roman

ABSTRACT. With increasing degrees of industrialization around the world and the conversion of natural habits for economic purposes, significant portions of the biosphere are at the risk of disappearing (e.g., species extinction) or no longer contributing to broader ecological processes (e.g., prominence of mono-cultures in crop production). It is possible that above a certain threshold of biosphere conversion and utilization, critical natural feedback mechanisms are broken, which can lead to a global ecological collapse. We present a way to account for the contribution of high-order feedback mechanisms to the stability of the biosphere. The method relies on building equations of power series that encapsulate higher order processes in ecological systems. The solutions provide candidate thresholds which vary from cautionary to optimistic, and we provide a metric to quantify the spectrum of possible values.

Food webs and ecological networks are considered in a discrete case where percolation thresholds are compared to our results. Furthermore, we take a continuum limit of the network structure and interactions that leads to a partial differential equation for the biomass of the different species. We investigate effects on the biosphere of distinct shocks and how extinction events propagate throughout the ecological chains. The considerations are not limited to biological systems: similar patterns can hold for various economic systems, such as supply and distribution chains of goods and financial markets.

Analysis of Oscillatory Time series using the Visibility Graph method

ABSTRACT. A visibility graph analysis is carried out for sinusoidal time series, with various modifications inspired by the issues observed in light curves of pulsating variables stars, previously studied. Frequency modifies the metrics of the resulting networks, specially for lower frequencies, whereas asymptotic values are reached, in general, for higher values. We have also found that although relevant statistical results are expected for longer curves, it is interesting that short times series may yield consistent results as well. Regarding noise, even a small amount of noise leads to degree distributions which are exponential (HVG) or power-law (VG), consistent with the universal behaviors found for observed light curves. Finally, gaps are found to have no major effect on the studied metrics, which is also consistent with the result for observed light curves.

16:00-16:35 Session Poster P2B: [8 -11] Human Behavior
Analyzing the commentator network within the French YouTube Environment

ABSTRACT. YouTube is by far and out the largest video hosting platform. The site began in 2005 and has achieved a sustained pattern of growth since its conception Burgess and Green (2018). Today YouTube host an immense amount of content from established media networks, multi-channel network, and third-party developers, as well as producing its own in-house content Burgess (2010).

In order to reward these content creators YouTube developed the YouTube Partner Program (YPP). This practice of sharing advertising revenue with some of the content creators is what really separates YouTube from other video sharing platforms Caplan and Gillespie (2020). Of course, with the introduction of such a program, the level of competition for views/subscribers on YouTube grew immensely.

Content creators have to build a reputation, and this reputation is constructed through the emergent networks on YouTube. Central to this is what content creators call 'cross-promotion': Cross promotion are links to other channels Grünewald & Haupt (2014). We find these links in the comments, which are posted under the videos produced.

Seeing as YouTube content is separated into topics i.e. (Sports, Gaming, Beauty, Music.etc) links are mostly found between related videos/channels. These networks of related videos/channels exhibit small world characteristics Cha et al. (2009).

Through this research we aim to identify the motivation behind the viewer decisions, what is driving a viewer who watched video 1 to watch video 2 ? By observing if the viewer has commented on both video 1 and video 2, we are able to establish a viewing pattern. We can now go further and analyse what kind of features( like video length or quality) are significant to viewer choice.

The impact of diversity on group performance
PRESENTER: Fabian Baumann

ABSTRACT. Diversity based on different skills or preferences has been shown empirically to drive innovation, creativity as well as economic growth. It is therefore expected to play a key role in collective problem-solving. Theoretical approaches to investigate group performance, however, have predominantly considered homogeneous populations that are comprised of indistinguishable agents. Challenging the assumption of homogeneity, we investigate the effects of diversity on group performance within a simple model of social learning. We find that diversity has different effects on group performance depending on the difficulty of the task and the social network structure. For simple tasks, diversity generally impairs group performance. Instead, for complex tasks, the network structure modifies the effect of diversity. On inefficient networks with low link density, where the spread of information is rather slow, homogeneous populations outperform diverse ones. By contrast, on efficient networks with high link density, diverse populations improve and discover on average better solutions than homogeneous ones. Our findings have implications for the composition of problem-solving teams in an increasingly interconnected world: the more we are connected, the more we can benefit from diversity to solve complex problems.

Correlation Networks measure Technology Evolution on Stack Overflow

ABSTRACT. Theories of technological development suggest combinatory evolution and competition as its mode of progress with adjacent technologies sharing similar trajectories. While prior research captured technology relationships on Stack Overflow via networks, we know little about the rules governing the evolution of technologies on the platform. We propose correlation networks as a method to study the evolution, composition, and cohesion of technology domains on Stack Overflow over time through tag use patterns. We find that the strongest technology links form two distinct technology clusters. Recent clusters adequately capture the increasing relevance of Python and machine learning frameworks given their now wide-scale accessibility. We flag further work in validating both our methodologies and assumptions via job market data.

16:35-17:15 Session Speaker S2: Giulia IORI City, University of London, UK
16:35
Information diffusion in trading networks

ABSTRACT. In this talk I will presents results from recent work, with my co-authors, which explores experimentally how private information percolates through Over the Counter (OTC) markets. Despite the relevance of OTC markets, very few studies have explored how the distribution of trading links affetcs the way in which uninformed traders learn the private information of insiders, influencing in turn the efficiency and fairness properties of these markets. We constrain trading to happen on three predetermined network structures, namely the ring, the small-world, and the Erdos-Renyi random network. This allows us to introduce heterogeneity in nodes centrality and clustering, as well as to change the network average diameter, while keeping the density of the network fixed. In doing so we can isolate the effect of changes in the network structure on the trading behaviour of market participants, on traded prices, and on the distribution of traders’ profits.

We complement laboratory experiments with extensive computational experiments via a behavioural agent-based model (ABM). By calibrating the ABM model to the experimental data, we can specify the learning rule and shed light on roles of clustering and centrality on learning.

Our experiments provide support to the theory that information diffusion is affected by the network structure and by the number and position of insiders.

We show that learning is a collaborative, network process, enabled by synergistic interactions rather than by independent, pairwise interactions with better informed agents, and eased by clustering rather than by node degree or centrality. While regular, clustered, networks strengthen learning locally around the insiders, the higher closeness centrality of random networks better supports its diffusion. The competition between the speed of individual learning and the speed at which information spreads to others determines which market is more efficient and fair. We show that clustered networks can dominate random networks in some cases.

We find that information remains more localized, and is more accurate, around the insider in clustered networks, and more dispersed, but less precise, in random networks. As a result, markets with lower average misbelief may lead to more efficiency but not necessarily to more equality in profits.

17:15-19:15 Session Oral O3A: Dynamics on/of Networks
17:15
The Biased-Voter model: How persuasive a small group can be?

ABSTRACT. We study the voter model dynamics in the presence of confidence and bias. We assume two types of voters. Unbiased voters whose confidence is indifferent to the state of the voter and biased voters whose confidence is biased towards a common fixed preferred state. We study the problem analytically on the complete graph using mean field theory and on an Erd˝os-R´enyi random network topology using the pair approximation, where we assume that the network of interactions topology is independent of the type of voters. We find that for the case of a random initial setup, and for sufficiently large number of voters N, the time to consensus increases proportionally to log(N)/γv, with γ the fraction of biased voters and v the parameter quantifying the bias of the voters (v = 0 no bias). We verify our analytical results through numerical simulations. We study this model on a biased-dependent topology of the network of interactions and examine two distinct, global average-degree preserving strategies (model I and model II) to obtain such biased-dependent random topologies starting from the biased-independent random topology case as the initial setup. Keeping all other parameters constant, in model I, µBU , the average number of links among biased (B) and unbiased (U) voters is varied at the expense of µUU and µBB , i.e. the average number of links among only unbiased and biased voters respectively. In model II, µBU is kept constant, while µBB is varied at the expense of µUU . We find that if the agents follow the strategy described by model II, they can achieve a significant reduction in the time to reach consensus as well as an increment in the probability to reach consensus to the preferred state. Hence, persuasiveness of the biased group depends on how well its members are connected among each other, compared to how well the members of the unbiased group are connected among each other.

17:30
Double-Threshold Models for Network Influence Propagation

ABSTRACT. We consider new models of activation/influence propagation in networks based on the concept of double thresholds: a node will ``activate" if at least a certain minimum fraction of its neighbors are active and no more than a certain maximum fraction of neighbors are active. These models are more flexible than standard threshold models as they allow to incorporate more complex dynamics of diffusion processes when nodes can activate and deactivate, which have possible interpretations in the context of social networks. In a social network, consistently with the hypothesis originally mentioned in the seminal work by Granovetter (1978), a person may ``activate'' (e.g., adopt and/or repost an opinion) if sufficiently many but not too many of their friends (i.e., neighbors in a network) have adopted this opinion. We study several versions of this problem setup under different assumptions on activation/deactivation mechanisms and initial choices of seed nodes, and compare the results to the well-known ``single threshold'' (e.g., linear threshold) models.

17:45
Is consensus robust against nodes removal in the Deffuant model during homophily scale-free growing?
PRESENTER: Yerali Gandica

ABSTRACT. Opinion dynamics models express mathematically some hypotheses about social interactions and provide means to investigate their effect in large populations. For instance, bounded confidence models assume that when an agent’s opinion is too far from the one of its interlocutor, it has no influence. This hypothesis can explain the emergence of macro-behaviours such as consensus, polarization or plurality of opinions. However, generally opinion dynamics are simulated once connections are established; little importance is given to the fact that the creation or destruction of connections and the opinion influences are interdependent dynamics. In this contribution, we study this interplay in a particular case. We simulate pairwise bounded-confidence interactions within a scale-free network growth process with community formation. This procedure mixes the preferential attachment mechanism with the tendency to connect agents with similar opinions (homophily or community formation). In a previous work, the authors showed that the homophily effect on scale-free networks promotes global consensus in the same bounded confidence model, which is a counter-intuitive result because polarisation is expected as a consequence of echo chambers. In this work, we aim to answer the question: Is consensus also achieved in homophily SF networks when bounded confidence dynamics are part of the growth of networks? What happens if there is also node removal?

18:00
Stochastic approximation of Adaptive Voter models

ABSTRACT. In this abstract, we present a methodology to tackle typical issues arising in co-evolutionary frameworks.

18:15
Memory Based Temporal Network Prediction
PRESENTER: Li Zou

ABSTRACT. Temporal networks are networks like physical contact networks whose topology changes over time. Predicting future temporal network is crucial e.g., to forecast and mitigate the spread of epidemics and misinformation on the network. Most existing methods for temporal network prediction are based on machine learning algorithms, at the expense of high computational costs and limited interpretation of the underlying mechanisms or properties that form the networks. This motivates us to develop network-based models to predict the temporal network at the next time step based on the network observed in the past. Firstly, we investigate temporal network properties to motivate our network prediction models and to explain the how the performance of these models depends on the temporal networks. We explore the similarity between the network topology (snapshot) at any two time steps with a given time lag/interval. We find that the similarity is relatively high when the time lag is small and decreases as the time lag increases. Inspired by such time-decaying memory of temporal networks and recent advances, we proposed two models that predict a link's future activity (i.e., connected or not), based on the past activities of the link itself or also of neighboring links, respectively. Via seven real-world physical contact networks, we find that our models outperform in both prediction quality and computational complexity, and predict better in networks that have a stronger memory. Beyond, our model also reveals how different types of neighboring links contribute to the prediction of a given link's future activity, again depending on properties of temporal networks.

18:30
Understanding the Evolution of Reddit in Temporal Networks induced by User Activity

ABSTRACT. Online social networks are ubiquitous and have become an essential part of our daily lives. They not only mirror society but act as Petri dishes for discourse, sometimes with fatal consequences. Understanding the dynamics underlying such platforms promises new models for predicting, e.g., the consequences of misinformation. Compared to individual-based platforms like Twitter or Facebook, Reddit is inherently topic-based. The Reddit universe consists of subreddits that represent sets of individuals gathering around certain themes.

This paper aims to understand the dynamics of evolving fields of common interest by analyzing temporal networks induced by Reddit user activity using community detection.

18:45
Socially-enhanced discovery processes
PRESENTER: Gabriele Di Bona

ABSTRACT. Recently, different mathematical approaches have been proposed to investigate and model the dynamics leading to the emergence of the new. Although these models successfully replicate the basic signatures of real-world discovery processes, they neglect the effects of social interactions. In the first part of the talk, I introduce a collaborative exploration model, where each explorer is associated with an urn model with triggering (UMT). Urns are coupled through the links of a complex network, so that explorers can exploit opportunities (possible discoveries) coming from their social contacts. We find that the pace of discovery of an explorer can be predicted analytically by using the eigenvector centrality. This highlights that the structural properties of the network can strongly affect the agents' ability to discover novelties. In the second part of the talk I investigate a data set containing the whole listening histories of a large, socially connected sample of users from the online music platform Last.fm. We find that more explorative users tend to interact with peers more prone to explore new content. We capture this phenomenology in a data-driven modeling scheme where users are represented by random walkers exploring a graph of artists, and interacting with each other through their social links. Interestingly, even starting from a uniform population of agents, our model predicts the emergence of strong heterogeneous exploration patterns, with users clustered according to their musical tastes and propensity to explore. We hope our work can represent a significant step forward to develop a general framework to understand how social interactions shape discovery and innovation processes.

19:00
Role of network topology in between-community beta diversity on river networks
PRESENTER: Richa Tripathi

ABSTRACT. Beta diversity, the similarity of fish species across river basins shows a non-linear drop with topological distance on river networks. In this work, we investigate the pattern of this drop with network distances and the role of underlying topology using optimal channel networks on which species evolve under the neutral biodiversity model. We observe that the steady-state beta diversity shows a phase transition-like behavior at a critical network distance. At this critical distance, the average degree over the nodes crosses the global average degree of the network. This study sheds light on the role of branching in dendritic networks in ecological community assembly rules.

17:15-19:15 Session Oral O3B: Networks in Finance & Economics
17:15
Green Sector Space: The evolution and capabilities spillover of economic green sectors in the United States
PRESENTER: Hanin Alhaddad

ABSTRACT. Countries productive capabilities play a crucial role in their ability to effectively transition their economies towards becoming green. Current research does not address the productive capabilities in the green sectors. In particular, (a) the effect of green production capabilities on a country green basket development; (b) weather the productive capabilities in its green sectors spillover to affect each other and its overall green growth. In this research, we use non parametric statistics with network science techniques to analyze green sectors evolution in the United States. The results of this paper provides recommendations that could benefit the United States green economic growth in addition to providing a methodology that can be used by countries policy makers to build effective recommendations that can accelerate their green economic growth.

17:30
Relatedness, complexity, and growth: The relevance of the value-added approach in CEE countries
PRESENTER: Erik Braun

ABSTRACT. Recent studies have revealed that economic complexity correlates with income level, and the deviation from this correlation predicts future growth. We can quantify economic complexity by the export structure of the countries. If a country produces and exports more types of unique goods, it has a more complex economy. For this, countries must have special skills, capabilities, and knowledge which are the main driving factors of growth.

However, the production processes in the Czech Republic, Hungary, Poland, and Slovakia rely on foreign skills and knowledge through inputs from abroad more than average. Therefore, the potential for future growth of these countries can be more limited than expected by the economic complexity based on gross export. The aim of this study is to quantify the economic complexity with domestic value-added export and to show that the measure of export by domestic value-added content is a better predictor of future growth, especially in CEE countries.

In this study, we utilize the Inter-Country Input-Output data provided by OECD in order to calculate the sector and country relatedness and determine the complexity indices. The results show that the country complexity indexes in the Czech Republic, Hungary, Poland, and Slovakia suggest that the economic complexity of these countries is considerably lower if taking into account only the domestic value-added content of the export. The reasons for this are that (1) the comparative advantage of countries has decreased in general, and (2) they have had a comparative advantage in sectors whose complexity has decreased. Moreover, we have measured the correlation between GDP per capita (PPP) and the country complexity indices in both cases. The correlation test shows that the index based on value-added export has a stronger correlation with income level (p=0.03, t-value=2.13). We also have estimated the predicting power of complexity indices in terms of future growth. The regression analysis in this respect confirms that the fitness of the index calculated on the basis of value-added export performs better. Finally, quantifying the deviation of the correlation shows that the Czech Republic, Slovakia, and Poland have a lower potential for future growth if the latter is predicted on the basis of complexity in value-added export.

17:45
A network analysis of world trade structural changes (1996-2019)
PRESENTER: Lucia Tajoli

ABSTRACT. Global trade suffered a significant contraction in the value of trade flows as a result of 2008 financial crisis, and again in 2020 with the pandemic crisis. There are questions arising from these observations: What is the structure of international trade? How has it changed due to these shocks? Inspired by the advantages of the network method and using the most extensive coverage data in terms of time and geography, we explore the structure of international trade by presenting a comprehensive analysis of the World Trade Network (WTN) from several angles. Connectivity results suggest that countries' efforts towards multilateral trade relations have resulted in an increasingly dense network, highly reciprocal and clustered. However, the network has not been fully connected yet. While trade connections are distributed homogeneously among countries, trade value is concentrated in a small set of countries yielding a weighted core-periphery structure. We analyze the consequences of the past financial crisis of 2008 to infer the potential effects of the recent crisis. Although the shock did not affect the main connective overall trends, their tendencies were restrained after 2008. The financial downturn also marks a turning point in the clustering of the WTN from two main groups, led by the United States and Germany, to three, led by the United States, China, and Germany. Revisions in preferential trade partners are visible from 2008 onward. Our study provides an intuitive insight for policymakers in revising the medium term effects of a global shock.

18:00
Network structures of a centralised and a decentralized market. A direct comparison
PRESENTER: Sylvain Mignot

ABSTRACT. A fundamental assumption in economics is that rational individuals act in their own self interest. One implication is that, when trading, buyers are supposed to seek for the lowest price and sellers for the highest one and social interactions are not considered. It is now largely accepted that social relationships affect the efficiency of a market structure (centralized or decentralized) \citep{Babus2013, Opp2016, Glode2017}.

The objectives The current study is the first to examine the network structures of a very specific market : the Boulogne-sur-mer fish market. On this market two market structure coexist, each being used by the same buyers and sellers, exchanging similar goods. The two submarkets are a centralized one (Auctions) and a decentralized one (over-the-counter market). For each sub-market we examine (1) the global network structure, (2) the local network structure, and (3) we identify the traders characteristics that best explain the network structures. by comparing the results, we can compare the role of trust (bilateral market) and reputation (auction market) to choose your trading partners.

Structural measures are used to characterize networks structures. Exponential random graph models are used to evaluate how trader characteristics explain purchasing patterns, and how the influence of these characteristics vary with the market mechanism.

We bring into the light that, when the transaction links on the auction market reflects the economic constraints of the partners, the relationships on the bilateral market depends on something more. Clearly, the prices of the bilateral transactions are the consequences of economics and non economics determinants. At first glance, the stable co-existence of two market structures looks like a paradox. Our results help to understand the distinctive characteristics and functioning of each sub-market. This discussion contributes to the debate about the efficiency of market structures.

18:15
Territorial Development as an innovation driver: a complex network approach

ABSTRACT. Rankings are a well-established tool to evaluate the performance of actors in different sectors of economy, and their use is increasing even in the context of the startup ecosystem, both on a regional and on a global scale. Although rankings meet the demand for measurability and comparability, they often provide too much an oversimplified picture of the status quo, which, in particular, overlooks the variability of the socio-economic conditions in which the quantified results are achieved. In this paper, we describe an approach based on constructing a network of world countries, in which links are determined by mutual similarity in terms of development indicators. Through the instrument of community detection, we perform an unsupervised partition of the considered set of countries, aimed at interpreting their performance in the StartupBlink rankings. We consider both the global ranking and the specific ones (Quality, Quantity, Business). After verifying if community membership is predictive of the success of a country in the considered ranking, we rate country performances in terms of the expectation based on community peers. We are thus able to identify cases in which performance is better than expected, providing a benchmark for countries in similar conditions, and cases in which performance is below the expectation, highlighting the need to strengthen the innovation ecosystem.

18:30
Regional Trade Network of the EU27 for Medical Products in the Fight against the Pandemic: Policy Implications for Self Sufficiency in Critical Inputs
PRESENTER: Semanur Soyyigit

ABSTRACT. The world economy has become wholly interconnected and states cannot remain unresponsive to the issues that happen in any part of the world. The pandemic has indicated that the healthcare systems of countries are not resilient against such emergencies. It has also been emphasized by European Commission that the resilience of the healthcare system should be improved with policies, including trade policies. Accordingly, it is vital to make the supply chain more resilient and diversified, and efforts for building strategic reserves of critical medical equipment should be supported (European Commission,2020). The reason of this is the severe crisis of the European countries in reach to vital medical supplies such as facemasks, ventilation systems, personal protective equipment (PPE) etc. in the beginning of the pandemic. That’s why we examined the network structures of intra-region and global trade of these products for both 2019 and 2020, and then compare the results from these two years. So, the response of the countries to such kind of an emergency is analyzed empirically and evaluated within the scope of regionalization and reshoring debates.

18:45
COVID-19 impact on international trade

ABSTRACT. We analyze how the COVID-19 pandemic affected the trade of products between countries. With this aim, using the United Nations Comtrade database, we perform a Google matrix analysis of the multiproduct World Trade Network (WTN) for the years 2018–2020, comprising the emergence of the COVID-19 as a global pandemic. The applied algorithms —PageRank, CheiRank and the reduced Google matrix— take into account the multiplicity of the WTN links, providing new insights into international trade compared to the usual import–export analysis. These complex networks analysis algorithms establish new rankings and trade balances of countries and products considering all countries on equal grounds, independent of their wealth, and every product on the basis of its relative exchanged volumes. In comparison with the pre-COVID-19 period, significant changes in these metrics occurred for the year 2020, highlighting a major rewiring of the international trade flows induced by the COVID-19 pandemic crisis. We define a new PageRank–CheiRank product trade balance, either export or import-oriented, which is significantly perturbed by the pandemic.

Particularly, the reduced Google matrix computed for selected countries and products, determines the most COVID-affected trade flows and provides a clear graphical network structure highlighting the rewiring of the WTN induced by the COVID-19 pandemic.

Our results, obtained from the UN Comtrade database, demonstrate a strong COVID-19 impact on the international trade. We argue that there are multiple social and economic origins of this negative impact: the pandemic reduced significantly the travel of people and goods between the countries reducing the request for petroleum and gas. We show that the PageRank-CheiRank trade balance in mineral fuels dropped for UAE, Russia, Saudi Arabia, Iran and only USA and India (among the largest countries) improved their balance. The production of specific products was reduced due to illness of people and hence the balance for machinery and manufactured articles induced a negative product balance for USA and almost all the European countries; only China was not perturbed (almost no balance changes) for these groups of products. Indeed, most of these products were produced in Asia and were absent in Europe (e.g. not enough masks in Europe, not enough medicaments). The COVID-19 impact clearly showed that the industrial production in USA and EU countries should be increased in order to become more self-sufficient and independent; a matter of fact that should be taken into consideration by policy makers.

19:00
Network Topologies of Corporate Organization Charts and Their Correlation with Corporate Performance
PRESENTER: Hiroki Sayama

ABSTRACT. The organization structure of corporations, which represents formal pathways of corporate chains of command, has potential to provide implications for performance of corporate operations. However, this subject has remained unexplored in the fields of management science, social network analysis and complex systems science, primarily because of the lack of readily available organization network datasets. Most of the information about corporate organization structures are publicized in a graphical organization chart, which may be used for a small number of manual case studies but would not be suitable for large-scale quantitative statistical analysis.

To overcome the above gap in corporate organization research, we developed a new heuristic image-processing method to extract and reconstruct organization network data from a published organization chart. Our method analyzes a PDF file of a corporate organization chart and detects text labels, boxes, connecting lines, and other objects through multiple steps of heuristically implemented image processing. The detected components are reorganized together into a NetworkX's Graph object for visualization, validation and further network analysis.

We applied the above method to the organization charts of all the listed firms in Japan shown in the ``Organization Chart/System Diagram Handbook'' published by Diamond, Inc., from 2008 to 2011. Out of the 10,008 organization chart PDF files, our method was able to reconstruct 4,606 organization networks (data acquisition success rate: 46%). Figure 1 shows an example of the original organization chart and the result of automated extraction of the organizational network.

For each reconstructed organization network, we measured network density, average clustering coefficient, and average distance of nodes from the CEO. We conducted multivariate regression analysis using the calculated network diagnostics and other control variables as independent variables and the firm's ROA (return on assets) as a dependent variable that characterizes its performance. The result (Table 1) showed statistically significant negative correlation between ROA and network density as well as between ROA and the average distance from the CEO, while the average clustering coefficient had no statistically significant correlation. These results imply that, the more complex a firm's formal organization structure is, the less effective its decision making and operation may be.

17:15-19:15 Session Oral O3C: Network Analysis
17:15
Opening up echo chambers via optimal content recommendation

ABSTRACT. Online social platforms have become central in the political debate. In this context, the existence of echo chambers is a problem of primary relevance. These clusters of like-minded individuals tend to reinforce prior beliefs, elicit animosity towards others and aggravate the spread of misinformation. We observe this phenomenon on a Twitter dataset related to the 2017 French presidential elections and propose a method to tackle it via algorithmic recommendations. We use a budgeted quadratic program to find optimal recommendations that maximise the diversity of content users are exposed to, while still accounting for their preferences. The method relies on a theoretical model that can sufficiently describe how content flows through the platform. We show that the model provides good approximations of empirical measures and demonstrate the effectiveness of the optimisation algorithm at mitigating the echo chamber effect on the dataset.

17:30
An analysis of Bitcoin dust through authenticated queries
PRESENTER: Matteo Loporchio

ABSTRACT. Dust refers to the amounts of cryptocurrency that are smaller than the fees required to spend them in a transaction. Due to its "economically irrational" nature, dust is often used to achieve some external side effect, rather than exchanging value. In this paper we study this phenomenon by conducting an analysis of dust creation and consumption in the Bitcoin blockchain. We do so by exploiting a new method that allows resource-constrained nodes to retrieve blockchain data by sending authenticated queries to possibly untrusted but more powerful nodes. We validate the method effectiveness experimentally and then analyze the collected data. Results show that a large amount of dust can be traced back to on-chain betting services.

17:45
A stochastic approach for extracting community-based backbones

ABSTRACT. Large-scale dense networks are very parvasive in various fields such as communication, social analytics, architecture, bio-metrics, etc. Thus, the need to build a compact version of the networks allowing their analysis is a matter of great importance. One of the main solutions to reduce the size of the network while maintaining its characteristics is backbone extraction techniques. Two types of methods are distinguished in the literature: similar nodes are gathered and merged in coarse-graining techniques to compress the network, while filter-based methods discard edges and nodes according to some statistical properties. In this paper, we propose a filtering-based approach which is based on the community structure of the network. The so-called "Acquaintance-Overlapping Backbone (AOB)" is a stochastic method which select overlapping nodes and the most connected nodes of the network. Experimental results show that the AOB is more effective in preserving relevant information as compared to some alternative methods.

18:00
The distance backbone of directed networks

ABSTRACT. In weighted graphs the shortest path between two nodes is often reached through an indirect path, leading to structural redundancies which play key roles in the dynamics and evolution of complex networks. We have previously developed a parameter-free, algebraically-principled methodology to uncover such redundancy and reveal the distance backbone of weighted graphs, which has been shown to be important in transmission dynamics, inference of important paths, and quantifying the robustness of networks. However, the method was developed for undirected graphs. Here we expand this methodology to weighted directed graphs and study the redundancy and robustness found in nine networks ranging from social, biomedical, and technical systems. We found that, similarly do undirected graphs, directed graphs in general also contain a large amount of redundancy, as measured by the size of their (directed) distance backbone. Our methodology adds an additional tool to the principled sparsification of complex networks and the measure of their robustness.

18:15
Uniformly Scattering Neighboring Nodes of an Ego-centric Network on a Spherical Surface for Better Network Visualization

ABSTRACT. Ego-centric network is an important class of networks to represent a particular node's connections to its neighbors. This work aims at providing an efficient method to represent an ego-centric network so that all neighboring nodes are scattered on the surface of the unit sphere uniformly. Such uniformity is not just a simple space-filling with maximum Euclidean distance among nodes, but with the consideration of existing edges among these nodes and without overlapping of node clusters. Our proposed method is a three-step method that partitions the spherical surface associated to a criterion on the edge-to-node ratio, then scatters the nodes on the respective subspace according to the relationship between nodes and modularity. To compute efficiently, the particle swarm optimization method is employed in all three steps to allocate the respective points. We show the connection between our space-filling of points on a spherical surface to the minimum energy design on a two-dimensional flat plane with a specific gradient. We provide a demonstration on allocating nodes of an ego-centric network of 50 nodes, and some distance statistics show the good performance of our method when compared to four state-of-the-art methods via self-organizing maps and force-driven approaches.

18:30
Delving into Individual Interactions in the IETF Time Evolving Social Graph
PRESENTER: Matthew Barnes

ABSTRACT. Tracking how individuals affect the structure of networks as they evolve in time is of great importance to understand our individual impact on the hierarchies we inhabit. In a recent publication, we define new tools for analysing time evolving networks which begin to delve into this relationship. These tools consist of equality and a hierarchical mobility taxonomy, which we used to compare 30 datasets of wide ranging sizes and structural types. The networks are given ``data types" based on where their data was gathered from, e.g. social, contact or co-occurrence.

We borrow the concept of the Gini coefficient from classical economics to measure the equality of node degree of a network over time. For hierarchical mobility, we developed a taxonomy of six measures. These consist of correlating the degree for individual nodes and their neighbourhoods between two points in time.

Applying both equality and hierarchical mobility to time slices of many sizes, such as hour, day and month, we can track the minutiae of evolution dynamics of the social hierarchies within the IETF. For instance there may be individuals that rise to prominence for a year, or maybe even an hour, and this can be charted, along with the affects it has on the rest of the network.

The Internet Engineering Task Force (IETF) is one of the largest bodies of computer scientists, who work together to build internet standards that are voluntary to adopt by the wider internet world. Throughout the more than forty years that it has been running, there have been numerous individuals contributing Request for Comments (RFCs), with varying levels of influence on the direction of the organisation. Building on work done to produce a social interaction graph of the IETF, we can apply our new tools to situate the IETF within our corpus of datasets and show how ``typical" it is.

Further to this, the membership of the IETF are affiliated with many types of institution, ranging from academic to commercial. Determining the level on interdependence between the affiliation of an individual and their hierarchical positional evolution through time is of interest. Most contributors will not stay put in one institution over their career span, therefore we can track their movement and look at the relationship between institution and hierarchical position in a network.

18:45
Exploring topics in LDA models through Statistically Validated Networks: directed and undirected approaches

ABSTRACT. Probabilistic topic models are machine learning tools for processing and understanding large text document collections. Among the different models in the literature, Latent Dirichlet Allocation (LDA) [1] has turned out to be the benchmark of the topic modelling community. The key idea is to represent text documents as random mixtures over latent semantic structures called topics. Each topic follows a multinomial distribution over the vocabulary words. In order to understand the result of a topic model, researchers usually select the top-n (essential words) words with the highest probability given a topic and look for meaningful and interpretable semantic themes. This work proposes a new method for exploring topics in LDA models, using Statis- tically Validated Networks (SVNs). The main idea of the proposed method is to con- sider co-occurrence between essential words as a measure of association. Two different approaches, called undirected and directed are proposed.

19:00
Optimal bond percolation in networks by a fast-decycling framework
PRESENTER: Xiao-Long Ren

ABSTRACT. Keeping a physical distance and creating social bubbles are popular measures that have been implemented to prevent infection and slow transmission of COVID-19. Such measures aim to reduce the risk of infection by decreasing the interactions among social networks. This, theoretically, corresponds to the optimal bond percolation (OBP) problem in networks, which is the problem of finding the minimum set of edges whose removal or deactivation from a network would dismantle it into isolated sub-components at most size C. To solve the OBP problem, we proposed a fast-decycling framework composed of three stages: (1) recursively removes influential edges from the 2-core of the network, (2) breaks large trees, and (3) reinserts the unnecessarily removed edges through an explosive percolation process. The proposed approaches perform better than existing OBP algorithms on real-world networks. Our results shed light on the faster design of a more practical social distancing and social bubble policy.