previous day
next day
all days

View: session overviewtalk overview

09:30-10:10 Session K0: Keynote
More-than-digital experiences in more-than-human worlds: bringing together creative methods with sociomaterialism theory

ABSTRACT. In this presentation, I will explain the possibilities of more-than-human theory, building on ‘old’ and ‘new’ materialisms for understanding people’s more-than-digital experiences. I will provide some examples from recent research projects of using creative research methods to identify the affective and multi-sensory dimensions of these more-than-human worlds as they relate to people’s use of digital technologies and their engagements with digital data.

10:30-12:00 Session T2-A: Societal Challenges
Location: Track A
Quantifying the Creator Economy: A Large-Scale Analysis of Patreon

ABSTRACT. Membership platforms allow creators to receive income from their followers, but the consumption characteristics of these emergent types of platforms remain poorly understood. We analyze transaction-level data to reveal consumption behaviour and creator dynamics as influenced by user breadth, activity level, and financial spending.

My Team Will Go On: Differentiating High and Low Viability Teams through Team Interaction

ABSTRACT. Understanding team viability — a team’s capacity for sustained and future success — is essential for building effective teams. In this study, we aggregate features drawn from the organizational behavior literature to train a viability classification model over a dataset of 669 10-minute text conversations of online teams.

How Many Russian Women Are Murdered by Partners and Relatives: Evidence from Text Analysis of Court Decisions

ABSTRACT. Using the texts of court decisions and machine learning algorithms, we estimated the proportion of family-related and intimate partner homicides in female homicides commited in Russia in 2011-2019

On the Value of Wikipedia as a Gateway to the Web

ABSTRACT. Wikipedia acts as a gateway to the Web, generating 43M clicks/month to external websites. Official links in infoboxes have the highest click-through rate (CTR), 2.47% on average. The respective website owners would need to pay $7--13 million per month to obtain the same volume of traffic via sponsored search.

10:30-11:45 Session T2-B: Text Analysis and Applications
Location: Track B
Multi-task multi-lingual hate speech detection

ABSTRACT. Hate Speech has become a major content moderation issue for online social media platforms. In this work we utilize a multi-task and multi-lingual approach based on recently proposed Transformer Neural Networks to solve three hate speech and offensive content identification tasks for Indo-European languages. Details: https://github.com/socialmediaie/MTML_HateSpeech

Data filtering and classification for the identification of texts related to security in Bogotá Colombia

ABSTRACT. Perception of security (PoS) refers to the subjective evaluation of risk and the magnitude of its consequences. This paper explores the use of expert knowledge combined with classification algorithms to identify security-related texts from social networks, particularly from Twitter.

Developing a Behavioural Typology of Social Identities

ABSTRACT. In four studies, we illustrate the value of using linguistic style to study social identity similarity online. We show that identities that are perceived as similar, show greater similarity in their linguistic style than identities perceived as different. We demonstrate the value of this approach in mapping identity evolution longitudinally.

10:30-12:00 Session T2-C: Social News
Location: Track C
Who’s advertising on Low-credibility News Sites?

ABSTRACT. We identify over 60K retailers advertising on both low-credibility and traditional news sites. We observe that high-profile retailers' ads are just as likely to appear on low-credibility news sites as low-profile retailers'. Finally, we highlight individual high-profile retailers that are being disproportionately promoted on low-credibility news sites.

Communicating Across Political Divides on Social Media

ABSTRACT. We model the relationship between the language that media outlets use in their promotional social media posts and the political diversity of their audience. We integrate our models in a web-application that helps journalists write more bridging content and partner with a media organization to test it using advertising experiments.

Political audience diversity and news reliability in algorithmic ranking

ABSTRACT. How can social media platforms promote reliable information? We propose using the political diversity of a website's audience as a signal. We show that websites with more extreme and less politically diverse audiences have lower journalistic standards. Incorporating audience diversity into recommendations increases their trustworthiness while keeping them relevant.

Sustainability of Stack Exchange Q&A communities: the role of social cohesion and trust

ABSTRACT. We study the evolution of Stack Exchange active and closed websites. Our results show that sustainability is not only driven by the number of active users and questions, but users’ trustworthiness and community inclusivity signalled by reputation and dense core-periphery interactions.

10:30-12:00 Session T2-D: Science Studies
Location: Track D
A labor advantage drives the greater productivity of faculty at elite universities

ABSTRACT. Faculty at prestigious institutions dominate scientific discourse. We combine employment, publication, and survey data for 97,478 tenure-track scientists at 275 PhD-granting institutions in the American university system to show that availability of funded graduate and postdoctoral labor at more prestigious departments contributes to the environmental effect of prestige on productivity.

New Directions in Science Emerge from Disconnection and Discord

ABSTRACT. We unpack the complex, temporally evolving relationship between citation impact alongside novelty and disruption, two emerging measures that capture the innovation in science. We find that novel papers will exhibit disruptive impact over time, and demonstrate how they are much more likely than conventional papers to disrupt current literature.

How common is common sense?

ABSTRACT. Common sense is tautological, rhetorical, and paradoxical, and yet, common sense is treated as a fundamental source of reason. We empirically measure common sense by asking many people what they think about statements and aggregating the responses.

Coevolution of policy and science during the pandemic

ABSTRACT. We combine two large-scale databases that capture policy and science and their interactions, to examine the coevolution of policy and science during the COVID-19 pandemic, finding recent, high-quality science is being heard, but unevenly. [See our paper at https://science.sciencemag.org/content/sci/371/6525/128.full.pdf for more details]

10:30-12:00 Session T2-E: Network Studies
Location: Track E
The hidden constraints on worker mobility: how workplace skills determine a worker’s next move

ABSTRACT. Economic inequality challenges the career success of today’s workers. However, descriptions of job polarization as a divide between low-skill and high-skill labor are vague. Rather, specific skill requirements determine employability and mobility between labor markets. Advancing theory to empirical insights requires new methodology that embraces the complexity of workplace skills.

Sentiments and Information Diffusion on Social Media: A Curvilinear Relationship

ABSTRACT. How does the sentiment of content affect its diffusion on social media? Intense sentiment evokes attention and emotional arousal. Both have been shown to elicit social sharing. We argue and test that the true relationship is curvilinear, rather than strictly positive. We also explore moderating mechanisms.

People, Places, and Ties: Landscape of social places and their social network structures

ABSTRACT. In this study, we provide a systematic nationwide investigation of third places and their social networks by using Facebook pages. Our analysis reveals a large degree of 1) geographic heterogeneity in the distribution of the types of third places, and 2) topological heterogeneity of their social networks.

Behavioral Correlates of Performance and Satisfaction in Problem-Solving Teams

ABSTRACT. There is a lack of research that successfully applies quantitative methods to study the process of team decision-making. We attempt to quantify aspects of communication that teams and individuals exhibit, and model team processes with the intention to extend quantitative analysis methods for understanding effective team interaction strategies.

10:30-12:00 Session T2-F: Experiments
Location: Track F
A New Surrogate Metric to Measure Political Persuasion Online

ABSTRACT. We examine a novel corpus of >80 RCTs testing the persuasiveness of political ads on Facebook ads and find no relationship between survey persuasion measures and common proxies for effectiveness like click-through-rate. However, we find emoji "reactions" are predictive of persuasion and use them to develop a surrogate metric.

Network Segregation and the Propagation of Misinformation

ABSTRACT. We argue that the segregation of online networks into ideologically homogeneous clusters structurally favors the spread of misinformation. We test this mechanism by seeding informational messages in experimental partisan social networks. In segregated networks, false messages systematically diffused more widely, with the greatest boost observed for the least plausible content.

Data confounds lead to performance overestimations in fake review detections

ABSTRACT. Data used for classification tasks in fake review detection may contain confounds in data-origin (a dataset consists of more than one source) and in product-ownership (product reviewers differ in owning and not owning the product). Using random allocation into experimental conditions, we find that these confounds lead to performance overestimations.

Jump bidding in online auctions: An in vivo experiment

ABSTRACT. This paper explores the effect of jump bidding on the final price in an in vivo experiment where we participate in online auctions as buyers to compare early jump bids with automated sequences of incremental bids as well as with a control condition in which we do not intervene.

10:30-12:00 Session T2-G: Public Health
Location: Track G
Assessing access inequality in American cities using amenity consumption patterns

ABSTRACT. Cities are places of opportunities but also of large social disparities. Access to physical and social resources has the potential to create opportunities for people. This study leverages high-resolution data of consumption patterns from SafeGraph to model the amenity-community bipartite networks across demographic groups in all American cities during 2019.

Subscribing to Sexual Health Content on YouTube: An Exploratory Analysis of Prevalent Topics among 2018-2019 YouTube Videos

ABSTRACT. Information seeking is a primary use of YouTube. This study used structural topic modeling to review video content about sexual health, yielding eight main topics, and their associations with use patterns (i.e., passive consumption and active participation). These findings bear significant implications for public health professions.

Estimation for DeGroot Opinion Diffusion Models with Small Datasets Using a Genetic Algorithm

ABSTRACT. We developed a novel genetic algorithm to estimate the parameters of a DeGroot opinion diffusion model using the small datasets available in public health applications. We present the results of a simulation study to assess algorithm performance under assumption violations and analysis of data from a social network intervention.

What can Administrative Data tell us about Service Allocation to American Foster Youth?

ABSTRACT. Under the John H. Chafee Foster Care Independence Program (CFCIP), foster care agencies in the US are reimbursed for services that assist foster youth in achieving self-sufficiency. We study factors in administrative data that predict which youth receive services, and the limitations of using administrative data for this purpose.

14:30-16:30 Session K2: Keynotes
Measuring large-scale emotion aggregates through social media text

ABSTRACT. Social media data has allowed us to quantify and track emotional expression across scales and for long periods of time. This is possible thanks to the application of text analysis methods that can be validated at the level of individual social media posts, for example against sentiment annotations. However, we still lack evidence showing whether aggregates of emotional expressions over time match the self-reported emotional experiences of the members of a community. In this talk, I present a set of studies comparing the temporal evolution of large-scale aggregates of emotion on social media versus surveys of emotional experiences of social media users and representative panels. I will present a comparison of dictionary-based and supervised text analysis methods and show their potentials and limitations as macroscopes of emotional experiences. This helps us to understand the convergence of traditional and computational social science methods and how to quantify measurement error when studying social phenomena through digital traces.

Pushing research on user-centric information exposure forward: bringing tracking, survey and automated text classification together.

ABSTRACT. The abundance of choices in today’s information environment challenges traditional survey research for measuring exposure. In this presentation we will put forward suggestions on how we can improve our understanding of individuals’ media diets, i.e. the channels and contents chosen, in a digital world. Based on the limitations of existing user-centric tracking approaches, we introduce a new academic tracking solution. We then show how the combination of computational tracking and classical survey research, allows us not only to connect the offline and online realm, but also unravel the drivers and consequences of individuals’ information exposure. So far, however, this newly evolving strand of research is hampered by the fact that tracking data primarily capture the clicks and sources, without further looking at the content people are actually engaging with. We thus conclude this presentation by showing how automated text classification could enrich survey and tracking data and thereby lead to a paradigm shift in exposure research. Such a shift would allow us to better judge the degree to which societal phenomena like echo chambers or information avoidance truly exist.

Computational Social Science and the Right to Audit

ABSTRACT. Online platforms are taking action against researchers who study them, arguing that scraping, sock puppets, or even just writing down public information with a pen constitutes hacking, theft, trespassing, research misconduct, or a terms of service violation. Researchers have had their accounts blocked, received cease-and-desist letters, and faced lawsuits and de-funding. Using the lens of the recent US legal case Sandvig v. Barr, this talk will discuss techniques used to investigate online platforms under the banner of the social science audit study, reviewing the necessity of establishing norms and legal frameworks that protect investigations in the public interest.

17:00-18:30 Session P1: Poster Session
Location: gather.town
A Method to Analyze the Multiple Social Identities in Twitter Bios

ABSTRACT. https://github.com/kennyjoseph/twitter_personal_identifiers

Understanding Online Hacktivist Groups: A Case Study into the Anonymous Collective

ABSTRACT. https://ojs.aaai.org/index.php/ICWSM/article/view/7303

Influence Dynamics in Diet-Climate Discourse: The Case of Veganuary 2019 on Twitter

ABSTRACT. We use network analysis to map the Veganuary 2019 Twitter discourse, an important landmark in the climate-diet discourse. We evaluate levels of influence disintegration between supporters and antagonists of plant-based diets using projection analyses and thereby detect influence dynamics.

Ephemeral Astroturfing Attacks: The Case of Fake Twitter Trends

ABSTRACT. We define and describe ephemeral astroturfing attacks on Twitter trends which employs deletions and compromised accounts and is responsible for the 47% of the trends in the region analyzed as well 20% of the global trends.

Quantifying the effect of social media networks on the success of entrepreneurs

ABSTRACT. Tweeting a lot is not good for entrepreneurs, at least.

The Structure of Toxic Conversations on Twitter

ABSTRACT. https://web.media.mit.edu/~msaveski/dtox-poster.pdf

Identifying individuals using topic patterns of Instagram photos

ABSTRACT. By combining computer vision and topic modeling approaches, we found that individuals' topic patterns of Instagram images could be reliably distinguished from others.

Adoption of Twitter’s New Length Limit: Is 280 the New 140?

ABSTRACT. In 2017, Twitter increased the maximum allowed tweet length to 280 characters, altering its signature feature. Studying the long term effects of this change, we find that while the introduction eliminated the disproportion of tweets reaching 140 characters, a similar effect emerged around 280 characters limit after the switch.

Why it is important to consider negative ties when studying polarized debates on Twitter

ABSTRACT. We introduce an approach to study online discourse through signed network analysis, and apply this approach to the Dutch Twitter debate on ‘Black Pete’—an annual Dutch celebration involving wearing blackface.

Do we have to agree on everything? On the alignment of echo chambers

ABSTRACT. If the echo chamber effect is strong then we would expect the polarised communities to persist over multiple dimensions. Our results show that communities do not always strongly persist, hence necessitate understanding the reasons that could make an echo chamber transient or persistent.

Comparable Analysis of News Diffusion between Mainstream and Alternative Media in Twitter

ABSTRACT. We have analyzed how each news media in Japan tends to spread on Twitter by Hawkes process, and the analysis result revealed the characteristics of diffusion in Japanese news media, e.g., some media with political bias have a fast diffusion convergence speed due to the Echo Chamber effect.

Linking COVID-19 perception with socioeconomic conditions using Twitter data

ABSTRACT. COVID-19 outbreak influenced our life in unprecedented ways. In our study, we investigate topic dynamics of Twitter content sharing for the Republic of Turkey. We have analyzed 1.3 million tweets containing the keyword "korona". Research indicates that the lower the income, the higher the COVID-19 News related sharing is.

Information in times of COVID19: traditional media vs. on-line social networks

ABSTRACT. We are interested in investigating the interplay between a traditional medium, the New York Times, and those users who follow their posts in Twitter.

Anti-Vax Strategy in the Reply Behavior on Social Media

ABSTRACT. We empirically analyzed the strategy of anti-vaxxers' reply behavior on Twitter. Among the results, anti-vaxxers more frequently conducted reply behavior to other clusters, and the content of their replies was significantly toxic and emotional. Furthermore, the most-targeted users were so-called "decent" accounts with large numbers of followers.

Disparity and Dynamics of Social Distancing Behaviors in Japan: An Investigation of mobile phone mobility data

ABSTRACT. This study aims to assist in the design and implementation of public health policies by exploring how voluntary and policy-induced social distancing behaviors shift over time across demographic groups.

Opinion Polarization on COVID-19 Measures: Integrating Surveys and Social Media Data

ABSTRACT. We introduce a framework for a multi-perspective analysis of opinion polarization using surveys, social media, and integrated data, which we tested on the topic of COVID-19 prevention measures in the German-speaking DACH region and find that vaccination is more polarizing than mask-wearing and contact tracing in surveys and social media.

Exploring Who is Responsible for the Spread of COVID-19 Misinformation on Twitter

ABSTRACT. In this study, we investigated the spread of misinformation by analyzing the authors, content and propagation of infodemic on Twitter. Using data from over 92 professional fact-checking organizations united as the International Fact-Checking Network (IFCN) from January to July 2020, we analyzed 1,500 false and partially false tweets that spread misinformation.

Leveraging dynamic heterogenous networks to study transnational issue publics

ABSTRACT. Extended abstract and poster: https://sync.academiccloud.de/index.php/s/6yCph6IGWdfLMYO.

“Russian vaccine will be good for our politicians; I want Pfizer” – Changing narratives of vaccination in Hungary

ABSTRACT. In this study, we use online textual data and advanced text mining methods to analyze the anti-vaccine discourse in Hungary under the second and third waves of the Covid-19 pandemic. We show the temporal changes of these narratives and we also present how different vaccines are mentioned in this discourse.

Using online searches and Social Media during pandemics to improve now-casting models

ABSTRACT. Online searches have been used to study different behaviours, including monitoring disease outbreaks. We focused on the two pandemics of the 21st century (2009-H1N1 flu and Covid-19) and collected a) Google searches; b) frequency of news media and c) number of actual infections to improve disease prediction.

Community in Times of Pandemic: The Evolution of the r/Covid19 Forum on Reddit

ABSTRACT. Our aim is to understand how both the individual and collective behaviour of participants on the Reddit forum, r/Covid19, have changed over this past year. To the best of our knowledge, this is one of the first studies to explore an online Covid-19 related forum using temporal network embeddings.

Analysis of COVID-19 Vaccination Willingness Based on the Initial Trust Model of ELM

ABSTRACT. The research is based on the elaboration likelihood model (ELM) to explore the influence of argument quality, information quality, reference group popularity, perceived information quality and perceived risk on COVID-19 vaccination willingness.

The influence of misinformation on U.S. COVID-19 vaccinations

ABSTRACT. Widespread uptake of COVID-19 vaccines is necessary to achieve herd immunity. However, surveys have found concerning numbers of U.S. adults hesitant or unwilling to be vaccinated. Online misinformation may play an important role in vaccine hesitancy, but we lack a clear picture of the extent to which it will impact vaccination uptake. In our work, we study how vaccination rates and vaccine hesitancy are associated with levels of online misinformation about vaccines shared by 1.6 million Twitter users geolocated at the U.S. state and county levels. We find a negative relationship between misinformation and vaccination uptake rates (Fig. 1A). Online misinformation is also correlated with vaccine hesitancy rates taken from survey data (Fig. 1B). Associations between vaccine outcomes and misinformation remain significant when accounting for political as well as demographic and socioeconomic factors. While vaccine hesitancy is strongly associated with Republican vote share, we observe that the effect of online misinformation on hesitancy is strongest across Democratic rather than Republican counties (Fig. 2). These results suggest that addressing online misinformation must be a key component of interventions aimed to maximize the effectiveness of vaccination campaigns.

A time series analysis of r/CoronavirusUK and r/CoronavirusUS subreddit volume and sentiment

ABSTRACT. We analyse social media posts and comments about Coronavirus on Reddit for two demographics r/CoronavirusUK and r/CoronavirusUS. We apply time series methods to analyse the volume and sentiment of posts and comments as part of an exploratory analysis of how the pandemic has played out over social media.

Can Social Signals Predict Bitcoin Price Fluctuations?

ABSTRACT. Bitcoin is a cryptocurrency with large price fluctuations. In our analysis, we add social signals related to information search (Google Trends data) and sentiments from the word of mouth (Tweets) to test whether they have effect on bitcoin price fluctuations. The sentiments of tweets are from people who have many followers and actively tweet about bitcoin, called Bitcoin influencers. In addition, We analyse the retweet and reply network structures of influencers. Our analysis reveals that i) Google Trends has slight and insignificant impact on price volatility and ii) "anticipation" emotion of the tweets has the greatest impact reaching 10% of the price change.

The Socioeconomic Mobility Gap: Disparities in the COVID-19 Pandemic

ABSTRACT. We investigate socioeconomic differences in mobility patterns during the COVID-19 pandemic, using aggregated location data from Google. We find that lower-income counties did not reduce their mobility as much as higher-income counties, and that the size of the socioeconomic mobility gap varies substantially across US states and types of locations.

Attention dynamics on the Chinese social media Sina Weibo during the COVID-19 pandemic

ABSTRACT. Understanding attention dynamics on social media during pandemics could help governments minimize the effects. We focus on how COVID-19 has influenced the attention dynamics on the biggest Chinese microblogging website Sina Weibo during the first four months of the pandemic. We study the real-time Hot Search List (HSL), which provides the ranking of the most popular 50 hashtags based on the amount of Sina Weibo searches. We show how the specific events, measures and developments during the epidemic affected the emergence of different kinds of hashtags and the ranking on the HSL. A significant increase of COVID-19 related hashtags started to occur on HSL around January 20, 2020. Then very rapidly a situation was reached where COVID-related hashtags occupied 30–70% of the HSL, however, with changing content. We analyzed how the hashtag topics changed during the investigated time span and conclude that there are three periods separated by February 12 and March 12. We further explore the dynamics of HSL by measuring the ranking dynamics and the lifetimes of hashtags on the list. This way we can obtain information about the decay of attention, which is important for decisions about the temporal placement of governmental measures to achieve permanent awareness. Furthermore, our observations indicate abnormally higher rank diversity in the top 15 ranks on HSL due to the COVID-19 related hashtags, revealing the possibility of algorithmic intervention from the platform provider.

Response to COVID-19 with Probabilistic Programming

ABSTRACT. This work provides an end-to-end pipeline to simulate the COVID-19 virus spread and the incurred loss of various non-pharmaceutical interventions.

Learning in the Pandemic: Unveiling Inequalities in College Students’ Academic Experience Through Large-Scale Behavioral Analytics

ABSTRACT. The COVID-19 pandemic has imposed substantial challenges to college students as they struggled to stay engaged and on track to meet their educational goals. This study uses behavior data from learning management systems (LMS) to systematically document college students’ academic experience and its inequality before and during the pandemic-induced remote instruction. Using such data generated by all the students enrolled between September 2019 and December 2020 (N=32,891) at an American institution and a comparative interrupted time series (CITS) design, this study finds that compared to before, overall engagement in LMS increased, discussion posts became shorter and more emotional, and assignment submissions received higher scores after the remote transition. As for inequality, URM and first-gen students had more interrupted study sessions, wrote longer and less analytical discussion posts, and submitted assignments later than their counterparts both before and after the pandemic impact; some of these gaps were also exacerbated by the remote transition. Such preliminary results serve to not only help inform post-pandemic institutional policies to make up for the learning loss, but also inspire arrangements for online learning as a normal part of the future of higher education.