View: session overviewtalk overview
Introduction to Day Two
Keynote III
Panel
Poster 2
11:30 | QUEST: A Common Sense Approach to Annotating Q&A Content ABSTRACT. Long-form questions and answers on community question answering (CQA) websites and forums such as stackoverflow.com are a valuable resource. Unlike questions and answers in traditional question answering research challenges and datasets, the questions that are asked on such platforms are quite different. Questions have multi-sentence elaborations, can vary from being an embryonic curiosity to a fully fleshed out problem, can contain multiple intents, and can seek advice and opinion beyond facts. Consequently, answers are also different: they are longer and more diverse. For complex and subjective questions, there usually does not exist any authority or notions of correctness, and diverging answers can be helpful to the asker and the community. Motivated by a desire to better understand the quality of long-form questions and answers, we designed an annotation task to collect data about as many commonsense properties of questions and answers, such as question interestingness and answer helpfulness. Just like users of Q&A sites and forums, which are not always experts, our annotation task requires raters to use their commonsense judgment. Our work contributes to the less-explored domain of collecting non-expert's subjective judgments by releasing QUEST, a dataset that contains about 687,000 annotations on 12,096 unique question-answer pairs from 30 different Q&A sites by 3 independent raters. We describe the dataset, iterations of task design, potential implications and usage, as well as limitations and challenges in crowdsourcing commonsense judgments. |
11:30 | Collective Intelligence in business and in public sphere: possible research methods for a comparative study ABSTRACT. Collective intelligence projects build the framework for absorption, filtering, summarizing, explaining and comparing of knowledge and ideas, creation of possible solutions for problems and their evaluation, finally – for taking decisions. One of the definitions of collective intelligence says it is the ability to solve problems exceeding the skills of a single person. If in a social structure mutual cooperation is missing, this structure has a limited ability to solve a certain group of problems: in such cases every individual looks for solutions on his own, therefore, neither positive or negative interaction exists. Collective intelligence, however, is emerging when cooperation, competition and mutual observation lead to the new, original solutions of problems or accelerate the process and increase the ability to solve complex problems (Szuba, 2001). In most cases, this kind of cooperation occurs in business projects, but similar initiatives concerning public affairs are equally important. So far, despite various efforts there are no fully satisfactory results of comparative examination of CI initiatives in the two aforementioned spheres. Therefore, the principal goal of my presentation is to analyze possible research methods which may be helpful in capturing differences in the behavior of CI participants in both these fields. |
11:30 | Repetition Doesn't Have To Be Boring: User Experience Design For Online Citizen Science Data Classification Applications ABSTRACT. Online Citizen Science applications for data classification, such as Galaxy Zoo and Penguin Watch, are developed to accommodate a broad spectrum of users from various backgrounds and different levels of interest, involvement and expertise. The use of these applications by volunteers neither requires nor assumes any prior scientific knowledge or skills in order to participate, in a way to invite and be open to users regardless of their background. Volunteers of online citizen science applications are a multidimensional user group which can be separated in a variety of different other user groups, with differences in motivations and perceived outcomes from their participation. From a usability point of view, this raises questions in how to attract those groups of participants not necessarily concerned with the scientific aspects of the activities in tasks that are sometimes repetitive or concentrated on very specific micro-tasks. This paper analyses feedback from first-time users of various classification applications (namely: Galaxy Zoo, Penguin Watch, Bat Detective from Zooniverse, and Gendered and Tech magazines from Crowdcrafting), and proposes to establish a set of guidelines for the User Experience (UX) design of online citizen science data classification applications to accommodate a variety of user groups. The guidelines include, amongst others, information and content presentation, controls placement and help buttons, and the addition of various levels of difficulty, in regards to functionality, usability, and look and feel by using User Centered Design (UCD) methodologies. |
11:30 | A Trading Market for Prices in Peer Production ABSTRACT. Open source software forms much of our digital infrastructure. It, however, contains vulnerabilities which have been exploited, attracted public attention, and caused large financial damages. This paper proposes a solution to shortcomings in the current economic situation of open source software development. The main idea is to introduce price signals into the peer production of software. This is achieved through a trading market for futures contracts on the status of software issues. Users, who value secure software, gain the possibility to predict outcomes and incentivize work, strengthening collaboration and information sharing in open source software development. The design of such a trading market is discussed and a prototype introduced. The feasibility of the trading market design is validated in a proof-of-concept implementation and simulation. |
11:30 | Artificial Swarms Outperform in Finding Social Optima ABSTRACT. Does Artificial Swarm Intelligence enable human groups to converge on optimal decisions at higher rates than traditional methods for aggregating group input? This study explores the issue rigorously and finds that "human swarms" can be significantly more effective in enabling networked populations to converge on Social Optima as compared to plurality voting, Borda Count rankings, and Condorcet pairwise voting. Across a test set of 100 questions, the traditional voting methods reached socially optimal solutions 60% of the time. The artificial swarming systems converged on socially optimal solutions 82% of the time. This is a highly significant result (p=.001) and suggests that human swarming may be an effective path not only for amplifying the intelligence of human populations, but for enabling human groups with conflicting interests to find solutions that maximize their collective opinions, preferences, interests, and/or welfare. |
11:30 | Collective Intelligence Aspects of Cyber-Physical Social Systems: Results of a Systematic Mapping Study SPEAKER: Marta Sabou ABSTRACT. Cyber-physical systems (CPS) are systems that span the physical and cyber-world by linking objects and process from these spaces. In a typical CPS data is collected from the physical world via sensors and computation resources from the cyber-space are used to integrate and analyze this data in order to decide on optimal feedback processes which can be put in place by physical actuators. CPS have started to diffuse into many areas, including mission-critical public transportation, energy services, and industrial production and manufacturing processes. While CPS affect the lives of people that rely on them on a daily basis, they so far only interact with humans as passive consumers. The results of a recent study about adaptation in CPS revealed an emerging trend to add an additional "social"' layer in a CPS architecture to address human and social factors. This trend shows the growing recognition of the importance of the social dimension of such CPS and of the need to evolve them into cyber-physical social systems (CPSS). CPSS consist not only of software and raw sensing and actuating hardware, but are fundamentally grounded in the behaviour of human actors, who both generate data and make informed decisions. As CPSS extend CPS with a social dimension, the question of the relation between CPSS and self-organizational, crowd-powered systems and Collective Intelligence (CI) systems naturally arises. What CI aspects do CPSS exhibit? Can we consider them as an emerging type of CI system or should they be rather perceived as systems of systems that also include a CI system? To answer these and other questions, we have recently performed a systematic mapping study of CPSS. In this paper we report on the study and some of our initial findings. |
11:30 | Towards Hybrid Human-Machine Translation Services ABSTRACT. Crowdsourcing is recently used to automate complex tasks when computational systems alone fail. The literature includes several contributions concerning natural language processing, e.g., language translation [Zaidan and Callison-Burch 2011; Minder and Bernstein 2012a; 2012b], also in combination with active learning [Green et al. 2015] and interactive model training [Zacharias et al. 2018]. In this work, we investigate (1) whether a (paid) crowd, that is acquired from a multilingual website’s community, is capable of translating coherent content from English to their mother tongue (we consider Arabic native speakers); and (2) in which cases state-of-the-art machine translation models can compete with human translations for automation in order to reduce task completion times and costs. The envisioned goal is a hybrid machine translation service that incrementally adapts machine translation models to new domains by employing human computation to make machine translation more competitive (see Figure 1). Recently, approaches for domain adoption of neural machine translation systems include filtering of generic corpora based on sentence embeddings of in-domain samples [Wang et al. 2017] have been proposed, as well as the fine-tuning with mixed batches containing domain and outof-domain samples [Chu et al. 2017] and with different regularization methods [Barone et al. 2017]. As a first step towards this goal, we conduct an experiment using a simple two-staged human computation algorithm for translating a subset of the IWSLT parallel corpus including English transcriptions of TED talks and reference translations in Arabic with a specifically acquired crowd. We compare the output with the state-of-the-art machine translation system Google Translate as a baseline. |
11:30 | False Positive and Cross-relation Signals in Distant Supervision Data ABSTRACT. Distant supervision (DS) is a well-established method for relation extraction from text, based on the assumption that when a knowledge-base contains a relation between a term pair, then sentences that contain that pair are likely to express the relation. In this paper, we use the results of a crowdsourcing relation extraction task to identify two problems with DS data quality: the widely varying degree of false positives across different relations, and the observed causal connection between relations that are not considered by the DS method. The crowdsourcing data aggregation is performed using ambiguity-aware CrowdTruth metrics, that are used to capture and interpret inter-annotator disagreement. We also present preliminary results of using the crowd to enhance DS training data for a relation classification model, without requiring the crowd to annotate the entire set. |
Keynote IV