AAMAS 2017: 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS
PROGRAM FOR FRIDAY, MAY 12TH
Days:
previous day
all days

View: session overviewtalk overview

09:00-10:00 Session Plenary 5: Invited Talk (Ana Bazzan)
Location: Ballroom
09:00
Trying to Fix Traffic: Past, Present, and Some Future Trends

ABSTRACT. According to the INRIX Global Traffic Scorecard, in 2016, an average driver in Los Angeles has spent 100 hours (approximately 4 days) in traffic jams. The economic impact of this is huge. Besides, also issues related to environment is non-negligible.

Many naive solutions tend to ignore the fundamental rule of traffic: building new roads just makes people drive more, i.e., more supply creates more demand. Given the continuous growth of the world urban population and the increasing needs related to mobility of people and goods, the demand is likely to grown anyway.

Lucky, there are not only bad news here: people do adapt to congestion. Therefore, it is important to understand how this adaptive process occurs, and how to seize the opportunity to develop or enhance agent-based tools as well as reinforcement learning methods in order to contribute to the effort of fixing traffic as much as possible. Among many others, one example here is the design of protocols and mechanisms underlying new traffic-related apps. In this talk, I try to take a look at the present and future of this task. However, we should not throw away the past experience, despite new actors or players in this game, despite new paradigms that are starting to arise, and in spite of changes in preference by those actors (e.g., drop in car ownership among young people).

Thus, this talk reviews what we already know, as well as what we have already tried (for instance regarding the role of traffic control or regarding what can be learned from studying traffic assignment and route choice). Interesting questions here are: how can a microscopic, agent-based approach to modeling and simulation help decision-makers and traffic planners? How can new technologies, such as routing apps, microblogs, and context for prediction of patterns be incorporated to the past effort/agenda? How to most effectively use the internet of vehicles?

Finally, this talk summarizes some challenges and obstacles, such as the lack of testbeds, or even the lack of validation data.

10:00-10:30Coffee Break
10:30-12:30 Session 5A: Computational Social Choice 4
Location: Ballroom 1
10:30
Anyone But Them: The Complexity Challenge for A Resolute Election Controller

ABSTRACT. We study the voting problems where given is an election associated with a subset J of candidates, and the question is whether we can modify the election in a way so that none of the candidates in J wins the election. The modification operations include either adding some votes/candidates or deleting some votes/candidates. These problems are natural generalizations of destructive control problems where J is a singleton and capture many practical situations. We achieve a broad range of complexity results for a number of single-winner voting systems, involving voting rules which are compositions of commonly used voting correspondences, such as Borda, Maximin and Copeland, and three tie-breaking schemes, namely the fixed-order, random candidates and random votes. In particular, we achieve polynomial-time solvability results, NP-hardness results, fixed-parameter tractability results as well as XP results. In addition, we study other tie-breaking schemes and show that the complexity of the problems may depend on tie-breaking schemes.

10:50
The Complexity of Bribery and Control in Group Identification

ABSTRACT. The goal of this paper is to analyze the complexity of bribery and destructive control in the framework of group identification. Group identification applies to situations where a group of individuals try to determine who among them is socially qualified for a given task. We consider consent rules, the consensus-start-respecting rule, and the liberal-start-respecting rule.

11:10
Complexity Results for Manipulation, Bribery and Control of the Kemeny Judgment Aggregation Procedure

ABSTRACT. An important criterium for social choice methods is their resistance against various types of strategic behavior. Seminal results in the social choice literature indicate that absolute resistance is in many cases impossible. For this reason, it has often been argued that computational intractability could be used as an obstruction for strategic behavior for different procedures.

In this paper, we study the computational complexity of strategic behavior for the Kemeny procedure in the setting of judgment aggregation. In particular, we investigate problems related to (1) strategic manipulation, (2) bribery, and (3) control (by adding or deleting issues). We show that these problems are complete for the second level of the Polynomial Hierarchy. Our results hold for two different judgment aggregation frameworks and for different notions of preference over judgment sets. The hardness results that we establish hold up even under various restrictions, such as unidimensional alignment of the profile.

11:30
Distributed Monitoring of Election Winners [BEST PAPER NOMINEE]

ABSTRACT. We consider distributed elections, where there is a center and k sites. In such distributed elections, each voter has preferences over some set of candidates, and each voter is assigned to exactly one site such that each site is aware only of the voters assigned to it. The center is able to directly communicate with each of the sites. We are interested in designing communication-efficient protocols, allowing the center to maintain a candidate which, with arbitrary high probability, is guaranteed to be a winner, or at least close to being a winner. We consider various single-winner voting rules, such as variants of Approval voting and scoring rules, tournament-based voting rules, and several round-based voting rules. For these voting rules, we show that, using communication which is logarithmic in the number of voters, it is possible for the center to maintain such approximate winners. We complement our protocols with lower bounds. Our results have implications in various scenarios, such as aggregating customer preferences in online shopping websites or supermarket chains and collecting votes from different polling stations of political elections.

11:50
The Complexity of Control and Bribery in Majority Judgement

12:10
On the Complexity of Borda Control in Single-Peaked Elections

ABSTRACT. Recent research reveals that many NP-hard voting problems in general become polynomial-time solvable in single-peaked elections. In contrast to these results, we prove for the first time that constructive control by adding/deleting votes/candidates for Borda are NP-hard even in single-peaked elections. On the other side, we prove that constructive control by adding/deleting votes/candidates for Borda are polynomial-time solvable in single-dived elections, which are elections obtained from single-peaked elections by reversing the preferences of the voters. Finally, we study constructive control by adding/deleting votes/candidates for Borda in single-peaked elections with k-truncated votes, i.e., each voter ranks only her top k candidates, aiming at investigating how the values of k affect the complexity of these problems. For this purpose, we adopt three variants of Borda and show that many control problems are polynomial-time solvable if k is a constant.

10:30-12:30 Session 5B: Logic & Game Theory
Location: Ballroom 2
10:30
Analyzing Games with Ambiguous Player Types Using the MINthenMAX Decision Model

ABSTRACT. In many common interactive scenarios, participants lack information about other participants, and specifically about the preferences of other participants. In this work, we model an extreme case of incomplete information, which we term games with type ambiguity, where a participant lacks even information enabling him to form a belief on the preferences of others. Under type ambiguity, one cannot analyze the scenario using the commonly used Bayesian framework, and therefore one needs to model the participants using a different decision model.

To this end, we present the MINthenMAX decision model under ambiguity. This model is a refinement of Wald's MiniMax principle, which we show to be too coarse for games with type ambiguity. We characterize MINthenMAX as the finest refinement of the MiniMax principle that satisfies three properties we claim are necessary for games with type ambiguity. This prior-less approach we present here also follows the common practice in computer science of worst-case analysis.

Finally, we define and analyze the corresponding equilibrium concept, when all players follow MINthenMAX. We demonstrate this equilibrium by applying it to bilateral trade, which is a common economic scenario. We show that an equilibrium in pure strategies always exists, and we analyze the equilibria.

10:50
Strategic disclosure of opinions on a social network

ABSTRACT. This paper starts from a simple model of strategic reasoning in situations of social influence. Agents express binary views on a set of propositions, and iteratively update their views when their influencers show an opposite unanimous opinion. We empower agents with the ability to disclose or hide their opinions, in order to attain a predetermined goal. We study classical game-theoretic solution concepts in the resulting games, observing a non-trivial interplay between the individual goals and the structure of the underlying network. By making use of different logics for strategic reasoning, we show how apparently simple problems in strategic opinion diffusion already require a very complex logical machinery to be properly formalized.

11:10
Hiding Actions in Multi-Player Games

ABSTRACT. In the game-theoretic approach to reasoning about multi-agent systems, imperfect information plays a key role.  It requires that players act in accordance with the information available to them.  The complexity of deciding games that have imperfect information is generally worse than those that do not have imperfect information, and is easily undecidable. In many real-life scenarios, however, we just have to deal with very restricted forms of imperfect information and limited interactions among players. In these settings, the challenge is to come up with elementary decision procedures, as we do.

In this paper, we study multi-player concurrent games where (i) Player 0 ’s objective  is  to  reach  a  target W ,  and  (ii)  the  opponents are trying to stop this but have partial observation about Player 0 ’s actions.  We study the problem of deciding whether the opponents can prevent Player 0 to reach W , by beating every Player 0 ’s strategy.  We show, using an automata-theoretic approach that, assuming the opponents have the same partial observation and play under uniformity, the problem is in ExpTime.

11:30
Synthesizing Optimal Social Laws for Strategical Agents via Bayesian Mechanism Design

ABSTRACT. When rational behavior of the agents and private information are considered, the optimal social law synthesizing problem naturally evolves into a setting which can be handled by the framework algorithm mechanism design. We focus on the Bayesian case in this paper, that is, the probability distribution of each agent's cost is known. It is easy to see that in this case our problem closely relates to path/spanning-tree auctions and Myerson's optimal auction mechanism, but the optimization objective is new, that is, we focus on profit maximization instead of payment maximization. By studying on this problem: We further extend the logic-based framework of social law optimization problem to the strategic case, and show that it becomes a new problem of algorithm mechanism design; We found out a mechanism that is incentive compatible, individually rational and maximize the expected profit for all input cost profiles; However, we can show that this mechanism is computational intractable; So, we finally find out a tractable constant-factor approximation mechanism.

11:50
Attenuate Locally, Win Globally: An Attenuation-based Framework for Online Stochastic Matching with Timeouts

ABSTRACT. Online matching problems have garnered significant attention in recent years due to numerous applications. Many of them capture the uncertainty in the real world by including stochasticity in both the arrival process and the matching process. The \textit{Online Stochastic Matching with Timeouts} problem introduced by Bansal, Gupta, Li, Mestre, Nagarajan, and Rudra (\emph{Algorithmica}, 2012) models matching markets (e.g. E-Bay, Amazon). Buyers arrive from an independent and identically distributed (i.i.d.) known distribution on buyer profiles and can be shown a list of items. Each buyer has some probability of purchasing each item and a limit (timeout) on the number of items they can be shown.

Bansal {\it et al.}\ (\emph{Algorithmica}, 2012) gave a $0.12$-competitive algorithm which was improved by Adamczyk, Grandoni, and Mukherjee (\emph{ESA, 2015}) to $0.24$. We present an online attenuation framework that uses an algorithm for offline stochastic matching as a black box. We show that this framework combined with a black box adapted from Bansal \emph{et al.} (\emph{Algorithmica}, 2012) yields an online algorithm which nearly doubles the ratio to $0.46$. We also show that no algorithm can achieve a ratio better than $0.632$ using the common LP for this problem. This framework has a high potential for further improvements since new algorithms for offline stochastic matching can lead directly to improvements for the online problem.

Our online framework also has the potential for a variety of extensions. For example, we introduce a natural generalization: \textit{Online Stochastic Matching with Two-sided Timeouts} in which both online and offline vertices have timeouts. Our framework provides the first algorithm for this problem achieving a ratio of $0.31$. We accomplish this by proposing a new black box algorithm for offline stochastic matching on star graphs. Additionally, this new black box may be of independent interest since it improves the approximation ratio for the offline stochastic matching problem on star graphs from $0.5$ by Adamczyk \textit{et al.} (\emph{ESA 2015}) to $0.56$.

12:10
Multi-Channel Marketing with Budget Complementarities

ABSTRACT. Utility maximization under a budget constraint is a classical problem in economics and management science. It is commonly assumed that the utility is a “nice” known analytic function, for example, continuous and concave. In many domains, such as marketing, increased abundance of computational resources and data has enabled the development of sophisticated simulations to evaluate the impact of allocation a fixed budget among alternatives (e.g., marketing channels) on outcomes, such as demand. While simulations enable high resolution evaluation of alternative budget allocation strategies, they significantly complicate the associated budget optimization problem. In particular, simulation runs are time consuming, significantly limiting the space of options that can be explored. An important second challenge is the common presence of budget complementarities, where non-negligible budget increments are required for an appreciable marginal impact from a channel. This introduces a combinatorial structure on the decision space. We propose to address these challenges by first converting the problem into a multi-choice Knapsack optimization problem with unknown weights. We show that if weights (corresponding to marginal impact thresholds for each channel) are well approximated, we can achieve a solution within a factor of 2 of optimal, and this bound is tight. We then develop several parsimonious query algorithms for achieving this approximation in an online fashion. Experimental evaluation demonstrates the effectiveness of our approach.

10:30-12:30 Session 5C: Logics 3
Location: Ballroom 3
10:30
Fixpoint Approximation of Strategic Abilities under Imperfect Information

ABSTRACT. Model checking of strategic ability under imperfect information is known to be hard. In this paper, we propose translations of \ATLir formulae that are cheaper to verify than the original specification, and provide lower and upper bounds for its truth value. Most interestingly, the lower approximation is provided by a fixpoint expression using a nonstandard variant of the next-step ability operator. We show the correctness of the translations, establish their computational complexity, and validate the approach by experiments with a scalable scenario of Bridge play.

10:50
Decidability results for ATL* with imperfect information and perfect recall

ABSTRACT. Alternating-time Temporal Logic (ATL) is a central logic for multi-agent open systems. It has been extended in various ways, notably with strategy contexts (ATLsc) or imperfect information (ATLi). Concerning the latter, since its model-checking problem for agents with perfect recall is undecidable, studies have focused on agents with bounded or no memory. In this work, we establish two decidability results for perfect recall. The first one is a meta-theorem that allows the transfer of decidability results for classes of multiplayer games with imperfect information, such as games with hierarchical observation, to the model-checking problem for ATLi . The second one establishes the decidability of the model-checking problem for ATL with strategy context and imperfect information for hierarchical instances.

11:10
Agent-based Abstractions for Verifying Alternating-time Temporal Logic with Imperfect Information

ABSTRACT. We introduce a novel 3-valued abstraction technique for alternating-time temporal logic (ATL) in contexts of imperfect information. In this setting agents have access only to their local state, i.e., a (possibly proper) subset of the information contained in the system's global state. We introduce under- and over-approximations of an agent's strategic abilities in terms of $\must$- and $\may$-strategies, then provide a 3-valued semantics for ATL based on agents in interpreted systems (IS). Next, we define a relation of simulation between agents and prove that it preserves defined truth values of ATL formulas. Finally, we introduce a notion of abstraction on IS and show that it simulates the concrete interpreted system. Most importantly, this procedure is capable of extracting a finite abstraction from an infinite-state IS.

11:30
Verification of Multi-agent Systems with Imperfect Information and Public Actions

ABSTRACT. We analyse the verification problem for synchronous, perfect recall multi-agent systems with imperfect information against a specification language that includes strategic as well as epistemic operators. While the general problem is known to be undecidable we show that if the agents' actions are public then verification is decidable, and we establish that the computational complexity is 2exptime-complete. To illustrate the formal framework we consider two well-known epistemic and strategic puzzles with imperfect information and public actions: the muddy children puzzle and the classic game of battleships.

11:50
Game-Theoretic Semantics for ATL+ with Applications to Model Checking

ABSTRACT. We develop game-theoretic semantics (GTS) for the fragment ATL+ of the full Alternating-time Temporal Logic ATL*, essentially extending a recently introduced GTS for ATL. We first show that the new game-theoretic semantics is equivalent to the standard semantics of ATL+ (based on perfect recall strategies). We then provide an analysis, based on the new semantics, of the memory and time resources needed for model checking ATL+. Based on that, we establish that strategies that use only a very limited amount of memory suffice for ATL+. Furthermore, using the GTS we provide a new algorithm for model checking of ATL+ and identify a natural hierarchy of tractable fragments of ATL+ that extend ATL.

12:10
Bisimulations for Verifying Strategic Abilities with an Application to ThreeBallot

ABSTRACT. We propose a notion of alternating bisimulation for strategic abilities under imperfect information. The bisimulation preserves formulas of ATL for both the {\em objective} and {\em subjective} variants of the state-based semantics with imperfect information, which are commonly used in the modeling and verification of multi-agent systems. Furthermore, we apply the theoretical result to the verification of coercion-resistance in the three-ballot voting system, a voting protocol that does not use cryptography. In particular, we show that natural simplifications of an initial model of the protocol are in fact bisimulations of the original model, and therefore satisfy the same ATL properties, including coercion-resistance. These simplifications allow the model-checking tool MCMAS to terminate on models with a larger number of voters and candidates, compared with the initial model.

10:30-12:30 Session 5D: Social Networks
Location: Ballroom 4
10:30
Group Reasoning in Social Environments

ABSTRACT. While modelling group decision making scenarios, the existence of a central authority is often assumed which is in charge of amalgamating the preferences of a given set of agents with the aim of computing a socially desirable out- come, for instance, maximizing the utilitarian or the egalitarian social welfare. Departing from this classical perspective and inspired by the growing body of literature on opinion formation and diffusion, a setting for group decision making is studied where agents are selfishly interested and where each of them can adopt her own decision without a central coordination, hence possibly disagreeing with the decision taken by some of the other agents. In particular, it is assumed that agents belong to a social environment and that their preferences on the available alternatives can be influenced by the number of “neighbors” agreeing/disagreeing with them. The setting is formalized and studied by modelling agents’ reasoning capabilities in terms of weighted propositional logics and by focusing on Nash-stable solutions as the prototypical solution concept. A thoroughly computational complexity analysis is conducted, too: A number of intractability results are pointed out, and qualitative restrictions on the underlying environments are introduced based on which a precise picture of the tractability frontier is depicted.

10:50
Uncharted but not Uninfluenced: Influence Maximization with an Uncertain Network

ABSTRACT. This paper focuses on new challenges in influence maximization inspired by non-profits' use of social networks to effect behavioral change in their target populations. Specifically, our work is motivated by the problem of spreading messages about HIV prevention among homeless youth using their social network. We show how to compute solutions which are provably close to optimal when the parameters of the influence process are unknown. We then extend these results to a dynamic setting where information about the network is revealed at each stage. Simulation experiments using real world networks collected by the homeless shelter show the advantages of our approach. Our algorithm is being piloted by the homeless shelter, with preparations underway for a large-scale deployment.

11:10
Robustness in Discrete Preference Games

ABSTRACT. In a {\em discrete preference game}, each agent is equipped with an internal {\em belief} and declares her {\em preference} from a discrete set of alternatives. The payoff of an agent depends on whether the declared preference agrees with the belief of the agent and on the {\em coordination} with the preferences expressed by neighbors from the underlying social network. These games have been used to model the formation of opinions and the adoption of innovations in social networks.

Recently, researchers have obtained bounds on the Price of Anarchy and on the Price of Stability of {discrete preference games}, and they have studied to which extent the winning preference reached via best-response dynamics disagrees with the majority of beliefs. In this work, we would like to highlight the robustness of these results to variants.

Our starting point is the observation that bounds on the Price of Anarchy and Stability can be very sensitive on how the quality of equilibria is quantified. Instead, results about the disagreement between majority in equilibria and among beliefs are showed either formally or experimentally to hold even if we consider a less stringent class of deviations, such as {\em no-worse-response} deviations, or multiple players updating at the same time, or weighted neighbors.

11:30
Contrasting the Spread of Misinformation in Online Social Networks

ABSTRACT. In recent years the way people seek and share information has been revolutionized by the emergence of online social networks. Nowadays, popular online social sites such as Twitter, Facebook and Google+ are among the major news sources as well as the most effective channels for viral marketing. However, these networks also became the most effective channel for spreading misinformation, accidentally or maliciously. The widespread diffusion of inaccurate information or fake news can lead to undesirable and severe consequences, such as widespread panic, libelous campaigns and conspiracies. In order to guarantee the trustworthiness of online social networks it is a crucial challenge to find effective strategies to contrast misinformation.

In this paper we concentrate our attention on two aspects of the problem of misinformation diffusion in social networks: the identification of the misinformation sources and the contrast to the diffusion of the misinformation in the network. We consider a network where some nodes have already been infected from the misinformation. We first provide an heuristics to recognize the set of most probable sources of the infection. Then, given a set of misinformation sources, we provide an heuristics to place a few monitors in some network nodes in order to control information diffused by the suspected nodes and block misinformation they injected in the network before it reaches most of the network.

To verify the quality and efficiency of our suggested solutions, we conduct experiments on several networks. Empirical results indicate that our heuristics are among the most effective known in literature.

11:50
Limiting Concept Spread in Environments with Interacting Concepts

ABSTRACT. The propagation of concepts in a population of agents is a form of influence spread, which can be modelled as a cascade from an initial set of individuals. In real-world environments there may be many concepts spreading and interacting. Previous work does not consider utilising concept interactions to limit the spread of a concept. In this paper we present a method for limiting concept spread, in environments where concepts interact and do not block others from spreading. We define a model that allows for the interactions between any number of concepts to be represented and, using this model, develop a solution to the influence limitation problem, which aims to minimise the spread of a target concept through the use of a secondary inhibiting concept. We present a heuristic, called maximum probable gain, and compare its performance to established heuristics for manipulating influence spread in both simulated small-world networks and real-world networks.

12:10
On the Construction of Covert Networks

ABSTRACT. Centrality measures are widely used to identify leaders of covert networks. We study how a group of such leaders can avoid being detected by such measures. More concretely, we study the hardness of choosing a set of edges that the leaders can add to the network in order to decrease their ranking according to two fundamental centrality measures, namely degree, and closeness. We prove that this problem is NP-complete for each measure. After that, we study how the leaders can construct a network from scratch, designed specifically for them to hide in disguise. We identify a network structure that not only guarantees to hide the leaders to a certain extent, but also allows them to spread their influence across the network.

10:30-12:30 Session 5E: Human-Robot Interaction and Learning in Robotics
10:30
Temporal Models for Robot Classification of Human Interruptibility

ABSTRACT. Robots are increasingly being deployed in unstructured human environments in which a need will arise to approach and interrupt collocated humans. Most prior work on robot interruptions has focused on how to interrupt a person or on estimating a human's awareness of the robot. Our work makes three contributions to this research area. First, we introduce an ordinal scale of interruptibility that can be used to rate the interruptibility of a human. Second, we propose the use of Conditional Random Fields (CRFs) and their variants, Hidden CRFs, and Latent-Dynamic CRFs, for classifying interruptibility. Third, we introduce the use of object labels as a visual cue to the context of an interruption in order to improve interruptibility estimates. Our results show that Latent-Dynamic CRFs outperform all other models across all tested conditions, and that the inclusion of object labels as a cue to context improves interruptibility classification performance, yielding the best overall results.

10:50
A Tale of Two Architectures: A Dual-Citizenship Integration of Natural Language and the Cognitive Map

ABSTRACT. The many robot architectures developed over the past few decades enable a wide range of varying capabilities. Two such examples are the Vulcan architecture, which uses rich spatial representations to facilitate navigation capabilities in real-world, campus-like environments, and the DIARC architecture, which uses high-level cognitive representations to facilitate human-like tasking through natural language. In this work, we show how the integration of Vulcan and DIARC enables not only the capabilities of the two individ ual architectures, but new synergistic capabilities as well, as each architecture leverages the strengths of the other. This integration presents interesting challenges, as DIARC and Vulcan are implemented in distinct multi-agent system middlewares. Accordingly, a second major contribution of this paper is the Vulcan-ADE Development Environment(VADE): a novel multi-agent system framework comprised of both (1) software agents belonging to a single robot architecture and implemented in a single multi-agent system middleware, and (2) “Dual-Citizen” agents that belong to both robot architectures and that use elements of both multi-agent system middlewares. Finally, as one example application, we demonstrate the implementation of the new joint architecture and novel multi-agent system framework on a robotic wheelchair, and show how this integration advances the state-of-the-art for NL-enabled wheelchairs.

11:10
Multi-Robot Human Guidance: Human Experiments and Multiple Concurrent Requests

ABSTRACT. In the multi-robot human guidance problem, a centralized controller makes use of multiple robots to provide navigational assistance to a human in order to reach a goal location. Previous work used Markov Decision Processes (MDPs) to construct a formalization for this problem [13], and evaluated this framework in an abstract setting only, i.e. without experiments using high-fidelity simulators or real humans. Additionally, it was unable to handle multiple concurrent requests and did not consider buildings with multiple floors. The main contribution of this paper is the introduction of an extended MDP framework for the multi-robot human guidance problem, and its application using a realistic 3D simulation environment and a real multi-robot system. The MDP formulation presented in this paper includes support for planning for multiple guidance requests concurrently as well as requests that require a human to traverse multiple floors. We evaluate this system using real humans controlling simulated avatars, and provide a video demonstration of the system implemented on real robots.

11:30
Spoken Instruction-Based One-Shot Object and Action Learning in a Cognitive Robotic Architecture [BEST PAPER NOMINEE]

ABSTRACT. Learning new knowledge from a single instruction and being able to apply it right away is a highly desirable capability for artificial agents. In this paper, we provide the first demonstration of spoken instruction-based one-shot object and action learning in a cognitive robotic architecture. We specifically discuss the modifications to several architectural components required to enable such fast learning and demonstrate the new capabilities on two different fully autonomous robots.

11:50
Effect of Leader Placement on Robotic Swarm Control

12:10
Probabilistic Supervisory Control Theory (pSCT) Applied to Swarm Robotics

ABSTRACT. Swarm robotics studies large groups of robots that work together to accomplish common tasks. Much of the used source code is developed in an ad-hoc manner, meaning that the correctness of the controller is not always verifiable. In previous work, supervisory control theory (SCT) and associated design tools have been used to address this problem. Given a formal description of the swarm's agents capabilities and their desired behaviour, the control source code can be automatically generated. However, regular SCT cannot model probabilistic controllers (supervisors). In this paper, we propose a probabilistic supervisory control theory (pSCT) framework. It applies prior work on probabilistic generators in a way that allows controllers to be decomposed into multiple local modular supervisors. Local modular supervisors take advantage of the modularity of formal specifications to reduce the size required to store the control logic. To validate the pSCT framework, we model a distributed swarm robotic version of the graph colouring problem and automatically generate the control source code for the Kilobot swarm robotics platform. We report the results of systematic experiments with swarms of 25 and 100 physical robots.

10:30-12:30 Session 5F: Emergence
Location: Curitiba 1&2
10:30
Error Cascades in Collective Behavior - A Case Study of the Gradient Algorithm on 1000 Physical Agents

ABSTRACT. The gradients, or hop count, algorithm is inspired by natural phenomena such as the morphogen gradients present in multi-cellular development. It has several applications in multi-agent systems and sensor networks, serving as a basis for self-organized coordinate system formation, and finding shortest paths for message passing. It is a simple and well-understood algorithm in theory; however, we show here that in practice, it is highly sensitive to specific rare errors. We implement it on a system of 1000 physical agents (Kilobot robots) that communicate via a noisy wireless channel where message loss and corruption are inevitable, and there is natural asynchrony among the agents. We observe that under these conditions, spontaneous, short-lasting rare errors made by a single agent (e.g. due to message corruption) propagate both spatially and temporally, causing cascades that severely hinder the algorithm's scalability. We develop a mathematical model for temporal propagation and validate it with controlled experiments on 100 agents. This work shows how multi-agent algorithms that are believed to be simple and robust from theoretical insight may be highly challenging to implement on physical systems. Systematically understanding and quantifying their current limitations is a first step in the direction of improving their robustness for implementation.

10:50
Inverse Reinforcement Learning in Swarm Systems [BEST PAPER NOMINEE]

ABSTRACT. Inverse reinforcement learning (IRL) has become a useful tool for learning behavioral models from demonstration data. However, IRL remains mostly unexplored for multi-agent systems. Inspired by the collective swarming behavior of natural systems, we show how the principle of IRL can be extended to homogeneous large-scale problems. In particular, we make the following contributions to the field: 1) We introduce the swarMDP framework, a sub-class of decentralized partially observable Markov decision processes endowed with a swarm characterization. 2) Exploiting the inherent homogeneity of this framework, we reduce the resulting multi-agent IRL problem to a single-agent one by proving that the agent-specific value functions in this model coincide. 3) To solve the corresponding control problem, we propose a novel heterogeneous learning scheme that is particularly tailored to the swarm setting. Results on two example systems demonstrate that our framework is able to produce meaningful local reward models from which we can replicate the observed global system behavior.

11:10
Social manifestation of guilt leads to stable cooperation in multi-agent systems

ABSTRACT. We show analytically that the inclusion of the emotion of guilt, in the sense arising from actual harm done to others from inappropriate action or inaction, is worthwhile to incorporate in evolutionary game theory models of cooperation, for it can increase cooperation by correcting and inhibiting defection. The abstract study thereof profitably transpires to concrete considerations in the design of artificial multi-agent populations.

11:30
Commitment and Participation in Public Goods Games (JAAMAS Paper)
11:50
Understanding Norm Change: An Evolutionary Game-Theoretic Approach

ABSTRACT. Human societies around the world interact with each other by developing and maintaining social norms, and it is critically important to understand how such norms emerge and change. In this work, we define an evolutionary game-theoretic model to study how norms change in a society, based on the idea that different strength of norms in societies translate to different game-theoretic interaction structures and incentives. We use this model to study, both analytically and with extensive agent-based simulations, the evolutionary relationships among the need for coordination in a society (which is related to its norm strength), cultural inertia (whether or how quickly the population responds when faced with conditions that make a norm change desirable), and exploration rate (the willingness of agents to try out new strategies). Our results show that a high need for coordination leads to both high cultural inertia and a low exploration rate, and a low need for coordination leads to the opposite. This is the first work, to our knowledge, on understanding the evolutionary causal relationships among these factors.

12:10
Multi-Agent Flag Coordination Games

ABSTRACT. Many multi-agent coordination problems can be understood as autonomous local choices between a finite set of options, with each local choice undertaken simultaneously without explicit coordination between decision-makers, and with a shared goal of achieving a desired global state. Examples of such problems include classic consensus problems between nodes in a distributed computer network, coordination of robot bucket brigades, the adoption of competing technology standards, and consumer purchase decisions of network goods. We model such problems as a multi-round game between agents having flags of different colours to represent the finite choice options, and all agents seeking to achieve a global pattern of colours through a succession of local colour-selection choices.

We generalise and formalise the problem, proving results for the probabilities of achievement of common desired global patterns when these games are undertaken on bipartite graphs, extending known results for non-bipartite graphs. We also calculate probabilities for the game entering infinite cycles of non-convergence. In addition, we present a game-theoretic approach to the problem that has a mixed-strategy Nash equilibrium where two players can simultaneously flip the colour of one of the opponent's nodes in the bipartite graph before or during a flag-coordination game.

12:30-12:45Pick up Lunchboxes