Breadcrumb
The ALADDIN Game Theory Workshop 2009
Abstracts
|
Georgios Chalkiadakis (Southampton) Overlapping coalition formation In multiagent domains, agents form coalitions to perform tasks. The usual models of cooperative game theory assume that the desired outcome is either the grand coalition or a coalition structure that consists of disjoint coalitions (i.e., a partition of the set of agents). However, in practice an agent may be involved in executing more than one task, and distributing his resources between several (not necessarily disjoint) coalitions. To tackle such scenarios, we introduce a model for cooperative games with overlapping coalitions. We then focus on concepts of stability in this setting. In particular, we define and study a notion of the core, which is a generalization of the corresponding notion in the traditional models of cooperative game theory. Under some quite general conditions, we characterize the elements of core. As a corollary, we also show that any element of the core maximizes the social welfare. We then introduce a concept of balancedness for overlapping coalitional games, and use it to characterize coalition structures that can be extended to elements of the core. Furthermore, we generalize the notion of convexity to our setting, and show that under some natural assumptions convex games have a non-empty core. To the best of our knowledge, this is the first paper to provide a generic model for overlapping coalition formation, along with a theoretical treatment of stability in this setting. This is joint work with Edith Elkind, Evangelos Markakis and Nicholas R. Jennings (published in Proceedings of WINE-2008) |
| Avgostinos Filippopoulitis (Imperial) Adaptive on-line decisions with distributed myopic information The evacuation of a building is a challenging problem, since the evacuees most of the times do not know or do not follow the optimal evacuation route. Especially during an ongoing hazard present in the building, finding the best evacuation route becomes harder as the conditions along the paths change in the course of the evacuation procedure. We propose a distributed system that will compute the best evacuation routes in real-time, while a hazard is spreading inside the building. The system is composed of a network of decision nodes and sensor nodes, positioned in specific locations inside the building. The recommendations of the decision nodes are computed in a distributed manner, at each of the decision nodes, which then communicate them to evacuees or rescue personnel located in their vicinity. We evaluate our proposed system in various emergency scenarios, using a multiagent simulation platform for Building Evacuation. Our results indicate that the presence of the system improves the outcome of the evacuation with respect to the evacuation time and the injury level of the evacuees. |
|
Maike Kaufman (Oxford) Local decision making in co-operative agent systems In co-operative Multi-agent systems, agents will often communicate among each other to facilitate joint decision-making and coordinated behaviour. However, in many real-world applications, communication can not be guaranteed at all times and agents will encounter situations in which they are forced to chose an action locally. Much of the work that has been done on this in the context of POMDPs places a strong focus on ensuring overall agent co-ordination even under sparse communication. This is usually achieved by ignoring any information which is not available to all agents in the local decision-making. An alternative approach would be to incorporate all obtainable data in an agent's individual action choice. This would lead to a locally more informed decision-making which might however result in un-coordinated global behaviour. Here we will focus on two particular aspects of this trade-off. Firstly the problem of avoiding infinite regress in agents reasoning over others' action choices and secondly the possibility of finding measures that allow predicting which of the two presented approaches to local decision-making will yield higher expected reward for a given Multi-agent system. |
|
Rob McInerney (Oxford) Bayesian bandits Most approaches to multiple agent decision making, or single agent multi-action decisions, tacitly assume a stationary reward environment in which e-greedy approaches can asymptotically give valid information regarding action-value policies. We extend this approach by taking a Bayesian methodology to multi-armed bandits in which natural exploitation-exploration will eventually take place. Crucially we do not discard any prior information until all unknowns can be integrated out together. Prior uncertainty is automatically incorporated into the decision process, occasionally resulting in non-conventional decisions being made. This framework can be extended with ease to incorporate utility functions which combine expected future reward and penalise model ignorance as well as enabling multiple agents to interact in a game theoretic framework. |
|
Jason Marden (Caltech) Overcoming limitations of game-theoretic distributed control Game theory has recently received significant research attention as a tool for cooperative control of multi-agent systems. Utilizing game theory for cooperative control requires the following: (1) the interactions of a multi-agent distributed system are modeled as a non-cooperative game where the agents are designed as self-interested entities and (2) the agents are controlled using distributed learning algorithms that provide convergence to a stable operating point, e.g., Nash equilibrium. While there exists a large body of literature on distributed learning algorithms, unfortunately guidelines for how to design a "desirable" game are relatively unexplored. In this talk, we focus on the question of how to design agent objective functions. We demonstrate that the standard framework of non-cooperative game has inherent limitations with regard to designing agent objective functions. In particular, we prove that there does not exist a systematic method for constructing agent objective functions that are local, budget balanced and guarantee that the optimal control is a pure Nash equilibrium. However, we demonstrate that these limitations can be overcome by moving beyond the class of non-cooperative games and conditioning each player's objective function on additional information, i.e., a state. |
|
Maria Polukarov (Southampton) Games with congestion-averse utilities Congestion games - in which players strategically choose from a set of "resources" and derive utilities that depend on the congestion on each resource - are important in a wide range of applications. However, to date, such games have been constrained to use utility functions that are linear sums with respect to resources. To remove this restriction, we provide a significant generalisation to the case where a player's payoff can be given by any real-valued function over the set of possible congestion vectors. Under weak assumptions on the structure of player strategy spaces, we constructively prove the existence of a pure strategy equilibrium for the very wide class of these generalised games in which player utility functions are congestion-averse - i.e., monotonic, submodular and independent of irrelevant alternatives. Although, as we show, these games do not admit a generalised ordinal potential function (and hence the finite improvement property), any such game does possess a Nash equilibrium in pure strategies. A polynomial time algorithm for computing such an equilibrium is presented. Moreover, we show that if a player utility function does not satisfy any one of the congestion-averse conditions, then a pure strategy equilibrium need not exist, and in fact the determination of whether or not a game has a pure strategy Nash equilibrium is NP-complete. We further prove analogous results for our assumed properties of player strategy spaces, thus showing that current assumptions on strategy and utility structures in this model cannot be relaxed anymore. |
|
Michalis Smyrnakis (Bristol) Adaptive estimation in fictitious play Distributed optimization can be formulated as an n-player coordination game. One of the most common learning techniques in game theory is fictitious play and its variations. However fictitious play is founded on an implicit assumption that opponents' strategies are stationary. In this talk we present two variations of fictitious play, were players use a more realistic model of opponent strategy which is changing through the time. The first variation aims to predict opponents' strategy using a particle filter algorithm and the second one use Adapting Forgetting to weight the recently observed actions. We present the results we obtained when we compared the results of these two variations against geometric Fictitious Play in two different games: a three player climbing hill game and a vehicle target assignment game with twenty players and twenty targets. |
|
Sebastian Stein (Southampton) Mechanisms for the flexible procurement of services with uncertain durations Emerging service-oriented technologies allow software agents to automatically procure distributed services to complete complex tasks. However, in many application scenarios, service providers demand financial remuneration, execution times are uncertain and consumers have deadlines for their tasks. In this talk, we address these issues by developing a novel approach that dynamically procures multiple, redundant services over time, in order to ensure success by the deadline. Specifically, we first present an algorithm for finding optimal procurement solutions, as well as a heuristic algorithm that achieves over 99% of the optimal and is capable of handling thousands of providers. Using experiments, we show that these algorithms achieve an improvement of up to 130% over current strategies that procure only single services. In the second part of the talk, we consider settings where certain service parameters are not known to the consumer and where it may be beneficial for providers to lie, for example, about their costs or the quality of their service. To address these settings, we develop several novel mechanisms that incentivise providers to reveal these parameters truthfully. The mechanisms differ in their specific properties - some are efficient, but have high computational demands, while others remain tractable for large problem instances, but are generally not efficient (however, we show empirically that these can still achieve up to 95% efficiency). |
|
Adam Sykulski (Imperial College) Exploitation by exploration in 2-player repeated games with unknown rewards We study finite-length repeated games where players have no prior knowledge of the rewards received for each set of actions. Each player therefore has two distinct learning operations estimating its reward functions and adapting to the opponents strategy. We first show that if a player selects actions using the fictitious play rule, then the strategy does not necessarily converge to a Nash Equilibrium. Furthermore, for non-zero sum games, the expected reward to each player using this strategy can actually be greater than if the rewards were known a priori. The assumption of unknown rewards induces an exploration-exploitation trade-off to a players action selection. For this reason, we use ideas from multi-armed bandit problems and introduce an e-greedy action selection rule that encourages exploration of infrequently visited actions. We demonstrate that a player that uses such a strategy can exploit an opposing player that performs no exploration. |
| David Wolpert (NASA) From game theory to game engineering Joint work with Stefan Bieniawski (Stanford University), Juergen Jost (Max Planck Institute), Michael Harre (University of Sydney), James Bono (American University) and Nilesh Kulkarni (NASA) In this talk I present recent work on combining game theory, statistics, and control theory. This combination provides new techniques for predicting / controlling any system comprising humans, human groups (e.g., firms, tribes), and / or adaptive automated systems (e.g., reinforcement learning robots). As illustrations, I will focus on three projects:
|
