Date: Thu, 28 Mar 2024 16:04:29 +0000 (UTC) Message-ID: <588449229.35.1711641869979@b3a2864c730a> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_34_821525750.1711641869978" ------=_Part_34_821525750.1711641869978 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
Colored Trails is a research test-bed for investigating decision-making = in groups comrising people, computer agents, and a mix of these two.
Widespread use of the Internet has fundamentally changed the ways in whi= ch we use computers, not only as individuals, but also within organizations= . Settings in which many people and many computer systems work together, de= spite being distributed both geographically and in time, are increasingly t= he norm. Examples of group activities in which computer systems participate= , whether as autonomous agents or as proxies for individual people, include= online auctions, hospital care-delivery systems, emergency response system= s, military systems, and systems administration groups. These increasingly = prevalent, heterogeneous group activities of computer systems and people-- = whether competitive, cooperative, or collaborative--frequently require deci= sion-making on the part of autonomous-agent systems or the support of decis= ion-making by people.
A wide variety of issues arise in the decision making that accompanies t= ask-oriented group activities. How do groups decide who should perform whic= h tasks in service of which goals? How are coalitions formed? How do agents= coordinate their behavior? How do agents select an appropriate recipe for = achieving a goal from among the set that might be applicable? How are tasks= allocated among agents capable of performing them? How do agents reconcile= their current intentions to perform actions in service of group goals with= new opportunities that arise? How do agents avoid conflicts in their plans= for achieving goals? How does altruistic or cooperative behavior arise in = individual behaviors? What incentives can be provided to encourage cooperat= ion, helpfulness, or other group-benefiting behaviors when agents are desig= ned independently or serve different organizations?
Grosz and Kraus developed the game Colored Trails (CT) as a testbed for = investigating the decision-making that arises in task settings, where the k= ey interactions are among goals (individual and group), tasks required to a= ccomplish those goals, and resources needed to perform the tasks. CT allows= the modeling of all of these phenomena and exploration of their causes and= ramifications. It provides the basis for development of a testbed that sup= ports investigations of human decision-making and for computational strateg= ies to be studied both in homogeneous groups comprising only people and in = heterogeneous groups consisting of people and computer systems.
The proposed project would convert the existing code-base to make it rob= ust and available for distribution to other researchers, extend the range o= f decision-making scenarios which could be modeled and tested using CT, and= perform machine-learning research and testing of computerbased decision-ma= king strategies in increasingly complex situations. The structure of CT pro= vides the right basis for development of a decision-making testbed. The rul= es of CT are simple, abstracting from particular task domains, and thus ena= bling investigators to focus on understanding, modeling, and testing decisi= on-making strategies rather than specifying and reasoning with domain knowl= edge. It is simpler to implement computer agents to play CT than games like= Diplomacy. Nonetheless, CT has proven interesting for people to play. In i= ts abstracting from "realworld" domains, CT is similar to the games develop= ed in behavioral economics. However, CT abstracts less than typical economi= c games do. In particular, unlike behavioral economics games, it provides a= clear analogue to task settings, and it provides situational contexts and = interaction histories in which to make decisions. Thus, it takes into accou= nt the problems of framing effects (for example, the difference between des= cribing a prisoner's dilemma game as "the Wall Street Game" and "the Commun= ity Game"), but does not reduce all decisions to choices between explicitly= given utility values. As a result, some of the social factors that are know= n to influence human decision-making (e.g., inequality aversion, reciprocit= y) may be directly studied in CT, providing a better basis for the design o= f computational agents.
CT is parameterized in ways that allow for increasing complexity along a= number of different dimensions that influence the performance of different= approaches to decision making. It allows for specification of different re= ward structures, enabling examination of such trade-offs as the importance = of the performance of others or the group as a whole to the outcome of an i= ndividual and the cost-benefit tradeoffs of collaboration-supporting action= s. The game parameters may be set to vary environmental features such as ta= sk complexity, availability of and access to task-related information, and = dependencies between agents. Although several testbeds and competitions hav= e been developed to test strategies for automated agents operating in multi= -agent settings, CT is the first testbed to be designed to investigate deci= sion-making in heterogeneous groups of people and computer systems. It is t= hus novel in addressing the need to understand how computer agents should b= ehave when they are participants in group activities that include people.= p>
Each instance of CT includes a numeric scoring function, which can serve= as a clear metric of evaluation of the performance of an agent playing the= game, and therefore of the model that the agent uses and indirectly of the= learning techniques that were used to derive the model. Our objective is, = in any given experimental environment, to match the performance of humans p= articipating in the identical environment. Our preliminary results for simp= le games show that it is possible not only to achieve such human-level perf= ormance but even to surpass it.
Colored Trails (CT) is played by two or more players on a rectangular bo= ard of colored squares. The rules are simple: Each player is given a starti= ng position, a goal position on the board, and a set of chips in colors tak= en from the same palette as the squares. Players may advance toward their g= oals by moving to an adjacent board square. Such a move is allowed only if = the player has a chip of the same color as the square, and the player must = turn in the chip to carry out the move. Players may negotiate with their pe= ers to exchange chips. Communication is controlled; players used a fixed bu= t expressive messaging protocol.
The scoring function, which determines the payoff to the individual play= ers, is a parameter of CT game instances and may depend on a variety of fac= tors. At its simplest, it may consist of a weighted sum of such components = as: whether the individual reached the goal, the final distance of the agen= t from the goal, the final number of chips the agent held. The scoring func= tion may be varied to reflect different possible social policies and utilit= y trade-offs, establishing a context in which to investigate the effects of= different decision-making mechanisms.
For example, by varying the relative weights of individual and group goo= d in the scoring function we can make collaborative behavior may become mor= e or less beneficial. Despite the simplicity of the rules, play of CT is ab= le to model a broad range of aspects of task situations in which a group of= agents perform actions; it allows for scenarios in which agents act as ind= ividuals or in teams or both. Traversing a path through the board correspon= ds to performing a complex task the constituents of which are the individua= l tasks represented by each square. Different colors represent different ta= sks. The existence of multiple paths to a goal corresponds to the availabil= ity of different "recipes" or methods for achieving goals. The possession o= f a chip of a particular color corresponds to having the skills and resourc= es needed for a task, and being able to deploy them at the appropriate time= . Not all players get chips of all colors much as agents have different cap= abilities, availability, and resources. The exchange of chips corresponds t= o agents producing the intended effects of actions for each other; in diffe= rent domain settings an exchange could be taken to correspond to agents pro= viding resources to each other, doing tasks for them, or enabling them in s= ome other way to do tasks they otherwise could not do. The game environment= may be set to model different knowledge conditions as well. For example, v= arying the amount of the board an agent can "see" corresponds to varying in= formation about task constituents or resource requirements, whereas varying= the information players have about each other's chips corresponds to varyi= ng information agents have about the capabilities of others. Various requir= ements on player objectives, goal squares, and paths correspond to differen= t types of group activities and collaborative tasks. To distinguish coopera= tive settings from those in which agents act independently, the scoring fun= ction may have a significant factor in which an agent's reward depends on o= thers' performance.
Mapping from task-environment decision problems to CT setting CT allows = the modeling of all the decision-making questions listed above. We provide = some simple illustrative examples here, but all of these questions and many= others can be explored within instances of the CT framework. Consider the = question of joint goals among agents. This kind of problem can be modeled b= y having the scoring function include a component that is maximized when al= l players complete their task, that is, only if the joint goal is satisfied= . Each individual may still get a "base score" that can include the usual c= omponents (final number of chips, distance to the goal). Or, consider the q= uestion of conflict avoidance. When players have a significant overlap in t= heir chip needs on one set of paths, but not on others, with the requisite = chips being a scarce resource, the game inherently provides incentive for a= voiding the conflicted paths. The initial game states can be selected so as= to exhibit such situations with greater or lesser frequency in order to ex= plore behaviors under such conditions. Finally, consider the question of al= truistic behavior. Evidence for such behavior is possible just when some pl= ayer has a chip or chips that can aid another player. Altruistic behavior i= n such situations may be pure altruism if no benefit accrues to the player,= or based on reciprocity if in a multi-shot game context, or supported by a= scoring function that rewards such behavior. Mapping standard economic gam= e concepts to CT situations Many standard concepts from economic games can = be modeled. Game concepts such as the prisoners' dilemma, public goods, ult= imatum games, dictator games, trust, gift exchange, third party punishment,= all may be exemplified in the CT framework. Moreover, they can be combined= in new and affecting ways, while still being carefully controlled. Again, = we provide a few representative examples. Most simply, the prisoners' dilem= ma corresponds to a game in which each player needs a chip that the other h= as, agreements are not enforceable, no communication other than sending chi= ps is available, and there is a large reward for getting to goal. Ultimatum= games (where a single agent makes a take-it-or-leave-it offer to which ano= ther agent responds) is another standard economic game with a natural CT in= stantiation. Here, we use a twoplayer CT instance in which both players hav= e full information about the board and each other's chips. Players' scores = are solely determined by their own individual outcomes and depend on the pl= ayers' distance from their respective goals, as well as the chips that the = players possess at the end of the game. Negotiation is restricted to a sing= le round of offer and response. In fact, we have explored this game with hu= man players, and describe our game-theoretically surprising results in a la= ter section.