testbed in the Do we need a testbed in the Do we need a COIN community and for what ? COIN community and for what ? COIN @ AAMAS Held with AAMAS May 12, 2008, Estoril, Portugal
Examples of Testbeds and Contest Examples of Testbeds and Contest Artificial Intelligence Turing Test RoboCup Agent Technology Trading Agent Competition Agent Contest (ProMAS) Trust and Reputation ART Testbed
Objectives Objectives Motivation & Challenge to find out solutions for a complex problem Comparison & evaluation qualitative, quantitative models, languages, platforms Improvement of the quality of tools in the domain
What to compare? What to compare? Languages and Models of Coordination Organisations Institutions Norms Platforms of COIN results from platforms can be extended to the underlying model (?) depends of how to compare
Which metrics? Which metrics? expressiveness constraints x autonomy evolution & adaptation scalability flexibility reliability reactiveness maturity #case studies requirements for cost the agent's cognitive #messages capabilities ...
How to compare? How to compare? by developing Testbed Dist. Vehicle Monitoring Testbed, ART Testbed by organising Contest Agent Contest, ART Contest Based on the performance in a scenario Avoid the Turing Test problem (?) Considering just the “result” (e.g. ART Contest) Considering both the “result” and the “method” (e.g. Agent Contest)
Scenario – assumptions Scenario – assumptions Open agents can enter and leave the systems Any agent architecture no access to the “internals” of the agents
Scenario – desired properties Scenario – desired properties Cooperation no single agent can have a good result Explicit rules must be available to allow agents to reason about them even violating them Require different dimensions structural, functional, dialogical, normative, ...
Task accomplishment scenario (TAS) Task accomplishment scenario (TAS) The platform should help the collective accomplishment of some task Purposeful describe goals, activities, objectives, etc Measurable objective ways to evaluate the performance of the platform Feasible a task that may be accomplished by software agents
Metrics for TAS Metrics for TAS time to accomplish allocation of goals to capable agents cost of coordination constraints detection of violations sanctions e.g. remove bad agents from the system scalability e.g. size and number of tasks ...
Method I Method I agent agent agent ....... agent agent agent agent agent given by the participant given by the contest Platform Evaluator given by the contest
Method II Method II agent agent agent agent given by the contest Platform Evaluator given by the participant
Method III Method III agent agent agent ....... agent agent agent agent agent given by the contest given by the participant Platform Evaluator given by the participant
Discussion Discussion What ? Language, Model, Platform Metrics? Testbed or Contest? Which scenario? TAS, TAC, ... Which Method (I, II, III)? Suggestions, New ideas?
Recommend
More recommend