Spotlight
Greedy Sequential Execution: Solving Homogeneous and Heterogeneous Cooperative Tasks with a Unified Framework
Shanqi Liu · Dong Xing · Pengjie Gu · Bo An · Yong Liu · Xinrun Wang
Effectively handling both homogeneous and heterogeneous tasks is crucial for the practical application of cooperative agents. However, existing solutions have not been successful in addressing both types of tasks simultaneously. On one hand, value-decomposition-based approaches demonstrate superior performance in homogeneous tasks. Nevertheless, they tend to produce agents with similar policies, which is unsuitable for heterogeneous tasks. On the other hand, solutions based on personalized observation or assigned roles are well-suited for heterogeneous tasks. However, they often lead to a trade-off situation where the agent's performance in homogeneous scenarios is negatively affected due to the aggregation of distinct policies. An alternative approach is to adopt sequential execution policies, which offer a flexible form for learning both types of tasks. However, learning sequential execution policies poses challenges in terms of credit assignment, and the lack of sufficient information about subsequently executed agents can lead to sub-optimal solutions. To tackle these issues, this paper proposes Greedy Sequential Execution (GSE) as a solution to learn the optimal policy that covers both scenarios. In the proposed GSE framework, we introduce an individual utility function into the framework of value decomposition to consider the complex interactions between agents. This function is capable of representing both the homogeneous and heterogeneous optimal policies. Furthermore, we utilize a greedy marginal contribution calculated by the utility function as the credit value of the sequential execution policy to address the credit assignment problem. We evaluated GSE in both homogeneous and heterogeneous scenarios. The results demonstrate that GSE achieves significant improvement in performance across multiple domains, especially in scenarios involving both homogeneous and heterogeneous tasks.