← Back to Overview
Incentivizing Cooperation Among AI Agents
What It Is and Why It Matters

In this area, we would like to see proposals that address the question of how cooperation can be incentivized among self-interested AI agents in mixed-motive settings. We expect such work to be important in finding approaches that lead to societally beneficial outcomes when advanced AI agents with conflicting goals are deployed in the real world. 

Specific Work We Would Like to Fund
  • Theoretical and empirical work on peer (i.e. decentralised) incentivization:
    • Development of realistic assumptions and models about methods of peer incentivization (e.g., monetary) and domains of application.
    • Understanding and building the infrastructure required for decentralised (third-party) norm enforcement.
    • Scalable and secure methods for inter-agent commitments and contracting.
    • Minimising inefficiencies from sanctions.
  • Scaling of methods to incentivize cooperation:
    • Scaling opponent-shaping and peer incentivization to more complex agents and environments (including LLM agents).
    • Approaches in automated/adaptive mechanism design (i.e. centralised forms of incentivisation) that focus on scaling to very large numbers of agents and/or much more complex agents and environments (including LLM agents). 
  • Conceptual and engineering work on designing infrastructure for interactions between agents that incentivizes cooperation (e.g., that supports the development and implementation of prosocial norms and commitments).

Key Considerations
  • For work in this area we want to emphasise that the approaches need to be applicable to “self-interested” agents. We do not expect to fund any work where agents share rewards or that otherwise assumes we need to control how every agent is designed or trained to be able to achieve cooperative outcomes. The point is to develop approaches that allow us to promote beneficial outcomes even when the individual agents are selfishly optimising for their own goals.
  • For methods of peer incentivisation and opponent-shaping, we will be especially interested in techniques that:
    • Will (at least with high probability) converge to outcomes with high social welfare if adopted by everyone.
    • Result in an equilibrium if adopted by everyone.
    • Are robust to cases where not everyone adopts the algorithm/policy. 
  • For work on mechanism design and infrastructure, it is important to consider how the work could be relevant for significant real-world scenarios. If you can point to which (kind of) institution would play the role of the mechanism designer in such a real-world scenario, this would help to strengthen the proposal.
  • We are interested in research that applies to AI agents specifically. 
  • Much of the existing work in fields such as mechanism design is focused on narrow application areas such as auctions, which are not a priority for us.
  • Opponent-shaping and peer incentivisation are potentially dual-use, meaning progress in these areas may have a negative overall impact on society. For such proposals, it is therefore especially important to consider downside risks. We will not fund work where we think the contribution to downside risks outweighs the positive impact of the work.

References
Priority Research Areas
Understanding and Evaluating Cooperation-Relevant Propensities (High Priority)
Understanding and Evaluating Cooperation-Relevant Propensities (High Priority)
Understanding and Evaluating Cooperation-Relevant Capabilities (High Priority)
Understanding and Evaluating Cooperation-Relevant Capabilities (High Priority)
Incentivizing Cooperation Among AI Agents
Incentivizing Cooperation Among AI Agents
AI for Facilitating Human Cooperation
AI for Facilitating Human Cooperation
Monitoring and Controlling Dynamic Networks of Agents and Emergent Properties
Monitoring and Controlling Dynamic Networks of Agents and Emergent Properties
Information Asymmetries and Transparency
Information Asymmetries and Transparency
Secondary Research Areas
No items found.