← Back to Overview
Understanding and Evaluating Cooperation-Relevant Capabilities (High Priority)
What It Is and Why It Matters

In this area, we would like to see proposals for work on definitions, metrics and methods for the evaluation of cooperation-relevant capabilities of AI systems. We believe that work on cooperation-relevant capabilities will be important to create a foundation for the development of AI systems with desirable, cooperative properties.

Specific Work We Would Like to Fund
  • Defining and measuring cooperation-relevant capabilities:
    • Theoretical work on the identification and definition of cooperation-relevant capabilities, including the development of rigorous arguments on whether those capabilities are desirable for AI systems (in general, or under specific conditions relevant to important real-world cases).
    • Empirical work on methods for evaluating or measuring such cooperation-relevant capabilities in frontier AI systems.
  • Investigating the causes of cooperation-relevant capabilities:
    • Theoretical work on how cooperation-relevant capabilities could arise in realistic AI systems, for example through training processes (including fine-tuning and in-context learning).
    • Empirical work on how cooperation-relevant capabilities arise in real systems. 
  • Theoretical and empirical work investigating the extent to which the differential development of beneficial cooperative capabilities is possible. Such work could include assessments of whether we should expect certain capabilities to affect cooperation in a net-positive or net-negative direction overall, in general, or in specific settings.
  • Theoretical and empirical work on how asymmetries in agents' capabilities and/or bounded rationality could affect cooperation.
Key Considerations
  • Note the distinction between cooperative capabilities (what the system can do) and cooperative propensities (what the system tends to do). 
  • For this area it is important to note that many capabilities are dual-use. Definitions and metrics that distinguish between desirable and undesirable (uses of) capabilities will be especially valuable as this can facilitate differential development of safe and beneficial systems. We are particularly likely to fund work on definitions or metrics that take this distinction into account.
  • For the study of how cooperation-relevant capabilities arise, we expect to only fund work that aims to draw general conclusions about causal relationships (e.g. between a feature of an agent’s training training and the agent’s development of a cooperation-relevant capability).
  • We strongly prefer applications that make use of state-of-the-art methods for capabilities evaluations. See the references below for pointers to methods for dangerous capabilities evaluation, for example.   
  • For empirical work, we are open to proposals either using foundation models or other approaches such as multi-agent reinforcement learning. However, please note the general guidelines on the importance of clarity about path to impact and justifying why the results should be expected to apply to complex settings with more advanced agents.
References
Priority Research Areas
Understanding and Evaluating Cooperation-Relevant Propensities (High Priority)
Understanding and Evaluating Cooperation-Relevant Propensities (High Priority)
Understanding and Evaluating Cooperation-Relevant Capabilities (High Priority)
Understanding and Evaluating Cooperation-Relevant Capabilities (High Priority)
Incentivizing Cooperation Among AI Agents
Incentivizing Cooperation Among AI Agents
AI for Facilitating Human Cooperation
AI for Facilitating Human Cooperation
Monitoring and Controlling Dynamic Networks of Agents and Emergent Properties
Monitoring and Controlling Dynamic Networks of Agents and Emergent Properties
Information Asymmetries and Transparency
Information Asymmetries and Transparency
Secondary Research Areas
No items found.