> Google DeepMind. 'Introducing the Frontier Safety Framework'. _Google DeepMind_, 17 Apr. 2024, [https://deepmind.google/discover/blog/introducing-the-frontier-safety-framework/](https://deepmind.google/discover/blog/introducing-the-frontier-safety-framework/).
# Introducing the Frontier Safety Framework
## Overview
Google DeepMind's **Frontier Safety Framework (FSF)** is a set of protocols for proactively identifying future AI capabilities that could cause severe harm and putting in place mechanisms to detect and mitigate them. It has three key components:
1. ==**Identifying capabilities a model may have with potential for severe harm.**==
2. ==**Evaluating our frontier models periodically to detect when they reach these Critical Capability Levels.**==
3. ==**Applying a mitigation plan when a model passes our early warning evaluations.**==

## Core Components
1. **Critical Capability Levels (CCLs)**
- ==**CCLs are thresholds where models may pose heightened risks if not properly mitigated.**==
- Initial CCLs are defined in four high-risk domains:
- **Autonomy:** Models capable of self-directed action and resource acquisition.
- **Biosecurity:** Models that could enable the development or execution of biological threats.
- **Cybersecurity:** Models that could automate or assist in cyberattacks.
- **Machine Learning R&D:** Models that could accelerate AI research, potentially enabling rapid proliferation of advanced AI.
2. **Evaluation Protocols**
- ==**Frontier models are periodically tested to assess proximity to CCLs using "early warning evaluations."**==
- Evaluations are triggered by increases in "effective compute" (a measure integrating model size, data, and compute) or significant fine-tuning progress.
- The goal is to provide a safety buffer before a model reaches a CCL.
3. **Mitigation Strategies**
- ==**Security Mitigations:**== Prevent unauthorized access or exfiltration of model weights, with escalating levels of security (from industry standard controls to advanced confidential computing).
- ==**Deployment Mitigations:**== Manage and restrict the use of critical capabilities in deployment, including safety fine-tuning, misuse detection, red-teaming, and, at the highest level, prevention of access to critical capabilities.
## Initial Critical Capability Levels (Examples)
- **Autonomy:** Model can autonomously acquire resources and replicate itself.
- **Biosecurity (Amateur/Expert Enablement):** Model can help non-experts or experts develop biological threats.
- **Cybersecurity (Autonomy/Enablement):** Model can automate cyberattacks or help amateurs conduct sophisticated attacks.
- **Machine Learning R&D:** Model can significantly accelerate AI research or automate the AI R&D pipeline.
## Implementation and Future Work
- The initial framework is targeted for implementation by early 2025, before these risks are expected to materialize.
- The FSF is exploratory and will evolve as understanding of AI risks improves.
- Future updates will focus on:
- More precise risk modeling and forecasting.
- Advanced capability elicitation techniques.
- Improved mitigation plans balancing safety and innovation.
- Inclusion of additional risk domains (e.g., chemical, radiological, nuclear risks, misaligned AI).
- Involving external authorities and independent experts in assessments and mitigation.
---
_Note: this page was at least partly written using generative AI._