AI Safety Index 2025: How Frontier AI Companies Score on Safety, Governance, and Existential Risk

By Isabella Costa
·
March 17, 2026
·
14 min read

AI Safety Index 2025: What It Measures and Why It Matters
Methodology: 33 Indicators Across Six Safety Domains
Overall AI Safety Rankings: From Anthropic’s C+ to DeepSeek’s F
Risk Assessment and Dangerous Capability Testing
Current Harms: Benchmarks, Robustness, and AI Watermarking
AI Safety Frameworks and Governance Gaps
Existential AI Safety: Industry Unprepared for AGI Risks
AI Transparency, Whistleblowing, and Information Sharing
Company Profiles: Strengths and Weaknesses Across the Field
AI Safety Recommendations and Policy Implications

📌 Key Takeaways

Anthropic leads with C+: The highest-scoring company earned just 2.64 out of 4.0, highlighting that even the best frontier AI developers fall short of adequate safety practices.
Existential safety failure: No company scored above a D in existential safety — the industry has no adequate plans for managing the risks of the AGI systems they claim to be building.
Governance divergence: Anthropic’s A- in governance contrasts sharply with most competitors’ D grades, showing that corporate structure choices (like Public Benefit Corporation) meaningfully impact safety outcomes.
Transparency gaps: Only OpenAI published a full whistleblowing policy, and only after media pressure. Most companies rely on NDAs that limit external scrutiny of safety concerns.
Chinese firms trail: Zhipu AI (F, 0.62) and DeepSeek (F, 0.37) scored lowest, though the report acknowledges different regulatory and cultural contexts affect transparency practices.

AI Safety Index 2025: What It Measures and Why It Matters

The Future of Life Institute (FLI) has released its Summer 2025 AI Safety Index, providing the most comprehensive independent assessment of frontier AI companies’ safety practices to date. As AI systems become increasingly powerful and pervasive, the question of whether the companies building them are doing so responsibly has moved from academic concern to urgent policy priority. This report delivers a data-driven answer — and the results are sobering.

The AI Safety Index evaluates seven frontier AI developers: Anthropic, OpenAI, Google DeepMind, xAI, Meta, Zhipu AI, and DeepSeek. These companies represent the vanguard of AI capability development, collectively responsible for the most powerful language models, reasoning systems, and multimodal AI architectures in existence. The assessment covers 33 specific indicators across six domains, graded by a six-person independent expert panel that includes Stuart Russell, Dylan Hadfield-Menell, and Jessica Newman.

The evidence window spans March to June 2025, drawing on public sources and a voluntary company survey (to which only OpenAI, Zhipu AI, and xAI responded). This combination of external observation and direct engagement provides a nuanced picture of each company’s safety posture — though the refusal of four companies to participate in the survey itself says something about industry commitment to transparency. For context on how AI governance frameworks are evolving globally, explore our interactive guide on the Chatham House AI Global Governance Framework.

Methodology: 33 Indicators Across Six Safety Domains

The AI Safety Index evaluates companies across six carefully defined domains, each capturing a distinct dimension of responsible AI development. Risk Assessment examines whether companies conduct internal and external testing, evaluate dangerous capabilities, test for human uplift scenarios (where AI helps users cause harm they couldn’t without it), and maintain bug bounty programmes. This domain measures the foundation of safety practice — you cannot manage risks you haven’t identified.

Current Harms evaluates performance on established safety benchmarks like HELM and TrustLLM, robustness testing, red-teaming practices, output watermarking (such as Google’s SynthID), and user privacy protections. This is where rubber meets road — how well do current AI systems actually behave in practice, and how effectively are companies monitoring and mitigating real-world harms?

Safety Frameworks assesses whether companies have systematic approaches to risk identification, analysis, treatment, and internal governance. The Index draws on SaferAI’s methodology to evaluate the maturity and comprehensiveness of these frameworks. Existential Safety — the domain where results are most alarming — examines AGI control strategies, monitoring systems, technical alignment research, and external safety support.

Governance and Accountability covers corporate structure and mandates, lobbying activities, whistleblowing policies, and safety reporting culture. Information Sharing evaluates transparency on system prompts, behaviour specifications, voluntary cooperation with international frameworks (such as G7 reporting), and incident notification practices. Together, these six domains create a 360-degree view of each company’s safety commitment.

Overall AI Safety Rankings: From Anthropic’s C+ to DeepSeek’s F

The headline finding is striking in its mediocrity: the best-performing frontier AI company earns a C+. Anthropic leads the field with 2.64 out of 4.0, followed by OpenAI at C (2.10), Google DeepMind at C- (1.76), xAI at D (1.23), Meta at D (1.06), Zhipu AI at F (0.62), and DeepSeek at F (0.37). To put this in perspective, the company building arguably the most safety-conscious AI systems in the world would earn a barely passing grade at most universities.

The distribution reveals a clear three-tier structure. The top tier (Anthropic, OpenAI, Google DeepMind) demonstrates meaningful engagement with safety across most domains, though with significant gaps. The middle tier (xAI, Meta) shows partial engagement with notable weaknesses. The bottom tier (Zhipu AI, DeepSeek) demonstrates minimal alignment with the Index’s safety criteria, though the report appropriately notes that different regulatory environments and cultural norms around corporate transparency may partially explain the gap.

Perhaps most concerning is the trajectory implied by these scores. AI capabilities are advancing at a historically unprecedented pace — with new frontier models demonstrating step-change improvements in reasoning, coding, and multi-step problem solving every few months. Yet safety practices, as measured by this Index, are improving incrementally at best. The gap between what these systems can do and how well we understand and control them is widening, not closing.

Transform this critical AI safety report into an interactive experience — make every stakeholder understand the risks.

Try It Free →

Risk Assessment and Dangerous Capability Testing

The Risk Assessment domain reveals a fundamental divide: only three of seven companies — Anthropic, OpenAI, and Google DeepMind — conduct substantive testing for dangerous capabilities tied to large-scale risks. This means four companies developing frontier AI systems do not systematically evaluate whether their models could enable catastrophic harm, such as assisting in the development of biological, chemical, or radiological weapons, or enabling unprecedented cyberattacks.

Anthropic leads this domain with a C+, demonstrating the most comprehensive approach to pre-deployment risk evaluation. Their Responsible Scaling Policy (RSP) establishes concrete capability thresholds that trigger additional safety measures, creating a systematic link between model capabilities and required protections. OpenAI follows with a C, having published detailed model cards and conducted external red-teaming for major releases, though with less systematic threshold-based triggering.

Google DeepMind earns a C-, recognised for its Frontier Safety Framework but criticised for insufficient public disclosure of testing methodologies and results. The remaining companies — xAI (F), Meta (D), Zhipu AI (F), and DeepSeek (F) — demonstrate inadequate risk assessment practices. The report flags Meta’s open-weight releases as particularly concerning in this context: releasing model weights without adequate safety testing means the broader community inherits responsibility for identifying dangerous capabilities, often without the resources or expertise to do so effectively.

Current Harms: Benchmarks, Robustness, and AI Watermarking

In the Current Harms domain, which measures how well companies manage real-world negative impacts of deployed systems, the results are marginally better. OpenAI leads with a B grade, followed by Anthropic at B-, reflecting both companies’ investment in safety benchmarking, red-teaming programmes, and harm mitigation systems. Google DeepMind earns a C+, with particular recognition for SynthID — its watermarking technology for AI-generated content, described as the most advanced implementation among assessed companies.

Anthropic stands out for a distinctive policy choice: it is the only assessed company that does not use user interaction data for training by default. This privacy-first approach reduces certain categories of harm (data leakage, memorisation of personal information) and aligns with growing regulatory expectations around AI and data protection. The trade-off — potentially less data for improving model behaviour — appears to be one Anthropic is willing to make in exchange for user trust.

At the lower end, xAI and Meta earn D+ grades, while Zhipu AI receives a D and DeepSeek a D-. The report notes that the open-weight model releases by Meta and others create a specific category of harm risk: fine-tuning by downstream users can remove safety guardrails, potentially creating more dangerous versions of already-capable models. This tension between openness and safety remains one of the most contentious debates in the AI community, and the Index takes a cautious position, flagging open-weight releases as increasing risk without adequate safeguards.

AI Safety Frameworks and Governance Gaps

The Safety Frameworks domain evaluates whether companies have systematic, comprehensive approaches to identifying, analysing, and treating AI risks. Anthropic and OpenAI share the lead at C, while Google DeepMind, xAI, and Meta cluster at D+. Zhipu AI and DeepSeek both receive F grades, indicating no publicly documented safety frameworks meeting the Index’s criteria.

The report draws a crucial distinction between having a safety framework on paper and implementing it with genuine enforcement mechanisms. Several companies have published impressive-sounding safety policies, but the Index evaluates implementation evidence — such as documented cases where safety concerns led to deployment delays or capability restrictions. This implementation gap is where many companies’ stated commitments diverge from observed practice.

The Governance and Accountability domain shows the starkest divergence between companies. Anthropic earns an A- — the highest grade in any domain for any company — largely due to its Public Benefit Corporation structure, which legally embeds safety obligations into the company’s corporate mandate. OpenAI receives a C-, with the report noting the irony of its transformation from a nonprofit focused on safe AI to a company navigating complex governance transitions. For a deeper analysis of governance structures in AI, our interactive guide on Carnegie’s Global AI Governance Analysis offers complementary insights.

Make this AI safety assessment accessible to your board and leadership — turn dense reports into interactive video experiences.

Get Started →

Existential AI Safety: Industry Unprepared for AGI Risks

The most alarming finding in the entire AI Safety Index is the Existential Safety domain, where no company scores above a D. Anthropic leads with D, while OpenAI receives an F — a grade that may surprise many, given OpenAI’s founding mission explicitly centred on beneficial AGI. Google DeepMind earns a D-, and all remaining companies receive F grades. The message is unambiguous: the companies most aggressively pursuing artificial general intelligence have no adequate plans for controlling it.

The report evaluates four key dimensions of existential safety: AGI control strategies (concrete plans for ensuring advanced systems remain aligned with human values and subject to human oversight), monitoring systems (ability to detect concerning capability developments or alignment failures), technical alignment research (investment in fundamental research on making AI systems reliably pursue intended objectives), and external support (contribution to the broader ecosystem of AI safety research and governance).

The finding that several companies simultaneously claim to be building AGI within years while failing to demonstrate any concrete control strategy is described by the report as a “profound institutional contradiction.” If these companies’ own timelines are to be believed, the existential safety gap is not an abstract long-term concern but an immediate operational risk. The Center for AI Safety has similarly warned that the pace of capability development is outstripping safety research by a widening margin.

OpenAI’s F grade in this domain specifically reflects what the report characterises as “safety-capacity concerns” — instances where commercial pressures appeared to override safety considerations in deployment decisions. The report notes that several high-profile departures of safety-focused researchers from OpenAI during the assessment period corroborate external concerns about the prioritisation of capability development over safety investment.

AI Transparency, Whistleblowing, and Information Sharing

Information sharing and transparency form a critical enabler of AI safety — external scrutiny can catch risks that internal processes miss. Anthropic leads with an A- in Information Sharing, followed by both OpenAI and Google DeepMind at B. xAI earns a C+, while Meta, Zhipu AI, and DeepSeek score D, D, and F respectively.

A particularly telling indicator is whistleblowing policy. The report finds that only OpenAI has published a full whistleblowing policy — and only after media investigations revealed that previous non-disparagement clauses in employee agreements could discourage safety-related disclosures. This reactive transparency, while better than silence, illustrates the broader pattern: safety improvements often follow public pressure rather than proactive commitment.

The report recommends that all frontier AI companies publish comprehensive whistleblowing policies, reduce the scope of NDAs that prevent safety researchers from sharing concerns externally, and establish clear incident notification procedures. The current state of affairs — where employees with safety concerns face legal and professional risks for raising them publicly — is described as fundamentally incompatible with responsible development of powerful AI systems.

System prompt transparency and behaviour specification disclosure are additional areas where companies diverge. OpenAI’s publication of its model specification and Anthropic’s detailed system prompt documentation set a standard that other companies have not matched. The report argues that understanding how AI systems are instructed to behave is a prerequisite for external safety evaluation, and opacity in this area signals either a lack of systematic specification or an unwillingness to be held accountable for stated behavioural objectives.

Company Profiles: Strengths and Weaknesses Across the Field

Anthropic (C+, 2.64) emerges as the clear safety leader, with particular strengths in governance (A-), information sharing (A-), and risk assessment (C+). Its Public Benefit Corporation structure, privacy-first data practices, and Responsible Scaling Policy set it apart. However, its D in existential safety shows that even the most safety-conscious company has significant gaps in AGI preparedness. The company’s focus on Constitutional AI and interpretability research positions it well for continued improvement, but execution must match ambition.

OpenAI (C, 2.10) presents a complex picture: strong on current harms mitigation (B) and information sharing (B), but struggling with existential safety (F) and governance (C-). The company’s willingness to engage with the survey and publish safety documentation demonstrates genuine transparency commitment, but the disconnect between its AGI mission and its existential safety score raises fundamental questions about strategic coherence. Our interactive analysis of European AI Governance and Power provides additional context on how regulatory frameworks are emerging to address these gaps.

Google DeepMind (C-, 1.76) sits in the middle, with solid safety infrastructure (SynthID watermarking, Frontier Safety Framework) but weaker governance (D) and transparency on testing methodologies. Its deep integration with Alphabet’s corporate structure brings both resources and constraints that affect its safety profile.

The bottom four companies tell a story of insufficient investment in safety relative to capability development. xAI (D, 1.23) and Meta (D, 1.06) have the resources to do significantly better but appear to prioritise development speed. Zhipu AI (F, 0.62) and DeepSeek (F, 0.37) face legitimate contextual challenges — different regulatory environments, different norms around corporate transparency — but the report maintains that physics-of-risk considerations apply regardless of geography: unsafe AI systems are dangerous wherever they are deployed.

AI Safety Recommendations and Policy Implications

The AI Safety Index concludes with a set of concrete recommendations that collectively describe what a responsible frontier AI development practice should look like. In the short term, all companies should publish full whistleblowing policies, expand independent pre-deployment testing with reduced NDA constraints, publish explicit risk-assessment methodologies linked to specific risks in model cards, and increase investment in technical safety research — particularly in interpretability, tamper-resistant safeguards, and scalable oversight.

The report takes a clear position on the open-weight debate: open releases of frontier model weights without tamper-proof safeguards increase risk, and companies should either develop effective safety mechanisms that survive fine-tuning or exercise greater caution in weight release decisions. This recommendation will be controversial in a community that values openness, but the report’s logic is straightforward — a safety mechanism that can be trivially removed provides an illusion of safety rather than genuine protection.

For longer-term governance, the report recommends establishing concrete trigger thresholds and enforcement mechanisms for conditional development pauses, strengthening corporate governance structures to legally prioritise safety (citing Anthropic’s Public Benefit Corporation as a model), and — most critically — creating regulatory floors that establish minimum safety standards for frontier AI development. The report explicitly states that voluntary pledges have proven insufficient and that government intervention is needed.

The overarching conclusion is both simple and alarming: AI capability is accelerating faster than safety and governance practices. The gap between what frontier AI systems can do and the industry’s ability to ensure they do so safely is widening. Without significant changes — both from companies and from regulators — this trajectory poses risks that the Index’s own grading system struggles to adequately capture. The question is no longer whether frontier AI companies need better safety practices, but whether the improvement will come fast enough to matter.

Bring this critical AI safety report to life — create an interactive experience that drives real understanding and action.

Start Now →

Frequently Asked Questions

What is the AI Safety Index and who publishes it?

The AI Safety Index is published by the Future of Life Institute (FLI) as an independent assessment of frontier AI companies’ safety practices. The Summer 2025 edition evaluates seven companies — Anthropic, OpenAI, Google DeepMind, xAI, Meta, Zhipu AI, and DeepSeek — across 33 indicators in six domains including risk assessment, current harms, safety frameworks, existential safety, governance, and information sharing.

Which AI company scored highest on the AI Safety Index 2025?

Anthropic scored highest with a C+ grade (2.64 out of 4.0), leading in risk assessment, governance and accountability (A-), and information sharing (A-). Anthropic was noted for not training on user data by default, strong alignment research, and its Public Benefit Corporation structure. OpenAI ranked second with a C grade (2.10), followed by Google DeepMind at C- (1.76).

Why did no AI company score above D in existential safety?

The FLI report found that no frontier AI company has adequate plans for managing existential risks from advanced AI systems. Despite several companies claiming to pursue AGI, none demonstrated concrete control strategies, meaningful conditional pause mechanisms, or comprehensive alignment research programs proportionate to the risks they acknowledge. Anthropic scored highest in this domain at D, with all others receiving D- or F grades.

How does the AI Safety Index evaluate AI companies?

The Index uses 33 indicators across six domains: Risk Assessment (internal/external testing, dangerous capability evaluation), Current Harms (benchmarks, robustness, red-teaming, watermarking, privacy), Safety Frameworks (risk identification and treatment), Existential Safety (AGI control plans, alignment research), Governance and Accountability (corporate structure, whistleblowing), and Information Sharing (transparency, incident reporting). A six-person expert panel grades companies based on public sources and voluntary company surveys.

What are the main recommendations from the AI Safety Index?

Key recommendations include publishing full whistleblowing policies, expanding independent pre-deployment testing, increasing investment in interpretability and alignment research, improving transparency on system prompts and incident reporting, establishing concrete trigger thresholds for development pauses, strengthening governance structures to prioritise safety, and creating regulatory floors since voluntary pledges are insufficient. The report emphasises that AI capability is advancing faster than safety practices.

Your documents deserve to be read.

PDFs get ignored. Presentations get skipped. Reports gather dust.

Libertify transforms them into interactive experiences people actually engage with.

Transform Your First Document Free →

No credit card required · 30-second setup