AI Safety Report 2026: Global Risks, Capabilities and Governance Frameworks

By Isabella Costa
·
March 14, 2026
·
13 min read

The Largest Global Collaboration on AI Safety
AI Safety Report 2026: What General-Purpose AI Can Do Today
Reliability Risks and the Hallucination Challenge
Autonomous AI Agents and Novel Safety Concerns
Cybersecurity Threats from Advanced AI Systems
AI Misuse Risks: Disinformation and Biological Threats
Systemic Risks from AI Concentration and Market Power
International AI Governance Frameworks and Policy
AI Safety Research: Evaluation Methods and Benchmarks
Building a Global AI Safety Infrastructure for the Future

📌 Key Takeaways

Unprecedented Collaboration: The AI Safety Report 2026, chaired by Yoshua Bengio, unites over 90 authors and advisers from 30+ countries in the largest global scientific review of AI safety ever conducted.
Agent Risks Escalating: Autonomous AI agents that act in the real world pose novel safety risks because their failures can cause direct harm without human intervention opportunities.
Reliability Gaps Persist: Despite improving benchmark scores, AI models still produce harmful answers in 19% of medical queries and fabricate legal citations, highlighting real-world reliability shortfalls.
Governance Urgency: The report calls for international cooperation on sector-specific AI regulation, safety research investment and mechanisms to prevent excessive concentration of AI capabilities.
Cybersecurity Dual Edge: Advanced AI strengthens both cyber offense and defense capabilities, creating an intensifying measure-countermeasure race with implications for critical infrastructure protection.

The Largest Global Collaboration on AI Safety

The International AI Safety Report 2026 stands as a landmark achievement in global scientific cooperation on artificial intelligence governance. Published in February 2026 under the chairmanship of Professor Yoshua Bengio — Turing Award laureate and founder of the Mila Quebec AI Institute — the report represents the most comprehensive multinational effort to assess the capabilities, emerging risks, and safety of general-purpose AI systems. With contributions from over 90 researchers and advisers spanning institutions including Stanford University, the University of Oxford, Harvard University, Princeton University, MIT, and Carnegie Mellon University, the AI safety report 2026 synthesizes cutting-edge scientific evidence into actionable frameworks for policymakers, industry leaders, and civil society worldwide.

This second edition builds on the inaugural 2025 report with substantially expanded coverage of autonomous AI agents, multi-agent systems, and the rapidly evolving landscape of AI-related cybersecurity threats. The Expert Advisory Panel includes representatives nominated by more than 30 countries and international organizations — from Australia and Brazil to the United Kingdom and the United Nations — making it the broadest consultation process ever undertaken on AI safety. The report’s structure revolves around three central questions: what can general-purpose AI do today, what emerging risks does it pose, and how can those risks be mitigated? For organizations navigating the complexities of AI deployment, understanding these findings is essential for responsible adoption. Our analysis of AI economic growth and productivity provides complementary context on the benefits side of the equation.

AI Safety Report 2026: What General-Purpose AI Can Do Today

The report’s first major section delivers a rigorous assessment of current AI capabilities, documenting the extraordinary pace at which general-purpose AI systems have advanced. Leading frontier models now demonstrate performance that surpasses human baselines on standardized tests spanning mathematics, science, law, and medicine. These systems can generate sophisticated text, images, code, and audio, translate between languages with near-native fluency, and engage in multi-turn reasoning tasks that were considered out of reach just two years ago.

Crucially, the AI safety report 2026 identifies a new class of capabilities emerging from the integration of AI models with external tools and autonomous agency. Modern AI systems are no longer limited to generating static outputs for human review. They can browse the internet, execute code, interact with APIs, manage files, and orchestrate complex multi-step workflows — transforming from passive assistants into active agents that operate in the real world. This capability expansion represents a fundamental shift in the risk landscape, as the consequences of AI failures now extend far beyond incorrect text generation to include unauthorized actions, data breaches, and cascading system failures.

The report also documents the emergence of reasoning AI systems that employ chain-of-thought processes, self-verification mechanisms, and structured logical frameworks. These models represent a significant advance over earlier generative systems by reducing — though not eliminating — hallucination errors and improving reliability in high-stakes domains. However, the authors caution that improved benchmark performance does not automatically translate to real-world safety, a theme that pervades the entire report. Research from the National Bureau of Economic Research reinforces the finding that AI capabilities are advancing faster than safety frameworks can keep pace.

Reliability Risks and the Hallucination Challenge

One of the most detailed sections of the AI safety report 2026 examines reliability failures in general-purpose AI systems. The report catalogs a comprehensive taxonomy of documented reliability issues, ranging from hallucinated legal citations in court briefs and fabricated fare policies for bereaved airline passengers to inaccurate medical information and outdated factual claims. These failures are not merely theoretical — they have occurred in deployed systems and have caused real-world harm to individuals and organizations relying on AI-generated outputs.

The report presents data from the SimpleQA Verified benchmark showing that factual accuracy has improved substantially from mid-2023 through early 2026, with leading models achieving accuracy rates exceeding 60 percent on verified factual questions. However, the authors emphasize a critical gap between benchmark performance and real-world reliability. In one study cited by the report, AI models provided potentially harmful answers to 19 percent of medical questions posed — a failure rate that is unacceptable in clinical settings where incorrect information could lead to misdiagnosis, inappropriate treatment, or denial of care.

Basic reasoning failures represent another persistent category of reliability risk. Despite sophisticated performance on complex tasks, general-purpose AI models still fail at elementary mathematical calculations and struggle to infer basic causal relationships. The report characterizes these failures as fundamentally unpredictable — users cannot reliably anticipate when a model will produce correct versus incorrect outputs, making it difficult to establish appropriate levels of trust and verification. This unpredictability is especially concerning as AI systems are deployed in financial risk management and other high-stakes domains where errors carry significant consequences.

Transform complex AI safety research into interactive experiences your team will actually engage with.

Try It Free →

Autonomous AI Agents and Novel Safety Concerns

Perhaps the most consequential new analysis in the AI safety report 2026 concerns the risks posed by autonomous AI agents. Unlike traditional AI systems that produce text, images, or recommendations for humans to review and act upon, AI agents can independently initiate actions, influence other humans or AI systems, and dynamically shape real-world outcomes. This expanded scope of influence introduces entirely new categories of risk that existing safety frameworks were not designed to address.

The report documents several specific agent failure modes that have already been observed in deployed systems. Privacy breaches have occurred when AI agents exposed users’ private images by sending them to third-party tools without authorization. Short-term working memory failures have led agents to lose track of their objectives mid-task, producing incomplete or incorrect results. In multi-agent systems, miscoordination failures have emerged when individual agents optimized for their own incentives at the expense of collective welfare goals, resulting in resource conflicts and suboptimal outcomes.

The fundamental concern with AI agents, as articulated by the report, is that their failures can cause direct harm with no opportunity for human intervention. When a chatbot hallucinates a citation, a human can verify it before acting. When an autonomous agent executes an unauthorized financial transaction or deletes critical files, the damage may be done before any human becomes aware of the problem. This asymmetry between agent capability and human oversight represents what the report calls one of the most urgent challenges in AI safety, requiring new approaches to monitoring, containment, and fail-safe design. The White House Blueprint for an AI Bill of Rights provides a complementary policy framework for addressing these concerns in the United States.

Cybersecurity Threats from Advanced AI Systems

The AI safety report 2026 devotes significant attention to the cybersecurity implications of advanced AI, characterizing the landscape as an intensifying measure-countermeasure race. On the offensive side, AI systems can automate vulnerability discovery, generate sophisticated phishing campaigns, adapt malware to evade detection, and orchestrate multi-stage attacks at speeds that overwhelm human defenders. The report documents cases where AI-generated phishing messages achieved significantly higher click-through rates than human-crafted equivalents, and where AI tools accelerated the process of identifying exploitable software vulnerabilities.

On the defensive side, AI offers transformative capabilities for threat detection, incident response, and network monitoring. AI-powered security systems can analyze vast volumes of network traffic in real time, identify anomalous patterns that signal emerging attacks, and automate response actions at machine speed. The report notes that AI-enhanced cyber defenses could eventually shift the balance toward defenders by addressing the fundamental scale, speed, and effectiveness challenges that currently give attackers a structural advantage.

However, the report cautions that this defensive advantage is not yet realized and may take years to materialize. In the near term, AI is likely to benefit cyber offense more than defense, particularly because attackers can leverage AI to automate the most labor-intensive phases of an attack while defenders must protect against an expanding attack surface. The report calls for urgent investment in AI-powered cyber defense capabilities, international cooperation on threat intelligence sharing, and research into the resilience of critical infrastructure systems against AI-enhanced attacks. Studies from the Cybersecurity and Infrastructure Security Agency echo these recommendations.

AI Misuse Risks: Disinformation and Biological Threats

The report examines two categories of intentional misuse that pose particular concern: the use of AI for disinformation campaigns and the potential for AI to lower barriers to biological threats. In the disinformation domain, the AI safety report 2026 documents how generative AI systems can produce synthetic text, images, audio, and video that are increasingly difficult to distinguish from authentic content. This capability enables the creation of disinformation at unprecedented scale and sophistication, threatening democratic processes, public health communication, and institutional trust.

The biological risk analysis represents one of the report’s most sensitive sections. The authors present evidence that AI systems can provide information that could assist individuals seeking to develop biological agents, even when specific safeguards are in place. While the report is careful to note that AI does not yet enable someone without relevant expertise to create a viable biological weapon, it identifies a concerning trajectory in which improving AI capabilities could progressively lower the knowledge barriers to biological threats. This finding has significant implications for biosecurity policy and for the development of AI safety guardrails.

The report recommends a multi-layered approach to mitigating misuse risks, combining technical safeguards such as output filtering and watermarking with regulatory measures including mandatory disclosure requirements for synthetic content and enhanced biosecurity protocols for AI developers. The authors emphasize that no single mitigation strategy is sufficient — effective risk management requires defense in depth across technical, institutional, and regulatory layers. For a deeper understanding of how AI is transforming security paradigms, explore our interactive library of authoritative research analyses.

See how leading institutions turn dense safety research into engaging interactive content for stakeholders.

Get Started →

Systemic Risks from AI Concentration and Market Power

A distinctive contribution of the AI safety report 2026 is its analysis of systemic risks arising from the concentration of AI capabilities among a small number of large technology companies. The enormous costs of training frontier AI models — measured in hundreds of millions to billions of dollars — create natural barriers to entry that have resulted in an increasingly oligopolistic market structure. This concentration raises concerns about single points of failure, limited diversity in AI approaches, and the potential for dominant players to shape safety norms in their own interests rather than the public interest.

The report identifies several pathways through which concentration creates systemic risk. First, when critical infrastructure and public services depend on AI systems from a handful of providers, a single vulnerability or failure mode can cascade across entire sectors. Second, concentrated development reduces the diversity of approaches to AI safety, increasing the likelihood that all leading systems share common failure modes. Third, dominant players may have incentives to prioritize capability advancement over safety investment, particularly in competitive environments where being first to market confers significant advantages.

To address these risks, the report recommends policies to promote competition and access in AI development, including support for open-source AI initiatives, shared compute infrastructure for researchers and smaller organizations, and regulatory requirements for interoperability and data portability. The authors draw parallels to historical precedents in telecommunications and financial services, where concentration created systemic vulnerabilities that ultimately required regulatory intervention to address.

International AI Governance Frameworks and Policy

The governance section of the AI safety report 2026 represents its most directly policy-relevant contribution. The report argues that effective AI governance requires international cooperation because AI systems are developed and deployed across borders, and their risks — from disinformation to cybersecurity to biological threats — are inherently transnational in nature. No single country can effectively regulate AI in isolation, and inconsistent regulatory frameworks create opportunities for regulatory arbitrage that undermine safety objectives.

The report advocates for sector-specific regulatory approaches that account for the diverse contexts in which AI is deployed, rather than broad horizontal regulation that may be either too restrictive for low-risk applications or insufficiently protective for high-risk ones. In healthcare, this means integrating AI safety requirements into existing medical device regulatory frameworks. In financial services, it means extending risk management requirements to cover AI-specific failure modes. In critical infrastructure, it means establishing resilience standards that account for AI-enhanced threats.

The authors also call for the establishment of national AI safety institutes in countries that do not yet have them, and for enhanced coordination among existing institutes. These organizations play a critical role in conducting independent evaluations of AI systems, developing safety testing methodologies, and providing technical expertise to regulators. The report cites the UK AI Security Institute and its international counterparts as models for this approach, while noting that significant gaps remain in global coverage and coordination. The OECD AI Principles provide an established framework for international alignment on these governance objectives.

AI Safety Research: Evaluation Methods and Benchmarks

The AI safety report 2026 provides an extensive review of current methods for evaluating AI system safety, identifying both progress and significant gaps. Benchmark-based evaluation has been the primary tool for assessing AI capabilities and reliability, with assessments such as SimpleQA, MMLU, and domain-specific medical and legal benchmarks providing standardized measures of model performance. The report documents substantial improvements across these benchmarks, with leading models achieving scores that would place them among human experts in many domains.

However, the report identifies a fundamental limitation of benchmark-based evaluation: benchmarks measure performance on known, structured tasks, while real-world safety failures often occur in novel, unstructured situations that benchmarks do not capture. The authors call for the development of more robust evaluation methodologies that assess AI systems in realistic deployment contexts, including adversarial testing, red-team exercises, and continuous monitoring of deployed systems. They also recommend investment in interpretability research that could enable deeper understanding of why AI systems fail, rather than simply measuring how often they fail.

The report highlights the particular challenge of evaluating autonomous AI agents, for which existing benchmarks are largely inadequate. Agent safety cannot be assessed by measuring performance on isolated tasks; it requires evaluating behavior across extended sequences of actions in complex, dynamic environments where the consequences of failures accumulate over time. The development of agent-specific evaluation frameworks is identified as one of the most urgent priorities for AI safety research.

Building a Global AI Safety Infrastructure for the Future

The concluding section of the AI safety report 2026 articulates a vision for a global AI safety infrastructure that can keep pace with the rapid advancement of AI capabilities. The report emphasizes that the current pace of AI development — with new frontier models released every few months and capabilities improving on timescales measured in weeks — creates an unprecedented challenge for safety governance. Traditional regulatory processes that operate on timescales of years are fundamentally mismatched with the speed of technological change.

To address this mismatch, the report proposes several institutional innovations. First, it calls for the creation of rapid evaluation mechanisms that can assess new AI systems within weeks of their release, rather than months or years. Second, it recommends the establishment of international information-sharing protocols that enable safety-relevant findings to be communicated quickly across borders. Third, it advocates for the development of automated safety testing tools that can scale with the pace of AI development, reducing the burden on human evaluators.

The report concludes with a call for sustained investment in AI safety research, arguing that the current level of investment is inadequate given the magnitude of the risks and the pace of capability advancement. The authors estimate that safety research receives less than two percent of total AI research investment globally — a ratio they characterize as dangerously insufficient. Increasing this investment, broadening the talent pipeline for AI safety research, and creating career incentives that attract top researchers to safety work are identified as essential prerequisites for managing the risks of increasingly capable AI systems. The scale and ambition of the International AI Safety Report 2026 itself demonstrates that global cooperation on AI safety is possible — the question is whether it can be sustained and expanded at the pace that the technology demands.

Turn any AI safety report or policy document into an interactive experience in minutes.

Start Now →

Frequently Asked Questions

What is the International AI Safety Report 2026?

The International AI Safety Report 2026 is the second edition of a comprehensive scientific review of AI capabilities and risks, chaired by Yoshua Bengio with contributions from over 90 authors and advisers from more than 30 countries. It represents the largest global collaboration on AI safety to date and is intended to inform policymakers, researchers, industry and civil society about the state of general-purpose AI systems.

What are the main AI risks identified in the safety report?

The report identifies several categories of AI risk including reliability failures such as hallucinations and factual errors, novel risks from autonomous AI agents that can take real-world actions without human review, cybersecurity vulnerabilities, potential misuse for disinformation and biological threats, and systemic risks from concentrating AI capabilities among a small number of large technology companies.

How does the AI safety report address AI agent risks?

The report highlights that AI agents pose novel reliability risks because they can independently take actions that affect the real world, unlike systems that simply produce text or images for human review. Agent failures could directly cause harm with no opportunity for human intervention, including privacy breaches, resource mismanagement in multi-agent systems, and short-term working memory failures.

What governance recommendations does the AI safety report propose?

The report recommends international cooperation on AI safety standards, sector-specific regulatory approaches that balance innovation with protection, investment in AI safety research infrastructure, mechanisms to ensure broad access to AI tools beyond large technology companies, and frameworks for evaluating and mitigating risks from increasingly autonomous AI systems.

How has AI factual accuracy improved according to the report?

According to the report’s analysis of the SimpleQA Verified benchmark, general-purpose AI models have shown significant improvements in factual accuracy from mid-2023 through early 2026. However, real-world performance still reveals limitations that benchmarks miss, with studies showing models providing potentially harmful answers to 19 percent of medical questions posed.

Who contributed to the International AI Safety Report 2026?

The report was chaired by Professor Yoshua Bengio of Université de Montréal and Mila Quebec AI Institute. Contributors include researchers from Stanford, Oxford, Harvard, Princeton, MIT, Carnegie Mellon, and many other institutions. Senior advisers include Geoffrey Hinton, Stuart Russell, Daron Acemoglu, and Bernhard Schölkopf, with government representatives from over 30 countries on the Expert Advisory Panel.

Your documents deserve to be read.

PDFs get ignored. Presentations get skipped. Reports gather dust.

Libertify transforms them into interactive experiences people actually engage with.

Transform Your First Document Free →

No credit card required · 30-second setup