Agentic AI and Cybersecurity Survey: Challenges, Opportunities and Use Cases
Table of Contents
- The Next Phase of AI in Cybersecurity
- What Is Agentic AI? Architectures and Autonomy Levels
- Agentic AI for Defensive Cybersecurity Operations
- Offensive Capabilities: How Agentic AI Empowers Attackers
- System-Level Attack Surface and Vulnerability Taxonomy
- Layered Mitigations and Defensive Architectures
- Multi-Agent Governance and Information Flow Control
- Evaluation, Simulation and Red-Teaming Approaches
- Governance, Assurance and Policy Implications
- Research Agenda and the Path Forward
📌 Key Takeaways
- $146.5 Billion Market by 2034: Global AI-in-cybersecurity spending is projected to grow from $24.8B in 2024 to $146.5B by 2034, driven by a workforce shortage of nearly 4 million professionals.
- Dual-Use Paradigm Shift: The same agentic capabilities that automate SOC operations — planning, tool use, memory — equally empower attackers to compress multi-day intrusions into minutes.
- 25-Minute Ransomware Cycles: Agentic ransomware frameworks can complete full attack lifecycles in roughly 25 minutes, with mean exfiltration times dropping from 9 days to under 2 days.
- 75% Unsandboxed Execution Risk: Experiments show over 75% of malicious commands execute successfully without sandboxing, while container-based isolation blocks nearly all such attempts.
- Augmentation-First Adoption: The safest operational posture is human-on-the-loop augmentation with enforced escalation boundaries and immutable logging until robust assurance frameworks mature.
The Next Phase of AI in Cybersecurity
Agentic AI represents a fundamental departure from the prompt-driven generative models that have dominated headlines since 2023. Where traditional large language models respond to individual queries, agentic systems combine persistent memory, multi-step planning, tool orchestration, and autonomous decision-making to execute complex workflows over extended time horizons. In cybersecurity, this transition carries profound implications for both defenders and attackers.
The survey by Lazer, Aryal, Gupta, and Bertino at Purdue University provides one of the most comprehensive examinations of agentic AI’s role in cybersecurity to date. Published in January 2026, it systematically maps the architectures, defensive applications, offensive misuse vectors, system-level vulnerabilities, and governance challenges that define this rapidly evolving domain. The stakes are substantial: global AI-in-cybersecurity spending is projected to grow from $24.8 billion in 2024 to $146.5 billion by 2034, fueled by a persistent workforce shortage approaching 4 million cybersecurity professionals worldwide.
This analysis examines the paper’s core findings and their practical implications for security leaders, policy makers, and technology architects navigating the transition from reactive AI tools to persistent, goal-directed autonomous agents. The dual-use nature of these capabilities — simultaneously enhancing defensive postures while dramatically amplifying offensive potential — demands urgent attention from organizations at every scale.
What Is Agentic AI? Architectures and Autonomy Levels
At its core, agentic AI combines a foundation model with a planner-executor loop, tool APIs, and layered memory systems. The architecture follows a recurring cycle: plan, act, observe, and reflect. This enables agents to decompose complex objectives into subtasks, execute them through external tools (databases, APIs, security platforms), evaluate results, and adapt their strategies — all without returning to a human operator for each decision.
The survey identifies a five-level autonomy spectrum that provides a useful framework for understanding deployment maturity. Level 0 represents static inference — single-shot model predictions with no persistence. Level 1 introduces assistive capabilities where models provide recommendations but humans execute all actions. Level 2 adds tool-assisted automation where agents can invoke external functions under supervision. Level 3 represents semi-agentic operation with adaptive planning and limited autonomous action. Level 4 describes fully agentic systems with continuous learning, self-improvement, and autonomous decision-making across extended time horizons.
Memory architectures deserve particular attention. Agents employ short-term working memory for immediate task context, episodic memory for session-specific recall, and long-term vector stores for persistent knowledge accumulation. This layered memory approach creates unprecedented capabilities — but also unique attack surfaces. Memory poisoning, where adversaries inject misleading data into an agent’s long-term stores, represents a threat category that has no direct analog in traditional cybersecurity systems.
Architecturally, the paper distinguishes between single-agent and multi-agent systems. Multi-agent configurations use a coordinating agent that decomposes tasks and delegates to specialist subagents, each with scoped responsibilities and tools. This modular approach improves scalability and containment — if one specialist agent is compromised, the blast radius remains bounded — but introduces coordination overhead and new inter-agent communication risks.
Agentic AI for Defensive Cybersecurity Operations
The defensive applications of agentic AI span the entire breach lifecycle: pre-incident preparation, active incident response, and post-incident analysis. For security operations centers (SOCs) drowning in alert fatigue, agentic systems offer a transformative shift from manual triage to automated investigation workflows.
In the pre-incident phase, agentic AI enables autonomous vulnerability scanning, proactive threat hunting across telemetry sources, and automated cyber range generation for testing defensive postures. Agents can continuously monitor attack surfaces, correlate indicators across multiple data sources, and identify emerging threat patterns that would overwhelm human analysts operating at scale.
During active incidents, multi-agent architectures demonstrate their greatest value. A coordinator agent can simultaneously dispatch specialist subagents for network forensics, endpoint analysis, identity investigation, and containment action planning. Commercial implementations like Microsoft Security Copilot, Exabeam Copilot, and CrowdStrike Charlotte AI illustrate how the industry is already deploying these capabilities — primarily as augmentation tools that accelerate human analysts rather than fully autonomous responders.
The post-incident phase benefits from agents’ ability to conduct comprehensive root cause analysis, generate detailed incident reports, and recommend adaptive policy updates based on lessons learned. Because agents maintain persistent memory across investigations, they can identify patterns across incidents that siloed, session-based tools would miss. However, the paper emphasizes that irreversible actions — account revocations, network segmentation changes, system isolations — typically remain gated behind human escalation protocols even in the most advanced deployments.
Transform complex cybersecurity research into interactive video experiences your team will actually engage with.
Offensive Capabilities: How Agentic AI Empowers Attackers
The offensive implications of agentic AI are stark and demand immediate attention from defenders. The same properties that make agents valuable for SOC automation — planning, tool orchestration, persistent memory, and adaptive reasoning — materially empower attackers to compress multi-day intrusion campaigns into minutes or hours.
The survey cites alarming data from Unit 42’s analysis of agentic ransomware: automated ransomware frameworks can complete the full attack lifecycle — from initial access through lateral movement, privilege escalation, data exfiltration, and encryption — in roughly 25 minutes. Mean time to exfiltrate has dropped from approximately nine days in 2021 to under two days in 2024, with many incidents completing data theft within a single hour. These acceleration trends are expected to intensify as agentic capabilities mature.
Beyond ransomware, agentic systems enable scaled reconnaissance that maps organizational attack surfaces with unprecedented thoroughness. Agents can autonomously probe networks, identify vulnerable services, test credentials, and chain exploitation paths — all while adapting their approach based on observed defensive measures. The paper describes “ScamAgents” and similar demonstrations showing that agents can sustain multi-turn social engineering conversations, craft adaptive phishing campaigns, and generate convincing deepfake content that circumvents content-level guardrails.
The insider threat vector deserves special attention. Compromised agents operating under valid credentials within an organization’s security stack represent an especially dangerous scenario. Because these agents hold legitimate access to monitoring tools, log systems, and security controls, a subverted agent could effect stealthy data tampering or exfiltration while actively masking its activities from human supervisors and peer monitoring systems.
System-Level Attack Surface and Vulnerability Taxonomy
The survey proposes a four-layer security model — perception, reasoning, action, and memory — as an organizing framework for understanding agentic AI’s attack surface. Each layer presents distinct threat categories that require layer-specific defenses and cross-layer monitoring.
At the perception layer, prompt injection remains the most widely discussed vulnerability. Adversaries can embed malicious instructions within data sources that agents consume, redirecting planning logic, triggering unauthorized tool calls, or exfiltrating sensitive information through crafted responses. The paper distinguishes between direct prompt injection (targeting agent inputs) and indirect injection (poisoning external data sources that agents retrieve during operation).
The reasoning layer faces threats from model hallucination, adversarial reasoning manipulation, and goal drift. When agents make multi-step decisions based on intermediate reasoning, corrupting any single step can cascade through the entire decision chain. NIST’s adversarial ML taxonomy provides useful categorization, but the multi-step, persistent nature of agentic reasoning introduces compounding effects that single-inference threat models do not capture.
The action layer encompasses execution vulnerabilities — perhaps the most immediately dangerous category. The paper cites experiments by He et al. demonstrating that over 75% of malicious Bash commands generated by agents executed successfully in unsandboxed environments. Container-based sandboxing blocked nearly all such attempts in controlled tests, but many production deployments still lack adequate execution isolation. Agent-generated SQL queries, API calls, and system commands each present distinct injection and privilege escalation risks.
Memory vulnerabilities represent a novel attack surface unique to agentic systems. Long-term memory poisoning — where adversaries inject misleading data into an agent’s persistent knowledge store — can corrupt future decisions across sessions without triggering immediate alerts. Session hijacking, cross-agent data leakage through shared memory spaces, and fine-tuning poisoning through compromised training data pipelines round out this increasingly complex threat landscape.
The multi-agent dimension adds further complexity. The paper documents scenarios where compromised agents can establish covert communication channels with peer agents, enabling coordinated malicious behavior that evades single-agent monitoring. Steganographic data encoding, shared memory manipulation, and emergent collusive patterns all represent active research areas where defensive capabilities lag offensive innovation.
Layered Mitigations and Defensive Architectures
Addressing the multi-layered attack surface requires composed defenses that operate across all four layers simultaneously. The survey maps several defensive frameworks to specific threat categories, providing practical guidance for security architects.
At the perception layer, effective mitigations include strict input validation and content sanitization, retrieval poisoning detection for RAG-based agents, and prompt hardening techniques that limit an agent’s susceptibility to injection attacks. These controls reduce the likelihood of adversarial instructions reaching the reasoning layer but cannot eliminate the risk entirely — making downstream defenses essential.
Reasoning-layer defenses include sandboxed reasoning environments, debate and adjudication mechanisms where multiple agents challenge each other’s conclusions, and model-challenge architectures that introduce adversarial review before consequential decisions. The paper highlights that debate-based classification systems show measurable improvement in reducing hallucination and manipulation, though they introduce latency and computational overhead that must be balanced against operational speed requirements.
Action-layer controls center on the principle of least privilege: scoped API tokens, granular credential management, execution sandboxes for code generation, and intent-bound delegation tokens (conceptually similar to JWTs that bind an agent’s authority to specific action types). The paper advocates for cryptographically binding an agent’s permitted actions to its declared intent, creating an audit trail that links every external action to an authorized objective.
Memory-layer defenses encompass encryption at rest and in transit, data provenance tracking, tamper detection for persistent stores, and selective retention policies with auditability requirements. The survey references proposals for blockchain-backed immutable audit logs (BlockA2A) and information flow control labels (SAFEFLOW) that restrict how data propagates between agents — addressing the cross-agent contamination risk that flat memory architectures expose.
Turn this 36-page cybersecurity survey into an interactive experience your stakeholders will actually complete.
Multi-Agent Governance and Information Flow Control
Multi-agent systems amplify governance complexity beyond what single-agent frameworks address. When agents share memory, tokens, or tool access, systemic risks emerge that are not captured by validating each agent independently. The survey identifies several critical governance patterns that organizations deploying multi-agent security systems must implement.
Information flow control between agents requires explicit labeling and enforcement. The SAFEFLOW framework referenced in the survey assigns sensitivity labels to data objects and enforces propagation rules as data moves between agents. Without such controls, a specialist agent with access to classified threat intelligence could inadvertently share sensitive indicators with a reporting agent that posts to external dashboards — creating a data leakage path that no individual agent’s access controls would prevent.
Collusion detection represents a particularly challenging governance requirement. The paper describes graph anomaly detection approaches that monitor inter-agent communication patterns for coordinated behavior that deviates from expected workflows. Practical implementations include restricted communication channels between agents (preventing arbitrary peer-to-peer messaging), mandatory logging of all inter-agent data exchanges, and dynamic revocation mechanisms that can isolate misbehaving agents without disrupting the broader system.
The MAESTRO framework provides a comprehensive mapping from threats (embedding poisoning, collusion, model theft) to controls across models, data flows, orchestration layers, infrastructure, and governance processes. For production deployments, the survey recommends combining MAESTRO’s threat-control mapping with TRiSM (trust, risk, and security management) framing to ensure that governance scales with system complexity.
Evaluation, Simulation and Red-Teaming Approaches
Assuring the safety and effectiveness of agentic AI systems requires evaluation infrastructure that goes well beyond traditional penetration testing or model benchmarking. The survey catalogs several emerging approaches, each with distinct strengths and limitations.
Automated cyber range generation, exemplified by the ARCeR framework, creates synthetic network environments for agent testing at scale. These ranges allow defenders to evaluate agent behavior under diverse attack scenarios without risking production systems. The Cyberwheel framework adds higher fidelity by combining graph-based simulation with real infrastructure emulation, enabling assessment of agent performance closer to operational conditions.
CTF-based (Capture the Flag) competitions adapted for agentic systems — such as OWASP FinBot CTF and CSAW Agentic CTF — provide standardized evaluation benchmarks and adversarial testing frameworks. Red-teaming platforms like RedTeamLLM and PenTestGPT offer dedicated adversarial assessment tools designed specifically for agentic architectures.
However, the survey identifies critical gaps in existing evaluation methods. Most testbeds assume static defensive configurations and simplified adversary models, producing results that may not transfer to dynamic production environments. Key deficiencies include limited coverage of long-horizon agent adaptation (where behavior evolves over extended operational periods), inadequate modeling of human-in-the-loop delays that affect real incident response, and insufficient attention to emergent multi-agent dynamics that only manifest at scale.
The sim-to-real transfer problem — ensuring that agent performance in simulated environments predicts real-world behavior — remains a fundamental research gap. The paper recommends staged validation approaches: begin with synthetic ranges for early capability testing, progress to controlled emulation environments with realistic traffic and adversary behavior, and conduct gated live exercises before granting agents execution authority in production systems.
Governance, Assurance and Policy Implications
Existing governance models are fundamentally inadequate for agentic AI because they assume human-supervised, single-shot model interactions. Agentic systems are persistent, autonomous, and increasingly operate across organizational boundaries — requiring entirely new assurance frameworks.
The policy implications span multiple dimensions. Certification frameworks for agentic systems must move beyond static model evaluations to assess runtime behavior, containment guarantees, and failure modes under adversarial conditions. Mandatory logging and data provenance requirements must extend to all agent actions, inter-agent communications, and memory modifications — not merely model inputs and outputs.
Cross-jurisdictional considerations add further complexity. When agentic security systems operate across organizational boundaries — sharing threat intelligence, coordinating responses, or accessing shared infrastructure — questions of liability, data sovereignty, and accountability become acute. The paper calls for industry standards that bind an agent’s operational authority to declared intent, creating verifiable chains of accountability from high-level objectives through individual actions.
For practitioners, the survey recommends an augmentation-first posture: deploy agents as powerful assistants that accelerate human decision-making rather than as fully autonomous operators. This approach preserves human oversight for high-impact decisions while capturing the speed and scale benefits of agentic automation. Enforced escalation boundaries, immutable audit logs, and least-privilege credential scoping form the operational foundation. As assurance frameworks mature and evaluation methods improve, the boundary of autonomous authority can be cautiously expanded — but the paper explicitly warns against premature full autonomy given current gaps in containment guarantees and governance infrastructure.
Research Agenda and the Path Forward
The survey concludes with a research agenda that identifies the most pressing gaps between current capabilities and the assurances required for safe, scaled deployment of agentic AI in cybersecurity.
Secure coordination primitives for multi-agent systems top the priority list. Current agent communication protocols lack formal security guarantees, making them vulnerable to eavesdropping, manipulation, and collusion. Research into cryptographically secured inter-agent messaging, verifiable delegation chains, and provable containment properties for delegated actions would address foundational trust requirements.
Robust sim-to-real evaluation pipelines represent the second critical need. The paper argues that the cybersecurity community needs standardized benchmarks that capture adversarial adaptation, long-horizon behavior evolution, multi-agent emergent dynamics, and realistic human oversight constraints. Without such benchmarks, comparing defensive agentic systems across vendors or deployment configurations remains impractical.
Memory security stands as perhaps the most novel research direction. Detecting and preventing memory poisoning across persistent knowledge stores, ensuring data integrity across agent sessions, and maintaining provenance for knowledge accumulated over time all require new techniques that current machine learning security research has barely begun to address.
Formal assurance frameworks that link autonomy levels to required safeguards would provide the governance infrastructure that regulators, insurers, and enterprise risk managers need to make informed deployment decisions. The paper envisions a maturity model where increased autonomous authority requires proportionally stronger containment guarantees, evaluation evidence, and monitoring infrastructure — a “trust but verify” approach calibrated to the capabilities and risks of each deployment tier.
The trajectory is clear: agentic AI will fundamentally reshape the cybersecurity landscape within the next three to five years. Organizations that build governance, evaluation, and containment capabilities now will be positioned to capture the defensive advantages while managing the substantial risks. Those that deploy hastily without adequate safeguards may find that their autonomous defenders become their most dangerous attack surface. The research community’s challenge is to close the gap between capability and assurance before the consequences of that gap become irreversible.
Make cybersecurity research accessible to every stakeholder — from the C-suite to the SOC floor.
Frequently Asked Questions
What is agentic AI in cybersecurity?
Agentic AI in cybersecurity refers to autonomous AI systems that combine planning, tool use, memory, and multi-step reasoning to perform complex security tasks such as threat detection, incident response, and vulnerability assessment without continuous human supervision. Unlike single-shot generative AI, agentic systems persist across sessions and adapt their strategies based on evolving conditions.
How does agentic AI differ from traditional AI in security operations?
Traditional AI in security typically performs single-step classification or detection tasks. Agentic AI adds planning loops, long-term memory, tool orchestration, and autonomous decision-making, enabling multi-turn investigations, automated triage, and coordinated incident response across multiple systems simultaneously.
What are the main cybersecurity risks of agentic AI systems?
Key risks include prompt injection attacks that redirect agent behavior, memory poisoning that corrupts long-term knowledge stores, multi-agent collusion where compromised agents coordinate malicious actions, execution vulnerabilities from unsandboxed code generation, and supply chain attacks targeting agent tool integrations. Studies show over 75% of malicious commands execute successfully without proper sandboxing.
Can agentic AI be used for offensive cyber attacks?
Yes, agentic AI significantly amplifies offensive capabilities. Automated ransomware campaigns can complete full attack cycles in roughly 25 minutes, and mean exfiltration time has dropped from nine days to under two days. Agents can automate reconnaissance, craft adaptive phishing campaigns, and scale social engineering attacks beyond what human operators could achieve alone.
What governance frameworks exist for agentic AI in cybersecurity?
Current frameworks include MAESTRO for mapping threats across agent pipelines, TRiSM for trust and risk management, and layered defense models addressing perception, reasoning, action, and memory layers. Practical governance measures include mandatory sandboxing, immutable audit logs, intent-binding delegation tokens, least-privilege credential scoping, and human-on-the-loop escalation gates for high-impact actions.