Publication · Agent Framework Security

Agent Frameworks Need a Runtime Security Partner

LangGraph, CrewAI, PydanticAI, OpenAI Agents SDK, Semantic Kernel, LlamaIndex, Haystack, Smolagents and Atomic Agents are accelerating agent adoption. But frameworks are not complete security control planes. AgenticDome is designed to protect what happens at runtime.

AgenticDome Research · 2026 · Approx. 8 minute read

The framework layer is where agentic AI becomes real

Enterprise agent adoption is not happening through models alone. It is happening through frameworks: graph runtimes, crews, typed agents, RAG-native pipelines, tool-calling SDKs, workflow engines, and multi-agent orchestration layers.

LangGraph

Stateful graph execution, checkpoints, nodes, edges and reviewable topology.

Graph Loop risk Tool control

CrewAI

Role-based collaboration, crews, flows, delegation and persistent memory.

Crew Confused deputy Memory

PydanticAI

Typed agents, schema validation, tool approval and structured outputs.

Typed Schema Purpose gap

OpenAI Agents SDK

Agents, handoffs, guardrails, hosted tools, tracing and sandboxed execution.

Handoffs Tracing Tool misuse

LlamaIndex / Haystack

RAG-native workflows, query tools, document stores and vector retrieval.

RAG Poisoning Observability

These frameworks are critical because they turn language model reasoning into operational behavior. They decide how agents plan, call tools, store memory, delegate tasks, resume state, process outputs, and coordinate with other agents.

But this also means frameworks are where many of the most important security decisions now occur. The model may generate the next action, but the framework decides how that action enters the system.

The framework is not just developer infrastructure. It is the runtime path between agent reasoning and enterprise action.
Runtime security map

Frameworks orchestrate agents. AgenticDome secures the interaction path.

Framework-native strengths
Common runtime gaps
AgenticDome control layer

Graph / State Runtime

Topology, checkpoints, conditional edges, state transitions.

Role-Based Crews

Managers, workers, specialists, delegation and handoffs.

Typed Tool Calls

Schema-validated inputs and outputs with structured contracts.

RAG / Memory

Vector stores, long-term memory, retrieval and context reuse.

Code / Hosted Tools

Python execution, API calls, MCP tools and external actions.

Runtime
Action
Path

Inbound prompt screening

Detect jailbreaks, injection, PII and instruction override before execution starts.

A2A delegation control

Validate source, target, role, privilege gap and delegation authority.

Direct tool authorization

Check intent, tool arguments, target sensitivity and destructive action risk.

Output and memory sanitization

Detect exfiltration, RAG poisoning, memory poisoning and hidden instructions.

Trust-score revalidation

Recheck policy, role, session chain and fanout before downstream decisions.

AgenticDome is the runtime mesh that sits beside the framework.

Developers keep their chosen framework. Security teams gain a policy-aware control plane for actions, delegation, memory, tool execution and output reuse.

What different frameworks do well

Modern agent frameworks are not all the same. Each architecture family creates different strengths and different security gaps.

LangGraph and graph-based runtimes

LangGraph-style systems provide explicit state, nodes, edges, checkpoints, and reviewable control flow. This is valuable for determinism and auditability. Security teams can inspect topology, enforce recursion limits, and reason about where decisions occur.

The gap is that graph correctness does not automatically equal action safety. A graph can be well-formed while a node still calls a tool with unsafe arguments, accepts poisoned context, or passes compromised output downstream.

CrewAI, AutoGen and role-based collaboration

Role-based systems make multi-agent delegation natural. Agents with different goals and tools can collaborate, hand off work, and specialize.

The gap is confused-deputy risk. A lower-trust agent can influence a higher-privilege agent, and the system may treat inter-agent messages as trusted simply because they came from inside the workflow.

PydanticAI, Atomic Agents and typed tool calls

Typed frameworks improve structure. They validate inputs and outputs, reduce malformed calls, and help developers define stronger contracts.

The gap is that schema validity is not the same as business legitimacy. A SQL command, refund request, API call, or memory write can be perfectly valid structurally and still violate policy.

OpenAI Agents SDK and handoff-native runtimes

Agent SDKs with handoffs, guardrails, tracing, hosted tools, and sandboxed execution are moving toward production readiness.

The gap is that handoff authority, tool intent, cross-agent trust, and downstream output reuse still require policy context beyond a single guardrail check.

LlamaIndex, Haystack and RAG-native pipelines

RAG-native stacks connect agents to document stores, vector indexes, query engines, retrieval workflows, and enterprise knowledge.

The gap is retrieval trust. RAG output can be poisoned, stale, cross-tenant, or instruction-bearing. Once retrieved, it may influence an agent’s next action or memory write.

Smolagents and code-as-action systems

Code-first agents can be powerful because the agent writes executable logic. But that also raises the stakes. Sandboxing is mandatory, and output must be treated as potentially dangerous.

The gap is not only whether code is isolated. It is whether the generated code should have been attempted in the first place.

The common security gap

Across these frameworks, a consistent pattern emerges. Native framework controls are useful, but they are usually local to the framework’s abstraction: schema validation, callback hooks, HITL events, tracing, sandboxing, or tool registration.

What is missing is a cross-framework runtime decision layer that evaluates the full interaction: source, target, role, intent, tool, arguments, memory, output, trust score, and policy.

Capability map: framework native controls vs runtime enforcement gaps

Frameworks provide useful primitives. AgenticDome complements them by enforcing action integrity at runtime.

Capability
Graph
Crew
Typed
RAG
AgenticDome
State / topology visibility
Native
Partial
Optional
Workflow
Uses as context
Delegation authority
Custom
Role only
Gap
Gap
Core control
Tool argument safety
Node ACL
Callback
Schema
Tool reg
Intent + scope
Memory / RAG poisoning
Checkpoint
Memory risk
Gap
Retrieval
Detect + block
Output sanitization before reuse
Gap
Gap
Typed output
Pipeline
Core control
Action integrity decision
Custom
Custom
Purpose gap
Purpose gap
Primary layer
Framework Family Native Strength Common Gap AgenticDome Layer
Graph / state-machine Explicit topology checkpoints, deterministic flow Gap unsafe tool arguments or poisoned node outputs AgenticDome tool authorization and output sanitization
Role-based crews Delegation specialist collaboration Gap confused deputy and privilege mismatch AgenticDome A2A delegation authorization and role validation
Typed agents Schemas validated tool calls Gap valid structure but unsafe purpose AgenticDome intent, purpose, and action-integrity enforcement
RAG pipelines Retrieval knowledge access Gap poisoned or instruction-bearing retrieved context AgenticDome RAG output inspection and memory write validation
Code-as-action Automation flexible execution Gap unexpected code execution or destructive behavior AgenticDome pre-execution policy check and sandbox-aware controls

Why AgenticDome is a natural partner

AgenticDome is not trying to replace these frameworks. It is designed to protect them.

Developers should be able to choose LangGraph for stateful control, CrewAI for collaboration, PydanticAI for typed tools, LlamaIndex for RAG, or OpenAI Agents SDK for handoffs. But the enterprise still needs a runtime layer that asks whether the action should occur.

AgenticDome adds that runtime layer through inbound screening, delegation authorization, direct tool authorization, tool output sanitization, memory/RAG poisoning detection, and trust-score revalidation.

The core idea

Frameworks help agents work. AgenticDome helps ensure that what agents do remains aligned with policy, purpose, authority, and trust.

The conclusion

Agent frameworks are becoming the application servers of autonomous AI. They orchestrate reasoning, tools, memory, delegation, and execution.

That makes them powerful. It also makes them security-critical. AgenticDome’s role is to become the runtime security partner for these frameworks — not by slowing developers down, but by giving enterprises a control plane for action integrity.

Frameworks make agents useful. Runtime controls make them deployable.

See how AgenticDome helps secure tool use, delegation, memory, RAG, and cross-agent workflows.