1. Inbound Prompt Screening
Detect prompt injection, jailbreak attempts, instruction override, PII exposure, and system-prompt extraction before the manager agent sees the input.
A universal protected-agent sequence for securing manager-worker flows, tool execution, memory, outputs, and cross-agent delegation.
Detect prompt injection, jailbreak attempts, instruction override, PII exposure, and system-prompt extraction before the manager agent sees the input.
Verify whether a manager agent has authority to delegate work to a worker or specialist agent.
Revalidate intent, role, tool scope, arguments, and target sensitivity before the tool executes.
Inspect tool results, responses, memory writes, and downstream outputs before they become the next instruction.
Re-check session chain, delegated authority, trust score, and signed policy context at the specialist side.
Emit structured evidence for auditability, governance, investigation, and OWASP-aligned reporting.
Most production agentic workflows can be reduced to a common pattern: a user interacts with a manager agent, the manager delegates to a specialist or worker agent, the worker calls a tool or API, and the result returns into the workflow.
Each numbered checkpoint validates a different part of the interaction chain: prompt, delegation, tool execution, returned output, and final specialist-side decision authority.
Filter prompts, jailbreak attempts, PII, and instruction override at the edge before the manager agent sees them.
Verify the manager has authority to delegate and bind the delegation to short-lived policy context.
Revalidate worker scope, intent, target sensitivity, and structured tool arguments before any tool runs.
Detect exfiltration, redact PII, strip hidden instructions, and inspect outputs before downstream reuse.
Worker re-checks the manager’s delegated authority, signed policy, and trust context at decision time.
Key principle: never trust an upstream agent’s word. Every control revalidates intent, identity, and scope against policy — even when the upstream agent is your own.
This sequence is powerful. It is also where agentic risk concentrates. A prompt can manipulate the manager. A manager can over-delegate. A worker can call a tool outside the intended purpose. A tool can return poisoned output. A specialist can blindly trust an upstream agent.
Inbound screening remains important. It filters jailbreak attempts, instruction override, malicious user input, PII exposure, and system-prompt extraction before the manager agent processes the request.
But inbound screening cannot be the only control. Many agentic attacks arrive indirectly through documents, tickets, CRM records, emails, RAG results, or tool outputs. That means the system must continue evaluating risk after the first prompt.
Multi-agent systems introduce a confused-deputy problem. A lower-trust agent can ask a higher-privilege specialist to perform an action it could not perform directly.
AgenticDome evaluates the source agent, target agent, claimed role, actual role, privilege gap, task purpose, and delegation context. Delegation should not be accepted merely because it came from another agent in the workflow.
Tool calls are where reasoning becomes action. This is the most important enforcement point for an Action Firewall.
AgenticDome inspects structured tool arguments, destructive commands, role context, trust score, business purpose, and tool scope before execution. A valid tool call can still be unsafe if the business objective is wrong.
Tool outputs are frequently reused as agent context. That means a malicious result can become the next prompt, a memory write, a downstream tool argument, or a cross-agent instruction.
AgenticDome inspects tool output for exfiltration, hidden forwarding, PII leakage, RAG poisoning, memory poisoning, unsafe scripts, and downstream execution risk. In agentic systems, output is not passive. It is part of the control loop.
The mature pattern is for worker agents and specialist agents to verify the authority of a delegated request at decision time. They should not rely only on the manager’s claim.
AgenticDome supports this direction through session-chain correlation, trust-score revalidation, role mismatch detection, and policy-context checks. The goal is no bare trust between agents.
| Protection Point | Risk Reduced | AgenticDome Focus |
|---|---|---|
| Inbound prompt screening | Prompt injection, sensitive disclosure, goal hijack | Prompt scanning, semantic review, PII detection, jailbreak detection |
| A2A delegation authorization | Privilege abuse, insecure inter-agent communication, rogue agents | Source/target validation, role mismatch detection, privilege gap enforcement |
| Direct tool authorization | Tool misuse, excessive agency, improper output handling | Tool arguments, trust score, role, intent, destructive action checks |
| Output sanitization | Exfiltration, memory poisoning, RAG poisoning, downstream compromise | Output validation, redaction, covert forwarding detection, poisoning detection |
| Decision verification | Identity spoofing, HITL abuse, resource exhaustion, cascading failures | Session-chain correlation, trust revalidation, fanout and loop controls |
Prompt filters, model guardrails, connector permissions, and platform-native policies all matter. But agentic workflows need a broader runtime architecture. The five protection points define where controls must fire when agents reason, delegate, call tools, process output, and act.
AgenticDome’s approach is to place enforcement at the interaction layer so enterprises can govern not just what agents know, but what agents do.
Read how Microsoft, Salesforce, and ServiceNow are advancing agent security — and where cross-platform gaps remain.