Introduction
In the previous article How to Build a ReAct AI Agent for Cybersecurity Scanning with Python and LangGraph I explained how to build a simple ReAct Agent to scan a web target for vulnerabilities. But the scope of work for cyber security audits is bigger than just scanning. It includes:
- Scanning Stage – get information about possible vulnerabilities in the target.
- Attacking Stage – try to exploit vulnerabilities and prove our hypothesis from the scanning stage.
- Reporting Stage – create comprehensive report for company which requested audit to apply fixes.
And to build this I tried to go with simple graph first but then realized that this approach is not flexible and violates „Single Responsibility“ from SOLID.
That’s why I have built the pipeline of agents where each agent is responsible only for one thing and does it pretty well.
Subscribe to my Substack to not miss my new articles 😊
Pipeline of Agents
Pipeline of Agents – is an architectural pattern that chains specialized AI agents in a sequential workflow, where each agent processes the output from the previous agent and passes refined data to the next. Unlike monolithic agents that try to do everything, pipeline agents follow single responsibility principle – each agent excels at one specific task.
Main characteristics of this pattern:
- Each agent has a single, specialized responsibility.
- Sequential execution with data flow – output from Agent N becomes input for Agent N + 1.
- Composable and modular – you can swap agents or change the pipeline order.
- State isolation – agents don’t share internal state, only defined outputs.
- Failure handling
Why Use Pipeline of Agents?
Let’s see with an example why we should use Pipeline of Agents. In my original implementation of Cyber Security AI Agent I didn’t use it and the system was very hard to develop and maintain. Here is the code which builds a graph:
def create_graph() -> CompiledStateGraph:
llm = ChatOpenAI(model="gpt-4o", temperature=0.3)
# tools
attack_tools = [ffuf_directory_scan, curl_tool, flexible_http_tool]
scan_tools = [ffuf_directory_scan]
llm_with_attack_tools = llm.bind_tools(attack_tools, parallel_tool_calls=True)
llm_with_scan_tools = llm.bind_tools(scan_tools, parallel_tool_calls=True)
# nodes init
process_tool_result_node = ProcessToolResultNode(llm=llm)
generate_report_node = GenerateReportNode(llm=llm)
scan_target_node = ScanTargetNode(llm_with_tools=llm_with_scan_tools)
attack_target_node = AttackTargetNode(llm_with_tools=llm_with_attack_tools)
# edges init
scan_tools_router = ToolRouterEdge(
origin_node="scan_target_node",
tools_type="scan",
end_node="attack_target_node",
tools_node="scan_tools",
)
attack_tools_router = ToolRouterEdge(
origin_node="attack_target_node",
tools_type="attack",
end_node="generate_report",
tools_node="attack_tools",
)
# graph init
builder = StateGraph(TargetScanState)
# nodes
builder.add_node("scan_target_node", scan_target_node)
builder.add_node("attack_target_node", attack_target_node)
builder.add_node("scan_tools", ToolNode(scan_tools))
builder.add_node("attack_tools", ToolNode(attack_tools))
builder.add_node(
"process_scan_results", process_tool_result_node.process_tool_results
)
builder.add_node(
"process_attack_results", process_tool_result_node.process_tool_results
)
builder.add_node("generate_report", generate_report_node.generate_report)
# edges
builder.add_edge(START, "scan_target_node")
builder.add_conditional_edges("scan_target_node", scan_tools_router)
builder.add_conditional_edges("attack_target_node", attack_tools_router)
builder.add_edge("scan_tools", "process_scan_results")
builder.add_edge("process_scan_results", "scan_target_node")
builder.add_edge("attack_tools", "process_attack_results")
builder.add_edge("process_attack_results", "attack_target_node")
builder.add_edge("generate_report", END)
# Add memory checkpointer for state persistence
memory = MemorySaver()
return builder.compile(checkpointer=memory)
Here I’m just creating a bunch of nodes with edges which conditionally jump from one node to another node. No single responsibility principle, no specialization. This code is hard to develop and hard to test because I can’t test only the „scanning“ stage of my system because there is only one way to test – launch the whole workflow and a bug in the „attacking“ stage hides bugs from the „scanning“ stage.
So LangGraph graphs should be small and simple like microservices for efficient development. That’s why Pipeline of Agents is a good solution in my case because it provides me the possibility to split my big graph into:
- Scan Agent graph
- Attack Agent graph
Develop and test these 2 graphs in isolation and only after that build the whole pipeline for my Cyber Security Agent by combining 2 smaller agents.
High Level Design
- User sends input information about a target.
-
Scan Agent
scans a target and generates scan summary. -
Attack Agent
attacks a target and generates attack summary. -
Summary Generation
generates a final summary based on the scan and attack summaries.
Scan ReAct Agent
This Agent uses ReAct architecture with tools:
-
ffuf
– for enumeration of possible endpoints. -
curl
– for quick testing enumeration output or perform custom testing.
I fully covered ReAct pattern and this agent implementation in the How to Build a ReAct AI Agent for Cybersecurity Scanning with Python and LangGraph article.
Attack ReAct Agent
Attack Agent uses the same ReAct architecture and only curl
tool to exploit vulnerabilities because I did it in the scope of a research project and if needed, other tools can be easily added.
Summary Generation
Implementation
To implement this system I used:
- LangGraph
- Python
Source code is available on GitHub.
LangGraph Graph
- User sends URL
-
scan_agent_node
executes Scan Agent and performs web target scanning. -
scan_agent_node
produces scanning result as an output. -
attack_agent_node
executes Attack Agent with an output fromscan_agent_node
and performs target attack. -
summary_node
generates summary based on outputs fromscan_agent_node
andattack_agent_node
There is code which builds this graph in Python (also available on GitHub):
from langchain_openai import ChatOpenAI
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import END, START, StateGraph
from langgraph.graph.state import CompiledStateGraph
from cybersecurity_agent.node import ScanAgentNode, AttackAgentNode, CybersecuritySummaryNode
from cybersecurity_agent.state import CybersecurityAgentState
def create_cybersecurity_graph(
scan_react_limit: int = 25,
scan_ffuf_limit: int = 2,
scan_curl_limit: int = 5,
attack_react_limit: int = 25,
attack_curl_limit: int = 10,
) -> CompiledStateGraph:
llm = ChatOpenAI(model="gpt-4.1-2025-04-14", temperature=0.3)
# Use parameterized wrapper nodes with configurable limits
scan_agent_node = ScanAgentNode(
react_usage_limit=scan_react_limit,
ffuf_tool_limit=scan_ffuf_limit,
curl_tool_limit=scan_curl_limit,
)
attack_agent_node = AttackAgentNode(
react_usage_limit=attack_react_limit,
curl_tool_limit=attack_curl_limit,
)
cybersecurity_summary_node = CybersecuritySummaryNode(llm=llm)
# Build the graph
builder = StateGraph(CybersecurityAgentState)
# Add nodes that use compiled sub-graphs internally
builder.add_node("scan_agent", scan_agent_node)
builder.add_node("attack_agent", attack_agent_node)
builder.add_node("cybersecurity_summary", cybersecurity_summary_node)
# Define the workflow: scan -> attack -> summary
builder.add_edge(START, "scan_agent")
builder.add_edge("scan_agent", "attack_agent")
builder.add_edge("attack_agent", "cybersecurity_summary")
builder.add_edge("cybersecurity_summary", END)
return builder.compile(checkpointer=MemorySaver())
Passing State for Child Graphs
All my agents are built with LangGraph and are graphs. LangGraph provides the possibility to embed an external graph inside my graph but as far as I don’t want to share the state of my parent graph with child graph I don’t want to embed it. That’s why I decided to build a node wrapper which will convert parent graph state to the child graph state and pass only required data for child graph execution.
This allows me to hide information and work only with the minimal amount of it. This is my parent graph state (see on GitHub):
class CybersecurityAgentState(TypedDict):
target: Target
scan_summary: ScanAgentSummary | None
attack_summary: AttackReportSummary | None
cybersecurity_report: CybersecurityReport | None
And this is my child graph state:
class AttackAgentState(ReActAgentState):
scan_summary: ScanAgentSummary
attack_summary: AttackReportSummary | None
To achieve my goal of hiding information I have built wrapper nodes for each agent.
Scan Agent Execution
Detailed implementation of the Scan Agent was described in How to Build a ReAct AI Agent for Cybersecurity Scanning with Python and LangGraph
Wrapper node for Scan Agent (see on GitHub):
from langchain_core.runnables.config import RunnableConfig
from scan_agent.graph import create_scan_graph
from cybersecurity_agent.state import CybersecurityAgentState
from agent_core.state import ReActUsage, Tools, ToolsUsage
from agent_core.tool import CURL_TOOL, FFUF_TOOL
class ScanAgentNode:
def __init__(
self,
react_usage_limit: int = 25,
ffuf_tool_limit: int = 2,
curl_tool_limit: int = 5,
):
self.scan_graph = create_scan_graph()
self.react_usage_limit = react_usage_limit
self.ffuf_tool_limit = ffuf_tool_limit
self.curl_tool_limit = curl_tool_limit
async def __call__(self, state: CybersecurityAgentState) -> dict:
scan_state = {
"target": state["target"],
"usage": ReActUsage(limit=self.react_usage_limit),
"tools_usage": ToolsUsage(
limits={
FFUF_TOOL.name: self.ffuf_tool_limit,
CURL_TOOL.name: self.curl_tool_limit,
}
),
"tools": Tools(tools=[FFUF_TOOL, CURL_TOOL]),
}
config = RunnableConfig(
max_concurrency=10,
recursion_limit=25,
configurable={"thread_id": f"scan_{hash(str(state['target']))}"},
)
final_state = await self.scan_graph.ainvoke(scan_state, config)
scan_summary = final_state.get("summary")
return {"scan_summary": scan_summary}
This node just creates a state for Scan Agent and executes it. Scan Agent doesn’t know anything about parent state or even that it is part of a bigger flow. And even more, here I’m pretty flexible to define what tools and limits are allowed for Scan Agent.
Attack Agent Execution
Attack Agent node has similar code to the Scan Agent node (see on GitHub):
from langchain_core.runnables.config import RunnableConfig
from attack_agent.graph import create_attack_graph
from cybersecurity_agent.state import CybersecurityAgentState
from agent_core.state import ReActUsage, Tools, ToolsUsage
from agent_core.tool import CURL_TOOL
class AttackAgentNode:
def __init__(
self,
react_usage_limit: int = 25,
curl_tool_limit: int = 20,
):
self.attack_graph = create_attack_graph()
self.react_usage_limit = react_usage_limit
self.curl_tool_limit = curl_tool_limit
async def __call__(self, state: CybersecurityAgentState) -> dict:
attack_state = {
"target": state["target"],
"scan_summary": state["scan_summary"],
"usage": ReActUsage(limit=self.react_usage_limit),
"tools_usage": ToolsUsage(
limits={
CURL_TOOL.name: self.curl_tool_limit,
}
),
"tools": Tools(tools=[CURL_TOOL]),
}
config = RunnableConfig(
max_concurrency=10,
recursion_limit=25,
configurable={"thread_id": f"attack_{hash(str(state['target']))}"},
)
final_state = await self.attack_graph.ainvoke(attack_state, config)
attack_summary = final_state.get("attack_summary")
return {"attack_summary": attack_summary}
Summary Generation
To generate a summary I decided to pass Scan and Attack Agents outputs because scan output may contain something which wasn’t too useful for the attack process and attack agent simply ignored it. But in the summary I want to see all information from the cyber security assessment. (see on GitHub)
import json
from langchain_core.language_models import BaseChatModel
from langchain_core.messages import SystemMessage
from cybersecurity_agent.state import CybersecurityAgentState
from cybersecurity_agent.state.cybersecurity_agent_state import CybersecurityReport
CYBERSECURITY_SUMMARY_PROMPT = "Omitted for simplicity. Full prompt available on GitHub."
class CybersecuritySummaryNode:
def __init__(self, llm: BaseChatModel):
self.structured_llm = llm.with_structured_output(CybersecurityReport)
def __call__(self, state: CybersecurityAgentState) -> dict:
target = state["target"]
scan_summary = state["scan_summary"]
attack_summary = state["attack_summary"]
system_prompt = CYBERSECURITY_SUMMARY_PROMPT.format(
target_url=target.url,
target_description=target.description,
target_type=target.type,
scan_summary=scan_summary.model_dump_json() if scan_summary else "No reconnaissance data available",
attack_summary=attack_summary.model_dump_json() if attack_summary else "No attack execution data available"
)
# Create a simple message to trigger the analysis
user_message = "Please analyze the provided reconnaissance and exploitation data to create a comprehensive cybersecurity assessment report."
prompt_messages = [
SystemMessage(content=system_prompt),
{"role": "user", "content": user_message}
]
cybersecurity_report = self.structured_llm.invoke(prompt_messages)
return {"cybersecurity_report": cybersecurity_report}
Testing
To perform my testing I generated a vulnerable application with Claude Code which is available on GitHub. After that I simply executed my flow from Jupyter Notebook
Result contains a lot of details useful for a business owner who asked to perform security testing for a system or security team.
Summary
In this article I explained how to build pipeline of agents by using LangGraph node wrappers in Python. As a result pipeline of agents provides a powerful system which has strict control of agent executions.
All code from this article is available on GitHub.
Previous article from the Cyber Security AI Agent development cycle is: How to Build a ReAct AI Agent for Cybersecurity Scanning with Python and LangGraph
Subscribe to my Substack to not miss my new articles 😊