Multi-Agent Systems
MESHFLOW_MOCK=1 python3 hands_on/05_multi_agent_team.pyLesson 10: Multi-Agent Systems
Lesson Goal
This lesson teaches how to design multi-agent systems in a starter-friendly way. A multi-agent system uses more than one agent to solve a task. Each agent should have a clear role, limited tools, clear inputs, and explicit outputs.
The main lesson:
Multiple agents can improve work only when their responsibilities are clear.
Without structure, they create more confusion, cost, and risk.
1. What Is A Multi-Agent System?
A multi-agent system is an AI application where two or more agents collaborate, debate, divide work, review each other, or hand off tasks.
Example:
researcher_agent -> writer_agent -> reviewer_agent -> approval_gate
Each agent uses an LLM, but each has a different responsibility.
2. When Multiple Agents Help
Use multiple agents when different responsibilities benefit from separation.
Good reasons:
- Research and writing require different behavior.
- One agent should critique another agent's output.
- Different tools should be isolated by role.
- Parallel work can save time.
- A specialist role improves quality.
- You need debate before a decision.
Weak reasons:
- "More agents sounds advanced."
- The same agent role is copied three times.
- Agents chat without producing artifacts.
- No one knows when the conversation should stop.
- Every agent has every tool.
Beginner rule: add a second agent only when you can explain what it does that the first agent should not do.
3. Common Multi-Agent Roles
| Role | Responsibility | Typical Tools |
|---|---|---|
| Planner | Breaks goal into steps | None or read-only project context |
| Researcher | Finds and summarizes evidence | Search, retrieval, file read |
| Analyst | Compares options and tradeoffs | Calculator, data query |
| Writer | Drafts final content | Usually no external tools |
| Reviewer | Checks quality and policy | Rubric, validator |
| Safety guard | Looks for risk | Policy checker |
| Executor | Performs approved actions | Restricted action tools |
Keep risky tools away from exploratory agents.
4. Coordination Patterns
Sequential Handoff
One agent produces an artifact for the next.
researcher -> writer -> reviewer
Best for beginner workflows because it is easy to inspect.
Parallel Specialists
Several agents work independently, then a merge step combines results.
technical_researcher
market_researcher
risk_researcher
-> synthesizer
Best when subtopics are independent.
Debate
Agents argue different positions before a judge decides.
pro_agent -> con_agent -> judge_agent
Useful for decisions, but needs a strict stopping rule.
Manager-Worker
A manager agent delegates tasks to worker agents.
manager -> researcher
manager -> writer
manager -> reviewer
Powerful, but harder to debug. Use after you understand sequential handoff.
Critic-Revision Loop
A critic reviews a draft, then a writer revises.
writer -> critic -> writer -> approval
Always set a maximum number of revisions.
5. Context Boundaries
Not every agent needs the same context.
Researcher context:
- Goal.
- Research questions.
- Search tool instructions.
- Source limits.
Writer context:
- Goal.
- Evidence summary.
- Required style.
- Output format.
Reviewer context:
- Goal.
- Draft answer.
- Quality rubric.
- Policy requirements.
This separation reduces confusion and limits unnecessary exposure to data.
6. Memory Boundaries
Multi-agent memory needs clear rules.
Decide:
- Which agents can read memory?
- Which agents can write memory?
- Which memories are shared?
- Which memories are private to one agent?
- Which memories need human approval before storing?
Simple design:
shared project memory: approved facts and decisions
agent scratchpad: temporary, not stored
final memory write: only after approval
Do not store every agent message as long-term memory. Store useful approved summaries.
7. Tool Boundaries
Tools should match roles.
Example:
| Agent | Allowed Tools | Blocked Tools |
|---|---|---|
| Researcher | search, read docs | send email, publish |
| Writer | style guide lookup | database write |
| Reviewer | rubric scorer, policy checker | publish |
| Executor | publish action | search everything |
This is least-privilege design for agents.
8. Artifacts In Multi-Agent Systems
Agents should communicate through artifacts, not only hidden chat.
Examples:
research_brieftechnical_notesrisk_notesdraft_answerreview_findingsrevision_planapproval_record
Artifacts make collaboration inspectable.
9. Stopping Rules
Multi-agent systems need strong stopping rules because conversations can drift.
Examples:
- Stop after each agent produces its required artifact.
- Stop after 3 debate rounds.
- Stop when reviewer score is at least 0.8.
- Stop when budget is reached.
- Stop when a human rejects the output.
- Stop when required evidence is missing.
No stop rule means no production-ready agent system.
10. Safety And Governance
Add governance wherever agents can:
- Call tools.
- Use private data.
- Produce external messages.
- Make recommendations with real consequences.
- Trigger actions.
- Store memory.
- Spend money.
Useful controls:
- Tool schemas.
- Input validation.
- Role-specific permissions.
- Gates.
- Rate limits.
- Cost budgets.
- Trace logs.
- Human review.
11. Hands-On: Multi-Agent Debate Example
Run:
python3 -m src.mini_meshflow run examples/07_multi_agent_debate.json
Then inspect:
- What each agent produces.
- Whether the debate has a clear final artifact.
- Whether the workflow has a stopping point.
- What you would add before using it in production.
12. Design Exercise
Design a multi-agent lesson-builder:
planner_agent
-> researcher_agent
-> writer_agent
-> reviewer_agent
-> approval_gate
-> final_lesson
For each agent, define:
- Role.
- Goal.
- Allowed tools.
- Input artifacts.
- Output artifact.
- Stop condition.
- Failure behavior.
13. Common Beginner Mistakes
Mistake 1: Too many agents.
Correction: Start with two or three roles.
Mistake 2: Every agent sees everything.
Correction: Give each agent only the context it needs.
Mistake 3: Every agent has every tool.
Correction: Match tools to roles.
Mistake 4: Agents talk but do not produce artifacts.
Correction: Require named outputs.
Mistake 5: No final authority.
Correction: Add a judge, reviewer, gate, or workflow rule.
Mistake 6: No stopping rule.
Correction: Add turn limits, artifact requirements, and budget limits.
14. Multi-Agent Design Checklist
Before building, answer:
- Why do we need more than one agent?
- What is each agent's role?
- What artifact does each agent produce?
- Which tools can each agent use?
- What context does each agent receive?
- What memory can each agent read or write?
- How do agents coordinate?
- What happens if agents disagree?
- What stops the system?
- What is logged in the trace?
- What requires human approval?
15. Summary
Multi-agent systems are powerful when they divide responsibility clearly. The safe beginner pattern is:
specialized roles + limited tools + explicit artifacts + gates + traces
Do not measure sophistication by the number of agents. Measure it by clarity, control, and quality.
Exercises
Exercises
Exercise 1: Run The Debate
python3 -m src.mini_meshflow run examples/07_multi_agent_debate.json
Write down each agent-like step and the artifact it produces.
Exercise 2: Role Boundaries
Design three agents for a training-course builder:
- Researcher.
- Writer.
- Reviewer.
For each, list allowed tools and blocked tools.
Exercise 3: Context Boundary
Write the exact context each agent should receive. Keep each context package short and role-specific.
Exercise 4: Add A Gate
Choose one action that should require approval. Explain what artifact the gate checks and what happens if approval is denied.