How to Design a Production-Grade CAMEL Multi-Agent System with Planning, Tool Use, Self-Consistency, and Critique-Driven Refinement

In this tutorial, we implement a complicated agentic AI system utilizing the CAMEL framework, orchestrating a number of specialised brokers to collaboratively remedy a advanced job. We design a structured multi-agent pipeline consisting of a planner, researcher, author, critic, and rewriter, every with clearly outlined duties and schema-constrained outputs. We combine device utilization, self-consistency sampling, structured validation with Pydantic, and iterative critique-driven refinement to construct a strong, research-backed technical temporary generator. Through this course of, we exhibit how trendy agent architectures mix planning, reasoning, exterior device interplay, and autonomous high quality management inside a single coherent workflow.

Copy Code

import os, sys, re, json, subprocess
from typing import List, Dict, Any, Optional, Tuple


def _pip_install(pkgs: List[str]):
   subprocess.check_call([sys.executable, "-m", "pip", "install", "-q", "-U"] + pkgs)


_pip_install(["camel-ai[web_tools]~=0.2", "pydantic>=2.7", "wealthy>=13.7"])


from pydantic import BaseModel, Field
from wealthy.console import Console
from wealthy.panel import Panel
from wealthy.desk import Table


console = Console()


def _get_colab_secret(identify: str) -> Optional[str]:
   attempt:
       from google.colab import userdata
       v = userdata.get(identify)
       return v if v else None
   besides Exception:
       return None


def ensure_openai_key():
   if os.getenv("OPENAI_API_KEY"):
       return
   v = _get_colab_secret("OPENAI_API_KEY")
   if v:
       os.environ["OPENAI_API_KEY"] = v
       return
   attempt:
       from getpass import getpass
       ok = getpass("Enter OPENAI_API_KEY (enter hidden): ").strip()
       if ok:
           os.environ["OPENAI_API_KEY"] = ok
   besides Exception:
       move


ensure_openai_key()
if not os.getenv("OPENAI_API_KEY"):
   increase RuntimeError("OPENAI_API_KEY just isn't set. Add it by way of Colab Secrets (OPENAI_API_KEY) or paste it when prompted.")

We arrange the execution surroundings and set up all required dependencies straight inside Colab. We securely configure the OpenAI API key utilizing both Colab secrets and techniques or guide enter. We additionally initialize the console utilities that enable us to render structured outputs cleanly throughout execution.

Copy Code

from camel.fashions import ModelFactory
from camel.sorts import ModelPlatformType, ModelType
from camel.brokers import ChatAgent
from camel.toolkits import SearchToolequipment


def make_model(temperature: float = 0.2):
   return ModelFactory.create(
       model_platform=ModelPlatformType.OPENAI,
       model_type=ModelType.GPT_4O,
       model_config_dict={"temperature": float(temperature)},
   )


def strip_code_fences(s: str) -> str:
   s = s.strip()
   s = re.sub(r"^```(?:json)?s*", "", s, flags=re.IGNORECASE)
   s = re.sub(r"s*```$", "", s)
   return s.strip()


def extract_first_json_object(s: str) -> str:
   s2 = strip_code_fences(s)
   begin = None
   stack = []
   for i, ch in enumerate(s2):
       if ch == "{":
           if begin is None:
               begin = i
           stack.append("{")
       elif ch == "}":
           if stack:
               stack.pop()
               if not stack and begin just isn't None:
                   return s2[start:i+1]
   m = re.search(r"{[sS]*}", s2)
   if m:
       return m.group(0)
   return s2

We import the core CAMEL elements and outline the mannequin manufacturing unit used throughout all brokers. We implement helper utilities to clear and extract JSON reliably from LLM responses. This ensures that our multi-agent pipeline stays structurally strong even when fashions return formatted textual content.

Copy Code

class PlanProcess(BaseModel):
   id: str = Field(..., min_length=1)
   title: str = Field(..., min_length=1)
   goal: str = Field(..., min_length=1)
   deliverable: str = Field(..., min_length=1)
   tool_hints: List[str] = Field(default_factory=checklist)
   dangers: List[str] = Field(default_factory=checklist)


class Plan(BaseModel):
   aim: str
   assumptions: List[str] = Field(default_factory=checklist)
   duties: List[PlanTask]
   success_criteria: List[str] = Field(default_factory=checklist)


class EvidenceItem(BaseModel):
   question: str
   notes: str
   key_points: List[str] = Field(default_factory=checklist)


class Critique(BaseModel):
   score_0_to_10: float = Field(..., ge=0, le=10)
   strengths: List[str] = Field(default_factory=checklist)
   points: List[str] = Field(default_factory=checklist)
   fix_plan: List[str] = Field(default_factory=checklist)


class RunConfig(BaseModel):
   aim: str
   max_tasks: int = 5
   max_searches_per_task: int = 2
   max_revision_rounds: int = 1
   self_consistency_samples: int = 2


DEFAULT_GOAL = "Create a concise, evidence-backed technical temporary explaining CAMEL (the multi-agent framework), its core abstractions, and a sensible recipe to construct a tool-using multi-agent pipeline (planner/researcher/author/critic) with safeguards."


cfg = RunConfig(aim=DEFAULT_GOAL)


search_tool = SearchToolequipment().search_duckduckgo

We outline all structured schemas utilizing Pydantic for planning, proof, critique, and runtime configuration. We formalize the agent communication protocol so that each step is validated and typed. This permits us to rework free-form LLM outputs into predictable, production-ready knowledge constructions.

Copy Code

planner_system = (
   "You are a senior agent architect. Produce a compact, high-leverage plan for attaining the aim.n"
   "Return ONLY legitimate JSON that matches this schema:n"
   "{{"aim": "...", "assumptions": ["..."], "duties": "
   "[{{"id": "T1", "title": "...", "objective": "...", "deliverable": "...", "
   ""tool_hints": ["..."], "dangers": ["..."]}}], "
   ""success_criteria": ["..."]}}n"
   "Constraints: duties size <= {max_tasks}. Each job ought to be executable with net search + reasoning."
).format(max_tasks=cfg.max_tasks)


planner = ChatAgent(system_message=planner_system, mannequin=make_model(0.1))


researcher = ChatAgent(
   system_message=(
       "You are a meticulous analysis agent. Use the net search device when helpful.n"
       "You should:n"
       "- Search for authoritative sources (docs, official repos) first.n"
       "- Write notes which might be straight related to the duty goal.n"
       "- Return ONLY legitimate JSON:n"
       "{"question": "...", "notes": "...", "key_points": ["..."]}n"
       "Do not embody markdown code fences."
   ),
   mannequin=make_model(0.2),
   instruments=[search_tool],
)


author = ChatAgent(
   system_message=(
       "You are a technical author agent. You shall be given a aim, a plan, and proof notes.n"
       "Write a deliverable that's clear, actionable, and concise.n"
       "Include:n"
       "- A crisp overviewn"
       "- Key abstractions and how they connectn"
       "- A sensible implementation recipen"
       "- Minimal caveats/limitationsn"
       "Do NOT fabricate citations. If proof is skinny, state uncertainty.n"
       "Return plain textual content solely."
   ),
   mannequin=make_model(0.3),
)


critic = ChatAgent(
   system_message=(
       "You are a strict reviewer. Evaluate the draft in opposition to the aim, correctness, and completeness.n"
       "Return ONLY legitimate JSON:n"
       "{"score_0_to_10": 0.0, "strengths": ["..."], "points": ["..."], "fix_plan": ["..."]}n"
       "Do not embody markdown code fences."
   ),
   mannequin=make_model(0.0),
)


rewriter = ChatAgent(
   system_message=(
       "You are a revising editor. Improve the draft primarily based on critique. Preserve factual accuracy.n"
       "Return the improved draft as plain textual content solely."
   ),
   mannequin=make_model(0.25),
)

We assemble the specialised brokers: planner, researcher, author, critic, and rewriter. We outline their system roles fastidiously to implement job boundaries and structured habits. This establishes the modular multi-agent structure that permits collaboration and iterative refinement.

Copy Code

def plan_goal(aim: str) -> Plan:
   resp = planner.step("GOAL:n" + aim + "nnReturn JSON plan now.")
   uncooked = resp.msgs[0].content material if hasattr(resp, "msgs") else resp.msg.content material
   js = extract_first_json_object(uncooked)
   attempt:
       return Plan.model_validate_json(js)
   besides Exception:
       return Plan.model_validate(json.hundreds(js))


def research_task(job: PlanProcess, aim: str, ok: int) -> EvidenceItem:
   immediate = (
       "GOAL:n" + aim + "nnTASK:n" + job.model_dump_json(indent=2) + "nn"
       f"Perform analysis. Use at most {ok} net searches. First search official documentation or GitHub if related."
   )
   resp = researcher.step(immediate)
   uncooked = resp.msgs[0].content material if hasattr(resp, "msgs") else resp.msg.content material
   js = extract_first_json_object(uncooked)
   attempt:
       return EvidenceItem.model_validate_json(js)
   besides Exception:
       return EvidenceItem.model_validate(json.hundreds(js))


def draft_with_self_consistency(aim: str, plan: Plan, proof: List[Tuple[PlanTask, EvidenceItem]], n: int) -> str:
   packed_evidence = []
   for t, ev in proof:
       packed_evidence.append({
           "task_id": t.id,
           "task_title": t.title,
           "goal": t.goal,
           "notes": ev.notes,
           "key_points": ev.key_points
       })
   payload = {
       "aim": aim,
       "assumptions": plan.assumptions,
       "duties": [t.model_dump() for t in plan.tasks],
       "proof": packed_evidence,
       "success_criteria": plan.success_criteria,
   }
   drafts = []
   for _ in vary(max(1, n)):
       resp = author.step("INPUT:n" + json.dumps(payload, ensure_ascii=False, indent=2))
       txt = resp.msgs[0].content material if hasattr(resp, "msgs") else resp.msg.content material
       drafts.append(txt.strip())
   if len(drafts) == 1:
       return drafts[0]
   chooser = ChatAgent(
       system_message=(
           "You are a selector agent. Choose one of the best draft amongst candidates for correctness, readability, and actionability.n"
           "Return ONLY the successful draft textual content, unchanged."
       ),
       mannequin=make_model(0.0),
   )
   resp = chooser.step("GOAL:n" + aim + "nnCANDIDATES:n" + "nn---nn".be part of([f"[DRAFT {i+1}]n{d}" for i, d in enumerate(drafts)]))
   return (resp.msgs[0].content material if hasattr(resp, "msgs") else resp.msg.content material).strip()

We implement the orchestration logic for planning, analysis, and self-consistent drafting. We combination structured proof and generate a number of candidate drafts to enhance robustness. We then choose one of the best draft by an extra analysis agent, simulating ensemble-style reasoning.

Copy Code

def critique_text(aim: str, draft: str) -> Critique:
   resp = critic.step("GOAL:n" + aim + "nnDRAFT:n" + draft + "nnReturn critique JSON now.")
   uncooked = resp.msgs[0].content material if hasattr(resp, "msgs") else resp.msg.content material
   js = extract_first_json_object(uncooked)
   attempt:
       return Critique.model_validate_json(js)
   besides Exception:
       return Critique.model_validate(json.hundreds(js))


def revise(aim: str, draft: str, critique: Critique) -> str:
   resp = rewriter.step(
       "GOAL:n" + aim +
       "nnCRITIQUE:n" + critique.model_dump_json(indent=2) +
       "nnDRAFT:n" + draft +
       "nnRewrite now."
   )
   return (resp.msgs[0].content material if hasattr(resp, "msgs") else resp.msg.content material).strip()


def pretty_plan(plan: Plan):
   tab = Table(title="Agent Plan", show_lines=True)
   tab.add_column("ID", model="daring")
   tab.add_column("Title")
   tab.add_column("Objective")
   tab.add_column("Deliverable")
   for t in plan.duties:
       tab.add_row(t.id, t.title, t.goal, t.deliverable)
   console.print(tab)


def run(cfg: RunConfig):
   console.print(Panel.match("CAMEL Advanced Agentic Tutorial Runner", model="daring"))
   plan = plan_goal(cfg.aim)
   pretty_plan(plan)


   proof = []
   for job in plan.duties[: cfg.max_tasks]:
       ev = research_task(job, cfg.aim, cfg.max_searches_per_task)
       proof.append((job, ev))


   console.print(Panel.match("Drafting (self-consistency)", model="daring"))
   draft = draft_with_self_consistency(cfg.aim, plan, proof, cfg.self_consistency_samples)


   for r in vary(cfg.max_revision_rounds + 1):
       crit = critique_text(cfg.aim, draft)
       console.print(Panel.match(f"Critique spherical {r+1} — rating {crit.score_0_to_10:.1f}/10", model="daring"))
       if crit.strengths:
           console.print(Panel("Strengths:n- " + "n- ".be part of(crit.strengths), title="Strengths"))
       if crit.points:
           console.print(Panel("Issues:n- " + "n- ".be part of(crit.points), title="Issues"))
       if crit.fix_plan:
           console.print(Panel("Fix plan:n- " + "n- ".be part of(crit.fix_plan), title="Fix plan"))
       if crit.score_0_to_10 >= 8.5 or r >= cfg.max_revision_rounds:
           break
       draft = revise(cfg.aim, draft, crit)


   console.print(Panel.match("FINAL DELIVERABLE", model="daring inexperienced"))
   console.print(draft)


run(cfg)

We implement the critique-and-revision loop to implement high quality management. We rating the draft, establish weaknesses, and iteratively refine it as wanted. Finally, we execute the total pipeline, producing a structured, research-backed deliverable by coordinated collaboration amongst brokers.

In conclusion, we constructed a production-style CAMEL-based multi-agent system that goes far past easy immediate chaining. We structured agent communication by validated schemas, included net search instruments for grounded reasoning, utilized self-consistency to enhance output reliability, and enforced high quality utilizing an inner critic loop. By combining these superior concepts, we confirmed how we are able to assemble scalable, modular, and dependable agentic pipelines appropriate for real-world AI functions.

Check out the Full Codes with Notebook here. Also, be at liberty to observe us on Twitter and don’t overlook to be part of our 130k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to associate with us for selling your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar and many others.? Connect with us

The publish How to Design a Production-Grade CAMEL Multi-Agent System with Planning, Tool Use, Self-Consistency, and Critique-Driven Refinement appeared first on MarkTechPost.

How to Design a Production-Grade CAMEL Multi-Agent System with Planning, Tool Use, Self-Consistency, and Critique-Driven Refinement

How to Build a Matryoshka-Optimized Sentence Embedding Model for Ultra-Fast Retrieval with 64-Dimension Truncation

How to Build Memory-Driven AI Agents with Short-Term, Long-Term, and Episodic Memory

Case study: Takeda

How to Design a Persistent Memory and Personalized Agentic AI System with Decay and Self-Evaluation?

NVIDIA AI Released Nemotron Speech ASR: A New Open Source Transcription Model Designed from the Ground Up for Low-Latency Use Cases like Voice Agents

Google AI Releases ADK Go: A New Open-Source Toolkit Designed to Empower Go Developers to Build Powerful AI Agents

Curated by experts. Filtered for relevance.

Resources

About

Subscribe & learn more every day!

Similar Posts

Curated by experts. Filtered for relevance.

Resources

About

Subscribe & learn more every day!