How to Build an Advanced Agentic AI System with Planning, Tool Calling, Memory, and Self-Critique Using OpenAI API

In this tutorial, we construct an superior agentic AI system utilizing the OpenAI API and a hidden terminal immediate for the API key. We design the agent as a small pipeline of specialised roles: planner, tool-using executor, and critic, in order that we are able to separate technique, motion, and high quality management. We additionally combine structured instruments (calculator, mini knowledge-base search, JSON extraction, and file writing) so the agent can reliably compute, retrieve steering, produce structured outputs, and save artifacts as deliverables.

Copy Code

!pip -q set up -U openai


import os, json, re, math, hashlib
from dataclasses import dataclass, area
from typing import Any, Dict, List
from getpass import getpass
from openai import OpenAI


if not os.environ.get("OPENAI_API_KEY"):
   os.environ["OPENAI_API_KEY"] = getpass("Enter OPENAI_API_KEY (hidden): ").strip()


assert os.environ["OPENAI_API_KEY"], "OPENAI_API_KEY required"


shopper = OpenAI()
MODEL = "gpt-5.2"

We set up the OpenAI SDK and import solely what we’d like to preserve the pocket book light-weight and reproducible in Colab. We take the API key by way of getpass() so it stays hidden and by no means seems within the pocket book output or code. We then create an OpenAI shopper and set the mannequin string as soon as so the remainder of the system can reuse it persistently.

Copy Code

KB = [
   {"title": "Agent Protocol: Execution", "text": "Use tools only when necessary. Prefer short intermediate notes. Always verify numeric results."},
   {"title": "Policy: Output Quality", "text": "Final answers must include steps, checks, and deliverables. Emails must include subject and next steps."},
   {"title": "Playbook: Meeting Follow-up", "text": "Summarize decisions. List action items with owner and due date. Draft concise follow-up."},
]


def _safe_calc(expr: str):
   allowed = set("0123456789+-*/().% eE")
   if any(ch not in allowed for ch in expr): return {"okay": False, "error": "Invalid characters"}
   if re.search(r"[A-Za-z_]", expr): return {"okay": False, "error": "Variables not allowed"}
   strive:
       val = eval(expr, {"__builtins__": {}}, {"math": math})
       return {"okay": True, "expression": expr, "worth": val}
   besides Exception as e:
       return {"okay": False, "error": str(e)}


def _kb_search(question: str, okay: int = 3):
   q = question.decrease()
   scored = []
   for merchandise in KB:
       hay = (merchandise["title"] + " " + merchandise["text"]).decrease()
       rating = sum(1 for tok in set(re.findall(r"w+", q)) if tok in hay)
       scored.append((rating, merchandise))
   scored.kind(key=lambda x: x[0], reverse=True)
   return {"okay": True, "outcomes": [it for _, it in scored[:k]]}


def _extract_json(textual content: str):
   m = re.search(r"{.*}", textual content, flags=re.DOTALL)
   if not m: return {"okay": False, "error": "No JSON discovered"}
   strive:
       return {"okay": True, "json": json.masses(m.group(0))}
   besides Exception as e:
       return {"okay": False, "error": str(e), "uncooked": m.group(0)[:1500]}


def _write_file(path: str, content material: str):
   os.makedirs(os.path.dirname(path) or ".", exist_ok=True)
   with open(path, "w", encoding="utf-8") as f: f.write(content material)
   sha = hashlib.sha256(content material.encode()).hexdigest()[:16]
   return {"okay": True, "path": path, "sha16": sha, "bytes": len(content material.encode("utf-8"))}

We outline a small inner “information base” to simulate playbooks or group documentation that the agent can seek the advice of by way of a device name. We implement instruments that return structured dictionaries to preserve device outputs machine-readable and sturdy. We embrace a secure calculator, a keyword-based KB search, a JSON extractor for structured parsing, and a file author to save closing deliverables as artifacts.

Copy Code

TOOLS = {
   "calc": lambda expression: _safe_calc(expression),
   "kb_search": lambda question, okay=3: _kb_search(question, int(okay)),
   "extract_json": lambda textual content: _extract_json(textual content),
   "write_file": lambda path, content material: _write_file(path, content material),
}


TOOL_SCHEMAS = [
   {"type": "function","function":{"name":"calc","description":"Safely compute a numeric expression.","parameters":{"type":"object","properties":{"expression":{"type":"string"}},"required":["expression"]}}},
   {"sort": "perform","perform":{"title":"kb_search","description":"Search inner mini information base.","parameters":{"sort":"object","properties":{"question":{"sort":"string"},"okay":{"sort":"integer","default":3}},"required":["query"]}}},
   {"sort": "perform","perform":{"title":"extract_json","description":"Extract and parse first JSON object from textual content.","parameters":{"sort":"object","properties":{"textual content":{"sort":"string"}},"required":["text"]}}},
   {"sort": "perform","perform":{"title":"write_file","description":"Write content material to a file path.","parameters":{"sort":"object","properties":{"path":{"sort":"string"},"content material":{"sort":"string"}},"required":["path","content"]}}},
]


@dataclass
class AgentState:
   aim: str
   reminiscence: List[str] = area(default_factory=record)
   hint: List[Dict[str, Any]] = area(default_factory=record)


def chat(messages, instruments=None, tool_choice="auto", temperature=0.2):
   kwargs = dict(
       mannequin=MODEL,
       messages=messages,
       temperature=temperature,
   )
   if instruments shouldn't be None:
       kwargs["tools"] = instruments
       kwargs["tool_choice"] = tool_choice
   return shopper.chat.completions.create(**kwargs)


def run_tool(title, args):
   fn = TOOLS.get(title)
   if not fn: return {"okay": False, "error": f"Unknown device: {title}"}
   strive:
       return fn(**args)
   besides Exception as e:
       return {"okay": False, "error": str(e), "args": args}

We register our Python instruments in a mapping so we are able to name them by title throughout function-calling. We declare device schemas so the mannequin can name instruments with appropriate argument buildings. We outline AgentState to retailer the aim, reminiscence, and tool-call hint, permitting us to examine what occurred and debug the agent’s conduct. We implement a secure chat() wrapper that solely contains tool_choice when instruments are offered, stopping the 400 error you noticed.

Copy Code

PLANNER_SYS = """You are a senior planner.
Return STRICT JSON with keys:
goal (string), steps (array of strings), tool_checkpoints (array of strings)."""


EXECUTOR_SYS = """You are a tool-using executor.
Use instruments when wanted. Keep intermediate notes brief.
When accomplished, return:
1) DRAFT output
2) Verification guidelines"""


CRITIC_SYS = """You are a critic.
Given aim + draft, return:
- Issues (bullets)
- Fixes (bullets)
- Improved closing reply (clear)"""


def plan(state: AgentState):
   r = chat(
       [{"role":"system","content":PLANNER_SYS},{"role":"user","content":state.goal}],
       instruments=None,
       temperature=0.1,
   )
   txt = r.selections[0].message.content material or ""
   parsed = _extract_json(txt)
   if not parsed.get("okay"):
       return {"goal": state.aim, "steps": ["Proceed directly (planner JSON parse failed)."], "tool_checkpoints": []}
   return parsed["json"]


def execute(state: AgentState, plan_obj: Dict[str, Any]):
   msgs = [
       {"role":"system","content":EXECUTOR_SYS},
       {"role":"user","content":f"GOAL:n{state.goal}nnPLAN:n{json.dumps(plan_obj, indent=2)}nnMEMORY:n" + "n".join(f"- {m}" for m in state.memory[-10:])}
   ]
   for _ in vary(12):
       r = chat(msgs, instruments=TOOL_SCHEMAS, tool_choice="auto", temperature=0.2)
       msg = r.selections[0].message
       tool_calls = getattr(msg, "tool_calls", None)
       if tool_calls:
           msgs.append({"function":"assistant","content material":msg.content material or "", "tool_calls": tool_calls})
           for tc in tool_calls:
               title = tc.perform.title
               args = json.masses(tc.perform.arguments or "{}")
               out = run_tool(title, args)
               state.hint.append({"device": title, "args": args, "out": out})
               msgs.append({"function":"device","tool_call_id": tc.id, "content material": json.dumps(out)})
           proceed
       return msg.content material or ""
   return "Executor stopped (iteration restrict reached)."

We outline three function prompts to separate the agent’s obligations: the planner produces a structured plan, the executor performs the duty and makes use of instruments as wanted, and the critic improves the ultimate output. We implement a plan to request strict JSON, a loop to execute mannequin calls, detect device calls, execute them in Python, and then feed their outputs again to the mannequin. This creates a real tool-using agent relatively than a single-shot textual content generator.

Copy Code

def critique(state: AgentState, draft: str):
   r = chat(
       [{"role":"system","content":CRITIC_SYS},{"role":"user","content":f"GOAL:n{state.goal}nnDRAFT:n{draft}nnTRACE:n{json.dumps(state.trace, indent=2)[:9000]}"}],
       instruments=None,
       temperature=0.2,
   )
   return r.selections[0].message.content material or draft


def run_agent(aim: str):
   state = AgentState(aim=aim)
   state.reminiscence.append("Use kb_search in case you want inner steering or formatting playbooks.")
   plan_obj = plan(state)
   draft = execute(state, plan_obj)
   closing = critique(state, draft)
   return {"plan": plan_obj, "draft": draft, "closing": closing, "hint": state.hint}


demo_goal = """
From this transcript, produce:
A) concise assembly abstract
B) motion objects as JSON array with fields: proprietor, motion, due_date (or null)
C) follow-up electronic mail (topic + physique)
D) Save output to /content material/meeting_followup.md utilizing write_file


Transcript:
- Decision: Ship v2 dashboard on March 15.
- Risk: Data latency would possibly spike; Priya will run load checks.
- Amir will replace the KPI definitions doc and share with finance.
- Next check-in: Tuesday. Owner: Nikhil.
"""


end result = run_agent(demo_goal)
print(end result["final"])

We implement critique to evaluation the draft and produce a sophisticated closing response, utilizing the Trace device as further proof for debugging and accountability. We implement run_agent() to orchestrate the complete loop: initialize state, plan, execute with instruments, then critique and finalize. Finally, we offer a demo aim that forces a sensible deliverable: abstract, structured motion objects JSON, a follow-up electronic mail, and saving output to a file by way of the write_file device.

In conclusion, we applied a sensible agentic structure that cleanly separates planning, device execution, and critique-based refinement. We linked the mannequin to actual instruments with strict schemas, recorded a clear device hint for debugging, and produced artifacts by saving the ultimate output to a file. With this construction, we are able to prolong the agent to extra production-grade workflows by including device retry insurance policies, parallel sub-agents, richer reminiscence (vector + symbolic), and analysis harnesses to measure how properly the agent plans, makes use of instruments, and improves output over time.

Check out the Full Codes with Notebook here. Also, be at liberty to observe us on Twitter and don’t neglect to be part of our 150k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to companion with us for selling your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar and many others.? Connect with us

The put up How to Build an Advanced Agentic AI System with Planning, Tool Calling, Memory, and Self-Critique Using OpenAI API appeared first on MarkTechPost.

How to Build an Advanced Agentic AI System with Planning, Tool Calling, Memory, and Self-Critique Using OpenAI API

DeepSeek Researchers Introduce DeepSeek-V3.2 and DeepSeek-V3.2-Speciale for Long Context Reasoning and Agentic Workloads

A Coding Implementation to Parsing, Analyzing, Visualizing, and Fine-Tuning Agent Reasoning Traces Using the lambda/hermes-agent-reasoning-traces Dataset

It’s Okay to Be “Just a Wrapper”: Why Solution-Driven AI Companies Win

Anthropic Launches Claude Sonnet 4.5 with New Coding and Agentic State-of-the-Art Results

How to Build an Advanced AI Agent with Summarized Short-Term and Vector-Based Long-Term Memory

How to Build an MCP Style Routed AI Agent System with Dynamic Tool Exposure Planning, Execution, and Context Injection

Curated by experts. Filtered for relevance.

Resources

About

Subscribe & learn more every day!

Similar Posts

Curated by experts. Filtered for relevance.

Resources

About

Subscribe & learn more every day!