How to Build Human-in-the-Loop Plan-and-Execute AI Agents with Explicit User Approval Using LangGraph and Streamlit

In this tutorial, we build a human-in-the-loop travel booking agent that treats the user as a teammate rather than a passive observer. We design the system so the agent first reasons openly by drafting a structured travel plan, then deliberately pauses before taking any action. We expose this proposed plan in a live interface where we can inspect, edit, or reject it, and only after explicit approval do we allow the agent to execute tools. By combining LangGraph interrupts with a Streamlit frontend, we create a workflow that makes agent reasoning visible, controllable, and trustworthy instead of opaque and autonomous.

Copy Code

!pip -q install -U langgraph openai streamlit pydantic
!npm -q install -g localtunnel


import os, getpass, textwrap, json, uuid, time
if not os.environ.get("OPENAI_API_KEY"):
   os.environ["OPENAI_API_KEY"] = getpass.getpass("OPENAI_API_KEY (hidden input): ")
os.environ.setdefault("OPENAI_MODEL", "gpt-4.1-mini")

We set up the execution environment by installing all required libraries and utilities needed for agent orchestration and UI exposure. We securely collect the OpenAI API key at runtime so it is never hardcoded or leaked in the notebook. We also configure the model selection upfront to keep the rest of the pipeline clean and reproducible.

Copy Code

app_code = r'''
import os, json, uuid
import streamlit as st
from typing import TypedDict, List, Dict, Any, Optional
from pydantic import BaseModel, Field
from openai import OpenAI


from langgraph.graph import StateGraph, START, END
from langgraph.types import Command, interrupt
from langgraph.checkpoint.memory import InMemorySaver




def tool_search_flights(origin: str, destination: str, depart_date: str, return_date: str, budget_usd: int) -> Dict[str, Any]:
   options = [
       {"airline": "SkyJet", "route": f"{origin}->{destination}", "depart": depart_date, "return": return_date, "price_usd": int(budget_usd*0.55)},
       {"airline": "AeroBlue", "route": f"{origin}->{destination}", "depart": depart_date, "return": return_date, "price_usd": int(budget_usd*0.70)},
       {"airline": "Nimbus Air", "route": f"{origin}->{destination}", "depart": depart_date, "return": return_date, "price_usd": int(budget_usd*0.62)},
   ]
   options = sorted(options, key=lambda x: x["price_usd"])
   return {"tool": "search_flights", "top_options": options[:2]}


def tool_search_hotels(city: str, nights: int, budget_usd: int, preferences: List[str]) -> Dict[str, Any]:
   base = max(60, int(budget_usd / max(nights, 1)))
   picks = [
       {"name": "Central Boutique", "city": city, "nightly_usd": int(base*0.95), "notes": ["walkable", "great reviews"]},
       {"name": "Riverside Stay", "city": city, "nightly_usd": int(base*0.80), "notes": ["quiet", "good value"]},
       {"name": "Modern Loft Hotel", "city": city, "nightly_usd": int(base*1.10), "notes": ["new", "gym"]},
   ]
   if "luxury" in [p.lower() for p in preferences]:
       picks = sorted(picks, key=lambda x: -x["nightly_usd"])
   else:
       picks = sorted(picks, key=lambda x: x["nightly_usd"])
   return {"tool": "search_hotels", "top_options": picks[:2]}


def tool_build_day_by_day(city: str, days: int, vibe: str) -> Dict[str, Any]:
   blocks = []
   for d in range(1, days+1):
       blocks.append({
           "day": d,
           "morning": f"{city}: coffee + a must-see landmark",
           "afternoon": f"{city}: {vibe} activity + local lunch",
           "evening": f"{city}: sunset spot + dinner + optional night walk"
       })
   return {"tool": "draft_itinerary", "days": blocks}
'''

We define the Streamlit application core and implement safe, deterministic tool functions that simulate flights, hotels, and itinerary generation. We design these tools to behave like real-world APIs while still running fully in a Colab environment. We ensure all tool outputs are structured so they can be audited before execution.

Copy Code

app_code += r'''
class TravelPlan(BaseModel):
   trip_title: str = Field(..., description="Short human-friendly title")
   origin: str
   destination: str
   depart_date: str
   return_date: str
   travelers: int = 1
   budget_usd: int = 1500
   preferences: List[str] = Field(default_factory=list)
   vibe: str = "balanced"
   lodging_nights: int = 4
   daily_outline: List[Dict[str, Any]] = Field(default_factory=list)
   tool_calls: List[Dict[str, Any]] = Field(default_factory=list)


class State(TypedDict):
   user_request: str
   plan: Dict[str, Any]
   approval: Dict[str, Any]
   execution: Dict[str, Any]


def make_llm_plan(state: State) -> Dict[str, Any]:
   client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
   model = os.environ.get("OPENAI_MODEL", "gpt-4.1-mini")


   sys = (
       "You are a travel planning agent. "
       "Return a JSON travel plan that matches the provided schema. "
       "Be realistic, concise, and include a tool_calls list describing what you want executed "
       "(e.g., search_flights, search_hotels, draft_itinerary)."
   )


   schema = TravelPlan.model_json_schema()


   resp = client.responses.create(
       model=model,
       input=[
           {"role":"system","content": sys},
           {"role":"user","content": state["user_request"]},
           {"role":"user","content": f"Schema (JSON): {json.dumps(schema)}"}
       ],
   )


   text = resp.output_text.strip()
   start = text.find("{")
   end = text.rfind("}")
   if start == -1 or end == -1:
       raise ValueError("Model did not return JSON. Try again or change model.")
   raw = text[start:end+1]
   plan_obj = json.loads(raw)


   plan = TravelPlan(**plan_obj).model_dump()


   if not plan.get("tool_calls"):
       plan["tool_calls"] = [
           {"name":"search_flights", "args":{"origin": plan["origin"], "destination": plan["destination"], "depart_date": plan["depart_date"], "return_date": plan["return_date"], "budget_usd": plan["budget_usd"]}},
           {"name":"search_hotels", "args":{"city": plan["destination"], "nights": plan["lodging_nights"], "budget_usd": int(plan["budget_usd"]*0.35), "preferences": plan["preferences"]}},
           {"name":"draft_itinerary", "args":{"city": plan["destination"], "days": max(2, plan["lodging_nights"]+1), "vibe": plan["vibe"]}},
       ]


   return {"plan": plan}


def wait_for_approval(state: State) -> Dict[str, Any]:
   payload = {
       "kind": "approval",
       "message": "Review/edit the plan. Approve to execute tools.",
       "plan": state["plan"],
   }
   decision = interrupt(payload)
   return {"approval": decision}


def execute_tools(state: State) -> Dict[str, Any]:
   approval = state.get("approval") or {}
   if not approval.get("approved"):
       return {"execution": {"status": "not_executed", "reason": "User rejected or did not approve."}}


   plan = approval.get("edited_plan") or state["plan"]
   tool_calls = plan.get("tool_calls", [])


   results = []
   for call in tool_calls:
       name = call.get("name")
       args = call.get("args", {})
       if name == "search_flights":
           results.append(tool_search_flights(**args))
       elif name == "search_hotels":
           results.append(tool_search_hotels(**args))
       elif name == "draft_itinerary":
           results.append(tool_build_day_by_day(**args))
       else:
           results.append({"tool": name, "error": "Unknown tool (blocked for safety).", "args": args})


   return {"execution": {"status": "executed", "tool_results": results, "final_plan": plan}}
'''

We formalize the agent’s reasoning using a strict schema that requires the model to output an explicit travel plan rather than free-form text. We generate the plan using the OpenAI model and validate it before allowing it into the workflow. We also auto-inject tool calls if the model omits them to guarantee a complete execution path.

Copy Code

app_code += r'''
def build_graph():
   builder = StateGraph(State)
   builder.add_node("plan", make_llm_plan)
   builder.add_node("approve", wait_for_approval)
   builder.add_node("execute", execute_tools)


   builder.add_edge(START, "plan")
   builder.add_edge("plan", "approve")
   builder.add_edge("approve", "execute")
   builder.add_edge("execute", END)


   memory = InMemorySaver()
   graph = builder.compile(checkpointer=memory)
   return graph


st.set_page_config(page_title="Plan → Approve → Execute Travel Agent", layout="wide")
st.title("Human-in-the-Loop Travel Booking Agent (Plan → Approve/Edit → Execute)")


with st.sidebar:
   st.header("Runtime")
   if st.button("New Session / Thread"):
       st.session_state.thread_id = str(uuid.uuid4())
       st.session_state.ran_once = False
       st.session_state.interrupt_payload = None
       st.session_state.last_execution = None


thread_id = st.session_state.get("thread_id") or str(uuid.uuid4())
st.session_state.thread_id = thread_id


graph = build_graph()
config = {"configurable": {"thread_id": thread_id}}


st.caption(f"Thread ID: {thread_id}")


req = st.text_area(
   "Describe your trip request",
   value=st.session_state.get("user_request", "Plan a 5-day trip from Dubai to Istanbul in April. Budget $1800. Prefer museums, street food, and a relaxed pace."),
   height=120
)
st.session_state.user_request = req


colA, colB = st.columns([1,1])
run_plan = colA.button("1) Generate Plan (LLM)")
resume_btn = colB.button("2) Resume After Approval")


if run_plan:
   st.session_state.ran_once = True
   st.session_state.interrupt_payload = None
   st.session_state.last_execution = None


   initial = {"user_request": req, "plan": {}, "approval": {}, "execution": {}}
   out = graph.invoke(initial, config=config)


   if "__interrupt__" in out and out["__interrupt__"]:
       st.session_state.interrupt_payload = out["__interrupt__"][0].value
   else:
       st.session_state.last_execution = out.get("execution")


payload = st.session_state.get("interrupt_payload")


if payload:
   st.subheader("Plan proposed by agent (editable)")
   plan = payload.get("plan", {})
   left, right = st.columns([1,1])


   with left:
       st.write("**Edit JSON (advanced):**")
       edited_text = st.text_area("Plan JSON", value=json.dumps(plan, indent=2), height=420)


   with right:
       st.write("**Quick actions:**")
       approved = st.radio("Decision", options=["Approve", "Reject"], index=0)
       st.write("Tip: If you edit JSON, keep it valid. You can also reject and re-run planning.")


   try:
       edited_plan = json.loads(edited_text)
       json_ok = True
   except Exception as e:
       json_ok = False
       st.error(f"Invalid JSON: {e}")


   if resume_btn:
       if not json_ok:
           st.stop()


       decision = {
           "approved": (approved == "Approve"),
           "edited_plan": edited_plan
       }
       out2 = graph.invoke(Command(resume=decision), config=config)
       st.session_state.interrupt_payload = None
       st.session_state.last_execution = out2.get("execution")


exec_result = st.session_state.get("last_execution")
if exec_result:
   st.subheader("Execution result")
   st.json(exec_result)
   if exec_result.get("status") == "executed":
       st.success("Tools executed only AFTER approval ")
   else:
       st.warning("Not executed (rejected or not approved).")
'''

We construct the LangGraph workflow by separating planning, approval, and execution into distinct nodes. We deliberately interrupt the graph after planning so we can review and control the agent’s intent. We only allow tool execution to proceed when explicit human approval is provided.

Copy Code

import pathlib
pathlib.Path("app.py").write_text(app_code)


!streamlit run app.py --server.port 8501 --server.address 0.0.0.0 & sleep 2
!lt --port 8501

We connect the agent workflow to a live Streamlit interface that supports editing, approval, and rejection of plans. We persist the state across runs using a thread identifier so the agent behaves consistently across interactions. We finally launch the app and make it publicly available, enabling real human-in-the-loop collaboration.

In conclusion, we demonstrated how plan-and-execute agents become significantly more reliable when humans remain in the loop at the right moment. We showed that interrupts are not just a technical feature but a design primitive for building trust, accountability, and collaboration into agent systems. By separating planning from execution and inserting a clear approval boundary, we ensured that tools run only with human consent and context. This pattern scales beyond travel planning to any high-stakes automation, giving us agents that think with us rather than act for us.

Check out the Full Codes here. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

The post How to Build Human-in-the-Loop Plan-and-Execute AI Agents with Explicit User Approval Using LangGraph and Streamlit appeared first on MarkTechPost.

How to Build Human-in-the-Loop Plan-and-Execute AI Agents with Explicit User Approval Using LangGraph and Streamlit

Anyscale and NovaSky Team Releases SkyRL tx v0.1.0: Bringing Tinker Compatible Reinforcement Learning RL Engine To Local GPU Clusters

Google AI Introduces Stax: A Practical AI Tool for Evaluating Large Language Models LLMs

AI Agent Trends of 2025: A Transformative Landscape

How to Build a Robust Multi-Agent Pipeline Using CAMEL with Planning, Web-Augmented Reasoning, Critique, and Persistent Memory

OpenAI Debuts Agent Builder and AgentKit: A Visual-First Stack for Building, Deploying, and Evaluating AI Agents

Meta AI Releases SAM Audio: A State-of-the-Art Unified Model that Uses Intuitive and Multimodal Prompts for Audio Separation

Curated by experts. Filtered for relevance.

Resources

About

Subscribe & learn more every day!

Similar Posts

Curated by experts. Filtered for relevance.

Resources

About

Subscribe & learn more every day!