How to Build Human-in-the-Loop Plan-and-Execute AI Agents with Explicit User Approval Using LangGraph and Streamlit
In this tutorial, we build a human-in-the-loop travel booking agent that treats the user as a teammate rather than a passive observer. We design the system so the agent first reasons openly by drafting a structured travel plan, then deliberately pauses before taking any action. We expose this proposed plan in a live interface where we can inspect, edit, or reject it, and only after explicit approval do we allow the agent to execute tools. By combining LangGraph interrupts with a Streamlit frontend, we create a workflow that makes agent reasoning visible, controllable, and trustworthy instead of opaque and autonomous.
!pip -q install -U langgraph openai streamlit pydantic
!npm -q install -g localtunnel
import os, getpass, textwrap, json, uuid, time
if not os.environ.get("OPENAI_API_KEY"):
os.environ["OPENAI_API_KEY"] = getpass.getpass("OPENAI_API_KEY (hidden input): ")
os.environ.setdefault("OPENAI_MODEL", "gpt-4.1-mini")
We set up the execution environment by installing all required libraries and utilities needed for agent orchestration and UI exposure. We securely collect the OpenAI API key at runtime so it is never hardcoded or leaked in the notebook. We also configure the model selection upfront to keep the rest of the pipeline clean and reproducible.
app_code = r'''
import os, json, uuid
import streamlit as st
from typing import TypedDict, List, Dict, Any, Optional
from pydantic import BaseModel, Field
from openai import OpenAI
from langgraph.graph import StateGraph, START, END
from langgraph.types import Command, interrupt
from langgraph.checkpoint.memory import InMemorySaver
def tool_search_flights(origin: str, destination: str, depart_date: str, return_date: str, budget_usd: int) -> Dict[str, Any]:
options = [
{"airline": "SkyJet", "route": f"{origin}->{destination}", "depart": depart_date, "return": return_date, "price_usd": int(budget_usd*0.55)},
{"airline": "AeroBlue", "route": f"{origin}->{destination}", "depart": depart_date, "return": return_date, "price_usd": int(budget_usd*0.70)},
{"airline": "Nimbus Air", "route": f"{origin}->{destination}", "depart": depart_date, "return": return_date, "price_usd": int(budget_usd*0.62)},
]
options = sorted(options, key=lambda x: x["price_usd"])
return {"tool": "search_flights", "top_options": options[:2]}
def tool_search_hotels(city: str, nights: int, budget_usd: int, preferences: List[str]) -> Dict[str, Any]:
base = max(60, int(budget_usd / max(nights, 1)))
picks = [
{"name": "Central Boutique", "city": city, "nightly_usd": int(base*0.95), "notes": ["walkable", "great reviews"]},
{"name": "Riverside Stay", "city": city, "nightly_usd": int(base*0.80), "notes": ["quiet", "good value"]},
{"name": "Modern Loft Hotel", "city": city, "nightly_usd": int(base*1.10), "notes": ["new", "gym"]},
]
if "luxury" in [p.lower() for p in preferences]:
picks = sorted(picks, key=lambda x: -x["nightly_usd"])
else:
picks = sorted(picks, key=lambda x: x["nightly_usd"])
return {"tool": "search_hotels", "top_options": picks[:2]}
def tool_build_day_by_day(city: str, days: int, vibe: str) -> Dict[str, Any]:
blocks = []
for d in range(1, days+1):
blocks.append({
"day": d,
"morning": f"{city}: coffee + a must-see landmark",
"afternoon": f"{city}: {vibe} activity + local lunch",
"evening": f"{city}: sunset spot + dinner + optional night walk"
})
return {"tool": "draft_itinerary", "days": blocks}
'''
We define the Streamlit application core and implement safe, deterministic tool functions that simulate flights, hotels, and itinerary generation. We design these tools to behave like real-world APIs while still running fully in a Colab environment. We ensure all tool outputs are structured so they can be audited before execution.
app_code += r'''
class TravelPlan(BaseModel):
trip_title: str = Field(..., description="Short human-friendly title")
origin: str
destination: str
depart_date: str
return_date: str
travelers: int = 1
budget_usd: int = 1500
preferences: List[str] = Field(default_factory=list)
vibe: str = "balanced"
lodging_nights: int = 4
daily_outline: List[Dict[str, Any]] = Field(default_factory=list)
tool_calls: List[Dict[str, Any]] = Field(default_factory=list)
class State(TypedDict):
user_request: str
plan: Dict[str, Any]
approval: Dict[str, Any]
execution: Dict[str, Any]
def make_llm_plan(state: State) -> Dict[str, Any]:
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
model = os.environ.get("OPENAI_MODEL", "gpt-4.1-mini")
sys = (
"You are a travel planning agent. "
"Return a JSON travel plan that matches the provided schema. "
"Be realistic, concise, and include a tool_calls list describing what you want executed "
"(e.g., search_flights, search_hotels, draft_itinerary)."
)
schema = TravelPlan.model_json_schema()
resp = client.responses.create(
model=model,
input=[
{"role":"system","content": sys},
{"role":"user","content": state["user_request"]},
{"role":"user","content": f"Schema (JSON): {json.dumps(schema)}"}
],
)
text = resp.output_text.strip()
start = text.find("{")
end = text.rfind("}")
if start == -1 or end == -1:
raise ValueError("Model did not return JSON. Try again or change model.")
raw = text[start:end+1]
plan_obj = json.loads(raw)
plan = TravelPlan(**plan_obj).model_dump()
if not plan.get("tool_calls"):
plan["tool_calls"] = [
{"name":"search_flights", "args":{"origin": plan["origin"], "destination": plan["destination"], "depart_date": plan["depart_date"], "return_date": plan["return_date"], "budget_usd": plan["budget_usd"]}},
{"name":"search_hotels", "args":{"city": plan["destination"], "nights": plan["lodging_nights"], "budget_usd": int(plan["budget_usd"]*0.35), "preferences": plan["preferences"]}},
{"name":"draft_itinerary", "args":{"city": plan["destination"], "days": max(2, plan["lodging_nights"]+1), "vibe": plan["vibe"]}},
]
return {"plan": plan}
def wait_for_approval(state: State) -> Dict[str, Any]:
payload = {
"kind": "approval",
"message": "Review/edit the plan. Approve to execute tools.",
"plan": state["plan"],
}
decision = interrupt(payload)
return {"approval": decision}
def execute_tools(state: State) -> Dict[str, Any]:
approval = state.get("approval") or {}
if not approval.get("approved"):
return {"execution": {"status": "not_executed", "reason": "User rejected or did not approve."}}
plan = approval.get("edited_plan") or state["plan"]
tool_calls = plan.get("tool_calls", [])
results = []
for call in tool_calls:
name = call.get("name")
args = call.get("args", {})
if name == "search_flights":
results.append(tool_search_flights(**args))
elif name == "search_hotels":
results.append(tool_search_hotels(**args))
elif name == "draft_itinerary":
results.append(tool_build_day_by_day(**args))
else:
results.append({"tool": name, "error": "Unknown tool (blocked for safety).", "args": args})
return {"execution": {"status": "executed", "tool_results": results, "final_plan": plan}}
'''
We formalize the agent’s reasoning using a strict schema that requires the model to output an explicit travel plan rather than free-form text. We generate the plan using the OpenAI model and validate it before allowing it into the workflow. We also auto-inject tool calls if the model omits them to guarantee a complete execution path.
app_code += r'''
def build_graph():
builder = StateGraph(State)
builder.add_node("plan", make_llm_plan)
builder.add_node("approve", wait_for_approval)
builder.add_node("execute", execute_tools)
builder.add_edge(START, "plan")
builder.add_edge("plan", "approve")
builder.add_edge("approve", "execute")
builder.add_edge("execute", END)
memory = InMemorySaver()
graph = builder.compile(checkpointer=memory)
return graph
st.set_page_config(page_title="Plan → Approve → Execute Travel Agent", layout="wide")
st.title("Human-in-the-Loop Travel Booking Agent (Plan → Approve/Edit → Execute)")
with st.sidebar:
st.header("Runtime")
if st.button("New Session / Thread"):
st.session_state.thread_id = str(uuid.uuid4())
st.session_state.ran_once = False
st.session_state.interrupt_payload = None
st.session_state.last_execution = None
thread_id = st.session_state.get("thread_id") or str(uuid.uuid4())
st.session_state.thread_id = thread_id
graph = build_graph()
config = {"configurable": {"thread_id": thread_id}}
st.caption(f"Thread ID: {thread_id}")
req = st.text_area(
"Describe your trip request",
value=st.session_state.get("user_request", "Plan a 5-day trip from Dubai to Istanbul in April. Budget $1800. Prefer museums, street food, and a relaxed pace."),
height=120
)
st.session_state.user_request = req
colA, colB = st.columns([1,1])
run_plan = colA.button("1) Generate Plan (LLM)")
resume_btn = colB.button("2) Resume After Approval")
if run_plan:
st.session_state.ran_once = True
st.session_state.interrupt_payload = None
st.session_state.last_execution = None
initial = {"user_request": req, "plan": {}, "approval": {}, "execution": {}}
out = graph.invoke(initial, config=config)
if "__interrupt__" in out and out["__interrupt__"]:
st.session_state.interrupt_payload = out["__interrupt__"][0].value
else:
st.session_state.last_execution = out.get("execution")
payload = st.session_state.get("interrupt_payload")
if payload:
st.subheader("Plan proposed by agent (editable)")
plan = payload.get("plan", {})
left, right = st.columns([1,1])
with left:
st.write("**Edit JSON (advanced):**")
edited_text = st.text_area("Plan JSON", value=json.dumps(plan, indent=2), height=420)
with right:
st.write("**Quick actions:**")
approved = st.radio("Decision", options=["Approve", "Reject"], index=0)
st.write("Tip: If you edit JSON, keep it valid. You can also reject and re-run planning.")
try:
edited_plan = json.loads(edited_text)
json_ok = True
except Exception as e:
json_ok = False
st.error(f"Invalid JSON: {e}")
if resume_btn:
if not json_ok:
st.stop()
decision = {
"approved": (approved == "Approve"),
"edited_plan": edited_plan
}
out2 = graph.invoke(Command(resume=decision), config=config)
st.session_state.interrupt_payload = None
st.session_state.last_execution = out2.get("execution")
exec_result = st.session_state.get("last_execution")
if exec_result:
st.subheader("Execution result")
st.json(exec_result)
if exec_result.get("status") == "executed":
st.success("Tools executed only AFTER approval
")
else:
st.warning("Not executed (rejected or not approved).")
'''
We construct the LangGraph workflow by separating planning, approval, and execution into distinct nodes. We deliberately interrupt the graph after planning so we can review and control the agent’s intent. We only allow tool execution to proceed when explicit human approval is provided.
import pathlib
pathlib.Path("app.py").write_text(app_code)
!streamlit run app.py --server.port 8501 --server.address 0.0.0.0 & sleep 2
!lt --port 8501
We connect the agent workflow to a live Streamlit interface that supports editing, approval, and rejection of plans. We persist the state across runs using a thread identifier so the agent behaves consistently across interactions. We finally launch the app and make it publicly available, enabling real human-in-the-loop collaboration.
In conclusion, we demonstrated how plan-and-execute agents become significantly more reliable when humans remain in the loop at the right moment. We showed that interrupts are not just a technical feature but a design primitive for building trust, accountability, and collaboration into agent systems. By separating planning from execution and inserting a clear approval boundary, we ensured that tools run only with human consent and context. This pattern scales beyond travel planning to any high-stakes automation, giving us agents that think with us rather than act for us.
Check out the Full Codes here. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.
The post How to Build Human-in-the-Loop Plan-and-Execute AI Agents with Explicit User Approval Using LangGraph and Streamlit appeared first on MarkTechPost.
