Meet LiteLLM Agent Platform: A Kubernetes-Based, Self-Hosted Infrastructure Layer for Isolated Agent Sandboxes and Persistent Session Management in Production

Running AI brokers in a neighborhood script is easy. Running them reliably in manufacturing throughout groups, throughout restarts, with remoted environments per context is a distinct downside fully. BerriAI, the corporate behind the LiteLLM AI Gateway, is now open-sourcing a purpose-built reply to that downside: the LiteLLM Agent Platform. The platform is described as a easy, self-hosted infrastructure platform for operating a number of brokers in manufacturing.

What Problem Does it Solve?

It helps to grasp what occurs if you attempt to scale brokers past a single course of. Agents are stateful: they carry session historical past, device name outcomes, and intermediate reasoning throughout turns. If the container operating your agent crashes, restarts, or will get changed throughout a deployment, that session state is gone except one thing is explicitly managing it. At the identical time, totally different groups usually want totally different runtime environments, totally different instruments, totally different secrets and techniques, totally different entry scopes which suggests you can not throw all brokers into one shared container.

The platform manages two issues: per-team and per-context sandboxes, and session continuity throughout pod restarts and upgrades. These two capabilities are the core infrastructure primitives the platform offers.

Architecture and Technical Stack

The platform is a standalone Next.js dashboard for LiteLLM v2 managed brokers, protecting classes chat, agent CRUD, and stay standing. The codebase is primarily TypeScript (92.8%), with Shell scripts for provisioning, a Dockerfile for containerization, and CSS for the dashboard UI.

The structure separates issues cleanly. A net course of runs on port 3000 and serves the Next.js dashboard. A employee course of handles async agent duties. Postgres is used because the persistent backing retailer, and a schema migration runs as an init container on startup — so the database is at all times in the right state earlier than the appliance boots.

For the sandbox layer — the remoted runtime setting the place brokers truly execute — sandboxes run on Kubernetes by way of the kubernetes-sigs/agent-sandbox CRD. Local improvement makes use of type. If you aren’t already aware of it: type (Kubernetes in Docker) helps you to spin up a full Kubernetes cluster regionally utilizing Docker containers as nodes, without having a cloud supplier. The agent-sandbox CRD (Custom Resource Definition) is a Kubernetes extension from kubernetes-sigs that the platform installs to handle the lifecycle of particular person sandbox environments.

The platform additionally features a harness system underneath harnesses/opencode, which comprises the configuration for operating coding brokers — akin to Claude Code or OpenAI Codex — inside remoted sandboxes with a vault proxy for credential administration. BerriAI crew additionally maintains a separate litellm-agent-runtime repository, described as a coding-agent runtime that runs inside per-session VMs provisioned by a LiteLLM proxy, generic by design, with customization occurring by way of harness configuration or a hydrate payload.

One sensible element value noting is how setting variables are dealt with throughout sandbox containers. Anything in .env prefixed with CONTAINER_ENV_ is injected into each sandbox container with the prefix stripped — for instance, CONTAINER_ENV_GITHUB_TOKEN=ghp_... means the container sees GITHUB_TOKEN=ghp_... This provides groups a clear technique to move secrets and techniques into sandboxed agent classes with out modifying container pictures.

https://github.com/BerriAI/litellm-agent-platform

Getting Started

The conditions for native improvement are Docker Desktop, type, kubectl, helm, and a LiteLLM gateway. No cloud credentials are required to get began regionally. The quickstart is 2 instructions:

Copy Code

bin/kind-up.sh
docker compose up

bin/kind-up.sh is idempotent — it provisions a sort cluster named agent-sbx, installs the agent-sandbox controller, and hundreds the harness picture. docker compose up boots Postgres, runs the schema migration, and begins the online course of on port 3000 together with the employee.

For manufacturing deployment, the really helpful path is AWS EKS for the sandbox cluster and Render for the online and employee processes. bin/eks-up.sh provisions the EKS cluster, and a Render Blueprint offers a one-click deployment possibility.

Relationship to the LiteLLM Gateway

The Agent Platform is a layer on high of the prevailing LiteLLM ecosystem, not a alternative for it. LiteLLM’s core is a Python SDK and Proxy Server — an AI Gateway — that calls 100+ LLM APIs in OpenAI format, with price monitoring, guardrails, load balancing, and logging, supporting suppliers together with Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, SageMaker, HuggingFace, vLLM, and NVIDIA NIM. The Agent Platform consumes a operating LiteLLM gateway as a dependency and builds agent orchestration and session administration infrastructure on high of it. Model routing, price monitoring, and charge limiting stay in the gateway layer. Sandbox isolation, session continuity, and the administration dashboard are dealt with by the Agent Platform.

Marktechpost’s Visual Explainer

LiteLLM Agent Platform

Self-Hosted Agent Infrastructure Guide

Alpha

Overview

Concepts

Architecture

Prerequisites

Quickstart

Production

01 / 06

What is LiteLLM Agent Platform?

BerriAI open-sourced this platform on May 8, 2026. It is a self-hosted infrastructure layer for operating a number of AI brokers in manufacturing, constructed on high of the LiteLLM AI Gateway.

Self-Hosted

Runs fully by yourself infrastructure. No knowledge leaves your setting. Suited for regulated industries and groups with knowledge residency necessities.

Multi-Agent

Designed to run a number of brokers in parallel, with full isolation between groups and contexts utilizing per-session sandboxes.

Session Continuity

Agent classes persist throughout pod restarts and upgrades, so stateful work will not be misplaced when containers are changed.

Open Source (MIT)

Fully open supply underneath the MIT license. Repo: github.com/BerriAI/litellm-agent-platform. File points and contribute instantly.

Prerequisite Knowledge

This information assumes familiarity with Docker, primary command-line utilization, and a normal understanding of what an AI agent is (a mannequin that calls instruments and runs multi-step duties). Kubernetes expertise helps however will not be required to comply with alongside.

02 / 06

Key Concepts to Know First

Before operating the platform, perceive these 4 constructing blocks. They seem all through the setup and configuration.

A

LiteLLM Gateway

The underlying AI Gateway that the Agent Platform depends upon. It routes requests to 100+ LLM suppliers (OpenAI, Anthropic, Bedrock, VertexAI, and so forth.) utilizing a unified OpenAI-format API. The Agent Platform doesn’t embody the gateway, you could have one operating individually and level the platform at it.

B

Sandbox

An remoted container setting the place a single agent session executes. Each sandbox is impartial, which means one agent can’t entry the filesystem, secrets and techniques, or state of one other. Sandboxes are provisioned and torn down per session utilizing the kubernetes-sigs/agent-sandbox CRD (Custom Resource Definition).

C

Harness

A configuration layer that defines how a selected kind of coding agent (akin to Claude Code or OpenAI Codex) runs inside a sandbox. The platform ships with an opencode harness underneath harnesses/opencode/. The harness picture is loaded into the sort cluster throughout setup.

D

CRD (Custom Resource Definition)

A Kubernetes extension that allows you to outline new useful resource varieties. The platform makes use of the kubernetes-sigs/agent-sandbox CRD to show your Kubernetes cluster methods to handle agent sandboxes as first-class sources, the identical means it manages pods or deployments.

03 / 06

How the Platform Is Structured

The platform has 4 major parts. Understanding how they join helps when debugging or deploying to manufacturing.

Component	What It Does	Tech
net (:3000)	Next.js dashboard. Provides the UI for classes chat, agent CRUD operations, and stay standing monitoring.	Next.js, TypeScript
employee	Background course of that handles async agent duties, decoupled from the online server.	TypeScript
postgres	Persistent backing retailer for session state, agent configs, and metadata. Schema migration runs robotically as an init container on startup.	PostgreSQL
sandbox cluster	Kubernetes cluster the place particular person agent sandboxes run, managed by way of the agent-sandbox CRD controller. Locally: type. In manufacturing: AWS EKS.	Kubernetes (type / EKS)

Separation of Concerns

The LiteLLM gateway handles mannequin routing, price monitoring, charge limiting, and guardrails. The Agent Platform handles sandbox lifecycle, session administration, and the administration dashboard. They run as separate companies and the Agent Platform consumes the gateway as a dependency.

04 / 06

Prerequisites Before You Start

Install and confirm these instruments earlier than operating any setup instructions. The quickstart won’t work with out all 5.

1
Docker Desktop

Required to construct and run containers, and to energy type (which runs Kubernetes nodes as Docker containers). Download from docker.com/merchandise/docker-desktop. Verify with:
```
docker --version
```

2
type (Kubernetes in Docker)

Used to provision a neighborhood Kubernetes cluster for operating sandboxes. Install by way of Homebrew on macOS (brew set up type) or from type.sigs.k8s.io. Verify with:
```
type --version
```

3
kubectl

The Kubernetes command-line device. Used by the setup scripts to work together with the sort cluster. Install from kubernetes.io/docs/duties/instruments. Verify with:
```
kubectl model --client
```

4
helm

The Kubernetes bundle supervisor. Used to put in the agent-sandbox controller into the sort cluster. Install from helm.sh/docs/intro/set up. Verify with:
```
helm model
```

5

A Running LiteLLM Gateway

The Agent Platform requires a LiteLLM gateway URL to route mannequin calls. If you should not have one operating, begin with the official LiteLLM quickstart at docs.litellm.ai. You will level the Agent Platform at this URL throughout configuration.

05 / 06

Local Quickstart

Clone the repo and run two instructions to get the total platform operating regionally. No cloud credentials wanted for native improvement.

Clone the repository

Pull the repo from GitHub:

git clone https://github.com/BerriAI/litellm-agent-platform
cd litellm-agent-platform

2
Configure your .env file

Copy the instance env file and fill in your LiteLLM gateway URL and any secrets and techniques:
```
cp .env.instance .env
# Edit .env and set your LITELLM_GATEWAY_URL and different required values
```

3
Provision the native type cluster

This script is idempotent, which means protected to run a number of instances. It provisions a sort cluster named agent-sbx, installs the agent-sandbox controller by way of helm, and hundreds the harness picture:
```
bin/kind-up.sh
```

4
Start all companies

Boots Postgres, runs the schema migration as an init container, and begins the online server on port 3000 and the employee course of:
```
docker compose up
```

5

Open the dashboard

Navigate to http://localhost:3000 in your browser. You ought to see the LiteLLM Agent Platform dashboard with choices to create brokers, open classes, and monitor stay standing.

Passing Secrets into Sandboxes

Any variable in .env prefixed with CONTAINER_ENV_ is robotically injected into each sandbox container with the prefix stripped. Example: CONTAINER_ENV_GITHUB_TOKEN=ghp_… means the sandbox sees GITHUB_TOKEN=ghp_… This is the right technique to move credentials into agent classes.

06 / 06

Production Deployment

The really helpful manufacturing setup separates the sandbox cluster (AWS EKS) from the online and employee processes (Render). The repo ships scripts and a Blueprint for each.

1
Provision the EKS sandbox cluster

The bin/eks-up.sh script provisions an AWS EKS cluster configured to run agent sandboxes. This replaces type because the sandbox backend. Requires AWS credentials in your setting:
```
bin/eks-up.sh
```

2

Deploy net and employee to Render

The repo features a Render Blueprint underneath deploy/render/ that deploys the online and employee companies to Render with one click on. See deploy/render/README.md for the Blueprint URL and required setting variables.

3
Use the Developer API instantly (optionally available)

You can work together with the platform programmatically by way of its REST API utilizing curl or any HTTP consumer. The full API reference protecting methods to create an agent, open a session, ship a message, and learn the reply is at src/server/DEVELOPER.md in the repo.
```
# Example: create an agent session by way of curl
curl -X POST http://localhost:3000/api/classes 
  -H "Content-Type: software/json" 
  -d '{"agent_id": "your-agent-id"}'
```

Architecture Summary for Production

AWS EKS runs the sandbox cluster the place agent classes execute in isolation. Render hosts the Next.js net dashboard and the async employee. Postgres (managed or self-hosted) persists session state. The LiteLLM gateway runs individually and handles all mannequin API routing. These 4 parts talk over the community and could be scaled independently.

Platform is at present in alpha public preview. File points at github.com/BerriAI/litellm-agent-platform. Architecture particulars at docs/k8s-backend.md in the repo.

1 / 6

Published by Marktechpost | AI/ML News and Research for Developers and Engineers

Key Takeaways

BerriAI open-sourced LiteLLM Agent Platform, a self-hosted infrastructure layer for operating a number of AI brokers in manufacturing with per-team sandbox isolation and session continuity throughout pod restarts.
Sandboxes run on Kubernetes by way of the kubernetes-sigs/agent-sandbox CRD — regionally with type, in manufacturing with AWS EKS — no cloud credentials wanted to get began.
The platform sits on high of the prevailing LiteLLM Gateway, which handles mannequin routing, price monitoring, and charge limiting throughout 100+ LLM suppliers in OpenAI format.
The quickstart is 2 instructions: bin/kind-up.sh provisions the sort cluster and installs the sandbox controller; docker compose up boots Postgres, net (:3000), and employee.
Released underneath MIT license and at present in alpha public preview

Check out the GitHub Repo. Also, be at liberty to comply with us on Twitter and don’t overlook to affix our 150k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to accomplice with us for selling your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar and so forth.? Connect with us

The submit Meet LiteLLM Agent Platform: A Kubernetes-Based, Self-Hosted Infrastructure Layer for Isolated Agent Sandboxes and Persistent Session Management in Production appeared first on MarkTechPost.

Meet LiteLLM Agent Platform: A Kubernetes-Based, Self-Hosted Infrastructure Layer for Isolated Agent Sandboxes and Persistent Session Management in Production

What Problem Does it Solve?

Architecture and Technical Stack

Getting Started

Relationship to the LiteLLM Gateway

Marktechpost’s Visual Explainer

Key Takeaways

Building a Multi-Agent AI Research Team with LangGraph and Gemini for Automated Reporting

7 MCP Server Best Practices for Scalable AI Integrations in 2025

MiniMax Releases MiniMax M2: A Mini Open Model Built for Max Coding and Agentic Workflows at 8% Claude Sonnet Price and ~2x Faster

Anthropic AI Releases Petri: An Open-Source Framework for Automated Auditing by Using AI Agents to Test the Behaviors of Target Models on Diverse Scenarios

Nous Research Proposes Lighthouse Attention: A Training-Only Selection-Based Hierarchical Attention That Delivers 1.4–1.7× Pretraining Speedup at Long Context

DeepSeek AI Researchers Introduce Engram: A Conditional Memory Axis For Sparse LLMs

Curated by experts. Filtered for relevance.

Resources

About

Subscribe & learn more every day!

What Problem Does it Solve?

Architecture and Technical Stack

Getting Started

Relationship to the LiteLLM Gateway

Marktechpost’s Visual Explainer

Key Takeaways

Similar Posts

Curated by experts. Filtered for relevance.

Resources

About

Subscribe & learn more every day!