GPT-5.5 is OpenAI’s most capable agentic AI model yet–at twice the API price

OpenAI launched GPT-5.5 on April 23 as what it calls “a brand new class of intelligence for actual work and powering brokers,” and the framing is deliberate. OpenAI says it’s the most capable agentic AI model thus far, constructed from the floor as much as plan, use instruments, test its personal output, and work by way of duties independently.

GPT-5.5 is the first retrained base model since GPT-4.5, co-designed with NVIDIA’s GB200 and GB300 NVL72 rack-scale programs. The firm says the sensible distinction is that when utilizing GPT5.5, duties that beforehand required a number of prompts and human ‘course-correction’ can now be handed off extra fully. The model is rolling out to Plus, Pro, Business, and Enterprise users in ChatGPT and Codex. API entry adopted on April 24.

The benchmarks

OpenAI’s strongest efficiency declare is on Terminal-Bench 2.0, a benchmark that assessments command-line workflows requiring planning and gear coordination in a sandboxed atmosphere. GPT-5.5 scores 82.7%, in opposition to GPT-5.4’s 75.1% and Claude Opus 4.7’s 69.4%.

On SWE-Bench Pro, which evaluates GitHub difficulty decision, GPT-5.5 reaches 58.6%, fixing extra points in a single cross than earlier variations. OpenAI additionally launched Expert-SWE, an inner benchmark the place duties carry a median estimated human completion time of 20 hours. GPT-5.5 scores 73.1%, up from GPT-5.4’s 68.5%.

In long-context reasoning, MRCR v2 at a million tokens, a retrieval benchmark testing whether or not a model can find a selected reply buried in a big doc, GPT-5.5 scores 74.0%, in opposition to GPT-5.4’s 36.6%.

However, on MCP Atlas, Scale AI’s Model Context Protocol tool-use benchmark, Claude Opus 4.7 leads at 79.1% and no rating is recorded by GPT-5.5. OpenAI included that absence in its personal benchmark desk, which a minimum of alerts its confidence in the total image.

Token effectivity, pricing actuality

API entry is priced at US$5 per million enter tokens and US$30 per million output tokens, precisely twice the charges for GPT-5.4. OpenAI’s defence is that GPT-5.5 completes the identical Codex duties with fewer tokens than GPT-5.4, making efficient prices roughly 20% greater as soon as its effectivity is factored in, a declare that unbiased testing lab Artificial Analysis validated.

GPT-5.5 Pro, obtainable to Pro, Business, and Enterprise customers, is priced at US$30 per million enter tokens and US$180 per million output tokens. It applies extra parallel test-time compute on more durable issues and leads the checklist of publicly-available fashions on BrowseComp, OpenAI’s agentic web-browsing benchmark, at 90.1%.

Token effectivity is price stress-testing in opposition to precise workloads earlier than committing to a model change. At 10 million output tokens per 30 days, GPT-5.5 normal prices US$300 in opposition to Claude Opus 4.7’s US$250, a 20% that solely pays off if the model’s superior agentic efficiency means fewer process iterations and fewer retries, with the maths various by use case.

In apply

Open AI says greater than 85% of workers now use Codex weekly of their departments, together with engineering and advertising and marketing. In one instance, the communications staff used GPT-5.5 to course of six months of talking request knowledge, the place the model was in a position to construct a scoring and danger framework to assist automate low-risk approvals.

Greg Brockman described the launch as “an actual step ahead in the direction of the form of computing that we count on in the future,” and chief scientist Jakub Pachocki famous the final two years of model progress had felt “surprisingly gradual.”

OpenAI says GPT-5.5 matches GPT-5.4’s per-token latency in manufacturing serving whereas acting at the next degree of intelligence; bigger, extra capable fashions are sometimes slower to serve, however that trade-off was averted right here.

Whether the benchmark leads translate into manufacturing positive factors for groups operating actual agentic pipelines is the query that can take the subsequent few weeks to reply correctly. The Terminal-Bench rating is promising for unattended terminal brokers and DevOps automation. The MCP Atlas hole is price awaiting anybody constructing closely on tool-use orchestration.

See additionally: OpenAI brings GPT-5.5 to Codex for coding taskse

Banner for AI & Big Data Expo by TechEx events.

Want to be taught extra about AI and large knowledge from business leaders? Check out AI & Big Data Expo happening in Amsterdam, California, and London. The complete occasion is a part of TechEx and is co-located with different main know-how occasions together with the Cyber Security & Cloud Expo. Click here for extra info.

AI News is powered by TechForge Media. Explore different upcoming enterprise know-how occasions and webinars here.

The put up GPT-5.5 is OpenAI’s most capable agentic AI model yet–at twice the API price appeared first on AI News.