Cerebras & Core42 Deliver Record Performance for OpenAI GPT Model

International-scale AI for real-time reasoning and agentic workloads, now at document velocity.

Cerebras, the quickest AI supplier on the planet, and Core42, a G42 firm specializing in sovereign cloud and AI infrastructure, introduced the worldwide availability of OpenAI’s gpt-oss-120B. Core42 AI Cloud through Compass API brings Cerebras Inference at 3,000 tokens per second to energy enterprise-scale agentic AI.

Cerebras has been a pioneer in supporting open-source fashions from OpenAI and Meta, constantly attaining the quickest inference speeds as verified by unbiased benchmarking agency Synthetic Evaluation. Collectively, Cerebras and Core42 carry these capabilities to enterprises and builders worldwide via a single platform.

“Along with Cerebras and Core42, we’re making our greatest and most usable open mannequin accessible at unprecedented velocity and scale,” stated Trevor Cai, Head of Infrastructure, OpenAI. “This collaboration will give enterprises, researchers, and governments world wide the flexibility to construct real-time reasoning purposes with extraordinary effectivity.”

Powered by the Cerebras CS-3 and wafer-scale engine (WSE), the collaboration units a brand new benchmark for real-time reasoning with ultra-low latency and radically decrease cost-per-token than GPUs, immediately scalable from experimentation to full deployment.

“The most recent chapter in our ongoing strategic partnership with Core42 now delivers the world’s most succesful open-weight fashions straight into the arms of enterprises, researchers, and governments within the Center East and across the globe for real-time, reasoning-capable purposes,” stated Andrew Feldman, CEO and co-founder of Cerebras. “Core42’s AI Cloud and Compass API make it seamless to faucet into our inference efficiency, enabling a brand new era of agentic workloads on the quickest speeds.”

OpenAI’s gpt-oss-120B brings unprecedented reasoning energy, long-context understanding (128K tokens), and superior real-time capabilities to the open-weight ecosystem. From semantic search and code execution to automation and choice intelligence, these fashions unlock next-generation enterprise AI use instances and ship real-time reasoning at scale.

Cerebras’ wafer-scale engine (WSE) expertise, powering the CS-3 system, and memory-optimized structure delivers deterministic, ultra-low-latency efficiency at a radically decrease cost-per-token than conventional GPU-based methods. The outcome: real-time inference for the biggest AI fashions on the planet, with the pliability to scale immediately for each experimental workloads and manufacturing deployments.

For enterprises, the Core42 AI Cloud gives direct entry to OpenAI’s most superior open-weight mannequin through a single API. Organizations can now construct highly effective agentic purposes with scalability and effectivity required for mission-critical workloads.

“By operating OpenAI gpt-oss on Cerebras {hardware} inside Core42’s AI Cloud and Compass API, we’re setting a brand new benchmark for efficiency, flexibility, and compliance in AI,” stated Kiril Evtimov, CEO of Core42 and Group CTO, G42. “This launch permits our clients to ship new software capabilities by harnessing cutting-edge open-weight fashions on the quickest speeds globally with Cerebras Inference.”

  • Agentic AI at scale – Construct highly effective, reasoning-capable AI methods optimized for efficiency and price.
  • Enterprise-scale efficiency – Run the quickest, most demanding workloads globally, enabling superior automation and real-time experiences.
  • Trade-leading velocity – Combine gpt-oss-120B into workloads corresponding to reasoning, data retrieval, and long-context era with ease and effectivity.

The worth-performance chief.

Cerebras purpose-built AI infrastructure has the bottom value per token for OpenAI’s new fashions whereas setting the excessive benchmark for each velocity and accuracy. Cerebras and Core42 are providing OpenAI’s newest open fashions on the following pricing:

  • Throughput: 3000 tokens per second
  • Enter: $0.25 per million tokens
  • Output: $0.69 per million tokens

By combining Cerebras’ unmatched inference throughput with Core42’s AI Cloud, enterprises and builders can now entry international fashions with immediate scale, for real-time, reasoning-driven purposes with industry-leading price-performance.

Obtainable now on Core42 AI Cloud:https://aicloud.core42.ai

The submit Cerebras & Core42 Deliver Record Performance for OpenAI GPT Model first appeared on AI-Tech Park.

Similar Posts