|

DeepReinforce Releases Ornith-1.0: An Open-Source Coding Model Family That Learns Its Own RL Scaffolds

▶

DeepReinforce has launched Ornith-1.0, an open-source mannequin household constructed for agentic coding. The lineup spans 4 sizes, from a 9B dense mannequin to a 397B mixture-of-experts flagship. Every checkpoint ships beneath the MIT license on Hugging Face. The fashions are post-trained on high of pretrained Gemma 4 and Qwen 3.5.

Most coding brokers pair a mannequin with a set, human-designed harness. Ornith-1.0 as an alternative learns to jot down its personal. The DeepReinforce analysis group stories state-of-the-art outcomes amongst open fashions of comparable dimension.

TL;DR

  • Ornith-1.0 ships in 9B, 31B, 35B-MoE, and 397B-MoE sizes beneath MIT, constructed on Gemma 4 and Qwen 3.5.
  • The mannequin learns its personal scaffold throughout RL, collectively optimizing the harness and the answer.
  • Ornith-1.0-397B tops Claude Opus 4.7 on each headline benchmarks, however not Opus 4.8 or the bigger GLM-5.2-744B.
  • Three layers — fastened belief boundary, deterministic monitor, frozen LLM decide — guard in opposition to reward hacking.

What is Ornith-1.0?

Ornith-1.0 is a set of reasoning fashions tuned for coding brokers. The variants are 9B Dense, 31B Dense, 35B MoE, and 397B MoE. The 35B mannequin is mixture-of-experts and prompts roughly 3B parameters per token. FP8 and GGUF builds are additionally printed for quicker native serving.

Each mannequin is a reasoning mannequin. Replies open with a <suppose> block earlier than the ultimate reply. The serving recipes allow a reasoning parser, in order that hint returns in a separate reasoning_content subject. The fashions additionally emit well-formed device requires agent loops.

Deployment is simple. The 9B mannequin is about 19GB in bf16 and serves on a single 80GB GPU. Serving recipes goal vLLM, SGLang, and Transformers. Each mannequin exposes an OpenAI-compatible endpoint. Standard agent frameworks subsequently work with out code adjustments.

Interactive Explainer