Meta FAIR Released Code World Model (CWM): A 32-Billion-Parameter Open-Weights LLM, to Advance Research on Code Generation with World Models

Meta FAIR launched Code World Model (CWM), a 32-billion-parameter dense decoder-only LLM that injects world modeling into code era by coaching on execution traces and long-horizon agent–atmosphere interactions—not simply static supply textual content.
What’s new: studying code by predicting execution?
CWM mid-trains on two giant households of remark–motion trajectories: (1) Python interpreter traces that document native variable states after every executed line, and (2) agentic interactions inside Dockerized repositories that seize edits, shell instructions, and check suggestions. This grounding is meant to educate semantics (how state evolves) fairly than solely syntax.
To scale assortment, the analysis group constructed executable repository photographs from 1000’s of GitHub initiatives and foraged multi-step trajectories through a software-engineering agent (“ForagerAgent”). The launch experiences ~3M trajectories throughout ~10k photographs and three.15k repos, with mutate-fix and issue-fix variants.

Model and context window
CWM is a dense, decoder-only Transformer (no MoE) with 64 layers, GQA (48Q/8KV), SwiGLU, RMSNorm, and Scaled RoPE. Attention alternates native 8k and world 131k sliding-window blocks, enabling 131k tokens efficient context; coaching makes use of document-causal masking.
Training recipe (pre → mid → put up)
- General pretraining: 8T tokens (code-heavy) at 8k context.
- Mid-training: +5T tokens, long-context (131k) with Python execution traces, ForagerAgent information, PR-derived diffs, IR/compilers, Triton kernels, and Lean math.
- Post-training: 100B-token SFT for instruction + reasoning, then multi-task RL (~172B-token) throughout verifiable coding, math, and multi-turn SWE environments utilizing a GRPO-style algorithm and a minimal toolset (bash/edit/create/submit).
- Quantized inference matches on a single 80 GB H100.
Benchmarks
The analysis group cites the next cross@1 / scores (test-time scaling famous the place relevant):
- SWE-bench Verified: 65.8% (with test-time scaling).
- ResideCodeBench-v5: 68.6%; LCB-v6: 63.5%.
- Math-500: 96.6%; AIME-24: 76.0%; AIME-25: 68.2%.
- CruxEval-Output: 94.3%.
The analysis group place CWM as aggressive with equally sized open-weights baselines and even with bigger or closed fashions on SWE-bench Verified.
For context on SWE-bench Verified’s activity design and metrics, see the official benchmark resources.

Why world modeling issues for code?
The launch emphasizes two operational capabilities:
- Execution-trace prediction: given a operate and a hint begin, CWM predicts stack frames (locals) and the executed line at every step through a structured format—usable as a “neural debugger” for grounded reasoning with out reside execution.
- Agentic coding: multi-turn reasoning with software use towards actual repos, verified by hidden checks and patch similarity rewards; the setup trains the mannequin to localize faults and generate end-to-end patches (git diff) fairly than snippets.
Some particulars price noting
- Tokenizer: Llama-3 household with reserved management tokens; reserved IDs are used to demarcate hint and reasoning segments throughout SFT.
- Attention structure: the 3:1 native:world interleave is repeated throughout the depth; long-context coaching happens at giant token batch sizes to stabilize gradients.
- Compute scaling: learning-rate/batch dimension schedules are derived from inside scaling-law sweeps tailor-made for long-context overheads.
Summary
CWM is a practical step towards grounded code era: Meta ties a 32B dense transformer to execution-trace studying and agentic, test-verified patching, releases intermediate/post-trained checkpoints, and gates utilization beneath the FAIR Non-Commercial Research License—making it a helpful platform for reproducible ablations on long-context, execution-aware coding with out conflating analysis with manufacturing deployment.
Check out the Paper, GitHub Page, and Model on Hugging Face. Feel free to take a look at our GitHub Page for Tutorials, Codes and Notebooks. Also, be at liberty to observe us on Twitter and don’t overlook to be a part of our 100k+ ML SubReddit and Subscribe to our Newsletter.
The put up Meta FAIR Released Code World Model (CWM): A 32-Billion-Parameter Open-Weights LLM, to Advance Research on Code Generation with World Models appeared first on MarkTechPost.