Alibaba Qwen Team Just Released FP8 Builds of Qwen3-Next-80B-A3B (Instruct & Thinking), Bringing 80B/3B-Active Hybrid-MoE to Commodity GPUs
Alibaba’s Qwen group has simply launched FP8-quantized checkpoints for its new Qwen3-Next-80B-A3B fashions in two post-training variants—Instruct and Thinking—aimed toward high-throughput inference with ultra-long context and MoE effectivity. The FP8 repos mirror the BF16 releases however package deal “fine-grained FP8” weights (block measurement 128) and deployment notes for sglang and vLLM nightly builds. Benchmarks within…
