Zyphra Release Zamba2-VL: Hybrid Mamba2–Transformer Vision-Language Models That Cut Time-to-First-Token by About an Order of Magnitude
Zyphra has launched Zamba2-VL, a household of open vision-language fashions. The launch covers three sizes: 1.2B, 2.7B, and 7B parameters. Each mannequin is constructed on the Zamba2 hybrid SSM–Transformer spine. Vision-language fashions (VLMs) learn photos and textual content collectively. They reply questions on charts, paperwork, and pictures. Most open VLMs use a dense Transformer because…
