IBM Released new Granite 4.0 Models with a Novel Hybrid Mamba-2/Transformer Architecture: Drastically Reducing Memory Use without Sacrificing Performance
IBM simply launched Granite 4.0, an open-source LLM household that swaps monolithic Transformers for a hybrid Mamba-2/Transformer stack to chop serving reminiscence whereas maintaining high quality. Sizes span a 3B dense “Micro,” a 3B hybrid “H-Micro,” a 7B hybrid MoE “H-Tiny” (~1B lively), and a 32B hybrid MoE “H-Small” (~9B lively). The fashions are Apache-2.0,…