ZenFlow: A New DeepSpeed Extension Designed as a Stall-Free Offloading Engine for Large Language Model (LLM) Training
The DeepSpeed staff unveiled ZenFlow, a brand new offloading engine designed to beat a significant bottleneck in giant language mannequin (LLM) coaching: CPU-induced GPU stalls. Whereas offloading optimizers and gradients to CPU reminiscence reduces GPU reminiscence stress, conventional frameworks like ZeRO-Offload and ZeRO-Infinity typically go away costly GPUs idle for many of every coaching step—ready…
