NVIDIA AI Releases Gated DeltaNet-2: A Linear Attention Layer That Decouples Erase and Write in the Delta Rule
Linear consideration replaces the unbounded KV cache of softmax consideration with a fixed-size recurrent state. This cuts sequence mixing to linear time and decoding to fixed reminiscence. The exhausting half will not be what to overlook. It is the way to edit a compressed reminiscence with out scrambling current associations. NVIDIA has launched Gated DeltaWeb-2,…
