NVIDIA AI Releases Gated DeltaNet-2: A Separate Attention Layer That Decouples and Writes on the Delta Law
Linear attention replaces the infinite KV cache of softmax attention with an iterative form of fixed size. This reduces sequence mixing to linear time and recording in non-volatile memory. The hard part is not to forget. It is a way of organizing repressed memory without criticizing existing associations. NVIDIA has been released Gated DeltaNet-2a specific … Read more