China's DeepSeek has launched NSA, a hardware-aligned and natively trainable sparse attention mechanism to offer users ultra-fast long-context training and inferences. DeepSeek NSA offers a dynamic hierarchical sparse strategy, fine-gained token selection, and coarse-gained token compression. The China-based DeepSeek AI company said its NSA would speed up the inferences and reduce pre-training costs without compromising performance. DeepSeek NSA is also said to outperform Full Attention models on various benchmarks. Grok 3 Launched by Elon Musk’s xAI Outperforming DeepSeek R1, OpenAI o1 and Gemini-2 Flash Thinking; Check Modes, Versions and More.

DeepSeek Launched NSA Mechanism for Faster Inferences, Lower Training Costs

(SocialLY brings you all the latest breaking news, fact checks and information from social media world, including Twitter (X), Instagram and Youtube. The above post contains publicly available embedded media, directly from the user's social media account and the views appearing in the social media post do not reflect the opinions of LatestLY.)