Alibaba's Qwen has released a new reasoning model, QwQ 32B, that is trained on 32 billion parameters to rival Deepseek R1. The Chinese AI company said that it investigated recipes for scaling RL (Reinforcement Learning) and achieved 'impressive' results based on the Qwen2.5-32B. Alibaba Qwen said that the RL training improved match and coding performance and that continuous scaling of Reinforcement Learning could help a medium-size model achieve better performance against MoE models. The company said, "Qwen2.5-Plus + Thinking (QwQ) = QwQ-32B". OpenAI GPT-4.5 Rolled Out to All Plus Users, Offers Better Performance and Enhanced Capabilities Than Previous Model.
QwQ 32B Reasoning Model Released by Alibaba's Qwen
Today, we release QwQ-32B, our new reasoning model with only 32 billion parameters that rivals cutting-edge reasoning model, e.g., DeepSeek-R1.
Blog: https://t.co/zCgACNdodj
ModelScope: https://t.co/hcfOD8wSLa
Demo: https://t.co/DxWPzAg6g8
Qwen Chat:… pic.twitter.com/kfvbNgNucW
— Qwen (@Alibaba_Qwen) March 5, 2025
(SocialLY brings you all the latest breaking news, viral trends and information from social media world, including Twitter (X), Instagram and Youtube. The above post is embeded directly from the user's social media account and LatestLY Staff may not have modified or edited the content body. The views and facts appearing in the social media post do not reflect the opinions of LatestLY, also LatestLY does not assume any responsibility or liability for the same.)