- Alibaba’s Tongyi Lab unveils the Qwen Robot Suite
- Its first embodied-AI models are split into navigation (RobotNav), a video “world model” (RobotWorld), and manipulation (RobotManip)
- The move comes after Nvidia recently unveiled and published its own Cosmos 3 offerings
As much of its AI competition continues to focus on LLMs and making them faster and more capable, Alibaba might be looking to lead on another frontier altogether, along with its LLM ambitions in tow: robots.
The company’s Tongyi Lab has unveiled the Qwen Robot Suite, what it calls a family of models focused on “embodied AI,” which centers on enabling machines to perceive space, reason, and act accordingly.
This comes at the heels of Nvidia’s own Cosmos 3, a frontier model for physical AI, further bolstering CEO Jensen Huang’s narrative that China’s developer ecosystem remains relatively unaffected by chip restrictions, even as focus in the West continues to shift to power for many of the sprawling data centers being built in the US.
A competitor or a complement to Nvidia’s playbook?
The Qwen-Robot Suite consists of three core models: Qwen-RobotManip, a generalizable vision-language-action model; Qwen-RobotNav, a scalable vision-language navigation model; and Qwen-RobotWorld, a video world model designed for embodied intelligence.
There is no denying, however, that robotics is being treated as perhaps the most crucial frontier for AI, even as LLMs continue to advance, with both Google and Nvidia among the companies pouring billions into research on their respective Gemini Robotics and open source Cosmos offerings.
Alibaba claims that the model, which leverages a more lightweight Qwen3.5-4B model rather than its Qwen 3.7 Max, which features over a trillion parameters, manages to top the RoboChallenge real-robot benchmark, scoring an impressive 59.83 and a 45% task success rate.
With other interested parties such as Tencent, Unitree,…


























