Technology
Alibaba’s model never trained as an agent — and improved agent performance across seven benchmarks
Image via VentureBeat
Article Summary
202 words
Alibaba's Qwen team released Qwen-AgentWorld on Tuesday — two models trained not to act inside agent environments, but to predict what those environments return. The release covers seven domains under a single architecture: MCP, Search, Terminal, Software Engineering, Android, Web,… Alibaba's Qwen team released Qwen-AgentWorld on Tuesday — two models trained not to act inside agent environments, but to predict what those environments return. The release covers seven domains under a single architecture: MCP, Search, Terminal, Software Engineering, Android, Web, and OS. The release extends Alibaba's recent push into autonomous agents. Qwen3.7-Max, released in May, was built around a 35-hour autonomous execution capability. That shift targets a ceiling teams training agents at scale run into directly. Real search engines surface whatever results exist, with no mechanism to inject controlled conditions. Live terminals do not allow injecting a low-disk-space condition on demand. Agent training is bounded by what production environments will surface, with no systematic way to expose the edge cases agents will need to handle but rarely encounter in training.The research team trained agents inside the resulting simulator and found performance gains that exceeded what training against real environments alone produced. In a separate test, using world model training as…
Continue Reading
Full story on VentureBeat
🔗 Clicking will take you to venturebeat.com