Artificial Analysis Rankings Released: Qwen 3.7 Claims the Crown Among Domestic Models, Ranking Top Five Globally

Qwen3.7-Max is about to launch on Alibaba Cloud Bailian, offering API services to the public

On May 21, third-party organization Artificial Analysis released its latest global large model rankings. Alibaba’s newly launched flagship model Qwen3.7-Max scored 56.6, surpassing all domestic models including Kimi-K2.6, DeepSeek-v4-Pro-Max, and GLM5.1. Its performance is close to the strongest models from GPT, Claude, and Gemini, placing it fifth worldwide and first among domestic models. According to reports, Qwen3.7-Max is about to go live on Alibaba Cloud Bailian and provide API services to external users.

Caption: Screenshot from the Artificial Analysis website showing Qwen3.7-Max ranking fifth worldwide and first among domestic models

Artificial Analysis is an independent AI large model evaluation and analysis platform. It conducts multi-dimensional benchmark testing and performance assessments of global large models, creating a systematic composite ranking of model intelligence. As a result, this ranking is widely regarded in the industry as one of the most influential third-party large model lists and one with the highest credibility. Alibaba’s Qwen large models have repeatedly appeared at the top of the Artificial Analysis rankings, and Qwen3.6-Max-Preview, released one month ago, had already set the best performance record for a domestic model.

Now, that achievement has once again been surpassed by Qwen itself. In the latest Artificial Analysis overall large model ranking released on the evening of May 20, Qwen3.7-Max scored 56.6, improving by 4.8 points over the previous flagship model. It is close to GPT-5.4 (xhigh), Gemini3.1 Pro Preview, and Claude-Opus4.7 (max), ranking fifth among all models and firmly holding the top spot among domestic models.

According to reports, Qwen3.7-Max was innovatively designed for Agent intelligence, delivering major breakthroughs in core capabilities such as coding, agents, and reasoning. Qwen 3.7 works seamlessly with a wide range of Agent frameworks, including Claude Code, OpenClaw, Hermes Agent, and Qwen Code. Through autonomous coding and Agent tool calls, it can independently complete complex long-horizon tasks spanning 35 hours and more than 1,000 tool invocations, delivering impressive production-grade results and handling enterprise-level complex tasks.

Artificial Analysis Rankings: Qwen 3.7 Tops Domestic Models, Ranks in Global Top Five

Artificial Analysis Rankings Released: Qwen 3.7 Claims the Crown Among Domestic Models, Ranking Top Five Globally