Qwen is an open-source, frontier-class AI platform developed by Alibaba Group’s dedicated AI research team. As the generative AI space became saturated with proprietary Western models priced for enterprise budgets, Qwen emerged as the world’s most downloaded open model family, the serious developer’s alternative to paying a premium for black-box intelligence. Built around a rapidly maturing lineup of dense and Mixture-of-Experts (MoE) models and released under the permissive Apache 2.0 License, Qwen is designed to deliver graduate-level reasoning, generate production-ready code, power long-horizon agentic workflows, and process massive multimodal inputs, all while keeping the cost of experimentation as close to zero as possible.

Key Features
- Truly Open Weights, No Catches: Every generation of Qwen (from Qwen3 through to Qwen3.6) ships open-weight under the Apache 2.0 License. That means enterprises can download the full model, self-host it inside their own infrastructure, fine-tune it on proprietary data, and deploy it commercially, all without a licensing agreement, a usage cap, or a phone call to a sales team. The weights are on Hugging Face and ModelScope, ready to pull today.
- Hybrid Thinking Mode: One of Qwen’s signature innovations is its switchable reasoning architecture. Rather than forcing users to choose between a fast “turbo” model and a slow “reasoning” model, Qwen3 and later generations support a unified
enable_thinkingparameter. Set it totruefor deep chain-of-thought reasoning on complex maths, legal analysis, or multi-step code; set it tofalsefor instant, conversational replies. One model, two speeds, no re-prompting strategy required. - Preserved Thinking Across Turns: Qwen3.6 introduced a feature that no major Western lab has shipped at this scale: the ability to retain reasoning context across conversation history turns. In practice, this means Qwen remembers why it made an earlier decision in a coding session, dramatically reducing redundant re-reasoning on iterative development tasks and making multi-turn agentic workflows far more coherent.
- Massive Sparse MoE Architecture: Qwen3.5’s flagship model sits at 397 billion total parameters with 17 billion active per forward pass, while Qwen3.6 offers a 35B-A3B variant (35 billion total, 3 billion active) built on a hybrid Gated DeltaNet plus sparse MoE architecture. The result is frontier-class output at a fraction of the compute cost. For latency-sensitive production workloads, active parameter count drives speed — and Qwen’s active counts are among the lowest of any frontier-quality model available today.
- Qwen Code – An Open Terminal Agent: Qwen ships its own open-source Claude Code equivalent: Qwen Code, a terminal-first AI coding agent that supports OpenAI, Anthropic and Gemini-compatible APIs interchangeably. It includes built-in Skills, SubAgents, and supports VS Code, Zed and JetBrains IDE integration. Crucially, both the agent framework and the underlying Qwen3-Coder model are open-source and shipped together. This means the agent and the model co-evolve with every release.
- 201-Language Multimodal Coverage: Qwen3.6 extends Qwen’s language support to 201 languages and dialects. Its vision-language variants accept text, image and video inputs natively, with up to 1 million tokens of context. Qwen currently dominates Chinese-language reasoning benchmarks while remaining highly competitive in English. This makes it the strongest bilingual option available in open weights.
Company Background
Qwen was built by Alibaba Group, the Chinese e-commerce and cloud computing giant headquartered in Hangzhou. The project began under the name Tongyi Qianwen (通义千问, literally “A Thousand Questions”) and was first introduced in April 2023, shortly after ChatGPT triggered a global scramble to build competitive large language models.
Unlike the university spinouts and well-funded startups that dominate Western AI, Qwen was born inside a trillion-dollar technology conglomerate with its own cloud infrastructure (Alibaba Cloud), its own chip partnerships, and direct access to the commercial deployment channels needed to train on real-world usage data at enormous scale. Alibaba trained Qwen 2.5 on 18 trillion tokens, double the dataset of its prior generation, and has since iterated to Qwen3, Qwen3.5 (February 2026), and Qwen3.6 (April 2026) in rapid succession.
By early 2026, Qwen had become the most downloaded open-weight model family on Hugging Face, accumulated over 30 million monthly active users on the Qwen consumer app, and was deployed inside more than 90,000 enterprises through Alibaba Cloud and DingTalk. The Qwen3.5 lineup beat every open-weight competitor on graduate-level reasoning benchmarks, a benchmark class once reserved for closed models like GPT-4 and Claude, and Qwen3.6 subsequently hit 78.8% on SWE-bench Verified, placing it within striking distance of the best closed coding models in the world.
User Experience
- The Interface (Qwen Studio): For general consumers, Qwen offers a polished web and mobile chat interface at qwen.ai. The experience is clean and fast, with native support for image uploads, video understanding, document processing, web search, and an Artifacts feature for generating runnable code directly inside the UI. The interface feels more fully-featured out of the box than most competitors at its price point. For free-tier users, it’s zero cost.
- The “Dual-Mode” Experience: The most distinctive thing about using Qwen day-to-day is the thinking toggle. Power users will find themselves flipping between instant replies for straightforward questions and deep reasoning mode for the tasks that actually need it, for example, debugging a gnarly algorithmic problem, drafting a structured financial analysis, or working through a multi-file refactor. The model’s instruction-following in non-thinking mode is snappy and precise. Its reasoning mode output is genuinely deliberate and well-structured.
- Technical Quirks: Global users should be aware of a few realities. Qwen’s models are trained by a Chinese lab and operate under corresponding content policies. Topics touching Chinese political sensitivities will hit hard refusals regardless of prompt engineering. Additionally, while English fluency is strong, outputs on very large text generation tasks can occasionally surface Chinese punctuation or formatting artifacts, particularly in non-thinking mode. For non-Chinese enterprises with strict data residency requirements, Alibaba Cloud’s EU (Frankfurt) and US (Virginia) deployment regions are available, though infrastructure sovereignty ultimately still runs through Alibaba’s cloud stack.
Cost
Qwen’s pricing strategy is built around aggressive open-source availability and developer-first API economics. Alibaba Cloud heavily utilizes a “tiered by request size” pricing model, rewarding developers who keep their input context efficient, while simultaneously offering highly competitive subscriptions to challenge Claude Code.
Free Tier
- Model Studio Free Quota – Available to new registered users in supported regions (like the Singapore endpoint).
- Price: $0
- Includes: 1 million input tokens and 1 million output tokens across flagship models, valid for 90 days after activation.
- Use case: Initial experimentation, testing API integrations, and light hobbyist tasks.
- Self-Hosted (Open-Weight) – Open-source Qwen models (including Qwen2.5, Qwen3, and Qwen3.5 families) are downloadable from Hugging Face and ModelScope.
- Price: $0 in licensing fees; standard compute costs apply.
- Use case: Strict data sovereignty, regulatory compliance, fine-tuning on proprietary datasets, or offline agentic workflows.
Alibaba Cloud Coding Plan (Subscription Tiers)
This is Alibaba’s direct answer to the Claude Code subscription stack. Rather than just offering Qwen, the subscription provides a unified API key granting access to Qwen models alongside top-tier Chinese models like GLM-5, Kimi K2.5, and MiniMax M2.5.
- Lite Plan – Entry tier for hobbyists and side projects.
- Price: ~$3 for the first month (renews at $5/month)
- Includes: Up to 18,000 requests per month across the available model suite.
- Pro Plan – For active developers integrating AI into daily coding workflows.
- Price: ~$15 for the first month (renews at $25/month)
- Includes: A generous quota of 90,000 requests per month.
Pay-Per-Token API Pricing (DashScope)
For developers who prefer usage-based billing, DashScope charges based on the specific model and the size of the context window per request (larger inputs push you into higher pricing tiers).
- Qwen3.5-Flash: ~$0.10 input / $0.40 output per million tokens (Best for simple, low-latency tasks).
- Qwen3-Coder-Next: ~$0.30 input / $1.50 output per million tokens (Agentic coding model optimized for multi-turn tool interactions).
- Qwen3.5-Plus: ~$0.40 input / $2.40 output per million tokens (The balanced workhorse; price applies to standard contexts under 256K).
- Qwen3.6-Plus: ~$0.50 to $2.00 input / $3.00 to $6.00 output per million tokens (The latest native multimodal flagship; exact price depends heavily on the input tier).
- Qwen3-Max: ~$1.20 input / $6.00 output per million tokens (Top-tier model for complex reasoning; price applies to contexts under 32K).
For full pricing details, see the Qwen Pricing.
In summary, Qwen is not just a regional AI player. It’s a globally competitive, multimodal and developer-obsessed ecosystem. By balancing state-of-the-art proprietary models with incredibly capable open-weight releases, Qwen provides a uniquely flexible lens through which to build with AI. It is the AI of choice for developers who need robust multilingual support, cutting-edge multimodal capabilities, and the freedom to shift seamlessly between highly scalable APIs and private, self-hosted infrastructure.