Alibaba's Qwen team has released Qwen3, the third generation of their open-weight model family, and it represents a significant leap forward. The flagship Qwen3-235B uses a Mixture-of-Experts architecture that activates only 22B parameters per forward pass, keeping inference costs low while matching frontier model quality on most benchmarks. The most interesting new feature is a switchable "thinking mode" — similar to the approach used in DeepSeek R1 — which lets the model toggle between fast instruct responses and slower, more careful chain-of-thought reasoning depending on the task. Developers can control this via a simple system prompt flag. The full suite includes models ranging from 0.6B to 235B, all released under the Apache 2.0 licence.