Skip to content

Conversation

wangxiyuan
Copy link
Collaborator

@wangxiyuan wangxiyuan commented Sep 14, 2025

What this PR does / why we need it?

Add Qwen3-next support.

Does this PR introduce any user-facing change?

Yes, users can use Qwen3 next.
Related doc: #2916 the tutorial will be ready in here

How was this patch tested?

Doc CI passed

Related: #2884

Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for the new Qwen3-next model, which is a hybrid attention model. The changes are extensive, touching attention mechanisms, model runner logic, and adding new custom operators. While the implementation is comprehensive, I've identified several critical issues related to code duplication, performance bottlenecks in the new model's prefill implementation, and potential correctness issues due to hardcoded values. There are also some high-severity issues regarding dead code and missed performance optimizations. Addressing these points will significantly improve the robustness and performance of the new model support.

MengqingCao and others added 17 commits September 14, 2025 16:14
Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: Angazenn <supperccell@163.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: Angazenn <supperccell@163.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: Angazenn <supperccell@163.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: Angazenn <supperccell@163.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: Angazenn <supperccell@163.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: Angazenn <supperccell@163.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: Your Name <you@example.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: zzzzwwjj <1183291235@qq.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: zzzzwwjj <1183291235@qq.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
@wangxiyuan wangxiyuan added ready read for review ready-for-test start test by label for PR labels Sep 14, 2025
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
@vllm-project vllm-project deleted a comment from gemini-code-assist bot Sep 14, 2025
# print(f"self.layer_idx: {self.layer_idx}, 111 mixed_qkv_non_spec: {mixed_qkv_non_spec}")

# 2.1: process the mutli-query part
# if spec_sequence_masks is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spec_sequence_masks is not None is used to choose non-MTP branch. Since MTP for qwen3 next is not supported on npu now, maybe we can check it here.

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
@Yikun Yikun added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels Sep 14, 2025
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
@wangxiyuan wangxiyuan added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels Sep 14, 2025
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
@wangxiyuan wangxiyuan added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels Sep 14, 2025
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
@wangxiyuan wangxiyuan added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels Sep 14, 2025
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
@wangxiyuan wangxiyuan added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels Sep 14, 2025
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
@wangxiyuan wangxiyuan added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels Sep 14, 2025
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
@wangxiyuan wangxiyuan added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels Sep 14, 2025
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
@wangxiyuan wangxiyuan added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels Sep 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accuracy-test enable all accuracy test for PR module:core module:ops module:tests ready read for review ready-for-test start test by label for PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants