Skip to content

Commit 1b14e1e

Browse files
committed
[Release] Add release note for v0.10.2
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
1 parent 382c29f commit 1b14e1e

File tree

7 files changed

+46
-9
lines changed

7 files changed

+46
-9
lines changed

.github/workflows/vllm_ascend_test_full.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ on:
2121
branches:
2222
- 'main'
2323
- '*-dev'
24-
types: [ labeled ]
24+
types: [ labeled, synchronize ]
2525

2626
# Bash shells do not use ~/.profile or ~/.bashrc so these shells need to be explicitly
2727
# declared as "shell: bash -el {0}" on steps that need to be properly activated.

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ Please use the following recommended versions to get started quickly:
5252

5353
| Version | Release type | Doc |
5454
|------------|--------------|--------------------------------------|
55-
|v0.10.1rc1|Latest release candidate|[QuickStart](https://vllm-ascend.readthedocs.io/en/latest/quick_start.html) and [Installation](https://vllm-ascend.readthedocs.io/en/latest/installation.html) for more details|
55+
|v0.10.2rc1|Latest release candidate|[QuickStart](https://vllm-ascend.readthedocs.io/en/latest/quick_start.html) and [Installation](https://vllm-ascend.readthedocs.io/en/latest/installation.html) for more details|
5656
|v0.9.1|Latest stable version|[QuickStart](https://vllm-ascend.readthedocs.io/en/v0.9.1-dev/quick_start.html) and [Installation](https://vllm-ascend.readthedocs.io/en/v0.9.1-dev/installation.html) for more details|
5757

5858
## Contributing

README.zh.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ vLLM 昇腾插件 (`vllm-ascend`) 是一个由社区维护的让vLLM在Ascend NP
5353

5454
| Version | Release type | Doc |
5555
|------------|--------------|--------------------------------------|
56-
|v0.10.1rc1| 最新RC版本 |请查看[快速开始](https://vllm-ascend.readthedocs.io/en/latest/quick_start.html)[安装指南](https://vllm-ascend.readthedocs.io/en/latest/installation.html)了解更多|
56+
|v0.10.2rc1| 最新RC版本 |请查看[快速开始](https://vllm-ascend.readthedocs.io/en/latest/quick_start.html)[安装指南](https://vllm-ascend.readthedocs.io/en/latest/installation.html)了解更多|
5757
|v0.9.1| 最新正式/稳定版本 |[快速开始](https://vllm-ascend.readthedocs.io/en/v0.9.1-dev/quick_start.html) and [安装指南](https://vllm-ascend.readthedocs.io/en/v0.9.1-dev/installation.html)了解更多|
5858

5959
## 贡献

docs/source/community/versioning_policy.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ Following is the Release Compatibility Matrix for vLLM Ascend Plugin:
2222

2323
| vLLM Ascend | vLLM | Python | Stable CANN | PyTorch/torch_npu | MindIE Turbo |
2424
|-------------|--------------|------------------|-------------|--------------------|--------------|
25+
| v0.10.2rc1 | v0.10.2 | >= 3.9, < 3.12 | 8.2.RC1 | 2.7.1 / 2.7.1.dev20250724 | |
2526
| v0.10.1rc1 | v0.10.1/v0.10.1.1 | >= 3.9, < 3.12 | 8.2.RC1 | 2.7.1 / 2.7.1.dev20250724 | |
2627
| v0.10.0rc1 | v0.10.0 | >= 3.9, < 3.12 | 8.2.RC1 | 2.7.1 / 2.7.1.dev20250724 | |
2728
| v0.9.2rc1 | v0.9.2 | >= 3.9, < 3.12 | 8.1.RC1 | 2.5.1 / 2.5.1.post1.dev20250619 | |
@@ -41,7 +42,8 @@ Following is the Release Compatibility Matrix for vLLM Ascend Plugin:
4142
### release window
4243

4344
| Date | Event |
44-
|------------|-------------------------------------------|
45+
|------------|-------------------------------------------|、
46+
| 2025.09.15| Release candidates, v0.10.2rc1 |
4547
| 2025.09.04 | Release candidates, v0.10.1rc1 |
4648
| 2025.09.03 | v0.9.1 Final release |
4749
| 2025.08.22 | Release candidates, v0.9.1rc3 |

docs/source/conf.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -65,15 +65,15 @@
6565
# the branch of vllm, used in vllm clone
6666
# - main branch: 'main'
6767
# - vX.Y.Z branch: 'vX.Y.Z'
68-
'vllm_version': 'v0.10.1.1',
68+
'vllm_version': 'v0.10.2',
6969
# the branch of vllm-ascend, used in vllm-ascend clone and image tag
7070
# - main branch: 'main'
7171
# - vX.Y.Z branch: latest vllm-ascend release tag
72-
'vllm_ascend_version': 'v0.10.1rc1',
72+
'vllm_ascend_version': 'v0.10.2rc1',
7373
# the newest release version of vllm-ascend and matched vLLM, used in pip install.
7474
# This value should be updated when cut down release.
75-
'pip_vllm_ascend_version': "0.10.1rc1",
76-
'pip_vllm_version': "0.10.1.1",
75+
'pip_vllm_ascend_version': "0.10.2rc1",
76+
'pip_vllm_version': "v0.10.2",
7777
# CANN image tag
7878
'cann_image_tag': "8.2.rc1-910b-ubuntu22.04-py3.11",
7979
# vllm version in ci

docs/source/faqs.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
## Version Specific FAQs
44

55
- [[v0.9.1] FAQ & Feedback](https://github.com/vllm-project/vllm-ascend/issues/2643)
6-
- [[v0.10.1rc1] FAQ & Feedback](https://github.com/vllm-project/vllm-ascend/issues/2630)
6+
- [[v0.10.2rc1] FAQ & Feedback](https://github.com/vllm-project/vllm-ascend/issues/2874)
77

88
## General FAQs
99

docs/source/user_guide/release_notes.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,40 @@
11
# Release note
22

3+
## v0.10.2rc1 - 2025.09.15
4+
5+
This is the 1st release candidate of v0.10.2 for vLLM Ascend. Please follow the [official doc](https://vllm-ascend.readthedocs.io/en/) to get started.
6+
7+
### Highlights
8+
9+
- Add support for Qwen3 Next. Please note that expert parallel and MTP feature doesn't work with this release. The server may crash in some case as well. We'll make it stable enough soon. Follow the [official guide](https://vllm-ascend.readthedocs.io/en/latest/tutorials/multi_npu_qwen3_next.html) to get start [#2917](https://github.com/vllm-project/vllm-ascend/pull/2917)
10+
- Add quantization support for aclgraph [#2841](https://github.com/vllm-project/vllm-ascend/pull/2841)
11+
12+
### Core
13+
14+
- Aclgraph now works with Ray backend. [#2589](https://github.com/vllm-project/vllm-ascend/pull/2589)
15+
- MTP now works with the token > 1. [#2708](https://github.com/vllm-project/vllm-ascend/pull/2708)
16+
- Qwen2.5 VL now works with quantization. [#2778](https://github.com/vllm-project/vllm-ascend/pull/2778)
17+
- Improved the performance with async scheduler enabled. [#2783](https://github.com/vllm-project/vllm-ascend/pull/2783)
18+
- Fixed the performance regression with non MLA model when use default scheduler. [#2894](https://github.com/vllm-project/vllm-ascend/pull/2894)
19+
20+
### Other
21+
- The performance of w8a8 quantization is improved. [#2275](https://github.com/vllm-project/vllm-ascend/pull/2275)
22+
- The performance of moe model is improved. [#2689](https://github.com/vllm-project/vllm-ascend/pull/2689) [#2842](https://github.com/vllm-project/vllm-ascend/pull/2842)
23+
- Fixed resources limit error when apply speculative decoding and aclgraph. [#2472](https://github.com/vllm-project/vllm-ascend/pull/2472)
24+
- Fixed the git config error in docker images. [#2746](https://github.com/vllm-project/vllm-ascend/pull/2746)
25+
- Fixed the sliding windows attention bug with prefill. [#2758](https://github.com/vllm-project/vllm-ascend/pull/2758)
26+
- The official doc for Prefill Decode Disaggregation with Qwen3 is added. [#2751](https://github.com/vllm-project/vllm-ascend/pull/2751)
27+
- `VLLM_ENABLE_FUSED_EXPERTS_ALLGATHER_EP` env works again. [#2740](https://github.com/vllm-project/vllm-ascend/pull/2740)
28+
- A new improvement for oproj in deepseek is added. Set `oproj_tensor_parallel_size` to enable this feature[#2167](https://github.com/vllm-project/vllm-ascend/pull/2167)
29+
- Fix a bug that deepseek with torchair doesn't work as expect when `graph_batch_sizes` is set. [#2760](https://github.com/vllm-project/vllm-ascend/pull/2760)
30+
- Avoid duplicate generation of sin_cos_cache in rope when kv_seqlen > 4k. [#2744](https://github.com/vllm-project/vllm-ascend/pull/2744)
31+
- The performance of Qwen3 dense model is improved with flashcomm_v1. Set `VLLM_ASCEND_ENABLE_DENSE_OPTIMIZE=1` and `VLLM_ASCEND_ENABLE_FLASHCOMM=1` to enable it. [#2779](https://github.com/vllm-project/vllm-ascend/pull/2779)
32+
- The performance of Qwen3 dense model is improved with prefetch feature. Set `VLLM_ASCEND_ENABLE_PREFETCH_MLP=1` to enable it. [#2816](https://github.com/vllm-project/vllm-ascend/pull/2816)
33+
- The performance of Qwen3 MoE model is improved with rope ops update. [#2571](https://github.com/vllm-project/vllm-ascend/pull/2571)
34+
- Fix the weight load error for RLHF case. [#2756](https://github.com/vllm-project/vllm-ascend/pull/2756)
35+
- Add warm_up_atb step to speed up the inference. [#2823](https://github.com/vllm-project/vllm-ascend/pull/2823)
36+
- Fixed the aclgraph steam error for moe model. [#2827](https://github.com/vllm-project/vllm-ascend/pull/2827)
37+
338
## v0.10.1rc1 - 2025.09.04
439

540
This is the 1st release candidate of v0.10.1 for vLLM Ascend. Please follow the [official doc](https://vllm-ascend.readthedocs.io/en/) to get started.

0 commit comments

Comments
 (0)