Skip to content

FastDeploy:2.0.3,发布DeepSeek-R1-Distill-Qwen-32B时,报错 #3237

@xiangvictory

Description

@xiangvictory

环境:
4张A6000 GPU
python3.10
大模型:DeepSeek-R1-Distill-Qwen-32B(aisudio上下载得)

执行NCCL_DEBUG=INFO python -m fastdeploy.entrypoints.openai.api_server --model /workspace/models/DeepSeek-R1-Distill-Qwen-32B_baidu --port 8180 --metrics-port 8181 --engine-worker-queue-port 8182 --max-model-len 8192 --max-num-seqs 1 --tensor-parallel-size 4 --reasoning-parser qwen3


Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/engine/../worker/worker_process.py", line 730, in
run_worker_proc()
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/engine/../worker/worker_process.py", line 711, in run_worker_proc
worker_proc.load_model()
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/engine/../worker/worker_process.py", line 409, in load_model
self.worker.load_model()
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/worker/gpu_worker.py", line 160, in load_model
self.model_runner.load_model()
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/worker/gpu_model_runner.py", line 725, in load_model
self.model = get_model_from_loader(fd_config=self.fd_config)
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/model_executor/model_loader.py", line 54, in get_model_from_loader
model = model_loader.load_model(fd_config)
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/model_executor/model_loader.py", line 115, in load_model
state_dict = load_composite_checkpoint(
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/model_executor/load_weight_utils.py", line 311, in load_composite_checkpoint
state_dict = load_tp_checkpoint(model_path,
File "/usr/local/lib/python3.10/dist-packages/paddleformers/transformers/model_utils.py", line 3260, in load_tp_checkpoint
return load_sharded_checkpoint_as_one(folder, return_numpy=return_numpy)
File "/usr/local/lib/python3.10/dist-packages/paddleformers/transformers/model_utils.py", line 3237, in load_sharded_checkpoint_as_one
state_dict = loader(os.path.join(folder, shard_file))
File "/usr/local/lib/python3.10/dist-packages/paddleformers/utils/safetensors.py", line 317, in fast_load_file
with fast_safe_open(filename, framework="np") as f:
File "/usr/local/lib/python3.10/dist-packages/paddleformers/utils/safetensors.py", line 289, in init
self.base, self.tensors_decs, self.metadata = read_metadata(self.file)
File "/usr/local/lib/python3.10/dist-packages/paddleformers/utils/safetensors.py", line 111, in read_metadata
raise ValueError("SafeTensorError::MetadataIncompleteBuffer")
ValueError: SafeTensorError::MetadataIncompleteBuffer

不管是一张卡运行,还是多种卡运行,都报ValueError: SafeTensorError::MetadataIncompleteBuffer

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions