-
Notifications
You must be signed in to change notification settings - Fork 835
Description
Describe the bug
bash vllm_inference.sh --model_name_or_path ./Llama-2-7b-hf --dataset_path data/alpaca/test_conversation --output_dir data/inference_results
[2025-04-03 14:16:08,168] [INFO] [real_accelerator.py:239:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Consider install flash_attn for better performance.
Checking dataset keys: 100%|██████████| 1/1 [00:00<00:00, 2833.99it/s]
Downloading data files: 100%|██████████| 1/1 [00:00<00:00, 13066.37it/s]
Extracting data files: 100%|██████████| 1/1 [00:00<00:00, 1470.14it/s]
Generating train split: 252 examples [00:00, 59018.63 examples/s]
Map (num_proc=16): 0%| | 0/252 [00:00<?, ? examples/s]Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Map (num_proc=16): 6%|▋ | 16/252 [00:00<00:02, 83.50 examples/s]Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Map (num_proc=16): 38%|███▊ | 96/252 [00:00<00:00, 354.95 examples/s]Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Map (num_proc=16): 57%|█████▋ | 144/252 [00:00<00:00, 382.18 examples/s]Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Map (num_proc=16): 88%|████████▊ | 222/252 [00:00<00:00, 475.83 examples/s]Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Not a valid conversation for generation, since the conversation doesn't end up with an user message. Skip.
Map (num_proc=16): 100%|██████████| 252/252 [00:00<00:00, 371.52 examples/s]
/root/miniconda3/envs/netlab/lib/python3.10/site-packages/datasets/table.py:1421: FutureWarning: promote has been superseded by promote_options='default'.
table = cls._concat_blocks(blocks, axis=0)
INFO 04-03 14:16:12 llm_engine.py:223] Initializing an LLM engine (v0.6.1.post2) with config: model='./Llama-2-7b-hf', speculative_config=None, tokenizer='./Llama-2-7b-hf', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config=None, rope_scaling=None, rope_theta=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.float16, max_seq_len=4096, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=1, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto, quantization_param_path=None, device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='outlines'), observability_config=ObservabilityConfig(otlp_traces_endpoint=None, collect_model_forward_time=False, collect_model_execute_time=False), seed=0, served_model_name=./Llama-2-7b-hf, use_v2_block_manager=False, num_scheduler_steps=1, enable_prefix_caching=False, use_async_output_proc=True)
INFO 04-03 14:16:13 model_runner.py:997] Starting to load model ./Llama-2-7b-hf...
Loading safetensors checkpoint shards: 0% Completed | 0/2 [00:00<?, ?it/s]
Loading safetensors checkpoint shards: 50% Completed | 1/2 [00:00<00:00, 1.30it/s]
Loading safetensors checkpoint shards: 100% Completed | 2/2 [00:02<00:00, 1.59s/it]
Loading safetensors checkpoint shards: 100% Completed | 2/2 [00:02<00:00, 1.47s/it]
INFO 04-03 14:16:16 model_runner.py:1008] Loading model weights took 12.5523 GB
INFO 04-03 14:16:17 gpu_executor.py:122] # GPU blocks: 4109, # CPU blocks: 512
INFO 04-03 14:16:19 model_runner.py:1311] Capturing the model for CUDA graphs. This may lead to unexpected consequences if the model is not static. To run the model in eager mode, set 'enforce_eager=True' or use '--enforce-eager' in the CLI.
INFO 04-03 14:16:19 model_runner.py:1315] CUDA graphs can take additional 1~3 GiB memory per GPU. If you are running out of memory, consider decreasing gpu_memory_utilization
or enforcing eager mode. You can also reduce the max_num_seqs
as needed to decrease memory usage.
INFO 04-03 14:16:33 model_runner.py:1430] Graph capturing finished in 14 secs.
Processed prompts: 0it [00:00, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]