[Bug]: Issues with vLLM tool call functionality leading to abnormal requests #11284
Open
1 task done
Labels
bug
Something isn't working
Your current environment
The output of `python collect_env.py`
Model Input Dumps
No response
🐛 Describe the bug
I am encountering issues while using the tool call capability of vLLM. Some requests are behaving abnormally, and the log indicates the following error:
Additionally, my startup script is as follows:
-d vllm/vllm-openai:v0.6.3.post1
--host 0.0.0.0 --port 30000
--model /llm/models/Qwen2.5-32B-Instruct
--served-model-name qwen2.5-32b-instruct
--dtype auto
--tensor-parallel-size 2
--gpu-memory-utilization 0.90
--enable-prefix-caching
--enable-auto-tool-choice
--tool-call-parser hermes
Currently, I am using vLLM version:vllm/vllm-openai:v0.6.3.post1
Could you please provide guidance on how to resolve this issue? Any help would be greatly appreciated.
Thank you!
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: