-
-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Usage]: vLLM model service crashes when adding OpenAI-API-compatible model in dify, model id: Qwen/Qwen2-VL-7B-Instruct #11154
Comments
Encountering a similar issue when running vLLM ARM container with Qwen2-VL-2B-Instruct:
|
I also tried with the https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct model and facing the same error on x86 cpu inside a podman container
|
Sorry for missing this, can you post an example image that results in this error? cc @Isotr0py |
$: podman run \
-v $HF_HUB_CACHE/models--Qwen--Qwen2-VL-2B-Instruct:/cache/models--Qwen--Qwen2-VL-2B-Instruct \
quay.io/rh-ee-astefani/vllm:cpu-1734105797 \
--model=/cache/models--Qwen--Qwen2-VL-2B-Instruct/snapshots/47592516d3e709cd9c194715bc76902241c5edea |
I don't have podman, can you just upload the image here? Edit: by image I mean the image that's being passed into the model, not the image of the container. |
Oh yeah I tried without image first and that was were the error was happening
|
I think this should be solved in #11396, please try it out. |
Seems that this is related to the |
This should be fixed by #11434, please have a try :) |
Your current environment
vLLM API server version 0.6.4.post2
docker vllm-cpu-env
model "Qwen/Qwen2-VL-7B-Instruct"
How would you like to use vllm
vLLM model service crashes when adding OpenAI-API-compatible model in dify,
model id: Qwen/Qwen2-VL-7B-Instruct
Error message:
INFO 12-13 01:12:19 engine.py:267] Added request chatcmpl-093873b3235846bfb6500cd5807b39be.
ERROR 12-13 01:12:19 engine.py:135] RuntimeError("shape '[0, -1, 128]' is invalid for input of size 71680")
ERROR 12-13 01:12:19 engine.py:135] Traceback (most recent call last):
ERROR 12-13 01:12:19 engine.py:135] File "/usr/local/lib/python3.10/dist-packages/vllm/engine/multiprocessing/engine.py", line 133, in start
ERROR 12-13 01:12:19 engine.py:135] self.run_engine_loop()
......
ERROR 12-13 01:12:19 engine.py:135] File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
ERROR 12-13 01:12:19 engine.py:135] return forward_call(*args, **kwargs)
ERROR 12-13 01:12:19 engine.py:135] File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/rotary_embedding.py", line 825, in forward
ERROR 12-13 01:12:19 engine.py:135] query = query.view(num_tokens, -1, self.head_size)
ERROR 12-13 01:12:19 engine.py:135] RuntimeError: shape '[0, -1, 128]' is invalid for input of size 71680
CRITICAL 12-13 01:12:19 launcher.py:99] MQLLMEngine is already dead, terminating server process
INFO: 192.168.0.200:37164 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
INFO: Shutting down
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: