[Feature]: LoRA support for qwen2-vl Models #11255

xlg-go · 2024-12-17T06:41:58Z

🚀 The feature, motivation and pitch

I fine-tuned a qwen2-vl-7b model using llama factory, deployed it with AsyncLLMEngine, and loaded the LoRA adapter using lora_request. However, the inference results are significantly worse compared to the merged model.

It would be great if we can have the support for LoRA for multimodal models as our team wants to use multiple LoRAs and merging the LoRA adapters to original model weights is not feasible for us. We are short on time for this project and as far as I can tell no other framework supports LoRA in this way. Also we need outlines for structured generation so vLLM (being the most user friendly, stable and mature framework ) is our best bet now. Can we get a timeline when will this be supported ? Also are there any workarounds possible until this feature is officially supported ?

Thank you for your adaptation.

Alternatives

No response

Additional context

No response

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

jeejeelee · 2024-12-17T07:18:16Z

vllm supports LoRA for multimodal models, but it only supports adding LoRA adapter to language backbone. The quickest approach would be to just only add LoRA to the language backbone and retrain it.

xlg-go · 2024-12-17T07:24:26Z

vllm supports LoRA for multimodal models, but it only supports adding LoRA adapter to language backbone. The quickest approach would be to just only add LoRA to the language backbone and retrain it.

the inference effect of the LoRA adapter is poor. Is it because the LoRA wasn't added to the vision backbone?

jeejeelee · 2024-12-17T07:32:17Z

Yep, could you share your lora configuration? I want to double check it

xlg-go · 2024-12-17T07:39:04Z

Yep, could you share your lora configuration? I want to double check it

Is it adapter_config.json?
engine_args: max_lora_rank=32, enable_lora=True

{
  "alpha_pattern": {},
  "auto_mapping": null,
  "base_model_name_or_path": "./ms_cache/hub/Qwen/Qwen2-VL-7B-Instruct",
  "bias": "none",
  "fan_in_fan_out": false,
  "inference_mode": true,
  "init_lora_weights": true,
  "layer_replication": null,
  "layers_pattern": null,
  "layers_to_transform": null,
  "loftq_config": {},
  "lora_alpha": 32,
  "lora_dropout": 0.15,
  "megatron_config": null,
  "megatron_core": "megatron.core",
  "modules_to_save": null,
  "peft_type": "LORA",
  "r": 32,
  "rank_pattern": {},
  "revision": null,
  "target_modules": "^(?!.*visual).*(?:o_proj|up_proj|v_proj|down_proj|k_proj|q_proj|gate_proj).*",
  "task_type": "CAUSAL_LM",
  "use_dora": false,
  "use_rslora": false
}

jeejeelee · 2024-12-17T07:54:01Z

It looks like your lora was indeed only added to the LLM. Is there a big difference in your results?

xlg-go · 2024-12-17T07:58:14Z

It looks like your lora was indeed only added to the LLM. Is there a big difference in your results?

Yep, the inference results differ significantly! I want to use qwen2-vl to perform OCR on the image to recognize the characters.

So, I guess it's because the LoRA wasn't added to the vision backbone, and I might not need the language backbone that much; the vision backbone is the key.

jeejeelee · 2024-12-17T08:14:40Z

Your actual lora was added to the visual backbone, and when using vllm for inference, you found that the results had big differences compared to the merged model?

xlg-go · 2024-12-17T08:55:24Z

Your actual lora was added to the visual backbone, and when using vllm for inference, you found that the results had big differences compared to the merged model?

yes！
wait a moment！！
Doesn't vLLM currently only support adding LoRA to the language backbone?

xlg-go · 2024-12-18T01:35:00Z

@jeejeelee hi~ Do you have any thoughts on adapting the vision backbone as well?

jeejeelee · 2024-12-18T01:45:04Z

Doesn't vLLM currently only support adding LoRA to the language backbone?

Yes, currently vLLM only supports adding LoRA to the language backbone.

jeejeelee · 2024-12-18T01:48:27Z

@jeejeelee hi~ Do you have any thoughts on adapting the vision backbone as well?

No, we haven't. We did some lora experiments before and found that for VL models, adapting vision backbone didn't show significant benefits. Though this might be due to our limited experiments

xlg-go · 2024-12-18T01:55:14Z

I understand. Thanks for sharing your findings and the insights from your LoRA experiments. While it's disappointing the vision backbone adaptation didn't yield significant benefits in your tests, it's valuable data nonetheless. Perhaps with further research and more extensive experiments, different approaches might prove fruitful in the future.

xlg-go added the feature request label Dec 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: LoRA support for qwen2-vl Models #11255

[Feature]: LoRA support for qwen2-vl Models #11255

xlg-go commented Dec 17, 2024

jeejeelee commented Dec 17, 2024

xlg-go commented Dec 17, 2024

jeejeelee commented Dec 17, 2024

xlg-go commented Dec 17, 2024 •

edited

Loading

jeejeelee commented Dec 17, 2024

xlg-go commented Dec 17, 2024 •

edited

Loading

jeejeelee commented Dec 17, 2024

xlg-go commented Dec 17, 2024 •

edited

Loading

xlg-go commented Dec 18, 2024

jeejeelee commented Dec 18, 2024

jeejeelee commented Dec 18, 2024

xlg-go commented Dec 18, 2024

[Feature]: LoRA support for qwen2-vl Models #11255

[Feature]: LoRA support for qwen2-vl Models #11255

Comments

xlg-go commented Dec 17, 2024

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

jeejeelee commented Dec 17, 2024

xlg-go commented Dec 17, 2024

jeejeelee commented Dec 17, 2024

xlg-go commented Dec 17, 2024 • edited Loading

jeejeelee commented Dec 17, 2024

xlg-go commented Dec 17, 2024 • edited Loading

jeejeelee commented Dec 17, 2024

xlg-go commented Dec 17, 2024 • edited Loading

xlg-go commented Dec 18, 2024

jeejeelee commented Dec 18, 2024

jeejeelee commented Dec 18, 2024

xlg-go commented Dec 18, 2024

xlg-go commented Dec 17, 2024 •

edited

Loading

xlg-go commented Dec 17, 2024 •

edited

Loading

xlg-go commented Dec 17, 2024 •

edited

Loading