vllm-project / vllm Public

Notifications
Fork 4.9k
Star 32.4k

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: vllm-project/vllm

Labels 56 Milestones 0

New pull request New

422 Open 4,865 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Frontend] improve hermes_tool_parser.py frontend

#11444 opened Dec 24, 2024 by paulcx

Loading…

[Misc] Move weights mapper

#11443 opened Dec 24, 2024 by jeejeelee • Draft

[Model] Support for fairseq2 Llama

#11442 opened Dec 24, 2024 by MartinGleize • Draft

[Model][LoRA]LoRA support added for MolmoForCausalLM

#11439 opened Dec 23, 2024 by ayylemao

Loading…

[Misc]Suppress irrelevant exception stack trace information when CUDA… frontend ready

ONLY add when PR is ready to merge/full CI is needed

#11438 opened Dec 23, 2024 by shiquan1988

Loading…

[Bugfix] Fix issues in CPU build Dockerfile. Fixes #9182 ci/build ready

ONLY add when PR is ready to merge/full CI is needed

#11435 opened Dec 23, 2024 by terrytangyuan

Loading…

[Bugfix][Hardware][CPU] Fix CPU input_positions creation for text-only inputs with mrope ready

ONLY add when PR is ready to merge/full CI is needed

#11434 opened Dec 23, 2024 by Isotr0py

Loading…

fix: add missing bos_token to example templates

#11432 opened Dec 23, 2024 by toslunar

Loading…

[Bugfix] Fix Qwen2-VL LoRA weight loading

#11430 opened Dec 23, 2024 by jeejeelee

Loading…

Add TTFT to offline_inference_with_prefix.py

#11428 opened Dec 23, 2024 by xu-song

Loading…

[WIP][VLM] Implement merged multimodal processor for Mllama

#11427 opened Dec 23, 2024 by Isotr0py • Draft

Zhn/fish e2e merge ci/build

#11426 opened Dec 23, 2024 by niuzheng168 • Draft

Bump helm/kind-action from 1.10.0 to 1.11.0 ci/build dependencies

Pull requests that update a dependency file

github_actions

Pull requests that update GitHub Actions code

#11424 opened Dec 23, 2024 by dependabot bot

Loading…

[V1] add error handling

#11420 opened Dec 22, 2024 by Ajay-Satish-01

Loading…

[WIP][Doc]Add documentation for using EAGLE in vLLM documentation

Improvements or additions to documentation

#11417 opened Dec 22, 2024 by sroy745 • Draft

[Model][BugFix] Mamba/Jamba exceed mamba cache slots

#11414 opened Dec 22, 2024 by mzusman

Loading…

[V1] Support Pixtral-HF on V1

#11409 opened Dec 22, 2024 by ywang96 • Draft

[V1] Optimize block table transfer from CPU to GPU ci/build

#11401 opened Dec 22, 2024 by WoosukKwon • Draft

[VLM] Support caching in merged multi-modal processor documentation

Improvements or additions to documentation

#11396 opened Dec 21, 2024 by DarkLight1337

Loading…

[Bugfix] Fix available_kv_cache_memory

#11395 opened Dec 21, 2024 by wubai

Loading…

[V1] Use FlashInfer Sampling Kernel for Top-P & Top-K Sampling

#11394 opened Dec 21, 2024 by WoosukKwon • Draft

Update Dockerfile.rocm ci/build

#11387 opened Dec 20, 2024 by yx-lamini

Loading…

[Core] Support global prefix caching

#11385 opened Dec 20, 2024 by lyppg

Loading…

[Misc] Adding API Key to the benchmark

#11384 opened Dec 20, 2024 by bjb19

Loading…

[Bugfix] Use .clone() for sampling params and deepcopy XGrammarLogitsProcessor

#11380 opened Dec 20, 2024 by tjohnson31415 • Draft

Previous 1 2 3 4 5 … 16 17 Next

Previous Next

ProTip! Add no:assignee to see everything that’s not assigned.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly