-
-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VLM] Support caching in merged multi-modal processor #11396
Open
DarkLight1337
wants to merge
75
commits into
vllm-project:main
Choose a base branch
from
DarkLight1337:mm-fields
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,273
−295
Open
Changes from all commits
Commits
Show all changes
75 commits
Select commit
Hold shift + click to select a range
faa9b84
Refactor multi-modal processor to support caching
DarkLight1337 9711a15
Clean up
DarkLight1337 29e3fcd
Fix cached result being mutated
DarkLight1337 ab64e85
Rename
DarkLight1337 81215a2
Fix docs
DarkLight1337 cf52b3b
Fix a typo
DarkLight1337 a4a8eb9
Fix unhandled sampling rate in initialization
DarkLight1337 c48f7c5
format
DarkLight1337 b84ff42
Change the delimiter
DarkLight1337 c3f1bde
Fix extra dimension
DarkLight1337 32e5197
Update
DarkLight1337 7264d4e
Use the inner processor to enable fine-grained caching
DarkLight1337 02ea829
Make the cache optional
DarkLight1337 b981a9d
Fix invalid kwargs being passed to tokenizer
DarkLight1337 5dde7d0
Fix Phi3V prompt replacement
DarkLight1337 7339ab8
Refine
DarkLight1337 509411d
Enable fine-grained caching for audio models
DarkLight1337 c0454f5
Add fallback
DarkLight1337 d50ef03
Fix typo
DarkLight1337 81f7d61
Fix video processor for Qwen2-VL
DarkLight1337 13eede3
Merge branch 'main' into mm-processor-cache
DarkLight1337 affbc5c
Fix a bunch of type errors
DarkLight1337 b4ddfb1
Fix qwen2-vl
DarkLight1337 4b3db32
Fix
DarkLight1337 dafbc7f
Simplify Pixtral-HF
DarkLight1337 38aaff8
Cleanup
DarkLight1337 5fcb5d6
Fix Pixtral-HF
DarkLight1337 f86e148
Enable caching outside the processing loop
DarkLight1337 337f0d2
Make debugging easier
DarkLight1337 c01d38a
Update
DarkLight1337 84f02fb
Fix ultravox
DarkLight1337 9f417c2
Revert some unnecessary changes
DarkLight1337 00b765b
Merge branch 'main' into mm-fields
DarkLight1337 2ed431e
Add test and fix some issues
DarkLight1337 baaf551
Update
DarkLight1337 f5dbcb8
Fix
DarkLight1337 afd3f4f
Rework
DarkLight1337 6172450
Rename the test
DarkLight1337 416943d
Update count
DarkLight1337 86f2786
Rename
DarkLight1337 f5b6214
Some fixes
DarkLight1337 8a68e87
Cleanup
DarkLight1337 ab7e84b
Skip unspecified fields
DarkLight1337 9f2cdaa
Fix equality checking
DarkLight1337 d11e833
Consolidate common code
DarkLight1337 5fee280
Improve error message
DarkLight1337 6182fd6
Cleanup
DarkLight1337 e1214cf
Fix Pixtral-HF
DarkLight1337 c717bce
Fix missing mm_count key
DarkLight1337 023890e
Fix qwen2-vl
DarkLight1337 b5e5b8a
Fix Qwen2-VL
DarkLight1337 cf24a1b
Fix Qwen2-VL and Qwen2-Audio
DarkLight1337 73271e9
Debug Phi3V
DarkLight1337 e30deec
Consolidate common code
DarkLight1337 ea6f8b5
Try to fix Phi3V and Ultravox
DarkLight1337 10ae755
Remove benchmark
DarkLight1337 85c5e2c
Fix token mismatch in Phi3V and Ultravox
DarkLight1337 4873ff8
Update max image tokens
DarkLight1337 4dbb5a3
Strictly check the number of placeholder tokens
DarkLight1337 6dbae81
Fix doc failure
DarkLight1337 fb51c9b
Test and fix Mantis processor
DarkLight1337 91cbd63
Fix embedding inputs
DarkLight1337 6bee6ba
Update entrypoints tests
DarkLight1337 cfa2ce8
Merge branch 'main' into mm-fields
DarkLight1337 fa54292
Clean up
DarkLight1337 cbf79be
Avoid extra placeholder in phi3v
DarkLight1337 9cd38b1
Fix OOM
DarkLight1337 14dcdd5
Fix mantis processor
DarkLight1337 b8bd2d4
Merge branch 'main' into mm-fields
DarkLight1337 5045d93
Remove redundant code
DarkLight1337 4cac998
Still need Mantis repo for testing
DarkLight1337 e8afd10
Merge branch 'main' into mm-fields
DarkLight1337 93bba0a
Fix incorrect max image tokens (Updated in #11258)
DarkLight1337 ea9f888
Also cache by model ID
DarkLight1337 58747f6
Format
DarkLight1337 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now use HF's
LlavaProcessor
+ our own prompt replacements to replicate the logic ofMLlavaProcessor
, so users don't have to install their GitHub anymore.