Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Bugfix] fix race condition that leads to wrong order of token returned
During the startup of the api server the setup function is called multiple times (every 5s). So the longer the longer the startup time (generally for larger models) the more consumers are contending for the output. This can then lead to race condition where the order of the answer token is wrong. Introduce here: vllm-project#9973 References: vllm-project#10376 vllm-project#10589 vllm-project#10782 Signed-off-by: Jannis Schönleber <[email protected]>
- Loading branch information