[Bug]: Streaming w/ tool choice auto often truncates the final delta in the streamed arguments #10781
Closed
1 task done
Labels
bug
Something isn't working
Your current environment
The output of `python collect_env.py`
Model Input Dumps
No response
🐛 Describe the bug
The current streaming implementation when using "auto" tool choice has multiple issues. I have validated that these issues exist with both the Hermes and Mistral tool parsers and have prepared a PR that I'll be submitting shortly to fix these issues.
delta that doesn't include the already-constructed delta is created and the original delta is not submitted.
For example: if arguments is
{"arguments": "{\"prompt\":\"Wicked Movie 2024\"}"
it is possible that depending on token return from the model, perhaps "2024" or even "vie 2024" would be dropped.
Hermes parser has a similar issue where it doesn't even return the delta when it detects that the tool end token is detected. This is because when it detects that the tool end token is provided, it may still have part of an unset delta that it has not yet returned.
Mistral parser may match the wrong part of the initial token if the argument name is short enough to fit in a single delta. This will result in broken JSON object being returned that is missing the name of the first argument and duplicates other parts of the string.
All of these are somewhat dependent on how the deltas come back from the model and except in case 3, the json object returned looks valid, but the final argument's value may be truncated.
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: