[Bug] Streaming output error of tool calling has still not been resolved. #10589

Sala8888 · 2024-11-23T04:06:19Z

I used the hermes_tool_parser.py as tool-parser-plugin and registered the parser as hermes_patched, but still have the same problem.

Already referred to #9874 #10395 #10398

Traceback (most recent call last):
  File "/app/hermes_tool_parser.py", line 228, in extract_tool_calls_streaming
    function_name: Union[str, None] = current_tool_call.get("name")
                                      ^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'get'
Error trying to handle streaming tool call.
Traceback (most recent call last):
  File "/app/hermes_tool_parser.py", line 292, in extract_tool_calls_streaming
    args_delta_start_loc = cur_arguments_json.index(delta_text) \
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: substring not found

Here is how I start vllm service with the latest package:

python3 -m vllm.entrypoints.openai.api_server \
--model /app/Qwen2.5-72B-Instruct-AWQ \
--port 7415 \
--tensor-parallel-size 2 \
--gpu-memory-utilization 0.95 \
--max-model-len 64000 \
--enforce-eager \
--disable_custom_all_reduce \
--enable-auto-tool-choice \
--tool-parser-plugin /app/hermes_tool_parser.py \
--tool-call-parser hermes_patched  \
--chat-template /app/qwen.jinja

I also tried using Docker image v0.6.3.post1 v0.6.4 v0.6.4.post1

Originally posted by @Sala8888 in #10398 (comment)

The text was updated successfully, but these errors were encountered:

sycamore792 · 2024-11-23T06:47:20Z

I got the same problem,

ERROR 11-23 01:34:01 hermes_tool_parser.py:338] Error trying to handle streaming tool call.
ERROR 11-23 01:34:01 hermes_tool_parser.py:338] Traceback (most recent call last):
ERROR 11-23 01:34:01 hermes_tool_parser.py:338] File "/home/sycamore/.conda/envs/llm_env/lib/python3.10/site-packages/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py", line 227, in extract_tool_calls_streaming
ERROR 11-23 01:34:01 hermes_tool_parser.py:338] function_name: Union[str, None] = current_tool_call.get("name")
ERROR 11-23 01:34:01 hermes_tool_parser.py:338] AttributeError: 'NoneType' object has no attribute 'get'
ERROR 11-23 01:34:01 hermes_tool_parser.py:338] Error trying to handle streaming tool call.
ERROR 11-23 01:34:01 hermes_tool_parser.py:338] Traceback (most recent call last):
ERROR 11-23 01:34:01 hermes_tool_parser.py:338] File "/home/sycamore/.conda/envs/llm_env/lib/python3.10/site-packages/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py", line 291, in extract_tool_calls_streaming
ERROR 11-23 01:34:01 hermes_tool_parser.py:338] args_delta_start_loc = cur_arguments_json.index(delta_text)
ERROR 11-23 01:34:01 hermes_tool_parser.py:338] ValueError: substring not found
ERROR 11-23 01:34:02 hermes_tool_parser.py:338] Error trying to handle streaming tool call.
ERROR 11-23 01:34:02 hermes_tool_parser.py:338] Traceback (most recent call last):
ERROR 11-23 01:34:02 hermes_tool_parser.py:338] File "/home/sycamore/.conda/envs/llm_env/lib/python3.10/site-packages/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py", line 291, in extract_tool_calls_streaming
ERROR 11-23 01:34:02 hermes_tool_parser.py:338] args_delta_start_loc = cur_arguments_json.index(delta_text)
ERROR 11-23 01:34:02 hermes_tool_parser.py:338] ValueError: substring not found

besides, when i print the tool_calls function.arguments in strem mode like:

for chunk in create:
    try:
        print(chunk.choices[0].delta.tool_calls[0].function.arguments,end="")
    except Exception as e:
        pass

the output like :

None{"args":"entity": ""\u5e7f\u4e1c\u641c\u4e00\u641c\u79d1\u6280\u6709\u9650\u516c\u53f8, "func": ""get_company_funding{"args": {"entity": "\u5e7f\u4e1c\u641c\u4e00\u641c\u79d1\u6280\u6709\u9650\u516c\u53f8"}, "func": "get_company_funding"}

It is very unfriendly for me to parse json

DarkLight1337 · 2024-11-23T07:10:10Z

cc @K-Mistele

Sala8888 · 2024-11-23T11:49:48Z

Supplementary note.

Before using a new parser, LLM displayed the tool calling arguments in the content of response instead of tool_call.
At the same time, the vllm server also reported the same error.
Here are some examples of response:

現在我將查詢 DB 中的所有 controller 信息。
<tool_call>
{"name": "text_to_sql", "arguments": {"retrieve_steps": "查全部的controller", "columns": "*", "analysis_method": "summary the data and get max", "db_name": "21", "plot": false, "return_sql": false}}}
</tool_call>

請稍等，我會盡快完成分析並提供給您。

<tool_response>
{"name": "voice_to_text", "arguments": {"AudioPath": "["/path/example.mp3"]", "KeywordsPath": "None", "language": "zh"}
</tool_response>

When I modify the code using the above method, LLM directly response the error.

Hope this problem can be solved as soon as possible, thank you!

During the startup of the api server the setup function is called multiple times (every 5s). So the longer the longer the startup time (generally for larger models) the more consumers are contending for the output. This can then lead to race condition where the order of the answer token is wrong. Introduce here: vllm-project#9973 References: vllm-project#10376 vllm-project#10589 vllm-project#10782 Signed-off-by: Jannis Schönleber <[email protected]>

Sala8888 · 2024-12-02T04:37:04Z

@cedonley @joennlae
Thank you for your fix, but the problem is still not solved.

I have read [Bugfix] Multiple fixes to tool streaming when using auto tool choice. and [Bugfix] fix race condition that leads to wrong order of token returned and installed the latest version of vllm using the instructions from here:

pip install https://vllm-wheels.s3.us-west-2.amazonaws.com/nightly/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl

The version is 0.6.4.post2.dev202+ge25810ae

But vllm server still has the same error:

INFO 12-02 12:13:00 engine.py:267] Added request chatcmpl-906ae28088d14caca3e3a355ac0a3036.
INFO 12-02 12:13:00 metrics.py:460] Avg prompt throughput: 171.5 tokens/s, Avg generation throughput: 4.2 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.8%, CPU KV cache usage: 0.0%.
ERROR 12-02 12:13:02 hermes_tool_parser.py:337] Error trying to handle streaming tool call.
ERROR 12-02 12:13:02 hermes_tool_parser.py:337] Traceback (most recent call last):
ERROR 12-02 12:13:02 hermes_tool_parser.py:337]   File "/opt/conda/lib/python3.10/site-packages/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py", line 228, in extract_tool_calls_streaming
ERROR 12-02 12:13:02 hermes_tool_parser.py:337]     function_name: Union[str, None] = current_tool_call.get("name")
ERROR 12-02 12:13:02 hermes_tool_parser.py:337] AttributeError: 'NoneType' object has no attribute 'get'
INFO 12-02 12:13:05 metrics.py:460] Avg prompt throughput: 296.4 tokens/s, Avg generation throughput: 15.3 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.8%, CPU KV cache usage: 0.0%.
INFO 12-02 12:13:20 metrics.py:460] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 8.4 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%.

Did I do something wrong? Or is this bug not resolved yet?

cedonley · 2024-12-02T04:51:28Z

The PR is not yet merged, but based on the error in the logs, I believe it may be resolved once the PR is merged, as the error raised is one of several that I resolved in my commits.

Sala8888 · 2024-12-02T04:54:09Z

Thanks for your reply, I will wait for the PR to be merged!

K-Mistele · 2024-12-02T15:57:10Z

I got the same problem,

ERROR 11-23 01:34:01 hermes_tool_parser.py:338] Error trying to handle streaming tool call. ERROR 11-23 01:34:01 hermes_tool_parser.py:338] Traceback (most recent call last): ERROR 11-23 01:34:01 hermes_tool_parser.py:338] File "/home/sycamore/.conda/envs/llm_env/lib/python3.10/site-packages/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py", line 227, in extract_tool_calls_streaming ERROR 11-23 01:34:01 hermes_tool_parser.py:338] function_name: Union[str, None] = current_tool_call.get("name") ERROR 11-23 01:34:01 hermes_tool_parser.py:338] AttributeError: 'NoneType' object has no attribute 'get' ERROR 11-23 01:34:01 hermes_tool_parser.py:338] Error trying to handle streaming tool call. ERROR 11-23 01:34:01 hermes_tool_parser.py:338] Traceback (most recent call last): ERROR 11-23 01:34:01 hermes_tool_parser.py:338] File "/home/sycamore/.conda/envs/llm_env/lib/python3.10/site-packages/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py", line 291, in extract_tool_calls_streaming ERROR 11-23 01:34:01 hermes_tool_parser.py:338] args_delta_start_loc = cur_arguments_json.index(delta_text) ERROR 11-23 01:34:01 hermes_tool_parser.py:338] ValueError: substring not found ERROR 11-23 01:34:02 hermes_tool_parser.py:338] Error trying to handle streaming tool call. ERROR 11-23 01:34:02 hermes_tool_parser.py:338] Traceback (most recent call last): ERROR 11-23 01:34:02 hermes_tool_parser.py:338] File "/home/sycamore/.conda/envs/llm_env/lib/python3.10/site-packages/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py", line 291, in extract_tool_calls_streaming ERROR 11-23 01:34:02 hermes_tool_parser.py:338] args_delta_start_loc = cur_arguments_json.index(delta_text) ERROR 11-23 01:34:02 hermes_tool_parser.py:338] ValueError: substring not found

besides, when i print the tool_calls function.arguments in strem mode like:
for chunk in create:
    try:
        print(chunk.choices[0].delta.tool_calls[0].function.arguments,end="")
    except Exception as e:
        pass
        
the output like :

None{"args":"entity": ""\u5e7f\u4e1c\u641c\u4e00\u641c\u79d1\u6280\u6709\u9650\u516c\u53f8, "func": ""get_company_funding{"args": {"entity": "\u5e7f\u4e1c\u641c\u4e00\u641c\u79d1\u6280\u6709\u9650\u516c\u53f8"}, "func": "get_company_funding"}

It is very unfriendly for me to parse json

This:

None{"args":"entity": ""\u5e7f\u4e1c\u641c\u4e00\u641c\u79d1\u6280\u6709\u9650\u516c\u53f8, "func": ""get_company_funding{"args": {"entity": "\u5e7f\u4e1c\u641c\u4e00\u641c\u79d1\u6280\u6709\u9650\u516c\u53f8"}, "func": "get_company_funding"}

Doesn't look like the model is trying to generate a valid tool call; the structure is off. Possibly a chat template issue, or a precision loss with the AWQ Quant?

Can you share your chat template and the tools you're passing to the model? Hard to debug without these.

cedonley · 2024-12-02T16:42:52Z

Hi @K-Mistele They are not printing the full tool call, only the arguments. The issues are (outlined in my PR mentioned earlier) is:

We're extensively using json.loads/dumps() calls, which explains why the model returns UTF-8 without the tool parsing, but is encoding non-ascii characters within the tool call. This isn't a showstopper except that we're mixing and matching ensure_ascii=False w/ the default (True) in various parsers, which breaks non-ascii arguments at times given that the "diffs" won't match (note the extra quotes and the missing {'s that invalidate this JSON...this is from the diffs getting misaligned).
The first argument looks to be "args": {"entity". The arg name is very short, and likely runs into the issue mentioned in my PR where the chunks come back as '"ar' then 'gs": {"ent'. What happens in this case before my fix is that the first argument can be corrupted. I'm not 100% sure if that's what is happening to Sala8888, but it's very common prior to my fix when the first argument is short...
What's worse is you can see the full (and correct) arguments get repeated after the first occurrence (right after the first appearance of get_company_funding in the string. This is because the first instance was corrupted from what was in the "expected" string, so the diff calculation is completely wrong, causing the full arguments to be returned in the "closing" part of the tool parser loop.

rmalde · 2024-12-02T23:12:23Z

Looking forward to this being merged, also blocked by this. Thanks @cedonley !

K-Mistele · 2024-12-03T16:53:09Z

@cedonley @joennlae Thank you for your fix, but the problem is still not solved.

I have read [Bugfix] Multiple fixes to tool streaming when using auto tool choice. and [Bugfix] fix race condition that leads to wrong order of token returned and installed the latest version of vllm using the instructions from here:

pip install https://vllm-wheels.s3.us-west-2.amazonaws.com/nightly/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl

The version is 0.6.4.post2.dev202+ge25810ae

But vllm server still has the same error:

INFO 12-02 12:13:00 engine.py:267] Added request chatcmpl-906ae28088d14caca3e3a355ac0a3036.
INFO 12-02 12:13:00 metrics.py:460] Avg prompt throughput: 171.5 tokens/s, Avg generation throughput: 4.2 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.8%, CPU KV cache usage: 0.0%.
ERROR 12-02 12:13:02 hermes_tool_parser.py:337] Error trying to handle streaming tool call.
ERROR 12-02 12:13:02 hermes_tool_parser.py:337] Traceback (most recent call last):
ERROR 12-02 12:13:02 hermes_tool_parser.py:337]   File "/opt/conda/lib/python3.10/site-packages/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py", line 228, in extract_tool_calls_streaming
ERROR 12-02 12:13:02 hermes_tool_parser.py:337]     function_name: Union[str, None] = current_tool_call.get("name")
ERROR 12-02 12:13:02 hermes_tool_parser.py:337] AttributeError: 'NoneType' object has no attribute 'get'
INFO 12-02 12:13:05 metrics.py:460] Avg prompt throughput: 296.4 tokens/s, Avg generation throughput: 15.3 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.8%, CPU KV cache usage: 0.0%.
INFO 12-02 12:13:20 metrics.py:460] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 8.4 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%.

Did I do something wrong? Or is this bug not resolved yet?

fyi it is likely that the version of vLLM that you pulled from the nightly build did not contain these fixes, as both the linked PRs are still open and have not been merged into main. You might try pulling the PRs themselves and checking that way.

K-Mistele · 2024-12-03T16:54:37Z

I think it makes the most sense to either test with those, or wait for them to be merged and then test with the nightly build, before trying to debug other issues here since otherwise we could be either (a) debugging an issue that has already been solved, or (b) trying to debug a compound issue which would be quite difficult

Sala8888 · 2024-12-04T07:08:23Z

@cedonley @joennlae I used the branch you provided to build vllm, but still got the same error.
The branch I use: cedonley:fix_toolstream_truncate, 44ai-labs:fix-racecondition-generation

Error messages:

ERROR 12-04 14:00:42 hermes_tool_parser.py:337] Error trying to handle streaming tool call.
ERROR 12-04 14:00:42 hermes_tool_parser.py:337] Traceback (most recent call last):
ERROR 12-04 14:00:42 hermes_tool_parser.py:337]   File "/app/vllm/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py", line 228, in extract_tool_calls_streaming
ERROR 12-04 14:00:42 hermes_tool_parser.py:337]     function_name: Union[str, None] = current_tool_call.get("name")
ERROR 12-04 14:00:42 hermes_tool_parser.py:337]                                       ^^^^^^^^^^^^^^^^^^^^^
ERROR 12-04 14:00:42 hermes_tool_parser.py:337] AttributeError: 'NoneType' object has no attribute 'get'

In addition, vllm may still send back a response after an error occurs. I used the same code to retrieve tool_calls information. The format and content retrieved in v0.6.3post1 were correct. However, in the branch you provided, there was a problem with the retrieved format, causing json decode errors.

Retrieval results in v0.6.3post1:

{'tool_calls': [{'index': 0, 'id': 'chatcmpl-tool-8882ac18eed149139d5f950e4c01390e', 'type': 'function', 'function': {'name': 'voice_to_text', 'arguments': '{"AudioPath": "http://example.com/audio.mp3", "language": "zh", "function_one_name_called": true}'}}]}

Retrieval results in branch:

{'tool_calls': [{'index': 0, 'id': 'chatcmpl-tool-d462cd97e6c940f2be6c33ee015f8eab', 'type': 'function', 'function': {'name': 'voice_to_text', 'arguments': ' true}'}}]}

Hope you can solve this problem, thank you!

Sala8888 · 2024-12-04T08:40:32Z

Here is another related error.

I originally only added the tool message (role: tool) in messages after getting the tool's response, and then entered the next iteration of LLM, so the messages contained: system-user-tool

In order to let LLM know the history of tool calls, I added an assistant message containing tool_calls information before the tool message, so the messages contained: system-user-assistant-tool.
Example:

[
    {
        "role": "system",
        "content": "You are a helpful assistant."
    },
    {
        "role": "user",
        "content": "幫我把音檔轉成逐字稿。檔名: http://example.com/audio.mp3"
    },
    {
        "role": "assistant",
        "tool_calls": [
            {
                "id": "chatcmpl-tool-cb9685ce207a4bcfa461eafea3d6e801",
                "function": {
                    "arguments": "{\"AudioPath\": \"http://example.com/audio.mp3\", \"language\": \"zh\", \"function_one_name_called\": true}",
                    "name": "voice_to_text"
                },
                "type": "function"
            }
        ]
    },
    {
        "role": "tool",
        "tool_call_id": "5097b499-6da1-4343-ae3c-4134b660e065",
        "name": "voice_to_text",
        "content": "Testing"
    }
]

It works when stream=False, but when stream=True the error occurs as follows:

ERROR 12-04 07:30:15 serving_chat.py:156] Error in applying chat template from request
ERROR 12-04 07:30:15 serving_chat.py:156] Traceback (most recent call last):
ERROR 12-04 07:30:15 serving_chat.py:156]   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/serving_chat.py", line 124, in create_chat_completion
ERROR 12-04 07:30:15 serving_chat.py:156]     conversation, mm_data_future = parse_chat_messages_futures(
ERROR 12-04 07:30:15 serving_chat.py:156]                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-04 07:30:15 serving_chat.py:156]   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/chat_utils.py", line 529, in parse_chat_messages_futures
ERROR 12-04 07:30:15 serving_chat.py:156]     sub_messages = _parse_chat_message_content(msg, mm_tracker)
ERROR 12-04 07:30:15 serving_chat.py:156]                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-04 07:30:15 serving_chat.py:156]   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/chat_utils.py", line 475, in _parse_chat_message_content
ERROR 12-04 07:30:15 serving_chat.py:156]     result_msg["tool_calls"] = list(parsed_msg["tool_calls"])
ERROR 12-04 07:30:15 serving_chat.py:156]                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-04 07:30:15 serving_chat.py:156] pydantic_core._pydantic_core.ValidationError: 1 validation error for ValidatorIterator
ERROR 12-04 07:30:15 serving_chat.py:156] 0.index
ERROR 12-04 07:30:15 serving_chat.py:156]   Extra inputs are not permitted [type=extra_forbidden, input_value=0, input_type=int]
ERROR 12-04 07:30:15 serving_chat.py:156]     For further information visit https://errors.pydantic.dev/2.9/v/extra_forbidden
INFO:     192.168.54.92:39236 - "POST /v1/chat/completions HTTP/1.1" 400 Bad Request

Version: v0.6.3post1

cedonley · 2024-12-04T20:25:03Z

@Sala8888

I've written a short script that tries to replicate your tool call. Note that you need to add a final "assistant" response to your example messages that shows the assistant's typical reply to the tool information.

gist with python test script

I start my server with the following arguments:

python -m vllm.entrypoints.openai.api_server --model /ai/models/Qwen2.5-72B-Instruct-AWQ --host 0.0.0.0 --port 5005 --served-model-name qwen2.5-large --disable-log-requests --kv-cache-dtype auto --enable-auto-tool-choice --tool-call-parser hermes -tp 2 --gpu-memory-utilization 0.95 --distributed-executor-backend ray --enable-prefix-caching  --enable-chunked-prefill

The script outputs both the stream=False followed by a separate stream=True call for the same request:

❯ python test_vllm_bug10831.py
stream=False results:
[ChatCompletionMessageToolCall(id='chatcmpl-tool-66fe6d006e4f484aa27a32c0c3078d06', function=Function(arguments='{"AudioPath": "https://example.com/real_audio.mp3", "language": "zh", "function_one_name_called": true}', name='voice_to_text'), type='function')]



streamed tool call id: chatcmpl-tool-1b5f49b78bfe4095bdcf4788c1ecb307
streamed tool call name: voice_to_text
streamed tool call arguments: {"AudioPath": "https://example.com/real_audio.mp3", "language": "zh", "function_one_name_called": true}

As you see above, I'm not seeing any issues when using with my PR.

If you don't get this output from the script, try to run the script when you start vLLM with debug logs enabled and provide the relevant debug lines.

export VLLM_LOGGING_LEVEL=DEBUG

During the startup of the api server the setup function is called multiple times (every 5s). So the longer the longer the startup time (generally for larger models) the more consumers are contending for the output. This can then lead to race condition where the order of the answer token is wrong. Introduce here: vllm-project#9973 References: vllm-project#10376 vllm-project#10589 vllm-project#10782 Signed-off-by: Jannis Schönleber <[email protected]>

cedonley mentioned this issue Nov 30, 2024

[Bugfix] Multiple fixes to tool streaming with hermes and mistral parsers #10782

Closed

joennlae mentioned this issue Dec 1, 2024

[Bugfix] fix race condition that leads to wrong order of token returned #10802

Open

cedonley mentioned this issue Dec 7, 2024

[Bugfix] Multiple fixes to tool streaming with hermes and mistral #10979

Merged

DarkLight1337 closed this as completed in #10979 Dec 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Streaming output error of tool calling has still not been resolved. #10589

[Bug] Streaming output error of tool calling has still not been resolved. #10589

Sala8888 commented Nov 23, 2024 •

edited

Loading

sycamore792 commented Nov 23, 2024

DarkLight1337 commented Nov 23, 2024

Sala8888 commented Nov 23, 2024 •

edited

Loading

Sala8888 commented Dec 2, 2024

cedonley commented Dec 2, 2024

Sala8888 commented Dec 2, 2024

K-Mistele commented Dec 2, 2024

cedonley commented Dec 2, 2024 •

edited

Loading

rmalde commented Dec 2, 2024

K-Mistele commented Dec 3, 2024

K-Mistele commented Dec 3, 2024

Sala8888 commented Dec 4, 2024

Sala8888 commented Dec 4, 2024 •

edited

Loading

cedonley commented Dec 4, 2024 •

edited

Loading

[Bug] Streaming output error of tool calling has still not been resolved. #10589

[Bug] Streaming output error of tool calling has still not been resolved. #10589

Comments

Sala8888 commented Nov 23, 2024 • edited Loading

sycamore792 commented Nov 23, 2024

DarkLight1337 commented Nov 23, 2024

Sala8888 commented Nov 23, 2024 • edited Loading

Sala8888 commented Dec 2, 2024

cedonley commented Dec 2, 2024

Sala8888 commented Dec 2, 2024

K-Mistele commented Dec 2, 2024

cedonley commented Dec 2, 2024 • edited Loading

rmalde commented Dec 2, 2024

K-Mistele commented Dec 3, 2024

K-Mistele commented Dec 3, 2024

Sala8888 commented Dec 4, 2024

Sala8888 commented Dec 4, 2024 • edited Loading

cedonley commented Dec 4, 2024 • edited Loading

Sala8888 commented Nov 23, 2024 •

edited

Loading

Sala8888 commented Nov 23, 2024 •

edited

Loading

cedonley commented Dec 2, 2024 •

edited

Loading

Sala8888 commented Dec 4, 2024 •

edited

Loading

cedonley commented Dec 4, 2024 •

edited

Loading