Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Function calling with stream vs without stream, arguments=None when stream option is enabled #9693

Closed
1 task done
ankush13r opened this issue Oct 25, 2024 · 11 comments · Fixed by #10398
Closed
1 task done
Labels
bug Something isn't working

Comments

@ankush13r
Copy link

ankush13r commented Oct 25, 2024

Your current environment

Dockerfile: vllm/vllm-openai:v0.6.3

Parameters:
--enable-auto-tool-choice --tool-call-parser hermes

Model Input Dumps

No response

🐛 Describe the bug

I'm using the VLLM library with a Docker container as a REST API, specifically the v1/chat/completion/ endpoint with the OpenAI client.

When I run chat completions without streaming, it returns tool_calls with the tool name and its arguments as expected. However, when I enable the streaming option, it only returns the tool name, with arguments set to None. I'm not sure why this is happening.

I've tried searching for related issues but haven’t found anything helpful.
Have tried stream_options={"include_usage": True} and it gives same output.

Model generate this output:

<tool_call>
{"arguments": {"n1": 2, "n2": 2}, "name": "sum"}
</tool_call>
chat_completion = client.chat.completions.create(
        model="tgi",
        messages=messages,
        stream=True,
        max_tokens=2000,
        temperature=0.3,
        tools=tools,
        tool_choice="auto",
    )
chunks = []
for chunk in chat_completion:
    chunks.append(chunk)
    if chunk.choices[0].delta.tool_calls:
        print(chunk.choices[0].delta.tool_calls[0])
    else:
        print(chunk.choices[0].delta)


chat_completion = client.chat.completions.create(
        model="tgi",
        messages=messages,
        stream=False,
        max_tokens=2000,
        temperature=0.3,
        tools=tools,
        tool_choice="auto",
    )
print(chat_completion.choices[0].message.tool_calls[0])

Output:

  • with stream:
ChoiceDelta(content='', function_call=None, refusal=None, role='assistant', tool_calls=None)
ChoiceDeltaToolCall(index=0, id='chatcmpl-tool-ac7886c6cea04451b439d4e24b21ab7a', function=ChoiceDeltaToolCallFunction(arguments=None, name='sum'), type='function')
ChoiceDelta(content='', function_call=None, refusal=None, role=None, tool_calls=None)
  • Without stream
ChatCompletionMessageToolCall(id='chatcmpl-tool-736bd066f6744f9985817df30c73aad3', function=Function(arguments='{"n1": 2, "n2": 2}', name='sum'), type='function')

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@ankush13r ankush13r added the bug Something isn't working label Oct 25, 2024
@DarkLight1337
Copy link
Member

@K-Mistele can you take a look into this?

@ankush13r
Copy link
Author

I’ve been debugging the issue on my own and think I've identified the solution. After testing the API, I noticed that it currently generates tool_calls where the function name and arguments are in separate yield statements, which is causing issues. Here’s an example of the current output:

Current Output:

[Choice(delta=ChoiceDelta(content='', function_call=None, refusal=None, role='assistant', tool_calls=None), finish_reason=None, index=0, logprobs=None)]
[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(None, name='sum'), type=None)]), finish_reason=None, index=0, logprobs=None)]
[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"n1": 2, "n2": 2}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)]
[Choice(delta=ChoiceDelta(content='', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason='tool_calls', index=0, logprobs=None, stop_reason=None)]

In this example, the function name is yielded separately from its arguments. However, for functionality like chatbot integration and API calls—where multiple frameworks expect the tool_call to be complete in a single field—it would be more efficient if both the name and arguments were generated in the same yield statement.

Expected Behavior: The API should generate tool_calls with the function name and arguments combined, so the function can be utilized directly without additional processing. Here’s an example of the ideal output:

[Choice(delta=ChoiceDelta(content='', function_call=None, refusal=None, role='assistant', tool_calls=None), finish_reason=None, index=0, logprobs=None)]
[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"n1": 2, "n2": 2}', name='sum'), type=None)]), finish_reason=None, index=0, logprobs=None)]
[Choice(delta=ChoiceDelta(content='', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason='tool_calls', index=0, logprobs=None, stop_reason=None)]

@K-Mistele
Copy link
Contributor

hi @ankush13r! You are correct in that the function name and function arguments are handled in separate yield statements. vLLM's OpenAI-compatible tool calling implementation follows OpenAI's standard for tool streaming, which works as follows.

here's an example request you can make with postman or something similar to illustrate what the streamed Server-sent events will look like according to OpenAI's standard:

{
  "model": "gpt-4o",
   "messages": [
    {
      "role": "user",
      "content": "Can you tell me the weather in dallas in fahrenheit?"
    }

  ],
  "stream": true,
  "temperature": 0.7,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
          "type": "object",
          "properties": {
            "city": {
              "type": "string",
              "description": "The city to find the weather for, e.g. 'San Francisco'"
            },
            "state": {
              "type": "string",
              "description": "the two-letter abbreviation for the state that the city is in, e.g. 'CA' which would mean 'California'"
            },
            "unit": {
              "type": "string",
              "description": "The unit to fetch the temperature in",
              "enum": [
                "celsius",
                "fahrenheit"
              ]
            }
          }
        }
      }
    }
  ]
}

Here is what this request generates from OpenAI using streaming:

Long list of Server-sent events from OpenAI
data: {"id":"chatcmpl-AMH4lG0yiD21BtZzSfsQtoMo7zDRi","object":"chat.completion.chunk","created":1729872219,"model":"gpt-4o-2024-08-06","system_fingerprint":"fp_90354628f2","choices":[{"index":0,"delta":{"role":"assistant","content":null,"tool_calls":[{"index":0,"id":"call_671mMmDFiC5r38Myya1UQub8","type":"function","function":{"name":"get_current_weather","arguments":""}}],"refusal":null},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-AMH4lG0yiD21BtZzSfsQtoMo7zDRi","object":"chat.completion.chunk","created":1729872219,"model":"gpt-4o-2024-08-06","system_fingerprint":"fp_90354628f2","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"{\""}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-AMH4lG0yiD21BtZzSfsQtoMo7zDRi","object":"chat.completion.chunk","created":1729872219,"model":"gpt-4o-2024-08-06","system_fingerprint":"fp_90354628f2","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"city"}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-AMH4lG0yiD21BtZzSfsQtoMo7zDRi","object":"chat.completion.chunk","created":1729872219,"model":"gpt-4o-2024-08-06","system_fingerprint":"fp_90354628f2","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"\":\""}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-AMH4lG0yiD21BtZzSfsQtoMo7zDRi","object":"chat.completion.chunk","created":1729872219,"model":"gpt-4o-2024-08-06","system_fingerprint":"fp_90354628f2","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"d"}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-AMH4lG0yiD21BtZzSfsQtoMo7zDRi","object":"chat.completion.chunk","created":1729872219,"model":"gpt-4o-2024-08-06","system_fingerprint":"fp_90354628f2","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"allas"}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-AMH4lG0yiD21BtZzSfsQtoMo7zDRi","object":"chat.completion.chunk","created":1729872219,"model":"gpt-4o-2024-08-06","system_fingerprint":"fp_90354628f2","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"\",\""}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-AMH4lG0yiD21BtZzSfsQtoMo7zDRi","object":"chat.completion.chunk","created":1729872219,"model":"gpt-4o-2024-08-06","system_fingerprint":"fp_90354628f2","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"unit"}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-AMH4lG0yiD21BtZzSfsQtoMo7zDRi","object":"chat.completion.chunk","created":1729872219,"model":"gpt-4o-2024-08-06","system_fingerprint":"fp_90354628f2","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"\":\""}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-AMH4lG0yiD21BtZzSfsQtoMo7zDRi","object":"chat.completion.chunk","created":1729872219,"model":"gpt-4o-2024-08-06","system_fingerprint":"fp_90354628f2","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"fahren"}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-AMH4lG0yiD21BtZzSfsQtoMo7zDRi","object":"chat.completion.chunk","created":1729872219,"model":"gpt-4o-2024-08-06","system_fingerprint":"fp_90354628f2","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"heit"}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-AMH4lG0yiD21BtZzSfsQtoMo7zDRi","object":"chat.completion.chunk","created":1729872219,"model":"gpt-4o-2024-08-06","system_fingerprint":"fp_90354628f2","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"\"}"}}]},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-AMH4lG0yiD21BtZzSfsQtoMo7zDRi","object":"chat.completion.chunk","created":1729872219,"model":"gpt-4o-2024-08-06","system_fingerprint":"fp_90354628f2","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"tool_calls"}]}

data: [DONE]

There are a couple important things to observe here:

  • the first SSE event specifies the message role (assistant) and sets up the tool call array, and includes the name of the called tool
  • subsequent SSE events include an arguments field for the function that's being called. Each of these is a diff of the arguments. To construct the entire arguments strings, you would concatenate these, and then try to parse the complete string as JSON, handling validation and errors(since they are not guaranteed to be valid JSON either in OpenAI's API or vLLM's)
    • NOTE that argument diffs in vLLM may be empty strings in some cases, however this should not break processing as that does not affect concatenation

This is the OpenAI standard for server-sent events for tool streaming, and this is the standard that vLLM follows. A function's name is always streamed before argument deltas arrive, and argument deltas will never be streamed in the same event as the function's name. Multiple argument deltas will be received that must be concatenated; the entire arguments stream (should) never be received all at once.

When you're receiving deltas from vLLM, are these (below) the only deltas that you are receiving before the stream ends, or are you receiving additional deltas with arguments diffs like shown above?

ChoiceDelta(content='', function_call=None, refusal=None, role='assistant', tool_calls=None)
ChoiceDeltaToolCall(index=0, id='chatcmpl-tool-ac7886c6cea04451b439d4e24b21ab7a', function=ChoiceDeltaToolCallFunction(arguments=None, name='sum'), type='function')
ChoiceDelta(content='', function_call=None, refusal=None, role=None, tool_calls=None)

If these are the only deltas you receive, that probably indicates a bug, since you should receive argument deltas as well. If you do receive additional deltas, you just need to handle concatenating and parsing them as described above & in the docs example that I linked to.

Can you please share your entire vLLM start command and the entire request and all received deltas so that I can help you debug it?

You should be able to see an example of how this works, including delta processing for arguments, in this example from the vLLM docs.

I actually created this demo with hermes, so it should work for testing your purposes.

@ankush13r
Copy link
Author

ankush13r commented Oct 26, 2024

Now I see that the arguments are being yielded separately. However, I found a bug in the Hermes parser during debugging, which causes it to return a response without arguments. Below is an example of the output received:

[Choice(delta=ChoiceDelta(content='', function_call=None, refusal=None, role='assistant', tool_calls=None), finish_reason=None, index=0, logprobs=None)]
[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id='chatcmpl-tool-eb20e37d0a2b449694953e3647e13603', function=ChoiceDeltaToolCallFunction(arguments=None, name='sum'), type='function')]), finish_reason=None, index=0, logprobs=None)]
[Choice(delta=ChoiceDelta(content='', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason='tool_calls', index=0, logprobs=None, stop_reason=None)]
[]

Debug Findings:
After investigating, I found that the parser throws a ValueError and NoneType in the hermes_tool_parser.py file. Specifically, the function extract_tool_calls_streaming attempts to locate delta_text within cur_arguments_json, but fails if the substring delta_text isn’t found. Here’s the relevant debugging output:

ERROR 10-26 16:44:41 hermes_tool_parser.py:337] Error trying to handle streaming tool call: 'NoneType' object has no attribute 'get'
INFO 10-26 16:44:41 metrics.py:345] Avg prompt throughput: 34.1 tokens/s, Avg generation throughput: 0.2 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%.
ERROR 10-26 16:44:41 hermes_tool_parser.py:337] Error trying to handle streaming tool call: substring not found
ERROR 10-26 16:44:41 hermes_tool_parser.py:337] Error trying to handle streaming tool call: cannot access local variable 'tool_call_portion' where it is not associated with a value

Proposed Solution:

The solution that mitigates this bug is to add a check to verify that delta_text exists within cur_arguments_json before attempting to find its index and check if current_tool_call is not None. Here’s the current and modified code:
current:

function_name: Union[str, None] = current_tool_call.get("name")

cur_arguments = current_tool_call.get("arguments")

# get the location where previous args differ from current
args_delta_start_loc = cur_arguments_json.index(delta_text) \
                       + len(delta_text)

arguments_delta = cur_arguments_json[:args_delta_start_loc]

https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py#L227C51-L227C72
https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py#L265
https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py#L291

Updated Code:

function_name: Union[str, None] = current_tool_call.get("name") if current_tool_call else None

cur_arguments = current_tool_call.get("arguments") if current_tool_call else None


args_delta_start_loc = None
if delta_text in cur_arguments_json:
    args_delta_start_loc = cur_arguments_json.index(delta_text) \
                           + len(delta_text)

arguments_delta = cur_arguments_json[:args_delta_start_loc]

This fix both bugs. However, it still produces reponses with arguments='', name=None , as shown below, but I get correct arguments.

[Choice(delta=ChoiceDelta(content='', function_call=None, refusal=None, role='assistant', tool_calls=None), finish_reason=None, index=0, logprobs=None)]
[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id='chatcmpl-tool-a35d521fb400478aa8d20371876adf0f', function=ChoiceDeltaToolCallFunction(arguments=None, name='sum'), type='function')]), finish_reason=None, index=0, logprobs=None)]
[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"n1": 2, "n2": 2}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)]
[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)]
[Choice(delta=ChoiceDelta(content='', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason='tool_calls', index=0, logprobs=None, stop_reason=None)]

To prevent empty responses, the solution is to check if arguments are not an empty string before yielding. The proposed solution involves adding an if condition as follows: if diff: # Diff can be an empty string ''.
https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py#L193
Here’s the adjusted code:

if diff:
                    diff = json.dumps(diff).replace(
                        self.streamed_args_for_tool[self.current_tool_id], "")
                    
                    if diff: #Diff can be empty string ''
                        logger.debug(
                            "Finishing tool and found diff that had not "
                            "been streamed yet: %s", diff)
                        self.streamed_args_for_tool[self.current_tool_id] \
                            += diff
                        return DeltaMessage(tool_calls=[
                            DeltaToolCall(index=self.current_tool_id,
                                        function=DeltaFunctionCall(
                                            arguments=diff).model_dump(
                                                exclude_none=True))
                        ])
                    else:
                        return None

Let me know if you think this should fix the bug or if the issue lies with the model's response generation. I'm open to collaborating to resolve the bug and can make pull request.

@K-Mistele
Copy link
Contributor

Can you please share the request you're using (messages, tools, vLLM config) so that I can try to reproduce the issue? It's not impossible that there's a bug in the Hermes tool parser, but it has been used and tested pretty robustly so I'm curious what's different about this and I'd like to be able to step through the streaming parsing.

@ankush13r
Copy link
Author

ankush13r commented Oct 27, 2024

I'm sending you the configuration here. The model I'm using is our own, and we can't publish it yet since it's still in testing. However, I tried to reproduce the bug with a Hermes model.
The model is NousResearch/Hermes-2-Pro-Llama-3-8B

Vllm config (I’m running with Singularity, but I believe Docker or running directly would have the same effect):

singularity run --nv  vllm-openai-latest.sif \
  --model NousResearch/Hermes-2-Pro-Llama-3-8B \
  --served-model-name Hermes-2-Pro-Llama-3-8B vllm \
  --host 0.0.0.0 --port 8080 --tensor-parallel-size 4 \
  --enable-auto-tool-choice --tool-call-parser hermes

Python OpenAi client:

client = OpenAI(
    base_url="http://localhost:8080/v1/", 
    api_key="HF_TOKEN"
)

chat_completion = client.chat.completions.create(
        model="vllm",
        messages=[{
    "role": "user", 
    "content": """Give me sum of 2 + 2. \n Tools.
Ignore the structure given you before for tool_calls and use the following:
For each function call return a json object with function name and arguments within <tool_call></tool_call> XML tags as follows:
<tool_call>
{"arguments": <args-dict>, "name": <function-name>}
</tool_call>"""
    }],
        stream=True,
        tool_choice="auto",
        tools=[{'type': 'function', 'function': {'name': 'sum', 'description': 'Given two number return the sum of it.', 'parameters': {'properties': {'n1': {'type': 'integer'}, 'n2': {'type': 'integer'}}, 'required': ['n1', 'n2'], 'type': 'object'}}}]
    )
chunks = []
for chunk in chat_completion:
    print(chunk.choices)

The error occurs if the generated text by the model follows this format, where arguments appear first and the name is at the end:

<tool_call>
{"arguments": <args-dict>, "name": <function-name>}
</tool_call>

Thanks

@K-Mistele
Copy link
Contributor

Okay, I can see a couple places that there would be a problem.

  1. If you're not using a Hermes/Qwen model, even if your model adheres to that tool call format, the tool parser may not work. This is because in the Hermes & Qwen models, <tool_call> and </tool_call> are actually tokens in the model's vocab, and the tool parser uses the tokens rather than their detokenized form to search for them
  2. If the order of name and arguments is backwards, the Hermes tool parsing is not guaranteed to work. This implementation of the parser assumes that name will be generated before arguments, due to the requirement to stream the function name before the arguments. This was not the original order that Hermes used, but after I had several conversations with Nous Research, they agreed to reverse the order of these fields to better support streaming, and because we felt like it would result in better-quality argument generation due to the model's autoregressive properties. They updated the chat template (which you can find in the tokenizer_config.json here ) to instruct the model to use the following pydantic JSON schema:
{"properties": {"name": {"title": "Name", "type": "string"}, "arguments": {"title": "Arguments", "type": "object"}}

which is

{"name": <function-name>, "arguments": <args-dict>}

You can see that Qwen adopted this format as well in their tokenizer_config.json here

Therefore, name before arguments is the officially supported order for this tool parser.
I recommend switching the order in your chat template (or using one of the Hermes templates provided in the examples folder of this repository) to make your model's generation compatible, as we found that the model was very willing to take instructions about the order of the arguments even if that is not what it had seen during training.

Alternatively, if you positively need the arguments before the name, you should be able to copy the hermes tool parser from vLLM's source, alter it for your needs and load it at runtime without having to open, create and have merged a PR. (see writing a tool parser plugin in the docs)

Hopefully that helps!

@ankush13r
Copy link
Author

Thank you;
I really appreciate it! I'll adjust the chat_template to include the name before the arguments, so we won’t need to create a new tool parser for now.

@ankush13r
Copy link
Author

Hi @K-Mistele,

I updated my model’s chat_template, and it has resolved the previous error related to arguments. But, didn't solve the error of 'NoneType':

Oct 31, 17:34:22  ERROR     
10-31 10:34:22 hermes_tool_parser.py:337] Error trying to handle streaming tool call: 'NoneType' object has no attribute 'get'

The error occurs in the following lines, where it attempts to call the get method on a None object:
https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py#L227
https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py#L265

A solution would be to check if current_tool_call is None before attempting to access its get method.
This error is also mentioned in issue #9874.

@K-Mistele
Copy link
Contributor

Working on this now. I am actually not sure what the cause of this issue is, because not only does it not occur frequently for me, I can't actually reproduce the issue at all. I can understand logically how it might happen based on the references that you provided in the source, but I've never seen this happen before. I wonder if this has to do with a tool call being generated slightly differently in some circumstances (e.g. extra whitespace where none was expected) resulting in this edge case being tripped.

If you could share your configuration so that I can reproduce the issue, then I'm happy to try and reproduce it on my end.

Otherwise, I think the best path forward will be for me to open a PR with the patched tool parser, and then to assess if it fixes your issue, you can load the tool parser as a plugin at runtime (see tool parser plugins in the docs) instead of using the hermes one in the current version.

@K-Mistele
Copy link
Contributor

Please check #9908 :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants