Fix JSON parsing from model #85

vutrung96 · 2024-11-13T17:00:33Z

Since we are not using strict mode, it's possible for OpenAI to return invalid JSON (either not a JSON or valid JSON but not conform to the JSON schema). In this case we catch the invalid JSON and skip the response.

RyanMarten · 2024-11-13T17:12:58Z

https://rwilinski.ai/posts/benchmarking-llms-for-structured-json-generation/
Overall requests are slower. By 50% 3s - 5s

Cold start problem:

As Ted Sanders mentioned in this HN comment, using strict mode bears a significant cold start penalty which goes away in the subsequent runs.

The first request with each JSON schema will be slow, as we need to preprocess the JSON schema into a context-free grammar. If you don’t want that latency hit (e.g., you’re prototyping, or have a use case that uses variable one-off schemas), then you might prefer “strict”: false

How much slower it is? Here are my results:

Model schema avgFirstRequestTime avgSecondRequestTime coldStartPenalty

gpt-4o-2024-08-06 Wide JSON Schema 20234.0549 5927.3556 241.37%

gpt-4o-mini Wide JSON Schema 21801.5501 5800.8192 275.84%

gpt-4o-2024-08-06 Complex JSON Schema 24089.9075 7100.4283 239.27%

gpt-4o-mini Complex JSON Schema 26665.4039 10270.7880 159.62%

gpt-4o-2024-08-06 Super Complex JSON Schema 60481.4465 11698.9430 416.98%

gpt-4o-mini Super Complex JSON Schema 66011.3763 13994.1616 371.71%

For a simple to medium complex schema, it is reasonable to go non-strict. Based on the success rate for complex (below)

Method Avg Time (ms) Time Diff (ms) Success Rate Cost Cost Diff

gpt-4o-2024-08-06-non-strict-tool 4079.0854 0 100.0000% 0.1680 +0.1534

gpt-4o-mini-non-strict-json 5847.6183 +1768.5329 100.0000% 0.0175 +0.0029

gpt-4o-2024-08-06-strict-json 5866.2200 +1787.1346 100.0000% 0.1528 +0.1382

gpt-4o-2024-08-06-non-strict-json 6314.3933 +2235.3079 100.0000% 0.3026 +0.2880

gpt-4o-mini-strict-json 7858.5114 +3779.4260 100.0000% 0.0146 0

gpt-4o-mini-non-strict-tool N/A N/A 0.0000% N/A N/A

We should expose a strict flag.
For now, default to non-strict so we can get the current generation jobs across the line. Right now all our own use has been simple json structures.

In line with the author's suggestion:

Based on my findings, I recommend the following approaches:

For Simple JSON Structures:

Prefer non-strict modes, especially tool-based methods for speed and cost-effectiveness
Go with smaller mini model if you can (but don’t forget about potential failures, wrap in try/catch accordingly)

RyanMarten

Small change to the error message.

Just say the model successfully responded with a string that is JSON but doesn't match the schema

src/bespokelabs/curator/request_processor/base_request_processor.py

RyanMarten

LGTM!

RyanMarten and others added 2 commits November 13, 2024 08:50

fix parsing when json is invalid

783f874

use strict: True for structured output

ab4b9c8

vutrung96 changed the base branch from main to dev November 13, 2024 17:00

vutrung96 added 3 commits November 13, 2024 17:14

remove strict and fix a type error

e58a435

fix type error

904f520

move parse_response_message outside of the class

8246be1

RyanMarten self-requested a review November 13, 2024 17:22

vutrung96 added 3 commits November 13, 2024 17:26

fix typing

5c47dae

black

c97f3c7

also catch invalid pydantic schema

fce9b2d

RyanMarten mentioned this pull request Nov 13, 2024

Retry when structured output fails #86

Open

RyanMarten requested changes Nov 13, 2024

View reviewed changes

src/bespokelabs/curator/request_processor/base_request_processor.py Outdated Show resolved Hide resolved

add warning message to error for pydantic

c7a50f3

RyanMarten approved these changes Nov 13, 2024

View reviewed changes

RyanMarten merged commit e7bb89e into dev Nov 13, 2024

RyanMarten deleted the ryanm/invalid-json-from-model branch November 13, 2024 17:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix JSON parsing from model #85

Fix JSON parsing from model #85

vutrung96 commented Nov 13, 2024 •

edited

Loading

RyanMarten commented Nov 13, 2024 •

edited

Loading

RyanMarten left a comment

RyanMarten left a comment

Fix JSON parsing from model #85

Fix JSON parsing from model #85

Conversation

vutrung96 commented Nov 13, 2024 • edited Loading

RyanMarten commented Nov 13, 2024 • edited Loading

RyanMarten left a comment

Choose a reason for hiding this comment

RyanMarten left a comment

Choose a reason for hiding this comment

vutrung96 commented Nov 13, 2024 •

edited

Loading

RyanMarten commented Nov 13, 2024 •

edited

Loading