Correct and clean up JSON handling #4655

erik-megarad · 2023-06-11T17:44:34Z

Background

We had a bunch of hacks to get around JSON errors. Turns out there were two issues:

The JSON coming from OpenAI is actually not JSON. It's a Python Dict that's been stringified like str(dict).

The second issue is that we weren't correctly feeding the AI the JSON schema. It recognizes the same schema that we use to validate the response, so this works much better.

This may also fix the "no command given" errors but I will need to test more.

Changes

Use ast.literal_eval to reverse the str() process that was done by OpenAI
Update the prompt to include the exact JSON schema
Remove a ton of now-useless code
Catch and log errors better

Documentation

Test Plan

Tested a bunch locally
Multiple challenge runs
Added json validation tests

PR Quality Checklist

My pull request is atomic and focuses on a single change.
I have thoroughly tested my changes with multiple different prompts.
I have considered potential risks and mitigations for my changes.
I have documented my changes clearly and comprehensively.
I have not snuck in any "extra" small tweaks changes.

I have run the following commands against my code to ensure it passes our linters:

black .
isort .
mypy
autoflake --remove-all-unused-imports --recursive --ignore-init-module-imports --ignore-pass-after-docstring autogpt tests --in-place

vercel · 2023-06-11T17:44:37Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment

Name	Status	Preview	Comments	Updated (UTC)
docs	⬜️ Ignored (Inspect)			Jun 11, 2023 11:59pm

github-actions · 2023-06-11T17:44:57Z

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

github-actions · 2023-06-11T17:48:56Z

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

github-actions · 2023-06-11T18:02:53Z

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

github-actions · 2023-06-11T20:33:37Z

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

Auto-GPT-Bot · 2023-06-11T20:37:11Z

You changed AutoGPT's behaviour. The cassettes have been updated and will be merged to the submodule when this Pull Request gets merged.

codecov · 2023-06-11T20:37:35Z

Codecov Report

Patch coverage: 76.92% and project coverage change: +0.66 🎉

Comparison is base (7bf39cb) 69.94% compared to head (1eb57d5) 70.61%.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #4655      +/-   ##
==========================================
+ Coverage   69.94%   70.61%   +0.66%     
==========================================
  Files          72       70       -2     
  Lines        3590     3437     -153     
  Branches      569      547      -22     
==========================================
- Hits         2511     2427      -84     
+ Misses        895      842      -53     
+ Partials      184      168      -16

Impacted Files	Coverage Δ
autogpt/memory/message_history.py	`86.86% <25.00%> (+1.15%)`	⬆️
autogpt/agent/agent.py	`58.82% <62.50%> (-0.82%)`	⬇️
autogpt/json_utils/utilities.py	`77.77% <87.50%> (+9.20%)`	⬆️
autogpt/commands/execute_code.py	`73.52% <100.00%> (ø)`
autogpt/prompts/generator.py	`89.74% <100.00%> (-0.51%)`	⬇️
autogpt/prompts/prompt.py	`46.80% <100.00%> (ø)`

... and 3 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

waynehamadi · 2023-06-11T20:53:40Z

@erik-megarad ast.literal_eval doesn't work with booleans true or false. and also why use it ?

github-actions · 2023-06-11T20:56:01Z

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

erik-megarad · 2023-06-11T20:59:50Z

Oh, it doesn't, does it. However, there aren't booleans allowed in our validation schema so it should never come up.

I'm using ast.literal_eval because it works with the escaped version of the JSON coming in from the OpenAI client.

erik-megarad · 2023-06-11T21:01:36Z

Actually I think I do need to change this. args is an object which could potentially have a boolean in it.

github-actions · 2023-06-12T00:00:06Z

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

netlify · 2023-06-12T00:00:40Z

✅ Deploy Preview for auto-gpt-docs canceled.

Name	Link
🔨 Latest commit	`1eb57d5`
🔍 Latest deploy log	https://app.netlify.com/sites/auto-gpt-docs/deploys/64889b2f2c79660008595a35

Auto-GPT-Bot · 2023-06-12T00:04:41Z

You changed AutoGPT's behaviour. The cassettes have been updated and will be merged to the submodule when this Pull Request gets merged.

erik-megarad · 2023-06-12T17:45:28Z

Update from discord: Using ast.literal_eval is correct because OpenAI serializes the response content using python's str function (it's not json apparently!). Subsequently ast.literal_eval reverses that process.

github-actions · 2023-06-12T22:10:28Z

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

github-actions · 2023-06-12T22:16:12Z

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

github-actions · 2023-06-12T22:55:54Z

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

github-actions · 2023-06-13T15:15:42Z

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

Auto-GPT-Bot · 2023-06-13T15:19:17Z

You changed AutoGPT's behaviour. The cassettes have been updated and will be merged to the submodule when this Pull Request gets merged.

github-actions · 2023-06-13T16:25:11Z

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

Auto-GPT-Bot · 2023-06-13T16:29:58Z

You changed AutoGPT's behaviour. The cassettes have been updated and will be merged to the submodule when this Pull Request gets merged.

github-actions · 2023-06-13T16:37:21Z

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

waynehamadi · 2023-06-13T16:53:46Z

@erik-megarad ok I reran the benchmark for JSON errors and it's 0 error out of 50 so this is good.

Pwuts · 2023-06-13T21:05:23Z

Nice! Nit: I see _test_json_parser.py was left behind ;)

Androbin · 2023-06-13T23:38:58Z

Exciting news: We can finally get rid of this JSON cleanup mess entirely!

Developers can now describe functions to gpt-4-0613 and gpt-3.5-turbo-0613, and have the model intelligently choose to output a JSON object containing arguments to call those functions. This is a new way to more reliably connect GPT's capabilities with external tools and APIs.

These models have been fine-tuned to both detect when a function needs to be called (depending on the user’s input) and to respond with JSON that adheres to the function signature. Function calling allows developers to more reliably get structured data back from the model. For example, developers can:

https://openai.com/blog/function-calling-and-other-api-updates

* Correct and clean up JSON handling * Use ast for message history too * Lint * Add comments explaining why we use literal_eval * Add descriptions to llm_response_format schema * Parse responses in code blocks * Be more careful when parsing in code blocks * Lint

github-actions bot added the size/xl label Jun 11, 2023

erik-megarad force-pushed the fix/json_handling branch from 6b090da to a27a066 Compare June 11, 2023 18:02

Auto-GPT-Bot added the behaviour change label Jun 11, 2023

erik-megarad force-pushed the fix/json_handling branch from d52c740 to 1fded82 Compare June 11, 2023 23:59

waynehamadi previously approved these changes Jun 12, 2023

View reviewed changes

erik-megarad dismissed waynehamadi’s stale review via 50cb0d8 June 12, 2023 22:55

erik-megarad force-pushed the fix/json_handling branch from 50cb0d8 to 18c1373 Compare June 12, 2023 23:22

erik-megarad added 5 commits June 13, 2023 08:14

Correct and clean up JSON handling

7261593

Use ast for message history too

4053f9a

Lint

ce94b35

Add comments explaining why we use literal_eval

0e2e687

Add descriptions to llm_response_format schema

ff273ab

Parse responses in code blocks

0c2bbeb

erik-megarad force-pushed the fix/json_handling branch from 18c1373 to 0c2bbeb Compare June 13, 2023 15:15

Be more careful when parsing in code blocks

fca801a

Lint

1eb57d5

waynehamadi approved these changes Jun 13, 2023

View reviewed changes

waynehamadi merged commit 07d9b58 into Significant-Gravitas:master Jun 13, 2023

This was referenced Jun 14, 2023

Extract the longest content enclosed by valid curly braces to improve the JSON parser #2549

Closed

Update bracket_termination.py #2094

Closed

Pwuts mentioned this pull request Jul 14, 2023

fix expressions in json strings #3891

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correct and clean up JSON handling #4655

Correct and clean up JSON handling #4655

erik-megarad commented Jun 11, 2023 •

edited

Loading

vercel bot commented Jun 11, 2023 •

edited

Loading

github-actions bot commented Jun 11, 2023

github-actions bot commented Jun 11, 2023

github-actions bot commented Jun 11, 2023

github-actions bot commented Jun 11, 2023

Auto-GPT-Bot commented Jun 11, 2023

codecov bot commented Jun 11, 2023 •

edited

Loading

waynehamadi commented Jun 11, 2023 •

edited

Loading

github-actions bot commented Jun 11, 2023

erik-megarad commented Jun 11, 2023

erik-megarad commented Jun 11, 2023

github-actions bot commented Jun 12, 2023

netlify bot commented Jun 12, 2023 •

edited

Loading

Auto-GPT-Bot commented Jun 12, 2023

erik-megarad commented Jun 12, 2023

github-actions bot commented Jun 12, 2023

github-actions bot commented Jun 12, 2023

github-actions bot commented Jun 12, 2023

github-actions bot commented Jun 13, 2023

Auto-GPT-Bot commented Jun 13, 2023

github-actions bot commented Jun 13, 2023

Auto-GPT-Bot commented Jun 13, 2023

github-actions bot commented Jun 13, 2023

waynehamadi commented Jun 13, 2023 •

edited

Loading

Pwuts commented Jun 13, 2023

Androbin commented Jun 13, 2023

Correct and clean up JSON handling #4655

Correct and clean up JSON handling #4655

Conversation

erik-megarad commented Jun 11, 2023 • edited Loading

Background

Changes

Documentation

Test Plan

PR Quality Checklist

vercel bot commented Jun 11, 2023 • edited Loading

github-actions bot commented Jun 11, 2023

github-actions bot commented Jun 11, 2023

github-actions bot commented Jun 11, 2023

github-actions bot commented Jun 11, 2023

Auto-GPT-Bot commented Jun 11, 2023

codecov bot commented Jun 11, 2023 • edited Loading

Codecov Report

waynehamadi commented Jun 11, 2023 • edited Loading

github-actions bot commented Jun 11, 2023

erik-megarad commented Jun 11, 2023

erik-megarad commented Jun 11, 2023

github-actions bot commented Jun 12, 2023

netlify bot commented Jun 12, 2023 • edited Loading

✅ Deploy Preview for auto-gpt-docs canceled.

Auto-GPT-Bot commented Jun 12, 2023

erik-megarad commented Jun 12, 2023

github-actions bot commented Jun 12, 2023

github-actions bot commented Jun 12, 2023

github-actions bot commented Jun 12, 2023

github-actions bot commented Jun 13, 2023

Auto-GPT-Bot commented Jun 13, 2023

github-actions bot commented Jun 13, 2023

Auto-GPT-Bot commented Jun 13, 2023

github-actions bot commented Jun 13, 2023

waynehamadi commented Jun 13, 2023 • edited Loading

Pwuts commented Jun 13, 2023

Androbin commented Jun 13, 2023

erik-megarad commented Jun 11, 2023 •

edited

Loading

vercel bot commented Jun 11, 2023 •

edited

Loading

codecov bot commented Jun 11, 2023 •

edited

Loading

waynehamadi commented Jun 11, 2023 •

edited

Loading

netlify bot commented Jun 12, 2023 •

edited

Loading

waynehamadi commented Jun 13, 2023 •

edited

Loading