Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct and clean up JSON handling #4655

Merged

Conversation

erik-megarad
Copy link
Contributor

@erik-megarad erik-megarad commented Jun 11, 2023

Background

We had a bunch of hacks to get around JSON errors. Turns out there were two issues:

The JSON coming from OpenAI is actually not JSON. It's a Python Dict that's been stringified like str(dict).

The second issue is that we weren't correctly feeding the AI the JSON schema. It recognizes the same schema that we use to validate the response, so this works much better.

This may also fix the "no command given" errors but I will need to test more.

Changes

  • Use ast.literal_eval to reverse the str() process that was done by OpenAI
  • Update the prompt to include the exact JSON schema
  • Remove a ton of now-useless code
  • Catch and log errors better

Documentation

Test Plan

  • Tested a bunch locally
  • Multiple challenge runs
  • Added json validation tests

PR Quality Checklist

  • My pull request is atomic and focuses on a single change.
  • I have thoroughly tested my changes with multiple different prompts.
  • I have considered potential risks and mitigations for my changes.
  • I have documented my changes clearly and comprehensively.
  • I have not snuck in any "extra" small tweaks changes.
  • I have run the following commands against my code to ensure it passes our linters:
    black .
    isort .
    mypy
    autoflake --remove-all-unused-imports --recursive --ignore-init-module-imports --ignore-pass-after-docstring autogpt tests --in-place

@vercel
Copy link

vercel bot commented Jun 11, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
docs ⬜️ Ignored (Inspect) Jun 11, 2023 11:59pm

@github-actions
Copy link
Contributor

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

1 similar comment
@github-actions
Copy link
Contributor

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

@github-actions
Copy link
Contributor

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

1 similar comment
@github-actions
Copy link
Contributor

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

@Auto-GPT-Bot
Copy link
Contributor

You changed AutoGPT's behaviour. The cassettes have been updated and will be merged to the submodule when this Pull Request gets merged.

@codecov
Copy link

codecov bot commented Jun 11, 2023

Codecov Report

Patch coverage: 76.92% and project coverage change: +0.66 🎉

Comparison is base (7bf39cb) 69.94% compared to head (1eb57d5) 70.61%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4655      +/-   ##
==========================================
+ Coverage   69.94%   70.61%   +0.66%     
==========================================
  Files          72       70       -2     
  Lines        3590     3437     -153     
  Branches      569      547      -22     
==========================================
- Hits         2511     2427      -84     
+ Misses        895      842      -53     
+ Partials      184      168      -16     
Impacted Files Coverage Δ
autogpt/memory/message_history.py 86.86% <25.00%> (+1.15%) ⬆️
autogpt/agent/agent.py 58.82% <62.50%> (-0.82%) ⬇️
autogpt/json_utils/utilities.py 77.77% <87.50%> (+9.20%) ⬆️
autogpt/commands/execute_code.py 73.52% <100.00%> (ø)
autogpt/prompts/generator.py 89.74% <100.00%> (-0.51%) ⬇️
autogpt/prompts/prompt.py 46.80% <100.00%> (ø)

... and 3 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@waynehamadi
Copy link
Contributor

waynehamadi commented Jun 11, 2023

@erik-megarad ast.literal_eval doesn't work with booleans true or false. and also why use it ?

@github-actions
Copy link
Contributor

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

@erik-megarad
Copy link
Contributor Author

Oh, it doesn't, does it. However, there aren't booleans allowed in our validation schema so it should never come up.

I'm using ast.literal_eval because it works with the escaped version of the JSON coming in from the OpenAI client.

@erik-megarad
Copy link
Contributor Author

Actually I think I do need to change this. args is an object which could potentially have a boolean in it.

@github-actions
Copy link
Contributor

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

@netlify
Copy link

netlify bot commented Jun 12, 2023

Deploy Preview for auto-gpt-docs canceled.

Name Link
🔨 Latest commit 1eb57d5
🔍 Latest deploy log https://app.netlify.com/sites/auto-gpt-docs/deploys/64889b2f2c79660008595a35

@Auto-GPT-Bot
Copy link
Contributor

You changed AutoGPT's behaviour. The cassettes have been updated and will be merged to the submodule when this Pull Request gets merged.

@erik-megarad
Copy link
Contributor Author

Update from discord: Using ast.literal_eval is correct because OpenAI serializes the response content using python's str function (it's not json apparently!). Subsequently ast.literal_eval reverses that process.

waynehamadi
waynehamadi previously approved these changes Jun 12, 2023
@github-actions
Copy link
Contributor

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

1 similar comment
@github-actions
Copy link
Contributor

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

@github-actions
Copy link
Contributor

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

@github-actions
Copy link
Contributor

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

@Auto-GPT-Bot
Copy link
Contributor

You changed AutoGPT's behaviour. The cassettes have been updated and will be merged to the submodule when this Pull Request gets merged.

@github-actions
Copy link
Contributor

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

@Auto-GPT-Bot
Copy link
Contributor

You changed AutoGPT's behaviour. The cassettes have been updated and will be merged to the submodule when this Pull Request gets merged.

@github-actions
Copy link
Contributor

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

@waynehamadi
Copy link
Contributor

waynehamadi commented Jun 13, 2023

@erik-megarad ok I reran the benchmark for JSON errors and it's 0 error out of 50 so this is good.

@waynehamadi waynehamadi merged commit 07d9b58 into Significant-Gravitas:master Jun 13, 2023
@Pwuts
Copy link
Member

Pwuts commented Jun 13, 2023

Nice! Nit: I see _test_json_parser.py was left behind ;)

@Androbin
Copy link
Contributor

Exciting news: We can finally get rid of this JSON cleanup mess entirely!

Developers can now describe functions to gpt-4-0613 and gpt-3.5-turbo-0613, and have the model intelligently choose to output a JSON object containing arguments to call those functions. This is a new way to more reliably connect GPT's capabilities with external tools and APIs.

These models have been fine-tuned to both detect when a function needs to be called (depending on the user’s input) and to respond with JSON that adheres to the function signature. Function calling allows developers to more reliably get structured data back from the model. For example, developers can:

https://openai.com/blog/function-calling-and-other-api-updates

@Pwuts Pwuts mentioned this pull request Jul 14, 2023
5 tasks
jordankanter pushed a commit to jordankanter/Auto-GPT that referenced this pull request Nov 12, 2023
* Correct and clean up JSON handling

* Use ast for message history too

* Lint

* Add comments explaining why we use literal_eval

* Add descriptions to llm_response_format schema

* Parse responses in code blocks

* Be more careful when parsing in code blocks

* Lint
jordankanter pushed a commit to jordankanter/Auto-GPT that referenced this pull request Nov 12, 2023
* Correct and clean up JSON handling

* Use ast for message history too

* Lint

* Add comments explaining why we use literal_eval

* Add descriptions to llm_response_format schema

* Parse responses in code blocks

* Be more careful when parsing in code blocks

* Lint
jordankanter pushed a commit to jordankanter/Auto-GPT that referenced this pull request Nov 12, 2023
* Correct and clean up JSON handling

* Use ast for message history too

* Lint

* Add comments explaining why we use literal_eval

* Add descriptions to llm_response_format schema

* Parse responses in code blocks

* Be more careful when parsing in code blocks

* Lint
jordankanter pushed a commit to jordankanter/Auto-GPT that referenced this pull request Nov 12, 2023
* Correct and clean up JSON handling

* Use ast for message history too

* Lint

* Add comments explaining why we use literal_eval

* Add descriptions to llm_response_format schema

* Parse responses in code blocks

* Be more careful when parsing in code blocks

* Lint
jordankanter pushed a commit to jordankanter/Auto-GPT that referenced this pull request Nov 12, 2023
* Correct and clean up JSON handling

* Use ast for message history too

* Lint

* Add comments explaining why we use literal_eval

* Add descriptions to llm_response_format schema

* Parse responses in code blocks

* Be more careful when parsing in code blocks

* Lint
jordankanter pushed a commit to jordankanter/Auto-GPT that referenced this pull request Nov 12, 2023
* Correct and clean up JSON handling

* Use ast for message history too

* Lint

* Add comments explaining why we use literal_eval

* Add descriptions to llm_response_format schema

* Parse responses in code blocks

* Be more careful when parsing in code blocks

* Lint
jordankanter pushed a commit to jordankanter/Auto-GPT that referenced this pull request Nov 12, 2023
* Correct and clean up JSON handling

* Use ast for message history too

* Lint

* Add comments explaining why we use literal_eval

* Add descriptions to llm_response_format schema

* Parse responses in code blocks

* Be more careful when parsing in code blocks

* Lint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants