Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove or make optional the ability of Auto-GPT to write Python "helper" scripts on the fly #3428

Closed
1 task done
Kenshiro-28 opened this issue Apr 27, 2023 · 31 comments
Closed
1 task done

Comments

@Kenshiro-28
Copy link

Kenshiro-28 commented Apr 27, 2023

Duplicates

  • I have searched the existing issues

Summary 💡

Auto-GPT should send the text it wants to process directly to the model and not create python scripts on the fly to make the task easier.

Examples 🌈

No response

Motivation 🔦

ChatGPT is capable of processing complex tasks by itself, it doesn't need the help of python scripts. In many cases they prevent it from completing the task and cause it to get stuck in a loop. If a task is large, it can be divided into smaller tasks.

@Kenshiro-28 Kenshiro-28 changed the title Remove Auto-GPT ability to write Python scripts. Remove Auto-GPT ability to write Python scripts on the fly Apr 27, 2023
@ntindle
Copy link
Member

ntindle commented Apr 27, 2023

Give examples of expected and unexpected behavior. I can’t do anything with this without having more context of what its goals were and why it decided to do that. There’s options other than fully removing a command

@tuwid
Copy link

tuwid commented Apr 27, 2023

Not sure whats required here as well although I've seen autogpt generate "helper" scripts that might even be dumb like create_directory or list_files somehow. I'd close this as inconclusive and vague

@Emasoft
Copy link

Emasoft commented Apr 28, 2023

The ability to write python script is needed to debug and fix python source code files automatically. Like this script does: SebRinnebach autodebug

@Kenshiro-28
Copy link
Author

Kenshiro-28 commented Apr 28, 2023

Give examples of expected and unexpected behavior. I can’t do anything with this without having more context of what its goals were and why it decided to do that. There’s options other than fully removing a command

When doing trivial tasks like writing a story with specific details about what should appear in each chapter, or even doing a basic stock market research, it can start to write "helper" python scripts do the task, which in many cases is unable to do as 99% of people only have access to gpt-3, other people don't have docker installed or prefer Auto-GPT doesn't run scripts created on the fly. Whatever we can do directly in the web chat of chatgpt should be done directly, it doesn't have any sense to write helper scripts to do it. The expected behavior is that all information is feeded directly to the model, without writing helper scripts.

@Kenshiro-28
Copy link
Author

Not sure whats required here as well although I've seen autogpt generate "helper" scripts that might even be dumb like create_directory or list_files somehow. I'd close this as inconclusive and vague

Those are good examples too. These "helper" scripts cause that even trivial tasks can't be completed and are not required. Should be disabled by default.

@Kenshiro-28
Copy link
Author

Kenshiro-28 commented Apr 28, 2023

The ability to write python script is needed to debug and fix python source code files automatically. Like this script does: SebRinnebach autodebug

That's outside the scope of 99% of people, they don't have access to gpt-4 key and don't even want to autodebug code. Should be disabled by default, and people that want to autodebug code could enable it when they need it. Could be an option in .env file

@Emasoft
Copy link

Emasoft commented Apr 28, 2023

That's outside the scope of 99% of people, they don't have access to gpt-4 key and don't even want to autodebug code. Should be disabled by default, and people that want to autodebug code could enable it when they need it. Could be an option in .env file

This is not true. Many people will just ask Auto-GPT to write programs for them. Handy scripts can be easily created by Auto-GPT. Also, internally, Auto-GPT already has 3 commands dedicated to programming. What it lacks is a command to autodebug its own generated code. The code is rarely perfect at the first try. Usually it needs to be fixed, and autodebug does just that. See this proposal: #3445

@Kenshiro-28
Copy link
Author

Kenshiro-28 commented Apr 28, 2023

That's outside the scope of 99% of people, they don't have access to gpt-4 key and don't even want to autodebug code. Should be disabled by default, and people that want to autodebug code could enable it when they need it. Could be an option in .env file

This is not true. Many people will just ask Auto-GPT to write programs for them. Handy scripts can be easily created by Auto-GPT. Also, internally, Auto-GPT already has 3 commands dedicated to programming. What it lacks is a command to autodebug its own generated code. The code is rarely perfect at the first try. Usually it needs to be fixed, and autodebug does just that. See this proposal: #3445

I think it's true because 99% of people don't have GPT-4 access, so that excludes advanced programming capabilities. And moreover, most people are not software developers, or don't want to use Auto-GPT for that. As the majority of people don't need these "helper" scripts and it destroys GPT-3 functionality for even non-programming tasks, this should be disabled by default. The minority of users that need it could enable it in .env

@Emasoft
Copy link

Emasoft commented Apr 28, 2023

@Kenshiro-28 I think you have a pretty idealized idea of the Auto-GPT user base. As far as I know, almost 90% of people using it are programmers. And in my experience the first thing they think to ask to Auto-GPT is to create a website or a program for them. Also, many complex things the non coders users are going to ask are often in need of a custom python script to do them. Think about "Get me the last financial quarters results for the major IT companies and generate a graph comparing their performances to last year..". This cannot be done without generating a script to scrape the financial web sites and another to generate a graph image from the collected data. Those two python scripts need to be created by Auto-GPT and debugged using the autodebug-code command.

@Kenshiro-28
Copy link
Author

Kenshiro-28 commented Apr 28, 2023

@Emasoft I think the community is aiming for a general-purpose AI, not a tool for developers. And moreover, it should be functional with GPT-3 or a local model less capable that GPT-4. Have you tried to use it in GPT-3 only mode? it derails even for trivial tasks, and many times this is caused by these "helper" scripts. I think default configuration should be what is more useful for the majority of people, and this clearly includes a functional product for a GPT-3 only mode. Even if it's not able to generate charts, it can read what experts say about a particular subject, and this is enough for many tasks. For the minority of users that have GPT-4 key and are willing to pay for it, they could enable the "coder mode" in .env

@Emasoft
Copy link

Emasoft commented Apr 28, 2023

@Emasoft I think the community is aiming for a general-purpose AI, not a tool for developers.

This doesn't make any sense. Auto-GPT should be more powerful than ChatGPT, not less. I can ask ChatGPT to write code for me, why Auto-GPT cannot do the same?
Also, GPT3.5-Turbo is perfectly capable of coding, you don't need GPT-4. But GPT4.5-Turbo is coming soon anyway.
The point is that Auto-GPT should use recursion to do everything that GPT already does, but better. Recursion, multiple sub-tasks, iterative autocorrection.. all those are the features that Auto-GPT offers. You should not interfere with the objectives of the user in any way. The beauty of ChatGPT is in his human like flexibility. You can ask him everything. Forbidding the user to do certain tasks is going to alienate him and make him feel that AutoGPT is worse than the original GPT. This is wrong on so many levels. It destroys the aura of freedom and omnipotence that GPT brings with it. Let AutoGPT be as flexible as the original GPT: only then it would be successful. Otherwise, a lot of people are going to drop it. And those that would leave are exactly the kind of people who are better suited to help with the project, I might add. In adopting a restriction policy on the target user base of Auto-GPT you are just shooting yourself in the foot.

@Kenshiro-28
Copy link
Author

@Emasoft I'm not saying that coding or any other task should be forbidden in Auto-GPT, the subject of this conversation is the generation of python scripts on the fly, just that. Of course you can feed any type of question to the model directly, just as you do with the web chat. And yes, Auto-GPT is better because it has recursion, task division and internet browsing, but you can have all of that with hardcoded rules on Auto-GPT. Creating "on the fly" scripts make Auto-GPT derail even in trivial tasks, at least when using GPT-3 only. If you make a few tests in GPT-3 mode you will see it.

These "on the fly" scripts should be disabled by default, and enabled only by the very small subset of users than need them. 99% of users are using GPT-3 only, and probably will be a majority in the future as GPT-4 usage is more expensive.

To be more concise: make these "on the fly" scripts optional, so users can decide if they want to use them or not. As 99% of users use GPT-3, it should be disabled by default.

@jeremyjs
Copy link

I faced a similar issue with it trying to download a third-party package from the internet and agree with @Kenshiro-28 that this is an issue which can be solved by a .env flag

https://gist.github.com/jeremyjs/f73b12a07583ca3e932656940f4b2b14

It tried to install a third-party python financial analysis library and in doing so got stuck debugging why it didn't work, including trying to check my internet connection speed and invoking another gpt model to ask it for advice.

Some ideas to improve:

  • have a flag or flags for the operator (i.e. me running autogpt) to disable coding, writing its own code vs downloading third party code, since I may want to have autogpt explore non-coding approaches to solve the goal. Downloading and running third-party code is a different class of security risk and worth an .env flag imo. A workaround could be to try including that constraint in the original prompt.
  • improve its understanding of its own capabilities and/or its ability to download and run third-pary code. for instance it got confused about where on the filesystem to download and run, how to run docker, etc.

@ColbyLeeCode
Copy link

Please do not remove the ability to write python scripts... A flag makes sense though. This is literally the only thing I have been using it for, and I seriously doubt I'm in the minority here.

@Kenshiro-28
Copy link
Author

Please do not remove the ability to write python scripts... A flag makes sense though. This is literally the only thing I have been using it for, and I seriously doubt I'm in the minority here.

Just in case, this is about removing python script creation "on the fly": when you request some non-python task and Auto-GPT starts to write python scripts to help himself to do it. Removing this feature wouldn't remove the ability to write code in Python or other language if that's the task you are requesting. I agree, a flag in .env looks the best for everyone.

@jaykayenn
Copy link

This 'issue' is absurd. Asking to remove a core feature, and arguing against using GPT4 when that is literally the first line of the project description. Stop spamming other people's repos. Feel free to make your own.

@ColbyLeeCode
Copy link

Please do not remove the ability to write python scripts... A flag makes sense though. This is literally the only thing I have been using it for, and I seriously doubt I'm in the minority here.

Just in case, this is about removing python script creation "on the fly": when you request some non-python task and Auto-GPT starts to write python scripts to help himself to do it. Removing this feature wouldn't remove the ability to write code in Python or other language if that's the task you are requesting. I agree, a flag in .env looks the best for everyone.

I'm curious if including, "Do not create or execute any python scripts" in the prompt would solve this. For example, I'm able to get the bot to avoid trying to clone a GitHub repo (mostly) by just requesting that it does not in my prompt, or asking it to review an 'instructions' file that has some 'avoid at all costs' list.

@Kenshiro-28
Copy link
Author

I'm curious if including, "Do not create or execute any python scripts" in the prompt would solve this. For example, I'm able to get the bot to avoid trying to clone a GitHub repo (mostly) by just requesting that it does not in my prompt, or asking it to review an 'instructions' file that has some 'avoid at all costs' list.

I tried "don't write python scripts" but didn't work

@Kenshiro-28
Copy link
Author

Kenshiro-28 commented Apr 28, 2023

This 'issue' is absurd. Asking to remove a core feature, and arguing against using GPT4 when that is literally the first line of the project description. Stop spamming other people's repos. Feel free to make your own.

Just in case you haven't noticed, there is a "GPT-3.5 ONLY Mode" in the project which doesn't work fine due to this issue.

@ColbyLeeCode
Copy link

I'm curious if including, "Do not create or execute any python scripts" in the prompt would solve this. For example, I'm able to get the bot to avoid trying to clone a GitHub repo (mostly) by just requesting that it does not in my prompt, or asking it to review an 'instructions' file that has some 'avoid at all costs' list.

I tried "don't write python scripts" but didn't work

Curious what your prompts are because I just tried 5 times and it didn't seem to write/execute any python.

@Kenshiro-28
Copy link
Author

Kenshiro-28 commented Apr 28, 2023

I tried "don't write python scripts" but didn't work

Curious what your prompts are because I just tried 5 times and it didn't seem to write/execute any python.

Try in GPT-3 mode some trivial task that is long but should be easy, like writing a guide about a particular subject. You can use this as a template, just adapt the text to whatever you want to research.

MAGI is: an AI designed to write guides.
Goal 1: write about the history of X. Read it and improve it until it's ok. It must have at least 500 words. Save it in "intro.txt".
Goal 2: write about the connection between X and Y. Read it and improve it until it's ok. It must have at least 500 words. Save it in "chapter1.txt".
Goal 3: write about the S1 school based in L1. Read it and improve it until it's ok. It must have at least 500 words. Save it in "chapter2.txt".
Goal 4: write about the S2 school based in L2. Read it and improve it until it's ok. It must have at least 500 words. Save it in "chapter3.txt".
Goal 5: read the previous files and generate the final version. Save it in "guide.txt".

@nacho00112
Copy link

This is simple, he is saying you are trying to mix GPT 3 and GPT 4, GPT 3 is not powerful as GPT 4, so of course they cannot have the same functionalities, this is because he is saying GPT 3 should have the option to disable "scripts on the fly", and activated by default, if you support GPT 3, since it seems that does not remove any functionality and does not have any conflicts with anything, this request should be accepted

@ntindle
Copy link
Member

ntindle commented Apr 29, 2023

Known issue. Proposed resolution: disabling commands selectively by the user

@Kenshiro-28 Kenshiro-28 changed the title Remove Auto-GPT ability to write Python scripts on the fly Remove / make optional Python "helper" scripts Apr 29, 2023
@Kenshiro-28 Kenshiro-28 changed the title Remove / make optional Python "helper" scripts Remove or make optional the ability of Auto-GPT to write Python "helper" scripts on the fly Apr 29, 2023
@Emasoft
Copy link

Emasoft commented May 1, 2023

MAGI is: an AI designed to write guides. (....) Goal 4: write about the S2 school based in L2. Read it and improve it until it's ok. It must have at least 500 words. Save it in "chapter3.txt". Goal 5: read the previous files and generate the final version. Save it in "guide.txt".

Everybody is taking the wrong approach here. To solve this issue you don't need to stop Auto-GPT from creating python scripts on the fly, or adding some alienating and self-limiting custom settings.
The reason GPT-3 or 4 tries to code Python scripts is that it has no choice! We didn't provide it with the basic commands to handle documents. So it can only resort to writing the functions by itself in Python!

What we need to do is to give Auto-GPT the RIGHT COMMANDS to solve these simple but common tasks:

  • Read a Document (type: txt, rtf, doc, epub, markdown, csv, tsv, json, excel, powerpoint, pdf, html, svg, css, xml, js, py...)
  • Write a New Document (type: txt, rtf, doc, epub, markdown, csv, tsv, json, excel, powerpoint, pdf, html, svg, xml, js, py...)
  • Append Text or Data to Existing Document
  • Edit/Modify/Improve/Summarize/Refactor a Document according to given Rules
  • Join Two or More Documents in the given Order
  • Split a Document in Two or More Documents at the given lines/chars/pages
  • Search inside existing Document
  • Index a Document for Similarity Search (or Embed a vector tokenization of it)
  • Extract/Scrape Document from URL
  • Download and Edit web assets from a website
  • Upload web assets to a website
  • Render markdown to HTML or PDF file
  • Save table to excel file or csv/tsv or json/yaml
  • etc.

Currently the commands available to the AI are the following:

    command_registry = CommandRegistry()
    command_registry.import_commands("autogpt.commands.analyze_code")
    command_registry.import_commands("autogpt.commands.audio_text")
    command_registry.import_commands("autogpt.commands.execute_code")
    command_registry.import_commands("autogpt.commands.file_operations")
    command_registry.import_commands("autogpt.commands.git_operations")
    command_registry.import_commands("autogpt.commands.google_search")
    command_registry.import_commands("autogpt.commands.image_gen")
    command_registry.import_commands("autogpt.commands.improve_code")
    command_registry.import_commands("autogpt.commands.twitter")
    command_registry.import_commands("autogpt.commands.web_selenium")
    command_registry.import_commands("autogpt.commands.write_tests")
    command_registry.import_commands("autogpt.app")

Those need a n-shot training to let GPT learn how to use them, but this is another issue.
The issue here is there are no commands to do what the prompt asks the AI. How the AI is supposed to "read file" or "create file" or "edit file", etc?

I propose to add those additional basic I/O commands to Auto-GPT:

  • autogpt.commands.read_document
  • autogpt.commands..write_new_document
  • autogpt.commands.change_document_metadata
  • autogpt.commands.duplicate_document
  • autogpt.commands.tag_document_with_keyword
  • autogpt.commands.read_documents_by_tag
  • autogpt.commands.rename_document
  • autogpt.commands.delete_document
  • autogpt.commands.archive_document
  • autogpt.commands.append_to_document
  • autogpt.commands.prepend_to_document
  • autogpt.commands.modify_original_document
  • autogpt.commands.modify_document_copy
  • autogpt.commands.join_documents
  • autogpt.commands.split_document
  • autogpt.commands.search_inside_document
  • autogpt.commands.embed_document
  • autogpt.commands.save_document_from_url
  • autogpt.commands.save_web_page_from_url
  • autogpt.commands.read_saved_web_page
  • autogpt.commands.upload_web_page
  • autogpt.commands.download_web_assets_from_url
  • autogpt.commands.create_ssh_keypair_for_remote_host
  • autogpt.commands.change_ssh_keypair_for_remote_host
  • autogpt.commands.upload_web_assets_to_url_via_ssh
  • autogpt.commands.scrape_data_from_url
  • autogpt.commands.render_markdown_to_html
  • autogpt.commands.render_markdown_to_pdf
  • autogpt.commands.render_markdown_to_docx
  • autogpt.commands.save_table_data_to_xlsx
  • autogpt.commands.save_table_data_to_csv
  • autogpt.commands.save_table_data_to_tsv
  • autogpt.commands.save_table_data_to_json
  • autogpt.commands.save_table_data_to_yaml
  • autogpt.commands.save_conversation_to_chatml
  • etc.

All those commands require arguments of course, and the AI must be instructed via n-shot training to use them correctly.

With those basic I/O commands ready, AutoGPT is not going to need to write Python code by itself anymore. After all, Auto-GPT commands are nothing different from GPT-4 plugins. This is how you solve this problem IMHO.

@Kenshiro-28
Copy link
Author

@Emasoft note that it's not failing because it's asked to write files. Moreover, it can successfully write files if you ask that as a simple task, like "save this text in a file X" or "read the email on file X". The problem is that with any trivial task that is a little long it starts to write non-sense scripts that make it derail.

@Boostrix
Copy link
Contributor

Boostrix commented May 1, 2023

at least on a Linux/Unix (Mac) system, most of the tooling would be in place (available) to do the majority via the CLI - and keep in mind, with this number of options, we might even need pagination for the prompt generator some day.

In general, I suppose that the solution here would be to allow people to explicitly disable/enable certain [all] commands - i.e. those that they [don't] want. Some sort of blacklist/whitelist approach - given that the usage scenarios may differ hugely, and given how the LLM needs to be presented with these options, it isn't such a bad idea to optionally reduce the number of options - and in fact, with the sub-agent feature being in place, it might even make conceptually sense to restrict sub-agents to certain sets of commands, for certain workflows/tasks.

Thus, I suppose that coming up with a mechanism to explicitly enable/disable certain commands (or plugins) would not be a bad idea assuming that some agents may need to run in more constrained environment than others (think sandbox)

The problem is that with any trivial task that is a little long it starts to write non-sense scripts that make it derail.

derailing happens because of the enormous number of possibilties and options, but if a sub-agent approach is used where a task is handed down to a bunch of heavily constrained agents, the task is likely to become much more feasible (think genetic algorithms/programming)

Imagine the task being to generate a PDF file:

  • lower level: Generate a file with a PDF header
    -- next level: generate a binary file
    --- next level: generate a file

all of a sudden, the whole task is much more constrained/feasible - because each agent could be trying to do "its thing" (including required research), and it could be observed/monitored by its parent agent (to be restarted as needed), but also do necessary research/experiments within its own workspace - only yielding control once it succeeds (at which point it stops existing in this scenario)

PS: If/when Auto-GPT supports disabling options/commands, it could add a new final "last resort" option/command "need_help" to communciate that its missing commands/options - at which point the human could be involved once again (#2460 ), or the parent agent could be informed accordingly via a message or exception.
Besides, at least from a security standpoint it would make perfect sense to only allow the execution of commands inside a chroot environment (at least when not using docker/virtualization) - if not even executing all of Auto-GPT using chroot

@ntindle
Copy link
Member

ntindle commented May 2, 2023

@Emasoft see #3031

@samj
Copy link

samj commented May 8, 2023

Expected behaviour: As it to do a simple task that does not require coding (e.g. generate content on cloud computing), and have it complete the task itself using pre-existing functionality.

Observed behaviour: AutoGPT starts trying to check out and analyse code, create repos for said content, etc.

Caveat: Scripting for its own internal purposes may be acceptable, but any "strait-jacket mode" (potentially the default) should be able to do everything it needs to do with included capabilities or plugins.

@samj
Copy link

samj commented May 8, 2023

Note that I also saw it trying and failing to use git and/or write outside the workspace (despite having been provided GitHub keys) and it set about trying to create new SSH keys etc. which a> should not have been necessary and b> should have been pre-configured and possibly tested at the start of the run.

Suggestion: Workspaces could be GitHub repos by default, with subsequent runs building on existing content, with new content being checked in on the fly.

@Boostrix
Copy link
Contributor

Boostrix commented May 8, 2023

Workspaces could be GitHub repos by default, with subsequent runs building on existing content, with new content being checked in on the fly.

I considered tinkering with that idea too over the last couple of days - it would make persistence much easier TBH.
For the time being there seems to be VERY little planning going on when tasks are tackled

@ntindle
Copy link
Member

ntindle commented May 8, 2023

Fixed in #3667

@ntindle ntindle closed this as completed May 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants