-
Notifications
You must be signed in to change notification settings - Fork 44.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Risk-avoiding continuous mode #789
Comments
It would obviously be necessary to test the responses' accuracy when evaluating risk and to tweak the prompt to something less improvised. Nevertheless, I believe this would be a good enough safeguard until a more complex system can be implemented. |
It would also potentially double the cost of running AutoGPT |
Yes, it would increase costs, but not doubling at all since embeddings are not used here. Still, the idea is to have this as a third, optional mode. |
I am almost done implementing this. GPT-4 does a very good job analyzing commands. However, GPT-3.5 is lacking at best. Will post feedback in pull request. |
An alternative might be to allow the user to whitelist certain commands like |
related: #2701 (safeguards) but also: #2987 (comment) (preparing shell commands prior to executing them by adding relevant context like 1) availability of tools, 2) location/path, 3) version number) |
There's also quite a few other discussions playing with the idea of self-moderation: Some of the ideas include using a separate agent for observing other agents and assessing whether or not they violated some sort of compliance. |
This issue was closed automatically because it has been stale for 10 days with no activity. |
Duplicates
Summary 💡
When in risk-avoiding mode, one more GPT call should be made before running each command, asking it to moderate the would-be next command. Ideally, the call would return a value on a specific range we can compare with a user-defined risk threshold. If the risk exceeds the threshold, pause execution and await human feedback.
Examples 🌈
The intermediate call can be prompted as something on the lines of
Motivation 🔦
There is currently no way to attempt properly using AutoGPT without either babysitting or taking the risk of giving it free, unmonitored agency.
This feature would allow users to put some more trust on AutoGPT and "letting it loose" while trusting on its own self-moderation capabilities.
The text was updated successfully, but these errors were encountered: