Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reward system for autogpt #2446

Closed
1 task done
horazius opened this issue Apr 18, 2023 · 5 comments
Closed
1 task done

reward system for autogpt #2446

horazius opened this issue Apr 18, 2023 · 5 comments
Labels
potential plugin This may fit better into our plugin system. Stale

Comments

@horazius
Copy link

Duplicates

  • I have searched the existing issues

Summary 💡

Introduce a reward system so that AutoGPT can also learn. You should be able to see all actions in a history. Possibly even be able to call up parameters and then evaluate them with positive "p" and "n" negative. This evaluation file could then be shared as an extension among different users or in the end even played back to openAi for learning.

Examples 🌈

No response

Motivation 🔦

To improve the software by crowd learning. The basics of ChatGPT

@Androbin
Copy link
Contributor

Please note that GPT-4 can only do in-context learning, as the API does not currently support fine-tuning the model.

@ntindle ntindle added the potential plugin This may fit better into our plugin system. label Apr 21, 2023
@Boostrix
Copy link
Contributor

Boostrix commented May 9, 2023

this is more important than people may think, you need this for any sort of fitness function / training purposes - regardless of whether the LLM supports this or not, the reward system could also be executed locally to self-optimize: #3868 (comment)

And all actions/commands have a certain cost associated with it.
So in general, all actions/commands need to expose their costs so that a local reward function can optimize for those:

@Androbin
Copy link
Contributor

I'm all for on-policy reinforcement learning, see the paper
LETI: Learning to Generate from Textual Interactions
https://arxiv.org/abs/2305.10314

@github-actions
Copy link
Contributor

github-actions bot commented Sep 6, 2023

This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days.

@github-actions github-actions bot added the Stale label Sep 6, 2023
@github-actions
Copy link
Contributor

This issue was closed automatically because it has been stale for 10 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
potential plugin This may fit better into our plugin system. Stale
Projects
None yet
Development

No branches or pull requests

4 participants