Integrate Auto-GPT with Auto-GPT-Benchmarks #4987

waynehamadi · 2023-07-15T16:56:43Z

Background

We want the master branch to be compatible with Auto-GPT-Benchmarks, so we always have the most up to date score in the benchmark.

Changes

- regression_tests.json stores the regression tests - benchmarks.py connects Auto-GPT to the benchmark's interface. - config.json defines the entry_path, the workspace location and the time a challenge should run.

Documentation

Test Plan

PR Quality Checklist

My pull request is atomic and focuses on a single change.
I have thoroughly tested my changes with multiple different prompts.
I have considered potential risks and mitigations for my changes.
I have documented my changes clearly and comprehensively.
I have not snuck in any "extra" small tweaks changes.

I have run the following commands against my code to ensure it passes our linters:

black .
isort .
mypy
autoflake --remove-all-unused-imports --recursive --ignore-init-module-imports --ignore-pass-after-docstring autogpt tests --in-place

netlify · 2023-07-15T16:56:56Z

✅ Deploy Preview for auto-gpt-docs canceled.

Name	Link
🔨 Latest commit	`e75b9d5`
🔍 Latest deploy log	https://app.netlify.com/sites/auto-gpt-docs/deploys/64bc3bd396545d000805389a

codecov · 2023-07-15T17:11:02Z

Codecov Report

Patch and project coverage have no change.

Comparison is base (e0bcde1) 51.00% compared to head (e75b9d5) 51.00%.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #4987   +/-   ##
=======================================
  Coverage   51.00%   51.00%           
=======================================
  Files         119      119           
  Lines        4968     4968           
  Branches      662      662           
=======================================
  Hits         2534     2534           
  Misses       2239     2239           
  Partials      195      195

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

github-actions · 2023-07-20T16:20:11Z

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

Signed-off-by: Merwane Hamadi <[email protected]>

* updating config * add reports, consolidate, update benchmark files

…evel

Signed-off-by: Merwane Hamadi <[email protected]>

github-actions · 2023-07-22T20:24:14Z

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

Signed-off-by: Merwane Hamadi <[email protected]>

waynehamadi marked this pull request as draft July 15, 2023 16:56

github-actions bot added the size/m label Jul 15, 2023

waynehamadi force-pushed the benchmark-integration branch from 1830f56 to edbfbdb Compare July 15, 2023 17:07

waynehamadi force-pushed the benchmark-integration branch 3 times, most recently from cf30bbd to 8f6e811 Compare July 15, 2023 17:34

waynehamadi marked this pull request as ready for review July 15, 2023 17:50

SilenNaihin approved these changes Jul 15, 2023

View reviewed changes

collijk previously approved these changes Jul 16, 2023

View reviewed changes

github-actions bot added the conflicts Automatically applied to PRs with merge conflicts label Jul 20, 2023

waynehamadi dismissed collijk’s stale review via 0978a8f July 22, 2023 20:14

waynehamadi force-pushed the benchmark-integration branch 3 times, most recently from 017ebdf to 0f9431a Compare July 22, 2023 20:19

waynehamadi and others added 11 commits July 22, 2023 13:20

WIP

f283898

Signed-off-by: Merwane Hamadi <[email protected]>

WIP

823cb7f

Signed-off-by: Merwane Hamadi <[email protected]>

Update config for benchmark changes (#4883)

990088e

Add Helicone

e2b46b3

Add reports, consolidate, update benchmark files (#4941)

5a487eb

* updating config * add reports, consolidate, update benchmark files

Update benchmarks.py

1298d0d

Change entrypath and add __init__.py

7093d89

Remove Helicone integration because we now have proxy at the system l…

d4ab6fd

…evel

Support more regression tests

eda7cd1

Fix Auto-GPT/benchmark integration

b49ebda

Signed-off-by: Merwane Hamadi <[email protected]>

Remove cutoff

07824fe

waynehamadi force-pushed the benchmark-integration branch from 0f9431a to 6e51216 Compare July 22, 2023 20:22

github-actions bot removed the conflicts Automatically applied to PRs with merge conflicts label Jul 22, 2023

waynehamadi force-pushed the benchmark-integration branch from 6e51216 to 02396d6 Compare July 22, 2023 20:25

Install agbenchmark and make continuous mode dynamic

e75b9d5

Signed-off-by: Merwane Hamadi <[email protected]>

waynehamadi force-pushed the benchmark-integration branch from 02396d6 to e75b9d5 Compare July 22, 2023 20:28

ntindle approved these changes Jul 22, 2023

View reviewed changes

waynehamadi merged commit 4ada7d1 into master Jul 22, 2023

waynehamadi deleted the benchmark-integration branch July 22, 2023 21:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate Auto-GPT with Auto-GPT-Benchmarks #4987

Integrate Auto-GPT with Auto-GPT-Benchmarks #4987

waynehamadi commented Jul 15, 2023 •

edited

Loading

netlify bot commented Jul 15, 2023 •

edited

Loading

codecov bot commented Jul 15, 2023 •

edited

Loading

github-actions bot commented Jul 20, 2023

github-actions bot commented Jul 22, 2023

Integrate Auto-GPT with Auto-GPT-Benchmarks #4987

Integrate Auto-GPT with Auto-GPT-Benchmarks #4987

Conversation

waynehamadi commented Jul 15, 2023 • edited Loading

Background

Changes

Documentation

Test Plan

PR Quality Checklist

netlify bot commented Jul 15, 2023 • edited Loading

✅ Deploy Preview for auto-gpt-docs canceled.

codecov bot commented Jul 15, 2023 • edited Loading

Codecov Report

github-actions bot commented Jul 20, 2023

github-actions bot commented Jul 22, 2023

waynehamadi commented Jul 15, 2023 •

edited

Loading

netlify bot commented Jul 15, 2023 •

edited

Loading

codecov bot commented Jul 15, 2023 •

edited

Loading