Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AGBenchmark: Codebase clean-up #6650

Merged
merged 33 commits into from
Jan 2, 2024
Merged

AGBenchmark: Codebase clean-up #6650

merged 33 commits into from
Jan 2, 2024

Conversation

Pwuts
Copy link
Member

@Pwuts Pwuts commented Jan 1, 2024

Codebase is a huge mess. This should fix the worst of it.

Changes 🏗️

  • Deduplicate configuration loading logic
  • Fix type errors, linting errors, and clean up CLI validation in main.py
  • Lint and typefix app.py
  • Replace .agent_protocol_client by agent-protcol-client, clean up schema.py
  • Use pathlib in agent_interface.py and agent_api_interface.py
  • Fix path prefix stacking in AgentApi requests
  • Improve typing, response validation, and readability in app.py
  • Clean up logging and print statements
  • Remove unused server.py and agent_interface.py::run_agent
  • Clean up conftest.py
  • Clean up generate_test.py file
  • Fix and add type annotations in execute_sub_process.py
  • Simplify const determination in agent_interface.py
  • Register category markers to prevent warnings
  • Fix indentation in 4_revenue_retrieval_2/data.json
  • Update agent_api_interface.py
  • Improve and centralize pathfinding
  • Clean up and improve CLI
  • Move AgentBenchmarkConfig and related functions to config.py
  • Fix ReportManager init parameter types and use pathlib
  • Improve typing surrounding ChallengeData and clean up its implementation
  • Clean up generate_test.py, conftest.py and main.py
  • Merge AGBenchmarkPathManager into AgentBenchmarkConfig and reduce fragmented/global state
  • Configurable port for serve subcommand
  • Add config subcommand
  • Gracefully handle incompatible challenge spec files in app.py
  • Move run_benchmark entrypoint to main.py, use it in /reports endpoint
  • Remove unused /updates endpoint and all related code
  • Clean up and update docstrings on AgentBenchmarkConfig
  • Restore mechanism to select (optional) categories in agent benchmark config

PR Quality Scorecard ✨

  • Have you used the PR description template?   +2 pts
  • Is your pull request atomic, focusing on a single change?   +5 pts
  • Have you linked the GitHub issue(s) that this PR addresses?   +5 pts
  • Have you documented your changes clearly and comprehensively?   +5 pts
  • Have you changed or added a feature?   -4 pts
    • Have you added/updated corresponding documentation?   +4 pts
    • Have you added/updated corresponding integration tests?   +5 pts
  • Have you changed the behavior of AutoGPT?   -5 pts
    • Have you also run agbenchmark to verify that these changes do not regress performance?   +10 pts

Pwuts added 25 commits December 28, 2023 16:28
- Move the configuration loading logic to a separate `load_agbenchmark_config` function in `agbenchmark/config.py` module.
- Replace the duplicate loading logic in `conftest.py`, `generate_test.py`, `ReportManager.py`, `reports.py`, and `__main__.py` with calls to `load_agbenchmark_config` function.
…idation in __main__.py

- Fixed type errors and linting errors in `__main__.py`
- Improved the readability of CLI argument validation by introducing a separate function for it
- Rearranged and cleaned up import statements
- Fixed type errors caused by improper use of `psutil` objects
- Simplified a number of `os.path` usages by converting to `pathlib`
- Use `Task` and `TaskRequestBody` classes from `agent_protocol_client` instead of `.schema`
…ol-client`, clean up schema.py

- Remove `agbenchmark.agent_protocol_client` (an offline copy of `agent-protocol-client`).
   - Add `agent-protocol-client` as a dependency and change imports to `agent_protocol_client`.
- Fix type annotation on `agent_api_interface.py::upload_artifacts` (`ApiClient` -> `AgentApi`).
- Remove all unused types from schema.py (= most of them).
…lity in app.py

- Simplified response generation by leveraging type checking and conversion by FastAPI.
- Introduced use of `HTTPException` for error responses.
- Improved naming, formatting, and typing in `app.py::create_evaluation`.
- Updated the docstring on `app.py::create_agent_task`.
- Fixed return type annotations of `create_single_test` and `create_challenge` in generate_test.py.
- Added default values to optional attributes on models in report_types_v2.py.
- Removed unused imports in `generate_test.py`
- Introduced use of the `logging` library for unified logging and better readability.
- Converted most print statements to use `logger.debug`, `logger.warning`, and `logger.error`.
- Improved descriptiveness of log statements.
- Removed unnecessary print statements.
- Added log statements to unspecific and non-verbose `except` blocks.
- Added `--debug` flag, which sets the log level to `DEBUG` and enables a more comprehensive log format.
- Added `.utils.logging` module with `configure_logging` function to easily configure the logging library.
- Converted raw escape sequences in `.utils.challenge` to use `colorama`.
- Renamed `generate_test.py::generate_tests` to `load_challenges`.
…run_agent

- Remove unused server.py file
- Remove unused run_agent function from agent_interface.py
- Fix and add type annotations
- Rewrite docstrings
- Disable or remove unused code
- Fix definition of arguments and their types in `pytest_addoption`
- Refactored the `create_single_test` function for clarity and readability
   - Removed unused variables
   - Made creation of `Challenge` subclasses more straightforward
   - Made bare `except` more specific
- Renamed `Challenge.setup_challenge` method to `run_challenge`
- Updated type hints and annotations
- Made minor code/readability improvements in `load_challenges`
- Added a helper function `_add_challenge_to_module` for attaching a Challenge class to the current module
- Simplify the logic that determines the value of `HELICONE_GRAPHQL_LOGS`
- Use the `pytest_configure` hook to register the known challenge categories as markers. Otherwise, Pytest will raise "unknown marker" warnings at runtime.
- Add type annotations to `copy_agent_artifacts_into_temp_folder` function
- Add note about broken endpoint in the `agent_protocol_client` library
- Remove unused variable in `run_api_agent` function
- Improve readability and resolve linting error
- Search path hierarchy for applicable `agbenchmark_config`, rather than assuming it's in the current folder.
- Create `agbenchmark.utils.path_manager` with `AGBenchmarkPathManager` and exporting a `PATH_MANAGER` const.
- Replace path constants defined in __main__.py with usages of `PATH_MANAGER`.
- Updated commands, options, and their descriptions to be more intuitive and consistent
- Moved slow imports into the entrypoints that use them to speed up application startup
- Fixed type hints to match output types of Click options
- Hid deprecated `agbenchmark start` command
- Refactored code to improve readability and maintainability
- Moved main entrypoint into `run` subcommand
- Fixed `version` and `serve` subcommands
- Added `click-default-group` package to allow using `run` implicitly (for backwards compatibility)
- Renamed `--no_dep` to `--no-dep` for consistency
- Fixed string formatting issues in log statements
…ctions to config.py

- Move the `AgentBenchmarkConfig` class from `utils/data_types.py` to `config.py`.
- Extract the `calculate_info_test_path` function from `utils/data_types.py` and move it to `config.py` as a private helper function `_calculate_info_test_path`.
- Move `load_agent_benchmark_config()` to `AgentBenchmarkConfig.load()`.
- Changed simple getter methods on `AgentBenchmarkConfig` to calculated properties.
- Update all code references according to the changes mentioned above.
…athlib

- Fix the type annotation of the `benchmark_start_time` parameter in `ReportManager.__init__`, was mistyped as `str` instead of `datetime`.
- Change the type of the `filename` parameter in the `ReportManager.__init__` method from `str` to `Path`.
- Rename `self.filename` with `self.report_file` in `ReportManager`.
- Change the way the report file is created, opened and saved to use the `Path` object.
…an up its implementation

- Use `ChallengeData` objects instead of untyped `dict` in  app.py, generate_test.py, reports.py.
- Remove unnecessary methods `serialize`, `get_data`, `get_json_from_path`, `deserialize` from `ChallengeData` class.
- Remove unused methods `challenge_from_datum` and `challenge_from_test_data` from `ChallengeData class.
- Update function signatures and annotations of `create_challenge` and `generate_single_test` functions in generate_test.py.
- Add types to function signatures of `generate_single_call_report` and `finalize_reports` in reports.py.
- Remove unnecessary `challenge_data` parameter (in generate_test.py) and fixture (in conftest.py).
…n__.py

- Cleaned up generate_test.py and conftest.py
   - Consolidated challenge creation logic in the `Challenge` class itself, most notably the new `Challenge.from_challenge_spec` method.
   - Moved challenge selection logic from generate_test.py to the `pytest_collection_modifyitems` hook in conftest.py.
- Converted methods in the `Challenge` class to class methods where appropriate.
- Improved argument handling in the `run_benchmark` function in `__main__.py`.
…nchmarkConfig and reduce fragmented/global state

- Merge the functionality of `AGBenchmarkPathManager` into `AgentBenchmarkConfig` to consolidate the configuration management.
- Remove the `.path_manager` module containing `AGBenchmarkPathManager`.
- Pass the `AgentBenchmarkConfig` and its attributes through function arguments to reduce global state and improve code clarity.
- Added `--port` option to `serve` subcommand to allow for specifying the port to run the API on.
- If no `--port` option is provided, the port will default to the value specified in the `PORT` environment variable, or 8080 if not set.
- Added a new subcommand `config` to the AGBenchmark CLI, to display information about the present AGBenchmark config.
Copy link
Contributor

github-actions bot commented Jan 1, 2024

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

Copy link

netlify bot commented Jan 1, 2024

Deploy Preview for auto-gpt-docs ready!

Name Link
🔨 Latest commit 2135019
🔍 Latest deploy log https://app.netlify.com/sites/auto-gpt-docs/deploys/6594502fa8296100085460d4
😎 Deploy Preview https://deploy-preview-6650--auto-gpt-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@Pwuts Pwuts added code quality ⬆️ PRs that improve code quality Classic Benchmark labels Jan 1, 2024
benchmark/agbenchmark/__main__.py Outdated Show resolved Hide resolved
benchmark/agbenchmark/__main__.py Outdated Show resolved Hide resolved
benchmark/agbenchmark/__main__.py Outdated Show resolved Hide resolved
benchmark/agbenchmark/app.py Show resolved Hide resolved
benchmark/agbenchmark/config.py Outdated Show resolved Hide resolved
Pwuts added 5 commits January 2, 2024 16:49
…n app.py

- Added a check to skip deprecated challenges
- Added logging to allow debugging of the loading process
- Added handling of validation errors when parsing challenge spec files
- Added missing `spec_file` attribute to `ChallengeData`
…it in `/reports` endpoint

- Move `run_benchmark` and `validate_args` from __main__.py to main.py
- Replace agbenchmark subprocess in `app.py:run_single_test` with `run_benchmark`
- Move `get_unique_categories` from __main__.py to challenges/__init__.py
- Move `OPTIONAL_CATEGORIES` from __main__.py to challenge.py
- Reduce operations on updates.json (including `initialize_updates_file`) outside of API
…d code

- Remove `updates_json_file` attribute from `AgentBenchmarkConfig`
- Remove `get_updates` and `_initialize_updates_file` in app.py
- Remove `append_updates_file` and `create_update_json` functions in agent_api_interface.py
- Remove call to `append_updates_file` in challenge.py
…enchmarkConfig`

- Add and update docstrings
- Change base class from `BaseModel` to `BaseSettings`, allow extras for backwards compatibility
- Make naming of path attributes on `AgentBenchmarkConfig` more consistent
- Remove unused `agent_home_directory` attribute
- Remove unused `workspace` attribute
Copy link
Contributor

github-actions bot commented Jan 2, 2024

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

Copy link
Contributor

github-actions bot commented Jan 2, 2024

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

1 similar comment
Copy link
Contributor

github-actions bot commented Jan 2, 2024

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

@Pwuts Pwuts force-pushed the benchmark/clean-up branch from a202a7a to f8a97f9 Compare January 2, 2024 16:52
Copy link
Contributor

github-actions bot commented Jan 2, 2024

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

@Pwuts
Copy link
Member Author

Pwuts commented Jan 2, 2024

Blocked by this PR:

@jzanecook
Copy link

Blocked by this PR:

Should be good upstream now.

Copy link
Contributor

github-actions bot commented Jan 2, 2024

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

@Pwuts Pwuts marked this pull request as ready for review January 2, 2024 17:54
@Pwuts Pwuts requested a review from a team January 2, 2024 17:54
@Pwuts Pwuts requested a review from a team as a code owner January 2, 2024 17:54
@Pwuts Pwuts force-pushed the benchmark/clean-up branch from a72eac2 to 2135019 Compare January 2, 2024 18:04
Copy link
Contributor

github-actions bot commented Jan 2, 2024

This PR exceeds the recommended size of 500 lines. Please make sure you are NOT addressing multiple issues with one PR.

@Pwuts Pwuts merged commit 25cc6ad into master Jan 2, 2024
12 of 13 checks passed
@Pwuts Pwuts deleted the benchmark/clean-up branch January 2, 2024 21:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants