Rename `evaluate_code` to `analyze_code` #1371

willcallender · 2023-04-14T17:39:14Z

Background

As seen in #101 and #286, GPT-3.5 misinterprets "evaluate" to mean "execute" rather than "analyze". As such, I changed the name of the function to "analyze" to make the purpose of the function more clear to the AI.

Changes

Renamed the evaluate_code command to analyze_code everywhere it appears, including the function names and the text as it's given to the AI.

Documentation

I tested with several simple prompts, namely hello world programs in various languages, and I haven't seen this error since making the change.

Test Plan

I used the same basic AI personality for these tests, specifically I used Dev-GPT, an AI designed to autonomously develop, run, and test code. I gave it a single goal which was "Write and run a simple hello world program in [language]." I tested Python and Rust, it didn't always succeed but it never got confused by evaluate_code.

PR Quality Checklist

My pull request is atomic and focuses on a single change.
I have thoroughly tested my changes with multiple different prompts.
I have considered potential risks and mitigations for my changes.
I have documented my changes clearly and comprehensively.
I have not snuck in any "extra" small tweaks changes

From my own observations and others (ie #101 and #286) ChatGPT seems to think that `evaluate_code` will actually run code, rather than just provide feedback. Since changing the phrasing to `analyze_code` I haven't seen the AI make this mistake.

Handles Docker errors separately, and prints a potentially helpful message for users.

nponeccop · 2023-04-16T15:23:02Z

@willcallender There are conflicts now

github-actions · 2023-04-17T15:58:34Z

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

willcallender · 2023-04-17T18:12:32Z

@willcallender There are conflicts now

Sorry, but where can I see the conflicts? Normally I'd use the resolve conflicts button but it's greyed out for me.

github-actions · 2023-04-17T18:48:06Z

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

willcallender · 2023-04-17T19:11:43Z

I've tested the changes and they still work as expected.

github-actions · 2023-04-17T22:34:45Z

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

Pwuts

Looks good to me

github-actions · 2023-04-19T00:13:09Z

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

ChatGPT is less confused by this phrasing From my own observations and others (ie Significant-Gravitas#101 and Significant-Gravitas#286) ChatGPT seems to think that `evaluate_code` will actually run code, rather than just provide feedback. Since changing the phrasing to `analyze_code` I haven't seen the AI make this mistake. --------- Co-authored-by: Reinier van der Leer <[email protected]>

willcallender added 3 commits April 14, 2023 12:52

ChatGPT is less confused by this phrasing

92d5609

From my own observations and others (ie #101 and #286) ChatGPT seems to think that `evaluate_code` will actually run code, rather than just provide feedback. Since changing the phrasing to `analyze_code` I haven't seen the AI make this mistake.

Merge branch 'Torantulino:master' into master

94befa1

Handle Docker errors specifically

c840e48

Handles Docker errors separately, and prints a potentially helpful message for users.

Qoyyuum requested review from nponeccop and Torantulino April 16, 2023 09:37

nponeccop previously approved these changes Apr 16, 2023

View reviewed changes

github-actions bot added the conflicts Automatically applied to PRs with merge conflicts label Apr 17, 2023

willcallender added 2 commits April 17, 2023 14:40

Merge remote-tracking branch 'upstream/master'

f96f44d

Rename

4fa30b1

willcallender dismissed nponeccop’s stale review via 4fa30b1 April 17, 2023 18:47

github-actions bot removed the conflicts Automatically applied to PRs with merge conflicts label Apr 17, 2023

github-actions bot added the conflicts Automatically applied to PRs with merge conflicts label Apr 17, 2023

Pwuts previously approved these changes Apr 18, 2023

View reviewed changes

Pwuts requested a review from BillSchumacher April 18, 2023 00:25

Pwuts linked an issue Apr 18, 2023 that may be closed by this pull request

AI seems to think that "evaluate code" runs it #101

Closed

Pwuts added the enhancement New feature or request label Apr 18, 2023

nponeccop assigned BillSchumacher Apr 18, 2023

Pwuts dismissed their stale review via 9c73ea9 April 19, 2023 00:12

github-actions bot removed the conflicts Automatically applied to PRs with merge conflicts label Apr 19, 2023

Merge branch 'master' into willcallender/master

7158a22

Pwuts approved these changes Apr 19, 2023

View reviewed changes

Pwuts merged commit 8532307 into Significant-Gravitas:master Apr 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rename `evaluate_code` to `analyze_code` #1371

Rename `evaluate_code` to `analyze_code` #1371

willcallender commented Apr 14, 2023

nponeccop commented Apr 16, 2023

github-actions bot commented Apr 17, 2023

willcallender commented Apr 17, 2023

github-actions bot commented Apr 17, 2023

willcallender commented Apr 17, 2023

github-actions bot commented Apr 17, 2023

Pwuts left a comment

github-actions bot commented Apr 19, 2023

Rename evaluate_code to analyze_code #1371

Rename evaluate_code to analyze_code #1371

Conversation

willcallender commented Apr 14, 2023

Background

Changes

Documentation

Test Plan

PR Quality Checklist

nponeccop commented Apr 16, 2023

github-actions bot commented Apr 17, 2023

willcallender commented Apr 17, 2023

github-actions bot commented Apr 17, 2023

willcallender commented Apr 17, 2023

github-actions bot commented Apr 17, 2023

Pwuts left a comment

Choose a reason for hiding this comment

github-actions bot commented Apr 19, 2023

Rename `evaluate_code` to `analyze_code` #1371

Rename `evaluate_code` to `analyze_code` #1371