Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix split file to handle edge case where overlap size > last chunk size #2062

Merged
merged 4 commits into from
Apr 17, 2023

Conversation

rocks6
Copy link
Contributor

@rocks6 rocks6 commented Apr 17, 2023

Background

Currently split_file will double count the last overlap# chars in a document due to the way we are consuming the chunks

Changes

Add logic to handle this case in the split_file function

Documentation

Commented

Test Plan

Tested with new file_operation unit tests, which will be included in a following PR, and tested with a file operation prompt

PR Quality Checklist

  • My pull request is atomic and focuses on a single change.
  • I have thoroughly tested my changes with multiple different prompts.
  • I have considered potential risks and mitigations for my changes.
  • I have documented my changes clearly and comprehensively.
  • I have not snuck in any "extra" small tweaks changes

@nponeccop nponeccop added B7 bug Something isn't working labels Apr 17, 2023
@p-i-
Copy link
Contributor

p-i- commented Apr 17, 2023

Does this invalidate #2088 ?

@p-i- p-i- merged commit 8637b8b into Significant-Gravitas:master Apr 17, 2023
@p-i- p-i- mentioned this pull request Apr 17, 2023
5 tasks
@socialmedialabs
Copy link

socialmedialabs commented Apr 17, 2023

Can't confirm this is working. I just switched to master branch and checked - Auto-GPT failed with the same error when I asked it to read a Python file from my HD:

    raise self.handle_error_response(
openai.error.InvalidRequestError: This model's maximum context length is 8191 tokens, however you requested 216094 tokens (216094 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.

@nponeccop nponeccop mentioned this pull request Apr 17, 2023
1 task
@rocks6
Copy link
Contributor Author

rocks6 commented Apr 18, 2023

My PR was related to the fact that the last chunk of a split file would be double counted, that may be a separate issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants