Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor/remove abstract singleton as voice base parent #4931

Conversation

collijk
Copy link
Contributor

@collijk collijk commented Jul 9, 2023

Background

This is one of several PRs aimed at removing global state from the system by removing globally invoked singletons (see #4227 #4737 #4901 #4900).

In the existing implementation we stand up a particular instance of a VoiceBase everytime we invoke the say_text function (which is invoked in multiple places) with branching statements based on configuration values.

This PR adds a TTS provider class that loads and holds onto the configured TTS service (thus removing a need for a singleton) and then handles the invocation logic in a single place. A new method is added to the logger class (which also serves as our de facto UI through the typewriter_log method) that allows the invocation of tts even when we're not doing the typewriter_log (needed to preserve backwards compatibility with current UI functionality, otherwise the system will say "speak" every time the assistant thoughts come back with something to say).

Changes

  • Wrap the say_text command in a class that holds on to the configured provider
  • Have all speech operate through the logger.typewriter_log or logger.say (since the logger is our current UI, unfortunately).
  • Remove AbstractSingleton as a base class for VoiceBase
  • Remove AbstractSingleton 🎉

Documentation

Test Plan

  • CI
  • Manual testing of classes and running with speech enabled

PR Quality Checklist

  • My pull request is atomic and focuses on a single change.
  • I have thoroughly tested my changes with multiple different prompts.
  • I have considered potential risks and mitigations for my changes.
  • I have documented my changes clearly and comprehensively.
  • I have not snuck in any "extra" small tweaks changes.
  • I have run the following commands against my code to ensure it passes our linters:
    black .
    isort .
    mypy
    autoflake --remove-all-unused-imports --recursive --ignore-init-module-imports --ignore-pass-after-docstring autogpt tests --in-place

@netlify
Copy link

netlify bot commented Jul 9, 2023

Deploy Preview for auto-gpt-docs ready!

Name Link
🔨 Latest commit f85bc32
🔍 Latest deploy log https://app.netlify.com/sites/auto-gpt-docs/deploys/64fb8d3c26fbfe0008862614
😎 Deploy Preview https://deploy-preview-4931--auto-gpt-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@collijk collijk marked this pull request as ready for review July 9, 2023 21:17
@github-actions github-actions bot added the size/l label Jul 9, 2023
@codecov
Copy link

codecov bot commented Jul 9, 2023

Codecov Report

Patch coverage: 60.46% and project coverage change: +0.62% 🎉

Comparison is base (78a620c) 50.73% compared to head (ddc2ab6) 51.35%.
Report is 1583 commits behind head on master.

❗ Current head ddc2ab6 differs from pull request most recent head f85bc32. Consider uploading reports for the commit f85bc32 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4931      +/-   ##
==========================================
+ Coverage   50.73%   51.35%   +0.62%     
==========================================
  Files         128      119       -9     
  Lines        5517     4899     -618     
  Branches      759      642     -117     
==========================================
- Hits         2799     2516     -283     
+ Misses       2507     2201     -306     
+ Partials      211      182      -29     
Files Changed Coverage Δ
autogpt/singleton.py 100.00% <ø> (ø)
autogpt/speech/say.py 56.41% <44.44%> (+18.91%) ⬆️
autogpt/logs/logger.py 81.65% <77.77%> (ø)
autogpt/agent/agent.py 58.51% <100.00%> (ø)
autogpt/logs/utils.py 79.41% <100.00%> (-20.59%) ⬇️
autogpt/speech/__init__.py 100.00% <100.00%> (ø)
autogpt/speech/base.py 78.26% <100.00%> (+24.09%) ⬆️

... and 84 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@github-actions github-actions bot added the conflicts Automatically applied to PRs with merge conflicts label Jul 13, 2023
@github-actions
Copy link
Contributor

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

@Pwuts
Copy link
Member

Pwuts commented Jul 17, 2023

@dayofthedave would you mind testing this, to make sure the speech function still fulfils its purpose?

@Pwuts Pwuts self-assigned this Jul 17, 2023
@Pwuts Pwuts added this to the v0.4.6 Release milestone Jul 17, 2023
@dayofthedave
Copy link

@dayofthedave would you mind testing this, to make sure the speech function still fulfils its purpose?

I'm not quite sure what function speech is expected to fulfill. For me personally, I use it to monitor the agent in continuous mode so I don't have to keep my eyes glued to the screen, but I can still know if it gets stuck or starts doing something I don't want. Would others agree that this is the purpose of speech, or is there something else I should be testing?

Speech seems to be somewhat broken on master, and I see the same behavior on this branch. I haven't worked with Auto-GPT in a while, so I'm not sure how it's expected to function now, but it seems to be locked in Google TTS mode. Perhaps I'm doing something wrong, but I can't get any other TTS engines to work anymore via the TEXT_TO_SPEECH_PROVIDER setting in .env. Auto-GPT always ignores this setting and uses Google TTS.

Question: Is multiple voice support still needed? I've been submitting PRs to enhance this feature, but it no longer appears to be a feature. At one point, Auto-GPT was spinning up helper agents all the time to complete certain tasks, and I had these helper agents speaking in different voices on MacOS, StreamElements and ElevenLabs. I haven't been able to trigger this helper agent behavior for quite a while now, however, so presumably it's no longer a thing, and there's no need to continue working on multiple voice support.

@Pwuts
Copy link
Member

Pwuts commented Jul 18, 2023

I'm not quite sure what function speech is expected to fulfill. For me personally, I use it to monitor the agent in continuous mode so I don't have to keep my eyes glued to the screen, but I can still know if it gets stuck or starts doing something I don't want. Would others agree that this is the purpose of speech, or is there something else I should be testing?

I'd say that's the most important use case, yes.

Speech seems to be somewhat broken on master, and I see the same behavior on this branch. I haven't worked with Auto-GPT in a while, so I'm not sure how it's expected to function now, but it seems to be locked in Google TTS mode. Perhaps I'm doing something wrong, but I can't get any other TTS engines to work anymore via the TEXT_TO_SPEECH_PROVIDER setting in .env. Auto-GPT always ignores this setting and uses Google TTS.

Thank you for bringing that to our attention. PR to fix it: #5005

Question: Is multiple voice support still needed?

Not strictly needed. But very nice to have? Definitely! :)

I've been submitting PRs to enhance this feature, but it no longer appears to be a feature. At one point, Auto-GPT was spinning up helper agents all the time to complete certain tasks, and I had these helper agents speaking in different voices on MacOS, StreamElements and ElevenLabs. I haven't been able to trigger this helper agent behavior for quite a while now, however, so presumably it's no longer a thing, and there's no need to continue working on multiple voice support.

The sub-agent implementation was removed because it didn't perform very well, and there were numerous problems with it. It will soon be replaced by a much better, more capable implementation.

@Pwuts Pwuts mentioned this pull request Jul 18, 2023
6 tasks
@lc0rp lc0rp modified the milestones: v0.4.6 Release, v0.4.7 Release Jul 22, 2023
@lc0rp lc0rp modified the milestones: v0.4.7 Release, v0.4.8 Aug 1, 2023
@dayofthedave
Copy link

dayofthedave commented Aug 5, 2023 via email

@Pwuts Pwuts self-requested a review as a code owner September 8, 2023 20:43
@github-actions github-actions bot removed the conflicts Automatically applied to PRs with merge conflicts label Sep 8, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Sep 8, 2023

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

@Pwuts

This comment was marked as off-topic.

@Pwuts Pwuts merged commit aef6b50 into Significant-Gravitas:master Sep 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants