Refactor/remove abstract singleton as voice base parent #4931

collijk · 2023-07-09T21:17:21Z

Background

This is one of several PRs aimed at removing global state from the system by removing globally invoked singletons (see #4227 #4737 #4901 #4900).

In the existing implementation we stand up a particular instance of a VoiceBase everytime we invoke the say_text function (which is invoked in multiple places) with branching statements based on configuration values.

This PR adds a TTS provider class that loads and holds onto the configured TTS service (thus removing a need for a singleton) and then handles the invocation logic in a single place. A new method is added to the logger class (which also serves as our de facto UI through the typewriter_log method) that allows the invocation of tts even when we're not doing the typewriter_log (needed to preserve backwards compatibility with current UI functionality, otherwise the system will say "speak" every time the assistant thoughts come back with something to say).

Changes

Wrap the say_text command in a class that holds on to the configured provider
Have all speech operate through the logger.typewriter_log or logger.say (since the logger is our current UI, unfortunately).
Remove AbstractSingleton as a base class for VoiceBase
Remove AbstractSingleton 🎉

Documentation

Test Plan

CI
Manual testing of classes and running with speech enabled

PR Quality Checklist

My pull request is atomic and focuses on a single change.
I have thoroughly tested my changes with multiple different prompts.
I have considered potential risks and mitigations for my changes.
I have documented my changes clearly and comprehensively.
I have not snuck in any "extra" small tweaks changes.

I have run the following commands against my code to ensure it passes our linters:

black .
isort .
mypy
autoflake --remove-all-unused-imports --recursive --ignore-init-module-imports --ignore-pass-after-docstring autogpt tests --in-place

netlify · 2023-07-09T21:17:35Z

✅ Deploy Preview for auto-gpt-docs ready!

Name	Link
🔨 Latest commit	`f85bc32`
🔍 Latest deploy log	https://app.netlify.com/sites/auto-gpt-docs/deploys/64fb8d3c26fbfe0008862614
😎 Deploy Preview	https://deploy-preview-4931--auto-gpt-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

…e-base-parent

codecov · 2023-07-09T21:25:01Z

Codecov Report

Patch coverage: 60.46% and project coverage change: +0.62% 🎉

Comparison is base (78a620c) 50.73% compared to head (ddc2ab6) 51.35%.
Report is 1583 commits behind head on master.

❗ Current head ddc2ab6 differs from pull request most recent head f85bc32. Consider uploading reports for the commit f85bc32 to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #4931      +/-   ##
==========================================
+ Coverage   50.73%   51.35%   +0.62%     
==========================================
  Files         128      119       -9     
  Lines        5517     4899     -618     
  Branches      759      642     -117     
==========================================
- Hits         2799     2516     -283     
+ Misses       2507     2201     -306     
+ Partials      211      182      -29

Files Changed	Coverage Δ
autogpt/singleton.py	`100.00% <ø> (ø)`
autogpt/speech/say.py	`56.41% <44.44%> (+18.91%)`	⬆️
autogpt/logs/logger.py	`81.65% <77.77%> (ø)`
autogpt/agent/agent.py	`58.51% <100.00%> (ø)`
autogpt/logs/utils.py	`79.41% <100.00%> (-20.59%)`	⬇️
autogpt/speech/__init__.py	`100.00% <100.00%> (ø)`
autogpt/speech/base.py	`78.26% <100.00%> (+24.09%)`	⬆️

... and 84 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…e-base-parent

github-actions · 2023-07-13T14:39:09Z

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

Pwuts · 2023-07-17T18:33:24Z

@dayofthedave would you mind testing this, to make sure the speech function still fulfils its purpose?

dayofthedave · 2023-07-17T22:15:33Z

@dayofthedave would you mind testing this, to make sure the speech function still fulfils its purpose?

I'm not quite sure what function speech is expected to fulfill. For me personally, I use it to monitor the agent in continuous mode so I don't have to keep my eyes glued to the screen, but I can still know if it gets stuck or starts doing something I don't want. Would others agree that this is the purpose of speech, or is there something else I should be testing?

Speech seems to be somewhat broken on master, and I see the same behavior on this branch. I haven't worked with Auto-GPT in a while, so I'm not sure how it's expected to function now, but it seems to be locked in Google TTS mode. Perhaps I'm doing something wrong, but I can't get any other TTS engines to work anymore via the TEXT_TO_SPEECH_PROVIDER setting in .env. Auto-GPT always ignores this setting and uses Google TTS.

Question: Is multiple voice support still needed? I've been submitting PRs to enhance this feature, but it no longer appears to be a feature. At one point, Auto-GPT was spinning up helper agents all the time to complete certain tasks, and I had these helper agents speaking in different voices on MacOS, StreamElements and ElevenLabs. I haven't been able to trigger this helper agent behavior for quite a while now, however, so presumably it's no longer a thing, and there's no need to continue working on multiple voice support.

Pwuts · 2023-07-18T13:48:27Z

I'm not quite sure what function speech is expected to fulfill. For me personally, I use it to monitor the agent in continuous mode so I don't have to keep my eyes glued to the screen, but I can still know if it gets stuck or starts doing something I don't want. Would others agree that this is the purpose of speech, or is there something else I should be testing?

I'd say that's the most important use case, yes.

Speech seems to be somewhat broken on master, and I see the same behavior on this branch. I haven't worked with Auto-GPT in a while, so I'm not sure how it's expected to function now, but it seems to be locked in Google TTS mode. Perhaps I'm doing something wrong, but I can't get any other TTS engines to work anymore via the TEXT_TO_SPEECH_PROVIDER setting in .env. Auto-GPT always ignores this setting and uses Google TTS.

Thank you for bringing that to our attention. PR to fix it: #5005

Question: Is multiple voice support still needed?

Not strictly needed. But very nice to have? Definitely! :)

I've been submitting PRs to enhance this feature, but it no longer appears to be a feature. At one point, Auto-GPT was spinning up helper agents all the time to complete certain tasks, and I had these helper agents speaking in different voices on MacOS, StreamElements and ElevenLabs. I haven't been able to trigger this helper agent behavior for quite a while now, however, so presumably it's no longer a thing, and there's no need to continue working on multiple voice support.

The sub-agent implementation was removed because it didn't perform very well, and there were numerous problems with it. It will soon be replaced by a much better, more capable implementation.

dayofthedave · 2023-08-05T18:45:55Z

Sounds great! I'm hoping to get more time in the near future to spend on this project. Also, on the multiple voice support, it sounds a bit silly, but it was really cool to hear different agents speaking in different voices. Really seemed to give them different personalities, and the interplay between them was fun to listen to. Can't wait for this feature to return!

…

On Tue, Jul 18, 2023, 9:48 AM Reinier van der Leer ***@***.***> wrote: I'm not quite sure what function speech is expected to fulfill. For me personally, I use it to monitor the agent in continuous mode so I don't have to keep my eyes glued to the screen, but I can still know if it gets stuck or starts doing something I don't want. Would others agree that this is the purpose of speech, or is there something else I should be testing? I'd say that's the most important use case, yes. Speech seems to be somewhat broken on master, and I see the same behavior on this branch. I haven't worked with Auto-GPT in a while, so I'm not sure how it's expected to function now, but it seems to be locked in Google TTS mode. Perhaps I'm doing something wrong, but I can't get any other TTS engines to work anymore via the TEXT_TO_SPEECH_PROVIDER setting in .env. Auto-GPT always ignores this setting and uses Google TTS. Thank you for bringing that to our attention. PR to fix it: #5005 <#5005> Question: Is multiple voice support still needed? Not strictly needed. But very nice to have? Definitely! :) I've been submitting PRs to enhance this feature, but it no longer appears to be a feature. At one point, Auto-GPT was spinning up helper agents all the time to complete certain tasks, and I had these helper agents speaking in different voices on MacOS, StreamElements and ElevenLabs. I haven't been able to trigger this helper agent behavior for quite a while now, however, so presumably it's no longer a thing, and there's no need to continue working on multiple voice support. The sub-agent implementation was removed because it didn't perform very well, and there were numerous problems with it. It will soon be replaced by a much better, more capable implementation. — Reply to this email directly, view it on GitHub <#4931 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAKYBEAMUA6AVO4F7RPCGYDXQ2H3NANCNFSM6AAAAAA2DXJ2KY> . You are receiving this because you were mentioned.Message ID: ***@***.***>

…t-singleton-as-voice-base-parent

…e-base-parent

github-actions · 2023-09-08T20:43:35Z

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

…e-base-parent

collijk added 2 commits July 9, 2023 13:40

Remove AbstractSingleton as VoiceBase parent

3abe9d0

Ugh, hacking

3c2c058

collijk marked this pull request as ready for review July 9, 2023 21:17

github-actions bot added the size/l label Jul 9, 2023

Merge branch 'master' into refactor/remove-abstract-singleton-as-voic…

d0b994e

…e-base-parent

Merge branch 'master' into refactor/remove-abstract-singleton-as-voic…

ddc2ab6

…e-base-parent

github-actions bot added the conflicts Automatically applied to PRs with merge conflicts label Jul 13, 2023

Pwuts self-assigned this Jul 17, 2023

Pwuts added this to the v0.4.6 Release milestone Jul 17, 2023

Pwuts mentioned this pull request Jul 18, 2023

Fix configuring TTS engine #5005

Merged

6 tasks

lc0rp modified the milestones: v0.4.6 Release, v0.4.7 Release Jul 22, 2023

lc0rp modified the milestones: v0.4.7 Release, v0.4.8 Aug 1, 2023

Pwuts added 2 commits September 8, 2023 22:39

Merge branch 'master-before-restructure' into refactor/remove-abstrac…

ec57a62

…t-singleton-as-voice-base-parent

Merge branch 'master' into refactor/remove-abstract-singleton-as-voic…

70a937a

…e-base-parent

Pwuts self-requested a review as a code owner September 8, 2023 20:43

github-actions bot removed the conflicts Automatically applied to PRs with merge conflicts label Sep 8, 2023

Pwuts had a problem deploying to develop September 8, 2023 20:43 — with GitHub Actions Failure

Merge branch 'master' into refactor/remove-abstract-singleton-as-voic…

f85bc32

…e-base-parent

Pwuts had a problem deploying to develop September 8, 2023 21:08 — with GitHub Actions Failure

This comment was marked as off-topic.

Sign in to view

Pwuts merged commit aef6b50 into Significant-Gravitas:master Sep 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor/remove abstract singleton as voice base parent #4931

Refactor/remove abstract singleton as voice base parent #4931

collijk commented Jul 9, 2023

netlify bot commented Jul 9, 2023 •

edited

Loading

codecov bot commented Jul 9, 2023 •

edited

Loading

github-actions bot commented Jul 13, 2023

Pwuts commented Jul 17, 2023 •

edited

Loading

dayofthedave commented Jul 17, 2023

Pwuts commented Jul 18, 2023

dayofthedave commented Aug 5, 2023 via email

github-actions bot commented Sep 8, 2023

This comment was marked as off-topic.

Refactor/remove abstract singleton as voice base parent #4931

Refactor/remove abstract singleton as voice base parent #4931

Conversation

collijk commented Jul 9, 2023

Background

Changes

Documentation

Test Plan

PR Quality Checklist

netlify bot commented Jul 9, 2023 • edited Loading

✅ Deploy Preview for auto-gpt-docs ready!

codecov bot commented Jul 9, 2023 • edited Loading

Codecov Report

github-actions bot commented Jul 13, 2023

Pwuts commented Jul 17, 2023 • edited Loading

dayofthedave commented Jul 17, 2023

Pwuts commented Jul 18, 2023

dayofthedave commented Aug 5, 2023 via email

github-actions bot commented Sep 8, 2023

This comment was marked as off-topic.

netlify bot commented Jul 9, 2023 •

edited

Loading

codecov bot commented Jul 9, 2023 •

edited

Loading

Pwuts commented Jul 17, 2023 •

edited

Loading