Skip to content

Speak to LetMeDoIt AI

Eliran Wong edited this page Feb 22, 2024 · 11 revisions

Speak to LetMeDoIt AI

You can speak to LetMeDoIt AI in two ways:
(both English and non-English languages are supported)

  1. Use LetMeDoIt Built-in Key Binding
  2. Use Speech-to-text Engines Built-in with Your OS

Use LetMeDoIt Built-in Key Binding

Requirement

Installation of python package 'PyAudio' is required.

Before installing "PyAudio", macOS and Linux users need to install "portaudio".

  • On macOS

brew install portaudio

  • On Ubuntu / Debian-based Linux

sudo apt update && sudo apt install portaudio19-dev

  • On Fedora / CentOS

sudo dnf update && sudo dnf install portaudio-devel

Install PyAudio

  1. Activate your python environment

.[your_letmedoit_environment_path]\Scripts\activate # Windows users

source [your_letmedoit_environment_path]/bin/activate # macOS / Linux users

  1. Install python package "PyAudio"

pip install PyAudio

Alternately, which is easier, enter in LetMeDoIt AI prompt, provided that you have plugin 'install python package' enabled:

Install python package "PyAudio"

For trouble-shooting, read:

https://github.com/Uberi/speech_recognition#python

macOS users may also read https://github.com/eliranwong/letmedoit/issues/47

Remarks: LetMeDoIt plugin '000_check_pyaudio' checks if PyAudio is installed when LetMeDoIt is started. LetMeDoIt tries to install 'PyAudio' if it is not in place.

Activate Voice Typing

Press the default built-in key binding 'Ctr+S' for voice typing.

You can customize the key combo. READ

Configure Built-in Voice Typing

Enter a blank entry '' to launch the LetMeDoIt AI action menu.

voice_typing_option

Alternately, press keys "Escape" + "S" to change voice typing configurations.

Option: model for voice typing:

change_voice_typing_config

Option: language for voice typing:

voice_typing_config_2

Option: adjust ambient noise:

voice_typing_config_3

Option: audio notification when microphone is used:

voice_typing_config_4

Option: automatic entry when you finish speaking with microphone

automatic_audio_entry

Troubleshooting: https://github.com/Uberi/speech_recognition#troubleshooting

Use Speech-to-text Engines Built-in with Your OS

You can also speak to LetMeDoIt AI via the voice typing tools built-in with your device OS:

How to use voice typing on Windows?

Ah, I see you're interested in voice typing on Windows. Windows also has a built-in feature that allows you to use voice recognition for typing. Let me guide you through the process:

  1. Open the "Settings" application on your Windows system. You can do this by pressing the Windows key and typing "Settings" in the search bar.

  2. In the Settings window, click on the "Ease of Access" category.

  3. On the left side of the Ease of Access settings, you'll find an option called "Speech." Click on it.

  4. In the Speech settings, scroll down until you find the section called "Dictate text and control your device." Make sure the toggle switch for "Speech recognition" is turned ON.

  5. Once you've enabled speech recognition, you can start voice typing in any text field by pressing the Windows key + H. This will activate the speech recognition feature, and you can start speaking to dictate your text.

Windows also provides a list of voice commands that you can use to navigate your computer and perform various tasks. You can access this list by clicking on the "Learn how to use dictation and other voice commands" link in the Speech settings.

How can I use voice typing on macOS?

On macOS, voice typing is known as dictation. It allows you to speak instead of typing and have your words converted into text. To use dictation on macOS, follow these steps:

  1. Open System Preferences by clicking on the Apple menu in the top-left corner of the screen and selecting "System Preferences."

  2. In System Preferences, click on the "Keyboard" icon.

  3. In the Keyboard settings, navigate to the "Dictation" tab.

  4. In the Dictation tab, make sure that "Dictation" is enabled. You can choose to enable it either for "Enhanced Dictation" or "Standard Dictation." Enhanced Dictation allows you to use dictation offline, while Standard Dictation requires an internet connection but takes up less storage space.

  5. Once you have enabled dictation, you can choose a keyboard shortcut to activate it. By default, you can press the fn (function) key twice to start dictation. However, you can change this shortcut to your preference.

  6. Now, whenever you want to use voice typing, simply activate dictation using the keyboard shortcut you set. A small microphone icon will appear on the screen indicating that dictation is active.

  7. Start speaking, and macOS will convert your speech into text in real-time. You can dictate entire paragraphs or dictate specific commands, such as punctuation or formatting.

Dictation on macOS supports various languages, so make sure you have selected the desired language in the Dictation settings.

How can I use voice typing on Ubuntu?

To enable voice typing on Ubuntu, you'll need to follow these steps:

  1. Open the "Settings" application on your Ubuntu system. You can usually find it in the launcher or by searching for it in the overview.

  2. In the Settings window, look for the "Universal Access" category and click on it.

  3. In the Universal Access settings, you should see an option called "Typing Assistive Technologies." Click on it to open the typing settings.

  4. In the typing settings, you'll find an option called "Enable Screen Keyboard." Make sure this option is turned ON.

  5. Once you have enabled the screen keyboard, you can click on the microphone icon on the keyboard to start voice typing.

How can I use voice typing on Android Termux?

Simply use google voice typing built-in with Android

Installation

Installation
Installation on Android
Install a Supported Python Version
Install ffmpeg
Android Support
Install LetMeDoIt AI on Android Termux App Automatic Upgrade Option

Video Demonstration

Video Demo

Basics

Quick Guide
Action Menu
ChatGPT API Key
Use GPT4 Models
Google API Setup
ElevenLabs API Setup
OpenWeatherMap API Setup
Run Local LLM Offline
Token Management
Command Line Interface Options
Command Execution
Chat-only Features
Developer Mode
Save Chart Content Locally
Work with Text Selection
Work with File Selection
System Tray
Custom Key Bindings

Selective Features

Examples
Features
Change Assistant Name
Setup Multiple LetMeDoIt Assistants
Memory Capabilities
Data Visualization
Analyze Files
Analyze Images
Analyze Audio
Google and Outlook Calendars
Python Code Auto‐heal Feature
Integration with AutoGen
Integration with Google AI Tools
Integration with Open Interpreter
Speak to LetMeDoIt AI
LetMeDoIt Speaks
Speak multi‐dialects
Create a map anytime
Modify your images with simple words
Work with Database Files
Create a Team of AI Assistants
Search and Load Chat Records
Search Weather Information
Search Financial Data
Social Media

Plugins

Plugins ‐ Overview
Plugins - How to Write a Custom Plugin
Plugins ‐ Add Aliases
Plugins ‐ Input Suggestions
Plugins ‐ Install Additional Packages
Plugins ‐ Predefined Contexts
Plugins ‐ Transform Text Output
Plugins ‐ Work with LetMeDoIt AI Configurations
Plugins ‐ Function Calling
Plugins ‐ Run Codes with Specific Packages
Plugins ‐ Work with Non‐conversational Model
Plugins ‐ Integrate Text‐to‐speech Feature
Plugins ‐ Integrate Other Shared Utilities

Comparison

Compare with ChatGPT
Compare with Siri and Others

Clone this wiki locally