πΉοΈ Quick Start¶
Executive Summary
Voice2Machine has two superpowers: Dictation (Voice β Text) and Refinement (Text β Better Text).
This visual guide helps you understand the main workflows so you can be productive in minutes.
1. Dictation Flow (Voice β Text)¶
Ideal for: Writing emails, code, or quick messages without touching the keyboard.
- Focus: Click on the text field where you want to write.
- Activate shortcut (Configurable, by default running
v2m-toggle.sh). You'll hear a start sound π. - Speak clearly. Don't worry about being robotic, speak naturally.
- Press the shortcut again to stop. You'll hear an end sound π.
- The text will paste automatically into your active field (or remain in clipboard if auto-paste is disabled).
flowchart LR
A((π€ START)) -->|Record| B{Local Whisper}
B -->|Transcribe| C[π Clipboard / Paste]
style A fill:#ff6b6b,stroke:#333,stroke-width:2px,color:white
style B fill:#feca57,stroke:#333,stroke-width:2px
style C fill:#48dbfb,stroke:#333,stroke-width:2px
2. Refinement Flow (Text β AI β Text)¶
Ideal for: Correcting grammar, translating, or giving professional formatting to a rough draft.
- Select and Copy (
Ctrl + C) the text you want to improve. - Activate the AI shortcut (running
v2m-llm.sh). - Wait a few seconds (the AI is thinking π§ ).
- The improved text will replace your clipboard contents.
- Paste (
Ctrl + V) the result.
flowchart LR
A[π Original Text] -->|Copy| B((π§ AI SHORTCUT))
B -->|Process| C{Local LLM / Gemini}
C -->|Improve| D[β¨ Polished Text]
style A fill:#c8d6e5,stroke:#333,stroke-width:2px
style B fill:#5f27cd,stroke:#333,stroke-width:2px,color:white
style C fill:#feca57,stroke:#333,stroke-width:2px
style D fill:#1dd1a1,stroke:#333,stroke-width:2px
π‘ Pro Tips¶
!!! tip "Improve Your Accuracy" - Speak fluently: Whisper understands context from complete sentences better than isolated words. - Hardware: A noise-canceling microphone dramatically improves results. - Configuration: You can adjust the LLM "temperature" in settings to make it more creative or more literal.
Privacy Guaranteed
Dictation is 100% local (runs on your GPU). Refinement can be local (Ollama) or cloud (Gemini), you have full control in settings.