Skip to content

REST API Reference

This section documents the Voice2Machine Daemon REST API (v0.2.0+).

Updated Architecture

Voice2Machine uses FastAPI for client-server communication, replacing the previous Unix Sockets IPC system. This allows testing endpoints directly with curl or any HTTP client.

General Information

Property Value
Base URL http://localhost:8765
Protocol HTTP/1.1 + WebSocket
Format JSON (UTF-8)
Interactive Docs http://localhost:8765/docs (Swagger UI)

REST Endpoints

POST /toggle

Recording toggle (start/stop). This is the main endpoint used by keyboard shortcuts.

bash curl -X POST http://localhost:8765/toggle | jq

json { "status": "recording", "message": "Recording started", "text": null }

json { "status": "idle", "message": "Transcription complete", "text": "The transcribed text appears here..." }


POST /start

Explicitly start recording. Useful when you need separate start/stop control.

bash curl -X POST http://localhost:8765/start | jq

json { "status": "recording", "message": "Recording started", "text": null }


POST /stop

Stop recording and transcribe captured audio.

bash curl -X POST http://localhost:8765/stop | jq

json { "status": "idle", "message": "Transcription complete", "text": "The transcribed text appears here..." }


POST /llm/process

Process text with LLM (cleanup, punctuation, formatting). Backend is selected based on config.toml.

bash curl -X POST http://localhost:8765/llm/process \ -H "Content-Type: application/json" \ -d '{"text": "hello how are you hope youre well"}' | jq

json { "text": "Hello, how are you? Hope you're well.", "backend": "gemini" }


POST /llm/translate

Translate text to another language using LLM.

bash curl -X POST http://localhost:8765/llm/translate \ -H "Content-Type: application/json" \ -d '{"text": "Good morning", "target_lang": "es"}' | jq

json { "text": "Buenos días", "backend": "gemini" }


GET /status

Returns current daemon state.

bash curl http://localhost:8765/status | jq

json { "state": "idle", "recording": false, "model_loaded": true }

Possible States:

State Description
idle Waiting for commands
recording Recording audio
processing Transcribing or processing with LLM

GET /health

Health check for systemd/monitoring scripts.

bash curl http://localhost:8765/health | jq

json { "status": "ok", "version": "0.2.0" }


WebSocket

WS /ws/events

Real-time event stream. Useful for showing provisional transcription while user speaks.

const ws = new WebSocket('ws://localhost:8765/ws/events');

    ws.onmessage = (event) => {
      const { event: eventType, data } = JSON.parse(event.data);
      console.log(`Event: ${eventType}`, data);
    };
    ```

=== "Connection (Python)"
```python
import asyncio
import websockets

    async def listen():
        async with websockets.connect('ws://localhost:8765/ws/events') as ws:
            async for message in ws:
                print(message)

    asyncio.run(listen())
    ```

**Emitted Events:**

| Event                  | Fields                           | Description                                 |
| ---------------------- | -------------------------------- | ------------------------------------------- |
| `transcription_update` | `text: str`, `final: bool`       | Transcription update (provisional or final) |
| `heartbeat`            | `timestamp: float`, `state: str` | Heartbeat to keep connection alive          |

---

## Data Models

### ToggleResponse

```python
class ToggleResponse(BaseModel):
    status: str      # 'recording' | 'idle'
    message: str     # Descriptive message
    text: str | None # Transcribed text (only on stop)

StatusResponse

class StatusResponse(BaseModel):
    state: str        # 'idle' | 'recording' | 'processing'
    recording: bool   # True if recording
    model_loaded: bool # True if Whisper is in VRAM

LLMResponse

class LLMResponse(BaseModel):
    text: str    # Processed/translated text
    backend: str # 'gemini' | 'ollama' | 'local'

Error Codes

HTTP Code Meaning
200 Successful operation
422 Validation error (invalid payload)
500 Internal server error

Debugging

Use the interactive documentation at http://localhost:8765/docs to test endpoints visually.