🇪🇸 Español: Consulta el historial de cambios en español en .github/locales/es/CHANGELOG.md.

title: Changelog description: Change log for the Voice2Machine project. ai_context: "Versions, Change History, SemVer" depends_on: [] status: stable

Changelog¶

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[Unreleased]¶

Zero-Copy Audio Engine: New ZeroCopyAudioRecorder in Rust using shared memory (/dev/shm) for true zero-capacity transfers.
Hallucination Detection: Heuristic filters and quality parameters (no_speech, compression_ratio) in StreamingTranscriber to reduce erroneous Whisper outputs.
Performance Metrics: Inference latency tracking in logs for detailed diagnostics.

Advanced Whisper Config: Increased beam_size and best_of to 5 for higher transcription quality in the "large-v3-turbo" model.
VAD Optimization: Adjusted default threshold to 0.35 to reduce false positives from ambient noise and breathing.
Memory Management: Forced CUDA cache reset (torch.cuda.empty_cache()) when unloading models to effectively free VRAM.
Code Hygiene: Import refactoring and linting error fixes (ruff) throughout the backend codebase.

Feature-Based Architecture: Total restructuring into self-contained modules in features/ (audio, llm, transcription).
Orchestration via Workflows: Introduction of RecordingWorkflow and LLMWorkflow to decouple business logic from the monolithic legacy Orchestrator.
Strict Protocols: Implementation of typing.Protocol for all internal services, allowing easy swapping of providers.
Modular API: Package structure in api/ with separate routes and schemas.

Elimination of Orchestrator: services/orchestrator.py has been decomposed and removed.
Infrastructure Refactoring: The infrastructure/ folder has been integrated into each corresponding feature.
Core and Domain: Simplified and moved to shared/ and local interfaces.

FastAPI REST API: New HTTP API replacing the Unix Sockets-based IPC system
WebSocket streaming: /ws/events endpoint for real-time provisional transcription
Swagger documentation: Interactive UI at /docs for testing endpoints
Orchestrator pattern: New coordination pattern that simplifies workflow
Rust audio engine: Native v2m_engine extension for low-latency audio capture
MkDocs documentation system: Structured documentation with Material theme

Simplified architecture: From CQRS/CommandBus to more direct Orchestrator pattern
Communication: From binary Unix Domain Sockets to standard HTTP REST
State model: Centralized management in DaemonState with lazy initialization
Updated README.md with new architecture