Skip to content
Kira LOCAL-FIRST
Source-of-truth developer guide

Verified architecture, current implementation, and active-development boundaries.

This section was refreshed against Engram project memory and the current website copy. It separates shipped behavior from old marketing copy and future work, because developer documentation is only useful when it refuses to overpromise.

Audit snapshot

Green claims were already aligned with project memory. Amber claims were corrected in this pass. Blue claims are active-development boundaries that developers should not treat as release guarantees.

Verified

OpenCohost is the orchestration chassis

Correct: the product value is coordinating Ollama, TTS, LiveAudio context, Agenda Mode, chat reaction, profiles, avatar/music presence, and resilience rather than being only a wrapper.

Corrected

The voice story is not just Qwen-TTS

Updated: the verified implementation includes Piper local synthesis, optional Edge-TTS light synthesis when privacy allows it, and active custom/Qwen voice work. The docs no longer promise a guaranteed Qwen-only stack.

Verified

Local-first does not mean compute-free

Correct: local mode avoids default cloud/API billing, but developers must still budget GPU/VRAM, RAM, electricity, model choice, game load, and setup time.

Verified

LiveAudio stays a separate bridge

Correct: voice listening, Silero VAD, Whisper transcription, subtitles, transcripts, and clean voice context belong to the LiveAudio bridge rather than an always-on OpenCohost microphone path.

Corrected

No fake waitlist or installer promise

The rendered FAQ no longer claims a public installer or email waitlist path as if it existed. Access remains active development until a real release flow is implemented.

Composition root

app_shell.py wires the motor thread, health monitor, OBS client, SmartAggregator, Stream Admin, topic inbox bridge, TTS controls, and UI panels through typed protocols. Keep it orchestration-only; push logic into testable modules.

UIState observer

A framework-agnostic, thread-safe state container with typed properties. Observers dispatch on a daemon thread; UI callbacks must hop back to the Tk main loop before touching widgets.

Priority queue + accumulation

The motor thread prioritizes real-time inputs such as PTT, chat, and agenda work while compacting overflow into bounded consultations so local models are not buried by raw stream noise.

LLM tier switching

Manual Quality / Balanced / Fast slots map to Ollama model tags. Switching preserves conversation/profile state and rolls back to the last known good model on failure.

Privacy-aware voice path

The TTS path supports Piper local synthesis, persisted Piper speed presets, and a tts_local_only switch that blocks Edge-TTS before text can leave the machine.

Human-gated topic intake

The topic inbox lets agents propose ideas, but approve remains human-only. Read-time namespace validation, short SQLite timeouts, and rollback keep UI suggestions safe under load.

Avatar state bridge

A pub/sub bridge lets core modules signal avatar states without coupling to UI or OBS. OBSClient can subscribe and update image sources when state changes.

Degradation ladder

Agenda recovery and model switching prefer graceful degradation: retries, stale-prefetch discard, explicit pause states, and rollback instead of silent deadlocks during a live show.

Recent verified implementation notes

TTS local-only switch

tts_local_only.json is persisted under config. When ON, the motor routes light synthesis to Piper and server_qwen.py returns HTTP 400 before any Edge-TTS call can run.

Piper speed presets

The UI exposes Rápida, Media, Calma, and Lenta presets backed by length_scale values. Engine changes persist to tts_speed.json and rebuild Piper synthesis config under lock.

Agenda double-close fix

The controller no longer prefetches a second kira-agenda-stop while a topic is already CLOSING, and stale prefetched actions are discarded instead of played later.

Topic inbox

Agents can propose topic candidates with the ti_ namespace. UI polling is fail-open, approval is human-only, and failed queue operations roll back approved rows.

Gotchas for contributors

Never proxy internal state from external metadata
A real agenda bug came from checking current_speech_source.startsWith("kira-agenda") to infer controller state. The controller can emit actions with source="chat"; trust the controller state, not labels attached to an event.
Tk widgets are single-threaded
Any widget mutation must happen on the main loop. The UIState observer dispatch thread is separate, so callbacks must use schedule_ui_update() / after_idle before touching Tk widgets.
Reasoning token budget
Models such as qwen3 and gemma can spend part of the token budget on internal reasoning. A hard num_predict cap can yield empty or truncated answers; the engine removes the cap for those families.
Storage paths resolve at import time
STORAGE_PATHS is resolved when config/storage.py is imported. apply_storage_environment() runs before library initialization; changing storage.yaml mid-runtime will not move already-resolved paths.
Privacy gates must run before convenience fallbacks
The local-only switch must be checked before the Edge-TTS offline/light fallback. Reordering those branches can silently send text to Microsoft even when the user asked for local-only synthesis.
Persisted settings can pollute tests
Any motor command test that writes preferences such as tts_local_only or tts_speed must patch the save/load helper to a temporary path; otherwise it can mutate the user's real config.
Validate topic namespaces when reading, not only when routing
Hostile or legacy rows can already exist in SQLite. The topic inbox quarantines foreign IDs at read time so render and dismiss paths agree on what the UI owns.
app_shell.py has a hard line budget
The integration guard keeps app_shell.py below 3100 lines. New UI behavior should usually live in a small injected module, with app_shell only wiring it.

Extension points

OpenCohost does not have a formal plugin system yet, but developers can extend these surfaces deliberately:

  • Profiles (perfiles.json): system prompt text and use_system flag. Defaults live in config/default_profiles.json.
  • LLM tier slots (llm_tiers.json): quality, balanced, and fast model slots validated against runtime-installed Ollama models at startup.
  • TTS privacy and speed files: tts_local_only.json and tts_speed.json persist user voice policy and Piper length_scale choices.
  • Config YAMLs: storage.yaml, smart_aggregator.yaml, avatar.yaml, and stream_admin.yaml configure paths, chat shaping, OBS/avatar state, OAuth, and moderation.
  • Model catalog (config/settings.py): MODELS_CATALOG entries need display, desc, size_gb, and family metadata before appearing safely in the UI.
  • Protocol system (ui/protocols.py): MotorEventCallback, SmartAggregatorCallbacks, and StreamAdminCallbacks define typed callback contracts.
  • Topic inbox (opencohost/core/topic_inbox.py + ui/topic_inbox_bridge.py): add agent proposals without bypassing human approval.
  • Crash reporting (ui/crash_reporting.py): Python excepthook, threading excepthook, Tk callback hook, and faulthandler cover different failure classes.