Initial commit

This commit is contained in:
2026-04-24 19:18:15 +08:00
commit fbcbe08696
555 changed files with 96692 additions and 0 deletions

26
docs/.gitignore vendored Normal file
View File

@@ -0,0 +1,26 @@
# deps
/node_modules
# generated content
.source
# test & build
/coverage
/.next/
/out/
/build
*.tsbuildinfo
# misc
.DS_Store
*.pem
/.pnp
.pnp.js
npm-debug.log*
yarn-debug.log*
yarn-error.log*
# others
.env*.local
.vercel
next-env.d.ts

634
docs/PROJECT_STATUS.md Normal file
View File

@@ -0,0 +1,634 @@
# Voicebox Project Status & Roadmap
> Last updated: 2026-04-18 | Current version: **v0.4.1** | 232 open issues | 12 open PRs
---
## Table of Contents
1. [Architecture Overview](#architecture-overview)
2. [Current State](#current-state)
3. [Open PRs — Triage & Analysis](#open-prs--triage--analysis)
4. [Open Issues — Categorized](#open-issues--categorized)
5. [Existing Plan Documents — Status](#existing-plan-documents--status)
6. [New Model Integration — Landscape](#new-model-integration--landscape)
7. [Architectural Bottlenecks](#architectural-bottlenecks)
8. [Recommended Priorities](#recommended-priorities)
---
## Architecture Overview
**Tauri shell (Rust)** hosts a **React frontend** (`app/`) that talks over HTTP on `localhost:17493` to a **FastAPI backend** (`backend/`).
The backend exposes:
- **`TTSBackend` Protocol** with seven concrete engine implementations:
- Qwen3-TTS (PyTorch or MLX depending on platform)
- Qwen CustomVoice (predefined speakers with instruct)
- LuxTTS (fast, CPU-friendly)
- Chatterbox Multilingual (23 languages)
- Chatterbox Turbo (English, paralinguistic tags)
- TADA (1B English, 3B multilingual via HumeAI)
- Kokoro 82M (pre-built voices, CPU realtime)
- **`STTBackend` Protocol** for Whisper (PyTorch or MLX-Whisper)
- **Profiles / History / Stories** services for persistence and timeline editing
### Key Files
| Layer | File | Purpose |
|-------|------|---------|
| Backend entry | `backend/main.py` | FastAPI app, all API routes (~2850 lines) |
| TTS protocol | `backend/backends/__init__.py:32-101` | `TTSBackend` Protocol definition |
| Model registry | `backend/backends/__init__.py:17-29,153-366` | `ModelConfig` dataclass + registry helpers |
| TTS factory | `backend/backends/__init__.py:382-426` | Thread-safe engine registry (double-checked locking) |
| PyTorch TTS | `backend/backends/pytorch_backend.py` | Qwen3-TTS via `qwen_tts` package |
| MLX TTS | `backend/backends/mlx_backend.py` | Qwen3-TTS via `mlx_audio.tts` |
| LuxTTS | `backend/backends/luxtts_backend.py` | LuxTTS — fast, CPU-friendly |
| Chatterbox MTL | `backend/backends/chatterbox_backend.py` | Chatterbox Multilingual — 23 languages |
| Chatterbox Turbo | `backend/backends/chatterbox_turbo_backend.py` | Chatterbox Turbo — English, paralinguistic tags |
| TADA | `backend/backends/hume_backend.py` | HumeAI TADA — 1B English + 3B Multilingual |
| Kokoro | `backend/backends/kokoro_backend.py` | Kokoro 82M — CPU realtime, pre-built voices |
| Qwen CustomVoice | `backend/backends/qwen_custom_voice_backend.py` | Qwen CustomVoice — predefined speakers with instruct |
| Platform detect | `backend/platform_detect.py` | Apple Silicon → MLX, else → PyTorch |
| API types | `backend/models.py` | Pydantic request/response models |
| HF progress | `backend/utils/hf_progress.py` | HFProgressTracker (tqdm patching for download progress) |
| Audio utils | `backend/utils/audio.py` | `trim_tts_output()`, normalize, load/save audio |
| Frontend API | `app/src/lib/api/client.ts` | Hand-written fetch wrapper |
| Frontend types | `app/src/lib/api/types.ts` | TypeScript API types |
| Engine selector | `app/src/components/Generation/EngineModelSelector.tsx` | Shared engine/model dropdown |
| Generation form | `app/src/components/Generation/GenerationForm.tsx` | TTS generation UI |
| Floating gen box | `app/src/components/Generation/FloatingGenerateBox.tsx` | Compact generation UI |
| Model manager | `app/src/components/ServerSettings/ModelManagement.tsx` | Model download/status/progress UI |
| GPU acceleration | `app/src/components/ServerSettings/GpuAcceleration.tsx` | CUDA backend swap UI |
| Gen form hook | `app/src/lib/hooks/useGenerationForm.ts` | Form validation + submission |
| Language constants | `app/src/lib/constants/languages.ts` | Per-engine language maps |
### How TTS Generation Works (Current Flow)
```
POST /generate
1. Look up voice profile from DB
2. Resolve engine from request (qwen | qwen_custom_voice | luxtts | chatterbox | chatterbox_turbo | tada | kokoro)
3. Get backend: get_tts_backend_for_engine(engine) # thread-safe singleton per engine
4. Check model cache → if missing, trigger background download, return HTTP 202
5. Load model (lazy): tts_backend.load_model(model_size)
6. Create voice prompt: profiles.create_voice_prompt_for_profile(engine=engine)
→ tts_backend.create_voice_prompt(audio_path, reference_text)
7. Generate: tts_backend.generate(text, voice_prompt, language, seed, instruct)
8. Post-process: trim_tts_output() for Chatterbox engines
9. Save WAV → data/generations/{id}.wav
10. Insert history record in SQLite
11. Return GenerationResponse
```
---
## Current State
### What's Shipped (v0.4.x)
**New since v0.3.0:**
- Kokoro 82M TTS engine + voice profile type system (PR #325)
- Qwen CustomVoice preset engine — predefined speakers with instruct support (PR #328)
- Intel Arc (XPU) GPU support (PR #320)
- Blackwell GPU (sm_120) CUDA support (PR #401)
- Generation cancellation flow (PR #444)
- Frontend quality gates + TypeScript hardening (PR #418)
- macOS Intel (x86_64) PyTorch compatibility (PR #416)
- Frozen-binary import fixes for Kokoro / Chatterbox Multilingual / scipy / transformers (PR #438)
- Linux PipeWire/PulseAudio monitor detection (PR #457)
- Server survives GUI close on Windows (PR #402)
- GPU arch compatibility warning on startup (catches unsupported PyTorch builds)
- cpal Stream playback reliability (PR #405), clip-splitting stability (PR #403)
- torch.from_numpy crash with numpy 2.x in frozen binary (PR #361)
- Async CUDA download lock (PR #428), NUMBA_CACHE_DIR env var (PR #425)
- "Clear failed" history button (PR #412)
- External server GUI startup + data refresh (PR #319)
- Force offline mode for cached Qwen/Whisper models (PR #318)
- macOS 11 ScreenCaptureKit launch crash fix (PR #424)
**Core TTS (cumulative):**
- Qwen3-TTS voice cloning (1.7B and 0.6B models, MLX + PyTorch)
- Qwen CustomVoice (preset speakers, instruct)
- LuxTTS — fast, CPU-friendly English TTS (PR #254)
- Chatterbox Multilingual — 23 languages including Hebrew (PR #257)
- Chatterbox Turbo — paralinguistic tags, low latency English (PR #258)
- HumeAI TADA — 1B English + 3B Multilingual (PR #296)
- Kokoro 82M — CPU-realtime, 8 languages, Apache 2.0 (PR #325)
- Multi-engine architecture with thread-safe backend registry (PR #254)
- Chunked TTS generation — engine-agnostic, removes ~500 char limit (PR #266)
- Async generation queue (PR #269)
- Post-processing audio effects system (PR #271)
- Voice profile type system (preset vs cloned, engine compatibility gating)
- Centralized `ModelConfig` registry — no per-engine dispatch maps
- Shared `EngineModelSelector` component
**Infrastructure (cumulative):**
- CUDA backend swap via binary download (PR #252), cu128 upgrade (PR #316), Blackwell/sm_120 (PR #401)
- CUDA backend split into independently versioned server + libs archives (PR #298)
- Intel Arc XPU support (PR #320)
- Docker + web deployment (PR #161)
- Backend refactor: modular architecture, style guide, tooling (PR #285)
- Settings overhaul: routed sub-tabs, server logs, changelog, about page (PR #294)
- Windows support: CUDA detection, cross-platform justfile, server lifecycle (PR #272, #402)
- Linux audio capture via pactl monitor detection (PR #457)
- macOS Intel x86_64 compatibility (PR #416)
- Voice profiles with multi-sample support
- Stories editor (multi-track DAW timeline)
- Whisper transcription (base, small, medium, large, turbo variants)
- Model management UI with inline download progress + folder migration (PR #268)
- Download cancel/clear UI with error panel (PR #238)
- Generation history with caching and cancellation (PR #444)
- Streaming generation endpoint (MLX only)
- Audio player freeze fix + UX improvements (PR #293)
- CORS restriction to known local origins (PR #88)
### Abandoned / Backlogged Integrations
| Model | PR / Branch | Reason |
|-------|-------------|--------|
| **CosyVoice2/3** | PR #311 | Output quality too poor. Heavy deps, no PyPI, needed 5+ shims. PR should be closed. |
| **VoxCPM 1.5 / VoxCPM2** | `voicebox-new-models` research (2026-04-18) | **Backlogged.** See detailed analysis below. |
#### VoxCPM — Evaluation Notes (2026-04-18)
**Project:** [OpenBMB/VoxCPM](https://github.com/OpenBMB/VoxCPM) — tokenizer-free TTS, 2B params (VoxCPM2), end-to-end diffusion autoregressive architecture, 30 languages, 48 kHz output, Apache 2.0, `pip install voxcpm`.
**Why it looked interesting:**
- Clean PyPI install (`pip install voxcpm`)
- Apache 2.0 — commercially safe
- Voice cloning via `reference_wav_path` with optional `prompt_wav_path` + `prompt_text` for "ultimate" cloning
- Streaming API via `generate_streaming()`
- Zero-shot cloning + style control via parenthetical prefixes in text (`(slightly faster, cheerful tone)...`)
- Relatively high-quality output per demos
**Why we backlogged it:**
- **Effectively CUDA-only.** README states `CUDA ≥ 12.0` as hard requirement. Source code's `from_pretrained(device=None|"auto")` claims "preferring CUDA, then MPS, then CPU," but in practice:
- **MPS (Apple Silicon) broken upstream** — OpenBMB/VoxCPM issues #232 (`NotImplementedError: Output channels > 65536 not supported at the MPS device`) and #248 (`IndexError` on M3 Mac) are both open with no resolution.
- **CPU unsupported in the Python package** — issue #256 shows `voxcpm --device cpu` rejected with `unrecognized arguments`. The only CPU path is the third-party **VoxCPM.cpp** GGML engine, which is a separate ecosystem project, not `pip install voxcpm`.
- **macOS source install fails** — issue #233 open with no resolution.
- Would require CUDA-only gating in UI (new `requires_cuda` flag on `ModelConfig`, lock icon + "Requires NVIDIA GPU" in `ModelManagement.tsx` / `EngineModelSelector.tsx`) plus a hard error at `load_model()` as safety net. Doable but adds first-class platform gating that doesn't exist for any other engine today.
- Voicebox's user base skews Apple Silicon (MLX is a primary backend). Shipping a CUDA-only model sets a precedent worth a separate scoping discussion (see issues #419 engine sprawl, #420 platform tiers, PR #465).
**What would change the decision:**
- Upstream fixes MPS crashes (watch issues #232, #248).
- We define an "experimental / CUDA-only" engine tier as part of issue #419 / PR #465, and decide it's acceptable to ship engines that are hidden on non-NVIDIA platforms.
- VoxCPM.cpp matures into a viable CPU path we can wrap (currently separate project, C++/GGML, unclear ergonomics).
**Integration shape if we revive it:** Zero-shot cloning maps naturally to the Chatterbox-style backend (store `ref_audio` + `ref_text` paths in the voice prompt dict, process at generate time). Est. ~250 lines for `voxcpm_backend.py` + one `ModelConfig` entry + engine registration in `backends/__init__.py`. Frontend UI gating is the bigger lift.
### What's In-Flight
| Feature | Branch/PR | Status |
|---------|-----------|--------|
| Platform support tiers | PR #465, issue #420 | Defining tier-1 (supported) vs tier-2 (community) platforms |
| Engine sprawl cleanup | issue #419 | First-class vs experimental TTS backends distinction |
| Frontend tech-debt burn-down | issue #421 | Biome + a11y debt before gating CI |
| Docker registry auto-publish | PR #463, issue #453 | ghcr.io image on tag push |
| New model research | `voicebox-new-models` branch | Evaluating Fish Speech, XTTS-v2, Pocket TTS, VibeVoice, Fish Audio S2, index-tts2 |
### TTS Engine Comparison
| Engine | Model Name | Profile Type | Languages | Size | Key Features | Instruct Support |
|--------|-----------|--------------|-----------|------|-------------|-----------------|
| Qwen3-TTS 1.7B | `qwen-tts-1.7B` | Cloned | 10 (zh, en, ja, ko, de, fr, ru, pt, es, it) | ~3.5 GB | Highest quality, voice cloning | None (Base model has no instruct path) |
| Qwen3-TTS 0.6B | `qwen-tts-0.6B` | Cloned | 10 | ~1.2 GB | Lighter, faster | None |
| Qwen CustomVoice 1.7B | `qwen-custom-voice-1.7B` | Preset | 10 | ~3.5 GB | Predefined speakers, instruct support | **Yes** |
| Qwen CustomVoice 0.6B | `qwen-custom-voice-0.6B` | Preset | 10 | ~1.2 GB | Predefined speakers, instruct support | **Yes** |
| LuxTTS | `luxtts` | Cloned | English | ~300 MB | CPU-friendly, 48 kHz, fast | None |
| Chatterbox | `chatterbox-tts` | Cloned | 23 (incl. Hebrew, Arabic, Hindi, etc.) | ~3.2 GB | Zero-shot cloning, multilingual | Partial — `exaggeration` float (0-1) |
| Chatterbox Turbo | `chatterbox-turbo` | Cloned | English | ~1.5 GB | Paralinguistic tags ([laugh], [cough]), 350M params, low latency | Partial — inline tags only |
| TADA 1B | `tada-1b` | Cloned | English | ~4 GB | HumeAI speech-language model, 700s+ coherent audio | None |
| TADA 3B Multilingual | `tada-3b-ml` | Cloned | 10 (en, ar, zh, de, es, fr, it, ja, pl, pt) | ~8 GB | Multilingual, text-acoustic dual alignment | None |
| Kokoro 82M | `kokoro` | Preset | 8 (en, es, fr, hi, it, pt, ja, zh) | ~350 MB | 82M params, CPU realtime, Apache 2.0, pre-built voices | None |
### Multi-Engine Architecture (Shipped)
- **Thread-safe backend registry** (`_tts_backends` dict + `_tts_backends_lock`) with double-checked locking
- **Per-engine backend instances** — each engine gets its own singleton, loaded lazily
- **Engine field on GenerationRequest** — frontend sends `engine: 'qwen' | 'qwen_custom_voice' | 'luxtts' | 'chatterbox' | 'chatterbox_turbo' | 'tada' | 'kokoro'`
- **Per-engine language filtering** — `ENGINE_LANGUAGES` map in frontend, backend regex accepts all languages
- **Per-engine voice prompts** — `create_voice_prompt_for_profile()` dispatches to the correct backend
- **Profile type system** — preset vs cloned profiles, UI grays out incompatible engines and auto-switches on selection
- **Trim post-processing** — `trim_tts_output()` for Chatterbox engines (cuts trailing silence/hallucination)
### Known Limitations
- **HF XET progress**: Large files downloaded via `hf-xet` (HuggingFace's new transfer backend) report `n=0` in tqdm updates. Progress bars may appear stuck for large `.safetensors` files even though the download is proceeding. This is a known upstream limitation.
- **Chatterbox Turbo upstream token bug**: `from_pretrained()` passes `token=os.getenv("HF_TOKEN") or True` which fails without a stored HF token. Our backend works around this by calling `snapshot_download(token=None)` + `from_local()`.
- **chatterbox-tts must install with `--no-deps`**: It pins `numpy<1.26`, `torch==2.6.0`, `transformers==4.46.3` — all incompatible with our stack (Python 3.12, torch 2.10, transformers 4.57.3). Sub-deps listed explicitly in `requirements.txt`.
- **Instruct parameter partially shipped** (#224, #303): Qwen CustomVoice (PR #328) now provides real instruct support via predefined speakers. Other backends still silently drop the instruct field — the UI exposes the field broadly but most engines ignore it. The floating generate box was patched to restore instruct for CustomVoice (commit `106aec4`).
- **Streaming generation** only works for Qwen on MLX. Other engines use the non-streaming `/generate` endpoint.
- **dicta-onnx** (Hebrew diacritization) not included — upstream Chatterbox bug requires `model_path` arg but calls `Dicta()` with none. Hebrew works fine without it.
- **Blackwell (RTX 50-series) CUDA**: cu128 + sm_120 kernel support shipped (PR #401, #316), but users still report `cudaErrorNoKernelImageForDevice` (#417, #400, #396, #395, #390, #362) — likely a stale CUDA binary on upgraded installs. Needs a follow-up diagnostic / forced re-download path.
- **Long text 50k character limit** (#464, #365, #354): Still hit on GPU despite chunking (PR #266). Chunking reliability needs another pass.
- **ROCm on RDNA 3/4** (#469): `HSA_OVERRIDE_GFX_VERSION` is hardcoded and harms newer cards.
- **`flash-attn is not installed` warning on every platform (cosmetic, common user complaint)**: Our transformer-based engines (Chatterbox / Qwen) emit `Warning: flash-attn is not installed. Will only run the manual PyTorch version. Please install flash-attn for faster inference.` on every startup, on every platform — we don't pin `flash-attn` in requirements because installing it is fragile and version-sensitive. Fallback is PyTorch SDPA, which is near-FA2 throughput on Ampere+ and is what actually runs. **Per-platform reality:** (a) **macOS/Apple Silicon** — FlashAttention is CUDA-only, irrelevant here; MLX has its own attention kernels. (b) **Linux**`pip install flash-attn --no-build-isolation` works but takes 20+ min to compile. (c) **Windows** — no official support (Dao-AILab README still says only "Might work"; source builds routinely fail on recent CUDA/MSVC, issues #1715, #1828, #2395). Windows users can install community prebuilt wheels from `kingbri1/flash-attention` or `bdashore3/flash-attention` (latest v2.8.3, Aug 2025; `win_amd64` wheels for CUDA 12.4/12.8, Torch 2.62.9, Python 3.103.13) matching their exact CUDA/Torch/Python, or use WSL2. **Native-Windows alternatives worth considering as a build-time swap:** SageAttention (thu-ml, Apache 2.0, claims 25× over FA2) and xformers (official Windows wheels). **Action for us:** troubleshooting doc now covers it (see `docs/content/docs/overview/troubleshooting.mdx`), and we should optionally suppress the warning via `logging.getLogger(...).setLevel(ERROR)` at backend import since the fallback is functionally fine.
- **WebAudio playback dies after audio-session interruption** (#41, plus an internal repro where the app is backgrounded long enough): WaveSurfer's `AudioContext` gets suspended by macOS — either because another app grabs the audio output, or because the WKWebView throttles when backgrounded. `play()` resolves and `timeupdate` can still fire, but no audio reaches the output. Only app restart fixes it. **Things already tried that didn't work:** (a) swapping WaveSurfer backend away from WebAudio — introduced more bugs, not an option; (b) remount hook on the player — doesn't help because a freshly-created `AudioContext` is born suspended and only resumes on a user gesture. PR #293 was a prior partial fix that doesn't cover this path. **Next thing to try** (not yet attempted — confirmed via grep of `AudioPlayer.tsx`): call `wavesurfer.getMediaElement().getGainNode().context.resume()` on the play button click (the click itself is a valid user gesture), plus a `visibilitychange` + `statechange` listener as belt-and-suspenders. The `ctx.resume()` pattern already exists in the codebase at `useStoryPlayback.ts:52` — just not wired into the main player.
---
## Open PRs — Triage & Analysis
### Recently Merged (Since Last Update — 2026-03-18 → 2026-04-18)
| PR | Title | Merged |
|----|-------|--------|
| **#481** | fix(build): pin transformers in MLX requirements to prevent 5.x upgrade | 2026-04-19 |
| **#470** | fix(api-client): declare moved + errors on migrateModels response type | 2026-04-18 |
| **#457** | fix(linux): use pactl to detect PipeWire/PulseAudio monitor | 2026-04-18 |
| **#450** | docs: clarify paralinguistic tag support in quick start | 2026-04-18 |
| **#447** | fix: delete version rows and files in delete_generations_by_profile | 2026-04-18 |
| **#444** | Fix generation cancellation flow | 2026-04-18 |
| **#440** | fix(paths): strip legacy "data/" prefix when resolving stored paths | 2026-04-18 |
| **#439** | Fix migration dialog hanging when no models are present | 2026-04-18 |
| **#438** | fix(build): repair frozen-binary imports for kokoro/chatterbox-multilingual/scipy/transformers | 2026-04-18 |
| **#433** | fix: warn user when no models to migrate during storage change | 2026-04-18 |
| **#425** | Add NUMBA_CACHE_DIR environment variable | 2026-04-16 |
| **#424** | fix: avoid ScreenCaptureKit launch crash on macOS 11 | 2026-04-16 |
| **#418** | Frontend quality gates + TypeScript hardening | 2026-04-18 |
| **#416** | fix(deps): relax PyTorch requirement for macOS Intel (x86_64) | 2026-04-16 |
| **#412** | feat(history): add "Clear failed" button | 2026-04-16 |
| **#405** | fix: keep cpal Stream alive until playback completes | 2026-04-16 |
| **#403** | fix: prevent intermittent clip splitting failures | 2026-04-16 |
| **#402** | fix: reliably keep server alive after GUI close on Windows | 2026-04-16 |
| **#401** | feat: add Blackwell GPU (sm_120) CUDA support | 2026-04-16 |
| **#394** | fix(history): populate status/error/engine fields from DB row | 2026-04-16 |
| **#384** | Fix: Resolve ModuleNotFoundError in effects service | 2026-04-16 |
| **#361** | fix: torch.from_numpy crash with numpy 2.x in frozen binary | 2026-04-16 |
| **#345** | Fix: "Failed to Save" preset error by resolving backend import path | 2026-03-22 |
| **#344** | fix: include changelog in docker web build | 2026-03-27 |
| **#332** | Fix links in Get Started section of index.mdx | 2026-03-21 |
| **#328** | feat: add Qwen CustomVoice preset engine | 2026-03-27 |
| **#325** | feat: Kokoro 82M TTS engine + voice profile type system | 2026-03-20 |
| **#321** | fix: allows deletion of failed generations | 2026-03-19 |
| **#320** | feat: Intel Arc (XPU) GPU support | 2026-03-21 |
| **#319** | fix: GUI startup with external server + data refresh on server switch | 2026-03-27 |
| **#318** | fix: force offline mode when loading cached models (Qwen TTS & Whisper) | 2026-03-21 |
| **#316** | Upgrade CUDA backend from cu126 to cu128, fix GPU settings UI | 2026-03-18 |
### Currently Open (12 PRs)
| PR | Title | Status | Notes |
|----|-------|--------|-------|
| **#465** | docs: define tier-1 and tier-2 platform support targets | Community PR | Pairs with issue #420. Important for scoping. |
| **#463** | feat(actions): add docker-registry.yml for automatic ghcr.io publishing | Community PR | Pairs with issue #453. Low risk. |
| **#443** | fix: prevent infinite retry loop in offline mode (#434) | Community PR | Fixes reported bug. |
| **#430** | feat: add MiniMax TTS provider support | Community PR | Cloud TTS provider — new direction (external API). Superset of #331? |
| **#331** | feat: add MiniMax Cloud TTS as a built-in engine | Community PR | Likely superseded by #430. Dedupe. |
| **#311** | feat: add CosyVoice2/3 TTS engine | **Close** | Abandoned — output quality too poor. |
| **#253** | Enhance speech tokenizer with 48kHz version | Community PR | Qwen tokenizer upgrade. Still worth reviewing. |
| **#227** | fix: harden input validation & file safety | Community PR | Coupled to #225 (custom models). |
| **#225** | feat: custom HuggingFace voice model support | Community PR | Needs rework for multi-engine arch. |
| **#195** | feat: per-profile LoRA fine-tuning | Draft | Complex. 15 new endpoints. |
| **#154** | feat: Audiobook tab | Community PR | Chunked generation now shipped (#266). |
| **#91** | fix: CoreAudio device enumeration | Draft | macOS audio device handling. |
---
## Open Issues — Categorized
### GPU / Hardware Detection — still the top category
**RTX 50-series (Blackwell / sm_120) cluster — NEW:** #417, #400, #396, #395, #390, #362 all report `cudaErrorNoKernelImageForDevice` / "no kernel image available." sm_120 support shipped in PR #401 + cu128 in PR #316, but users on upgraded installs still hit it — likely stale CUDA binary. Needs a diagnostic that detects binary/GPU-arch mismatch and prompts re-download.
**AMD / ROCm — NEW:** #469 `HSA_OVERRIDE_GFX_VERSION` is hardcoded and breaks RDNA 3/4 cards. #313 DirectML on AMD Ryzen AI Max+ 395 not working.
**Intel Arc:** PR #320 shipped XPU support — may resolve #119.
**General GPU-not-detected (older):** #368, #310, #330, #324, #326, #355 (multi-GPU / eGPU).
**Fix path:** CUDA backend swap (PR #252) + cu128 (PR #316) + sm_120 (PR #401) + GPU-arch warning (`73170d0`) are all in. Remaining work is diagnostics + re-download prompts for users whose binary predates the kernel updates.
### Model Downloads
Still reported. Users get stuck downloads, can't resume, offline mode edge cases.
**Key issues:** #475 (MAC CustomVoice install error), #449 (infinite loading macOS), #445 (can't download CustomVoice), #462 (Qwen requires internet even when loaded — regression from #150), #434 (infinite retry loop offline — PR #443 open), #432 (storage location change hangs when empty — partly fixed by PR #439/#433), #348 (TADA 3B Multilingual download fails), #336 (TADA model not listed in app), #275 (`No module named 'chatterbox'` on download), #304 (whisper-base feature extractor load error), #287 (macOS ARM `check_model_inputs` ImportError on new version), #181, #180.
**Fix path:** PR #443 addresses infinite offline retry. CustomVoice-specific download failures (#475, #445) need triage — likely related to frozen-binary import fixes in PR #438. TADA cluster (#336, #348) and macOS ARM import regressions (#287, #275, #304) need a dedicated triage pass.
**Qwen 0.6B-downloads-1.7B reports:** **#485** (2026-04-19), **#423** (macOS M1), **#329**. Originally a stale-fallback bug: `mlx-community/Qwen3-TTS-12Hz-0.6B-Base-bf16` wasn't published when MLX support shipped, so the 0.6B slot was aliased to the 1.7B repo. The 0.6B bf16 conversion is live now and both `backend/backends/mlx_backend.py` and `backend/backends/__init__.py` point at their correct repos. Qwen CustomVoice is unaffected — it runs via PyTorch on all platforms, both sizes always have dedicated repos.
### Language Requests (ongoing)
Strong demand: Hungarian (#479), Indonesian (#458, #247), Thai (#455), Bangla (#454), Arabic (#379), Persian (#162), IndicF5 (#339 — Indian languages), Ukrainian (#109), Chinese UI (#392, #261).
**Fix path:** Chatterbox Multilingual (PR #257) covers Arabic, Danish, German, Greek, Finnish, Hebrew, Hindi, Dutch, Norwegian, Polish, Swedish, Swahili, Turkish. Still missing: Hungarian, Indonesian, Thai, Bangla, Ukrainian. Issue #411 offers a PR for UI i18n foundation.
### New Model Requests (growing)
| Issue | Model Requested |
|-------|----------------|
| #478 | CosyVoice3 (we tried & abandoned CosyVoice2/3 — see #311) |
| #407, #347 | RVC-style voice-to-voice / seed voice conversion (STS) |
| #385 | Fish Audio S2 |
| #380 | OmniVoice |
| #370 | index-tts2 |
| #364 | Voxtral-TTS |
| #335 | Faster-Qwen-TTS |
| #346 | Multi-model batch request |
| #381 | Microsoft MAI models |
| #339 | IndicF5 |
| #226 | GGUF support |
| #172 | VibeVoice |
| #138 | Export to ONNX/Piper format |
| #132 | LavaSR (transcription) |
| #147 | Facebook Omnilingual ASR |
| #338 | Default voices |
The multi-engine architecture makes integration straightforward — see [`content/docs/developer/tts-engines.mdx`](content/docs/developer/tts-engines.mdx). Platform-specific gating (e.g. VoxCPM CUDA-only) doesn't exist yet and would need design.
### Platform Scope & Quality Debt — NEW category
Awareness issues filed this cycle — ties into engine sprawl and platform tier work.
- **#419** — Engine sprawl: define first-class vs experimental TTS backends
- **#420** — Formalize tier-1 vs tier-2 platform support targets (PR #465 open)
- **#421** — Track & burn down frontend Biome + a11y debt before gating CI
- **#422** — Code-split web build (main bundle > 1 MB)
### Long-Form / Chunking
Still reported despite chunking + queue being merged.
**Key issues:** #464 (50k char limit on GPU despite 16 GB VRAM — v0.4.0), #365 (FR: >50k chars), #363 (smart chunking to prevent robotic artifacts), #354 (50k limit v0.3.0).
**Fix path:** Chunking (#266) and queue (#269) shipped. Remaining work is raising/removing the 50k guard and tuning chunk boundaries for prosody.
### Feature Requests (ongoing)
Notable:
- **#480** — Noise removal on uploaded recordings
- **#448** — API for non-Qwen models (external integrations)
- **#427** — Task status control
- **#407, #347** — Voice-to-voice / audio-to-audio conversion
- **#387** — Location of downloaded generated voices
- **#383** — Concatenate partial reference audio into generated audio
- **#382** — Lightning.ai support
- **#376** — Remote mode
- **#353** — Audio transcoding
- **#317** — Voice pitch control
- **#189** — "Auto" language option
- **#173** — Vocal intonation/inflection control
- **#165, #270** — Audiobook mode (PR #154 open)
- **#242** — Seed value pinning
- **#228** — Always use 0.6B option
- **#235** — Finetuned Qwen3-TTS tokenizer (PR #253 open)
- **#144** — Copy text to clipboard
### Housekeeping / Triage Needed
| Issue | Reason |
|-------|--------|
| **#431**, **#408** | Spam — Chinese "free Claude API" promos. Close. |
| **#398** ("Excelente") | Non-issue. Close. |
| **#357** | Informational — project featured in Awesome MLX. Close after acknowledgement. |
| **#374**, **#377** | Version-release questions, no bug. Close. |
| **#306** ("voice model"), **#389** ("New model"), **#473** ("New functionality") | Title-only issues, no content. Request details or close. |
| **#309** | Uninstall/cleanup question. Answer and close. |
| **#241** | "How to use in Colab" — support question, not a bug. |
| **#423** / **#485** / **#329** | Stale MLX fallback to 1.7B repo — fixed; 0.6B bf16 conversion now live on `mlx-community`, registry points at correct repo on both backends. |
| **#336** / **#348** | TADA download/registration cluster — triage together. |
| **#287** / **#275** / **#304** | macOS ARM import regressions on new version — likely one root cause. |
| **#292**, **#349** | Possibly already fixed by merged PRs (#321/#412 and #345). Verify + close. |
**~70 older issues (pre-#170) not individually categorized above.** Most are long-tail support questions or duplicates of problems now addressed by the multi-engine / model-registry work. A dedicated backlog-sweep pass is overdue.
### Bugs (ongoing)
| Category | Issues |
|----------|--------|
| Generation failures | #476, #467, #452, #459 (voice clone fetch error), #468 (tada-1b marked error), #437, #300, #301, #282 |
| Audio quality | #456 (clipping errors v0.4.0), #436 (emotion labels), #333 (pitch/echo), #307 (by-model breakdown), #340 (all generations say "www...") |
| Transcription | #371 (fails every time), #291 (extract transcription from generated audio) |
| Effects / presets | #349 ("Failed to save" when creating effects presets — possibly fixed by merged #345) |
| File ops | #477 (spacy_pkuseg dict missing on frozen Windows build), #472 (storage location change), #283 (allow longer files for voice creation + in-app trim), #350 (failed to add sample) |
| History | #292 (can't delete failed generations — possibly fixed by merged #321/#412) |
| Windows | #466 (install problem), #375 (WinError 5 access denied), #273 (port 8000 conflict), #201 (model doesn't stay loaded) |
| Linux | #471 (thread-safe PULSE_SOURCE), #413 (Arch build), #409 (Kubuntu build), #351, #341 |
| macOS | #441 (older macOS), #369 (malware flag), #334 (microphone permission), #287 (`check_model_inputs` ImportError — regression), #171 (ARM64 binary won't open) |
| Profile/UI | #360 (Kokoro profile hides others — partly addressed by auto-switch), #299 (drag-drop on Win11), #329 (size selector state bug), #393 (stuck loading screen after reinstall to new dir) |
| Integrations | #397 (SAMMI-bot 422 Unprocessable Entity) |
| Audio playback / session | **#41** (macOS: Voicebox goes silent after another app takes audio output; restart restores it) — see deep-dive below |
| Database | #174 (sqlite3 IntegrityError) |
---
## Existing Plan Documents — Status
| Document | Target Version | Status | Relevance |
|----------|---------------|--------|-----------|
| `TTS_PROVIDER_ARCHITECTURE.md` | v0.1.13 | **Partially superseded** by multi-engine arch + CUDA swap | Core concepts implemented differently than planned |
| `CUDA_BACKEND_SWAP.md` | — | **Shipped** (PR #252) | CUDA binary download + backend restart |
| `CUDA_BACKEND_SWAP_FINAL.md` | — | **Shipped** (PR #252) | Final implementation plan |
| `EXTERNAL_PROVIDERS.md` | v0.2.0 | **Not started** | Remote server support |
| `MLX_AUDIO.md` | — | **Shipped** | MLX backend is live |
| `DOCKER_DEPLOYMENT.md` | v0.2.0 | **Shipped** (PR #161) | Docker + web deployment |
| `OPENAI_SUPPORT.md` | v0.2.0 | **Not started** | OpenAI-compatible API layer |
| `PR33_CUDA_PROVIDER_REVIEW.md` | — | **Reference** | Analysis of the original provider approach |
---
## New Model Integration — Landscape
### Status Snapshot (2026-04-18)
| Model | Cloning | Speed | Sample Rate | Languages | VRAM | Instruct | Cross-platform? | Status |
|-------|---------|-------|-------------|-----------|------|----------|-----------------|--------|
| **Qwen3-TTS** | 10s zero-shot | Medium | 24 kHz | 10 | Medium | None | MLX + PyTorch | **Shipped** |
| **Qwen CustomVoice** | Preset speakers | Medium | 24 kHz | 10 | Medium | **Yes** | PyTorch | **Shipped** (PR #328) |
| **LuxTTS** | 3s zero-shot | 150x RT, CPU ok | 48 kHz | English | <1 GB | None | All | **Shipped** (PR #254) |
| **Chatterbox MTL** | 5s zero-shot | Medium | 24 kHz | 23 | Medium | Partial — `exaggeration` | CPU/CUDA | **Shipped** (PR #257) |
| **Chatterbox Turbo** | 5s zero-shot | Fast | 24 kHz | English | Low | Partial — inline tags | CPU/CUDA | **Shipped** (PR #258) |
| **HumeAI TADA 1B/3B** | Zero-shot | 5x faster than LLM-TTS | 24 kHz | EN (1B), 10 (3B) | Medium | Partial — prosody | PyTorch | **Shipped** (PR #296) |
| **Kokoro-82M** | Preset voices | CPU realtime | 24 kHz | 8 | Tiny (82M) | None | All | **Shipped** (PR #325) |
| ~~**CosyVoice2-0.5B**~~ | 3-10s zero-shot | Very fast | 24 kHz | Multilingual | Low | **Yes** | — | **Abandoned** (PR #311) — poor output quality |
| ~~**VoxCPM2**~~ | Zero-shot | ~0.15 RTF streaming | 48 kHz | 30 | Medium | Partial — parenthetical style | **CUDA-only in practice** | **Backlogged** (2026-04-18) — see notes above |
| **Fish Speech** | 10-30s few-shot | Real-time | 24-44 kHz | 50+ | Medium | **Yes** — word-level inline | All | Candidate — license TBD |
| **Fish Audio S2** | — | — | — | — | — | — | — | Candidate (#385) |
| **XTTS-v2** | 6s zero-shot | Mid-GPU | 24 kHz | 17+ | Medium | Partial — style transfer from ref | All | Candidate — CPML license likely blocker |
| **Pocket TTS** (Kyutai) | Zero-shot + streaming | >1x RT on CPU | — | English + several European (FR/DE/PT/IT/ES added by Feb 2026) | ~100M | None | CPU-first | Candidate — MIT |
| **MOSS-TTS-Nano** | Zero-shot | **Realtime on 4 CPU cores** | 48 kHz stereo | 20 | 0.1B | Partial — MOSS-VoiceGenerator companion does text-to-voice design | All (ONNX CPU path dropped 2026-04-17) | **Top candidate** — Apache 2.0, released 2026-04-13, streaming |
| **VibeVoice** (Microsoft) | — | — | — | Multi-speaker long-form (up to 90 min, 4 speakers) | 1.5B | — | — | Candidate (#172) — Stories-editor fit |
| **index-tts2** | — | — | — | — | — | — | — | Candidate (#370) |
| **Voxtral TTS** (Mistral) | Zero-shot (short clips) + 20 preset voices | Single-GPU | — | — | 4B (`Voxtral-4B-TTS-2603`) | Presets + cloning | CUDA (16 GB+ VRAM) | Candidate (#364) — frontier quality claim, open-weight |
| **Dia / Dia2** | — | — | — | — | — | — | — | Watch — emotion-forward, but "rough edges" / artifacts per April reviews |
| **IndicF5** | — | — | — | Indian languages | — | — | — | Candidate (#339) — fills Indic gap |
| **MiniMax Cloud TTS** | — | Cloud | — | — | N/A (API) | — | N/A | Community PR #430, #331 — new direction (external API) |
| **OmniVoice** | — | — | — | — | — | — | — | Candidate (#380) |
| **RVC voice conversion** | N/A (STS) | — | — | — | — | N/A | All | New modality, not TTS (#407, #347) |
**Watch list:** MioTTS-2.6B (fast LLM-based EN/JP, vLLM compatible), Oolel-Voices (Soynade Research, expressive modular control), Faster-Qwen-TTS (#335), Orpheus / Sesame CSM (on-device fine-tuning discussions), Fish Audio S2 Pro / Fish Speech V1.5 (benchmark leader but research/non-commercial license — same blocker as Fish Speech).
**Deep-research pass (2026-04-18):** MOSS-TTS-Nano identified as the freshest high-alignment candidate — verified via [OpenMOSS/MOSS-TTS](https://github.com/OpenMOSS/MOSS-TTS) README (0.1B params, Apache 2.0, 48 kHz stereo, 4-core CPU realtime, streaming, released 2026-04-13). Dedicated repo: [OpenMOSS/MOSS-TTS-Nano](https://github.com/OpenMOSS/MOSS-TTS-Nano). Voxtral TTS verified on HF as `mistralai/Voxtral-4B-TTS-2603`.
#### Active Evaluation Criteria (learned from cycle)
1. **Cross-platform first.** MLX is a primary backend for our Apple Silicon user base. CUDA-only models require platform gating that doesn't exist yet — shipping one sets a precedent (see VoxCPM notes, issues #419/#420).
2. **PyPI + Apache/MIT licensing preferred.** Heavy deps, git-only installs, and `--no-deps` workarounds are expensive to maintain (Chatterbox taught us this).
3. **Output quality is non-negotiable.** CosyVoice was abandoned despite the best instruct API.
4. **Instruct support fills a real gap** (#173, #224, #303). Qwen CustomVoice partially addresses it with preset speakers; zero-shot clone-with-instruct is still unmet.
5. **Long-form + streaming are user-requested** (#363, #365, #464). Candidates with native streaming (Pocket TTS, Fish Speech) get extra weight.
### Adding a New Engine (Now Straightforward)
With the model config registry and shared `EngineModelSelector` component, adding a new TTS engine requires:
1. **Create `backend/backends/<engine>_backend.py`** — implement `TTSBackend` protocol (~200-300 lines)
2. **Register in `backend/backends/__init__.py`** — add `ModelConfig` entry + `TTS_ENGINES` entry + factory elif
3. **Update `backend/models.py`** — add engine name to regex
4. **Update frontend** — add to engine union type, `EngineModelSelector` options, form schema, language map, profile type gating (icons/labels ~9 files per grep of `kokoro`)
`main.py` requires **zero changes** — the registry handles all dispatch automatically.
**Platform gating doesn't exist yet.** If we add a CUDA-only model (e.g. VoxCPM), we need a new `requires_cuda` (or more generally `requires: list[device]`) flag on `ModelConfig`, plumbed through `/models` API and surfaced in `ModelManagement.tsx` and `EngineModelSelector.tsx` as a lock icon + "Requires NVIDIA GPU" state. Backend should hard-error at `load_model()` as a safety net.
Total effort: **~1 day** for a well-documented model with a PyPI package, cross-platform. **~2 days** if platform gating is required. See [`content/docs/developer/tts-engines.mdx`](content/docs/developer/tts-engines.mdx) for the full guide.
---
## Architectural Bottlenecks
### ~~1. Single Backend Singleton~~ — RESOLVED
The singleton TTS backend was replaced with a thread-safe per-engine registry in PR #254. Multiple engines can now be loaded simultaneously.
### ~~2. `main.py` Dispatch Point Duplication~~ — RESOLVED
Previously, each engine required updates to 6+ hardcoded dispatch maps across `main.py` (~320 lines of if/elif chains). A model config registry in `backend/backends/__init__.py` now centralizes all model metadata (`ModelConfig` dataclass) with helper functions (`load_engine_model()`, `check_model_loaded()`, `engine_needs_trim()`, etc.). Adding a new engine requires zero changes to `main.py`.
### ~~3. Model Config is Scattered~~ — RESOLVED
Model identifiers, HF repo IDs, display names, and engine metadata are now consolidated in the `ModelConfig` registry. Backend-aware branching (e.g. MLX vs PyTorch Qwen repo IDs) happens inside the registry. Frontend model options are centralized in `EngineModelSelector.tsx`.
### 4. Voice Prompt Cache Assumes PyTorch Tensors
`backend/utils/cache.py` uses `torch.save()` / `torch.load()`. LuxTTS, Chatterbox, and Kokoro backends work around this by storing reference audio paths (or preset voice IDs) instead of tensors in their voice prompt dicts. Not ideal but functional.
### 5. ~~Frontend Assumes Qwen Model Sizes~~ — RESOLVED
The generation form now uses a flat model dropdown with engine-based routing. Per-engine language filtering is in place. Model size is only sent for Qwen / Qwen CustomVoice.
### 6. No Platform Gating on Models — NEW
`ModelConfig` has no way to express hardware requirements. Every engine is shown to every user, regardless of whether it'll actually load. Users on non-CUDA platforms discover failure at load time (or not at all — some fall back silently to CPU and never complete). Blocks shipping CUDA-only engines (VoxCPM) and would improve the Intel Arc / ROCm / CPU-only UX today. See `ModelConfig` TODO: add `requires: list[Literal["cuda", "mps", "xpu", "cpu", "rocm"]]` or equivalent, plumb through `/models` API, render in `ModelManagement.tsx` + `EngineModelSelector.tsx`.
### 7. Engine Sprawl — NEW
Seven TTS engines shipped, more candidates queued. Issue #419 asks for a first-class vs experimental distinction. Related: issue #420 asks for formalized platform support tiers. Combined, these would let us ship more engines more confidently with clearer expectations for users.
---
## Recommended Priorities
### Tier 1 — Ship Now
| Priority | PR/Item | Impact | Effort |
|----------|---------|--------|--------|
| 1 | **RTX 50-series / Blackwell diagnostic** — detect stale CUDA binary vs GPU arch, prompt re-download (#417, #400, #396, #395, #390, #362) | Large cluster of user-blocking errors | Medium |
| 2 | **CustomVoice download failures** (#475, #445) | New engine blocked on MAC/Win — regression triage | Medium |
| 3 | **50k char limit on GPU** (#464) | Regression — chunking should handle this | Medium |
| 4 | Close PR #311 (CosyVoice) and dedupe #331/#430 (MiniMax) | Housekeeping | None |
| 5 | **PR #443** — infinite offline retry loop | Bug fix, reviewable | Low |
| 6 | **PR #465** — define tier-1 / tier-2 platforms | Unblocks engine-sprawl decision (#419) | Low |
| 7 | **PR #463** — docker registry auto-publish | Community PR, low risk | Low |
| 8 | **#253** — 48kHz speech tokenizer | Quality improvement for Qwen | Medium |
| 9 | **Kokoro profile UX** (#360) — partially addressed by auto-switch | Polish | Low |
### Tier 2 — Feature Work
| Priority | Item | Impact | Effort |
|----------|------|--------|--------|
| 1 | **Engine tier system** (#419) — first-class vs experimental, platform gating in `ModelConfig` | Unblocks CUDA-only engines (VoxCPM, etc.) and frontend polish | Medium |
| 2 | **Frontend tech-debt burn-down** (#421) + code-split (#422) | Before gating CI on Biome | Medium |
| 3 | **#154** — Audiobook tab | Long-form users. Chunking + queue shipped. | Medium |
| 4 | **UI i18n** (#411 PR offer, #392, #261) | Chinese UI + general localization | Medium |
| 5 | **#225** — Custom HuggingFace models | User-supplied models. Needs rework. | High |
| 6 | OpenAI-compatible API (plan doc exists) — see also #448 (API for non-Qwen) | Low effort once API is stable | Low |
| 7 | LoRA fine-tuning (PR #195) | Complex, needs rework for multi-engine | Very High |
| 8 | Streaming for non-MLX engines | Currently MLX-only | Medium |
| 9 | Voice-to-voice / RVC (#407, #347) | New modality — different arch shape | High |
### Tier 3 — Future Engines (cross-platform preferred)
| Priority | Item | Notes |
|----------|------|-------|
| 1 | **MOSS-TTS-Nano** | 0.1B, Apache 2.0, 4-core CPU realtime, 48 kHz stereo, streaming, 20 langs, released 2026-04-13. Best alignment with our criteria. Verify install ergonomics before committing. |
| 2 | **Pocket TTS** (Kyutai) | CPU-first 100M model. MIT. Fills streaming gap without CUDA dependency. Several European langs added by Feb 2026. |
| 3 | **IndicF5** | Fills Indian-language gap (#339). Closes many language-request issues. |
| 4 | **VibeVoice** (Microsoft, #172) | 1.5B, long-form multi-speaker (up to 90 min, 4 speakers). Strong Stories-editor fit. |
| 5 | **Voxtral TTS** (Mistral, #364) | 4B presets+cloning. Frontier quality claim, but 16 GB+ VRAM — would need the platform-tier work first. |
| 6 | **Fish Speech / Fish Audio S2** | 50+ langs, word-level instruct. **License clarification first.** (#385) |
| 7 | **XTTS-v2** | 17+ langs, mature pip. CPML likely kills commercial use — verify. |
| 8 | **index-tts2** (#370) | Unvetted. |
| — | ~~**VoxCPM2**~~ | **Backlogged** — CUDA-only upstream. Revisit when tier system ships or MPS bugs are fixed upstream. |
### ~~Previously Prioritized — Now Done~~
- ~~Kokoro 82M — finish integration~~ **Shipped** (PR #325)
- ~~Qwen CustomVoice~~ **Shipped** (PR #328)
- ~~Intel Arc (XPU) support~~ **Shipped** (PR #320)
- ~~Blackwell CUDA~~ **Shipped** (PR #401, follow-up work open)
- ~~Generation cancellation~~ **Shipped** (PR #444)
- ~~macOS Intel x86_64~~ **Shipped** (PR #416)
---
## Branch Inventory
| Branch | PR | Status | Notes |
|--------|-----|--------|-------|
| `voicebox-new-models` | — | **Active** | New model research (Fish Speech, Pocket TTS, VibeVoice, etc.); VoxCPM evaluated & backlogged |
| `fix/kokoro-pyinstaller-source-files` | — | Active | Kokoro frozen-build source bundling (parent of `voicebox-new-models`) |
| `feat/cosyvoice-engine` | #311 | Open — closing | CosyVoice2/3 — abandoned, poor quality |
| `feat/kokoro` | #325 | **Merged** | Kokoro 82M + voice profile type system |
| `feat/qwen-custom-voice` | #328 | **Merged** | Qwen CustomVoice preset engine |
| `feat/chatterbox-turbo` | #258 | **Merged** | Chatterbox Turbo + per-engine languages |
| `feat/chatterbox` | #257 | **Merged** | Chatterbox Multilingual |
| `feat/luxtts` | #254 | **Merged** | LuxTTS + multi-engine arch |
---
## Quick Reference: API Endpoints
<details>
<summary>All current endpoints</summary>
| Endpoint | Method | Purpose |
|----------|--------|---------|
| `/health` | GET | Health check, model/GPU status |
| `/profiles` | POST, GET | Create/list voice profiles |
| `/profiles/{id}` | GET, PUT, DELETE | Profile CRUD |
| `/profiles/{id}/samples` | POST, GET | Add/list voice samples |
| `/profiles/{id}/avatar` | POST, GET, DELETE | Avatar management |
| `/profiles/{id}/export` | GET | Export profile as ZIP |
| `/profiles/import` | POST | Import profile from ZIP |
| `/generate` | POST | Generate speech (engine param selects TTS backend) |
| `/generate/stream` | POST | Stream speech (MLX only) |
| `/history` | GET | List generation history |
| `/history/{id}` | GET, DELETE | Get/delete generation |
| `/history/{id}/export` | GET | Export generation ZIP |
| `/history/{id}/export-audio` | GET | Export audio only |
| `/transcribe` | POST | Transcribe audio (Whisper) |
| `/models/status` | GET | All model statuses (Qwen, LuxTTS, Chatterbox, Chatterbox Turbo, TADA, Whisper) |
| `/models/download` | POST | Trigger model download |
| `/models/download/cancel` | POST | Cancel/dismiss download |
| `/models/{name}` | DELETE | Delete downloaded model |
| `/models/load` | POST | Load model into memory |
| `/models/unload` | POST | Unload model |
| `/models/progress/{name}` | GET | SSE download progress |
| `/tasks/active` | GET | Active downloads/generations (with inline progress) |
| `/stories` | POST, GET | Create/list stories |
| `/stories/{id}` | GET, PUT, DELETE | Story CRUD |
| `/stories/{id}/items` | POST, GET | Story items CRUD |
| `/stories/{id}/export` | GET | Export story audio |
| `/channels` | POST, GET | Audio channel CRUD |
| `/channels/{id}` | PUT, DELETE | Channel update/delete |
| `/cache/clear` | POST | Clear voice prompt cache |
| `/server/cuda/status` | GET | CUDA binary availability |
| `/server/cuda/download` | POST | Download CUDA binary |
| `/server/cuda/switch` | POST | Switch to CUDA backend |
</details>

41
docs/README.md Normal file
View File

@@ -0,0 +1,41 @@
# fumadocs-ui-template
This is a Next.js application generated with
[Create Fumadocs](https://github.com/fuma-nama/fumadocs).
Run development server:
```bash
bun run dev
```
Open http://localhost:3000 with your browser to see the result.
## Explore
In the project, you can see:
- `lib/source.ts`: Code for content source adapter, [`loader()`](https://fumadocs.dev/docs/headless/source-api) provides the interface to access your content.
- `lib/layout.shared.tsx`: Shared options for layouts, optional but preferred to keep.
| Route | Description |
| ------------------------- | ------------------------------------------------------ |
| `app/(home)` | The route group for your landing page and other pages. |
| `app/docs` | The documentation layout and pages. |
| `app/api/search/route.ts` | The Route Handler for search. |
### Fumadocs MDX
A `source.config.ts` config file has been included, you can customise different options like frontmatter schema.
Read the [Introduction](https://fumadocs.dev/docs/mdx) for further details.
## Learn More
To learn more about Next.js and Fumadocs, take a look at the following
resources:
- [Next.js Documentation](https://nextjs.org/docs) - learn about Next.js
features and API.
- [Learn Next.js](https://nextjs.org/learn) - an interactive Next.js tutorial.
- [Fumadocs](https://fumadocs.dev) - learn about Fumadocs

View File

@@ -0,0 +1,11 @@
import { DocsLayout } from 'fumadocs-ui/layouts/docs';
import { baseOptions } from '@/lib/layout.shared';
import { source } from '@/lib/source';
export default function Layout({ children }: LayoutProps<'/[[...slug]]'>) {
return (
<DocsLayout tree={source.pageTree} {...baseOptions()}>
{children}
</DocsLayout>
);
}

View File

@@ -0,0 +1,74 @@
import { createRelativeLink } from 'fumadocs-ui/mdx';
import { DocsBody, DocsDescription, DocsPage, DocsTitle } from 'fumadocs-ui/page';
import type { Metadata } from 'next';
import { notFound } from 'next/navigation';
import { MarkdownCopyButton, ViewOptionsPopover } from '@/components/ai/page-actions';
import { APIPage } from '@/components/api-page';
import { getPageImage, source } from '@/lib/source';
import { getMDXComponents } from '@/mdx-components';
export default async function Page(props: PageProps<'/[[...slug]]'>) {
const params = await props.params;
const page = source.getPage(params.slug);
if (!page) notFound();
const MDX = page.data.body;
const markdownUrl = `${page.url}.mdx`;
const githubUrl = `https://github.com/jamiepine/voicebox/blob/main/docs/content/docs/${page.path}`;
return (
<DocsPage
toc={page.data.toc}
full={page.data.full}
editOnGithub={{
owner: 'jamiepine',
repo: 'voicebox',
sha: 'main',
path: `docs/content/docs/${page.path}`,
}}
lastUpdate={page.data.lastModified}
>
<DocsTitle>{page.data.title}</DocsTitle>
<DocsDescription className="mb-0">{page.data.description}</DocsDescription>
<div className="flex flex-row gap-2 items-center">
<MarkdownCopyButton markdownUrl={markdownUrl} />
<ViewOptionsPopover markdownUrl={markdownUrl} githubUrl={githubUrl} />
</div>
<div
role="separator"
style={{
height: '1px',
background: 'currentColor',
opacity: 0.15,
marginTop: '8px',
marginBottom: '24px',
}}
/>
<DocsBody>
<MDX
components={getMDXComponents({
a: createRelativeLink(source, page),
})}
/>
</DocsBody>
</DocsPage>
);
}
export async function generateStaticParams() {
return source.generateParams();
}
export async function generateMetadata(props: PageProps<'/[[...slug]]'>): Promise<Metadata> {
const params = await props.params;
const page = source.getPage(params.slug);
if (!page) notFound();
return {
title: page.data.title,
description: page.data.description,
openGraph: {
images: getPageImage(page).url,
},
};
}

View File

@@ -0,0 +1,7 @@
import { source } from '@/lib/source';
import { createFromSource } from 'fumadocs-core/search/server';
export const { GET } = createFromSource(source, {
// https://docs.orama.com/docs/orama-js/supported-languages
language: 'english',
});

14
docs/app/global.css Normal file
View File

@@ -0,0 +1,14 @@
@import "tailwindcss";
@import "fumadocs-ui/css/neutral.css";
@import "fumadocs-ui/css/preset.css";
@import "fumadocs-openapi/css/preset.css";
:root {
--color-fd-primary: hsl(43, 50%, 50%);
--color-fd-primary-foreground: hsl(222.2, 47.4%, 11.2%);
}
.dark {
--color-fd-primary: hsl(43, 50%, 45%);
--color-fd-primary-foreground: hsl(0, 0%, 95%);
}

17
docs/app/layout.tsx Normal file
View File

@@ -0,0 +1,17 @@
import { RootProvider } from 'fumadocs-ui/provider/next';
import './global.css';
import { Inter } from 'next/font/google';
const inter = Inter({
subsets: ['latin'],
});
export default function Layout({ children }: LayoutProps<'/'>) {
return (
<html lang="en" className={inter.className} suppressHydrationWarning>
<body className="flex flex-col min-h-screen">
<RootProvider>{children}</RootProvider>
</body>
</html>
);
}

View File

@@ -0,0 +1,10 @@
import { getLLMText, source } from '@/lib/source';
export const revalidate = false;
export async function GET() {
const scan = source.getPages().map(getLLMText);
const scanned = await Promise.all(scan);
return new Response(scanned.join('\n\n'));
}

View File

@@ -0,0 +1,20 @@
import { notFound } from 'next/navigation';
import { getLLMText, source } from '@/lib/source';
export const revalidate = false;
export async function GET(_req: Request, { params }: RouteContext<'/llms.mdx/docs/[[...slug]]'>) {
const { slug } = await params;
const page = source.getPage(slug);
if (!page) notFound();
return new Response(await getLLMText(page), {
headers: {
'Content-Type': 'text/markdown',
},
});
}
export function generateStaticParams() {
return source.generateParams();
}

View File

@@ -0,0 +1,27 @@
import { getPageImage, source } from '@/lib/source';
import { notFound } from 'next/navigation';
import { ImageResponse } from 'next/og';
import { generate as DefaultImage } from 'fumadocs-ui/og';
export const revalidate = false;
export async function GET(_req: Request, { params }: RouteContext<'/og/docs/[...slug]'>) {
const { slug } = await params;
const page = source.getPage(slug.slice(0, -1));
if (!page) notFound();
return new ImageResponse(
<DefaultImage title={page.data.title} description={page.data.description} site="My App" />,
{
width: 1200,
height: 630,
},
);
}
export function generateStaticParams() {
return source.getPages().map((page) => ({
lang: page.locale,
slug: getPageImage(page).segments,
}));
}

830
docs/bun.lock Normal file
View File

@@ -0,0 +1,830 @@
{
"lockfileVersion": 1,
"configVersion": 1,
"workspaces": {
"": {
"name": "example-next-mdx",
"dependencies": {
"@radix-ui/react-popover": "^1.1.15",
"class-variance-authority": "^0.7.1",
"fumadocs-core": "^16.4.11",
"fumadocs-mdx": "13",
"fumadocs-openapi": "^10.2.7",
"fumadocs-ui": "^16.4.11",
"lucide-react": "^0.546.0",
"next": "^16.1.6",
"react": "^19.2.0",
"react-dom": "^19.2.0",
"shiki": "^3.22.0",
"tailwind-merge": "^3.5.0",
},
"devDependencies": {
"@tailwindcss/postcss": "^4.1.15",
"@types/mdx": "^2.0.13",
"@types/node": "^24.9.1",
"@types/react": "^19.2.2",
"@types/react-dom": "^19.2.2",
"postcss": "^8.5.6",
"tailwindcss": "^4.1.15",
"typescript": "^5.9.3",
},
},
},
"packages": {
"@alloc/quick-lru": ["@alloc/quick-lru@5.2.0", "", {}, "sha512-UrcABB+4bUrFABwbluTIBErXwvbsU/V7TZWfmbgJfbkwiBuziS9gxdODUyuiecfdGQ85jglMW6juS3+z5TsKLw=="],
"@emnapi/runtime": ["@emnapi/runtime@1.8.1", "", { "dependencies": { "tslib": "^2.4.0" } }, "sha512-mehfKSMWjjNol8659Z8KxEMrdSJDDot5SXMq00dM8BN4o+CLNXQ0xH2V7EchNHV4RmbZLmmPdEaXZc5H2FXmDg=="],
"@esbuild/aix-ppc64": ["@esbuild/aix-ppc64@0.25.12", "", { "os": "aix", "cpu": "ppc64" }, "sha512-Hhmwd6CInZ3dwpuGTF8fJG6yoWmsToE+vYgD4nytZVxcu1ulHpUQRAB1UJ8+N1Am3Mz4+xOByoQoSZf4D+CpkA=="],
"@esbuild/android-arm": ["@esbuild/android-arm@0.25.12", "", { "os": "android", "cpu": "arm" }, "sha512-VJ+sKvNA/GE7Ccacc9Cha7bpS8nyzVv0jdVgwNDaR4gDMC/2TTRc33Ip8qrNYUcpkOHUT5OZ0bUcNNVZQ9RLlg=="],
"@esbuild/android-arm64": ["@esbuild/android-arm64@0.25.12", "", { "os": "android", "cpu": "arm64" }, "sha512-6AAmLG7zwD1Z159jCKPvAxZd4y/VTO0VkprYy+3N2FtJ8+BQWFXU+OxARIwA46c5tdD9SsKGZ/1ocqBS/gAKHg=="],
"@esbuild/android-x64": ["@esbuild/android-x64@0.25.12", "", { "os": "android", "cpu": "x64" }, "sha512-5jbb+2hhDHx5phYR2By8GTWEzn6I9UqR11Kwf22iKbNpYrsmRB18aX/9ivc5cabcUiAT/wM+YIZ6SG9QO6a8kg=="],
"@esbuild/darwin-arm64": ["@esbuild/darwin-arm64@0.25.12", "", { "os": "darwin", "cpu": "arm64" }, "sha512-N3zl+lxHCifgIlcMUP5016ESkeQjLj/959RxxNYIthIg+CQHInujFuXeWbWMgnTo4cp5XVHqFPmpyu9J65C1Yg=="],
"@esbuild/darwin-x64": ["@esbuild/darwin-x64@0.25.12", "", { "os": "darwin", "cpu": "x64" }, "sha512-HQ9ka4Kx21qHXwtlTUVbKJOAnmG1ipXhdWTmNXiPzPfWKpXqASVcWdnf2bnL73wgjNrFXAa3yYvBSd9pzfEIpA=="],
"@esbuild/freebsd-arm64": ["@esbuild/freebsd-arm64@0.25.12", "", { "os": "freebsd", "cpu": "arm64" }, "sha512-gA0Bx759+7Jve03K1S0vkOu5Lg/85dou3EseOGUes8flVOGxbhDDh/iZaoek11Y8mtyKPGF3vP8XhnkDEAmzeg=="],
"@esbuild/freebsd-x64": ["@esbuild/freebsd-x64@0.25.12", "", { "os": "freebsd", "cpu": "x64" }, "sha512-TGbO26Yw2xsHzxtbVFGEXBFH0FRAP7gtcPE7P5yP7wGy7cXK2oO7RyOhL5NLiqTlBh47XhmIUXuGciXEqYFfBQ=="],
"@esbuild/linux-arm": ["@esbuild/linux-arm@0.25.12", "", { "os": "linux", "cpu": "arm" }, "sha512-lPDGyC1JPDou8kGcywY0YILzWlhhnRjdof3UlcoqYmS9El818LLfJJc3PXXgZHrHCAKs/Z2SeZtDJr5MrkxtOw=="],
"@esbuild/linux-arm64": ["@esbuild/linux-arm64@0.25.12", "", { "os": "linux", "cpu": "arm64" }, "sha512-8bwX7a8FghIgrupcxb4aUmYDLp8pX06rGh5HqDT7bB+8Rdells6mHvrFHHW2JAOPZUbnjUpKTLg6ECyzvas2AQ=="],
"@esbuild/linux-ia32": ["@esbuild/linux-ia32@0.25.12", "", { "os": "linux", "cpu": "ia32" }, "sha512-0y9KrdVnbMM2/vG8KfU0byhUN+EFCny9+8g202gYqSSVMonbsCfLjUO+rCci7pM0WBEtz+oK/PIwHkzxkyharA=="],
"@esbuild/linux-loong64": ["@esbuild/linux-loong64@0.25.12", "", { "os": "linux", "cpu": "none" }, "sha512-h///Lr5a9rib/v1GGqXVGzjL4TMvVTv+s1DPoxQdz7l/AYv6LDSxdIwzxkrPW438oUXiDtwM10o9PmwS/6Z0Ng=="],
"@esbuild/linux-mips64el": ["@esbuild/linux-mips64el@0.25.12", "", { "os": "linux", "cpu": "none" }, "sha512-iyRrM1Pzy9GFMDLsXn1iHUm18nhKnNMWscjmp4+hpafcZjrr2WbT//d20xaGljXDBYHqRcl8HnxbX6uaA/eGVw=="],
"@esbuild/linux-ppc64": ["@esbuild/linux-ppc64@0.25.12", "", { "os": "linux", "cpu": "ppc64" }, "sha512-9meM/lRXxMi5PSUqEXRCtVjEZBGwB7P/D4yT8UG/mwIdze2aV4Vo6U5gD3+RsoHXKkHCfSxZKzmDssVlRj1QQA=="],
"@esbuild/linux-riscv64": ["@esbuild/linux-riscv64@0.25.12", "", { "os": "linux", "cpu": "none" }, "sha512-Zr7KR4hgKUpWAwb1f3o5ygT04MzqVrGEGXGLnj15YQDJErYu/BGg+wmFlIDOdJp0PmB0lLvxFIOXZgFRrdjR0w=="],
"@esbuild/linux-s390x": ["@esbuild/linux-s390x@0.25.12", "", { "os": "linux", "cpu": "s390x" }, "sha512-MsKncOcgTNvdtiISc/jZs/Zf8d0cl/t3gYWX8J9ubBnVOwlk65UIEEvgBORTiljloIWnBzLs4qhzPkJcitIzIg=="],
"@esbuild/linux-x64": ["@esbuild/linux-x64@0.25.12", "", { "os": "linux", "cpu": "x64" }, "sha512-uqZMTLr/zR/ed4jIGnwSLkaHmPjOjJvnm6TVVitAa08SLS9Z0VM8wIRx7gWbJB5/J54YuIMInDquWyYvQLZkgw=="],
"@esbuild/netbsd-arm64": ["@esbuild/netbsd-arm64@0.25.12", "", { "os": "none", "cpu": "arm64" }, "sha512-xXwcTq4GhRM7J9A8Gv5boanHhRa/Q9KLVmcyXHCTaM4wKfIpWkdXiMog/KsnxzJ0A1+nD+zoecuzqPmCRyBGjg=="],
"@esbuild/netbsd-x64": ["@esbuild/netbsd-x64@0.25.12", "", { "os": "none", "cpu": "x64" }, "sha512-Ld5pTlzPy3YwGec4OuHh1aCVCRvOXdH8DgRjfDy/oumVovmuSzWfnSJg+VtakB9Cm0gxNO9BzWkj6mtO1FMXkQ=="],
"@esbuild/openbsd-arm64": ["@esbuild/openbsd-arm64@0.25.12", "", { "os": "openbsd", "cpu": "arm64" }, "sha512-fF96T6KsBo/pkQI950FARU9apGNTSlZGsv1jZBAlcLL1MLjLNIWPBkj5NlSz8aAzYKg+eNqknrUJ24QBybeR5A=="],
"@esbuild/openbsd-x64": ["@esbuild/openbsd-x64@0.25.12", "", { "os": "openbsd", "cpu": "x64" }, "sha512-MZyXUkZHjQxUvzK7rN8DJ3SRmrVrke8ZyRusHlP+kuwqTcfWLyqMOE3sScPPyeIXN/mDJIfGXvcMqCgYKekoQw=="],
"@esbuild/openharmony-arm64": ["@esbuild/openharmony-arm64@0.25.12", "", { "os": "none", "cpu": "arm64" }, "sha512-rm0YWsqUSRrjncSXGA7Zv78Nbnw4XL6/dzr20cyrQf7ZmRcsovpcRBdhD43Nuk3y7XIoW2OxMVvwuRvk9XdASg=="],
"@esbuild/sunos-x64": ["@esbuild/sunos-x64@0.25.12", "", { "os": "sunos", "cpu": "x64" }, "sha512-3wGSCDyuTHQUzt0nV7bocDy72r2lI33QL3gkDNGkod22EsYl04sMf0qLb8luNKTOmgF/eDEDP5BFNwoBKH441w=="],
"@esbuild/win32-arm64": ["@esbuild/win32-arm64@0.25.12", "", { "os": "win32", "cpu": "arm64" }, "sha512-rMmLrur64A7+DKlnSuwqUdRKyd3UE7oPJZmnljqEptesKM8wx9J8gx5u0+9Pq0fQQW8vqeKebwNXdfOyP+8Bsg=="],
"@esbuild/win32-ia32": ["@esbuild/win32-ia32@0.25.12", "", { "os": "win32", "cpu": "ia32" }, "sha512-HkqnmmBoCbCwxUKKNPBixiWDGCpQGVsrQfJoVGYLPT41XWF8lHuE5N6WhVia2n4o5QK5M4tYr21827fNhi4byQ=="],
"@esbuild/win32-x64": ["@esbuild/win32-x64@0.25.12", "", { "os": "win32", "cpu": "x64" }, "sha512-alJC0uCZpTFrSL0CCDjcgleBXPnCrEAhTBILpeAp7M/OFgoqtAetfBzX0xM00MUsVVPpVjlPuMbREqnZCXaTnA=="],
"@floating-ui/core": ["@floating-ui/core@1.7.4", "", { "dependencies": { "@floating-ui/utils": "^0.2.10" } }, "sha512-C3HlIdsBxszvm5McXlB8PeOEWfBhcGBTZGkGlWc2U0KFY5IwG5OQEuQ8rq52DZmcHDlPLd+YFBK+cZcytwIFWg=="],
"@floating-ui/dom": ["@floating-ui/dom@1.7.5", "", { "dependencies": { "@floating-ui/core": "^1.7.4", "@floating-ui/utils": "^0.2.10" } }, "sha512-N0bD2kIPInNHUHehXhMke1rBGs1dwqvC9O9KYMyyjK7iXt7GAhnro7UlcuYcGdS/yYOlq0MAVgrow8IbWJwyqg=="],
"@floating-ui/react-dom": ["@floating-ui/react-dom@2.1.7", "", { "dependencies": { "@floating-ui/dom": "^1.7.5" }, "peerDependencies": { "react": ">=16.8.0", "react-dom": ">=16.8.0" } }, "sha512-0tLRojf/1Go2JgEVm+3Frg9A3IW8bJgKgdO0BN5RkF//ufuz2joZM63Npau2ff3J6lUVYgDSNzNkR+aH3IVfjg=="],
"@floating-ui/utils": ["@floating-ui/utils@0.2.10", "", {}, "sha512-aGTxbpbg8/b5JfU1HXSrbH3wXZuLPJcNEcZQFMxLs3oSzgtVu6nFPkbbGGUvBcUjKV2YyB9Wxxabo+HEH9tcRQ=="],
"@formatjs/fast-memoize": ["@formatjs/fast-memoize@3.1.0", "", { "dependencies": { "tslib": "^2.8.1" } }, "sha512-b5mvSWCI+XVKiz5WhnBCY3RJ4ZwfjAidU0yVlKa3d3MSgKmH1hC3tBGEAtYyN5mqL7N0G5x0BOUYyO8CEupWgg=="],
"@formatjs/intl-localematcher": ["@formatjs/intl-localematcher@0.8.0", "", { "dependencies": { "@formatjs/fast-memoize": "3.1.0", "tslib": "^2.8.1" } }, "sha512-zgMYWdUlmEZpX2Io+v3LHrfq9xZ6khpQVf9UAw2xYWhGerGgI9XgH1HvL/A34jWiruUJpYlP5pk4g8nIcaDrXQ=="],
"@fumadocs/ui": ["@fumadocs/ui@16.4.11", "", { "dependencies": { "next-themes": "^0.4.6", "postcss-selector-parser": "^7.1.1", "tailwind-merge": "^3.4.0" }, "peerDependencies": { "@types/react": "*", "fumadocs-core": "16.4.11", "next": "16.x.x", "react": "^19.2.0", "react-dom": "^19.2.0", "tailwindcss": "^4.0.0" }, "optionalPeers": ["@types/react", "next", "tailwindcss"] }, "sha512-3APzHr4Rv5P9YQApTKCQW3cXika0dwHuOo8WxYz74y42nONRo/TMDtvoWaNhB145sBrW9N4j0/0xXfiGLihVRQ=="],
"@fumari/json-schema-to-typescript": ["@fumari/json-schema-to-typescript@2.0.0", "", { "dependencies": { "js-yaml": "^4.1.0" }, "peerDependencies": { "@apidevtools/json-schema-ref-parser": "14.x.x", "prettier": "3.x.x" }, "optionalPeers": ["@apidevtools/json-schema-ref-parser", "prettier"] }, "sha512-X0Wm3QJLj1Rtb1nY2exM6QwMXb9LGyIKLf35+n6xyltDDBLMECOC4R/zPaw3RwgFVmvRLSmLCd+ht4sKabgmNw=="],
"@fumari/stf": ["@fumari/stf@0.0.1", "", { "peerDependencies": { "@types/react": "*", "react": "^19.2.0", "react-dom": "^19.2.0" }, "optionalPeers": ["@types/react"] }, "sha512-Io3xlYr8xMPZtxWI5GwIRvWEMu1CsfbwXa09ACeXGjbY4QVreMiMjNCvN1YNLmETgG6Ru1S/+2B8qv80OIExyA=="],
"@img/colour": ["@img/colour@1.0.0", "", {}, "sha512-A5P/LfWGFSl6nsckYtjw9da+19jB8hkJ6ACTGcDfEJ0aE+l2n2El7dsVM7UVHZQ9s2lmYMWlrS21YLy2IR1LUw=="],
"@img/sharp-darwin-arm64": ["@img/sharp-darwin-arm64@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-darwin-arm64": "1.2.4" }, "os": "darwin", "cpu": "arm64" }, "sha512-imtQ3WMJXbMY4fxb/Ndp6HBTNVtWCUI0WdobyheGf5+ad6xX8VIDO8u2xE4qc/fr08CKG/7dDseFtn6M6g/r3w=="],
"@img/sharp-darwin-x64": ["@img/sharp-darwin-x64@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-darwin-x64": "1.2.4" }, "os": "darwin", "cpu": "x64" }, "sha512-YNEFAF/4KQ/PeW0N+r+aVVsoIY0/qxxikF2SWdp+NRkmMB7y9LBZAVqQ4yhGCm/H3H270OSykqmQMKLBhBJDEw=="],
"@img/sharp-libvips-darwin-arm64": ["@img/sharp-libvips-darwin-arm64@1.2.4", "", { "os": "darwin", "cpu": "arm64" }, "sha512-zqjjo7RatFfFoP0MkQ51jfuFZBnVE2pRiaydKJ1G/rHZvnsrHAOcQALIi9sA5co5xenQdTugCvtb1cuf78Vf4g=="],
"@img/sharp-libvips-darwin-x64": ["@img/sharp-libvips-darwin-x64@1.2.4", "", { "os": "darwin", "cpu": "x64" }, "sha512-1IOd5xfVhlGwX+zXv2N93k0yMONvUlANylbJw1eTah8K/Jtpi15KC+WSiaX/nBmbm2HxRM1gZ0nSdjSsrZbGKg=="],
"@img/sharp-libvips-linux-arm": ["@img/sharp-libvips-linux-arm@1.2.4", "", { "os": "linux", "cpu": "arm" }, "sha512-bFI7xcKFELdiNCVov8e44Ia4u2byA+l3XtsAj+Q8tfCwO6BQ8iDojYdvoPMqsKDkuoOo+X6HZA0s0q11ANMQ8A=="],
"@img/sharp-libvips-linux-arm64": ["@img/sharp-libvips-linux-arm64@1.2.4", "", { "os": "linux", "cpu": "arm64" }, "sha512-excjX8DfsIcJ10x1Kzr4RcWe1edC9PquDRRPx3YVCvQv+U5p7Yin2s32ftzikXojb1PIFc/9Mt28/y+iRklkrw=="],
"@img/sharp-libvips-linux-ppc64": ["@img/sharp-libvips-linux-ppc64@1.2.4", "", { "os": "linux", "cpu": "ppc64" }, "sha512-FMuvGijLDYG6lW+b/UvyilUWu5Ayu+3r2d1S8notiGCIyYU/76eig1UfMmkZ7vwgOrzKzlQbFSuQfgm7GYUPpA=="],
"@img/sharp-libvips-linux-riscv64": ["@img/sharp-libvips-linux-riscv64@1.2.4", "", { "os": "linux", "cpu": "none" }, "sha512-oVDbcR4zUC0ce82teubSm+x6ETixtKZBh/qbREIOcI3cULzDyb18Sr/Wcyx7NRQeQzOiHTNbZFF1UwPS2scyGA=="],
"@img/sharp-libvips-linux-s390x": ["@img/sharp-libvips-linux-s390x@1.2.4", "", { "os": "linux", "cpu": "s390x" }, "sha512-qmp9VrzgPgMoGZyPvrQHqk02uyjA0/QrTO26Tqk6l4ZV0MPWIW6LTkqOIov+J1yEu7MbFQaDpwdwJKhbJvuRxQ=="],
"@img/sharp-libvips-linux-x64": ["@img/sharp-libvips-linux-x64@1.2.4", "", { "os": "linux", "cpu": "x64" }, "sha512-tJxiiLsmHc9Ax1bz3oaOYBURTXGIRDODBqhveVHonrHJ9/+k89qbLl0bcJns+e4t4rvaNBxaEZsFtSfAdquPrw=="],
"@img/sharp-libvips-linuxmusl-arm64": ["@img/sharp-libvips-linuxmusl-arm64@1.2.4", "", { "os": "linux", "cpu": "arm64" }, "sha512-FVQHuwx1IIuNow9QAbYUzJ+En8KcVm9Lk5+uGUQJHaZmMECZmOlix9HnH7n1TRkXMS0pGxIJokIVB9SuqZGGXw=="],
"@img/sharp-libvips-linuxmusl-x64": ["@img/sharp-libvips-linuxmusl-x64@1.2.4", "", { "os": "linux", "cpu": "x64" }, "sha512-+LpyBk7L44ZIXwz/VYfglaX/okxezESc6UxDSoyo2Ks6Jxc4Y7sGjpgU9s4PMgqgjj1gZCylTieNamqA1MF7Dg=="],
"@img/sharp-linux-arm": ["@img/sharp-linux-arm@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-linux-arm": "1.2.4" }, "os": "linux", "cpu": "arm" }, "sha512-9dLqsvwtg1uuXBGZKsxem9595+ujv0sJ6Vi8wcTANSFpwV/GONat5eCkzQo/1O6zRIkh0m/8+5BjrRr7jDUSZw=="],
"@img/sharp-linux-arm64": ["@img/sharp-linux-arm64@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-linux-arm64": "1.2.4" }, "os": "linux", "cpu": "arm64" }, "sha512-bKQzaJRY/bkPOXyKx5EVup7qkaojECG6NLYswgktOZjaXecSAeCWiZwwiFf3/Y+O1HrauiE3FVsGxFg8c24rZg=="],
"@img/sharp-linux-ppc64": ["@img/sharp-linux-ppc64@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-linux-ppc64": "1.2.4" }, "os": "linux", "cpu": "ppc64" }, "sha512-7zznwNaqW6YtsfrGGDA6BRkISKAAE1Jo0QdpNYXNMHu2+0dTrPflTLNkpc8l7MUP5M16ZJcUvysVWWrMefZquA=="],
"@img/sharp-linux-riscv64": ["@img/sharp-linux-riscv64@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-linux-riscv64": "1.2.4" }, "os": "linux", "cpu": "none" }, "sha512-51gJuLPTKa7piYPaVs8GmByo7/U7/7TZOq+cnXJIHZKavIRHAP77e3N2HEl3dgiqdD/w0yUfiJnII77PuDDFdw=="],
"@img/sharp-linux-s390x": ["@img/sharp-linux-s390x@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-linux-s390x": "1.2.4" }, "os": "linux", "cpu": "s390x" }, "sha512-nQtCk0PdKfho3eC5MrbQoigJ2gd1CgddUMkabUj+rBevs8tZ2cULOx46E7oyX+04WGfABgIwmMC0VqieTiR4jg=="],
"@img/sharp-linux-x64": ["@img/sharp-linux-x64@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-linux-x64": "1.2.4" }, "os": "linux", "cpu": "x64" }, "sha512-MEzd8HPKxVxVenwAa+JRPwEC7QFjoPWuS5NZnBt6B3pu7EG2Ge0id1oLHZpPJdn3OQK+BQDiw9zStiHBTJQQQQ=="],
"@img/sharp-linuxmusl-arm64": ["@img/sharp-linuxmusl-arm64@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-linuxmusl-arm64": "1.2.4" }, "os": "linux", "cpu": "arm64" }, "sha512-fprJR6GtRsMt6Kyfq44IsChVZeGN97gTD331weR1ex1c1rypDEABN6Tm2xa1wE6lYb5DdEnk03NZPqA7Id21yg=="],
"@img/sharp-linuxmusl-x64": ["@img/sharp-linuxmusl-x64@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-linuxmusl-x64": "1.2.4" }, "os": "linux", "cpu": "x64" }, "sha512-Jg8wNT1MUzIvhBFxViqrEhWDGzqymo3sV7z7ZsaWbZNDLXRJZoRGrjulp60YYtV4wfY8VIKcWidjojlLcWrd8Q=="],
"@img/sharp-wasm32": ["@img/sharp-wasm32@0.34.5", "", { "dependencies": { "@emnapi/runtime": "^1.7.0" }, "cpu": "none" }, "sha512-OdWTEiVkY2PHwqkbBI8frFxQQFekHaSSkUIJkwzclWZe64O1X4UlUjqqqLaPbUpMOQk6FBu/HtlGXNblIs0huw=="],
"@img/sharp-win32-arm64": ["@img/sharp-win32-arm64@0.34.5", "", { "os": "win32", "cpu": "arm64" }, "sha512-WQ3AgWCWYSb2yt+IG8mnC6Jdk9Whs7O0gxphblsLvdhSpSTtmu69ZG1Gkb6NuvxsNACwiPV6cNSZNzt0KPsw7g=="],
"@img/sharp-win32-ia32": ["@img/sharp-win32-ia32@0.34.5", "", { "os": "win32", "cpu": "ia32" }, "sha512-FV9m/7NmeCmSHDD5j4+4pNI8Cp3aW+JvLoXcTUo0IqyjSfAZJ8dIUmijx1qaJsIiU+Hosw6xM5KijAWRJCSgNg=="],
"@img/sharp-win32-x64": ["@img/sharp-win32-x64@0.34.5", "", { "os": "win32", "cpu": "x64" }, "sha512-+29YMsqY2/9eFEiW93eqWnuLcWcufowXewwSNIT6UwZdUUCrM3oFjMWH/Z6/TMmb4hlFenmfAVbpWeup2jryCw=="],
"@jridgewell/gen-mapping": ["@jridgewell/gen-mapping@0.3.13", "", { "dependencies": { "@jridgewell/sourcemap-codec": "^1.5.0", "@jridgewell/trace-mapping": "^0.3.24" } }, "sha512-2kkt/7niJ6MgEPxF0bYdQ6etZaA+fQvDcLKckhy1yIQOzaoKjBBjSj63/aLVjYE3qhRt5dvM+uUyfCg6UKCBbA=="],
"@jridgewell/remapping": ["@jridgewell/remapping@2.3.5", "", { "dependencies": { "@jridgewell/gen-mapping": "^0.3.5", "@jridgewell/trace-mapping": "^0.3.24" } }, "sha512-LI9u/+laYG4Ds1TDKSJW2YPrIlcVYOwi2fUC6xB43lueCjgxV4lffOCZCtYFiH6TNOX+tQKXx97T4IKHbhyHEQ=="],
"@jridgewell/resolve-uri": ["@jridgewell/resolve-uri@3.1.2", "", {}, "sha512-bRISgCIjP20/tbWSPWMEi54QVPRZExkuD9lJL+UIxUKtwVJA8wW1Trb1jMs1RFXo1CBTNZ/5hpC9QvmKWdopKw=="],
"@jridgewell/sourcemap-codec": ["@jridgewell/sourcemap-codec@1.5.5", "", {}, "sha512-cYQ9310grqxueWbl+WuIUIaiUaDcj7WOq5fVhEljNVgRfOUhY9fy2zTvfoqWsnebh8Sl70VScFbICvJnLKB0Og=="],
"@jridgewell/trace-mapping": ["@jridgewell/trace-mapping@0.3.31", "", { "dependencies": { "@jridgewell/resolve-uri": "^3.1.0", "@jridgewell/sourcemap-codec": "^1.4.14" } }, "sha512-zzNR+SdQSDJzc8joaeP8QQoCQr8NuYx2dIIytl1QeBEZHJ9uW6hebsrYgbz8hJwUQao3TWCMtmfV8Nu1twOLAw=="],
"@mdx-js/mdx": ["@mdx-js/mdx@3.1.1", "", { "dependencies": { "@types/estree": "^1.0.0", "@types/estree-jsx": "^1.0.0", "@types/hast": "^3.0.0", "@types/mdx": "^2.0.0", "acorn": "^8.0.0", "collapse-white-space": "^2.0.0", "devlop": "^1.0.0", "estree-util-is-identifier-name": "^3.0.0", "estree-util-scope": "^1.0.0", "estree-walker": "^3.0.0", "hast-util-to-jsx-runtime": "^2.0.0", "markdown-extensions": "^2.0.0", "recma-build-jsx": "^1.0.0", "recma-jsx": "^1.0.0", "recma-stringify": "^1.0.0", "rehype-recma": "^1.0.0", "remark-mdx": "^3.0.0", "remark-parse": "^11.0.0", "remark-rehype": "^11.0.0", "source-map": "^0.7.0", "unified": "^11.0.0", "unist-util-position-from-estree": "^2.0.0", "unist-util-stringify-position": "^4.0.0", "unist-util-visit": "^5.0.0", "vfile": "^6.0.0" } }, "sha512-f6ZO2ifpwAQIpzGWaBQT2TXxPv6z3RBzQKpVftEWN78Vl/YweF1uwussDx8ECAXVtr3Rs89fKyG9YlzUs9DyGQ=="],
"@next/env": ["@next/env@16.1.6", "", {}, "sha512-N1ySLuZjnAtN3kFnwhAwPvZah8RJxKasD7x1f8shFqhncnWZn4JMfg37diLNuoHsLAlrDfM3g4mawVdtAG8XLQ=="],
"@next/swc-darwin-arm64": ["@next/swc-darwin-arm64@16.1.6", "", { "os": "darwin", "cpu": "arm64" }, "sha512-wTzYulosJr/6nFnqGW7FrG3jfUUlEf8UjGA0/pyypJl42ExdVgC6xJgcXQ+V8QFn6niSG2Pb8+MIG1mZr2vczw=="],
"@next/swc-darwin-x64": ["@next/swc-darwin-x64@16.1.6", "", { "os": "darwin", "cpu": "x64" }, "sha512-BLFPYPDO+MNJsiDWbeVzqvYd4NyuRrEYVB5k2N3JfWncuHAy2IVwMAOlVQDFjj+krkWzhY2apvmekMkfQR0CUQ=="],
"@next/swc-linux-arm64-gnu": ["@next/swc-linux-arm64-gnu@16.1.6", "", { "os": "linux", "cpu": "arm64" }, "sha512-OJYkCd5pj/QloBvoEcJ2XiMnlJkRv9idWA/j0ugSuA34gMT6f5b7vOiCQHVRpvStoZUknhl6/UxOXL4OwtdaBw=="],
"@next/swc-linux-arm64-musl": ["@next/swc-linux-arm64-musl@16.1.6", "", { "os": "linux", "cpu": "arm64" }, "sha512-S4J2v+8tT3NIO9u2q+S0G5KdvNDjXfAv06OhfOzNDaBn5rw84DGXWndOEB7d5/x852A20sW1M56vhC/tRVbccQ=="],
"@next/swc-linux-x64-gnu": ["@next/swc-linux-x64-gnu@16.1.6", "", { "os": "linux", "cpu": "x64" }, "sha512-2eEBDkFlMMNQnkTyPBhQOAyn2qMxyG2eE7GPH2WIDGEpEILcBPI/jdSv4t6xupSP+ot/jkfrCShLAa7+ZUPcJQ=="],
"@next/swc-linux-x64-musl": ["@next/swc-linux-x64-musl@16.1.6", "", { "os": "linux", "cpu": "x64" }, "sha512-oicJwRlyOoZXVlxmIMaTq7f8pN9QNbdes0q2FXfRsPhfCi8n8JmOZJm5oo1pwDaFbnnD421rVU409M3evFbIqg=="],
"@next/swc-win32-arm64-msvc": ["@next/swc-win32-arm64-msvc@16.1.6", "", { "os": "win32", "cpu": "arm64" }, "sha512-gQmm8izDTPgs+DCWH22kcDmuUp7NyiJgEl18bcr8irXA5N2m2O+JQIr6f3ct42GOs9c0h8QF3L5SzIxcYAAXXw=="],
"@next/swc-win32-x64-msvc": ["@next/swc-win32-x64-msvc@16.1.6", "", { "os": "win32", "cpu": "x64" }, "sha512-NRfO39AIrzBnixKbjuo2YiYhB6o9d8v/ymU9m/Xk8cyVk+k7XylniXkHwjs4s70wedVffc6bQNbufk5v0xEm0A=="],
"@orama/orama": ["@orama/orama@3.1.18", "", {}, "sha512-a61ljmRVVyG5MC/698C8/FfFDw5a8LOIvyOLW5fztgUXqUpc1jOfQzOitSCbge657OgXXThmY3Tk8fpiDb4UcA=="],
"@radix-ui/number": ["@radix-ui/number@1.1.1", "", {}, "sha512-MkKCwxlXTgz6CFoJx3pCwn07GKp36+aZyu/u2Ln2VrA5DcdyCZkASEDBTd8x5whTQQL5CiYf4prXKLcgQdv29g=="],
"@radix-ui/primitive": ["@radix-ui/primitive@1.1.3", "", {}, "sha512-JTF99U/6XIjCBo0wqkU5sK10glYe27MRRsfwoiq5zzOEZLHU3A3KCMa5X/azekYRCJ0HlwI0crAXS/5dEHTzDg=="],
"@radix-ui/react-accordion": ["@radix-ui/react-accordion@1.2.12", "", { "dependencies": { "@radix-ui/primitive": "1.1.3", "@radix-ui/react-collapsible": "1.1.12", "@radix-ui/react-collection": "1.1.7", "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-direction": "1.1.1", "@radix-ui/react-id": "1.1.1", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-use-controllable-state": "1.2.2" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-T4nygeh9YE9dLRPhAHSeOZi7HBXo+0kYIPJXayZfvWOWA0+n3dESrZbjfDPUABkUNym6Hd+f2IR113To8D2GPA=="],
"@radix-ui/react-arrow": ["@radix-ui/react-arrow@1.1.7", "", { "dependencies": { "@radix-ui/react-primitive": "2.1.3" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-F+M1tLhO+mlQaOWspE8Wstg+z6PwxwRd8oQ8IXceWz92kfAmalTRf0EjrouQeo7QssEPfCn05B4Ihs1K9WQ/7w=="],
"@radix-ui/react-collapsible": ["@radix-ui/react-collapsible@1.1.12", "", { "dependencies": { "@radix-ui/primitive": "1.1.3", "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-id": "1.1.1", "@radix-ui/react-presence": "1.1.5", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-use-controllable-state": "1.2.2", "@radix-ui/react-use-layout-effect": "1.1.1" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-Uu+mSh4agx2ib1uIGPP4/CKNULyajb3p92LsVXmH2EHVMTfZWpll88XJ0j4W0z3f8NK1eYl1+Mf/szHPmcHzyA=="],
"@radix-ui/react-collection": ["@radix-ui/react-collection@1.1.7", "", { "dependencies": { "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-slot": "1.2.3" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-Fh9rGN0MoI4ZFUNyfFVNU4y9LUz93u9/0K+yLgA2bwRojxM8JU1DyvvMBabnZPBgMWREAJvU2jjVzq+LrFUglw=="],
"@radix-ui/react-compose-refs": ["@radix-ui/react-compose-refs@1.1.2", "", { "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-z4eqJvfiNnFMHIIvXP3CY57y2WJs5g2v3X0zm9mEJkrkNv4rDxu+sg9Jh8EkXyeqBkB7SOcboo9dMVqhyrACIg=="],
"@radix-ui/react-context": ["@radix-ui/react-context@1.1.2", "", { "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-jCi/QKUM2r1Ju5a3J64TH2A5SpKAgh0LpknyqdQ4m6DCV0xJ2HG1xARRwNGPQfi1SLdLWZ1OJz6F4OMBBNiGJA=="],
"@radix-ui/react-dialog": ["@radix-ui/react-dialog@1.1.15", "", { "dependencies": { "@radix-ui/primitive": "1.1.3", "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-dismissable-layer": "1.1.11", "@radix-ui/react-focus-guards": "1.1.3", "@radix-ui/react-focus-scope": "1.1.7", "@radix-ui/react-id": "1.1.1", "@radix-ui/react-portal": "1.1.9", "@radix-ui/react-presence": "1.1.5", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-slot": "1.2.3", "@radix-ui/react-use-controllable-state": "1.2.2", "aria-hidden": "^1.2.4", "react-remove-scroll": "^2.6.3" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-TCglVRtzlffRNxRMEyR36DGBLJpeusFcgMVD9PZEzAKnUs1lKCgX5u9BmC2Yg+LL9MgZDugFFs1Vl+Jp4t/PGw=="],
"@radix-ui/react-direction": ["@radix-ui/react-direction@1.1.1", "", { "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-1UEWRX6jnOA2y4H5WczZ44gOOjTEmlqv1uNW4GAJEO5+bauCBhv8snY65Iw5/VOS/ghKN9gr2KjnLKxrsvoMVw=="],
"@radix-ui/react-dismissable-layer": ["@radix-ui/react-dismissable-layer@1.1.11", "", { "dependencies": { "@radix-ui/primitive": "1.1.3", "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-use-callback-ref": "1.1.1", "@radix-ui/react-use-escape-keydown": "1.1.1" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-Nqcp+t5cTB8BinFkZgXiMJniQH0PsUt2k51FUhbdfeKvc4ACcG2uQniY/8+h1Yv6Kza4Q7lD7PQV0z0oicE0Mg=="],
"@radix-ui/react-focus-guards": ["@radix-ui/react-focus-guards@1.1.3", "", { "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-0rFg/Rj2Q62NCm62jZw0QX7a3sz6QCQU0LpZdNrJX8byRGaGVTqbrW9jAoIAHyMQqsNpeZ81YgSizOt5WXq0Pw=="],
"@radix-ui/react-focus-scope": ["@radix-ui/react-focus-scope@1.1.7", "", { "dependencies": { "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-use-callback-ref": "1.1.1" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-t2ODlkXBQyn7jkl6TNaw/MtVEVvIGelJDCG41Okq/KwUsJBwQ4XVZsHAVUkK4mBv3ewiAS3PGuUWuY2BoK4ZUw=="],
"@radix-ui/react-id": ["@radix-ui/react-id@1.1.1", "", { "dependencies": { "@radix-ui/react-use-layout-effect": "1.1.1" }, "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-kGkGegYIdQsOb4XjsfM97rXsiHaBwco+hFI66oO4s9LU+PLAC5oJ7khdOVFxkhsmlbpUqDAvXw11CluXP+jkHg=="],
"@radix-ui/react-navigation-menu": ["@radix-ui/react-navigation-menu@1.2.14", "", { "dependencies": { "@radix-ui/primitive": "1.1.3", "@radix-ui/react-collection": "1.1.7", "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-direction": "1.1.1", "@radix-ui/react-dismissable-layer": "1.1.11", "@radix-ui/react-id": "1.1.1", "@radix-ui/react-presence": "1.1.5", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-use-callback-ref": "1.1.1", "@radix-ui/react-use-controllable-state": "1.2.2", "@radix-ui/react-use-layout-effect": "1.1.1", "@radix-ui/react-use-previous": "1.1.1", "@radix-ui/react-visually-hidden": "1.2.3" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-YB9mTFQvCOAQMHU+C/jVl96WmuWeltyUEpRJJky51huhds5W2FQr1J8D/16sQlf0ozxkPK8uF3niQMdUwZPv5w=="],
"@radix-ui/react-popover": ["@radix-ui/react-popover@1.1.15", "", { "dependencies": { "@radix-ui/primitive": "1.1.3", "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-dismissable-layer": "1.1.11", "@radix-ui/react-focus-guards": "1.1.3", "@radix-ui/react-focus-scope": "1.1.7", "@radix-ui/react-id": "1.1.1", "@radix-ui/react-popper": "1.2.8", "@radix-ui/react-portal": "1.1.9", "@radix-ui/react-presence": "1.1.5", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-slot": "1.2.3", "@radix-ui/react-use-controllable-state": "1.2.2", "aria-hidden": "^1.2.4", "react-remove-scroll": "^2.6.3" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-kr0X2+6Yy/vJzLYJUPCZEc8SfQcf+1COFoAqauJm74umQhta9M7lNJHP7QQS3vkvcGLQUbWpMzwrXYwrYztHKA=="],
"@radix-ui/react-popper": ["@radix-ui/react-popper@1.2.8", "", { "dependencies": { "@floating-ui/react-dom": "^2.0.0", "@radix-ui/react-arrow": "1.1.7", "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-use-callback-ref": "1.1.1", "@radix-ui/react-use-layout-effect": "1.1.1", "@radix-ui/react-use-rect": "1.1.1", "@radix-ui/react-use-size": "1.1.1", "@radix-ui/rect": "1.1.1" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-0NJQ4LFFUuWkE7Oxf0htBKS6zLkkjBH+hM1uk7Ng705ReR8m/uelduy1DBo0PyBXPKVnBA6YBlU94MBGXrSBCw=="],
"@radix-ui/react-portal": ["@radix-ui/react-portal@1.1.9", "", { "dependencies": { "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-use-layout-effect": "1.1.1" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-bpIxvq03if6UNwXZ+HTK71JLh4APvnXntDc6XOX8UVq4XQOVl7lwok0AvIl+b8zgCw3fSaVTZMpAPPagXbKmHQ=="],
"@radix-ui/react-presence": ["@radix-ui/react-presence@1.1.5", "", { "dependencies": { "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-use-layout-effect": "1.1.1" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-/jfEwNDdQVBCNvjkGit4h6pMOzq8bHkopq458dPt2lMjx+eBQUohZNG9A7DtO/O5ukSbxuaNGXMjHicgwy6rQQ=="],
"@radix-ui/react-primitive": ["@radix-ui/react-primitive@2.1.3", "", { "dependencies": { "@radix-ui/react-slot": "1.2.3" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-m9gTwRkhy2lvCPe6QJp4d3G1TYEUHn/FzJUtq9MjH46an1wJU+GdoGC5VLof8RX8Ft/DlpshApkhswDLZzHIcQ=="],
"@radix-ui/react-roving-focus": ["@radix-ui/react-roving-focus@1.1.11", "", { "dependencies": { "@radix-ui/primitive": "1.1.3", "@radix-ui/react-collection": "1.1.7", "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-direction": "1.1.1", "@radix-ui/react-id": "1.1.1", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-use-callback-ref": "1.1.1", "@radix-ui/react-use-controllable-state": "1.2.2" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-7A6S9jSgm/S+7MdtNDSb+IU859vQqJ/QAtcYQcfFC6W8RS4IxIZDldLR0xqCFZ6DCyrQLjLPsxtTNch5jVA4lA=="],
"@radix-ui/react-scroll-area": ["@radix-ui/react-scroll-area@1.2.10", "", { "dependencies": { "@radix-ui/number": "1.1.1", "@radix-ui/primitive": "1.1.3", "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-direction": "1.1.1", "@radix-ui/react-presence": "1.1.5", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-use-callback-ref": "1.1.1", "@radix-ui/react-use-layout-effect": "1.1.1" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-tAXIa1g3sM5CGpVT0uIbUx/U3Gs5N8T52IICuCtObaos1S8fzsrPXG5WObkQN3S6NVl6wKgPhAIiBGbWnvc97A=="],
"@radix-ui/react-select": ["@radix-ui/react-select@2.2.6", "", { "dependencies": { "@radix-ui/number": "1.1.1", "@radix-ui/primitive": "1.1.3", "@radix-ui/react-collection": "1.1.7", "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-direction": "1.1.1", "@radix-ui/react-dismissable-layer": "1.1.11", "@radix-ui/react-focus-guards": "1.1.3", "@radix-ui/react-focus-scope": "1.1.7", "@radix-ui/react-id": "1.1.1", "@radix-ui/react-popper": "1.2.8", "@radix-ui/react-portal": "1.1.9", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-slot": "1.2.3", "@radix-ui/react-use-callback-ref": "1.1.1", "@radix-ui/react-use-controllable-state": "1.2.2", "@radix-ui/react-use-layout-effect": "1.1.1", "@radix-ui/react-use-previous": "1.1.1", "@radix-ui/react-visually-hidden": "1.2.3", "aria-hidden": "^1.2.4", "react-remove-scroll": "^2.6.3" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-I30RydO+bnn2PQztvo25tswPH+wFBjehVGtmagkU78yMdwTwVf12wnAOF+AeP8S2N8xD+5UPbGhkUfPyvT+mwQ=="],
"@radix-ui/react-slot": ["@radix-ui/react-slot@1.2.3", "", { "dependencies": { "@radix-ui/react-compose-refs": "1.1.2" }, "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-aeNmHnBxbi2St0au6VBVC7JXFlhLlOnvIIlePNniyUNAClzmtAUEY8/pBiK3iHjufOlwA+c20/8jngo7xcrg8A=="],
"@radix-ui/react-tabs": ["@radix-ui/react-tabs@1.1.13", "", { "dependencies": { "@radix-ui/primitive": "1.1.3", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-direction": "1.1.1", "@radix-ui/react-id": "1.1.1", "@radix-ui/react-presence": "1.1.5", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-roving-focus": "1.1.11", "@radix-ui/react-use-controllable-state": "1.2.2" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-7xdcatg7/U+7+Udyoj2zodtI9H/IIopqo+YOIcZOq1nJwXWBZ9p8xiu5llXlekDbZkca79a/fozEYQXIA4sW6A=="],
"@radix-ui/react-use-callback-ref": ["@radix-ui/react-use-callback-ref@1.1.1", "", { "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-FkBMwD+qbGQeMu1cOHnuGB6x4yzPjho8ap5WtbEJ26umhgqVXbhekKUQO+hZEL1vU92a3wHwdp0HAcqAUF5iDg=="],
"@radix-ui/react-use-controllable-state": ["@radix-ui/react-use-controllable-state@1.2.2", "", { "dependencies": { "@radix-ui/react-use-effect-event": "0.0.2", "@radix-ui/react-use-layout-effect": "1.1.1" }, "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-BjasUjixPFdS+NKkypcyyN5Pmg83Olst0+c6vGov0diwTEo6mgdqVR6hxcEgFuh4QrAs7Rc+9KuGJ9TVCj0Zzg=="],
"@radix-ui/react-use-effect-event": ["@radix-ui/react-use-effect-event@0.0.2", "", { "dependencies": { "@radix-ui/react-use-layout-effect": "1.1.1" }, "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-Qp8WbZOBe+blgpuUT+lw2xheLP8q0oatc9UpmiemEICxGvFLYmHm9QowVZGHtJlGbS6A6yJ3iViad/2cVjnOiA=="],
"@radix-ui/react-use-escape-keydown": ["@radix-ui/react-use-escape-keydown@1.1.1", "", { "dependencies": { "@radix-ui/react-use-callback-ref": "1.1.1" }, "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-Il0+boE7w/XebUHyBjroE+DbByORGR9KKmITzbR7MyQ4akpORYP/ZmbhAr0DG7RmmBqoOnZdy2QlvajJ2QA59g=="],
"@radix-ui/react-use-layout-effect": ["@radix-ui/react-use-layout-effect@1.1.1", "", { "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-RbJRS4UWQFkzHTTwVymMTUv8EqYhOp8dOOviLj2ugtTiXRaRQS7GLGxZTLL1jWhMeoSCf5zmcZkqTl9IiYfXcQ=="],
"@radix-ui/react-use-previous": ["@radix-ui/react-use-previous@1.1.1", "", { "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-2dHfToCj/pzca2Ck724OZ5L0EVrr3eHRNsG/b3xQJLA2hZpVCS99bLAX+hm1IHXDEnzU6by5z/5MIY794/a8NQ=="],
"@radix-ui/react-use-rect": ["@radix-ui/react-use-rect@1.1.1", "", { "dependencies": { "@radix-ui/rect": "1.1.1" }, "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-QTYuDesS0VtuHNNvMh+CjlKJ4LJickCMUAqjlE3+j8w+RlRpwyX3apEQKGFzbZGdo7XNG1tXa+bQqIE7HIXT2w=="],
"@radix-ui/react-use-size": ["@radix-ui/react-use-size@1.1.1", "", { "dependencies": { "@radix-ui/react-use-layout-effect": "1.1.1" }, "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-ewrXRDTAqAXlkl6t/fkXWNAhFX9I+CkKlw6zjEwk86RSPKwZr3xpBRso655aqYafwtnbpHLj6toFzmd6xdVptQ=="],
"@radix-ui/react-visually-hidden": ["@radix-ui/react-visually-hidden@1.2.3", "", { "dependencies": { "@radix-ui/react-primitive": "2.1.3" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-pzJq12tEaaIhqjbzpCuv/OypJY/BPavOofm+dbab+MHLajy277+1lLm6JFcGgF5eskJ6mquGirhXY2GD/8u8Ug=="],
"@radix-ui/rect": ["@radix-ui/rect@1.1.1", "", {}, "sha512-HPwpGIzkl28mWyZqG52jiqDJ12waP11Pa1lGoiyUkIEuMLBP0oeK/C89esbXrxsky5we7dfd8U58nm0SgAWpVw=="],
"@scalar/helpers": ["@scalar/helpers@0.2.10", "", {}, "sha512-VS32setBEAGY9JifuDZKHIq8SUCUWLEfL1V+h3s5V4wcmE8OZVkzaJemsMq/YAM9e7gb9ZbkvJLL4zzEvPSrVg=="],
"@scalar/json-magic": ["@scalar/json-magic@0.9.5", "", { "dependencies": { "@scalar/helpers": "0.2.10", "yaml": "^2.8.0" } }, "sha512-+IZngReH0P+ima7y9u/f5QJD60AdISG81ezhwEVrYhsp46PiJp7YyOd0z1YLiOgwV0jkPlPo74T/FVBcM2ejuw=="],
"@scalar/openapi-parser": ["@scalar/openapi-parser@0.24.5", "", { "dependencies": { "@scalar/json-magic": "0.9.4", "@scalar/openapi-types": "0.5.3", "@scalar/openapi-upgrader": "0.1.8", "ajv": "^8.17.1", "ajv-draft-04": "^1.0.0", "ajv-formats": "^3.0.1", "jsonpointer": "^5.0.1", "leven": "^4.0.0", "yaml": "^2.8.0" } }, "sha512-pTeKnmhVdSIfG3vysgDm6jsKc7Do1vXdy/4aqp7j8AEzXllf8RZjSgRSUhtvFYFQCr27fDZ117V3WPQUYtgmCw=="],
"@scalar/openapi-types": ["@scalar/openapi-types@0.5.3", "", { "dependencies": { "zod": "^4.1.11" } }, "sha512-m4n/Su3K01d15dmdWO1LlqecdSPKuNjuokrJLdiQ485kW/hRHbXW1QP6tJL75myhw/XhX5YhYAR+jrwnGjXiMw=="],
"@scalar/openapi-upgrader": ["@scalar/openapi-upgrader@0.1.8", "", { "dependencies": { "@scalar/openapi-types": "0.5.3" } }, "sha512-2xuYLLs0fBadLIk4I1ObjMiCnOyLPEMPf24A1HtHQvhKGDnGlvT63F2rU2Xw8lxCjgHnzveMPnOJEbwIy64RCg=="],
"@shikijs/core": ["@shikijs/core@3.22.0", "", { "dependencies": { "@shikijs/types": "3.22.0", "@shikijs/vscode-textmate": "^10.0.2", "@types/hast": "^3.0.4", "hast-util-to-html": "^9.0.5" } }, "sha512-iAlTtSDDbJiRpvgL5ugKEATDtHdUVkqgHDm/gbD2ZS9c88mx7G1zSYjjOxp5Qa0eaW0MAQosFRmJSk354PRoQA=="],
"@shikijs/engine-javascript": ["@shikijs/engine-javascript@3.22.0", "", { "dependencies": { "@shikijs/types": "3.22.0", "@shikijs/vscode-textmate": "^10.0.2", "oniguruma-to-es": "^4.3.4" } }, "sha512-jdKhfgW9CRtj3Tor0L7+yPwdG3CgP7W+ZEqSsojrMzCjD1e0IxIbwUMDDpYlVBlC08TACg4puwFGkZfLS+56Tw=="],
"@shikijs/engine-oniguruma": ["@shikijs/engine-oniguruma@3.22.0", "", { "dependencies": { "@shikijs/types": "3.22.0", "@shikijs/vscode-textmate": "^10.0.2" } }, "sha512-DyXsOG0vGtNtl7ygvabHd7Mt5EY8gCNqR9Y7Lpbbd/PbJvgWrqaKzH1JW6H6qFkuUa8aCxoiYVv8/YfFljiQxA=="],
"@shikijs/langs": ["@shikijs/langs@3.22.0", "", { "dependencies": { "@shikijs/types": "3.22.0" } }, "sha512-x/42TfhWmp6H00T6uwVrdTJGKgNdFbrEdhaDwSR5fd5zhQ1Q46bHq9EO61SCEWJR0HY7z2HNDMaBZp8JRmKiIA=="],
"@shikijs/rehype": ["@shikijs/rehype@3.22.0", "", { "dependencies": { "@shikijs/types": "3.22.0", "@types/hast": "^3.0.4", "hast-util-to-string": "^3.0.1", "shiki": "3.22.0", "unified": "^11.0.5", "unist-util-visit": "^5.1.0" } }, "sha512-69b2VPc6XBy/VmAJlpBU5By+bJSBdE2nvgRCZXav7zujbrjXuT0F60DIrjKuutjPqNufuizE+E8tIZr2Yn8Z+g=="],
"@shikijs/themes": ["@shikijs/themes@3.22.0", "", { "dependencies": { "@shikijs/types": "3.22.0" } }, "sha512-o+tlOKqsr6FE4+mYJG08tfCFDS+3CG20HbldXeVoyP+cYSUxDhrFf3GPjE60U55iOkkjbpY2uC3It/eeja35/g=="],
"@shikijs/transformers": ["@shikijs/transformers@3.22.0", "", { "dependencies": { "@shikijs/core": "3.22.0", "@shikijs/types": "3.22.0" } }, "sha512-E7eRV7mwDBjueLF6852n2oYeJYxBq3NSsDk+uyruYAXONv4U8holGmIrT+mPRJQ1J1SNOH6L8G19KRzmBawrFw=="],
"@shikijs/types": ["@shikijs/types@3.22.0", "", { "dependencies": { "@shikijs/vscode-textmate": "^10.0.2", "@types/hast": "^3.0.4" } }, "sha512-491iAekgKDBFE67z70Ok5a8KBMsQ2IJwOWw3us/7ffQkIBCyOQfm/aNwVMBUriP02QshIfgHCBSIYAl3u2eWjg=="],
"@shikijs/vscode-textmate": ["@shikijs/vscode-textmate@10.0.2", "", {}, "sha512-83yeghZ2xxin3Nj8z1NMd/NCuca+gsYXswywDy5bHvwlWL8tpTQmzGeUuHd9FC3E/SBEMvzJRwWEOz5gGes9Qg=="],
"@standard-schema/spec": ["@standard-schema/spec@1.1.0", "", {}, "sha512-l2aFy5jALhniG5HgqrD6jXLi/rUWrKvqN/qJx6yoJsgKhblVd+iqqU4RCXavm/jPityDo5TCvKMnpjKnOriy0w=="],
"@swc/helpers": ["@swc/helpers@0.5.15", "", { "dependencies": { "tslib": "^2.8.0" } }, "sha512-JQ5TuMi45Owi4/BIMAJBoSQoOJu12oOk/gADqlcUL9JEdHB8vyjUSsxqeNXnmXHjYKMi2WcYtezGEEhqUI/E2g=="],
"@tailwindcss/node": ["@tailwindcss/node@4.1.18", "", { "dependencies": { "@jridgewell/remapping": "^2.3.4", "enhanced-resolve": "^5.18.3", "jiti": "^2.6.1", "lightningcss": "1.30.2", "magic-string": "^0.30.21", "source-map-js": "^1.2.1", "tailwindcss": "4.1.18" } }, "sha512-DoR7U1P7iYhw16qJ49fgXUlry1t4CpXeErJHnQ44JgTSKMaZUdf17cfn5mHchfJ4KRBZRFA/Coo+MUF5+gOaCQ=="],
"@tailwindcss/oxide": ["@tailwindcss/oxide@4.1.18", "", { "optionalDependencies": { "@tailwindcss/oxide-android-arm64": "4.1.18", "@tailwindcss/oxide-darwin-arm64": "4.1.18", "@tailwindcss/oxide-darwin-x64": "4.1.18", "@tailwindcss/oxide-freebsd-x64": "4.1.18", "@tailwindcss/oxide-linux-arm-gnueabihf": "4.1.18", "@tailwindcss/oxide-linux-arm64-gnu": "4.1.18", "@tailwindcss/oxide-linux-arm64-musl": "4.1.18", "@tailwindcss/oxide-linux-x64-gnu": "4.1.18", "@tailwindcss/oxide-linux-x64-musl": "4.1.18", "@tailwindcss/oxide-wasm32-wasi": "4.1.18", "@tailwindcss/oxide-win32-arm64-msvc": "4.1.18", "@tailwindcss/oxide-win32-x64-msvc": "4.1.18" } }, "sha512-EgCR5tTS5bUSKQgzeMClT6iCY3ToqE1y+ZB0AKldj809QXk1Y+3jB0upOYZrn9aGIzPtUsP7sX4QQ4XtjBB95A=="],
"@tailwindcss/oxide-android-arm64": ["@tailwindcss/oxide-android-arm64@4.1.18", "", { "os": "android", "cpu": "arm64" }, "sha512-dJHz7+Ugr9U/diKJA0W6N/6/cjI+ZTAoxPf9Iz9BFRF2GzEX8IvXxFIi/dZBloVJX/MZGvRuFA9rqwdiIEZQ0Q=="],
"@tailwindcss/oxide-darwin-arm64": ["@tailwindcss/oxide-darwin-arm64@4.1.18", "", { "os": "darwin", "cpu": "arm64" }, "sha512-Gc2q4Qhs660bhjyBSKgq6BYvwDz4G+BuyJ5H1xfhmDR3D8HnHCmT/BSkvSL0vQLy/nkMLY20PQ2OoYMO15Jd0A=="],
"@tailwindcss/oxide-darwin-x64": ["@tailwindcss/oxide-darwin-x64@4.1.18", "", { "os": "darwin", "cpu": "x64" }, "sha512-FL5oxr2xQsFrc3X9o1fjHKBYBMD1QZNyc1Xzw/h5Qu4XnEBi3dZn96HcHm41c/euGV+GRiXFfh2hUCyKi/e+yw=="],
"@tailwindcss/oxide-freebsd-x64": ["@tailwindcss/oxide-freebsd-x64@4.1.18", "", { "os": "freebsd", "cpu": "x64" }, "sha512-Fj+RHgu5bDodmV1dM9yAxlfJwkkWvLiRjbhuO2LEtwtlYlBgiAT4x/j5wQr1tC3SANAgD+0YcmWVrj8R9trVMA=="],
"@tailwindcss/oxide-linux-arm-gnueabihf": ["@tailwindcss/oxide-linux-arm-gnueabihf@4.1.18", "", { "os": "linux", "cpu": "arm" }, "sha512-Fp+Wzk/Ws4dZn+LV2Nqx3IilnhH51YZoRaYHQsVq3RQvEl+71VGKFpkfHrLM/Li+kt5c0DJe/bHXK1eHgDmdiA=="],
"@tailwindcss/oxide-linux-arm64-gnu": ["@tailwindcss/oxide-linux-arm64-gnu@4.1.18", "", { "os": "linux", "cpu": "arm64" }, "sha512-S0n3jboLysNbh55Vrt7pk9wgpyTTPD0fdQeh7wQfMqLPM/Hrxi+dVsLsPrycQjGKEQk85Kgbx+6+QnYNiHalnw=="],
"@tailwindcss/oxide-linux-arm64-musl": ["@tailwindcss/oxide-linux-arm64-musl@4.1.18", "", { "os": "linux", "cpu": "arm64" }, "sha512-1px92582HkPQlaaCkdRcio71p8bc8i/ap5807tPRDK/uw953cauQBT8c5tVGkOwrHMfc2Yh6UuxaH4vtTjGvHg=="],
"@tailwindcss/oxide-linux-x64-gnu": ["@tailwindcss/oxide-linux-x64-gnu@4.1.18", "", { "os": "linux", "cpu": "x64" }, "sha512-v3gyT0ivkfBLoZGF9LyHmts0Isc8jHZyVcbzio6Wpzifg/+5ZJpDiRiUhDLkcr7f/r38SWNe7ucxmGW3j3Kb/g=="],
"@tailwindcss/oxide-linux-x64-musl": ["@tailwindcss/oxide-linux-x64-musl@4.1.18", "", { "os": "linux", "cpu": "x64" }, "sha512-bhJ2y2OQNlcRwwgOAGMY0xTFStt4/wyU6pvI6LSuZpRgKQwxTec0/3Scu91O8ir7qCR3AuepQKLU/kX99FouqQ=="],
"@tailwindcss/oxide-wasm32-wasi": ["@tailwindcss/oxide-wasm32-wasi@4.1.18", "", { "dependencies": { "@emnapi/core": "^1.7.1", "@emnapi/runtime": "^1.7.1", "@emnapi/wasi-threads": "^1.1.0", "@napi-rs/wasm-runtime": "^1.1.0", "@tybys/wasm-util": "^0.10.1", "tslib": "^2.4.0" }, "cpu": "none" }, "sha512-LffYTvPjODiP6PT16oNeUQJzNVyJl1cjIebq/rWWBF+3eDst5JGEFSc5cWxyRCJ0Mxl+KyIkqRxk1XPEs9x8TA=="],
"@tailwindcss/oxide-win32-arm64-msvc": ["@tailwindcss/oxide-win32-arm64-msvc@4.1.18", "", { "os": "win32", "cpu": "arm64" }, "sha512-HjSA7mr9HmC8fu6bdsZvZ+dhjyGCLdotjVOgLA2vEqxEBZaQo9YTX4kwgEvPCpRh8o4uWc4J/wEoFzhEmjvPbA=="],
"@tailwindcss/oxide-win32-x64-msvc": ["@tailwindcss/oxide-win32-x64-msvc@4.1.18", "", { "os": "win32", "cpu": "x64" }, "sha512-bJWbyYpUlqamC8dpR7pfjA0I7vdF6t5VpUGMWRkXVE3AXgIZjYUYAK7II1GNaxR8J1SSrSrppRar8G++JekE3Q=="],
"@tailwindcss/postcss": ["@tailwindcss/postcss@4.1.18", "", { "dependencies": { "@alloc/quick-lru": "^5.2.0", "@tailwindcss/node": "4.1.18", "@tailwindcss/oxide": "4.1.18", "postcss": "^8.4.41", "tailwindcss": "4.1.18" } }, "sha512-Ce0GFnzAOuPyfV5SxjXGn0CubwGcuDB0zcdaPuCSzAa/2vII24JTkH+I6jcbXLb1ctjZMZZI6OjDaLPJQL1S0g=="],
"@types/debug": ["@types/debug@4.1.12", "", { "dependencies": { "@types/ms": "*" } }, "sha512-vIChWdVG3LG1SMxEvI/AK+FWJthlrqlTu7fbrlywTkkaONwk/UAGaULXRlf8vkzFBLVm0zkMdCquhL5aOjhXPQ=="],
"@types/estree": ["@types/estree@1.0.8", "", {}, "sha512-dWHzHa2WqEXI/O1E9OjrocMTKJl2mSrEolh1Iomrv6U+JuNwaHXsXx9bLu5gG7BUWFIN0skIQJQ/L1rIex4X6w=="],
"@types/estree-jsx": ["@types/estree-jsx@1.0.5", "", { "dependencies": { "@types/estree": "*" } }, "sha512-52CcUVNFyfb1A2ALocQw/Dd1BQFNmSdkuC3BkZ6iqhdMfQz7JWOFRuJFloOzjk+6WijU56m9oKXFAXc7o3Towg=="],
"@types/hast": ["@types/hast@3.0.4", "", { "dependencies": { "@types/unist": "*" } }, "sha512-WPs+bbQw5aCj+x6laNGWLH3wviHtoCv/P3+otBhbOhJgG8qtpdAMlTCxLtsTWA7LH1Oh/bFCHsBn0TPS5m30EQ=="],
"@types/json-schema": ["@types/json-schema@7.0.15", "", {}, "sha512-5+fP8P8MFNC+AyZCDxrB2pkZFPGzqQWUzpSeuuVLvm8VMcorNYavBqoFcxK8bQz4Qsbn4oUEEem4wDLfcysGHA=="],
"@types/mdast": ["@types/mdast@4.0.4", "", { "dependencies": { "@types/unist": "*" } }, "sha512-kGaNbPh1k7AFzgpud/gMdvIm5xuECykRR+JnWKQno9TAXVa6WIVCGTPvYGekIDL4uwCZQSYbUxNBSb1aUo79oA=="],
"@types/mdx": ["@types/mdx@2.0.13", "", {}, "sha512-+OWZQfAYyio6YkJb3HLxDrvnx6SWWDbC0zVPfBRzUk0/nqoDyf6dNxQi3eArPe8rJ473nobTMQ/8Zk+LxJ+Yuw=="],
"@types/ms": ["@types/ms@2.1.0", "", {}, "sha512-GsCCIZDE/p3i96vtEqx+7dBUGXrc7zeSK3wwPHIaRThS+9OhWIXRqzs4d6k1SVU8g91DrNRWxWUGhp5KXQb2VA=="],
"@types/node": ["@types/node@24.10.9", "", { "dependencies": { "undici-types": "~7.16.0" } }, "sha512-ne4A0IpG3+2ETuREInjPNhUGis1SFjv1d5asp8MzEAGtOZeTeHVDOYqOgqfhvseqg/iXty2hjBf1zAOb7RNiNw=="],
"@types/react": ["@types/react@19.2.10", "", { "dependencies": { "csstype": "^3.2.2" } }, "sha512-WPigyYuGhgZ/cTPRXB2EwUw+XvsRA3GqHlsP4qteqrnnjDrApbS7MxcGr/hke5iUoeB7E/gQtrs9I37zAJ0Vjw=="],
"@types/react-dom": ["@types/react-dom@19.2.3", "", { "peerDependencies": { "@types/react": "^19.2.0" } }, "sha512-jp2L/eY6fn+KgVVQAOqYItbF0VY/YApe5Mz2F0aykSO8gx31bYCZyvSeYxCHKvzHG5eZjc+zyaS5BrBWya2+kQ=="],
"@types/unist": ["@types/unist@3.0.3", "", {}, "sha512-ko/gIFJRv177XgZsZcBwnqJN5x/Gien8qNOn0D5bQU/zAzVf9Zt3BlcUiLqhV9y4ARk0GbT3tnUiPNgnTXzc/Q=="],
"@ungap/structured-clone": ["@ungap/structured-clone@1.3.0", "", {}, "sha512-WmoN8qaIAo7WTYWbAZuG8PYEhn5fkz7dZrqTBZ7dtt//lL2Gwms1IcnQ5yHqjDfX8Ft5j4YzDM23f87zBfDe9g=="],
"acorn": ["acorn@8.15.0", "", { "bin": { "acorn": "bin/acorn" } }, "sha512-NZyJarBfL7nWwIq+FDL6Zp/yHEhePMNnnJ0y3qfieCrmNvYct8uvtiV41UvlSe6apAfk0fY1FbWx+NwfmpvtTg=="],
"acorn-jsx": ["acorn-jsx@5.3.2", "", { "peerDependencies": { "acorn": "^6.0.0 || ^7.0.0 || ^8.0.0" } }, "sha512-rq9s+JNhf0IChjtDXxllJ7g41oZk5SlXtp0LHwyA5cejwn7vKmKp4pPri6YEePv2PU65sAsegbXtIinmDFDXgQ=="],
"ajv": ["ajv@8.17.1", "", { "dependencies": { "fast-deep-equal": "^3.1.3", "fast-uri": "^3.0.1", "json-schema-traverse": "^1.0.0", "require-from-string": "^2.0.2" } }, "sha512-B/gBuNg5SiMTrPkC+A2+cW0RszwxYmn6VYxB/inlBStS5nx6xHIt/ehKRhIMhqusl7a8LjQoZnjCs5vhwxOQ1g=="],
"ajv-draft-04": ["ajv-draft-04@1.0.0", "", { "peerDependencies": { "ajv": "^8.5.0" }, "optionalPeers": ["ajv"] }, "sha512-mv00Te6nmYbRp5DCwclxtt7yV/joXJPGS7nM+97GdxvuttCOfgI3K4U25zboyeX0O+myI8ERluxQe5wljMmVIw=="],
"ajv-formats": ["ajv-formats@3.0.1", "", { "dependencies": { "ajv": "^8.0.0" } }, "sha512-8iUql50EUR+uUcdRQ3HDqa6EVyo3docL8g5WJ3FNcWmu62IbkGUue/pEyLBW8VGKKucTPgqeks4fIU1DA4yowQ=="],
"argparse": ["argparse@2.0.1", "", {}, "sha512-8+9WqebbFzpX9OR+Wa6O29asIogeRMzcGtAINdpMHHyAg10f05aSFVBbcEqGf/PXw1EjAZ+q2/bEBg3DvurK3Q=="],
"aria-hidden": ["aria-hidden@1.2.6", "", { "dependencies": { "tslib": "^2.0.0" } }, "sha512-ik3ZgC9dY/lYVVM++OISsaYDeg1tb0VtP5uL3ouh1koGOaUMDPpbFIei4JkFimWUFPn90sbMNMXQAIVOlnYKJA=="],
"astring": ["astring@1.9.0", "", { "bin": { "astring": "bin/astring" } }, "sha512-LElXdjswlqjWrPpJFg1Fx4wpkOCxj1TDHlSV4PlaRxHGWko024xICaa97ZkMfs6DRKlCguiAI+rbXv5GWwXIkg=="],
"bail": ["bail@2.0.2", "", {}, "sha512-0xO6mYd7JB2YesxDKplafRpsiOzPt9V02ddPCLbY1xYGPOX24NTyN50qnUxgCPcSoYMhKpAuBTjQoRZCAkUDRw=="],
"baseline-browser-mapping": ["baseline-browser-mapping@2.9.19", "", { "bin": { "baseline-browser-mapping": "dist/cli.js" } }, "sha512-ipDqC8FrAl/76p2SSWKSI+H9tFwm7vYqXQrItCuiVPt26Km0jS+NzSsBWAaBusvSbQcfJG+JitdMm+wZAgTYqg=="],
"caniuse-lite": ["caniuse-lite@1.0.30001766", "", {}, "sha512-4C0lfJ0/YPjJQHagaE9x2Elb69CIqEPZeG0anQt9SIvIoOH4a4uaRl73IavyO+0qZh6MDLH//DrXThEYKHkmYA=="],
"ccount": ["ccount@2.0.1", "", {}, "sha512-eyrF0jiFpY+3drT6383f1qhkbGsLSifNAjA61IUjZjmLCWjItY6LB9ft9YhoDgwfmclB2zhu51Lc7+95b8NRAg=="],
"character-entities": ["character-entities@2.0.2", "", {}, "sha512-shx7oQ0Awen/BRIdkjkvz54PnEEI/EjwXDSIZp86/KKdbafHh1Df/RYGBhn4hbe2+uKC9FnT5UCEdyPz3ai9hQ=="],
"character-entities-html4": ["character-entities-html4@2.1.0", "", {}, "sha512-1v7fgQRj6hnSwFpq1Eu0ynr/CDEw0rXo2B61qXrLNdHZmPKgb7fqS1a2JwF0rISo9q77jDI8VMEHoApn8qDoZA=="],
"character-entities-legacy": ["character-entities-legacy@3.0.0", "", {}, "sha512-RpPp0asT/6ufRm//AJVwpViZbGM/MkjQFxJccQRHmISF/22NBtsHqAWmL+/pmkPWoIUJdWyeVleTl1wydHATVQ=="],
"character-reference-invalid": ["character-reference-invalid@2.0.1", "", {}, "sha512-iBZ4F4wRbyORVsu0jPV7gXkOsGYjGHPmAyv+HiHG8gi5PtC9KI2j1+v8/tlibRvjoWX027ypmG/n0HtO5t7unw=="],
"chokidar": ["chokidar@4.0.3", "", { "dependencies": { "readdirp": "^4.0.1" } }, "sha512-Qgzu8kfBvo+cA4962jnP1KkS6Dop5NS6g7R5LFYJr4b8Ub94PPQXUksCw9PvXoeXPRRddRNC5C1JQUR2SMGtnA=="],
"class-variance-authority": ["class-variance-authority@0.7.1", "", { "dependencies": { "clsx": "^2.1.1" } }, "sha512-Ka+9Trutv7G8M6WT6SeiRWz792K5qEqIGEGzXKhAE6xOWAY6pPH8U+9IY3oCMv6kqTmLsv7Xh/2w2RigkePMsg=="],
"client-only": ["client-only@0.0.1", "", {}, "sha512-IV3Ou0jSMzZrd3pZ48nLkT9DA7Ag1pnPzaiQhpW7c3RbcqqzvzzVu+L8gfqMp/8IM2MQtSiqaCxrrcfu8I8rMA=="],
"clsx": ["clsx@2.1.1", "", {}, "sha512-eYm0QWBtUrBWZWG0d386OGAw16Z995PiOVo2B7bjWSbHedGl5e0ZWaq65kOGgUSNesEIDkB9ISbTg/JK9dhCZA=="],
"collapse-white-space": ["collapse-white-space@2.1.0", "", {}, "sha512-loKTxY1zCOuG4j9f6EPnuyyYkf58RnhhWTvRoZEokgB+WbdXehfjFviyOVYkqzEWz1Q5kRiZdBYS5SwxbQYwzw=="],
"comma-separated-tokens": ["comma-separated-tokens@2.0.3", "", {}, "sha512-Fu4hJdvzeylCfQPp9SGWidpzrMs7tTrlu6Vb8XGaRGck8QSNZJJp538Wrb60Lax4fPwR64ViY468OIUTbRlGZg=="],
"compute-scroll-into-view": ["compute-scroll-into-view@3.1.1", "", {}, "sha512-VRhuHOLoKYOy4UbilLbUzbYg93XLjv2PncJC50EuTWPA3gaja1UjBsUP/D/9/juV3vQFr6XBEzn9KCAHdUvOHw=="],
"cssesc": ["cssesc@3.0.0", "", { "bin": { "cssesc": "bin/cssesc" } }, "sha512-/Tb/JcjK111nNScGob5MNtsntNM1aCNUDipB/TkwZFhyDrrE47SOx/18wF2bbjgc3ZzCSKW1T5nt5EbFoAz/Vg=="],
"csstype": ["csstype@3.2.3", "", {}, "sha512-z1HGKcYy2xA8AGQfwrn0PAy+PB7X/GSj3UVJW9qKyn43xWa+gl5nXmU4qqLMRzWVLFC8KusUX8T/0kCiOYpAIQ=="],
"debug": ["debug@4.4.3", "", { "dependencies": { "ms": "^2.1.3" } }, "sha512-RGwwWnwQvkVfavKVt22FGLw+xYSdzARwm0ru6DhTVA3umU5hZc28V3kO4stgYryrTlLpuvgI9GiijltAjNbcqA=="],
"decode-named-character-reference": ["decode-named-character-reference@1.3.0", "", { "dependencies": { "character-entities": "^2.0.0" } }, "sha512-GtpQYB283KrPp6nRw50q3U9/VfOutZOe103qlN7BPP6Ad27xYnOIWv4lPzo8HCAL+mMZofJ9KEy30fq6MfaK6Q=="],
"dequal": ["dequal@2.0.3", "", {}, "sha512-0je+qPKHEMohvfRTCEo3CrPG6cAzAYgmzKyxRiYSSDkS6eGJdyVJm7WaYA5ECaAD9wLB2T4EEeymA5aFVcYXCA=="],
"detect-libc": ["detect-libc@2.1.2", "", {}, "sha512-Btj2BOOO83o3WyH59e8MgXsxEQVcarkUOpEYrubB0urwnN10yQ364rsiByU11nZlqWYZm05i/of7io4mzihBtQ=="],
"detect-node-es": ["detect-node-es@1.1.0", "", {}, "sha512-ypdmJU/TbBby2Dxibuv7ZLW3Bs1QEmM7nHjEANfohJLvE0XVujisn1qPJcZxg+qDucsr+bP6fLD1rPS3AhJ7EQ=="],
"devlop": ["devlop@1.1.0", "", { "dependencies": { "dequal": "^2.0.0" } }, "sha512-RWmIqhcFf1lRYBvNmr7qTNuyCt/7/ns2jbpp1+PalgE/rDQcBT0fioSMUpJ93irlUhC5hrg4cYqe6U+0ImW0rA=="],
"enhanced-resolve": ["enhanced-resolve@5.18.4", "", { "dependencies": { "graceful-fs": "^4.2.4", "tapable": "^2.2.0" } }, "sha512-LgQMM4WXU3QI+SYgEc2liRgznaD5ojbmY3sb8LxyguVkIg5FxdpTkvk72te2R38/TGKxH634oLxXRGY6d7AP+Q=="],
"esast-util-from-estree": ["esast-util-from-estree@2.0.0", "", { "dependencies": { "@types/estree-jsx": "^1.0.0", "devlop": "^1.0.0", "estree-util-visit": "^2.0.0", "unist-util-position-from-estree": "^2.0.0" } }, "sha512-4CyanoAudUSBAn5K13H4JhsMH6L9ZP7XbLVe/dKybkxMO7eDyLsT8UHl9TRNrU2Gr9nz+FovfSIjuXWJ81uVwQ=="],
"esast-util-from-js": ["esast-util-from-js@2.0.1", "", { "dependencies": { "@types/estree-jsx": "^1.0.0", "acorn": "^8.0.0", "esast-util-from-estree": "^2.0.0", "vfile-message": "^4.0.0" } }, "sha512-8Ja+rNJ0Lt56Pcf3TAmpBZjmx8ZcK5Ts4cAzIOjsjevg9oSXJnl6SUQ2EevU8tv3h6ZLWmoKL5H4fgWvdvfETw=="],
"esbuild": ["esbuild@0.25.12", "", { "optionalDependencies": { "@esbuild/aix-ppc64": "0.25.12", "@esbuild/android-arm": "0.25.12", "@esbuild/android-arm64": "0.25.12", "@esbuild/android-x64": "0.25.12", "@esbuild/darwin-arm64": "0.25.12", "@esbuild/darwin-x64": "0.25.12", "@esbuild/freebsd-arm64": "0.25.12", "@esbuild/freebsd-x64": "0.25.12", "@esbuild/linux-arm": "0.25.12", "@esbuild/linux-arm64": "0.25.12", "@esbuild/linux-ia32": "0.25.12", "@esbuild/linux-loong64": "0.25.12", "@esbuild/linux-mips64el": "0.25.12", "@esbuild/linux-ppc64": "0.25.12", "@esbuild/linux-riscv64": "0.25.12", "@esbuild/linux-s390x": "0.25.12", "@esbuild/linux-x64": "0.25.12", "@esbuild/netbsd-arm64": "0.25.12", "@esbuild/netbsd-x64": "0.25.12", "@esbuild/openbsd-arm64": "0.25.12", "@esbuild/openbsd-x64": "0.25.12", "@esbuild/openharmony-arm64": "0.25.12", "@esbuild/sunos-x64": "0.25.12", "@esbuild/win32-arm64": "0.25.12", "@esbuild/win32-ia32": "0.25.12", "@esbuild/win32-x64": "0.25.12" }, "bin": { "esbuild": "bin/esbuild" } }, "sha512-bbPBYYrtZbkt6Os6FiTLCTFxvq4tt3JKall1vRwshA3fdVztsLAatFaZobhkBC8/BrPetoa0oksYoKXoG4ryJg=="],
"escape-string-regexp": ["escape-string-regexp@5.0.0", "", {}, "sha512-/veY75JbMK4j1yjvuUxuVsiS/hr/4iHs9FTT6cgTexxdE0Ly/glccBAkloH/DofkjRbZU3bnoj38mOmhkZ0lHw=="],
"estree-util-attach-comments": ["estree-util-attach-comments@3.0.0", "", { "dependencies": { "@types/estree": "^1.0.0" } }, "sha512-cKUwm/HUcTDsYh/9FgnuFqpfquUbwIqwKM26BVCGDPVgvaCl/nDCCjUfiLlx6lsEZ3Z4RFxNbOQ60pkaEwFxGw=="],
"estree-util-build-jsx": ["estree-util-build-jsx@3.0.1", "", { "dependencies": { "@types/estree-jsx": "^1.0.0", "devlop": "^1.0.0", "estree-util-is-identifier-name": "^3.0.0", "estree-walker": "^3.0.0" } }, "sha512-8U5eiL6BTrPxp/CHbs2yMgP8ftMhR5ww1eIKoWRMlqvltHF8fZn5LRDvTKuxD3DUn+shRbLGqXemcP51oFCsGQ=="],
"estree-util-is-identifier-name": ["estree-util-is-identifier-name@3.0.0", "", {}, "sha512-hFtqIDZTIUZ9BXLb8y4pYGyk6+wekIivNVTcmvk8NoOh+VeRn5y6cEHzbURrWbfp1fIqdVipilzj+lfaadNZmg=="],
"estree-util-scope": ["estree-util-scope@1.0.0", "", { "dependencies": { "@types/estree": "^1.0.0", "devlop": "^1.0.0" } }, "sha512-2CAASclonf+JFWBNJPndcOpA8EMJwa0Q8LUFJEKqXLW6+qBvbFZuF5gItbQOs/umBUkjviCSDCbBwU2cXbmrhQ=="],
"estree-util-to-js": ["estree-util-to-js@2.0.0", "", { "dependencies": { "@types/estree-jsx": "^1.0.0", "astring": "^1.8.0", "source-map": "^0.7.0" } }, "sha512-WDF+xj5rRWmD5tj6bIqRi6CkLIXbbNQUcxQHzGysQzvHmdYG2G7p/Tf0J0gpxGgkeMZNTIjT/AoSvC9Xehcgdg=="],
"estree-util-value-to-estree": ["estree-util-value-to-estree@3.5.0", "", { "dependencies": { "@types/estree": "^1.0.0" } }, "sha512-aMV56R27Gv3QmfmF1MY12GWkGzzeAezAX+UplqHVASfjc9wNzI/X6hC0S9oxq61WT4aQesLGslWP9tKk6ghRZQ=="],
"estree-util-visit": ["estree-util-visit@2.0.0", "", { "dependencies": { "@types/estree-jsx": "^1.0.0", "@types/unist": "^3.0.0" } }, "sha512-m5KgiH85xAhhW8Wta0vShLcUvOsh3LLPI2YVwcbio1l7E09NTLL1EyMZFM1OyWowoH0skScNbhOPl4kcBgzTww=="],
"estree-walker": ["estree-walker@3.0.3", "", { "dependencies": { "@types/estree": "^1.0.0" } }, "sha512-7RUKfXgSMMkzt6ZuXmqapOurLGPPfgj6l9uRZ7lRGolvk0y2yocc35LdcxKC5PQZdn2DMqioAQ2NoWcrTKmm6g=="],
"extend": ["extend@3.0.2", "", {}, "sha512-fjquC59cD7CyW6urNXK0FBufkZcoiGG80wTuPujX590cB5Ttln20E2UB4S/WARVqhXffZl2LNgS+gQdPIIim/g=="],
"fast-deep-equal": ["fast-deep-equal@3.1.3", "", {}, "sha512-f3qQ9oQy9j2AhBe/H9VC91wLmKBCCU/gDOnKNAYG5hswO7BLKj09Hc5HYNz9cGI++xlpDCIgDaitVs03ATR84Q=="],
"fast-uri": ["fast-uri@3.1.0", "", {}, "sha512-iPeeDKJSWf4IEOasVVrknXpaBV0IApz/gp7S2bb7Z4Lljbl2MGJRqInZiUrQwV16cpzw/D3S5j5Julj/gT52AA=="],
"fast-xml-parser": ["fast-xml-parser@4.5.3", "", { "dependencies": { "strnum": "^1.1.1" }, "bin": { "fxparser": "src/cli/cli.js" } }, "sha512-RKihhV+SHsIUGXObeVy9AXiBbFwkVk7Syp8XgwN5U3JV416+Gwp/GO9i0JYKmikykgz/UHRrrV4ROuZEo/T0ig=="],
"fdir": ["fdir@6.5.0", "", { "peerDependencies": { "picomatch": "^3 || ^4" }, "optionalPeers": ["picomatch"] }, "sha512-tIbYtZbucOs0BRGqPJkshJUYdL+SDH7dVM8gjy+ERp3WAUjLEFJE+02kanyHtwjWOnwrKYBiwAmM0p4kLJAnXg=="],
"foreach": ["foreach@2.0.6", "", {}, "sha512-k6GAGDyqLe9JaebCsFCoudPPWfihKu8pylYXRlqP1J7ms39iPoTtk2fviNglIeQEwdh0bQeKJ01ZPyuyQvKzwg=="],
"fumadocs-core": ["fumadocs-core@16.4.11", "", { "dependencies": { "@formatjs/intl-localematcher": "^0.8.0", "@orama/orama": "^3.1.18", "@shikijs/rehype": "^3.21.0", "@shikijs/transformers": "^3.21.0", "estree-util-value-to-estree": "^3.5.0", "github-slugger": "^2.0.0", "hast-util-to-estree": "^3.1.3", "hast-util-to-jsx-runtime": "^2.3.6", "image-size": "^2.0.2", "negotiator": "^1.0.0", "npm-to-yarn": "^3.0.1", "path-to-regexp": "^8.3.0", "remark": "^15.0.1", "remark-gfm": "^4.0.1", "remark-rehype": "^11.1.2", "scroll-into-view-if-needed": "^3.1.0", "shiki": "^3.21.0", "tinyglobby": "^0.2.15", "unist-util-visit": "^5.1.0" }, "peerDependencies": { "@mixedbread/sdk": "^0.46.0", "@orama/core": "1.x.x", "@oramacloud/client": "2.x.x", "@tanstack/react-router": "1.x.x", "@types/react": "*", "algoliasearch": "5.x.x", "lucide-react": "*", "next": "16.x.x", "react": "^19.2.0", "react-dom": "^19.2.0", "react-router": "7.x.x", "waku": "^0.26.0 || ^0.27.0", "zod": "4.x.x" }, "optionalPeers": ["@mixedbread/sdk", "@orama/core", "@oramacloud/client", "@tanstack/react-router", "@types/react", "algoliasearch", "lucide-react", "next", "react", "react-dom", "react-router", "waku", "zod"] }, "sha512-ORjWgYetxDgyHZocuvEghfxt6tuEPWE+Km5KvwNKlXPxcNdBIiSVCED8WEMwiw1n/FZ/ys+W+BOe58ZXxhWg2A=="],
"fumadocs-mdx": ["fumadocs-mdx@13.0.8", "", { "dependencies": { "@mdx-js/mdx": "^3.1.1", "@standard-schema/spec": "^1.0.0", "chokidar": "^4.0.3", "esbuild": "^0.25.12", "estree-util-value-to-estree": "^3.5.0", "js-yaml": "^4.1.0", "lru-cache": "^11.2.2", "mdast-util-to-markdown": "^2.1.2", "picocolors": "^1.1.1", "picomatch": "^4.0.3", "remark-mdx": "^3.1.1", "tinyexec": "^1.0.2", "tinyglobby": "^0.2.15", "unified": "^11.0.5", "unist-util-remove-position": "^5.0.0", "unist-util-visit": "^5.0.0", "zod": "^4.1.12" }, "peerDependencies": { "@fumadocs/mdx-remote": "^1.4.0", "fumadocs-core": "^15.0.0 || ^16.0.0", "next": "^15.3.0 || ^16.0.0", "react": "*", "vite": "6.x.x || 7.x.x" }, "optionalPeers": ["@fumadocs/mdx-remote", "next", "react", "vite"], "bin": { "fumadocs-mdx": "dist/bin.js" } }, "sha512-UbUwH0iGvYbytnxhmfd7tWJKFK8L0mrbTAmrQYnpg6Wi/h8afNMJmbHBOzVcaEWJKeFipZ1CGDAsNA2fztwXNg=="],
"fumadocs-openapi": ["fumadocs-openapi@10.2.7", "", { "dependencies": { "@fumari/json-schema-to-typescript": "^2.0.0", "@fumari/stf": "^0.0.1", "@radix-ui/react-accordion": "^1.2.12", "@radix-ui/react-dialog": "^1.1.15", "@radix-ui/react-select": "^2.2.6", "@radix-ui/react-slot": "^1.2.4", "@scalar/json-magic": "^0.9.4", "@scalar/openapi-parser": "0.24.5", "ajv": "^8.17.1", "class-variance-authority": "^0.7.1", "github-slugger": "^2.0.0", "hast-util-to-jsx-runtime": "^2.3.6", "js-yaml": "^4.1.1", "lucide-react": "^0.563.0", "next-themes": "^0.4.6", "openapi-sampler": "^1.6.2", "react-hook-form": "^7.71.1", "remark": "^15.0.1", "remark-rehype": "^11.1.2", "tailwind-merge": "^3.4.0", "xml-js": "^1.6.11" }, "peerDependencies": { "@scalar/api-client-react": "*", "@types/react": "*", "fumadocs-core": "^16.2.0", "fumadocs-ui": "^16.2.0", "react": "^19.2.0", "react-dom": "^19.2.0" }, "optionalPeers": ["@scalar/api-client-react", "@types/react"] }, "sha512-V24iseZFHmUyPdVEH/nyR1205mltOamlHXvAGtJx9FteKj0li0Rf7o7EPkV9Mby202ReG2CIic1cR2oWa+i7Jg=="],
"fumadocs-ui": ["fumadocs-ui@16.4.11", "", { "dependencies": { "@fumadocs/ui": "16.4.11", "@radix-ui/react-accordion": "^1.2.12", "@radix-ui/react-collapsible": "^1.1.12", "@radix-ui/react-dialog": "^1.1.15", "@radix-ui/react-direction": "^1.1.1", "@radix-ui/react-navigation-menu": "^1.2.14", "@radix-ui/react-popover": "^1.1.15", "@radix-ui/react-presence": "^1.1.5", "@radix-ui/react-scroll-area": "^1.2.10", "@radix-ui/react-slot": "^1.2.4", "@radix-ui/react-tabs": "^1.1.13", "class-variance-authority": "^0.7.1", "lucide-react": "^0.563.0", "next-themes": "^0.4.6", "react-medium-image-zoom": "^5.4.0", "scroll-into-view-if-needed": "^3.1.0" }, "peerDependencies": { "@types/react": "*", "fumadocs-core": "16.4.11", "next": "16.x.x", "react": "^19.2.0", "react-dom": "^19.2.0", "tailwindcss": "^4.0.0" }, "optionalPeers": ["@types/react", "next", "tailwindcss"] }, "sha512-LFOzdnNFAFkOHzsUtCMi8cyal1pIZqygoQKSET0LO/C5JOk1YQKAZqiut1jf6pv6o0OKXacDk+MY7kfn61309A=="],
"get-nonce": ["get-nonce@1.0.1", "", {}, "sha512-FJhYRoDaiatfEkUK8HKlicmu/3SGFD51q3itKDGoSTysQJBnfOcxU5GxnhE1E6soB76MbT0MBtnKJuXyAx+96Q=="],
"github-slugger": ["github-slugger@2.0.0", "", {}, "sha512-IaOQ9puYtjrkq7Y0Ygl9KDZnrf/aiUJYUpVf89y8kyaxbRG7Y1SrX/jaumrv81vc61+kiMempujsM3Yw7w5qcw=="],
"graceful-fs": ["graceful-fs@4.2.11", "", {}, "sha512-RbJ5/jmFcNNCcDV5o9eTnBLJ/HszWV0P73bc+Ff4nS/rJj+YaS6IGyiOL0VoBYX+l1Wrl3k63h/KrH+nhJ0XvQ=="],
"hast-util-to-estree": ["hast-util-to-estree@3.1.3", "", { "dependencies": { "@types/estree": "^1.0.0", "@types/estree-jsx": "^1.0.0", "@types/hast": "^3.0.0", "comma-separated-tokens": "^2.0.0", "devlop": "^1.0.0", "estree-util-attach-comments": "^3.0.0", "estree-util-is-identifier-name": "^3.0.0", "hast-util-whitespace": "^3.0.0", "mdast-util-mdx-expression": "^2.0.0", "mdast-util-mdx-jsx": "^3.0.0", "mdast-util-mdxjs-esm": "^2.0.0", "property-information": "^7.0.0", "space-separated-tokens": "^2.0.0", "style-to-js": "^1.0.0", "unist-util-position": "^5.0.0", "zwitch": "^2.0.0" } }, "sha512-48+B/rJWAp0jamNbAAf9M7Uf//UVqAoMmgXhBdxTDJLGKY+LRnZ99qcG+Qjl5HfMpYNzS5v4EAwVEF34LeAj7w=="],
"hast-util-to-html": ["hast-util-to-html@9.0.5", "", { "dependencies": { "@types/hast": "^3.0.0", "@types/unist": "^3.0.0", "ccount": "^2.0.0", "comma-separated-tokens": "^2.0.0", "hast-util-whitespace": "^3.0.0", "html-void-elements": "^3.0.0", "mdast-util-to-hast": "^13.0.0", "property-information": "^7.0.0", "space-separated-tokens": "^2.0.0", "stringify-entities": "^4.0.0", "zwitch": "^2.0.4" } }, "sha512-OguPdidb+fbHQSU4Q4ZiLKnzWo8Wwsf5bZfbvu7//a9oTYoqD/fWpe96NuHkoS9h0ccGOTe0C4NGXdtS0iObOw=="],
"hast-util-to-jsx-runtime": ["hast-util-to-jsx-runtime@2.3.6", "", { "dependencies": { "@types/estree": "^1.0.0", "@types/hast": "^3.0.0", "@types/unist": "^3.0.0", "comma-separated-tokens": "^2.0.0", "devlop": "^1.0.0", "estree-util-is-identifier-name": "^3.0.0", "hast-util-whitespace": "^3.0.0", "mdast-util-mdx-expression": "^2.0.0", "mdast-util-mdx-jsx": "^3.0.0", "mdast-util-mdxjs-esm": "^2.0.0", "property-information": "^7.0.0", "space-separated-tokens": "^2.0.0", "style-to-js": "^1.0.0", "unist-util-position": "^5.0.0", "vfile-message": "^4.0.0" } }, "sha512-zl6s8LwNyo1P9uw+XJGvZtdFF1GdAkOg8ujOw+4Pyb76874fLps4ueHXDhXWdk6YHQ6OgUtinliG7RsYvCbbBg=="],
"hast-util-to-string": ["hast-util-to-string@3.0.1", "", { "dependencies": { "@types/hast": "^3.0.0" } }, "sha512-XelQVTDWvqcl3axRfI0xSeoVKzyIFPwsAGSLIsKdJKQMXDYJS4WYrBNF/8J7RdhIcFI2BOHgAifggsvsxp/3+A=="],
"hast-util-whitespace": ["hast-util-whitespace@3.0.0", "", { "dependencies": { "@types/hast": "^3.0.0" } }, "sha512-88JUN06ipLwsnv+dVn+OIYOvAuvBMy/Qoi6O7mQHxdPXpjy+Cd6xRkWwux7DKO+4sYILtLBRIKgsdpS2gQc7qw=="],
"html-void-elements": ["html-void-elements@3.0.0", "", {}, "sha512-bEqo66MRXsUGxWHV5IP0PUiAWwoEjba4VCzg0LjFJBpchPaTfyfCKTG6bc5F8ucKec3q5y6qOdGyYTSBEvhCrg=="],
"image-size": ["image-size@2.0.2", "", { "bin": { "image-size": "bin/image-size.js" } }, "sha512-IRqXKlaXwgSMAMtpNzZa1ZAe8m+Sa1770Dhk8VkSsP9LS+iHD62Zd8FQKs8fbPiagBE7BzoFX23cxFnwshpV6w=="],
"inline-style-parser": ["inline-style-parser@0.2.7", "", {}, "sha512-Nb2ctOyNR8DqQoR0OwRG95uNWIC0C1lCgf5Naz5H6Ji72KZ8OcFZLz2P5sNgwlyoJ8Yif11oMuYs5pBQa86csA=="],
"is-alphabetical": ["is-alphabetical@2.0.1", "", {}, "sha512-FWyyY60MeTNyeSRpkM2Iry0G9hpr7/9kD40mD/cGQEuilcZYS4okz8SN2Q6rLCJ8gbCt6fN+rC+6tMGS99LaxQ=="],
"is-alphanumerical": ["is-alphanumerical@2.0.1", "", { "dependencies": { "is-alphabetical": "^2.0.0", "is-decimal": "^2.0.0" } }, "sha512-hmbYhX/9MUMF5uh7tOXyK/n0ZvWpad5caBA17GsC6vyuCqaWliRG5K1qS9inmUhEMaOBIW7/whAnSwveW/LtZw=="],
"is-decimal": ["is-decimal@2.0.1", "", {}, "sha512-AAB9hiomQs5DXWcRB1rqsxGUstbRroFOPPVAomNk/3XHR5JyEZChOyTWe2oayKnsSsr/kcGqF+z6yuH6HHpN0A=="],
"is-hexadecimal": ["is-hexadecimal@2.0.1", "", {}, "sha512-DgZQp241c8oO6cA1SbTEWiXeoxV42vlcJxgH+B3hi1AiqqKruZR3ZGF8In3fj4+/y/7rHvlOZLZtgJ/4ttYGZg=="],
"is-plain-obj": ["is-plain-obj@4.1.0", "", {}, "sha512-+Pgi+vMuUNkJyExiMBt5IlFoMyKnr5zhJ4Uspz58WOhBF5QoIZkFyNHIbBAtHwzVAgk5RtndVNsDRN61/mmDqg=="],
"jiti": ["jiti@2.6.1", "", { "bin": { "jiti": "lib/jiti-cli.mjs" } }, "sha512-ekilCSN1jwRvIbgeg/57YFh8qQDNbwDb9xT/qu2DAHbFFZUicIl4ygVaAvzveMhMVr3LnpSKTNnwt8PoOfmKhQ=="],
"js-yaml": ["js-yaml@4.1.1", "", { "dependencies": { "argparse": "^2.0.1" }, "bin": { "js-yaml": "bin/js-yaml.js" } }, "sha512-qQKT4zQxXl8lLwBtHMWwaTcGfFOZviOJet3Oy/xmGk2gZH677CJM9EvtfdSkgWcATZhj/55JZ0rmy3myCT5lsA=="],
"json-pointer": ["json-pointer@0.6.2", "", { "dependencies": { "foreach": "^2.0.4" } }, "sha512-vLWcKbOaXlO+jvRy4qNd+TI1QUPZzfJj1tpJ3vAXDych5XJf93ftpUKe5pKCrzyIIwgBJcOcCVRUfqQP25afBw=="],
"json-schema-traverse": ["json-schema-traverse@1.0.0", "", {}, "sha512-NM8/P9n3XjXhIZn1lLhkFaACTOURQXjWhV4BA/RnOv8xvgqtqpAX9IO4mRQxSx1Rlo4tqzeqb0sOlruaOy3dug=="],
"jsonpointer": ["jsonpointer@5.0.1", "", {}, "sha512-p/nXbhSEcu3pZRdkW1OfJhpsVtW1gd4Wa1fnQc9YLiTfAjn0312eMKimbdIQzuZl9aa9xUGaRlP9T/CJE/ditQ=="],
"leven": ["leven@4.1.0", "", {}, "sha512-KZ9W9nWDT7rF7Dazg8xyLHGLrmpgq2nVNFUckhqdW3szVP6YhCpp/RAnpmVExA9JvrMynjwSLVrEj3AepHR6ew=="],
"lightningcss": ["lightningcss@1.30.2", "", { "dependencies": { "detect-libc": "^2.0.3" }, "optionalDependencies": { "lightningcss-android-arm64": "1.30.2", "lightningcss-darwin-arm64": "1.30.2", "lightningcss-darwin-x64": "1.30.2", "lightningcss-freebsd-x64": "1.30.2", "lightningcss-linux-arm-gnueabihf": "1.30.2", "lightningcss-linux-arm64-gnu": "1.30.2", "lightningcss-linux-arm64-musl": "1.30.2", "lightningcss-linux-x64-gnu": "1.30.2", "lightningcss-linux-x64-musl": "1.30.2", "lightningcss-win32-arm64-msvc": "1.30.2", "lightningcss-win32-x64-msvc": "1.30.2" } }, "sha512-utfs7Pr5uJyyvDETitgsaqSyjCb2qNRAtuqUeWIAKztsOYdcACf2KtARYXg2pSvhkt+9NfoaNY7fxjl6nuMjIQ=="],
"lightningcss-android-arm64": ["lightningcss-android-arm64@1.30.2", "", { "os": "android", "cpu": "arm64" }, "sha512-BH9sEdOCahSgmkVhBLeU7Hc9DWeZ1Eb6wNS6Da8igvUwAe0sqROHddIlvU06q3WyXVEOYDZ6ykBZQnjTbmo4+A=="],
"lightningcss-darwin-arm64": ["lightningcss-darwin-arm64@1.30.2", "", { "os": "darwin", "cpu": "arm64" }, "sha512-ylTcDJBN3Hp21TdhRT5zBOIi73P6/W0qwvlFEk22fkdXchtNTOU4Qc37SkzV+EKYxLouZ6M4LG9NfZ1qkhhBWA=="],
"lightningcss-darwin-x64": ["lightningcss-darwin-x64@1.30.2", "", { "os": "darwin", "cpu": "x64" }, "sha512-oBZgKchomuDYxr7ilwLcyms6BCyLn0z8J0+ZZmfpjwg9fRVZIR5/GMXd7r9RH94iDhld3UmSjBM6nXWM2TfZTQ=="],
"lightningcss-freebsd-x64": ["lightningcss-freebsd-x64@1.30.2", "", { "os": "freebsd", "cpu": "x64" }, "sha512-c2bH6xTrf4BDpK8MoGG4Bd6zAMZDAXS569UxCAGcA7IKbHNMlhGQ89eRmvpIUGfKWNVdbhSbkQaWhEoMGmGslA=="],
"lightningcss-linux-arm-gnueabihf": ["lightningcss-linux-arm-gnueabihf@1.30.2", "", { "os": "linux", "cpu": "arm" }, "sha512-eVdpxh4wYcm0PofJIZVuYuLiqBIakQ9uFZmipf6LF/HRj5Bgm0eb3qL/mr1smyXIS1twwOxNWndd8z0E374hiA=="],
"lightningcss-linux-arm64-gnu": ["lightningcss-linux-arm64-gnu@1.30.2", "", { "os": "linux", "cpu": "arm64" }, "sha512-UK65WJAbwIJbiBFXpxrbTNArtfuznvxAJw4Q2ZGlU8kPeDIWEX1dg3rn2veBVUylA2Ezg89ktszWbaQnxD/e3A=="],
"lightningcss-linux-arm64-musl": ["lightningcss-linux-arm64-musl@1.30.2", "", { "os": "linux", "cpu": "arm64" }, "sha512-5Vh9dGeblpTxWHpOx8iauV02popZDsCYMPIgiuw97OJ5uaDsL86cnqSFs5LZkG3ghHoX5isLgWzMs+eD1YzrnA=="],
"lightningcss-linux-x64-gnu": ["lightningcss-linux-x64-gnu@1.30.2", "", { "os": "linux", "cpu": "x64" }, "sha512-Cfd46gdmj1vQ+lR6VRTTadNHu6ALuw2pKR9lYq4FnhvgBc4zWY1EtZcAc6EffShbb1MFrIPfLDXD6Xprbnni4w=="],
"lightningcss-linux-x64-musl": ["lightningcss-linux-x64-musl@1.30.2", "", { "os": "linux", "cpu": "x64" }, "sha512-XJaLUUFXb6/QG2lGIW6aIk6jKdtjtcffUT0NKvIqhSBY3hh9Ch+1LCeH80dR9q9LBjG3ewbDjnumefsLsP6aiA=="],
"lightningcss-win32-arm64-msvc": ["lightningcss-win32-arm64-msvc@1.30.2", "", { "os": "win32", "cpu": "arm64" }, "sha512-FZn+vaj7zLv//D/192WFFVA0RgHawIcHqLX9xuWiQt7P0PtdFEVaxgF9rjM/IRYHQXNnk61/H/gb2Ei+kUQ4xQ=="],
"lightningcss-win32-x64-msvc": ["lightningcss-win32-x64-msvc@1.30.2", "", { "os": "win32", "cpu": "x64" }, "sha512-5g1yc73p+iAkid5phb4oVFMB45417DkRevRbt/El/gKXJk4jid+vPFF/AXbxn05Aky8PapwzZrdJShv5C0avjw=="],
"longest-streak": ["longest-streak@3.1.0", "", {}, "sha512-9Ri+o0JYgehTaVBBDoMqIl8GXtbWg711O3srftcHhZ0dqnETqLaoIK0x17fUw9rFSlK/0NlsKe0Ahhyl5pXE2g=="],
"lru-cache": ["lru-cache@11.2.5", "", {}, "sha512-vFrFJkWtJvJnD5hg+hJvVE8Lh/TcMzKnTgCWmtBipwI5yLX/iX+5UB2tfuyODF5E7k9xEzMdYgGqaSb1c0c5Yw=="],
"lucide-react": ["lucide-react@0.546.0", "", { "peerDependencies": { "react": "^16.5.1 || ^17.0.0 || ^18.0.0 || ^19.0.0" } }, "sha512-Z94u6fKT43lKeYHiVyvyR8fT7pwCzDu7RyMPpTvh054+xahSgj4HFQ+NmflvzdXsoAjYGdCguGaFKYuvq0ThCQ=="],
"magic-string": ["magic-string@0.30.21", "", { "dependencies": { "@jridgewell/sourcemap-codec": "^1.5.5" } }, "sha512-vd2F4YUyEXKGcLHoq+TEyCjxueSeHnFxyyjNp80yg0XV4vUhnDer/lvvlqM/arB5bXQN5K2/3oinyCRyx8T2CQ=="],
"markdown-extensions": ["markdown-extensions@2.0.0", "", {}, "sha512-o5vL7aDWatOTX8LzaS1WMoaoxIiLRQJuIKKe2wAw6IeULDHaqbiqiggmx+pKvZDb1Sj+pE46Sn1T7lCqfFtg1Q=="],
"markdown-table": ["markdown-table@3.0.4", "", {}, "sha512-wiYz4+JrLyb/DqW2hkFJxP7Vd7JuTDm77fvbM8VfEQdmSMqcImWeeRbHwZjBjIFki/VaMK2BhFi7oUUZeM5bqw=="],
"mdast-util-find-and-replace": ["mdast-util-find-and-replace@3.0.2", "", { "dependencies": { "@types/mdast": "^4.0.0", "escape-string-regexp": "^5.0.0", "unist-util-is": "^6.0.0", "unist-util-visit-parents": "^6.0.0" } }, "sha512-Tmd1Vg/m3Xz43afeNxDIhWRtFZgM2VLyaf4vSTYwudTyeuTneoL3qtWMA5jeLyz/O1vDJmmV4QuScFCA2tBPwg=="],
"mdast-util-from-markdown": ["mdast-util-from-markdown@2.0.2", "", { "dependencies": { "@types/mdast": "^4.0.0", "@types/unist": "^3.0.0", "decode-named-character-reference": "^1.0.0", "devlop": "^1.0.0", "mdast-util-to-string": "^4.0.0", "micromark": "^4.0.0", "micromark-util-decode-numeric-character-reference": "^2.0.0", "micromark-util-decode-string": "^2.0.0", "micromark-util-normalize-identifier": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0", "unist-util-stringify-position": "^4.0.0" } }, "sha512-uZhTV/8NBuw0WHkPTrCqDOl0zVe1BIng5ZtHoDk49ME1qqcjYmmLmOf0gELgcRMxN4w2iuIeVso5/6QymSrgmA=="],
"mdast-util-gfm": ["mdast-util-gfm@3.1.0", "", { "dependencies": { "mdast-util-from-markdown": "^2.0.0", "mdast-util-gfm-autolink-literal": "^2.0.0", "mdast-util-gfm-footnote": "^2.0.0", "mdast-util-gfm-strikethrough": "^2.0.0", "mdast-util-gfm-table": "^2.0.0", "mdast-util-gfm-task-list-item": "^2.0.0", "mdast-util-to-markdown": "^2.0.0" } }, "sha512-0ulfdQOM3ysHhCJ1p06l0b0VKlhU0wuQs3thxZQagjcjPrlFRqY215uZGHHJan9GEAXd9MbfPjFJz+qMkVR6zQ=="],
"mdast-util-gfm-autolink-literal": ["mdast-util-gfm-autolink-literal@2.0.1", "", { "dependencies": { "@types/mdast": "^4.0.0", "ccount": "^2.0.0", "devlop": "^1.0.0", "mdast-util-find-and-replace": "^3.0.0", "micromark-util-character": "^2.0.0" } }, "sha512-5HVP2MKaP6L+G6YaxPNjuL0BPrq9orG3TsrZ9YXbA3vDw/ACI4MEsnoDpn6ZNm7GnZgtAcONJyPhOP8tNJQavQ=="],
"mdast-util-gfm-footnote": ["mdast-util-gfm-footnote@2.1.0", "", { "dependencies": { "@types/mdast": "^4.0.0", "devlop": "^1.1.0", "mdast-util-from-markdown": "^2.0.0", "mdast-util-to-markdown": "^2.0.0", "micromark-util-normalize-identifier": "^2.0.0" } }, "sha512-sqpDWlsHn7Ac9GNZQMeUzPQSMzR6Wv0WKRNvQRg0KqHh02fpTz69Qc1QSseNX29bhz1ROIyNyxExfawVKTm1GQ=="],
"mdast-util-gfm-strikethrough": ["mdast-util-gfm-strikethrough@2.0.0", "", { "dependencies": { "@types/mdast": "^4.0.0", "mdast-util-from-markdown": "^2.0.0", "mdast-util-to-markdown": "^2.0.0" } }, "sha512-mKKb915TF+OC5ptj5bJ7WFRPdYtuHv0yTRxK2tJvi+BDqbkiG7h7u/9SI89nRAYcmap2xHQL9D+QG/6wSrTtXg=="],
"mdast-util-gfm-table": ["mdast-util-gfm-table@2.0.0", "", { "dependencies": { "@types/mdast": "^4.0.0", "devlop": "^1.0.0", "markdown-table": "^3.0.0", "mdast-util-from-markdown": "^2.0.0", "mdast-util-to-markdown": "^2.0.0" } }, "sha512-78UEvebzz/rJIxLvE7ZtDd/vIQ0RHv+3Mh5DR96p7cS7HsBhYIICDBCu8csTNWNO6tBWfqXPWekRuj2FNOGOZg=="],
"mdast-util-gfm-task-list-item": ["mdast-util-gfm-task-list-item@2.0.0", "", { "dependencies": { "@types/mdast": "^4.0.0", "devlop": "^1.0.0", "mdast-util-from-markdown": "^2.0.0", "mdast-util-to-markdown": "^2.0.0" } }, "sha512-IrtvNvjxC1o06taBAVJznEnkiHxLFTzgonUdy8hzFVeDun0uTjxxrRGVaNFqkU1wJR3RBPEfsxmU6jDWPofrTQ=="],
"mdast-util-mdx": ["mdast-util-mdx@3.0.0", "", { "dependencies": { "mdast-util-from-markdown": "^2.0.0", "mdast-util-mdx-expression": "^2.0.0", "mdast-util-mdx-jsx": "^3.0.0", "mdast-util-mdxjs-esm": "^2.0.0", "mdast-util-to-markdown": "^2.0.0" } }, "sha512-JfbYLAW7XnYTTbUsmpu0kdBUVe+yKVJZBItEjwyYJiDJuZ9w4eeaqks4HQO+R7objWgS2ymV60GYpI14Ug554w=="],
"mdast-util-mdx-expression": ["mdast-util-mdx-expression@2.0.1", "", { "dependencies": { "@types/estree-jsx": "^1.0.0", "@types/hast": "^3.0.0", "@types/mdast": "^4.0.0", "devlop": "^1.0.0", "mdast-util-from-markdown": "^2.0.0", "mdast-util-to-markdown": "^2.0.0" } }, "sha512-J6f+9hUp+ldTZqKRSg7Vw5V6MqjATc+3E4gf3CFNcuZNWD8XdyI6zQ8GqH7f8169MM6P7hMBRDVGnn7oHB9kXQ=="],
"mdast-util-mdx-jsx": ["mdast-util-mdx-jsx@3.2.0", "", { "dependencies": { "@types/estree-jsx": "^1.0.0", "@types/hast": "^3.0.0", "@types/mdast": "^4.0.0", "@types/unist": "^3.0.0", "ccount": "^2.0.0", "devlop": "^1.1.0", "mdast-util-from-markdown": "^2.0.0", "mdast-util-to-markdown": "^2.0.0", "parse-entities": "^4.0.0", "stringify-entities": "^4.0.0", "unist-util-stringify-position": "^4.0.0", "vfile-message": "^4.0.0" } }, "sha512-lj/z8v0r6ZtsN/cGNNtemmmfoLAFZnjMbNyLzBafjzikOM+glrjNHPlf6lQDOTccj9n5b0PPihEBbhneMyGs1Q=="],
"mdast-util-mdxjs-esm": ["mdast-util-mdxjs-esm@2.0.1", "", { "dependencies": { "@types/estree-jsx": "^1.0.0", "@types/hast": "^3.0.0", "@types/mdast": "^4.0.0", "devlop": "^1.0.0", "mdast-util-from-markdown": "^2.0.0", "mdast-util-to-markdown": "^2.0.0" } }, "sha512-EcmOpxsZ96CvlP03NghtH1EsLtr0n9Tm4lPUJUBccV9RwUOneqSycg19n5HGzCf+10LozMRSObtVr3ee1WoHtg=="],
"mdast-util-phrasing": ["mdast-util-phrasing@4.1.0", "", { "dependencies": { "@types/mdast": "^4.0.0", "unist-util-is": "^6.0.0" } }, "sha512-TqICwyvJJpBwvGAMZjj4J2n0X8QWp21b9l0o7eXyVJ25YNWYbJDVIyD1bZXE6WtV6RmKJVYmQAKWa0zWOABz2w=="],
"mdast-util-to-hast": ["mdast-util-to-hast@13.2.1", "", { "dependencies": { "@types/hast": "^3.0.0", "@types/mdast": "^4.0.0", "@ungap/structured-clone": "^1.0.0", "devlop": "^1.0.0", "micromark-util-sanitize-uri": "^2.0.0", "trim-lines": "^3.0.0", "unist-util-position": "^5.0.0", "unist-util-visit": "^5.0.0", "vfile": "^6.0.0" } }, "sha512-cctsq2wp5vTsLIcaymblUriiTcZd0CwWtCbLvrOzYCDZoWyMNV8sZ7krj09FSnsiJi3WVsHLM4k6Dq/yaPyCXA=="],
"mdast-util-to-markdown": ["mdast-util-to-markdown@2.1.2", "", { "dependencies": { "@types/mdast": "^4.0.0", "@types/unist": "^3.0.0", "longest-streak": "^3.0.0", "mdast-util-phrasing": "^4.0.0", "mdast-util-to-string": "^4.0.0", "micromark-util-classify-character": "^2.0.0", "micromark-util-decode-string": "^2.0.0", "unist-util-visit": "^5.0.0", "zwitch": "^2.0.0" } }, "sha512-xj68wMTvGXVOKonmog6LwyJKrYXZPvlwabaryTjLh9LuvovB/KAH+kvi8Gjj+7rJjsFi23nkUxRQv1KqSroMqA=="],
"mdast-util-to-string": ["mdast-util-to-string@4.0.0", "", { "dependencies": { "@types/mdast": "^4.0.0" } }, "sha512-0H44vDimn51F0YwvxSJSm0eCDOJTRlmN0R1yBh4HLj9wiV1Dn0QoXGbvFAWj2hSItVTlCmBF1hqKlIyUBVFLPg=="],
"micromark": ["micromark@4.0.2", "", { "dependencies": { "@types/debug": "^4.0.0", "debug": "^4.0.0", "decode-named-character-reference": "^1.0.0", "devlop": "^1.0.0", "micromark-core-commonmark": "^2.0.0", "micromark-factory-space": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-chunked": "^2.0.0", "micromark-util-combine-extensions": "^2.0.0", "micromark-util-decode-numeric-character-reference": "^2.0.0", "micromark-util-encode": "^2.0.0", "micromark-util-normalize-identifier": "^2.0.0", "micromark-util-resolve-all": "^2.0.0", "micromark-util-sanitize-uri": "^2.0.0", "micromark-util-subtokenize": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-zpe98Q6kvavpCr1NPVSCMebCKfD7CA2NqZ+rykeNhONIJBpc1tFKt9hucLGwha3jNTNI8lHpctWJWoimVF4PfA=="],
"micromark-core-commonmark": ["micromark-core-commonmark@2.0.3", "", { "dependencies": { "decode-named-character-reference": "^1.0.0", "devlop": "^1.0.0", "micromark-factory-destination": "^2.0.0", "micromark-factory-label": "^2.0.0", "micromark-factory-space": "^2.0.0", "micromark-factory-title": "^2.0.0", "micromark-factory-whitespace": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-chunked": "^2.0.0", "micromark-util-classify-character": "^2.0.0", "micromark-util-html-tag-name": "^2.0.0", "micromark-util-normalize-identifier": "^2.0.0", "micromark-util-resolve-all": "^2.0.0", "micromark-util-subtokenize": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-RDBrHEMSxVFLg6xvnXmb1Ayr2WzLAWjeSATAoxwKYJV94TeNavgoIdA0a9ytzDSVzBy2YKFK+emCPOEibLeCrg=="],
"micromark-extension-gfm": ["micromark-extension-gfm@3.0.0", "", { "dependencies": { "micromark-extension-gfm-autolink-literal": "^2.0.0", "micromark-extension-gfm-footnote": "^2.0.0", "micromark-extension-gfm-strikethrough": "^2.0.0", "micromark-extension-gfm-table": "^2.0.0", "micromark-extension-gfm-tagfilter": "^2.0.0", "micromark-extension-gfm-task-list-item": "^2.0.0", "micromark-util-combine-extensions": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-vsKArQsicm7t0z2GugkCKtZehqUm31oeGBV/KVSorWSy8ZlNAv7ytjFhvaryUiCUJYqs+NoE6AFhpQvBTM6Q4w=="],
"micromark-extension-gfm-autolink-literal": ["micromark-extension-gfm-autolink-literal@2.1.0", "", { "dependencies": { "micromark-util-character": "^2.0.0", "micromark-util-sanitize-uri": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-oOg7knzhicgQ3t4QCjCWgTmfNhvQbDDnJeVu9v81r7NltNCVmhPy1fJRX27pISafdjL+SVc4d3l48Gb6pbRypw=="],
"micromark-extension-gfm-footnote": ["micromark-extension-gfm-footnote@2.1.0", "", { "dependencies": { "devlop": "^1.0.0", "micromark-core-commonmark": "^2.0.0", "micromark-factory-space": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-normalize-identifier": "^2.0.0", "micromark-util-sanitize-uri": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-/yPhxI1ntnDNsiHtzLKYnE3vf9JZ6cAisqVDauhp4CEHxlb4uoOTxOCJ+9s51bIB8U1N1FJ1RXOKTIlD5B/gqw=="],
"micromark-extension-gfm-strikethrough": ["micromark-extension-gfm-strikethrough@2.1.0", "", { "dependencies": { "devlop": "^1.0.0", "micromark-util-chunked": "^2.0.0", "micromark-util-classify-character": "^2.0.0", "micromark-util-resolve-all": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-ADVjpOOkjz1hhkZLlBiYA9cR2Anf8F4HqZUO6e5eDcPQd0Txw5fxLzzxnEkSkfnD0wziSGiv7sYhk/ktvbf1uw=="],
"micromark-extension-gfm-table": ["micromark-extension-gfm-table@2.1.1", "", { "dependencies": { "devlop": "^1.0.0", "micromark-factory-space": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-t2OU/dXXioARrC6yWfJ4hqB7rct14e8f7m0cbI5hUmDyyIlwv5vEtooptH8INkbLzOatzKuVbQmAYcbWoyz6Dg=="],
"micromark-extension-gfm-tagfilter": ["micromark-extension-gfm-tagfilter@2.0.0", "", { "dependencies": { "micromark-util-types": "^2.0.0" } }, "sha512-xHlTOmuCSotIA8TW1mDIM6X2O1SiX5P9IuDtqGonFhEK0qgRI4yeC6vMxEV2dgyr2TiD+2PQ10o+cOhdVAcwfg=="],
"micromark-extension-gfm-task-list-item": ["micromark-extension-gfm-task-list-item@2.1.0", "", { "dependencies": { "devlop": "^1.0.0", "micromark-factory-space": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-qIBZhqxqI6fjLDYFTBIa4eivDMnP+OZqsNwmQ3xNLE4Cxwc+zfQEfbs6tzAo2Hjq+bh6q5F+Z8/cksrLFYWQQw=="],
"micromark-extension-mdx-expression": ["micromark-extension-mdx-expression@3.0.1", "", { "dependencies": { "@types/estree": "^1.0.0", "devlop": "^1.0.0", "micromark-factory-mdx-expression": "^2.0.0", "micromark-factory-space": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-events-to-acorn": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-dD/ADLJ1AeMvSAKBwO22zG22N4ybhe7kFIZ3LsDI0GlsNr2A3KYxb0LdC1u5rj4Nw+CHKY0RVdnHX8vj8ejm4Q=="],
"micromark-extension-mdx-jsx": ["micromark-extension-mdx-jsx@3.0.2", "", { "dependencies": { "@types/estree": "^1.0.0", "devlop": "^1.0.0", "estree-util-is-identifier-name": "^3.0.0", "micromark-factory-mdx-expression": "^2.0.0", "micromark-factory-space": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-events-to-acorn": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0", "vfile-message": "^4.0.0" } }, "sha512-e5+q1DjMh62LZAJOnDraSSbDMvGJ8x3cbjygy2qFEi7HCeUT4BDKCvMozPozcD6WmOt6sVvYDNBKhFSz3kjOVQ=="],
"micromark-extension-mdx-md": ["micromark-extension-mdx-md@2.0.0", "", { "dependencies": { "micromark-util-types": "^2.0.0" } }, "sha512-EpAiszsB3blw4Rpba7xTOUptcFeBFi+6PY8VnJ2hhimH+vCQDirWgsMpz7w1XcZE7LVrSAUGb9VJpG9ghlYvYQ=="],
"micromark-extension-mdxjs": ["micromark-extension-mdxjs@3.0.0", "", { "dependencies": { "acorn": "^8.0.0", "acorn-jsx": "^5.0.0", "micromark-extension-mdx-expression": "^3.0.0", "micromark-extension-mdx-jsx": "^3.0.0", "micromark-extension-mdx-md": "^2.0.0", "micromark-extension-mdxjs-esm": "^3.0.0", "micromark-util-combine-extensions": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-A873fJfhnJ2siZyUrJ31l34Uqwy4xIFmvPY1oj+Ean5PHcPBYzEsvqvWGaWcfEIr11O5Dlw3p2y0tZWpKHDejQ=="],
"micromark-extension-mdxjs-esm": ["micromark-extension-mdxjs-esm@3.0.0", "", { "dependencies": { "@types/estree": "^1.0.0", "devlop": "^1.0.0", "micromark-core-commonmark": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-events-to-acorn": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0", "unist-util-position-from-estree": "^2.0.0", "vfile-message": "^4.0.0" } }, "sha512-DJFl4ZqkErRpq/dAPyeWp15tGrcrrJho1hKK5uBS70BCtfrIFg81sqcTVu3Ta+KD1Tk5vAtBNElWxtAa+m8K9A=="],
"micromark-factory-destination": ["micromark-factory-destination@2.0.1", "", { "dependencies": { "micromark-util-character": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-Xe6rDdJlkmbFRExpTOmRj9N3MaWmbAgdpSrBQvCFqhezUn4AHqJHbaEnfbVYYiexVSs//tqOdY/DxhjdCiJnIA=="],
"micromark-factory-label": ["micromark-factory-label@2.0.1", "", { "dependencies": { "devlop": "^1.0.0", "micromark-util-character": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-VFMekyQExqIW7xIChcXn4ok29YE3rnuyveW3wZQWWqF4Nv9Wk5rgJ99KzPvHjkmPXF93FXIbBp6YdW3t71/7Vg=="],
"micromark-factory-mdx-expression": ["micromark-factory-mdx-expression@2.0.3", "", { "dependencies": { "@types/estree": "^1.0.0", "devlop": "^1.0.0", "micromark-factory-space": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-events-to-acorn": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0", "unist-util-position-from-estree": "^2.0.0", "vfile-message": "^4.0.0" } }, "sha512-kQnEtA3vzucU2BkrIa8/VaSAsP+EJ3CKOvhMuJgOEGg9KDC6OAY6nSnNDVRiVNRqj7Y4SlSzcStaH/5jge8JdQ=="],
"micromark-factory-space": ["micromark-factory-space@2.0.1", "", { "dependencies": { "micromark-util-character": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-zRkxjtBxxLd2Sc0d+fbnEunsTj46SWXgXciZmHq0kDYGnck/ZSGj9/wULTV95uoeYiK5hRXP2mJ98Uo4cq/LQg=="],
"micromark-factory-title": ["micromark-factory-title@2.0.1", "", { "dependencies": { "micromark-factory-space": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-5bZ+3CjhAd9eChYTHsjy6TGxpOFSKgKKJPJxr293jTbfry2KDoWkhBb6TcPVB4NmzaPhMs1Frm9AZH7OD4Cjzw=="],
"micromark-factory-whitespace": ["micromark-factory-whitespace@2.0.1", "", { "dependencies": { "micromark-factory-space": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-Ob0nuZ3PKt/n0hORHyvoD9uZhr+Za8sFoP+OnMcnWK5lngSzALgQYKMr9RJVOWLqQYuyn6ulqGWSXdwf6F80lQ=="],
"micromark-util-character": ["micromark-util-character@2.1.1", "", { "dependencies": { "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-wv8tdUTJ3thSFFFJKtpYKOYiGP2+v96Hvk4Tu8KpCAsTMs6yi+nVmGh1syvSCsaxz45J6Jbw+9DD6g97+NV67Q=="],
"micromark-util-chunked": ["micromark-util-chunked@2.0.1", "", { "dependencies": { "micromark-util-symbol": "^2.0.0" } }, "sha512-QUNFEOPELfmvv+4xiNg2sRYeS/P84pTW0TCgP5zc9FpXetHY0ab7SxKyAQCNCc1eK0459uoLI1y5oO5Vc1dbhA=="],
"micromark-util-classify-character": ["micromark-util-classify-character@2.0.1", "", { "dependencies": { "micromark-util-character": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-K0kHzM6afW/MbeWYWLjoHQv1sgg2Q9EccHEDzSkxiP/EaagNzCm7T/WMKZ3rjMbvIpvBiZgwR3dKMygtA4mG1Q=="],
"micromark-util-combine-extensions": ["micromark-util-combine-extensions@2.0.1", "", { "dependencies": { "micromark-util-chunked": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-OnAnH8Ujmy59JcyZw8JSbK9cGpdVY44NKgSM7E9Eh7DiLS2E9RNQf0dONaGDzEG9yjEl5hcqeIsj4hfRkLH/Bg=="],
"micromark-util-decode-numeric-character-reference": ["micromark-util-decode-numeric-character-reference@2.0.2", "", { "dependencies": { "micromark-util-symbol": "^2.0.0" } }, "sha512-ccUbYk6CwVdkmCQMyr64dXz42EfHGkPQlBj5p7YVGzq8I7CtjXZJrubAYezf7Rp+bjPseiROqe7G6foFd+lEuw=="],
"micromark-util-decode-string": ["micromark-util-decode-string@2.0.1", "", { "dependencies": { "decode-named-character-reference": "^1.0.0", "micromark-util-character": "^2.0.0", "micromark-util-decode-numeric-character-reference": "^2.0.0", "micromark-util-symbol": "^2.0.0" } }, "sha512-nDV/77Fj6eH1ynwscYTOsbK7rR//Uj0bZXBwJZRfaLEJ1iGBR6kIfNmlNqaqJf649EP0F3NWNdeJi03elllNUQ=="],
"micromark-util-encode": ["micromark-util-encode@2.0.1", "", {}, "sha512-c3cVx2y4KqUnwopcO9b/SCdo2O67LwJJ/UyqGfbigahfegL9myoEFoDYZgkT7f36T0bLrM9hZTAaAyH+PCAXjw=="],
"micromark-util-events-to-acorn": ["micromark-util-events-to-acorn@2.0.3", "", { "dependencies": { "@types/estree": "^1.0.0", "@types/unist": "^3.0.0", "devlop": "^1.0.0", "estree-util-visit": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0", "vfile-message": "^4.0.0" } }, "sha512-jmsiEIiZ1n7X1Rr5k8wVExBQCg5jy4UXVADItHmNk1zkwEVhBuIUKRu3fqv+hs4nxLISi2DQGlqIOGiFxgbfHg=="],
"micromark-util-html-tag-name": ["micromark-util-html-tag-name@2.0.1", "", {}, "sha512-2cNEiYDhCWKI+Gs9T0Tiysk136SnR13hhO8yW6BGNyhOC4qYFnwF1nKfD3HFAIXA5c45RrIG1ub11GiXeYd1xA=="],
"micromark-util-normalize-identifier": ["micromark-util-normalize-identifier@2.0.1", "", { "dependencies": { "micromark-util-symbol": "^2.0.0" } }, "sha512-sxPqmo70LyARJs0w2UclACPUUEqltCkJ6PhKdMIDuJ3gSf/Q+/GIe3WKl0Ijb/GyH9lOpUkRAO2wp0GVkLvS9Q=="],
"micromark-util-resolve-all": ["micromark-util-resolve-all@2.0.1", "", { "dependencies": { "micromark-util-types": "^2.0.0" } }, "sha512-VdQyxFWFT2/FGJgwQnJYbe1jjQoNTS4RjglmSjTUlpUMa95Htx9NHeYW4rGDJzbjvCsl9eLjMQwGeElsqmzcHg=="],
"micromark-util-sanitize-uri": ["micromark-util-sanitize-uri@2.0.1", "", { "dependencies": { "micromark-util-character": "^2.0.0", "micromark-util-encode": "^2.0.0", "micromark-util-symbol": "^2.0.0" } }, "sha512-9N9IomZ/YuGGZZmQec1MbgxtlgougxTodVwDzzEouPKo3qFWvymFHWcnDi2vzV1ff6kas9ucW+o3yzJK9YB1AQ=="],
"micromark-util-subtokenize": ["micromark-util-subtokenize@2.1.0", "", { "dependencies": { "devlop": "^1.0.0", "micromark-util-chunked": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-XQLu552iSctvnEcgXw6+Sx75GflAPNED1qx7eBJ+wydBb2KCbRZe+NwvIEEMM83uml1+2WSXpBAcp9IUCgCYWA=="],
"micromark-util-symbol": ["micromark-util-symbol@2.0.1", "", {}, "sha512-vs5t8Apaud9N28kgCrRUdEed4UJ+wWNvicHLPxCa9ENlYuAY31M0ETy5y1vA33YoNPDFTghEbnh6efaE8h4x0Q=="],
"micromark-util-types": ["micromark-util-types@2.0.2", "", {}, "sha512-Yw0ECSpJoViF1qTU4DC6NwtC4aWGt1EkzaQB8KPPyCRR8z9TWeV0HbEFGTO+ZY1wB22zmxnJqhPyTpOVCpeHTA=="],
"ms": ["ms@2.1.3", "", {}, "sha512-6FlzubTLZG3J2a/NVCAleEhjzq5oxgHyaCU9yYXvcLsvoVaHJq/s5xXI6/XXP6tz7R9xAOtHnSO/tXtF3WRTlA=="],
"nanoid": ["nanoid@3.3.11", "", { "bin": { "nanoid": "bin/nanoid.cjs" } }, "sha512-N8SpfPUnUp1bK+PMYW8qSWdl9U+wwNWI4QKxOYDy9JAro3WMX7p2OeVRF9v+347pnakNevPmiHhNmZ2HbFA76w=="],
"negotiator": ["negotiator@1.0.0", "", {}, "sha512-8Ofs/AUQh8MaEcrlq5xOX0CQ9ypTF5dl78mjlMNfOK08fzpgTHQRQPBxcPlEtIw0yRpws+Zo/3r+5WRby7u3Gg=="],
"next": ["next@16.1.6", "", { "dependencies": { "@next/env": "16.1.6", "@swc/helpers": "0.5.15", "baseline-browser-mapping": "^2.8.3", "caniuse-lite": "^1.0.30001579", "postcss": "8.4.31", "styled-jsx": "5.1.6" }, "optionalDependencies": { "@next/swc-darwin-arm64": "16.1.6", "@next/swc-darwin-x64": "16.1.6", "@next/swc-linux-arm64-gnu": "16.1.6", "@next/swc-linux-arm64-musl": "16.1.6", "@next/swc-linux-x64-gnu": "16.1.6", "@next/swc-linux-x64-musl": "16.1.6", "@next/swc-win32-arm64-msvc": "16.1.6", "@next/swc-win32-x64-msvc": "16.1.6", "sharp": "^0.34.4" }, "peerDependencies": { "@opentelemetry/api": "^1.1.0", "@playwright/test": "^1.51.1", "babel-plugin-react-compiler": "*", "react": "^18.2.0 || 19.0.0-rc-de68d2f4-20241204 || ^19.0.0", "react-dom": "^18.2.0 || 19.0.0-rc-de68d2f4-20241204 || ^19.0.0", "sass": "^1.3.0" }, "optionalPeers": ["@opentelemetry/api", "@playwright/test", "babel-plugin-react-compiler", "sass"], "bin": { "next": "dist/bin/next" } }, "sha512-hkyRkcu5x/41KoqnROkfTm2pZVbKxvbZRuNvKXLRXxs3VfyO0WhY50TQS40EuKO9SW3rBj/sF3WbVwDACeMZyw=="],
"next-themes": ["next-themes@0.4.6", "", { "peerDependencies": { "react": "^16.8 || ^17 || ^18 || ^19 || ^19.0.0-rc", "react-dom": "^16.8 || ^17 || ^18 || ^19 || ^19.0.0-rc" } }, "sha512-pZvgD5L0IEvX5/9GWyHMf3m8BKiVQwsCMHfoFosXtXBMnaS0ZnIJ9ST4b4NqLVKDEm8QBxoNNGNaBv2JNF6XNA=="],
"npm-to-yarn": ["npm-to-yarn@3.0.1", "", {}, "sha512-tt6PvKu4WyzPwWUzy/hvPFqn+uwXO0K1ZHka8az3NnrhWJDmSqI8ncWq0fkL0k/lmmi5tAC11FXwXuh0rFbt1A=="],
"oniguruma-parser": ["oniguruma-parser@0.12.1", "", {}, "sha512-8Unqkvk1RYc6yq2WBYRj4hdnsAxVze8i7iPfQr8e4uSP3tRv0rpZcbGUDvxfQQcdwHt/e9PrMvGCsa8OqG9X3w=="],
"oniguruma-to-es": ["oniguruma-to-es@4.3.4", "", { "dependencies": { "oniguruma-parser": "^0.12.1", "regex": "^6.0.1", "regex-recursion": "^6.0.2" } }, "sha512-3VhUGN3w2eYxnTzHn+ikMI+fp/96KoRSVK9/kMTcFqj1NRDh2IhQCKvYxDnWePKRXY/AqH+Fuiyb7VHSzBjHfA=="],
"openapi-sampler": ["openapi-sampler@1.6.2", "", { "dependencies": { "@types/json-schema": "^7.0.7", "fast-xml-parser": "^4.5.0", "json-pointer": "0.6.2" } }, "sha512-NyKGiFKfSWAZr4srD/5WDhInOWDhfml32h/FKUqLpEwKJt0kG0LGUU0MdyNkKrVGuJnw6DuPWq/sHCwAMpiRxg=="],
"parse-entities": ["parse-entities@4.0.2", "", { "dependencies": { "@types/unist": "^2.0.0", "character-entities-legacy": "^3.0.0", "character-reference-invalid": "^2.0.0", "decode-named-character-reference": "^1.0.0", "is-alphanumerical": "^2.0.0", "is-decimal": "^2.0.0", "is-hexadecimal": "^2.0.0" } }, "sha512-GG2AQYWoLgL877gQIKeRPGO1xF9+eG1ujIb5soS5gPvLQ1y2o8FL90w2QWNdf9I361Mpp7726c+lj3U0qK1uGw=="],
"path-to-regexp": ["path-to-regexp@8.3.0", "", {}, "sha512-7jdwVIRtsP8MYpdXSwOS0YdD0Du+qOoF/AEPIt88PcCFrZCzx41oxku1jD88hZBwbNUIEfpqvuhjFaMAqMTWnA=="],
"picocolors": ["picocolors@1.1.1", "", {}, "sha512-xceH2snhtb5M9liqDsmEw56le376mTZkEX/jEb/RxNFyegNul7eNslCXP9FDj/Lcu0X8KEyMceP2ntpaHrDEVA=="],
"picomatch": ["picomatch@4.0.3", "", {}, "sha512-5gTmgEY/sqK6gFXLIsQNH19lWb4ebPDLA4SdLP7dsWkIXHWlG66oPuVvXSGFPppYZz8ZDZq0dYYrbHfBCVUb1Q=="],
"postcss": ["postcss@8.5.6", "", { "dependencies": { "nanoid": "^3.3.11", "picocolors": "^1.1.1", "source-map-js": "^1.2.1" } }, "sha512-3Ybi1tAuwAP9s0r1UQ2J4n5Y0G05bJkpUIO0/bI9MhwmD70S5aTWbXGBwxHrelT+XM1k6dM0pk+SwNkpTRN7Pg=="],
"postcss-selector-parser": ["postcss-selector-parser@7.1.1", "", { "dependencies": { "cssesc": "^3.0.0", "util-deprecate": "^1.0.2" } }, "sha512-orRsuYpJVw8LdAwqqLykBj9ecS5/cRHlI5+nvTo8LcCKmzDmqVORXtOIYEEQuL9D4BxtA1lm5isAqzQZCoQ6Eg=="],
"property-information": ["property-information@7.1.0", "", {}, "sha512-TwEZ+X+yCJmYfL7TPUOcvBZ4QfoT5YenQiJuX//0th53DE6w0xxLEtfK3iyryQFddXuvkIk51EEgrJQ0WJkOmQ=="],
"react": ["react@19.2.4", "", {}, "sha512-9nfp2hYpCwOjAN+8TZFGhtWEwgvWHXqESH8qT89AT/lWklpLON22Lc8pEtnpsZz7VmawabSU0gCjnj8aC0euHQ=="],
"react-dom": ["react-dom@19.2.4", "", { "dependencies": { "scheduler": "^0.27.0" }, "peerDependencies": { "react": "^19.2.4" } }, "sha512-AXJdLo8kgMbimY95O2aKQqsz2iWi9jMgKJhRBAxECE4IFxfcazB2LmzloIoibJI3C12IlY20+KFaLv+71bUJeQ=="],
"react-hook-form": ["react-hook-form@7.71.1", "", { "peerDependencies": { "react": "^16.8.0 || ^17 || ^18 || ^19" } }, "sha512-9SUJKCGKo8HUSsCO+y0CtqkqI5nNuaDqTxyqPsZPqIwudpj4rCrAz/jZV+jn57bx5gtZKOh3neQu94DXMc+w5w=="],
"react-medium-image-zoom": ["react-medium-image-zoom@5.4.0", "", { "peerDependencies": { "react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0", "react-dom": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0" } }, "sha512-BsE+EnFVQzFIlyuuQrZ9iTwyKpKkqdFZV1ImEQN573QPqGrIUuNni7aF+sZwDcxlsuOMayCr6oO/PZR/yJnbRg=="],
"react-remove-scroll": ["react-remove-scroll@2.7.2", "", { "dependencies": { "react-remove-scroll-bar": "^2.3.7", "react-style-singleton": "^2.2.3", "tslib": "^2.1.0", "use-callback-ref": "^1.3.3", "use-sidecar": "^1.1.3" }, "peerDependencies": { "@types/react": "*", "react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-Iqb9NjCCTt6Hf+vOdNIZGdTiH1QSqr27H/Ek9sv/a97gfueI/5h1s3yRi1nngzMUaOOToin5dI1dXKdXiF+u0Q=="],
"react-remove-scroll-bar": ["react-remove-scroll-bar@2.3.8", "", { "dependencies": { "react-style-singleton": "^2.2.2", "tslib": "^2.0.0" }, "peerDependencies": { "@types/react": "*", "react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0" }, "optionalPeers": ["@types/react"] }, "sha512-9r+yi9+mgU33AKcj6IbT9oRCO78WriSj6t/cF8DWBZJ9aOGPOTEDvdUDz1FwKim7QXWwmHqtdHnRJfhAxEG46Q=="],
"react-style-singleton": ["react-style-singleton@2.2.3", "", { "dependencies": { "get-nonce": "^1.0.0", "tslib": "^2.0.0" }, "peerDependencies": { "@types/react": "*", "react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-b6jSvxvVnyptAiLjbkWLE/lOnR4lfTtDAl+eUC7RZy+QQWc6wRzIV2CE6xBuMmDxc2qIihtDCZD5NPOFl7fRBQ=="],
"readdirp": ["readdirp@4.1.2", "", {}, "sha512-GDhwkLfywWL2s6vEjyhri+eXmfH6j1L7JE27WhqLeYzoh/A3DBaYGEj2H/HFZCn/kMfim73FXxEJTw06WtxQwg=="],
"recma-build-jsx": ["recma-build-jsx@1.0.0", "", { "dependencies": { "@types/estree": "^1.0.0", "estree-util-build-jsx": "^3.0.0", "vfile": "^6.0.0" } }, "sha512-8GtdyqaBcDfva+GUKDr3nev3VpKAhup1+RvkMvUxURHpW7QyIvk9F5wz7Vzo06CEMSilw6uArgRqhpiUcWp8ew=="],
"recma-jsx": ["recma-jsx@1.0.1", "", { "dependencies": { "acorn-jsx": "^5.0.0", "estree-util-to-js": "^2.0.0", "recma-parse": "^1.0.0", "recma-stringify": "^1.0.0", "unified": "^11.0.0" }, "peerDependencies": { "acorn": "^6.0.0 || ^7.0.0 || ^8.0.0" } }, "sha512-huSIy7VU2Z5OLv6oFLosQGGDqPqdO1iq6bWNAdhzMxSJP7RAso4fCZ1cKu8j9YHCZf3TPrq4dw3okhrylgcd7w=="],
"recma-parse": ["recma-parse@1.0.0", "", { "dependencies": { "@types/estree": "^1.0.0", "esast-util-from-js": "^2.0.0", "unified": "^11.0.0", "vfile": "^6.0.0" } }, "sha512-OYLsIGBB5Y5wjnSnQW6t3Xg7q3fQ7FWbw/vcXtORTnyaSFscOtABg+7Pnz6YZ6c27fG1/aN8CjfwoUEUIdwqWQ=="],
"recma-stringify": ["recma-stringify@1.0.0", "", { "dependencies": { "@types/estree": "^1.0.0", "estree-util-to-js": "^2.0.0", "unified": "^11.0.0", "vfile": "^6.0.0" } }, "sha512-cjwII1MdIIVloKvC9ErQ+OgAtwHBmcZ0Bg4ciz78FtbT8In39aAYbaA7zvxQ61xVMSPE8WxhLwLbhif4Js2C+g=="],
"regex": ["regex@6.1.0", "", { "dependencies": { "regex-utilities": "^2.3.0" } }, "sha512-6VwtthbV4o/7+OaAF9I5L5V3llLEsoPyq9P1JVXkedTP33c7MfCG0/5NOPcSJn0TzXcG9YUrR0gQSWioew3LDg=="],
"regex-recursion": ["regex-recursion@6.0.2", "", { "dependencies": { "regex-utilities": "^2.3.0" } }, "sha512-0YCaSCq2VRIebiaUviZNs0cBz1kg5kVS2UKUfNIx8YVs1cN3AV7NTctO5FOKBA+UT2BPJIWZauYHPqJODG50cg=="],
"regex-utilities": ["regex-utilities@2.3.0", "", {}, "sha512-8VhliFJAWRaUiVvREIiW2NXXTmHs4vMNnSzuJVhscgmGav3g9VDxLrQndI3dZZVVdp0ZO/5v0xmX516/7M9cng=="],
"rehype-recma": ["rehype-recma@1.0.0", "", { "dependencies": { "@types/estree": "^1.0.0", "@types/hast": "^3.0.0", "hast-util-to-estree": "^3.0.0" } }, "sha512-lqA4rGUf1JmacCNWWZx0Wv1dHqMwxzsDWYMTowuplHF3xH0N/MmrZ/G3BDZnzAkRmxDadujCjaKM2hqYdCBOGw=="],
"remark": ["remark@15.0.1", "", { "dependencies": { "@types/mdast": "^4.0.0", "remark-parse": "^11.0.0", "remark-stringify": "^11.0.0", "unified": "^11.0.0" } }, "sha512-Eht5w30ruCXgFmxVUSlNWQ9iiimq07URKeFS3hNc8cUWy1llX4KDWfyEDZRycMc+znsN9Ux5/tJ/BFdgdOwA3A=="],
"remark-gfm": ["remark-gfm@4.0.1", "", { "dependencies": { "@types/mdast": "^4.0.0", "mdast-util-gfm": "^3.0.0", "micromark-extension-gfm": "^3.0.0", "remark-parse": "^11.0.0", "remark-stringify": "^11.0.0", "unified": "^11.0.0" } }, "sha512-1quofZ2RQ9EWdeN34S79+KExV1764+wCUGop5CPL1WGdD0ocPpu91lzPGbwWMECpEpd42kJGQwzRfyov9j4yNg=="],
"remark-mdx": ["remark-mdx@3.1.1", "", { "dependencies": { "mdast-util-mdx": "^3.0.0", "micromark-extension-mdxjs": "^3.0.0" } }, "sha512-Pjj2IYlUY3+D8x00UJsIOg5BEvfMyeI+2uLPn9VO9Wg4MEtN/VTIq2NEJQfde9PnX15KgtHyl9S0BcTnWrIuWg=="],
"remark-parse": ["remark-parse@11.0.0", "", { "dependencies": { "@types/mdast": "^4.0.0", "mdast-util-from-markdown": "^2.0.0", "micromark-util-types": "^2.0.0", "unified": "^11.0.0" } }, "sha512-FCxlKLNGknS5ba/1lmpYijMUzX2esxW5xQqjWxw2eHFfS2MSdaHVINFmhjo+qN1WhZhNimq0dZATN9pH0IDrpA=="],
"remark-rehype": ["remark-rehype@11.1.2", "", { "dependencies": { "@types/hast": "^3.0.0", "@types/mdast": "^4.0.0", "mdast-util-to-hast": "^13.0.0", "unified": "^11.0.0", "vfile": "^6.0.0" } }, "sha512-Dh7l57ianaEoIpzbp0PC9UKAdCSVklD8E5Rpw7ETfbTl3FqcOOgq5q2LVDhgGCkaBv7p24JXikPdvhhmHvKMsw=="],
"remark-stringify": ["remark-stringify@11.0.0", "", { "dependencies": { "@types/mdast": "^4.0.0", "mdast-util-to-markdown": "^2.0.0", "unified": "^11.0.0" } }, "sha512-1OSmLd3awB/t8qdoEOMazZkNsfVTeY4fTsgzcQFdXNq8ToTN4ZGwrMnlda4K6smTFKD+GRV6O48i6Z4iKgPPpw=="],
"require-from-string": ["require-from-string@2.0.2", "", {}, "sha512-Xf0nWe6RseziFMu+Ap9biiUbmplq6S9/p+7w7YXP/JBHhrUDDUhwa+vANyubuqfZWTveU//DYVGsDG7RKL/vEw=="],
"sax": ["sax@1.4.4", "", {}, "sha512-1n3r/tGXO6b6VXMdFT54SHzT9ytu9yr7TaELowdYpMqY/Ao7EnlQGmAQ1+RatX7Tkkdm6hONI2owqNx2aZj5Sw=="],
"scheduler": ["scheduler@0.27.0", "", {}, "sha512-eNv+WrVbKu1f3vbYJT/xtiF5syA5HPIMtf9IgY/nKg0sWqzAUEvqY/xm7OcZc/qafLx/iO9FgOmeSAp4v5ti/Q=="],
"scroll-into-view-if-needed": ["scroll-into-view-if-needed@3.1.0", "", { "dependencies": { "compute-scroll-into-view": "^3.0.2" } }, "sha512-49oNpRjWRvnU8NyGVmUaYG4jtTkNonFZI86MmGRDqBphEK2EXT9gdEUoQPZhuBM8yWHxCWbobltqYO5M4XrUvQ=="],
"semver": ["semver@7.7.3", "", { "bin": { "semver": "bin/semver.js" } }, "sha512-SdsKMrI9TdgjdweUSR9MweHA4EJ8YxHn8DFaDisvhVlUOe4BF1tLD7GAj0lIqWVl+dPb/rExr0Btby5loQm20Q=="],
"sharp": ["sharp@0.34.5", "", { "dependencies": { "@img/colour": "^1.0.0", "detect-libc": "^2.1.2", "semver": "^7.7.3" }, "optionalDependencies": { "@img/sharp-darwin-arm64": "0.34.5", "@img/sharp-darwin-x64": "0.34.5", "@img/sharp-libvips-darwin-arm64": "1.2.4", "@img/sharp-libvips-darwin-x64": "1.2.4", "@img/sharp-libvips-linux-arm": "1.2.4", "@img/sharp-libvips-linux-arm64": "1.2.4", "@img/sharp-libvips-linux-ppc64": "1.2.4", "@img/sharp-libvips-linux-riscv64": "1.2.4", "@img/sharp-libvips-linux-s390x": "1.2.4", "@img/sharp-libvips-linux-x64": "1.2.4", "@img/sharp-libvips-linuxmusl-arm64": "1.2.4", "@img/sharp-libvips-linuxmusl-x64": "1.2.4", "@img/sharp-linux-arm": "0.34.5", "@img/sharp-linux-arm64": "0.34.5", "@img/sharp-linux-ppc64": "0.34.5", "@img/sharp-linux-riscv64": "0.34.5", "@img/sharp-linux-s390x": "0.34.5", "@img/sharp-linux-x64": "0.34.5", "@img/sharp-linuxmusl-arm64": "0.34.5", "@img/sharp-linuxmusl-x64": "0.34.5", "@img/sharp-wasm32": "0.34.5", "@img/sharp-win32-arm64": "0.34.5", "@img/sharp-win32-ia32": "0.34.5", "@img/sharp-win32-x64": "0.34.5" } }, "sha512-Ou9I5Ft9WNcCbXrU9cMgPBcCK8LiwLqcbywW3t4oDV37n1pzpuNLsYiAV8eODnjbtQlSDwZ2cUEeQz4E54Hltg=="],
"shiki": ["shiki@3.22.0", "", { "dependencies": { "@shikijs/core": "3.22.0", "@shikijs/engine-javascript": "3.22.0", "@shikijs/engine-oniguruma": "3.22.0", "@shikijs/langs": "3.22.0", "@shikijs/themes": "3.22.0", "@shikijs/types": "3.22.0", "@shikijs/vscode-textmate": "^10.0.2", "@types/hast": "^3.0.4" } }, "sha512-LBnhsoYEe0Eou4e1VgJACes+O6S6QC0w71fCSp5Oya79inkwkm15gQ1UF6VtQ8j/taMDh79hAB49WUk8ALQW3g=="],
"source-map": ["source-map@0.7.6", "", {}, "sha512-i5uvt8C3ikiWeNZSVZNWcfZPItFQOsYTUAOkcUPGd8DqDy1uOUikjt5dG+uRlwyvR108Fb9DOd4GvXfT0N2/uQ=="],
"source-map-js": ["source-map-js@1.2.1", "", {}, "sha512-UXWMKhLOwVKb728IUtQPXxfYU+usdybtUrK/8uGE8CQMvrhOpwvzDBwj0QhSL7MQc7vIsISBG8VQ8+IDQxpfQA=="],
"space-separated-tokens": ["space-separated-tokens@2.0.2", "", {}, "sha512-PEGlAwrG8yXGXRjW32fGbg66JAlOAwbObuqVoJpv/mRgoWDQfgH1wDPvtzWyUSNAXBGSk8h755YDbbcEy3SH2Q=="],
"stringify-entities": ["stringify-entities@4.0.4", "", { "dependencies": { "character-entities-html4": "^2.0.0", "character-entities-legacy": "^3.0.0" } }, "sha512-IwfBptatlO+QCJUo19AqvrPNqlVMpW9YEL2LIVY+Rpv2qsjCGxaDLNRgeGsQWJhfItebuJhsGSLjaBbNSQ+ieg=="],
"strnum": ["strnum@1.1.2", "", {}, "sha512-vrN+B7DBIoTTZjnPNewwhx6cBA/H+IS7rfW68n7XxC1y7uoiGQBxaKzqucGUgavX15dJgiGztLJ8vxuEzwqBdA=="],
"style-to-js": ["style-to-js@1.1.21", "", { "dependencies": { "style-to-object": "1.0.14" } }, "sha512-RjQetxJrrUJLQPHbLku6U/ocGtzyjbJMP9lCNK7Ag0CNh690nSH8woqWH9u16nMjYBAok+i7JO1NP2pOy8IsPQ=="],
"style-to-object": ["style-to-object@1.0.14", "", { "dependencies": { "inline-style-parser": "0.2.7" } }, "sha512-LIN7rULI0jBscWQYaSswptyderlarFkjQ+t79nzty8tcIAceVomEVlLzH5VP4Cmsv6MtKhs7qaAiwlcp+Mgaxw=="],
"styled-jsx": ["styled-jsx@5.1.6", "", { "dependencies": { "client-only": "0.0.1" }, "peerDependencies": { "react": ">= 16.8.0 || 17.x.x || ^18.0.0-0 || ^19.0.0-0" } }, "sha512-qSVyDTeMotdvQYoHWLNGwRFJHC+i+ZvdBRYosOFgC+Wg1vx4frN2/RG/NA7SYqqvKNLf39P2LSRA2pu6n0XYZA=="],
"tailwind-merge": ["tailwind-merge@3.5.0", "", {}, "sha512-I8K9wewnVDkL1NTGoqWmVEIlUcB9gFriAEkXkfCjX5ib8ezGxtR3xD7iZIxrfArjEsH7F1CHD4RFUtxefdqV/A=="],
"tailwindcss": ["tailwindcss@4.1.18", "", {}, "sha512-4+Z+0yiYyEtUVCScyfHCxOYP06L5Ne+JiHhY2IjR2KWMIWhJOYZKLSGZaP5HkZ8+bY0cxfzwDE5uOmzFXyIwxw=="],
"tapable": ["tapable@2.3.0", "", {}, "sha512-g9ljZiwki/LfxmQADO3dEY1CbpmXT5Hm2fJ+QaGKwSXUylMybePR7/67YW7jOrrvjEgL1Fmz5kzyAjWVWLlucg=="],
"tinyexec": ["tinyexec@1.0.2", "", {}, "sha512-W/KYk+NFhkmsYpuHq5JykngiOCnxeVL8v8dFnqxSD8qEEdRfXk1SDM6JzNqcERbcGYj9tMrDQBYV9cjgnunFIg=="],
"tinyglobby": ["tinyglobby@0.2.15", "", { "dependencies": { "fdir": "^6.5.0", "picomatch": "^4.0.3" } }, "sha512-j2Zq4NyQYG5XMST4cbs02Ak8iJUdxRM0XI5QyxXuZOzKOINmWurp3smXu3y5wDcJrptwpSjgXHzIQxR0omXljQ=="],
"trim-lines": ["trim-lines@3.0.1", "", {}, "sha512-kRj8B+YHZCc9kQYdWfJB2/oUl9rA99qbowYYBtr4ui4mZyAQ2JpvVBd/6U2YloATfqBhBTSMhTpgBHtU0Mf3Rg=="],
"trough": ["trough@2.2.0", "", {}, "sha512-tmMpK00BjZiUyVyvrBK7knerNgmgvcV/KLVyuma/SC+TQN167GrMRciANTz09+k3zW8L8t60jWO1GpfkZdjTaw=="],
"tslib": ["tslib@2.8.1", "", {}, "sha512-oJFu94HQb+KVduSUQL7wnpmqnfmLsOA/nAh6b6EH0wCEoK0/mPeXU6c3wKDV83MkOuHPRHtSXKKU99IBazS/2w=="],
"typescript": ["typescript@5.9.3", "", { "bin": { "tsc": "bin/tsc", "tsserver": "bin/tsserver" } }, "sha512-jl1vZzPDinLr9eUt3J/t7V6FgNEw9QjvBPdysz9KfQDD41fQrC2Y4vKQdiaUpFT4bXlb1RHhLpp8wtm6M5TgSw=="],
"undici-types": ["undici-types@7.16.0", "", {}, "sha512-Zz+aZWSj8LE6zoxD+xrjh4VfkIG8Ya6LvYkZqtUQGJPZjYl53ypCaUwWqo7eI0x66KBGeRo+mlBEkMSeSZ38Nw=="],
"unified": ["unified@11.0.5", "", { "dependencies": { "@types/unist": "^3.0.0", "bail": "^2.0.0", "devlop": "^1.0.0", "extend": "^3.0.0", "is-plain-obj": "^4.0.0", "trough": "^2.0.0", "vfile": "^6.0.0" } }, "sha512-xKvGhPWw3k84Qjh8bI3ZeJjqnyadK+GEFtazSfZv/rKeTkTjOJho6mFqh2SM96iIcZokxiOpg78GazTSg8+KHA=="],
"unist-util-is": ["unist-util-is@6.0.1", "", { "dependencies": { "@types/unist": "^3.0.0" } }, "sha512-LsiILbtBETkDz8I9p1dQ0uyRUWuaQzd/cuEeS1hoRSyW5E5XGmTzlwY1OrNzzakGowI9Dr/I8HVaw4hTtnxy8g=="],
"unist-util-position": ["unist-util-position@5.0.0", "", { "dependencies": { "@types/unist": "^3.0.0" } }, "sha512-fucsC7HjXvkB5R3kTCO7kUjRdrS0BJt3M/FPxmHMBOm8JQi2BsHAHFsy27E0EolP8rp0NzXsJ+jNPyDWvOJZPA=="],
"unist-util-position-from-estree": ["unist-util-position-from-estree@2.0.0", "", { "dependencies": { "@types/unist": "^3.0.0" } }, "sha512-KaFVRjoqLyF6YXCbVLNad/eS4+OfPQQn2yOd7zF/h5T/CSL2v8NpN6a5TPvtbXthAGw5nG+PuTtq+DdIZr+cRQ=="],
"unist-util-remove-position": ["unist-util-remove-position@5.0.0", "", { "dependencies": { "@types/unist": "^3.0.0", "unist-util-visit": "^5.0.0" } }, "sha512-Hp5Kh3wLxv0PHj9m2yZhhLt58KzPtEYKQQ4yxfYFEO7EvHwzyDYnduhHnY1mDxoqr7VUwVuHXk9RXKIiYS1N8Q=="],
"unist-util-stringify-position": ["unist-util-stringify-position@4.0.0", "", { "dependencies": { "@types/unist": "^3.0.0" } }, "sha512-0ASV06AAoKCDkS2+xw5RXJywruurpbC4JZSm7nr7MOt1ojAzvyyaO+UxZf18j8FCF6kmzCZKcAgN/yu2gm2XgQ=="],
"unist-util-visit": ["unist-util-visit@5.1.0", "", { "dependencies": { "@types/unist": "^3.0.0", "unist-util-is": "^6.0.0", "unist-util-visit-parents": "^6.0.0" } }, "sha512-m+vIdyeCOpdr/QeQCu2EzxX/ohgS8KbnPDgFni4dQsfSCtpz8UqDyY5GjRru8PDKuYn7Fq19j1CQ+nJSsGKOzg=="],
"unist-util-visit-parents": ["unist-util-visit-parents@6.0.2", "", { "dependencies": { "@types/unist": "^3.0.0", "unist-util-is": "^6.0.0" } }, "sha512-goh1s1TBrqSqukSc8wrjwWhL0hiJxgA8m4kFxGlQ+8FYQ3C/m11FcTs4YYem7V664AhHVvgoQLk890Ssdsr2IQ=="],
"use-callback-ref": ["use-callback-ref@1.3.3", "", { "dependencies": { "tslib": "^2.0.0" }, "peerDependencies": { "@types/react": "*", "react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-jQL3lRnocaFtu3V00JToYz/4QkNWswxijDaCVNZRiRTO3HQDLsdu1ZtmIUvV4yPp+rvWm5j0y0TG/S61cuijTg=="],
"use-sidecar": ["use-sidecar@1.1.3", "", { "dependencies": { "detect-node-es": "^1.1.0", "tslib": "^2.0.0" }, "peerDependencies": { "@types/react": "*", "react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-Fedw0aZvkhynoPYlA5WXrMCAMm+nSWdZt6lzJQ7Ok8S6Q+VsHmHpRWndVRJ8Be0ZbkfPc5LRYH+5XrzXcEeLRQ=="],
"util-deprecate": ["util-deprecate@1.0.2", "", {}, "sha512-EPD5q1uXyFxJpCrLnCc1nHnq3gOa6DZBocAIiI2TaSCA7VCJ1UJDMagCzIkXNsUYfD1daK//LTEQ8xiIbrHtcw=="],
"vfile": ["vfile@6.0.3", "", { "dependencies": { "@types/unist": "^3.0.0", "vfile-message": "^4.0.0" } }, "sha512-KzIbH/9tXat2u30jf+smMwFCsno4wHVdNmzFyL+T/L3UGqqk6JKfVqOFOZEpZSHADH1k40ab6NUIXZq422ov3Q=="],
"vfile-message": ["vfile-message@4.0.3", "", { "dependencies": { "@types/unist": "^3.0.0", "unist-util-stringify-position": "^4.0.0" } }, "sha512-QTHzsGd1EhbZs4AsQ20JX1rC3cOlt/IWJruk893DfLRr57lcnOeMaWG4K0JrRta4mIJZKth2Au3mM3u03/JWKw=="],
"xml-js": ["xml-js@1.6.11", "", { "dependencies": { "sax": "^1.2.4" }, "bin": { "xml-js": "./bin/cli.js" } }, "sha512-7rVi2KMfwfWFl+GpPg6m80IVMWXLRjO+PxTq7V2CDhoGak0wzYzFgUY2m4XJ47OGdXd8eLE8EmwfAmdjw7lC1g=="],
"yaml": ["yaml@2.8.2", "", { "bin": { "yaml": "bin.mjs" } }, "sha512-mplynKqc1C2hTVYxd0PU2xQAc22TI1vShAYGksCCfxbn/dFwnHTNi1bvYsBTkhdUNtGIf5xNOg938rrSSYvS9A=="],
"zod": ["zod@4.3.6", "", {}, "sha512-rftlrkhHZOcjDwkGlnUtZZkvaPHCsDATp4pGpuOOMDaTdDDXF91wuVDJoWoPsKX/3YPQ5fHuF3STjcYyKr+Qhg=="],
"zwitch": ["zwitch@2.0.4", "", {}, "sha512-bXE4cR/kVZhKZX/RjPEflHaKVhUVl85noU3v6b8apfQEc1x4A+zBxjZ4lN8LqGd6WZ3dl98pY4o717VFmoPp+A=="],
"@fumadocs/ui/tailwind-merge": ["tailwind-merge@3.4.0", "", {}, "sha512-uSaO4gnW+b3Y2aWoWfFpX62vn2sR3skfhbjsEnaBI81WD1wBLlHZe5sWf0AqjksNdYTbGBEd0UasQMT3SNV15g=="],
"@scalar/openapi-parser/@scalar/json-magic": ["@scalar/json-magic@0.9.4", "", { "dependencies": { "@scalar/helpers": "0.2.9", "yaml": "^2.8.0" } }, "sha512-PyfyWrH4ZkW0TM1ColiiHj4NRF8hUM61H0UzAkHLhRNnKFxi6hI+oqNrwqPnyk93hrpkpTRHC7Fl5T0BRwuzVg=="],
"@tailwindcss/oxide-wasm32-wasi/@emnapi/core": ["@emnapi/core@1.8.1", "", { "dependencies": { "@emnapi/wasi-threads": "1.1.0", "tslib": "^2.4.0" }, "bundled": true }, "sha512-AvT9QFpxK0Zd8J0jopedNm+w/2fIzvtPKPjqyw9jwvBaReTTqPBk9Hixaz7KbjimP+QNz605/XnjFcDAL2pqBg=="],
"@tailwindcss/oxide-wasm32-wasi/@emnapi/runtime": ["@emnapi/runtime@1.8.1", "", { "dependencies": { "tslib": "^2.4.0" }, "bundled": true }, "sha512-mehfKSMWjjNol8659Z8KxEMrdSJDDot5SXMq00dM8BN4o+CLNXQ0xH2V7EchNHV4RmbZLmmPdEaXZc5H2FXmDg=="],
"@tailwindcss/oxide-wasm32-wasi/@emnapi/wasi-threads": ["@emnapi/wasi-threads@1.1.0", "", { "dependencies": { "tslib": "^2.4.0" }, "bundled": true }, "sha512-WI0DdZ8xFSbgMjR1sFsKABJ/C5OnRrjT06JXbZKexJGrDuPTzZdDYfFlsgcCXCyf+suG5QU2e/y1Wo2V/OapLQ=="],
"@tailwindcss/oxide-wasm32-wasi/@napi-rs/wasm-runtime": ["@napi-rs/wasm-runtime@1.1.1", "", { "dependencies": { "@emnapi/core": "^1.7.1", "@emnapi/runtime": "^1.7.1", "@tybys/wasm-util": "^0.10.1" }, "bundled": true }, "sha512-p64ah1M1ld8xjWv3qbvFwHiFVWrq1yFvV4f7w+mzaqiR4IlSgkqhcRdHwsGgomwzBH51sRY4NEowLxnaBjcW/A=="],
"@tailwindcss/oxide-wasm32-wasi/@tybys/wasm-util": ["@tybys/wasm-util@0.10.1", "", { "dependencies": { "tslib": "^2.4.0" }, "bundled": true }, "sha512-9tTaPJLSiejZKx+Bmog4uSubteqTvFrVrURwkmHixBo0G4seD0zUxp98E1DzUBJxLQ3NPwXrGKDiVjwx/DpPsg=="],
"@tailwindcss/oxide-wasm32-wasi/tslib": ["tslib@2.8.1", "", { "bundled": true }, "sha512-oJFu94HQb+KVduSUQL7wnpmqnfmLsOA/nAh6b6EH0wCEoK0/mPeXU6c3wKDV83MkOuHPRHtSXKKU99IBazS/2w=="],
"fumadocs-openapi/@radix-ui/react-slot": ["@radix-ui/react-slot@1.2.4", "", { "dependencies": { "@radix-ui/react-compose-refs": "1.1.2" }, "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-Jl+bCv8HxKnlTLVrcDE8zTMJ09R9/ukw4qBs/oZClOfoQk/cOTbDn+NceXfV7j09YPVQUryJPHurafcSg6EVKA=="],
"fumadocs-openapi/lucide-react": ["lucide-react@0.563.0", "", { "peerDependencies": { "react": "^16.5.1 || ^17.0.0 || ^18.0.0 || ^19.0.0" } }, "sha512-8dXPB2GI4dI8jV4MgUDGBeLdGk8ekfqVZ0BdLcrRzocGgG75ltNEmWS+gE7uokKF/0oSUuczNDT+g9hFJ23FkA=="],
"fumadocs-openapi/tailwind-merge": ["tailwind-merge@3.4.0", "", {}, "sha512-uSaO4gnW+b3Y2aWoWfFpX62vn2sR3skfhbjsEnaBI81WD1wBLlHZe5sWf0AqjksNdYTbGBEd0UasQMT3SNV15g=="],
"fumadocs-ui/@radix-ui/react-slot": ["@radix-ui/react-slot@1.2.4", "", { "dependencies": { "@radix-ui/react-compose-refs": "1.1.2" }, "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-Jl+bCv8HxKnlTLVrcDE8zTMJ09R9/ukw4qBs/oZClOfoQk/cOTbDn+NceXfV7j09YPVQUryJPHurafcSg6EVKA=="],
"fumadocs-ui/lucide-react": ["lucide-react@0.563.0", "", { "peerDependencies": { "react": "^16.5.1 || ^17.0.0 || ^18.0.0 || ^19.0.0" } }, "sha512-8dXPB2GI4dI8jV4MgUDGBeLdGk8ekfqVZ0BdLcrRzocGgG75ltNEmWS+gE7uokKF/0oSUuczNDT+g9hFJ23FkA=="],
"next/postcss": ["postcss@8.4.31", "", { "dependencies": { "nanoid": "^3.3.6", "picocolors": "^1.0.0", "source-map-js": "^1.0.2" } }, "sha512-PS08Iboia9mts/2ygV3eLpY5ghnUcfLV/EXTOW1E2qYxJKGGBUtNjN76FYHnMs36RmARn41bC0AZmn+rR0OVpQ=="],
"parse-entities/@types/unist": ["@types/unist@2.0.11", "", {}, "sha512-CmBKiL6NNo/OqgmMn95Fk9Whlp2mtvIv+KNpQKN2F4SjvrEesubTRWGYSg+BnWZOnlCaSTU1sMpsBOzgbYhnsA=="],
"@scalar/openapi-parser/@scalar/json-magic/@scalar/helpers": ["@scalar/helpers@0.2.9", "", {}, "sha512-Y4ffJF0yELdwZ0BKgonqn3SumIgRn1WKyYCVHD+TDM7qRFChdGRypyt20+efHs26fmJeyBAIIv2laICj5uimiw=="],
}
}

13
docs/cli.json Normal file
View File

@@ -0,0 +1,13 @@
{
"$schema": "node_modules/@fumadocs/cli/dist/schema/default.json",
"aliases": {
"uiDir": "./components/ui",
"componentsDir": "./components",
"blockDir": "./components",
"cssDir": "./styles",
"libDir": "./lib"
},
"baseDir": "",
"uiLibrary": "radix-ui",
"commands": {}
}

View File

@@ -0,0 +1,247 @@
'use client';
import { type ComponentProps, useMemo, useState } from 'react';
import { Check, ChevronDown, Copy, ExternalLinkIcon, TextIcon } from 'lucide-react';
import { cn } from '../../lib/cn';
import { useCopyButton } from 'fumadocs-ui/utils/use-copy-button';
import { Popover, PopoverTrigger, PopoverContent } from '../ui/popover';
import { buttonVariants } from '../ui/button';
const cache = new Map<string, Promise<string>>();
export function MarkdownCopyButton({
markdownUrl,
...props
}: ComponentProps<'button'> & {
/**
* A URL to fetch the raw Markdown/MDX content of page
*/
markdownUrl: string;
}) {
const [isLoading, setLoading] = useState(false);
const [checked, onClick] = useCopyButton(async () => {
const cached = cache.get(markdownUrl);
if (cached) return navigator.clipboard.writeText(await cached);
setLoading(true);
try {
const promise = fetch(markdownUrl).then((res) => res.text());
cache.set(markdownUrl, promise);
await navigator.clipboard.write([
new ClipboardItem({
'text/plain': promise,
}),
]);
} finally {
setLoading(false);
}
});
return (
<button
disabled={isLoading}
onClick={onClick}
{...props}
className={cn(
buttonVariants({
variant: 'secondary',
size: 'sm',
className: 'gap-2 [&_svg]:size-3.5 [&_svg]:text-fd-muted-foreground',
}),
props.className,
)}
>
{checked ? <Check /> : <Copy />}
Copy Markdown
</button>
);
}
export function ViewOptionsPopover({
markdownUrl,
githubUrl,
...props
}: ComponentProps<typeof PopoverTrigger> & {
/**
* A URL to the raw Markdown/MDX content of page
*/
markdownUrl: string;
/**
* Source file URL on GitHub
*/
githubUrl: string;
}) {
const items = useMemo(() => {
const pageUrl = typeof window !== 'undefined' ? window.location.href : 'loading';
const q = `Read ${pageUrl}, I want to ask questions about it.`;
return [
{
title: 'Open in GitHub',
href: githubUrl,
icon: (
<svg fill="currentColor" role="img" viewBox="0 0 24 24">
<title>GitHub</title>
<path d="M12 .297c-6.63 0-12 5.373-12 12 0 5.303 3.438 9.8 8.205 11.385.6.113.82-.258.82-.577 0-.285-.01-1.04-.015-2.04-3.338.724-4.042-1.61-4.042-1.61C4.422 18.07 3.633 17.7 3.633 17.7c-1.087-.744.084-.729.084-.729 1.205.084 1.838 1.236 1.838 1.236 1.07 1.835 2.809 1.305 3.495.998.108-.776.417-1.305.76-1.605-2.665-.3-5.466-1.332-5.466-5.93 0-1.31.465-2.38 1.235-3.22-.135-.303-.54-1.523.105-3.176 0 0 1.005-.322 3.3 1.23.96-.267 1.98-.399 3-.405 1.02.006 2.04.138 3 .405 2.28-1.552 3.285-1.23 3.285-1.23.645 1.653.24 2.873.12 3.176.765.84 1.23 1.91 1.23 3.22 0 4.61-2.805 5.625-5.475 5.92.42.36.81 1.096.81 2.22 0 1.606-.015 2.896-.015 3.286 0 .315.21.69.825.57C20.565 22.092 24 17.592 24 12.297c0-6.627-5.373-12-12-12" />
</svg>
),
},
{
title: 'View as Markdown',
href: markdownUrl,
icon: <TextIcon />,
},
{
title: 'Open in Scira AI',
href: `https://scira.ai/?${new URLSearchParams({
q,
})}`,
icon: (
<svg
width="910"
height="934"
viewBox="0 0 910 934"
fill="none"
xmlns="http://www.w3.org/2000/svg"
>
<title>Scira AI</title>
<path
d="M647.664 197.775C569.13 189.049 525.5 145.419 516.774 66.8849C508.048 145.419 464.418 189.049 385.884 197.775C464.418 206.501 508.048 250.131 516.774 328.665C525.5 250.131 569.13 206.501 647.664 197.775Z"
fill="currentColor"
stroke="currentColor"
strokeWidth="8"
strokeLinejoin="round"
/>
<path
d="M516.774 304.217C510.299 275.491 498.208 252.087 480.335 234.214C462.462 216.341 439.058 204.251 410.333 197.775C439.059 191.3 462.462 179.209 480.335 161.336C498.208 143.463 510.299 120.06 516.774 91.334C523.25 120.059 535.34 143.463 553.213 161.336C571.086 179.209 594.49 191.3 623.216 197.775C594.49 204.251 571.086 216.341 553.213 234.214C535.34 252.087 523.25 275.491 516.774 304.217Z"
fill="currentColor"
stroke="currentColor"
strokeWidth="8"
strokeLinejoin="round"
/>
<path
d="M857.5 508.116C763.259 497.644 710.903 445.288 700.432 351.047C689.961 445.288 637.605 497.644 543.364 508.116C637.605 518.587 689.961 570.943 700.432 665.184C710.903 570.943 763.259 518.587 857.5 508.116Z"
stroke="currentColor"
strokeWidth="20"
strokeLinejoin="round"
/>
<path
d="M700.432 615.957C691.848 589.05 678.575 566.357 660.383 548.165C642.191 529.973 619.499 516.7 592.593 508.116C619.499 499.533 642.191 486.258 660.383 468.066C678.575 449.874 691.848 427.181 700.432 400.274C709.015 427.181 722.289 449.874 740.481 468.066C758.673 486.258 781.365 499.533 808.271 508.116C781.365 516.7 758.673 529.973 740.481 548.165C722.289 566.357 709.015 589.05 700.432 615.957Z"
stroke="currentColor"
strokeWidth="20"
strokeLinejoin="round"
/>
<path
d="M889.949 121.237C831.049 114.692 798.326 81.9698 791.782 23.0692C785.237 81.9698 752.515 114.692 693.614 121.237C752.515 127.781 785.237 160.504 791.782 219.404C798.326 160.504 831.049 127.781 889.949 121.237Z"
fill="currentColor"
stroke="currentColor"
strokeWidth="8"
strokeLinejoin="round"
/>
<path
d="M791.782 196.795C786.697 176.937 777.869 160.567 765.16 147.858C752.452 135.15 736.082 126.322 716.226 121.237C736.082 116.152 752.452 107.324 765.16 94.6152C777.869 81.9065 786.697 65.5368 791.782 45.6797C796.867 65.5367 805.695 81.9066 818.403 94.6152C831.112 107.324 847.481 116.152 867.338 121.237C847.481 126.322 831.112 135.15 818.403 147.858C805.694 160.567 796.867 176.937 791.782 196.795Z"
fill="currentColor"
stroke="currentColor"
strokeWidth="8"
strokeLinejoin="round"
/>
<path
d="M760.632 764.337C720.719 814.616 669.835 855.1 611.872 882.692C553.91 910.285 490.404 924.255 426.213 923.533C362.022 922.812 298.846 907.419 241.518 878.531C184.19 849.643 134.228 808.026 95.4548 756.863C56.6815 705.7 30.1238 646.346 17.8129 583.343C5.50207 520.339 7.76433 455.354 24.4266 393.359C41.089 331.364 71.7099 274.001 113.947 225.658C156.184 177.315 208.919 139.273 268.117 114.442"
stroke="currentColor"
strokeWidth="30"
strokeLinecap="round"
strokeLinejoin="round"
/>
</svg>
),
},
{
title: 'Open in ChatGPT',
href: `https://chatgpt.com/?${new URLSearchParams({
hints: 'search',
q,
})}`,
icon: (
<svg
role="img"
viewBox="0 0 24 24"
fill="currentColor"
xmlns="http://www.w3.org/2000/svg"
>
<title>OpenAI</title>
<path d="M22.2819 9.8211a5.9847 5.9847 0 0 0-.5157-4.9108 6.0462 6.0462 0 0 0-6.5098-2.9A6.0651 6.0651 0 0 0 4.9807 4.1818a5.9847 5.9847 0 0 0-3.9977 2.9 6.0462 6.0462 0 0 0 .7427 7.0966 5.98 5.98 0 0 0 .511 4.9107 6.051 6.051 0 0 0 6.5146 2.9001A5.9847 5.9847 0 0 0 13.2599 24a6.0557 6.0557 0 0 0 5.7718-4.2058 5.9894 5.9894 0 0 0 3.9977-2.9001 6.0557 6.0557 0 0 0-.7475-7.0729zm-9.022 12.6081a4.4755 4.4755 0 0 1-2.8764-1.0408l.1419-.0804 4.7783-2.7582a.7948.7948 0 0 0 .3927-.6813v-6.7369l2.02 1.1686a.071.071 0 0 1 .038.052v5.5826a4.504 4.504 0 0 1-4.4945 4.4944zm-9.6607-4.1254a4.4708 4.4708 0 0 1-.5346-3.0137l.142.0852 4.783 2.7582a.7712.7712 0 0 0 .7806 0l5.8428-3.3685v2.3324a.0804.0804 0 0 1-.0332.0615L9.74 19.9502a4.4992 4.4992 0 0 1-6.1408-1.6464zM2.3408 7.8956a4.485 4.485 0 0 1 2.3655-1.9728V11.6a.7664.7664 0 0 0 .3879.6765l5.8144 3.3543-2.0201 1.1685a.0757.0757 0 0 1-.071 0l-4.8303-2.7865A4.504 4.504 0 0 1 2.3408 7.872zm16.5963 3.8558L13.1038 8.364 15.1192 7.2a.0757.0757 0 0 1 .071 0l4.8303 2.7913a4.4944 4.4944 0 0 1-.6765 8.1042v-5.6772a.79.79 0 0 0-.407-.667zm2.0107-3.0231l-.142-.0852-4.7735-2.7818a.7759.7759 0 0 0-.7854 0L9.409 9.2297V6.8974a.0662.0662 0 0 1 .0284-.0615l4.8303-2.7866a4.4992 4.4992 0 0 1 6.6802 4.66zM8.3065 12.863l-2.02-1.1638a.0804.0804 0 0 1-.038-.0567V6.0742a4.4992 4.4992 0 0 1 7.3757-3.4537l-.142.0805L8.704 5.459a.7948.7948 0 0 0-.3927.6813zm1.0976-2.3654l2.602-1.4998 2.6069 1.4998v2.9994l-2.5974 1.4997-2.6067-1.4997Z" />
</svg>
),
},
{
title: 'Open in Claude',
href: `https://claude.ai/new?${new URLSearchParams({
q,
})}`,
icon: (
<svg
fill="currentColor"
role="img"
viewBox="0 0 24 24"
xmlns="http://www.w3.org/2000/svg"
>
<title>Anthropic</title>
<path d="M17.3041 3.541h-3.6718l6.696 16.918H24Zm-10.6082 0L0 20.459h3.7442l1.3693-3.5527h7.0052l1.3693 3.5528h3.7442L10.5363 3.5409Zm-.3712 10.2232 2.2914-5.9456 2.2914 5.9456Z" />
</svg>
),
},
{
title: 'Open in Cursor',
icon: (
<svg
fill="currentColor"
role="img"
viewBox="0 0 24 24"
xmlns="http://www.w3.org/2000/svg"
>
<title>Cursor</title>
<path d="M11.503.131 1.891 5.678a.84.84 0 0 0-.42.726v11.188c0 .3.162.575.42.724l9.609 5.55a1 1 0 0 0 .998 0l9.61-5.55a.84.84 0 0 0 .42-.724V6.404a.84.84 0 0 0-.42-.726L12.497.131a1.01 1.01 0 0 0-.996 0M2.657 6.338h18.55c.263 0 .43.287.297.515L12.23 22.918c-.062.107-.229.064-.229-.06V12.335a.59.59 0 0 0-.295-.51l-9.11-5.257c-.109-.063-.064-.23.061-.23" />
</svg>
),
href: `https://cursor.com/link/prompt?${new URLSearchParams({
text: q,
})}`,
},
];
}, [githubUrl, markdownUrl]);
return (
<Popover>
<PopoverTrigger
{...props}
className={cn(
buttonVariants({
variant: 'secondary',
size: 'sm',
}),
'gap-2 data-[state=open]:bg-fd-accent data-[state=open]:text-fd-accent-foreground',
props.className,
)}
>
Open
<ChevronDown className="size-3.5 text-fd-muted-foreground" />
</PopoverTrigger>
<PopoverContent className="flex flex-col">
{items.map((item) => (
<a
key={item.href}
href={item.href}
rel="noreferrer noopener"
target="_blank"
className="text-sm p-2 rounded-lg inline-flex items-center gap-2 hover:text-fd-accent-foreground hover:bg-fd-accent [&_svg]:size-4"
>
{item.icon}
{item.title}
<ExternalLinkIcon className="text-fd-muted-foreground size-3.5 ms-auto" />
</a>
))}
</PopoverContent>
</Popover>
);
}

View File

@@ -0,0 +1,6 @@
'use client';
import { defineClientConfig } from 'fumadocs-openapi/ui/client';
export default defineClientConfig({
// Client-side configuration for API playground
});

View File

@@ -0,0 +1,7 @@
import { openapi } from '@/lib/openapi';
import { createAPIPage } from 'fumadocs-openapi/ui';
import client from './api-page.client';
export const APIPage = createAPIPage(openapi, {
client,
});

View File

@@ -0,0 +1,29 @@
import { cva, type VariantProps } from 'class-variance-authority';
const variants = {
primary:
'bg-fd-primary text-fd-primary-foreground hover:bg-fd-primary/80 disabled:bg-fd-secondary disabled:text-fd-secondary-foreground',
outline: 'border hover:bg-fd-accent hover:text-fd-accent-foreground',
ghost: 'hover:bg-fd-accent hover:text-fd-accent-foreground',
secondary:
'border bg-fd-secondary text-fd-secondary-foreground hover:bg-fd-accent hover:text-fd-accent-foreground',
} as const;
export const buttonVariants = cva(
'inline-flex items-center justify-center rounded-md p-2 text-sm font-medium transition-colors duration-100 disabled:pointer-events-none disabled:opacity-50 focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-fd-ring',
{
variants: {
variant: variants,
// fumadocs use `color` instead of `variant`
color: variants,
size: {
sm: 'gap-1 px-2 py-1.5 text-xs',
icon: 'p-1.5 [&_svg]:size-5',
'icon-sm': 'p-1.5 [&_svg]:size-4.5',
'icon-xs': 'p-1 [&_svg]:size-4',
},
},
},
);
export type ButtonProps = VariantProps<typeof buttonVariants>;

View File

@@ -0,0 +1,32 @@
'use client';
import * as PopoverPrimitive from '@radix-ui/react-popover';
import * as React from 'react';
import { cn } from '../../lib/cn';
const Popover = PopoverPrimitive.Root;
const PopoverTrigger = PopoverPrimitive.Trigger;
const PopoverContent = React.forwardRef<
React.ComponentRef<typeof PopoverPrimitive.Content>,
React.ComponentPropsWithoutRef<typeof PopoverPrimitive.Content>
>(({ className, align = 'center', sideOffset = 4, ...props }, ref) => (
<PopoverPrimitive.Portal>
<PopoverPrimitive.Content
ref={ref}
align={align}
sideOffset={sideOffset}
side="bottom"
className={cn(
'z-50 origin-(--radix-popover-content-transform-origin) overflow-y-auto max-h-(--radix-popover-content-available-height) min-w-[240px] max-w-[98vw] rounded-xl border bg-fd-popover/60 backdrop-blur-lg p-2 text-sm text-fd-popover-foreground shadow-lg focus-visible:outline-none data-[state=closed]:animate-fd-popover-out data-[state=open]:animate-fd-popover-in',
className,
)}
{...props}
/>
</PopoverPrimitive.Portal>
));
PopoverContent.displayName = PopoverPrimitive.Content.displayName;
const PopoverClose = PopoverPrimitive.PopoverClose;
export { Popover, PopoverTrigger, PopoverContent, PopoverClose };

View File

@@ -0,0 +1,38 @@
---
title: "Documentation README"
description: "Voicebox documentation development guide"
---
This directory contains the documentation for Voicebox, built with [Fumadocs](https://fumadocs.dev).
## Development
### Running Locally
From the `docs/` directory:
```bash
bun install
bun run dev
```
The docs will be available at `http://localhost:3000`.
### Structure
- `content/docs/overview/` — user-facing guides (installation, quick start, feature walkthroughs)
- `content/docs/developer/` — architecture, backend internals, and contributor guides
- `content/docs/api-reference/` — auto-generated from the backend's OpenAPI schema
- `content/docs/index.mdx` — landing page
- `public/` — static assets (images, screenshots, videos)
### Writing Docs
- Use `.mdx` files for all documentation pages
- Navigation is generated from `content/docs/meta.json` files
- Fumadocs components available: `Callout`, `Cards` / `Card`, `Tabs` / `Tab`, `Steps` / `Step`, `Accordion` / `AccordionGroup`, `Files` / `Folder` / `File`
- API reference pages under `api-reference/` are regenerated from the backend's OpenAPI schema — don't edit them by hand
## Deployment
Docs are automatically deployed when changes land on `main`.

View File

@@ -0,0 +1,16 @@
---
title: Health
description: Health check endpoint.
full: true
_openapi:
method: GET
toc: []
structuredData:
headings: []
contents:
- content: Health check endpoint.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/health","method":"get"}]} />

View File

@@ -0,0 +1,4 @@
{
"title": "General",
"pages": ["root__get", "health_health_get"]
}

View File

@@ -0,0 +1,16 @@
---
title: Root
description: Root endpoint.
full: true
_openapi:
method: GET
toc: []
structuredData:
headings: []
contents:
- content: Root endpoint.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/","method":"get"}]} />

View File

@@ -0,0 +1,16 @@
---
title: Generate Speech
description: Generate speech from text using a voice profile.
full: true
_openapi:
method: POST
toc: []
structuredData:
headings: []
contents:
- content: Generate speech from text using a voice profile.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/generate","method":"post"}]} />

View File

@@ -0,0 +1,16 @@
---
title: Get Audio
description: Serve generated audio file.
full: true
_openapi:
method: GET
toc: []
structuredData:
headings: []
contents:
- content: Serve generated audio file.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/audio/{generation_id}","method":"get"}]} />

View File

@@ -0,0 +1,8 @@
{
"title": "Generation",
"pages": [
"generate_speech_generate_post",
"transcribe_audio_transcribe_post",
"get_audio_audio__generation_id__get"
]
}

View File

@@ -0,0 +1,16 @@
---
title: Transcribe Audio
description: Transcribe audio file to text.
full: true
_openapi:
method: POST
toc: []
structuredData:
headings: []
contents:
- content: Transcribe audio file to text.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/transcribe","method":"post"}]} />

View File

@@ -0,0 +1,16 @@
---
title: Delete Generation
description: Delete a generation.
full: true
_openapi:
method: DELETE
toc: []
structuredData:
headings: []
contents:
- content: Delete a generation.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/history/{generation_id}","method":"delete"}]} />

View File

@@ -0,0 +1,16 @@
---
title: Get Generation
description: Get a generation by ID.
full: true
_openapi:
method: GET
toc: []
structuredData:
headings: []
contents:
- content: Get a generation by ID.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/history/{generation_id}","method":"get"}]} />

View File

@@ -0,0 +1,16 @@
---
title: Get Stats
description: Get generation statistics.
full: true
_openapi:
method: GET
toc: []
structuredData:
headings: []
contents:
- content: Get generation statistics.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/history/stats","method":"get"}]} />

View File

@@ -0,0 +1,16 @@
---
title: List History
description: List generation history with optional filters.
full: true
_openapi:
method: GET
toc: []
structuredData:
headings: []
contents:
- content: List generation history with optional filters.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/history","method":"get"}]} />

View File

@@ -0,0 +1,9 @@
{
"title": "History",
"pages": [
"list_history_history_get",
"get_generation_history__generation_id__get",
"delete_generation_history__generation_id__delete",
"get_stats_history_stats_get"
]
}

View File

@@ -0,0 +1,5 @@
{
"title": "API Reference",
"defaultOpen": true,
"pages": ["general", "profiles", "generation", "history", "models"]
}

View File

@@ -0,0 +1,16 @@
---
title: Get Model Progress
description: Get model download progress via Server-Sent Events.
full: true
_openapi:
method: GET
toc: []
structuredData:
headings: []
contents:
- content: Get model download progress via Server-Sent Events.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/models/progress/{model_name}","method":"get"}]} />

View File

@@ -0,0 +1,16 @@
---
title: Get Model Status
description: Get status of all available models.
full: true
_openapi:
method: GET
toc: []
structuredData:
headings: []
contents:
- content: Get status of all available models.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/models/status","method":"get"}]} />

View File

@@ -0,0 +1,16 @@
---
title: Load Model
description: Manually load TTS model.
full: true
_openapi:
method: POST
toc: []
structuredData:
headings: []
contents:
- content: Manually load TTS model.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/models/load","method":"post"}]} />

View File

@@ -0,0 +1,10 @@
{
"title": "Models",
"pages": [
"get_model_status_models_status_get",
"load_model_models_load_post",
"unload_model_models_unload_post",
"trigger_model_download_models_download_post",
"get_model_progress_models_progress__model_name__get"
]
}

View File

@@ -0,0 +1,16 @@
---
title: Trigger Model Download
description: Trigger download of a specific model.
full: true
_openapi:
method: POST
toc: []
structuredData:
headings: []
contents:
- content: Trigger download of a specific model.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/models/download","method":"post"}]} />

View File

@@ -0,0 +1,16 @@
---
title: Unload Model
description: Unload TTS model to free memory.
full: true
_openapi:
method: POST
toc: []
structuredData:
headings: []
contents:
- content: Unload TTS model to free memory.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/models/unload","method":"post"}]} />

View File

@@ -0,0 +1,16 @@
---
title: Add Profile Sample
description: Add a sample to a voice profile.
full: true
_openapi:
method: POST
toc: []
structuredData:
headings: []
contents:
- content: Add a sample to a voice profile.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/profiles/{profile_id}/samples","method":"post"}]} />

View File

@@ -0,0 +1,16 @@
---
title: Create Profile
description: Create a new voice profile.
full: true
_openapi:
method: POST
toc: []
structuredData:
headings: []
contents:
- content: Create a new voice profile.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/profiles","method":"post"}]} />

View File

@@ -0,0 +1,16 @@
---
title: Delete Profile
description: Delete a voice profile.
full: true
_openapi:
method: DELETE
toc: []
structuredData:
headings: []
contents:
- content: Delete a voice profile.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/profiles/{profile_id}","method":"delete"}]} />

View File

@@ -0,0 +1,16 @@
---
title: Delete Profile Sample
description: Delete a profile sample.
full: true
_openapi:
method: DELETE
toc: []
structuredData:
headings: []
contents:
- content: Delete a profile sample.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/profiles/samples/{sample_id}","method":"delete"}]} />

View File

@@ -0,0 +1,16 @@
---
title: Get Profile
description: Get a voice profile by ID.
full: true
_openapi:
method: GET
toc: []
structuredData:
headings: []
contents:
- content: Get a voice profile by ID.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/profiles/{profile_id}","method":"get"}]} />

View File

@@ -0,0 +1,16 @@
---
title: Get Profile Samples
description: Get all samples for a profile.
full: true
_openapi:
method: GET
toc: []
structuredData:
headings: []
contents:
- content: Get all samples for a profile.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/profiles/{profile_id}/samples","method":"get"}]} />

View File

@@ -0,0 +1,16 @@
---
title: List Profiles
description: List all voice profiles.
full: true
_openapi:
method: GET
toc: []
structuredData:
headings: []
contents:
- content: List all voice profiles.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/profiles","method":"get"}]} />

View File

@@ -0,0 +1,13 @@
{
"title": "Profiles",
"pages": [
"list_profiles_profiles_get",
"create_profile_profiles_post",
"get_profile_profiles__profile_id__get",
"update_profile_profiles__profile_id__put",
"delete_profile_profiles__profile_id__delete",
"get_profile_samples_profiles__profile_id__samples_get",
"add_profile_sample_profiles__profile_id__samples_post",
"delete_profile_sample_profiles_samples__sample_id__delete"
]
}

View File

@@ -0,0 +1,16 @@
---
title: Update Profile
description: Update a voice profile.
full: true
_openapi:
method: PUT
toc: []
structuredData:
headings: []
contents:
- content: Update a voice profile.
---
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
<APIPage document={"./openapi.json"} operations={[{"path":"/profiles/{profile_id}","method":"put"}]} />

View File

@@ -0,0 +1,286 @@
---
title: "Architecture"
description: "Understanding Voicebox's technical architecture"
---
## System Overview
Voicebox uses a client-server architecture with a React frontend and Python backend. The desktop app is built with Tauri and contains two main layers:
**Frontend Layer:** A React application that handles the UI components, state management with Zustand, and data fetching with React Query (TanStack Query).
**Backend Layer:** A Python FastAPI server that hosts the REST API, runs a pluggable registry of TTS and STT engines, manages the SQLite database, and handles audio processing.
These two layers communicate via HTTP on `localhost:17493`, with the frontend making API requests to the backend. In production the backend is compiled with PyInstaller and launched as a Tauri sidecar; in development it's run manually via `uvicorn`.
## Frontend Architecture
### Tech Stack
- **Framework**: React 18 with TypeScript
- **State Management**: Zustand stores
- **Data Fetching**: React Query (TanStack Query)
- **Styling**: Tailwind CSS
- **Audio**: WaveSurfer.js
- **Desktop**: Tauri (Rust)
### Component Structure
<Files>
<Folder name="app/src" defaultOpen>
<Folder name="components">
<File name="Profiles/" />
<File name="Generation/" />
<File name="Stories/" />
<File name="ServerSettings/" />
</Folder>
<Folder name="lib">
<File name="api/" />
<File name="constants/" />
<File name="hooks/" />
<File name="utils/" />
</Folder>
<Folder name="stores" />
</Folder>
</Files>
## Backend Architecture
### Tech Stack
- **Framework**: FastAPI (Python 3.11+)
- **TTS Engines**: Qwen3-TTS, Qwen CustomVoice, LuxTTS, Chatterbox, Chatterbox Turbo, TADA, Kokoro
- **Transcription**: Whisper (PyTorch or MLX-Whisper)
- **Inference Backends**: MLX (Apple Silicon), PyTorch (CUDA / ROCm / XPU / DirectML / CPU)
- **Database**: SQLite via SQLAlchemy
- **Audio**: librosa, soundfile, Pedalboard
### Layout
<Files>
<Folder name="backend" defaultOpen>
<File name="app.py" />
<File name="main.py" />
<File name="config.py" />
<File name="models.py" />
<File name="server.py" />
<File name="build_binary.py" />
<Folder name="routes">
<File name="profiles.py" />
<File name="generate.py" />
<File name="history.py" />
<File name="models.py" />
<File name="channels.py" />
</Folder>
<Folder name="services">
<File name="generation.py" />
<File name="task_queue.py" />
<File name="profiles.py" />
<File name="channels.py" />
</Folder>
<Folder name="backends">
<File name="__init__.py" />
<File name="base.py" />
<File name="pytorch_backend.py" />
<File name="mlx_backend.py" />
<File name="qwen_custom_voice_backend.py" />
<File name="luxtts_backend.py" />
<File name="chatterbox_backend.py" />
<File name="chatterbox_turbo_backend.py" />
<File name="hume_backend.py" />
<File name="kokoro_backend.py" />
</Folder>
<Folder name="database">
<File name="models.py" />
<File name="session.py" />
</Folder>
<Folder name="utils">
<File name="audio.py" />
<File name="effects.py" />
</Folder>
</Folder>
</Files>
### Request Flow
An HTTP request enters a **route handler**, which validates input and delegates to a **service** function. The service calls into the appropriate **engine backend** via the registry, which runs the actual inference. Audio post-processing runs through **utils** (trim, resample, effects).
Route handlers are intentionally thin — they validate input, delegate to a service function, and format the response. All business logic lives in `services/`.
### Multi-Engine Registry
The backend is designed so that adding a new TTS engine only requires touching the `backends/` directory and the central registry. There is no per-engine branching in routes or services.
- **`TTSBackend` Protocol** (`backends/__init__.py`) — defines the contract every engine implements: `load_model`, `create_voice_prompt`, `combine_voice_prompts`, `generate`, `unload_model`, `is_loaded`, `_get_model_path`.
- **`ModelConfig` dataclass** — central metadata record for each model variant: `model_name`, `display_name`, `engine`, `hf_repo_id`, `size_mb`, `needs_trim`, `languages`, `supports_instruct`, etc.
- **`TTS_ENGINES` dict** — maps engine name (`"qwen"`, `"kokoro"`, etc.) to display name.
- **`get_tts_backend_for_engine(engine)`** — thread-safe factory that lazily instantiates and caches the backend for an engine using double-checked locking.
Shipped engines:
| Engine key | Display name | Profile type |
|------------|--------------|--------------|
| `qwen` | Qwen TTS | Cloned |
| `qwen_custom_voice` | Qwen CustomVoice | Preset |
| `luxtts` | LuxTTS | Cloned |
| `chatterbox` | Chatterbox TTS | Cloned |
| `chatterbox_turbo` | Chatterbox Turbo | Cloned |
| `tada` | TADA | Cloned |
| `kokoro` | Kokoro | Preset |
See [TTS Engines](/developer/tts-engines) for the full contract and integration phases, and [PROJECT_STATUS.md](https://github.com/jamiepine/voicebox/blob/main/docs/PROJECT_STATUS.md) for candidates under evaluation.
### Key Modules
- **`app.py`** — FastAPI app factory, CORS, lifecycle events
- **`main.py`** — Entry point (imports app, runs uvicorn)
- **`server.py`** — Tauri sidecar launcher, parent-pid watchdog, frozen-build environment setup
- **`services/generation.py`** — Single function handling all generation modes (generate, retry, regenerate)
- **`services/task_queue.py`** — Serial generation queue for GPU inference
- **`backends/__init__.py`** — Protocol definitions, `ModelConfig` registry, and engine factory
- **`backends/base.py`** — Shared utilities across all engine implementations (device selection, progress tracking, output trimming)
### Inference Backend Selection
The server detects the best inference backend at startup and uses it for all engines that support it:
| Platform | Backend | Acceleration |
|----------|---------|--------------|
| macOS (Apple Silicon) | MLX | Metal / Neural Engine |
| Windows / Linux (NVIDIA) | PyTorch | CUDA (cu128) |
| Linux (AMD) | PyTorch | ROCm |
| Windows / Linux (Intel Arc) | PyTorch | XPU (IPEX) |
| Windows (other GPU) | PyTorch | DirectML |
| Any | PyTorch | CPU fallback |
See [GPU Acceleration](/overview/gpu-acceleration) for platform-specific notes and manual overrides.
### Data Model
Core tables (see `backend/database/models.py`):
- **`profiles`** — Voice profiles with `voice_type` discriminator (`cloned` | `preset` | `designed`), `preset_engine`, `preset_voice_id`, and `default_engine`.
- **`profile_samples`** — Reference audio clips + transcripts for cloned profiles. Empty for preset profiles.
- **`generations`** — Generated audio with text, engine, model, language, seed, and duration.
- **`generation_versions`** — Processed variants of a generation with different effects chains applied.
- **`audio_channels`** + **`channel_device_mappings`** + **`profile_channel_mappings`** — Multi-output routing.
See [Voice Profiles](/developer/voice-profiles) and [Effects Pipeline](/developer/effects-pipeline) for details.
## Desktop App (Tauri)
### Rust Backend
<Files>
<Folder name="tauri/src-tauri" defaultOpen>
<File name="Cargo.toml" />
<File name="tauri.conf.json" />
<File name="src/" />
<Folder name="binaries" />
</Folder>
</Files>
### Responsibilities
- Launch Python backend as sidecar process
- Native file dialogs
- System tray integration
- Auto-updates (Tauri updater + custom CUDA backend swap)
- Parent-PID watchdog so the backend exits if the app crashes
## Build Process
### Development
```bash
just dev # Starts backend + Tauri app
just dev-web # Starts backend + web app (no Tauri)
just dev-backend # Backend only
just dev-frontend # Tauri app only (backend must be running)
```
### Production
```bash
just build # CPU server binary + Tauri installer
just build-local # CPU + CUDA binaries + Tauri installer (Windows)
just build-server # Server binary only
just build-tauri # Tauri app only
```
See [Building](/developer/building) for what PyInstaller does and how the CUDA binary is split and packaged separately.
## Data Flow
### Generation Flow
1. **User Input** — text entered in a React component, engine + profile selected
2. **State Update** — Zustand generation form store records the request
3. **API Request** — React Query mutation hits `POST /generate`
4. **Route** — `routes/generate.py` validates input, dispatches to `services/generation.py`
5. **Voice Prompt** — the service creates or retrieves a cached voice prompt via the engine's backend
6. **Queue** — `services/task_queue.py` serializes generation to avoid GPU contention
7. **Inference** — the engine backend runs `generate()` and returns audio + sample rate
8. **Post-process** — optional trim (for engines that need it), effects chain applied per generation version
9. **Storage** — audio written to the generations directory, metadata saved to SQLite
10. **Response** — backend returns the generation record; frontend updates React Query cache and plays audio
## Performance Considerations
### Frontend
- **Code splitting** — lazy-load routes
- **Memoization** — `React.memo` for heavy components
- **Virtual scrolling** — for large lists
- **Debouncing** — search and input handling
### Backend
- **Async I/O** — all I/O is async; inference runs in `asyncio.to_thread`
- **Serial task queue** — avoids multiple engines fighting for the GPU
- **Voice prompt caching** — engine-specific, keyed by audio hash + reference text
- **Model pinning** — only one model per engine loaded at a time; switching unloads the previous one
- **Per-engine backend cache** — engines are only instantiated once per process
## Security
### Current
- Local-only by default (bound to `127.0.0.1:17493`)
- No authentication (localhost trust)
- File system sandboxing via Tauri
### Planned
- API key authentication for remote mode
- User accounts
- Rate limiting
- HTTPS support
## Deployment Modes
### Local Mode
- Backend runs as sidecar
- All data stays on device
- No network required
### Remote Mode
- Backend on a separate machine (Docker or bare host)
- Frontend (desktop or web) connects over HTTP
- See [Remote Mode](/overview/remote-mode) and [Docker](/overview/docker)
## Next Steps
<Cards>
<Card title="Development Setup" href="/developer/setup">
Set up your dev environment
</Card>
<Card title="TTS Engines" href="/developer/tts-engines">
How to add a new engine
</Card>
<Card title="Contributing" href="/developer/contributing">
Contribute to Voicebox
</Card>
</Cards>

View File

@@ -0,0 +1,310 @@
---
title: "Audio Channels"
description: "How audio output routing works in Voicebox"
---
## Overview
Audio channels allow routing voice output to different audio devices. This is useful for multi-output setups where different voices should play through different speakers or applications.
## Architecture
**Channel:** A named audio bus that can be assigned to output devices.
**Device Mapping:** Links channels to OS audio device identifiers.
**Profile Mapping:** Links voice profiles to channels (many-to-many).
## Data Model
### AudioChannel Table
```python
class AudioChannel(Base):
__tablename__ = "audio_channels"
id = Column(String, primary_key=True)
name = Column(String, nullable=False)
is_default = Column(Boolean, default=False)
created_at = Column(DateTime)
```
### ChannelDeviceMapping Table
```python
class ChannelDeviceMapping(Base):
__tablename__ = "channel_device_mappings"
id = Column(String, primary_key=True)
channel_id = Column(String, ForeignKey("audio_channels.id"))
device_id = Column(String) # OS device identifier
```
### ProfileChannelMapping Table
```python
class ProfileChannelMapping(Base):
__tablename__ = "profile_channel_mappings"
profile_id = Column(String, ForeignKey("profiles.id"), primary_key=True)
channel_id = Column(String, ForeignKey("audio_channels.id"), primary_key=True)
```
## Default Channel
A default channel is created on database initialization:
```python
def init_db():
# Create default channel if it doesn't exist
default_channel = db.query(AudioChannel).filter(
AudioChannel.is_default == True
).first()
if not default_channel:
default_channel = AudioChannel(
id=str(uuid.uuid4()),
name="Default",
is_default=True
)
db.add(default_channel)
# Assign all existing profiles to default channel
profiles = db.query(VoiceProfile).all()
for profile in profiles:
mapping = ProfileChannelMapping(
profile_id=profile.id,
channel_id=default_channel.id
)
db.add(mapping)
```
## Core Operations
### Creating a Channel
```python
async def create_channel(
data: AudioChannelCreate,
db: Session,
) -> AudioChannelResponse:
# Check name uniqueness
existing = db.query(DBAudioChannel).filter_by(name=data.name).first()
if existing:
raise ValueError(f"Channel with name '{data.name}' already exists")
# Create channel
channel = DBAudioChannel(
id=str(uuid.uuid4()),
name=data.name,
is_default=False,
)
db.add(channel)
# Add device mappings
for device_id in data.device_ids:
mapping = DBChannelDeviceMapping(
id=str(uuid.uuid4()),
channel_id=channel.id,
device_id=device_id,
)
db.add(mapping)
db.commit()
```
### Updating a Channel
```python
async def update_channel(
channel_id: str,
data: AudioChannelUpdate,
db: Session,
) -> AudioChannelResponse:
channel = db.query(DBAudioChannel).filter_by(id=channel_id).first()
# Cannot modify default channel
if channel.is_default:
raise ValueError("Cannot modify the default channel")
# Update name
if data.name is not None:
channel.name = data.name
# Update device mappings
if data.device_ids is not None:
# Delete existing
db.query(DBChannelDeviceMapping).filter_by(channel_id=channel_id).delete()
# Add new
for device_id in data.device_ids:
mapping = DBChannelDeviceMapping(
channel_id=channel.id,
device_id=device_id,
)
db.add(mapping)
db.commit()
```
### Deleting a Channel
```python
async def delete_channel(channel_id: str, db: Session) -> bool:
channel = db.query(DBAudioChannel).filter_by(id=channel_id).first()
# Cannot delete default channel
if channel.is_default:
raise ValueError("Cannot delete the default channel")
# Delete device mappings
db.query(DBChannelDeviceMapping).filter_by(channel_id=channel_id).delete()
# Delete profile-channel mappings
db.query(DBProfileChannelMapping).filter_by(channel_id=channel_id).delete()
# Delete channel
db.delete(channel)
db.commit()
```
## Voice Assignment
### Assigning Voices to Channel
```python
async def set_channel_voices(
channel_id: str,
data: ChannelVoiceAssignment,
db: Session,
) -> None:
# Verify channel exists
channel = db.query(DBAudioChannel).filter_by(id=channel_id).first()
if not channel:
raise ValueError(f"Channel {channel_id} not found")
# Verify all profiles exist
for profile_id in data.profile_ids:
profile = db.query(DBVoiceProfile).filter_by(id=profile_id).first()
if not profile:
raise ValueError(f"Profile {profile_id} not found")
# Delete existing mappings
db.query(DBProfileChannelMapping).filter_by(channel_id=channel_id).delete()
# Add new mappings
for profile_id in data.profile_ids:
mapping = DBProfileChannelMapping(
profile_id=profile_id,
channel_id=channel_id,
)
db.add(mapping)
db.commit()
```
### Assigning Channels to Voice
```python
async def set_profile_channels(
profile_id: str,
data: ProfileChannelAssignment,
db: Session,
) -> None:
# Verify profile exists
profile = db.query(DBVoiceProfile).filter_by(id=profile_id).first()
if not profile:
raise ValueError(f"Profile {profile_id} not found")
# Delete existing mappings
db.query(DBProfileChannelMapping).filter_by(profile_id=profile_id).delete()
# Add new mappings
for channel_id in data.channel_ids:
mapping = DBProfileChannelMapping(
profile_id=profile_id,
channel_id=channel_id,
)
db.add(mapping)
db.commit()
```
## API Endpoints
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/channels` | List all channels |
| POST | `/channels` | Create a channel |
| GET | `/channels/{id}` | Get channel by ID |
| PUT | `/channels/{id}` | Update channel |
| DELETE | `/channels/{id}` | Delete channel |
| GET | `/channels/{id}/voices` | Get assigned voices |
| PUT | `/channels/{id}/voices` | Set assigned voices |
| GET | `/profiles/{id}/channels` | Get profile's channels |
| PUT | `/profiles/{id}/channels` | Set profile's channels |
## Request/Response Schemas
### AudioChannelCreate
```json
{
"name": "Speakers",
"device_ids": ["device_uuid_1", "device_uuid_2"]
}
```
### AudioChannelResponse
```json
{
"id": "channel_uuid",
"name": "Speakers",
"is_default": false,
"device_ids": ["device_uuid_1", "device_uuid_2"],
"created_at": "2024-01-15T10:30:00Z"
}
```
### ChannelVoiceAssignment
```json
{
"profile_ids": ["profile_1", "profile_2"]
}
```
## Use Cases
### Multi-Output Setup
**Scenario:** Stream with different voice characters
1. Create "Stream" channel → OBS virtual audio
2. Create "Monitor" channel → Headphones
3. Assign "Narrator" profile → Both channels
4. Assign "Character 1" profile → Stream only
### Virtual Audio Cables
Common device IDs for virtual audio:
- VB-Audio Virtual Cable
- BlackHole (macOS)
- Soundflower (macOS)
## Frontend Integration
The frontend needs to:
1. **Enumerate devices** using Web Audio API or Tauri
2. **Display channel list** with device assignments
3. **Allow profile assignment** via drag/drop or dropdown
4. **Route playback** to correct device based on profile's channel
## Limitations
- Device IDs are OS-specific
- Hot-plugging may invalidate device IDs
- Default channel cannot be modified/deleted
- Frontend handles actual audio routing (backend just stores config)

View File

@@ -0,0 +1,218 @@
---
title: "Auto-Updater"
description: "How Voicebox automatic updates work"
---
## Overview
Voicebox uses Tauri's built-in auto-updater to deliver signed updates to users. The system verifies updates cryptographically before installation.
## How It Works
When Voicebox launches (in production Tauri builds only), it checks GitHub Releases for a `latest.json` manifest. If a newer version is available:
1. **Notification** - An update banner appears at the top of the app
2. **Download** - User clicks "Install Now" to download the update package
3. **Verification** - The downloaded package is cryptographically verified using the public key embedded in `tauri.conf.json`
4. **Installation** - After verification, the update is installed
5. **Restart** - The app restarts automatically with the new version
Users can also check for updates manually via **Settings → Check for Updates**.
## Configuration
The updater is configured in `tauri/src-tauri/tauri.conf.json`:
```json
{
"plugins": {
"updater": {
"active": true,
"dialog": false,
"endpoints": [
"https://github.com/jamiepine/voicebox/releases/latest/download/latest.json"
],
"pubkey": "PASTE_PUBLIC_KEY_CONTENT_HERE"
}
}
}
```
**Key settings:**
- `endpoints` - URL to the `latest.json` manifest (checked on app startup)
- `pubkey` - Public key for verifying update signatures
- `dialog` - Set to `false` (we use custom UI instead of Tauri's built-in dialog)
## Release Manifest
The `latest.json` file defines available updates per platform:
```json
{
"version": "0.2.0",
"notes": "Bug fixes and improvements",
"pub_date": "2026-01-25T12:00:00Z",
"platforms": {
"darwin-aarch64": {
"signature": "base64_encoded_signature",
"url": "https://github.com/jamiepine/voicebox/releases/download/v0.2.0/voicebox_0.2.0_aarch64.app.tar.gz"
},
"darwin-x86_64": {
"signature": "base64_encoded_signature",
"url": "https://github.com/jamiepine/voicebox/releases/download/v0.2.0/voicebox_0.2.0_x64.app.tar.gz"
},
"linux-x86_64": {
"signature": "base64_encoded_signature",
"url": "https://github.com/jamiepine/voicebox/releases/download/v0.2.0/voicebox_0.2.0_amd64.AppImage"
},
"windows-x86_64": {
"signature": "base64_encoded_signature",
"url": "https://github.com/jamiepine/voicebox/releases/download/v0.2.0/voicebox_0.2.0_x64_en-US.msi"
}
}
}
```
## Signing
Updates must be cryptographically signed to be accepted. The signing process:
1. **Generate keys** (one-time setup):
```bash
bun tauri signer generate -w ~/.tauri/voicebox.key
```
This creates:
- Private key: `~/.tauri/voicebox.key` (stored in GitHub Secrets, never committed)
- Public key: `~/.tauri/voicebox.key.pub` (pasted into `tauri.conf.json`)
2. **Build with signing** (GitHub Actions handles this):
- Set `TAURI_SIGNING_PRIVATE_KEY` environment variable
- Tauri signs the update package during build
- Generates `.sig` signature file alongside the installer
3. **Verification** - The updater compares the signature against the public key before installing
## GitHub Actions Workflow
The release workflow (`.github/workflows/release.yml`) automatically:
- Builds signed releases for macOS, Windows, and Linux
- Creates the `latest.json` manifest with signatures
- Uploads everything to the GitHub Release
Triggered by pushing a git tag:
```bash
git tag v0.2.0 && git push --tags
```
## Environment Variables
GitHub Actions needs these secrets set:
- `TAURI_SIGNING_PRIVATE_KEY` - Content of `~/.tauri/voicebox.key`
- `TAURI_SIGNING_PRIVATE_KEY_PASSWORD` - Password for the key (if set)
## Security
<Callout type="warn">
**Critical:** Never commit the private key. Store it only in GitHub Secrets. The public key in `tauri.conf.json` is safe to commit and distribute.
</Callout>
- Updates are cryptographically signed using Ed25519
- HTTP endpoints are blocked (HTTPS only)
- Signature verification happens before installation
- Failed verification aborts the update
## Troubleshooting
### "Invalid signature" error
- Public key in `tauri.conf.json` doesn't match the private key used to sign
- Signature file wasn't uploaded to the release
### "No update available" when one exists
- `latest.json` version isn't higher than current version
- Wrong endpoint URL in configuration
- Manifest hasn't propagated to GitHub's CDN yet
### Update check fails in dev mode
The updater only works in production Tauri builds. It doesn't run during `just dev` or web mode.
### Build fails with signing error
- GitHub Secrets aren't set correctly
- Private key file is missing or corrupted
- Key format is wrong (should start with `dW50cnVzdGVkIGNvbW1lbnQ6`)
## CUDA Backend Updates
The CUDA-enabled backend is distributed separately from the main app because bundling CUDA would bloat the installer by several gigabytes for users who don't have an NVIDIA GPU. Unlike the Tauri auto-updater, the CUDA backend uses a custom download system built into the Python server.
**Size comparison (approximate):**
- Standard CPU bundle (in the installer): ~200400 MB
- CUDA server core: ~945 MB (versioned with each Voicebox release)
- CUDA libs (NVIDIA runtime DLLs): ~1.7 GB (versioned independently, cached across upgrades)
### Two-archive split
Since v0.4, the CUDA binary is packaged as **two archives** instead of one:
- **Server core** (`voicebox-server-cuda.tar.gz`) — the Python server + PyTorch code, changes every release.
- **CUDA libs** (`cuda-libs-cu128-v1.tar.gz`) — the heavy NVIDIA CUDA/cuDNN DLLs, only re-downloaded when the CUDA toolkit major version changes.
This means most Voicebox upgrades only re-download the ~945 MB server core, not the full ~2.5 GB bundle.
### Download Process
When a user clicks "Install CUDA backend" in Settings → GPU:
1. **Server-core archive** — Downloaded from GitHub Releases and extracted.
2. **CUDA libs archive** — Downloaded separately (or reused if the installed version still matches).
3. **Verification** — SHA-256 checksum verification for integrity.
4. **Placement** — Extracted into `{data_dir}/backends/cuda/`.
5. **Restart** — The Voicebox server restarts and swaps in the CUDA backend.
### Auto-Update on Startup
On startup, the backend compares the installed CUDA server-core version with the current app version. If they differ, the core archive is pulled in the background. If the libs version pinned by the new release also differs (rare — e.g. on a cu126 → cu128 bump), the user is prompted to confirm the larger download.
### Storage Location
Downloaded CUDA binaries live in the app's data directory:
```
{data_dir}/backends/cuda/
voicebox-server-cuda.exe # Windows
voicebox-server-cuda # macOS/Linux
<NVIDIA CUDA runtime DLLs>
```
### API Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/backend/cuda-status` | GET | Check if the CUDA backend is available/active and which versions are installed |
| `/backend/download-cuda` | POST | Trigger server-core + libs download |
| `/backend/cuda-progress` | GET | SSE stream of download progress |
| `/backend/cuda` | DELETE | Remove the downloaded CUDA backend |
### Progress Tracking
Downloads report progress via Server-Sent Events (SSE):
```
GET /backend/cuda-progress
event: progress
data: {"current": 52428800, "total": 945000000, "filename": "voicebox-server-cuda.tar.gz", "status": "downloading"}
```
The frontend subscribes to this endpoint to show real-time progress, including which archive (server core vs libs) is currently downloading.
### Release Artifacts
For each CUDA-capable release, these files are uploaded to GitHub:
- `voicebox-server-cuda.tar.gz` — server-core archive
- `voicebox-server-cuda.tar.gz.sha256` — checksum
- `cuda-libs-cu128-v1.tar.gz` — CUDA runtime libs (only when the libs version bumps)
- `cuda-libs-cu128-v1.tar.gz.sha256` — checksum

View File

@@ -0,0 +1,192 @@
---
title: "Building"
description: "How Voicebox is built for production"
---
## Overview
Voicebox uses a two-stage build process:
1. **Python Server Binary** — PyInstaller bundles the FastAPI backend into a standalone executable
2. **Tauri Desktop App** — Bundles the React frontend, Rust wrapper, and Python server as a sidecar
## Build Commands
```bash
just build # Build everything (server + Tauri)
just build-server # Build Python server binary only
just build-tauri # Build Tauri app only
```
## Server Binary Build
### Build Script
`scripts/build-server.sh` orchestrates the build:
```bash
# Determine platform (e.g., x86_64-apple-darwin)
PLATFORM=$(rustc --print host-tuple)
# Run PyInstaller via build_binary.py
cd backend
python build_binary.py
# Copy to Tauri's binaries directory
cp dist/voicebox-server ../tauri/src-tauri/binaries/voicebox-server-${PLATFORM}
```
### PyInstaller Configuration
`backend/build_binary.py` contains the PyInstaller configuration:
**Entry Point:** Uses `server.py` (not `main.py`) for Tauri sidecar support
**Key Options:**
- `--onefile` — Single executable
- `--hidden-import` — Explicitly import modules PyInstaller can't detect
- `--collect-all` — Bundle data files and native libraries for packages like `mlx`, `zipvoice`
- `--exclude-module` — Strip NVIDIA packages from CPU builds
**Platform-Specific Logic:**
```python
# Apple Silicon — include MLX backend
if is_apple_silicon() and not cuda:
args.extend([
"--hidden-import", "mlx",
"--collect-all", "mlx", # Bundles .dylib and .metallib files
])
# CUDA builds — include torch.cuda
if cuda:
args.extend(["--hidden-import", "torch.cuda"])
# CPU builds — exclude NVIDIA packages to save ~3GB
else:
for pkg in ["nvidia", "nvidia.cublas", "nvidia.cudnn", ...]:
args.extend(["--exclude-module", pkg])
```
**Environment Variable:**
```bash
export QWEN_TTS_PATH=~/path/to/Qwen3-TTS # Use local Qwen3-TTS source
```
### CUDA Binary
The CUDA-enabled server is built separately due to size (~2.43 GB vs ~410 MB CPU version):
```bash
cd backend
python build_binary.py --cuda
```
The resulting binary is too large for GitHub Releases, so it's split into parts for distribution (see Auto-Updater docs for the download mechanism).
## Tauri App Build
Tauri bundles everything together:
```bash
cd tauri
bun run tauri build
```
**What happens:**
1. Vite builds the React frontend
2. Rust compiles the Tauri wrapper
3. Sidecar binary is copied from `src-tauri/binaries/`
4. Platform-specific installer created (DMG, MSI, AppImage)
**Output locations:**
<Files>
<Folder name="tauri/src-tauri/target/release/bundle" defaultOpen>
<File name="dmg/" />
<File name="msi/" />
<File name="nsis/" />
<File name="appimage/" />
</Folder>
</Files>
### Sidecar Configuration
The server binary is declared as an external binary in `tauri.conf.json`:
```json
{
"tauri": {
"bundle": {
"externalBin": ["binaries/voicebox-server"]
}
}
}
```
Tauri looks for `voicebox-server-${PLATFORM}` in `src-tauri/binaries/` and bundles it.
## GitHub Actions Release
`.github/workflows/release.yml` automates the full build:
### Matrix Strategy
| Platform | Target | Backend | Notes |
|----------|--------|---------|-------|
| macos-latest | aarch64-apple-darwin | MLX | Apple Silicon native |
| macos-15-intel | x86_64-apple-darwin | PyTorch | Intel Macs |
| windows-latest | x86_64-pc-windows-msvc | PyTorch | Windows with CUDA optional |
### Build Steps
1. **Setup** — Python, Rust, Bun, dependencies
2. **Build Server** — `build-server.sh` (Unix) or `build_binary.py` (Windows)
3. **Build Tauri** — `tauri-action` with signing keys
4. **Upload** — Release artifacts and `latest.json`
### Code Signing
**macOS:**
- Apple Developer certificate imported from secrets
- Notarization via App Store Connect API
**Windows:**
- Tauri handles signing via `TAURI_SIGNING_PRIVATE_KEY`
### CUDA Binary (Separate Job)
The `build-cuda-windows` job runs separately:
1. Install PyTorch with CUDA 12.8
2. Build with `build_binary.py --cuda` (produces `--onedir` output)
3. Package with `scripts/package_cuda.py` into two archives:
- `voicebox-server-cuda.tar.gz` — server core (~945 MB)
- `cuda-libs-cu128-v1.tar.gz` — NVIDIA runtime libraries (~1.7 GB, cached independently)
4. Upload archives as release artifacts
This binary is downloaded on-demand by users who enable CUDA in settings. The CUDA libs archive is only re-downloaded when the CUDA toolkit version changes, not on every app update.
## Troubleshooting
<AccordionGroup>
<Accordion title="Binary not found in dist/">
PyInstaller failed to create the output. Check:
- Python venv is activated
- All dependencies installed: `pip install -r requirements.txt`
- PyInstaller installed: `pip install pyinstaller`
</Accordion>
<Accordion title="MLX/Metal libraries missing in bundle">
macOS Apple Silicon builds need `--collect-all mlx` to include `.dylib` and `.metallib` files, not just `--collect-data`.
</Accordion>
<Accordion title="CUDA DLLs bloating CPU build">
If building CPU version but CUDA torch is installed locally, the script auto-detects and swaps to CPU torch temporarily, then restores CUDA torch after.
</Accordion>
<Accordion title="Tauri can't find sidecar">
Ensure binary exists at `tauri/src-tauri/binaries/voicebox-server-${PLATFORM}` before running Tauri build.
</Accordion>
</AccordionGroup>

View File

@@ -0,0 +1,339 @@
---
title: "Contributing"
description: "How to contribute to Voicebox"
---
Thank you for your interest in contributing to Voicebox! This guide will help you get started.
## Code of Conduct
- Be respectful and inclusive
- Welcome newcomers and help them learn
- Focus on constructive feedback
- Respect different viewpoints and experiences
## Getting Started
Before you start contributing, make sure you have:
1. **Read the documentation** to understand how Voicebox works
2. **Set up your development environment** — see [Development Setup](/developer/setup)
3. **Explored the codebase** to understand the project structure
4. **Checked [`docs/PROJECT_STATUS.md`](https://github.com/jamiepine/voicebox/blob/main/docs/PROJECT_STATUS.md)** — the living engineering roadmap that tracks prioritized tasks (Tier 1 → 3), architectural bottlenecks, and candidate TTS engines under evaluation (including why some are backlogged)
5. **Checked existing issues** to see if someone else is working on something similar
## Ways to Contribute
<Cards>
<Card title="Report Bugs">
Found a bug? Open an issue with reproduction steps
</Card>
<Card title="Request Features">
Have an idea? Start a discussion or open an issue
</Card>
<Card title="Improve Docs">
Fix typos, add examples, or clarify instructions
</Card>
<Card title="Write Code">
Fix bugs, add features, or optimize performance
</Card>
</Cards>
## Development Workflow
### 1. Fork & Clone
```bash
# Fork the repository on GitHub
# Then clone your fork
git clone https://github.com/YOUR_USERNAME/voicebox.git
cd voicebox
```
### 2. Create a Branch
Use descriptive branch names:
```bash
# For features
git checkout -b feature/voice-effects
# For bug fixes
git checkout -b fix/audio-playback-issue
# For documentation
git checkout -b docs/api-examples
```
### 3. Make Your Changes
Follow these guidelines:
<AccordionGroup>
<Accordion title="Code Style">
**TypeScript/React:**
- Use TypeScript strict mode
- Prefer functional components with hooks
- Use named exports
- Format with Biome (runs automatically)
**Python:**
- Follow PEP 8
- Use type hints
- Use async/await for I/O
- Document functions with docstrings
**Rust:**
- Follow Rust conventions
- Use meaningful names
- Handle errors explicitly
- Run `rustfmt`
</Accordion>
<Accordion title="Commit Messages">
Write clear, descriptive commit messages:
```bash
# Good
git commit -m "Add voice profile export feature"
git commit -m "Fix audio playback stopping after 30 seconds"
# Avoid
git commit -m "Update code"
git commit -m "Fix bug"
```
Format:
- Use imperative mood ("Add feature" not "Added feature")
- Keep first line under 50 characters
- Add detailed description if needed
</Accordion>
<Accordion title="Testing">
- Test your changes manually in the app
- Ensure backend API endpoints work
- Check for TypeScript/Python errors
- Verify UI components render correctly
- Add automated tests when possible
</Accordion>
</AccordionGroup>
### 4. Push & Create PR
```bash
# Push your branch
git push origin feature/your-feature-name
# Then create a pull request on GitHub
```
## Pull Request Guidelines
When creating a pull request:
<Steps>
<Step title="Use a Clear Title">
Examples:
- "Add voice profile export functionality"
- "Fix audio playback stopping after 30 seconds"
- "Improve generation speed with caching"
</Step>
<Step title="Provide Description">
Include:
- What changes you made
- Why you made them
- How to test them
- Screenshots (for UI changes)
- Reference related issues
</Step>
<Step title="Update Documentation">
- Update relevant docs if behavior changes
- Add API documentation for new endpoints
- Update README if needed
</Step>
<Step title="Check the Checklist">
- [ ] Code follows style guidelines
- [ ] Documentation updated
- [ ] Changes tested
- [ ] No breaking changes (or documented)
- [ ] CHANGELOG.md updated
</Step>
</Steps>
## Project Structure
<Files>
<Folder name="voicebox" defaultOpen>
<Folder name="app/src">
<File name="components/" />
<File name="lib/" />
<File name="hooks/" />
<File name="stores/" />
</Folder>
<Folder name="backend">
<File name="app.py" />
<File name="main.py" />
<File name="server.py" />
<File name="models.py" />
<Folder name="routes" />
<Folder name="services" />
<Folder name="backends" />
<Folder name="database" />
<Folder name="utils" />
</Folder>
<Folder name="tauri">
<File name="src-tauri/" />
</Folder>
<File name="web/" />
<File name="landing/" />
<File name="scripts/" />
</Folder>
</Files>
## Areas for Contribution
### Bug Fixes
- Check [existing issues](https://github.com/jamiepine/voicebox/issues) for bugs
- Test your fix thoroughly
- Add regression tests if possible
### New Features
- Check [`docs/PROJECT_STATUS.md`](https://github.com/jamiepine/voicebox/blob/main/docs/PROJECT_STATUS.md) and the [roadmap](https://github.com/jamiepine/voicebox#roadmap) before proposing work — the status doc lists prioritized tasks (Tier 1 → 3), known architectural bottlenecks, and candidate TTS engines already under evaluation (including why some have been backlogged)
- Discuss major features in an issue first
- Keep features focused and well-scoped
- Adding a new TTS engine? See [TTS Engines](/developer/tts-engines) for the phased workflow
### Documentation
- Improve clarity and fix typos
- Add code examples
- Create tutorials or guides
- Document API endpoints
### UI/UX Improvements
- Improve accessibility
- Enhance visual design
- Optimize performance
- Add animations/transitions
### Infrastructure
- Improve build process
- Add CI/CD improvements
- Optimize bundle size
- Add testing infrastructure
## API Development
When adding new API endpoints:
<Steps>
<Step title="Add Route">
In `backend/main.py`:
```python
@app.post("/api/new-endpoint")
async def new_endpoint(data: RequestModel) -> ResponseModel:
"""Endpoint description."""
# Implementation
return response
```
</Step>
<Step title="Create Models">
In `backend/models.py`:
```python
class RequestModel(BaseModel):
field: str
class ResponseModel(BaseModel):
result: str
```
</Step>
<Step title="Regenerate Client">
```bash
just generate-api
```
This updates the TypeScript client with type-safe bindings.
</Step>
<Step title="Update Docs">
The API documentation is automatically generated from the OpenAPI schema. Ensure your endpoint has proper docstrings and type hints, then regenerate the docs:
```bash
just generate-api
```
</Step>
</Steps>
## Testing
Currently testing is primarily manual. When adding tests:
**Backend:**
```bash
cd backend
pytest
```
**Frontend:**
```bash
bun run test
```
**E2E (future):**
```bash
bun run test:e2e
```
## Release Process
Releases are managed by maintainers using `bumpversion`:
```bash
# Bump version (patch, minor, or major)
bumpversion patch
# Push with tags
git push && git push --tags
```
GitHub Actions automatically builds and publishes releases when tags are pushed.
## Community
- **GitHub Issues:** Bug reports and feature requests
- **GitHub Discussions:** General questions and ideas
- **Discord:** Real-time chat (coming soon)
## Recognition
Contributors are recognized in:
- [CHANGELOG.md](https://github.com/jamiepine/voicebox/blob/main/CHANGELOG.md)
- GitHub contributor list
- Release notes
## License
By contributing, you agree that your contributions will be licensed under the MIT License.
## Questions?
If you have questions:
1. Check the [documentation](/overview/introduction)
2. Read [`docs/PROJECT_STATUS.md`](https://github.com/jamiepine/voicebox/blob/main/docs/PROJECT_STATUS.md) for current engineering priorities
3. Search [existing issues](https://github.com/jamiepine/voicebox/issues)
4. Open a new issue or discussion
5. See [CONTRIBUTING.md](https://github.com/jamiepine/voicebox/blob/main/CONTRIBUTING.md) in the repo
Thank you for contributing to Voicebox! 🎉

View File

@@ -0,0 +1,257 @@
---
title: "Effects Pipeline"
description: "Audio post-processing effects and generation versioning"
---
The effects pipeline provides professional-grade DSP audio processing using Spotify's Pedalboard library. Each generation can have multiple versions with different effect chains applied.
## Overview
**Key concepts:**
- **Effects Chain** — JSON-serializable list of effect configurations applied sequentially
- **Generation Version** — A processed variant of a generation with its own audio file and effects chain
- **Effect Preset** — Saved effects chain configuration (built-in or user-created)
- **Clean Version** — The original unprocessed generation audio
**Flow:**
1. TTS Generation creates clean audio
2. Effects Chain processes the audio
3. Processed Version is saved as a new generation version
Each generation maintains a clean version (original) plus any number of processed versions with different effect chains applied.
## Effect Types
The following effect types are available, each with configurable parameters:
### Chorus / Flanger
Modulated delay effect. Short centre_delay_ms gives flanger; longer gives chorus.
**Parameters:**
- rate_hz: LFO speed in Hz (range: 0.01 to 20, default: 1.0)
- depth: Modulation depth (range: 0.0 to 1.0, default: 0.5)
- feedback: Feedback amount (range: 0.0 to 0.95, default: 0.0)
- centre_delay_ms: Centre delay in milliseconds (range: 0.5 to 50, default: 7.0)
- mix: Wet/dry mix (range: 0.0 to 1.0, default: 0.5)
### Reverb
Room reverb effect.
**Parameters:**
- room_size: Room size (range: 0.0 to 1.0, default: 0.5)
- damping: High frequency damping (range: 0.0 to 1.0, default: 0.5)
- wet_level: Wet level (range: 0.0 to 1.0, default: 0.33)
- dry_level: Dry level (range: 0.0 to 1.0, default: 0.4)
- width: Stereo width (range: 0.0 to 1.0, default: 1.0)
### Delay
Echo / delay line.
**Parameters:**
- delay_seconds: Delay time in seconds (range: 0.01 to 2.0, default: 0.3)
- feedback: Feedback amount (range: 0.0 to 0.95, default: 0.3)
- mix: Wet/dry mix (range: 0.0 to 1.0, default: 0.3)
### Compressor
Dynamic range compression for consistent loudness.
**Parameters:**
- threshold_db: Threshold in dB (range: -60 to 0, default: -20.0)
- ratio: Compression ratio (range: 1.0 to 20.0, default: 4.0)
- attack_ms: Attack time in ms (range: 0.1 to 100, default: 10.0)
- release_ms: Release time in ms (range: 10 to 1000, default: 100.0)
### Gain
Volume adjustment in decibels.
**Parameters:**
- gain_db: Gain in dB (range: -40 to 40, default: 0.0)
### High-Pass Filter
Removes frequencies below the cutoff.
**Parameters:**
- cutoff_frequency_hz: Cutoff frequency in Hz (range: 20 to 8000, default: 80.0)
### Low-Pass Filter
Removes frequencies above the cutoff.
**Parameters:**
- cutoff_frequency_hz: Cutoff frequency in Hz (range: 200 to 20000, default: 8000.0)
### Pitch Shift
Shift pitch up or down by semitones.
**Parameters:**
- semitones: Semitones to shift (range: -12 to 12, default: 0.0)
## Generation Versions
Each generation starts with a clean version (no effects). Users can create processed versions by applying effect chains.
**Version properties:**
- id — Unique version identifier
- label — User-defined name (e.g., "robotic", "with reverb")
- audio_path — Path to the processed audio file
- effects_chain — JSON array of effect configurations
- source_version_id — Which version this was derived from
- is_default — Whether this is the default audio for the generation
**File storage:**
<Files>
<Folder name="data/generations" defaultOpen>
<File name="{generation_id}.wav" />
<File name="{generation_id}_{version_id}.wav" />
</Folder>
</Files>
**Default version behavior:**
- One version per generation is marked as default
- The generation's audio_path always points to the default version's audio
- Deleting the default version automatically promotes another version
## Effect Presets
Presets are saved effects chains that can be reused across generations.
**Built-in presets:**
- **Robotic**: Metallic robotic voice using chorus (flanger-style)
- **Radio**: Thin AM-radio voice with band-pass filtering and light compression
- **Echo Chamber**: Spacious reverb with trailing echo
- **Deep Voice**: Lower pitch with added warmth using pitch shift and compression
**User presets:**
- Created via the effects UI
- Stored in the database (SQLite)
- Cannot modify/delete built-in presets
- Used to quickly apply favorite effect combinations
## API Endpoints
### Effects Management
| Endpoint | Method | Description |
|----------|--------|-------------|
| /effects/available | GET | List all effect types with parameter definitions |
| /effects/presets | GET | List all presets (built-in + user) |
| /effects/presets | POST | Create a new user preset |
| /effects/presets/:id | GET | Get a specific preset |
| /effects/presets/:id | PUT | Update a user preset |
| /effects/presets/:id | DELETE | Delete a user preset |
| /effects/preview/:generation_id | POST | Preview effects on a generation (returns audio stream) |
### Generation Versions
| Endpoint | Method | Description |
|----------|--------|-------------|
| /generations/:id/versions | GET | List all versions for a generation |
| /generations/:id/versions/apply-effects | POST | Apply effects chain, create new version |
| /generations/:id/versions/:version_id/set-default | PUT | Set a version as default |
| /generations/:id/versions/:version_id | DELETE | Delete a version |
### Request Body: Apply Effects
Request body for applying effects:
- effects_chain: Array of effect objects
- label: Version label (e.g., "with reverb")
- set_as_default: Whether to set as default
- source_version_id: Source version ID (optional)
## Implementation
### Backend Architecture
**Files:**
| File | Purpose |
|------|---------|
| backend/utils/effects.py | Effect registry, validation, and audio processing |
| backend/services/versions.py | Generation version CRUD operations |
| backend/services/effects.py | Effect preset CRUD operations |
| backend/routes/effects.py | API endpoints for effects and versions |
**Effect Registry:**
The EFFECT_REGISTRY dict in utils/effects.py defines all available effects with their parameters, defaults, and ranges.
**Validation:**
Effects chains are validated before application:
- Each effect type must exist in the registry
- Parameters must be numbers within min/max bounds
- Unknown parameters are rejected
**Audio Processing:**
Uses Spotify's Pedalboard library:
```python
from pedalboard import Pedalboard
# Build pedalboard from chain
board = build_pedalboard(effects_chain)
# Apply to audio (async via thread)
processed = await asyncio.to_thread(lambda: board(audio, sample_rate))
```
### Frontend Integration
**Key components:**
| Component | Location |
|-----------|----------|
| Effects chain editor | app/src/components/Effects/ |
| Version selector | Generation detail view |
| Preset manager | Effects panel |
| Live preview | Preview button (streams processed audio) |
**State management:**
- Effects chains are stored as JSON arrays
- Live preview fetches processed audio without saving
- Applied effects create new versions via POST endpoint
## Adding New Effects
To add a new effect type:
1. **Add to registry** (backend/utils/effects.py):
- Add entry to EFFECT_REGISTRY with cls, label, description, and params
- Import the effect class from Pedalboard
2. **Update frontend types** if needed
The new effect automatically appears in /effects/available and the chain editor UI.
## Best Practices
**Effect ordering matters.** Process effects in this order for best results:
1. Pitch shift (if needed)
2. High/low-pass filters
3. Chorus/flanger (time-based)
4. Reverb/delay (spatial)
5. Compressor
6. Gain (final level adjustment)
**CPU usage:**
- Effects are applied in real-time during generation
- Pitch shift and reverb are the most CPU-intensive
- Consider previewing complex chains before applying
**Storage:**
- Each version creates a new audio file
- Clean version always exists (can be reverted to)
- Processed versions can be deleted to save space

View File

@@ -0,0 +1,275 @@
---
title: "Generation History"
description: "How generation history tracking works in Voicebox"
---
## Overview
The history module tracks all generated audio, providing a searchable record of past generations. Each generation stores the text, settings, and a reference to the audio file.
## Data Model
### Generation Table
```python
class Generation(Base):
__tablename__ = "generations"
id = Column(String, primary_key=True, default=lambda: str(uuid.uuid4()))
profile_id = Column(String, ForeignKey("profiles.id"), nullable=False)
text = Column(Text, nullable=False)
language = Column(String, default="en")
audio_path = Column(String, nullable=True)
duration = Column(Float, nullable=True)
seed = Column(Integer)
instruct = Column(Text)
engine = Column(String, default="qwen")
model_size = Column(String, nullable=True)
status = Column(String, default="completed") # pending | completed | failed
error = Column(Text, nullable=True)
is_favorited = Column(Boolean, default=False)
created_at = Column(DateTime, default=datetime.utcnow)
```
Each generation can also have multiple **generation versions** — processed variants with different effects chains applied. The original (`clean`) version plus any number of processed versions live in a separate `generation_versions` table. See [Effects Pipeline](/developer/effects-pipeline).
## File Storage
Generated audio is stored in:
<Files>
<Folder name="data" defaultOpen>
<Folder name="generations">
<File name="{generation_id}.wav" />
</Folder>
</Folder>
</Files>
## Core Functions
### Creating a Generation Record
After TTS generates audio, a history entry is created:
```python
async def create_generation(
profile_id: str,
text: str,
language: str,
audio_path: str,
duration: float,
seed: Optional[int],
db: Session,
instruct: Optional[str] = None,
) -> GenerationResponse:
db_generation = DBGeneration(
id=str(uuid.uuid4()),
profile_id=profile_id,
text=text,
language=language,
audio_path=audio_path,
duration=duration,
seed=seed,
instruct=instruct,
created_at=datetime.utcnow(),
)
db.add(db_generation)
db.commit()
return GenerationResponse.model_validate(db_generation)
```
### Listing Generations
Supports filtering and pagination:
```python
async def list_generations(
query: HistoryQuery,
db: Session,
) -> HistoryListResponse:
# Build query with profile name join
q = db.query(
DBGeneration,
DBVoiceProfile.name.label('profile_name')
).join(
DBVoiceProfile,
DBGeneration.profile_id == DBVoiceProfile.id
)
# Apply filters
if query.profile_id:
q = q.filter(DBGeneration.profile_id == query.profile_id)
if query.search:
q = q.filter(DBGeneration.text.like(f"%{query.search}%"))
# Order and paginate
total = q.count()
q = q.order_by(DBGeneration.created_at.desc())
q = q.offset(query.offset).limit(query.limit)
return HistoryListResponse(items=results, total=total)
```
### Getting Statistics
Aggregate statistics for the dashboard:
```python
async def get_generation_stats(db: Session) -> dict:
total = db.query(func.count(DBGeneration.id)).scalar()
total_duration = db.query(func.sum(DBGeneration.duration)).scalar()
by_profile = db.query(
DBGeneration.profile_id,
func.count(DBGeneration.id).label('count')
).group_by(DBGeneration.profile_id).all()
return {
"total_generations": total,
"total_duration_seconds": total_duration,
"generations_by_profile": {
profile_id: count for profile_id, count in by_profile
},
}
```
## Deletion
Deleting a generation removes both the database record and audio file:
```python
async def delete_generation(generation_id: str, db: Session) -> bool:
generation = db.query(DBGeneration).filter_by(id=generation_id).first()
if not generation:
return False
# Delete audio file
audio_path = Path(generation.audio_path)
if audio_path.exists():
audio_path.unlink()
# Delete database record
db.delete(generation)
db.commit()
return True
```
### Cascade Delete
When deleting a profile, all its generations are also deleted:
```python
async def delete_generations_by_profile(profile_id: str, db: Session) -> int:
generations = db.query(DBGeneration).filter_by(profile_id=profile_id).all()
for generation in generations:
Path(generation.audio_path).unlink(missing_ok=True)
db.delete(generation)
db.commit()
return len(generations)
```
## Export/Import
### Exporting a Generation
Generations can be exported as ZIP archives:
<Files>
<Folder name="generation_export.zip" defaultOpen>
<File name="generation.json" />
<File name="audio.wav" />
</Folder>
</Files>
### Importing a Generation
The import process:
1. Extract ZIP archive
2. Validate metadata and audio
3. Create new generation ID
4. Copy audio to generations directory
5. Create database record
## API Endpoints
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/history` | List generations with filters |
| GET | `/history/stats` | Get aggregate statistics |
| GET | `/history/{id}` | Get generation by ID |
| DELETE | `/history/{id}` | Delete generation |
| GET | `/history/{id}/export` | Export as ZIP |
| GET | `/history/{id}/export-audio` | Export audio only |
| POST | `/history/import` | Import from ZIP |
### Query Parameters
```
GET /history?profile_id=uuid&search=hello&limit=50&offset=0
```
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `profile_id` | string | null | Filter by profile |
| `search` | string | null | Search in text |
| `limit` | int | 50 | Results per page |
| `offset` | int | 0 | Pagination offset |
### Response Schema
```json
{
"items": [
{
"id": "uuid",
"profile_id": "uuid",
"profile_name": "My Voice",
"text": "Hello world",
"language": "en",
"audio_path": "/path/to/audio.wav",
"duration": 1.5,
"seed": 42,
"instruct": null,
"engine": "qwen",
"model_size": "1.7B",
"status": "completed",
"error": null,
"is_favorited": false,
"created_at": "2026-04-18T10:30:00Z"
}
],
"total": 150
}
```
## Usage in Stories
Generations can be added to stories for multi-voice narratives. The story system references generations by ID:
```python
class StoryItem(Base):
generation_id = Column(String, ForeignKey("generations.id"))
```
This allows the same generation to be reused across multiple stories without duplicating audio files.
## Storage Considerations
### Disk Usage
Each generation creates a WAV file. For a 10-second clip at 24kHz:
- ~480KB per file (mono, 16-bit)
### Cleanup Strategy
Consider implementing:
- Automatic cleanup of old generations
- Storage quota per profile
- Compression for archival

View File

@@ -0,0 +1,20 @@
{
"title": "Developer",
"defaultOpen": true,
"pages": [
"setup",
"architecture",
"contributing",
"building",
"autoupdater",
"voice-profiles",
"tts-generation",
"tts-engines",
"effects-pipeline",
"history",
"stories",
"transcription",
"audio-channels",
"model-management"
]
}

View File

@@ -0,0 +1,199 @@
---
title: "Model Management"
description: "How model downloading, loading, and status tracking works across all engines"
---
## Overview
Voicebox manages two categories of models:
**TTS Models** — Seven engines covering zero-shot cloning and preset voices. Each engine may have one or more size variants.
**ASR Models** — Whisper for transcription. Five sizes, plus MLX-Whisper on Apple Silicon for ~8× faster transcription.
Every model is described by a `ModelConfig` entry in `backend/backends/__init__.py`. Models are downloaded from HuggingFace Hub on first use and cached in the platform-standard HF cache.
## Available TTS Models
| Model | Engine | HuggingFace Repo | Size | VRAM | Languages |
|-------|--------|------------------|------|------|-----------|
| **Qwen TTS 1.7B** | `qwen` | `Qwen/Qwen3-TTS-12Hz-1.7B-Base` | 3.5 GB | ~6 GB | 10 |
| **Qwen TTS 0.6B** | `qwen` | `Qwen/Qwen3-TTS-12Hz-0.6B-Base` | 1.2 GB | ~2 GB | 10 |
| **Qwen CustomVoice 1.7B** | `qwen_custom_voice` | `Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice` | 3.5 GB | ~6 GB | 10 |
| **Qwen CustomVoice 0.6B** | `qwen_custom_voice` | `Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice` | 1.2 GB | ~2 GB | 10 |
| **LuxTTS** | `luxtts` | `YatharthS/LuxTTS` | 300 MB | ~1 GB | English |
| **Chatterbox Multilingual** | `chatterbox` | `ResembleAI/chatterbox` | 3.2 GB | ~3 GB | 23 |
| **Chatterbox Turbo** | `chatterbox_turbo` | `ResembleAI/chatterbox-turbo` | 1.5 GB | ~1.5 GB | English |
| **TADA 1B** | `tada` | `HumeAI/tada-1b` | 4 GB | ~4 GB | English |
| **TADA 3B Multilingual** | `tada` | `HumeAI/tada-3b-ml` | 8 GB | ~8 GB | 10 |
| **Kokoro 82M** | `kokoro` | `hexgrad/Kokoro-82M` | 350 MB | ~150 MB | 8 |
On Apple Silicon, Qwen TTS uses MLX-optimized repos from `mlx-community` instead of the PyTorch repos. The backend picks automatically via `get_backend_type()`.
## Available Whisper Models
| Model | HuggingFace Repo | Size |
|-------|------------------|------|
| **Whisper Base** | `openai/whisper-base` | ~300 MB |
| **Whisper Small** | `openai/whisper-small` | ~500 MB |
| **Whisper Medium** | `openai/whisper-medium` | ~1.5 GB |
| **Whisper Large** | `openai/whisper-large-v3` | ~3 GB |
| **Whisper Turbo** | `openai/whisper-large-v3-turbo` | ~1.5 GB |
On Apple Silicon, MLX-Whisper is preferred automatically — see [Transcription](/developer/transcription).
## Model Storage
Models live in the platform HuggingFace cache:
| Platform | Path |
|----------|------|
| macOS | `~/.cache/huggingface/hub/` |
| Linux | `~/.cache/huggingface/hub/` |
| Windows | `%USERPROFILE%\.cache\huggingface\hub\` |
| Docker | `/home/voicebox/.cache/huggingface/hub` (volume-mounted) |
Set `VOICEBOX_MODELS_DIR` to override.
## Progress Tracking
Downloads stream progress to the frontend via Server-Sent Events. The progress pipeline has three pieces:
**`ProgressManager`** (`backend/utils/progress.py`) — in-memory map of `model_name → {current, total, filename, status}`.
**`HFProgressTracker`** — context manager that intercepts HuggingFace Hub downloads to emit byte-level progress. Needed because `huggingface_hub` silently disables tqdm in frozen PyInstaller builds.
**SSE endpoint** — `GET /models/progress/{model_name}` streams updates until `status` is `complete` or `error`.
```python
# Frontend
const eventSource = new EventSource(`/models/progress/${modelName}`);
eventSource.onmessage = (event) => {
const { current, total, status } = JSON.parse(event.data);
updateProgressBar(current / total);
if (status === "complete") eventSource.close();
};
```
## Model Status
`GET /models/status` returns every registered model's current state:
```json
{
"models": [
{
"model_name": "qwen-tts-1.7B",
"display_name": "Qwen TTS 1.7B",
"engine": "qwen",
"downloaded": true,
"size_mb": 3500,
"loaded": true
},
...
]
}
```
The handler iterates `get_all_model_configs()` and calls `check_model_loaded(config)` for each entry, so new engines appear automatically once they're registered in `ModelConfig`.
## Manual Model Operations
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/models/status` | Status of every registered model |
| POST | `/models/load` | Load a TTS model into memory |
| POST | `/models/unload` | Unload a TTS model from memory |
| POST | `/models/download` | Trigger a background download |
| GET | `/models/progress/{name}` | Stream download progress (SSE) |
| DELETE | `/models/{name}` | Delete a downloaded model from cache |
### Load
```http
POST /models/load
{
"model_name": "qwen-tts-1.7B"
}
```
The route looks up the config, dispatches to `get_model_load_func(config)`, and returns once the model is ready.
### Unload
```http
POST /models/unload
{
"model_name": "chatterbox-tts"
}
```
Calls `unload_model_by_config(config)`, which routes to the right backend's `unload_model()` and frees GPU memory.
### Download
```http
POST /models/download
{
"model_name": "kokoro"
}
```
Fires off an async download task. Progress is available via the SSE endpoint. Download is triggered automatically on first generation, so this is only needed for pre-warming.
## Preset Voice Seeding
For engines that use preset voices (Kokoro, Qwen CustomVoice), the backend auto-creates a voice profile per preset voice after the model is downloaded. This is driven by `seed_preset_profiles(engine)` in `backend/services/profiles.py`, called from the models route once download completes.
Preset profiles have:
- `voice_type = "preset"`
- `preset_engine` = engine name (`"kokoro"`, `"qwen_custom_voice"`)
- `preset_voice_id` = engine-specific voice ID (`"am_adam"`, `"f000001"`, etc.)
- No `profile_samples` rows — no audio to store
See [Voice Profiles](/developer/voice-profiles) for the schema.
## Adding a New Model
To add a new size variant of an existing engine, just add another `ModelConfig`:
```python
ModelConfig(
model_name="qwen-tts-3B",
display_name="Qwen TTS 3B",
engine="qwen",
hf_repo_id="Qwen/Qwen3-TTS-12Hz-3B-Base",
model_size="3B",
size_mb=7000,
languages=["zh", "en", ...],
),
```
The frontend picks it up via `/models/status`; download/load flow works without further changes.
Adding a whole new engine is a bigger lift — see [TTS Engines](/developer/tts-engines) for the full phased workflow.
## Error Handling
| Error | Cause | Fix |
|-------|-------|-----|
| Download failed | Network / HF rate limit | Retry |
| OOM on load | Not enough VRAM | Use a smaller variant, unload other engines |
| Model not found | Corrupt cache | Re-download via `/models/download` |
| Stuck progress bar in frozen build | `huggingface_hub` tqdm silenced | `HFProgressTracker` force-enables the internal counter |
| GPU architecture unsupported | PyTorch wheel doesn't target your GPU | See [GPU Acceleration](/overview/gpu-acceleration) |
## Next Steps
<Cards>
<Card title="TTS Generation" href="/developer/tts-generation">
How generation flows through the registry
</Card>
<Card title="TTS Engines" href="/developer/tts-engines">
Add a new engine end-to-end
</Card>
<Card title="Transcription" href="/developer/transcription">
Whisper and MLX-Whisper integration
</Card>
</Cards>

View File

@@ -0,0 +1,299 @@
---
title: "Development Setup"
description: "Set up your local development environment for Voicebox"
---
## Quick Setup (Recommended)
Get started in two commands:
```bash
# Clone and enter the repository
git clone https://github.com/jamiepine/voicebox.git
cd voicebox
# Setup everything (Python venv, JS deps, dev sidecar)
just setup
# Start development (backend + desktop app)
just dev
```
The `just dev` command automatically starts the Python backend (if not already running) and launches the Tauri desktop app.
## Prerequisites
Ensure you have these installed:
<Cards>
<Card title="Bun" icon={<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="m7.5 4.27 9 5.15"/><path d="M21 8a2 2 0 0 0-1-1.73l-7-4a2 2 0 0 0-2 0l-7 4A2 2 0 0 0 3 8v8a2 2 0 0 0 1 1.73l7 4a2 2 0 0 0 2 0l7-4A2 2 0 0 0 21 16Z"/><path d="m3.3 7 8.7 5 8.7-5"/><path d="M12 22V12"/></svg>}>
[Download Bun](https://bun.sh)
```bash
curl -fsSL https://bun.sh/install | bash
```
</Card>
<Card title="Python 3.11+" icon={<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="M12 2L2 7l10 5 10-5-10-5z"/><path d="M2 17l10 5 10-5"/><path d="M2 12l10 5 10-5"/></svg>}>
[Download Python](https://python.org)
```bash
python --version
```
</Card>
<Card title="Rust" icon={<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="m6 9 6 6 6-6"/></svg>}>
[Install Rust](https://rustup.rs)
```bash
rustc --version
```
</Card>
<Card title="Just" icon={<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="M4 7V4h3"/><path d="M7 4h14v6h-2V6H7V4Z"/><path d="M4 10v10h16V10H4Z"/></svg>}>
[Install Just](https://github.com/casey/just)
```bash
brew install just # macOS
cargo install just # Linux/Windows
```
</Card>
</Cards>
<Callout type="info">
Just works on macOS, Linux, and Windows.
</Callout>
## Just Commands
Run `just --list` to see all available commands. Highlights:
### Setup
| Command | Description |
|---------|-------------|
| `just setup` | Full setup (Python venv + JS deps + dev sidecar). Detects Apple Silicon for MLX and NVIDIA/Intel Arc on Windows for accelerated PyTorch. |
| `just setup-python` | Python venv + dependencies only |
| `just setup-js` | `bun install` only |
### Development
| Command | Description |
|---------|-------------|
| `just dev` | Start backend + Tauri desktop app (reuses a running backend if one exists) |
| `just dev-web` | Start backend + web app (no Tauri/Rust build) |
| `just dev-backend` | Backend only |
| `just dev-frontend` | Tauri app only (backend must already be running) |
| `just kill` | Stop all dev processes |
### Build
| Command | Description |
|---------|-------------|
| `just build` | CPU server binary + Tauri installer |
| `just build-local` | **Windows:** CPU + CUDA server binaries + Tauri installer |
| `just build-server` | CPU server binary only |
| `just build-server-cuda` | **Windows:** CUDA server binary only, placed in `%APPDATA%/sh.voicebox.app/backends/cuda` for local testing |
| `just build-tauri` | Tauri app only |
| `just build-web` | Web app only |
### Quality
| Command | Description |
|---------|-------------|
| `just check` | Lint + format + typecheck (Biome + ruff) |
| `just fix` | Auto-fix lint + format issues |
| `just lint` / `just format` | Lint or format only |
| `just test` | Run Python tests (pytest) |
| `just test-models` | End-to-end generation against every TTS engine using the frozen binary |
### Database
| Command | Description |
|---------|-------------|
| `just db-init` | Initialize SQLite database |
| `just db-reset` | Delete and reinitialize the database |
### Utilities
| Command | Description |
|---------|-------------|
| `just generate-api` | Generate TypeScript API client from the backend's OpenAPI schema |
| `just docs` | Open `http://localhost:17493/docs` in your browser |
| `just logs` | Tail backend logs |
| `just clean` | Remove build artifacts |
| `just clean-python` | Remove the Python venv + `__pycache__` |
| `just clean-all` | Nuclear clean (includes all `node_modules`) |
## Project Structure
<Files>
<Folder name="voicebox" defaultOpen>
<Folder name="app">
<Folder name="src">
<File name="components/" />
<File name="lib/" />
<File name="hooks/" />
</Folder>
</Folder>
<Folder name="backend">
<File name="app.py" />
<File name="main.py" />
<File name="config.py" />
<File name="models.py" />
<File name="server.py" />
<Folder name="routes">
<File name="..." />
</Folder>
<Folder name="services">
<File name="..." />
</Folder>
<Folder name="backends">
<File name="..." />
</Folder>
<Folder name="database">
<File name="..." />
</Folder>
<Folder name="utils">
<File name="..." />
</Folder>
</Folder>
<Folder name="tauri">
<Folder name="src-tauri" />
</Folder>
<Folder name="web" />
<Folder name="scripts" />
</Folder>
</Files>
### Request Flow
HTTP request → **routes/** (validate input) → **services/** (business logic) → **backends/** (TTS/STT inference) → **utils/** (audio processing)
### Key Modules
- **app.py** — FastAPI app factory, CORS, lifecycle events
- **main.py** — Entry point (imports app, runs uvicorn)
- **server.py** — Tauri sidecar launcher, parent-pid watchdog
- **services/generation.py** — Single function handling all generation modes
- **backends/** — TTS/STT engine implementations (MLX, PyTorch, etc.)
## Model Downloads
Models are automatically downloaded from HuggingFace Hub on first use, with live progress streamed to the UI:
- **Whisper** (transcription) — auto-downloads on first transcription
- **TTS engines** — auto-download on first generation. Sizes range from 82 M (Kokoro, ~350 MB) to 3 B (TADA, ~8 GB)
See [Model Management](/developer/model-management) for the full list.
<Callout type="warn">
First-time usage will be slower due to model downloads, but subsequent runs will use cached models.
</Callout>
## Generate OpenAPI Client
After starting the backend server, generate the TypeScript API client:
```bash
just generate-api
```
This downloads the OpenAPI schema and generates the TypeScript client in `app/src/lib/api/`.
## Manual Setup (Advanced)
If you prefer not to use Just, follow these manual steps:
### 1. Install JavaScript Dependencies
```bash
bun install
```
This installs dependencies for:
- `app/` - Shared React frontend
- `tauri/` - Tauri desktop wrapper
- `web/` - Web deployment wrapper
### 2. Set Up Python Backend
```bash
cd backend
# Create virtual environment
python -m venv venv
# Activate virtual environment
source venv/bin/activate # macOS/Linux
# or
venv\Scripts\activate # Windows
# Install Python dependencies
pip install -r requirements.txt
# Apple Silicon: install MLX dependencies
pip install -r requirements-mlx.txt
# Chatterbox pins numpy<1.26 / torch==2.6 which break on Python 3.12+
pip install --no-deps chatterbox-tts
# HumeAI TADA pins torch>=2.7,<2.8 which conflicts with our torch>=2.1
pip install --no-deps hume-tada
# Install Qwen3-TTS from source
pip install git+https://github.com/QwenLM/Qwen3-TTS.git
# PyInstaller and linting tools
pip install pyinstaller ruff pytest pytest-asyncio
```
### 3. Start Development
Start the backend:
```bash
cd backend
source venv/bin/activate
uvicorn main:app --reload --port 17493
```
In a new terminal, start the desktop app:
```bash
cd tauri
bun run tauri dev
```
## Next Steps
<Cards>
<Card title="Architecture" href="/developer/architecture">
Understand the system architecture
</Card>
<Card title="Contributing" href="/developer/contributing">
Read the contribution guidelines
</Card>
<Card title="Building" href="/developer/building">
Learn how to build production releases
</Card>
<Card title="TTS Engines" href="/developer/tts-engines">
Add a new TTS engine end-to-end
</Card>
</Cards>
## Troubleshooting
<AccordionGroup>
<Accordion title="Backend won't start">
- Check Python version (must be 3.11+)
- Ensure virtual environment is activated: `source backend/venv/bin/activate`
- Verify all dependencies are installed: `pip install -r requirements.txt`
- Check if port 17493 is available
</Accordion>
<Accordion title="Tauri build fails">
- Ensure Rust is installed: `rustc --version`
- Clean the build: `cd tauri/src-tauri && cargo clean`
- Try rebuilding: `just dev`
</Accordion>
<Accordion title="OpenAPI client generation fails">
- Ensure backend is running: `curl http://localhost:17493/openapi.json`
- Check network connectivity
- Verify the backend is accessible at localhost:17493
</Accordion>
</AccordionGroup>
See the full [Troubleshooting Guide](/overview/troubleshooting) for more issues and solutions.

View File

@@ -0,0 +1,301 @@
---
title: "Stories & Timeline"
description: "How the multi-voice timeline editor works in Voicebox"
---
## Overview
Stories allow users to arrange multiple voice generations on a timeline to create multi-voice narratives. The system supports tracks, trimming, splitting, and audio mixing.
## Architecture
**Story:** A container that holds story items with metadata.
**Story Item:** Links a generation to a story with timeline position, track, and trim data.
**Export:** Combines all items into a single mixed audio file.
## Data Model
### Story Table
```python
class Story(Base):
__tablename__ = "stories"
id = Column(String, primary_key=True)
name = Column(String, nullable=False)
description = Column(Text)
created_at = Column(DateTime)
updated_at = Column(DateTime)
```
### StoryItem Table
```python
class StoryItem(Base):
__tablename__ = "story_items"
id = Column(String, primary_key=True)
story_id = Column(String, ForeignKey("stories.id"))
generation_id = Column(String, ForeignKey("generations.id"))
start_time_ms = Column(Integer, default=0) # Timeline position
track = Column(Integer, default=0) # Track number
trim_start_ms = Column(Integer, default=0) # Trim from start
trim_end_ms = Column(Integer, default=0) # Trim from end
created_at = Column(DateTime)
```
## Timeline Concepts
### Start Time
`start_time_ms` is the absolute position on the timeline where an item begins playing. Items on the same track cannot overlap; items on different tracks can.
### Tracks
A `track` is an integer (0-indexed) that identifies the horizontal row an item sits on. Audio on separate tracks plays concurrently, so tracks are the primary way to layer multiple voices or sound effects.
### Trimming
`trim_start_ms` and `trim_end_ms` hide the leading/trailing portions of the source generation without modifying the underlying audio file. The effective playback length is `generation.duration * 1000 - trim_start_ms - trim_end_ms`. Trimming is non-destructive — the same generation can be trimmed differently in different stories.
## Core Operations
### Adding Items
When adding a generation to a story:
```python
async def add_item_to_story(
story_id: str,
data: StoryItemCreate,
db: Session,
) -> StoryItemDetail:
# Calculate start time if not provided
if data.start_time_ms is None:
# Find the end of all existing items
existing_items = get_items_with_durations(story_id, db)
max_end_time_ms = max(
item.start_time_ms + int(gen.duration * 1000)
for item, gen in existing_items
)
start_time_ms = max_end_time_ms + 200 # 200ms gap
# Create the item
item = DBStoryItem(
id=str(uuid.uuid4()),
story_id=story_id,
generation_id=data.generation_id,
start_time_ms=start_time_ms,
track=data.track or 0,
)
db.add(item)
db.commit()
```
### Moving Items
Update position and/or track:
```python
async def move_story_item(
story_id: str,
item_id: str,
data: StoryItemMove,
db: Session,
) -> StoryItemDetail:
item = get_item(story_id, item_id, db)
item.start_time_ms = data.start_time_ms
item.track = data.track
db.commit()
```
### Trimming Items
Non-destructive trimming:
```python
async def trim_story_item(
story_id: str,
item_id: str,
data: StoryItemTrim,
db: Session,
) -> StoryItemDetail:
item = get_item(story_id, item_id, db)
generation = get_generation(item.generation_id, db)
# Validate trim doesn't exceed duration
max_duration_ms = int(generation.duration * 1000)
if data.trim_start_ms + data.trim_end_ms >= max_duration_ms:
return None # Invalid trim
item.trim_start_ms = data.trim_start_ms
item.trim_end_ms = data.trim_end_ms
db.commit()
```
### Splitting Items
Split one item into two at a specific time:
```python
async def split_story_item(
story_id: str,
item_id: str,
data: StoryItemSplit,
db: Session,
) -> List[StoryItemDetail]:
item = get_item(story_id, item_id, db)
generation = get_generation(item.generation_id, db)
# Calculate split point
current_trim_start = item.trim_start_ms
current_trim_end = item.trim_end_ms
original_duration_ms = int(generation.duration * 1000)
absolute_split_ms = current_trim_start + data.split_time_ms
# Update original: trim from end
item.trim_end_ms = original_duration_ms - absolute_split_ms
# Create new item: trim from start
new_item = DBStoryItem(
generation_id=item.generation_id, # Same generation
start_time_ms=item.start_time_ms + data.split_time_ms,
track=item.track,
trim_start_ms=absolute_split_ms,
trim_end_ms=current_trim_end,
)
db.add(new_item)
db.commit()
return [item, new_item]
```
### Duplicating Items
Create a copy with all properties:
```python
async def duplicate_story_item(
story_id: str,
item_id: str,
db: Session,
) -> StoryItemDetail:
original = get_item(story_id, item_id, db)
generation = get_generation(original.generation_id, db)
# Calculate effective duration for positioning
effective_duration_ms = (
int(generation.duration * 1000)
- original.trim_start_ms
- original.trim_end_ms
)
# Place copy after original with 200ms gap
new_item = DBStoryItem(
generation_id=original.generation_id,
start_time_ms=original.start_time_ms + effective_duration_ms + 200,
track=original.track,
trim_start_ms=original.trim_start_ms,
trim_end_ms=original.trim_end_ms,
)
db.add(new_item)
db.commit()
```
## Audio Export
### Mixing Algorithm
The export function mixes all items into a single audio file:
```python
async def export_story_audio(story_id: str, db: Session) -> bytes:
items = get_all_items_with_generations(story_id, db)
# Calculate total duration
max_end_time_ms = max(
data['start_time_ms'] + data['duration_ms']
for data in audio_data
)
# Create output buffer
total_samples = int((max_end_time_ms / 1000.0) * sample_rate)
final_audio = np.zeros(total_samples, dtype=np.float32)
# Mix each item at its position
for data in audio_data:
audio = data['audio']
start_sample = int((data['start_time_ms'] / 1000.0) * sample_rate)
# Apply trim
trimmed_audio = audio[trim_start_sample:len(audio) - trim_end_sample]
# Add to buffer (overlapping items sum together)
final_audio[start_sample:start_sample + len(trimmed_audio)] += trimmed_audio
# Normalize to prevent clipping
max_val = np.abs(final_audio).max()
if max_val > 1.0:
final_audio = final_audio / max_val
return audio_to_bytes(final_audio, sample_rate)
```
## API Endpoints
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/stories` | List all stories |
| POST | `/stories` | Create a story |
| GET | `/stories/{id}` | Get story with items |
| PUT | `/stories/{id}` | Update story metadata |
| DELETE | `/stories/{id}` | Delete story |
| POST | `/stories/{id}/items` | Add item to story |
| DELETE | `/stories/{id}/items/{item_id}` | Remove item |
| PUT | `/stories/{id}/items/{item_id}/move` | Move item |
| PUT | `/stories/{id}/items/{item_id}/trim` | Trim item |
| POST | `/stories/{id}/items/{item_id}/split` | Split item |
| POST | `/stories/{id}/items/{item_id}/duplicate` | Duplicate item |
| PUT | `/stories/{id}/items/times` | Batch update times |
| PUT | `/stories/{id}/items/reorder` | Reorder items |
| GET | `/stories/{id}/export-audio` | Export mixed audio |
## Response Schemas
### StoryItemDetail
```json
{
"id": "item_uuid",
"story_id": "story_uuid",
"generation_id": "generation_uuid",
"start_time_ms": 1500,
"track": 0,
"trim_start_ms": 200,
"trim_end_ms": 100,
"profile_id": "profile_uuid",
"profile_name": "Narrator",
"text": "Hello world",
"audio_path": "/path/to/audio.wav",
"duration": 2.5,
"created_at": "2024-01-15T10:30:00Z"
}
```
## Frontend Integration
The timeline UI needs to:
1. **Fetch story** with all items
2. **Render waveforms** for each item
3. **Handle drag/drop** to move items
4. **Handle edge drag** for trimming
5. **Sync playhead** across all tracks
6. **Export** when user clicks download

View File

@@ -0,0 +1,160 @@
---
title: "Transcription"
description: "How Whisper-based audio transcription works in Voicebox"
---
## Overview
Voicebox uses OpenAI's Whisper for automatic speech recognition (ASR). Transcription powers two flows:
1. **Reference-text auto-fill** — when a user records or uploads a voice sample, the backend transcribes it and populates the `reference_text` field so cloning can use it.
2. **On-demand transcription** — a user-facing `/transcribe` endpoint for arbitrary audio.
On Apple Silicon, the transcription path runs through **MLX-Whisper** (from `mlx-audio`) for ~8× faster inference than PyTorch. Everywhere else it runs through PyTorch's `transformers` Whisper.
## Architecture
Transcription is wired through the same backend abstraction as TTS. The `STTBackend` protocol lives in `backend/backends/__init__.py`:
```python
@runtime_checkable
class STTBackend(Protocol):
async def load_model(self, model_size: str) -> None: ...
async def transcribe(
self,
audio_path: str,
language: Optional[str] = None,
model_size: Optional[str] = None,
) -> str: ...
def unload_model(self) -> None: ...
def is_loaded(self) -> bool: ...
```
Two implementations ship today:
- **`MLXSTTBackend`** (`backends/mlx_backend.py`) — uses `mlx_audio.stt.load()`. Default on Apple Silicon.
- **`PyTorchSTTBackend`** (`backends/pytorch_backend.py`) — uses `transformers.WhisperForConditionalGeneration`. Default everywhere else.
`get_stt_backend()` picks the right one based on `get_backend_type()`. `backend/services/transcribe.py` is a thin wrapper that delegates to the backend.
## Model Sizes
Five Whisper variants are registered in `ModelConfig`:
| Model | HuggingFace Repo | Size | Notes |
|-------|------------------|------|-------|
| **Base** | `openai/whisper-base` | ~300 MB | Default; fast, decent quality |
| **Small** | `openai/whisper-small` | ~500 MB | Better quality, still fast |
| **Medium** | `openai/whisper-medium` | ~1.5 GB | High quality |
| **Large** | `openai/whisper-large-v3` | ~3 GB | Best quality, slow on CPU |
| **Turbo** | `openai/whisper-large-v3-turbo` | ~1.5 GB | Large-tier quality, ~5× faster than Large |
The `tiny` model is **not** exposed — the quality gap to `base` wasn't worth the download.
`Turbo` + MLX-Whisper on Apple Silicon dropped user-facing transcription latency from ~20s to ~2-3s in v0.1.10.
## Language Hints
Whisper can auto-detect language, but providing a hint improves accuracy on short clips:
```python
text = await backend.transcribe(audio_path, language="en")
```
Accepted language codes are the standard Whisper set (99+ languages). The frontend typically passes the profile's language if available, or lets Whisper detect otherwise.
## Model Loading
Both backends are lazy: the model is loaded on first use and cached in memory. Switching sizes unloads the previous model.
On MLX, the model is loaded via `mlx_audio.stt.load(hf_repo)`. On PyTorch, via:
```python
WhisperProcessor.from_pretrained(hf_repo)
WhisperForConditionalGeneration.from_pretrained(hf_repo).to(device)
```
Both load paths use `model_load_progress()` from `backends/base.py` so the frontend sees live download progress on the first use.
## Audio Preprocessing
Whisper expects mono 16 kHz audio. The audio utility in `backend/utils/audio.py` handles resampling and format conversion transparently:
- **Formats:** WAV, MP3, FLAC, OGG, M4A (via soundfile / librosa)
- **Target:** mono, 16 kHz, float32
Files longer than Whisper's 30-second window are handled by the underlying library's chunking logic — no explicit splitting in Voicebox code.
## API Endpoints
| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | `/transcribe` | Transcribe an uploaded audio file |
### Request
Multipart form data:
```
POST /transcribe
Content-Type: multipart/form-data
file: <audio_file>
language: en # optional
model_size: base # optional (default: "base")
```
### Response
```json
{
"text": "Hello, this is a test transcription.",
"duration": 3.5
}
```
## Use Cases
### Reference Text for Voice Cloning
Adding a voice sample triggers transcription automatically:
1. User uploads or records audio.
2. The backend writes the audio file and calls `/transcribe` internally (or the frontend calls it separately).
3. The returned text becomes `reference_text` on the new `profile_samples` row.
4. Cloning engines that need reference text (Chatterbox, TADA, etc.) read it from there.
### Quality Tips
- Provide a language hint for short clips (under 5 seconds) — auto-detection is unreliable on little audio.
- Use Turbo or Large for noisy audio — Base can hallucinate on hard inputs.
- Prefer clean audio; transcription errors become reference-text errors, which become cloning errors.
## Memory Management
`unload_model()` drops the model reference and clears the CUDA cache if applicable. `/models/unload` wires this up for manual control.
A singleton per backend is returned by `get_stt_backend()` — multiple callers share one Whisper instance.
## Error Handling
| Error | Cause | Solution |
|-------|-------|----------|
| Model not found | First run + network failure | Retry; check connectivity |
| OOM on load | Large model on low-VRAM GPU | Switch to Small or Turbo |
| Empty result | No speech in audio | Confirm input has voice; check trim |
| Wrong language | Auto-detect misfired | Pass `language` hint |
## Next Steps
<Cards>
<Card title="Model Management" href="/developer/model-management">
Download / load / unload any model
</Card>
<Card title="Voice Profiles" href="/developer/voice-profiles">
How reference text is stored alongside samples
</Card>
<Card title="GPU Acceleration" href="/overview/gpu-acceleration">
Platform-specific acceleration including MLX-Whisper
</Card>
</Cards>

View File

@@ -0,0 +1,703 @@
---
title: "TTS Engines"
description: "How to add new text-to-speech engines to Voicebox"
---
> **For humans:** This doc is optimized for AI agents to implement new TTS engines autonomously. It's structured as a phased workflow with explicit gates and a checklist so an agent can do the full integration — dependency research, backend, frontend, bundling — and hand you a draft release or prod build to test locally. It's also a useful reference if you're doing it yourself.
Adding an engine touches ~10 files across 4 layers. The backend protocol work is straightforward — the real time sink is dependency hell, upstream library bugs, and PyInstaller bundling.
**Do not start writing code until you complete Phase 0.** The v0.2.3 release was three patch releases of PyInstaller fixes because dependency research was skipped. Every issue — `inspect.getsource()` failures, missing native data files, metadata lookups, dtype mismatches — was discoverable by reading the model library's source code before integration began.
## Architecture Overview
The backend is split into layers:
| Layer | Purpose | Files Touched |
|-------|---------|---------------|
| `routes/` | Thin HTTP handlers | None (auto-dispatch) |
| `services/` | Business logic | None (auto-dispatch) |
| `backends/` | Engine implementations | `your_engine_backend.py` |
| `utils/` | Shared utilities | As needed |
New engines only need to touch `backends/` and `models.py` on the backend side — the route and service layers use a model config registry that handles dispatch automatically.
## Phase 0: Dependency Research
**This phase is mandatory.** Clone the model library and its key dependencies into a temporary directory and inspect them before writing any integration code. The goal is to produce a dependency audit that identifies every PyInstaller-incompatible pattern, every native data file, and every upstream bug you'll need to work around.
### 0.1 Clone and Inspect the Model Library
```bash
# Create a throwaway workspace
mkdir /tmp/engine-research && cd /tmp/engine-research
# Clone the model library
git clone https://github.com/org/model-library.git
cd model-library
```
**Read these files first, in order:**
1. **`setup.py` / `setup.cfg` / `pyproject.toml`** — Check pinned dependency versions. If the library pins `torch==2.6.0` or `numpy<1.26`, you'll need `--no-deps` installation and manual sub-dependency listing (this is what happened with `chatterbox-tts`).
2. **`__init__.py` and the main model class** — Trace the import chain. Look for:
- `from_pretrained()` — does it call `huggingface_hub` internally? Does it pass `token=True` (which crashes without a stored HF token)?
- `from_local()` — does it exist? You may need manual `snapshot_download()` + `from_local()` to bypass download bugs.
- Device handling — does it default to CUDA? Does it support MPS? Many libraries crash on MPS with unsupported operators.
3. **All `import` statements** — Recursively trace what the library imports. You're looking for:
- `inspect.getsource()` anywhere in the chain (search all `.py` files)
- `typeguard` / `@typechecked` decorators (these call `inspect.getsource()` at import time)
- `importlib.metadata.version()` or `pkg_resources.get_distribution()` (need `--copy-metadata`)
- `lazy_loader` (needs `--collect-all` to bundle `.pyi` stubs)
### 0.2 Scan for PyInstaller-Incompatible Patterns
Run these searches against the cloned library **and** its transitive dependencies:
```bash
# inspect.getsource — will crash in frozen binary without --collect-all
grep -r "inspect.getsource\|getsource(" .
# typeguard / @typechecked — calls inspect.getsource at import time
grep -r "@typechecked\|from typeguard" .
# importlib.metadata — needs --copy-metadata
grep -r "importlib.metadata\|pkg_resources.get_distribution\|pkg_resources.require" .
# Data files loaded at runtime — need --collect-all or --collect-data
grep -r "Path(__file__).parent\|os.path.dirname(__file__)\|resources_path\|pkg_resources.resource_filename" .
# Native library paths — may need env var override in frozen builds
grep -r "/usr/share\|/usr/lib\|/usr/local\|espeak\|phonemize" .
# torch.load without map_location — will crash on CPU-only builds
grep -r "torch.load(" . | grep -v "map_location"
# HuggingFace token bugs
grep -r 'token=True\|token=os.getenv' .
# Float64/Float32 assumptions — librosa returns float64, many models assume float32
grep -r "torch.from_numpy\|\.double()\|float64" .
# @torch.jit.script — calls inspect.getsource(), crashes in frozen builds
grep -r "@torch.jit.script\|torch.jit.script" .
# torchaudio.load — requires torchcodec in torchaudio 2.10+, use soundfile.read() instead
grep -r "torchaudio.load\|torchaudio.save" .
# Gated HuggingFace repos — models that hardcode gated repos as tokenizer/config sources
grep -r "from_pretrained\|tokenizer_name\|AutoTokenizer" . | grep -i "llama\|meta-llama\|gated"
```
### 0.3 Install and Trace in a Throwaway Venv
```bash
# Create isolated venv
python -m venv /tmp/engine-venv
source /tmp/engine-venv/bin/activate
# Install the package (try normally first)
pip install model-package
# Check if it conflicts with our stack
pip install model-package torch==2.10 transformers==4.57.3 numpy>=1.26
# If this fails, you need --no-deps:
pip install --no-deps model-package
# Get the full dependency tree
pip show model-package # Check Requires: field
pip show -f model-package # List all installed files (look for data files)
# Check for non-PyPI dependencies
pip install model-package 2>&1 | grep -i "no matching distribution"
```
### 0.4 Test Model Loading on CPU
Before writing any integration code, verify the model works on CPU in a plain Python script:
```python
import torch
# Force CPU to catch map_location bugs early
model = ModelClass.from_pretrained("org/model", device="cpu")
# Test with a float32 audio array (not float64)
import numpy as np
audio = np.random.randn(16000).astype(np.float32)
output = model.generate("Hello world", audio)
print(f"Output shape: {output.shape}, dtype: {output.dtype}, sample rate: {model.sample_rate}")
```
If this crashes, you've found a bug you'll need to monkey-patch. Common ones:
- `RuntimeError: expected scalar type Float but found Double` → needs float32 cast
- `RuntimeError: map_location` → needs `torch.load` patch
- `RuntimeError: Unsupported operator aten::...` → needs MPS skip
### 0.5 Produce a Dependency Audit
Before proceeding to Phase 1, write down:
1. **PyPI vs non-PyPI deps** — which packages need `--find-links`, `git+https://`, or `--no-deps`?
2. **PyInstaller directives needed** — which packages need `--collect-all`, `--copy-metadata`, `--hidden-import`?
3. **Runtime data files** — which packages ship data files (YAML, pretrained weights, phoneme tables, shader libraries) that must be bundled?
4. **Native library paths** — which packages look for data at system paths that won't exist in a frozen binary?
5. **Monkey-patches needed** — `torch.load` map_location, float64→float32 casts, MPS skip, HF token bypass, etc.
6. **Sample rate** — what does the engine output? (24kHz, 44.1kHz, 48kHz)
7. **Model download method** — `from_pretrained()` with library-managed download, or manual `snapshot_download()` + `from_local()`?
This audit becomes your implementation plan for Phases 1, 4, and 5.
## Phase 1: Backend Implementation
### 1.1 Create the Backend File
Create `backend/backends/<engine>_backend.py` (~200-300 lines) implementing the `TTSBackend` protocol:
```python
class YourBackend:
"""Must satisfy the TTSBackend protocol."""
async def load_model(self, model_size: str = "default") -> None: ...
async def create_voice_prompt(self, audio_path: str, reference_text: str, use_cache: bool = True) -> tuple[dict, bool]: ...
async def combine_voice_prompts(self, audio_paths: list[str], ref_texts: list[str]) -> tuple[np.ndarray, str]: ...
async def generate(self, text: str, voice_prompt: dict, language: str = "en", seed: int | None = None, instruct: str | None = None) -> tuple[np.ndarray, int]: ...
def unload_model(self) -> None: ...
def is_loaded(self) -> bool: ...
def _get_model_path(self, model_size: str) -> str: ...
```
**Key decisions per engine:**
| Decision | Options | Examples |
|----------|---------|---------|
| **Voice prompt storage** | Pre-computed tensors vs deferred file paths | Qwen stores tensor dicts; Chatterbox stores paths |
| **Caching** | Use voice prompt cache or skip it | LuxTTS caches with prefix; Chatterbox skips caching |
| **Device selection** | CUDA / MPS / CPU | Chatterbox forces CPU on macOS (MPS bugs) |
| **Model download** | Library handles it vs manual `snapshot_download` | Turbo uses manual download to bypass `token=True` bug |
| **Sample rate** | Engine-specific | LuxTTS outputs 48kHz, everything else is 24kHz |
### 1.2 Voice Prompt Patterns
**Pattern A: Pre-computed tensors** (Qwen, LuxTTS)
```python
encoded = model.encode_prompt(audio_path)
return encoded, False # (prompt_dict, was_cached)
```
**Pattern B: Deferred file paths** (Chatterbox, MLX)
```python
return {"ref_audio": audio_path, "ref_text": reference_text}, False
```
**Pattern C: Hybrid** (possible for new engines)
```python
embedding = model.extract_speaker(audio_path)
return {"embedding": embedding, "ref_audio": audio_path}, False
```
If caching, prefix your cache keys:
```python
cache_key = "yourengine_" + get_cache_key(audio_path, reference_text)
```
### 1.3 Register the Engine
In `backend/backends/__init__.py`:
**Add a `ModelConfig` entry:**
```python
ModelConfig(
model_name="your-engine",
display_name="Your Engine",
engine="your_engine",
hf_repo_id="org/model-repo",
size_mb=3200,
needs_trim=False, # set True if output needs trim_tts_output()
languages=["en", "fr", "de"],
),
```
**Add to `TTS_ENGINES` dict:**
```python
TTS_ENGINES = {
...
"your_engine": "Your Engine",
}
```
**Add factory branch:**
```python
elif engine == "your_engine":
from .your_backend import YourBackend
backend = YourBackend()
```
### 1.4 Update Request Models
In `backend/models.py`:
- Add engine name to `GenerationRequest.engine` regex pattern
- Add any new language codes to the language regex
## Phase 2: Route and Service Integration
With the model config registry, route and service layers have **zero per-engine dispatch points**. All endpoints use registry helpers like `get_model_config()`, `load_engine_model()`, `engine_needs_trim()`, `check_model_loaded()`, etc.
**You don't need to touch any route or service files** unless your engine needs custom behavior in the generate pipeline.
### Post-Processing
If your model produces trailing silence, set `needs_trim=True` on your `ModelConfig`. The generation service applies `trim_tts_output()` automatically.
## Phase 3: Frontend Integration
### 3.1 TypeScript Types
In `app/src/lib/api/types.ts`:
- Add to the `engine` union type on `GenerationRequest`
### 3.2 Language Maps
In `app/src/lib/constants/languages.ts`:
- Add entry to `ENGINE_LANGUAGES` record
- Add any new language codes to `ALL_LANGUAGES` if needed
### 3.3 Engine/Model Selector
In `app/src/components/Generation/EngineModelSelector.tsx`:
- Add entry to `ENGINE_OPTIONS` and `ENGINE_DESCRIPTIONS`
- Add to `ENGLISH_ONLY_ENGINES` if applicable
### 3.4 Form Hook
In `app/src/lib/hooks/useGenerationForm.ts`:
- Add to Zod schema enum for `engine`
- Add engine-to-model-name mapping
- Update payload construction for engine-specific fields
**Watch out for model naming inconsistencies.** The HuggingFace repo name, the model size label, and the API model name don't always follow predictable patterns. For example, TADA's 3B model is named `tada-3b-ml` (not `tada-3b`), because it's a multilingual variant. Always check the actual repo names and build the frontend model name mapping from those, not from assumptions like `{engine}-{size}`.
### 3.5 Model Management
In `app/src/components/ServerSettings/ModelManagement.tsx`:
- Add description to `MODEL_DESCRIPTIONS` record
- Add model name to `voiceModels` filter condition
### 3.6 Non-Cloning Engines (Preset Voices)
If your engine uses **pre-built voices** instead of zero-shot cloning from reference audio (e.g. Kokoro), additional integration is needed:
**Backend:**
- In `kokoro_backend.py` (or your engine), define a `VOICES` list of `(voice_id, display_name, gender, language)` tuples
- `create_voice_prompt()` should return `{"voice_type": "preset", "preset_engine": "<engine>", "preset_voice_id": "<id>"}`
- `generate()` should read `voice_prompt.get("preset_voice_id")` to select the voice
- Add a `seed_preset_profiles("<engine>")` call in `backend/routes/models.py` after model download completes
- The `seed_preset_profiles()` function in `backend/services/profiles.py` creates DB profiles with `voice_type="preset"`
**Frontend:**
- The `EngineModelSelector` filters options based on `selectedProfile.voice_type`:
- `"cloned"` profiles → only cloning engines shown (Kokoro hidden)
- `"preset"` profiles → only the preset's engine shown
- Profile cards show the engine name as a badge for preset profiles
- When a preset profile is selected, the engine auto-switches
**Profile schema fields for presets:**
- `voice_type: "preset"` (vs `"cloned"` for traditional profiles)
- `preset_engine: "<engine>"` — which engine owns this voice
- `preset_voice_id: "<id>"` — the engine-specific voice identifier
**For future "designed" voices** (text description instead of audio, e.g. Qwen CustomVoice):
- Use `voice_type: "designed"` with `design_prompt` field
- `create_voice_prompt_for_profile()` already returns the design prompt for this type
## Phase 4: Dependencies
Use the dependency audit from Phase 0 to drive this phase. You should already know what packages are needed, which conflict, and which require special installation.
### 4.1 Python Dependencies
Add to `backend/requirements.txt`. There are three installation patterns, depending on what Phase 0 revealed:
**Normal PyPI packages:**
```
some-model-package>=1.0.0
```
**Pinned dependency conflicts (`--no-deps`)** — If the model package pins old versions of torch/numpy/transformers, install with `--no-deps` and list sub-dependencies manually. This is the pattern used for `chatterbox-tts`:
```bash
# In justfile / CI setup:
pip install --no-deps chatterbox-tts
# In requirements.txt — list each actual sub-dependency:
conformer>=0.3.2
diffusers>=0.31.0
omegaconf>=2.3.0
resemble-perth>=0.0.2
s3tokenizer>=0.1.6
```
To identify sub-deps: `pip show chatterbox-tts` → `Requires:` field, then cross-reference against existing `requirements.txt` to avoid duplicates.
**Non-PyPI packages** — Some libraries only exist on GitHub or require custom indexes:
```
# Git-only packages (no PyPI release)
linacodec @ git+https://github.com/ysharma3501/LinaCodec.git
Zipvoice @ git+https://github.com/ysharma3501/LuxTTS.git
# Custom package indexes (C extensions with platform-specific wheels)
--find-links https://k2-fsa.github.io/icefall/piper_phonemize.html
piper-phonemize>=1.2.0
```
### 4.2 Dependency Conflict Resolution
Check for conflicts with the existing stack before adding anything:
```bash
# Our current stack pins (approximate):
# Python 3.12+, torch>=2.10, transformers>=4.57, numpy>=1.26
# Test compatibility
pip install model-package torch==2.10 transformers==4.57.3 numpy>=1.26
# If it fails, check what the package pins:
pip show model-package | grep Requires
# Look at setup.py/pyproject.toml for version constraints
```
**Known incompatible patterns in the wild:**
- `torch==2.6.0` — many older packages pin this
- `numpy<1.26` — conflicts with Python 3.12+
- `transformers==4.46.3` — many packages pin old transformers
- `onnxruntime` pinned versions — often conflict with torch
### 4.3 Update Installation Scripts
Dependencies must be added in multiple places:
| File | What to add |
|------|------------|
| `backend/requirements.txt` | Package and version constraint |
| `justfile` | `--no-deps` install line if needed (in `setup-python` and `setup-python-release` targets) |
| `.github/workflows/release.yml` | Same `--no-deps` line in CI build steps |
| `Dockerfile` | Same install commands for Docker builds |
## Phase 5: PyInstaller Bundling (`build_binary.py`)
This is where most of the pain lives. **The v0.2.3 release was entirely dedicated to fixing bundling issues** — every new engine that shipped in v0.2.1 (LuxTTS, Chatterbox, Chatterbox Turbo) worked in dev but failed in production builds. Don't skip this phase.
### 5.1 Register Your Engine in `build_binary.py`
Every new engine needs entries in `backend/build_binary.py`. This file drives PyInstaller and is the single most common source of "works in dev, breaks in prod" bugs. You need to decide which PyInstaller directives your engine's dependencies require:
| Directive | What It Does | When You Need It |
|-----------|-------------|-----------------|
| `--hidden-import <module>` | Includes a module PyInstaller can't detect via static analysis | Dynamic imports, lazy imports, plugin architectures |
| `--collect-all <package>` | Bundles source `.py` files, data files, AND native libraries | Packages that call `inspect.getsource()` at import time (e.g. `inflect` via `typeguard`'s `@typechecked`), or that ship pretrained model files (e.g. `perth` ships `.pth.tar` + `hparams.yaml`) |
| `--collect-data <package>` | Bundles only data files (not source or native libs) | Packages with YAML configs, vocab files, etc. |
| `--collect-submodules <package>` | Bundles all submodules | Packages with deep module trees that PyInstaller misses |
| `--copy-metadata <package>` | Copies `importlib.metadata` info | Packages that call `importlib.metadata.version()` or `pkg_resources.get_distribution()` at runtime. Already required for: `requests`, `transformers`, `huggingface-hub`, `tokenizers`, `safetensors`, `tqdm` |
**Example: adding hidden imports and collect-all for a new engine:**
```python
# In build_binary.py, inside the args list:
"--hidden-import",
"backend.backends.your_engine_backend",
"--hidden-import",
"your_engine_package",
"--hidden-import",
"your_engine_package.inference",
"--collect-all",
"some_dependency_that_uses_inspect_getsource",
"--copy-metadata",
"some_dependency_that_checks_its_own_version",
```
### 5.2 Lessons from v0.2.3 — Real Failures and Their Fixes
These are actual production failures from shipping new engines. Every one of these passed `python -m uvicorn` in dev:
| Engine | Failure | Root Cause | Fix |
|--------|---------|-----------|-----|
| LuxTTS | `"could not get source code"` on import | `inflect` uses `typeguard`'s `@typechecked` which calls `inspect.getsource()` — needs `.py` source files, not just bytecode | `--collect-all inflect` |
| LuxTTS | `espeak-ng-data` not found | `piper_phonemize` C library looks for data at `/usr/share/espeak-ng-data/` which doesn't exist in the bundle | `--collect-all piper_phonemize` + set `ESPEAK_DATA_PATH` env var at runtime (see 5.3) |
| LuxTTS | `inspect.getsource` error in Vocos codec | `linacodec` and `zipvoice` use source introspection | `--collect-all linacodec` + `--collect-all zipvoice` |
| Chatterbox | `FileNotFoundError` for watermark model | `perth` ships pretrained model files (`hparams.yaml`, `.pth.tar`) that PyInstaller doesn't bundle by default | `--collect-all perth` |
| All engines | `importlib.metadata` failures | Frozen binary doesn't include package metadata for `huggingface-hub`, `transformers`, etc. | `--copy-metadata` for each affected package |
| All engines | Download progress bars stuck at 0% | `huggingface_hub` silently disables tqdm progress bars based on logger level in frozen builds — our progress tracker never receives byte updates | Force-enable tqdm's internal counter in `HFProgressTracker` |
| TADA | `inspect.getsource` error in DAC's `Snake1d` | `@torch.jit.script` calls `inspect.getsource()` which fails without `.py` source files | Wrote a lightweight shim (`dac_shim.py`) reimplementing `Snake1d` without `@torch.jit.script`, registered fake `dac.*` modules in `sys.modules` |
| All engines | `NameError: name 'obj' is not defined` on macOS | Python 3.12.0 has a [CPython bug](https://github.com/pyinstaller/pyinstaller/issues/7992) that corrupts bytecode when PyInstaller rewrites code objects | Upgrade to Python 3.12.13+ |
| All engines | `resource_tracker` subprocess crash | `multiprocessing` in frozen binaries needs `freeze_support()` called before anything else | Added to `server.py` entry point |
### 5.3 Runtime Frozen-Build Handling (`server.py`)
Some fixes can't live in `build_binary.py` — they need runtime detection. The entry point `backend/server.py` handles these before any heavy imports:
```python
# 1. freeze_support() — MUST be called before any multiprocessing use
import multiprocessing
multiprocessing.freeze_support()
# 2. Native data paths — redirect C libraries to bundled data
if getattr(sys, 'frozen', False):
_meipass = getattr(sys, '_MEIPASS', os.path.dirname(sys.executable))
_espeak_data = os.path.join(_meipass, 'piper_phonemize', 'espeak-ng-data')
if os.path.isdir(_espeak_data):
os.environ.setdefault('ESPEAK_DATA_PATH', _espeak_data)
# 3. stdout/stderr safety — PyInstaller --noconsole on Windows sets these to None
if not _is_writable(sys.stdout):
sys.stdout = open(os.devnull, 'w')
```
If your engine's dependencies include native libraries that look for data at system paths (like espeak-ng does), you'll need to add a similar `os.environ.setdefault()` block here.
### 5.4 CUDA vs CPU Build Branching
`build_binary.py` produces two different binaries:
- **`voicebox-server`** (CPU) — excludes all `nvidia.*` packages to avoid bundling ~3 GB of CUDA DLLs
- **`voicebox-server-cuda`** — includes `torch.cuda` and `torch.backends.cudnn`
On Windows, if the build environment has CUDA torch installed but you're building the CPU binary, the script temporarily swaps to CPU-only torch and restores CUDA torch afterward. This prevents PyInstaller from accidentally bundling CUDA libraries into the CPU build.
New engine imports go in the **common section** (not the CUDA or MLX conditional blocks) unless your engine has platform-specific dependencies.
### 5.5 MLX Conditional Inclusion
Apple Silicon builds conditionally include MLX hidden imports and `--collect-all mlx` / `--collect-all mlx_audio`. If your engine has an MLX-specific backend variant, add its imports inside the `if is_apple_silicon() and not cuda:` block.
### 5.6 Testing Frozen Builds
You can't skip this. Models that work in `python -m uvicorn` will break in the PyInstaller binary. The v0.2.3 release required **three patch releases** (v0.2.1 → v0.2.2 → v0.2.3) to get all engines working in production.
1. Build: `just build`
2. Launch the binary directly (not via `python -m`)
3. Test the **full chain**: download → load → generate → progress tracking
4. Check stderr for the actual error (logs go to stderr for Tauri sidecar capture)
5. Fix, rebuild, repeat
**Common gotcha:** testing only generation with a pre-cached model from your dev install. Always test with a clean model cache to verify downloads work too.
## Phase 6: Common Upstream Workarounds
### torch.load device mismatch
```python
_original_torch_load = torch.load
def _patched_torch_load(*args, **kwargs):
kwargs.setdefault("map_location", "cpu")
return _original_torch_load(*args, **kwargs)
torch.load = _patched_torch_load
```
### Float64/Float32 dtype mismatch
```python
original_fn = SomeClass.some_method
def patched_fn(self, *args, **kwargs):
result = original_fn(self, *args, **kwargs)
return result.float()
SomeClass.some_method = patched_fn
```
### HuggingFace token bug
```python
from huggingface_hub import snapshot_download
local_path = snapshot_download(repo_id=REPO, token=None)
model = ModelClass.from_local(local_path, device=device)
```
### MPS tensor issues
Skip MPS entirely if operators aren't supported:
```python
def _get_device(self):
if torch.cuda.is_available():
return "cuda"
return "cpu" # Skip MPS
```
### Gated HuggingFace repos as hardcoded config sources
Some models hardcode a gated HuggingFace repo as their tokenizer or config source (e.g., TADA hardcodes `"meta-llama/Llama-3.2-1B"` in both its `AlignerConfig` and `TadaConfig`). This silently fails without HF authentication.
**Fix:** Download from an ungated mirror and patch the config objects directly:
```python
# Download tokenizer from ungated mirror
UNGATED_TOKENIZER = "unsloth/Llama-3.2-1B"
tokenizer_path = snapshot_download(UNGATED_TOKENIZER, token=None)
# Patch the model config to use the local path instead of the gated repo
config = ModelConfig.from_pretrained(model_path)
config.tokenizer_name = tokenizer_path
model = ModelClass.from_pretrained(model_path, config=config)
```
**Do NOT monkey-patch `AutoTokenizer.from_pretrained`** — it's a classmethod, and replacing it corrupts the descriptor, which breaks other engines that use different tokenizers (e.g., Qwen uses a Qwen tokenizer via `AutoTokenizer`). Always patch at the config level, not the class method level.
### `torchaudio.load()` requires `torchcodec` in 2.10+
As of `torchaudio>=2.10`, `torchaudio.load()` requires the `torchcodec` package for audio I/O. If your engine or backend code uses `torchaudio.load()`, replace it with `soundfile`:
```python
# Before (breaks without torchcodec):
import torchaudio
waveform, sr = torchaudio.load("audio.wav")
# After:
import soundfile as sf
import torch
data, sr = sf.read("audio.wav", dtype="float32")
waveform = torch.from_numpy(data).unsqueeze(0)
```
Note: `torchaudio.functional.resample()` and other pure-PyTorch math functions work fine without `torchcodec` — only the I/O functions are affected.
### `@torch.jit.script` breaks in frozen builds
`torch.jit.script` calls `inspect.getsource()` to parse the decorated function's source code. In a PyInstaller binary, `.py` source files aren't available, so this crashes at import time.
**Fix:** Remove or avoid `@torch.jit.script` decorators. If the decorated function comes from an upstream dependency, write a shim that reimplements the function without the decorator (see "Toxic dependency chains" below).
### Toxic dependency chains — the shim pattern
Sometimes a model library depends on a package with a massive, hostile transitive dependency tree, but only uses a tiny piece of it. When the dependency chain is unbuildable or would pull in dozens of unwanted packages, the right move is to write a lightweight shim.
**Example:** TADA depends on `descript-audio-codec` (DAC), which pulls in `descript-audiotools` -> `onnx`, `tensorboard`, `protobuf`, `matplotlib`, `pystoi`, etc. The `onnx` package fails to build from source on macOS. But TADA only uses `Snake1d` from DAC — a 7-line PyTorch module.
**Solution:** Create a shim at `backend/utils/dac_shim.py` that registers fake modules in `sys.modules`:
```python
import sys
import types
import torch
from torch import nn
def snake(x, alpha):
"""Snake activation — reimplemented without @torch.jit.script."""
return x + (1.0 / (alpha + 1e-9)) * torch.sin(alpha * x).pow(2)
class Snake1d(nn.Module):
def __init__(self, channels):
super().__init__()
self.alpha = nn.Parameter(torch.ones(1, channels, 1))
def forward(self, x):
return snake(x, self.alpha)
# Register fake dac.* modules so "from dac.nn.layers import Snake1d" works
_nn = types.ModuleType("dac.nn")
_layers = types.ModuleType("dac.nn.layers")
_layers.Snake1d = Snake1d
_nn.layers = _layers
for name, mod in [("dac", types.ModuleType("dac")),
("dac.nn", _nn), ("dac.nn.layers", _layers)]:
sys.modules[name] = mod
```
**Key rules for shims:**
- Import the shim **before** importing the model library (so it finds the fake modules first)
- Do NOT use `@torch.jit.script` in the shim (see above)
- Only reimplement what the model actually uses — check the import chain carefully
## Candidate Engines
The [`docs/PROJECT_STATUS.md`](https://github.com/jamiepine/voicebox/blob/main/docs/PROJECT_STATUS.md) file is the canonical, living list of candidates under evaluation — including why some have been backlogged (e.g. VoxCPM, which is effectively CUDA-only upstream).
At a glance, current top candidates:
| Model | Tier | Size | Cross-platform? | Key Features |
|-------|------|------|-----------------|--------------|
| **MOSS-TTS-Nano** | 1 | 0.1 B | Yes (CPU realtime) | 48 kHz stereo, Apache 2.0, released 2026-04-13 |
| **Voxtral TTS** | 2 | 4 B | Likely | `mistralai/Voxtral-4B-TTS-2603` — presets + cloning |
| **VibeVoice** | 2 | ~500 M | Yes | Podcast-style multi-speaker dialogue |
| **Dia2** | 3 | TBD | TBD | Successor to the original Dia |
| **Fish Audio S2 Pro** | 3 | Medium | Yes | Word-level control via inline text |
**Backlogged:**
- **VoxCPM** (2B, Apache 2.0) — CUDA ≥12 required upstream; MPS broken in issues #232/#248; CPU path rejected by maintainers (#256). Keep watching for a PR that relaxes the device requirement.
Update `PROJECT_STATUS.md` when you pick one up or mark one as shipped/backlogged.
## Implementation Checklist
Use this as a gate between phases. Do not proceed to the next phase until every item in the current phase is checked.
### Phase 0: Dependency Research
- [ ] Cloned model library source into a temp directory
- [ ] Read `setup.py` / `pyproject.toml` — noted pinned dependency versions
- [ ] Traced all imports from the model class through to leaf dependencies
- [ ] Searched for `inspect.getsource`, `@typechecked`, `typeguard` in the full dependency tree
- [ ] Searched for `importlib.metadata`, `pkg_resources.get_distribution` in the dependency tree
- [ ] Searched for `Path(__file__).parent`, `os.path.dirname(__file__)`, hardcoded system paths
- [ ] Searched for `torch.load` calls missing `map_location`
- [ ] Searched for `torch.from_numpy` without `.float()` cast
- [ ] Searched for `token=True` or `token=os.getenv("HF_TOKEN")` in HuggingFace calls
- [ ] Searched for `@torch.jit.script` / `torch.jit.script` (crashes in frozen builds)
- [ ] Searched for `torchaudio.load` / `torchaudio.save` (requires `torchcodec` in 2.10+)
- [ ] Searched for hardcoded gated HuggingFace repo names (e.g., `meta-llama/*`)
- [ ] Evaluated whether any dependency is used minimally enough to shim instead of install
- [ ] Tested model loading and generation on CPU in a throwaway venv
- [ ] Tested with a clean HuggingFace cache (no pre-downloaded models)
- [ ] Produced a written dependency audit documenting all findings
### Phase 1: Backend Implementation
- [ ] Created `backend/backends/<engine>_backend.py` implementing `TTSBackend` protocol
- [ ] Chose voice prompt pattern (pre-computed tensors vs deferred file paths)
- [ ] Implemented all monkey-patches identified in Phase 0
- [ ] Used `get_torch_device()` from `backends/base.py` for device selection
- [ ] Used `model_load_progress()` from `backends/base.py` for download/load tracking
- [ ] Tested: model downloads correctly
- [ ] Tested: model loads on CPU
- [ ] Tested: generation produces valid audio
- [ ] Tested: voice cloning from reference audio works
- [ ] Registered `ModelConfig` in `backends/__init__.py`
- [ ] Added to `TTS_ENGINES` dict
- [ ] Added factory branch in `get_tts_backend_for_engine()`
- [ ] Updated engine regex in `backend/models.py`
### Phase 23: Route, Service, and Frontend
- [ ] Confirmed zero changes needed in routes/services (or documented why custom behavior is needed)
- [ ] Added engine to TypeScript union type in `app/src/lib/api/types.ts`
- [ ] Added language map entry in `app/src/lib/constants/languages.ts`
- [ ] Added to `ENGINE_OPTIONS` and `ENGINE_DESCRIPTIONS` in `EngineModelSelector.tsx`
- [ ] Added to Zod schema and model-name mapping in `useGenerationForm.ts`
- [ ] Added description in `ModelManagement.tsx`
### Phase 4: Dependencies
- [ ] Added packages to `backend/requirements.txt`
- [ ] If `--no-deps` needed: listed sub-dependencies explicitly
- [ ] If git-only packages: added `@ git+https://...` entries
- [ ] If custom index needed: added `--find-links` line
- [ ] Updated `justfile` setup targets
- [ ] Updated `.github/workflows/release.yml` build steps
- [ ] Updated `Dockerfile` if applicable
- [ ] Verified `pip install` succeeds in a clean venv with existing requirements
### Phase 5: PyInstaller Bundling
- [ ] Added `--hidden-import` entries in `build_binary.py` for:
- [ ] `backend.backends.<engine>_backend`
- [ ] The model package and its key submodules
- [ ] Added `--collect-all` for any packages that:
- [ ] Use `inspect.getsource()` / `@typechecked`
- [ ] Ship pretrained model data files (`.pth.tar`, `.yaml`, etc.)
- [ ] Ship native data files (phoneme tables, shader libraries, etc.)
- [ ] Added `--copy-metadata` for any packages that use `importlib.metadata`
- [ ] If engine has native data paths: added `os.environ.setdefault()` in `server.py`
- [ ] Built frozen binary with `just build`
- [ ] Tested in frozen binary with **clean model cache** (not pre-cached from dev):
- [ ] Model download works with real-time progress
- [ ] Model loading works
- [ ] Generation produces valid audio
- [ ] No errors in stderr logs
### Phase 6: Final Verification
- [ ] Engine works in dev mode (`just dev`)
- [ ] Engine works in frozen binary (`just build` → run binary directly)
- [ ] Tested on target platform (macOS for MLX, Windows/Linux for CUDA)
- [ ] No regressions in existing engines

View File

@@ -0,0 +1,251 @@
---
title: "TTS Generation"
description: "How text-to-speech generation works across Voicebox's multi-engine backend"
---
## Overview
Voicebox ships seven TTS engines — Qwen3-TTS, Qwen CustomVoice, LuxTTS, Chatterbox Multilingual, Chatterbox Turbo, TADA, and Kokoro — behind a single `TTSBackend` Protocol. All of them expose the same async interface so the routes and services don't need per-engine branching.
This page covers how generation flows through that abstraction. For the step-by-step guide to adding a new engine, see [TTS Engines](/developer/tts-engines).
## The `TTSBackend` Protocol
Every engine implements the same contract (defined in `backend/backends/__init__.py`):
```python
@runtime_checkable
class TTSBackend(Protocol):
async def load_model(self, model_size: str) -> None: ...
async def create_voice_prompt(
self, audio_path: str, reference_text: str, use_cache: bool = True
) -> Tuple[dict, bool]: ...
async def combine_voice_prompts(
self, audio_paths: List[str], reference_texts: List[str]
) -> Tuple[np.ndarray, str]: ...
async def generate(
self,
text: str,
voice_prompt: dict,
language: str = "en",
seed: Optional[int] = None,
instruct: Optional[str] = None,
) -> Tuple[np.ndarray, int]: ...
def unload_model(self) -> None: ...
def is_loaded(self) -> bool: ...
```
## The `ModelConfig` Registry
Each downloadable model variant is described by a `ModelConfig` dataclass:
```python
@dataclass
class ModelConfig:
model_name: str # "luxtts", "qwen-tts-1.7B", "kokoro"
display_name: str # "LuxTTS (Fast, CPU-friendly)"
engine: str # "luxtts", "qwen", "kokoro"
hf_repo_id: str # "YatharthS/LuxTTS"
model_size: str = "default"
size_mb: int = 0
needs_trim: bool = False
supports_instruct: bool = False
languages: list[str] = field(default_factory=lambda: ["en"])
```
Registry helpers in `backends/__init__.py` replace what used to be per-engine `if/elif` chains:
- `get_all_model_configs()` — every TTS + STT variant
- `get_tts_model_configs()` — only TTS variants
- `get_model_config(model_name)` — lookup by name
- `engine_needs_trim(engine)` — whether output should run through `trim_tts_output()`
- `load_engine_model(engine, model_size)` — downloads + loads, handles engines with multiple sizes
- `get_tts_backend_for_engine(engine)` — thread-safe backend factory with double-checked locking
The `TTS_ENGINES` dict is the canonical list of shipped engine names:
```python
TTS_ENGINES = {
"qwen": "Qwen TTS",
"qwen_custom_voice": "Qwen CustomVoice",
"luxtts": "LuxTTS",
"chatterbox": "Chatterbox TTS",
"chatterbox_turbo": "Chatterbox Turbo",
"tada": "TADA",
"kokoro": "Kokoro",
}
```
## Voice Prompt Patterns
Each engine chooses how to represent a voice in the prompt dict returned from `create_voice_prompt()`. Three patterns are in use today:
**Pattern A — Pre-computed tensors** (Qwen3-TTS, LuxTTS)
```python
encoded = model.encode_prompt(audio_path)
return encoded, False # (prompt_dict, was_cached)
```
**Pattern B — Deferred file paths** (Chatterbox, Chatterbox Turbo, TADA)
```python
return {"ref_audio": audio_path, "ref_text": reference_text}, False
```
**Pattern C — Preset voice pointer** (Kokoro, Qwen CustomVoice)
```python
return {
"voice_type": "preset",
"preset_engine": "kokoro",
"preset_voice_id": "am_adam",
}, False
```
Pattern C is the shape used for profiles where `voice_type == "preset"` — there's no cloning step; the engine looks up a baked-in voice by ID.
Engines that cache voice prompts prefix their cache keys to avoid collisions:
```python
cache_key = f"{engine}_{hash(audio_path, reference_text)}"
```
## Device Selection
Engines pick their device through `get_torch_device()` in `backends/base.py`, which layers:
1. `VOICEBOX_FORCE_CPU` environment override
2. CUDA (if compiled and available)
3. XPU (Intel Arc via IPEX)
4. MPS (Apple Silicon) — **only for engines that support it**; some (Chatterbox, older Qwen paths) skip MPS and fall back to CPU due to upstream operator gaps
5. CPU
Qwen TTS uses MLX directly on Apple Silicon instead of going through PyTorch — see `mlx_backend.py`.
## Generation Flow
The request path from frontend to audio file:
1. **Request** — `POST /generate` with `GenerationRequest`:
```json
{
"profile_id": "uuid",
"text": "...",
"language": "en",
"seed": 42,
"model_size": "1.7B",
"instruct": "warm, slightly amused",
"engine": "qwen",
"max_chunk_chars": 800
}
```
The `engine` field is validated against the regex `^(qwen|qwen_custom_voice|luxtts|chatterbox|chatterbox_turbo|tada|kokoro)$`.
2. **Route** — `routes/generate.py` validates input and delegates.
3. **Service** — `services/generation.py` fetches the profile, resolves the engine backend via `get_tts_backend_for_engine(engine)`, and ensures the model is loaded (downloading it on first use with live progress).
4. **Voice prompt** — the service calls `create_voice_prompt()` (or the preset equivalent). For cloned profiles with multiple samples, it calls `combine_voice_prompts()` first to merge reference audio.
5. **Queue** — the request is serialized through `services/task_queue.py` to avoid multiple generations fighting for the GPU.
6. **Inference** — the engine's `generate()` returns `(audio_array, sample_rate)`.
7. **Post-process** — if `engine_needs_trim(engine)` is True, `trim_tts_output()` strips trailing silence. Effects chains (if any) are applied per generation version, not the clean version.
8. **Persist** — audio is written to the generations directory, a row is inserted into the `generations` table, and the response includes the generation metadata.
## Chunking for Long Text
Text longer than `max_chunk_chars` (default 800, range 1005000) is split at sentence boundaries, generated in sequence, and crossfaded together. The chunking behavior is engine-agnostic — it lives in the service layer, not in individual backends.
## Instruct Mode
Two engines support natural-language delivery control via the `instruct` kwarg:
- **Qwen CustomVoice** — `supports_instruct=True`, fully wired to the model's instruct head.
- **Qwen Base** — silently drops the instruct text (`supports_instruct=False`). The frontend hides the instruct input for Base profiles.
```python
# Good instruct prompts:
"warm and conversational, slight smile"
"whisper, intimate and close"
"authoritative, broadcast quality"
```
Other engines ignore `instruct` entirely.
## Memory Management
Models are loaded lazily on first use and kept in memory. Switching between model sizes (e.g. Qwen 1.7B ↔ 0.6B) unloads the previous model before loading the new one to avoid OOM:
```python
def unload_model(self):
if self.model is not None:
del self.model
self.model = None
if torch.cuda.is_available():
torch.cuda.empty_cache()
```
The model management API (`/models/load`, `/models/unload`) lets users free VRAM manually — see [Model Management](/developer/model-management).
## API Endpoints
| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | `/generate` | Generate speech from text |
| GET | `/audio/{generation_id}` | Serve generated audio file |
### Response schema
```json
{
"id": "generation_uuid",
"profile_id": "profile_uuid",
"text": "...",
"language": "en",
"audio_path": "/path/to/audio.wav",
"duration": 3.5,
"seed": 42,
"engine": "qwen",
"model_size": "1.7B",
"instruct": "...",
"created_at": "2026-04-18T10:30:00Z"
}
```
## Performance Considerations
- **CUDA** is the fastest backend for every PyTorch-based engine. Apple Silicon MLX is competitive with CUDA for Qwen TTS specifically.
- **Serial queue** — only one generation runs at a time per process; concurrent requests are queued.
- **Voice prompt caching** saves ~1-2s on repeated generations from the same profile.
- **Model pinning** — the first load is slow (download + load), subsequent generations reuse the cached model in memory.
### Per-engine VRAM (approximate, on CUDA)
| Engine | VRAM |
|--------|------|
| Kokoro | ~150 MB |
| LuxTTS | ~1 GB |
| Chatterbox Turbo | ~1.5 GB |
| Qwen 0.6B / Qwen CustomVoice 0.6B | ~2 GB |
| Chatterbox Multilingual | ~3 GB |
| Qwen 1.7B / Qwen CustomVoice 1.7B | ~6 GB |
| TADA 1B | ~4 GB |
| TADA 3B | ~8 GB |
## Next Steps
<Cards>
<Card title="TTS Engines" href="/developer/tts-engines">
Add a new engine — full phased workflow
</Card>
<Card title="Model Management" href="/developer/model-management">
Downloading, loading, and unloading models
</Card>
<Card title="Voice Profiles" href="/developer/voice-profiles">
Cloned vs preset profile schema
</Card>
</Cards>

View File

@@ -0,0 +1,232 @@
---
title: "Voice Profiles"
description: "How voice profile management works in Voicebox"
---
## Overview
Voice profiles are the unit of "a saved voice" in Voicebox. As of 0.4 they support two flavors backed by the same `profiles` table:
- **Cloned profiles** — store one or more reference audio samples; the cloning engine generates a voice embedding at use time
- **Preset profiles** — store no audio; just a pointer to an engine-specific pre-built voice (e.g. Kokoro's `am_adam`, Qwen CustomVoice's `Ryan`)
The schema also reserves a third type, `designed`, for future text-described voices. Not currently used by any shipped engine.
## Architecture
The voice profile system consists of three main components:
**Database Layer:** SQLite tables store profile metadata, sample references (cloned), and engine + voice ID (preset).
**File Storage:** Audio samples are stored on disk in a structured directory format. Preset profiles have no on-disk audio.
**Profile Module:** `backend/services/profiles.py` provides the business logic for CRUD operations and dispatches to the appropriate engine based on `voice_type`.
## Data Model
### VoiceProfile Table
```python
class VoiceProfile(Base):
__tablename__ = "profiles"
id = Column(String, primary_key=True, default=lambda: str(uuid.uuid4()))
name = Column(String, unique=True, nullable=False)
description = Column(Text)
language = Column(String, default="en")
avatar_path = Column(String, nullable=True)
effects_chain = Column(Text, nullable=True)
# Voice type system — added v0.3.x
voice_type = Column(String, default="cloned") # "cloned" | "preset" | "designed"
preset_engine = Column(String, nullable=True) # e.g. "kokoro" — only for preset
preset_voice_id = Column(String, nullable=True) # e.g. "am_adam" — only for preset
design_prompt = Column(Text, nullable=True) # text description — only for designed (reserved)
default_engine = Column(String, nullable=True) # auto-selected engine, locked for preset
created_at = Column(DateTime, default=datetime.utcnow)
updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)
```
The `voice_type` column discriminates the three flavors:
| `voice_type` | `preset_engine` | `preset_voice_id` | Samples in `profile_samples` |
| ------------ | --------------- | ----------------- | ---------------------------- |
| `cloned` | NULL | NULL | Required (≥1 row) |
| `preset` | engine name | voice ID string | None |
| `designed` | NULL | NULL | None (uses `design_prompt`) |
The `default_engine` column is set automatically when the profile is created. For preset profiles it's locked to the source engine — switching engines at generation time will skip the profile (and the UI auto-switches back when the user clicks a greyed-out card; see the floating generate box and profile grid).
### ProfileSample Table
```python
class ProfileSample(Base):
__tablename__ = "profile_samples"
id = Column(String, primary_key=True, default=lambda: str(uuid.uuid4()))
profile_id = Column(String, ForeignKey("profiles.id"))
audio_path = Column(String, nullable=False)
reference_text = Column(Text, nullable=False)
```
Only populated for cloned profiles. Preset and designed profiles have zero rows in this table.
## File Structure
Profiles are stored in the data directory:
<Files>
<Folder name="data" defaultOpen>
<Folder name="profiles">
<Folder name="{profile_id}">
<File name="{sample_id_1}.wav" />
<File name="{sample_id_2}.wav" />
</Folder>
</Folder>
</Folder>
</Files>
## Core Functions
### Creating a Profile
```python
async def create_profile(data: VoiceProfileCreate, db: Session) -> VoiceProfileResponse:
# 1. Create database record
db_profile = DBVoiceProfile(
id=str(uuid.uuid4()),
name=data.name,
description=data.description,
language=data.language,
)
db.add(db_profile)
db.commit()
# 2. Create profile directory
profile_dir = profiles_dir / db_profile.id
profile_dir.mkdir(parents=True, exist_ok=True)
return VoiceProfileResponse.model_validate(db_profile)
```
### Adding Samples
When a sample is added, the audio is validated and copied to the profile directory:
```python
async def add_profile_sample(
profile_id: str,
audio_path: str,
reference_text: str,
db: Session,
) -> ProfileSampleResponse:
# 1. Validate audio (duration, format, quality)
is_valid, error_msg = validate_reference_audio(audio_path)
if not is_valid:
raise ValueError(f"Invalid reference audio: {error_msg}")
# 2. Copy to profile directory
sample_id = str(uuid.uuid4())
dest_path = profile_dir / f"{sample_id}.wav"
audio, sr = load_audio(audio_path)
save_audio(audio, str(dest_path), sr)
# 3. Create database record
db_sample = DBProfileSample(
id=sample_id,
profile_id=profile_id,
audio_path=str(dest_path),
reference_text=reference_text,
)
db.add(db_sample)
db.commit()
```
### Voice Prompt Creation
When generating speech, samples are combined into a voice prompt:
```python
async def create_voice_prompt_for_profile(
profile_id: str,
db: Session,
) -> dict:
samples = db.query(DBProfileSample).filter_by(profile_id=profile_id).all()
if len(samples) == 1:
# Single sample - use directly
voice_prompt, _ = await tts_model.create_voice_prompt(
sample.audio_path,
sample.reference_text,
)
else:
# Multiple samples - combine them
combined_audio, combined_text = await tts_model.combine_voice_prompts(
[s.audio_path for s in samples],
[s.reference_text for s in samples],
)
voice_prompt, _ = await tts_model.create_voice_prompt(
combined_audio_path,
combined_text,
)
return voice_prompt
```
## Audio Validation
Reference audio is validated before being accepted:
- **Duration:** 3-30 seconds recommended
- **Format:** WAV, MP3, FLAC, OGG, M4A supported
- **Sample Rate:** Engine-specific — the audio utility resamples to whatever the active engine expects (Whisper uses 16 kHz, most TTS engines use 24 kHz, LuxTTS outputs 48 kHz). Resampling happens on the fly; the stored sample retains its original rate.
- **Channels:** Converted to mono if stereo
## Export/Import
Profiles can be exported as ZIP archives for sharing:
<Files>
<Folder name="profile_export.zip" defaultOpen>
<File name="profile.json" />
<Folder name="samples">
<File name="sample_1.wav" />
<File name="sample_1.json" />
</Folder>
</Folder>
</Files>
## API Endpoints
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/profiles` | List all profiles |
| POST | `/profiles` | Create a profile |
| GET | `/profiles/{id}` | Get profile by ID |
| PUT | `/profiles/{id}` | Update profile |
| DELETE | `/profiles/{id}` | Delete profile |
| GET | `/profiles/{id}/samples` | Get profile samples |
| POST | `/profiles/{id}/samples` | Add sample to profile |
| PUT | `/profiles/samples/{id}` | Update sample text |
| DELETE | `/profiles/samples/{id}` | Delete sample |
| GET | `/profiles/{id}/export` | Export as ZIP |
| POST | `/profiles/import` | Import from ZIP |
## Best Practices
### Sample Quality
- Use clean audio with minimal background noise
- Ensure the reference text exactly matches what is spoken
- Multiple samples (3-5) improve voice cloning quality
### Language Matching
- Set the profile language to match the reference audio
- Supported languages: en, zh, ja, ko, de, fr, ru, pt, es, it
### Naming Conventions
- Use descriptive names that identify the voice
- Avoid special characters that may cause filesystem issues

View File

@@ -0,0 +1,37 @@
---
title: "Voicebox Documentation"
description: "Voicebox is a local-first voice cloning studio -- a free and open-source alternative to ElevenLabs."
---
Voicebox is a **local-first voice cloning studio** -- a free and open-source alternative to ElevenLabs. Clone voices from a few seconds of audio, generate speech in 23 languages across 7 TTS engines, apply post-processing effects, and compose multi-voice projects with a timeline editor.
![Voicebox App Screenshot](/images/app-screenshot-1.webp)
- **Complete privacy** -- models and voice data stay on your machine
- **7 TTS engines** -- Qwen3-TTS, Qwen CustomVoice, LuxTTS, Chatterbox Multilingual, Chatterbox Turbo, HumeAI TADA, and Kokoro
- **Cloning and preset voices** -- zero-shot cloning from a reference sample, or 50+ curated preset voices via Kokoro and Qwen CustomVoice
- **23 languages** -- from English to Arabic, Japanese, Hindi, Swahili, and more
- **Post-processing effects** -- pitch shift, reverb, delay, chorus, compression, and filters
- **Expressive speech** -- paralinguistic tags like `[laugh]`, `[sigh]`, `[gasp]` via Chatterbox Turbo; natural-language delivery control via Qwen CustomVoice
- **Unlimited length** -- auto-chunking with crossfade for scripts, articles, and chapters
- **Stories editor** -- multi-track timeline for conversations, podcasts, and narratives
- **API-first** -- REST API for integrating voice synthesis into your own projects
- **Native performance** -- built with Tauri (Rust), not Electron
- **Runs everywhere** -- macOS (MLX/Metal), Windows (CUDA), Linux, AMD ROCm, Intel Arc, Docker
## Download
| Platform | Download |
|----------|----------|
| macOS (Apple Silicon) | [Download DMG](https://voicebox.sh/download/mac-arm) |
| macOS (Intel) | [Download DMG](https://voicebox.sh/download/mac-intel) |
| Windows | [Download MSI](https://voicebox.sh/download/windows) |
| Docker | `docker compose up` |
[View all releases](https://github.com/jamiepine/voicebox/releases/latest)
## Get Started
- [Installation](/overview/installation) -- download and install Voicebox
- [Quick Start](/overview/quick-start) -- get up and running in 5 minutes
- [API Reference](/api-reference) -- integrate voice synthesis into your apps

View File

@@ -0,0 +1,4 @@
{
"title": "Voicebox Documentation",
"pages": ["overview", "api-reference", "developer"]
}

View File

@@ -0,0 +1,37 @@
---
title: "Building Stories"
description: "Create multi-voice narratives with the Stories Editor"
---
## Getting Started
The Stories Editor is perfect for creating podcasts, audiobooks, and multi-speaker content.
<Steps>
<Step title="Create Story">
**Stories** → **+ New Story**
</Step>
<Step title="Add Tracks">
Create tracks for each speaker
</Step>
<Step title="Add Clips">
Generate or drag audio to tracks
</Step>
<Step title="Arrange">
Position and trim clips on timeline
</Step>
<Step title="Export">
Render final audio
</Step>
</Steps>
## Use Cases
- Multi-host podcasts
- Audiobook narration with character voices
- Game dialogue scenes
- Educational content with multiple speakers
## Coming Soon
Full timeline editor documentation will be added as features are finalized.

View File

@@ -0,0 +1,288 @@
---
title: "Creating Voice Profiles"
description: "How to create voice profiles, both cloning-based and preset-based"
---
## Overview
A **voice profile** is a saved voice you can reuse across generations, stories, and the API. As of 0.4, Voicebox profiles come in two flavors that map to two different ways of getting a voice:
| Profile type | What it stores | Use when… |
| -------------- | ---------------------------------------------------- | -------------------------------------------------------- |
| **Cloned** | One or more reference audio samples + a voice embedding | You want to replicate a specific person's voice |
| **Preset** | A reference to a pre-built voice in a specific engine | You want a curated, production-ready voice with no audio prep |
Both types live in the same Profiles tab and behave the same way at generation time — pick the type that matches your goal and follow the workflow below.
<Callout type="info">
Not sure which to use? Cloning gives you a *specific* voice but needs clean audio. Preset gives you *good* voices instantly but you don't get to choose who they sound like.
</Callout>
## Workflow A — Cloned Profiles
Use this when you want to replicate a specific person's voice from a recording.
<Steps>
<Step title="Prepare Audio">
10-30 seconds of clear speech, minimal background noise. See [Voice Cloning](/overview/voice-cloning) for the engine catalog.
</Step>
<Step title="Create Profile">
**Profiles** → **+ New Profile** → choose a cloning engine (Qwen3-TTS, Chatterbox Multilingual, Chatterbox Turbo, LuxTTS, or TADA)
</Step>
<Step title="Upload or Record Sample">
Drag in an audio file, or record directly with the in-app recorder
</Step>
<Step title="Generate to Test">
Use the profile to generate a test phrase. If quality is poor, add more samples
</Step>
</Steps>
### Audio Requirements (Cloning Only)
<Cards>
<Card title="Duration">
**10-30 seconds**
Too short: Poor quality
Too long: Unnecessary
</Card>
<Card title="Clarity">
**Clear speech**
No background noise
No music or overlapping voices
</Card>
<Card title="Quality">
**High fidelity**
44.1 kHz or 48 kHz sample rate
Minimal compression
</Card>
<Card title="Content">
**Natural speech**
Conversational tone
Complete sentences
</Card>
</Cards>
### File Formats
Supported formats:
- **WAV** (recommended) — Lossless quality
- **MP3** — Acceptable, minimal compression
- **M4A** — Acceptable
- **FLAC** — Lossless alternative
<Callout type="info">
Use WAV for best results. Avoid heavily compressed formats.
</Callout>
### Recording Tips
<AccordionGroup>
<Accordion title="Quiet Space">
- Record in a quiet room
- Turn off fans, AC, appliances
- Close windows to reduce outside noise
- Use soft furnishings to reduce echo
</Accordion>
<Accordion title="Microphone Placement">
- 6-12 inches from mouth
- Slight angle to reduce plosives (p, b, t)
- Use a pop filter if available
- Maintain consistent distance
</Accordion>
<Accordion title="Recording Settings">
- 44.1 kHz or 48 kHz sample rate
- 16-bit or 24-bit depth
- Mono is fine (stereo will be converted)
- Avoid automatic gain control
</Accordion>
</AccordionGroup>
### Speaking Style
- **Natural pace** — Don't rush or speak too slowly
- **Clear articulation** — Pronounce words clearly
- **Consistent volume** — Maintain steady loudness
- **Normal tone** — Speak as you normally would
- **Complete sentences** — Avoid fragments or "ums"
### Multiple Samples
Adding multiple samples can significantly improve quality:
<Cards>
<Card title="Robustness">
Model learns a more complete representation
</Card>
<Card title="Versatility">
Handles different speaking styles better
</Card>
<Card title="Quality">
Reduces artifacts and improves naturalness
</Card>
<Card title="Consistency">
More reliable across different texts
</Card>
</Cards>
Consider adding samples with:
1. **Different tones** — casual, formal, excited, calm
2. **Different content** — narratives, questions, statements
3. **Different recording conditions** — studio quality, room acoustics
<Callout type="warn">
All samples should be from the **same speaker**. Mixing voices will produce poor results.
</Callout>
### Processing Existing Audio
If you have existing audio (podcasts, videos, etc.):
<Steps>
<Step title="Find Clean Speech">
Look for segments with just the target speaker, no background music, minimal noise
</Step>
<Step title="Use Audio Editor">
Tools like Audacity or Adobe Audition: cut clean 10-30s segments, remove silence at start/end, normalize volume
</Step>
<Step title="Export as WAV">
Save as high-quality WAV file
</Step>
</Steps>
For light background noise, use Audacity's noise reduction (gentle settings — over-processing introduces artifacts).
### Testing & Iteration
After creating a cloned profile:
<Steps>
<Step title="Generate Test">
Try a simple phrase: `"Hello, this is a test of my voice profile."`
</Step>
<Step title="Evaluate Quality">
Listen for natural tone, clear pronunciation, proper prosody, lack of artifacts
</Step>
<Step title="Iterate">
If quality is poor: add more samples, try different source audio, check sample quality
</Step>
</Steps>
#### Common Issues
<AccordionGroup>
<Accordion title="Robotic Voice">
**Cause**: Poor quality samples or too short
**Fix**: Use longer, higher-quality samples
</Accordion>
<Accordion title="Wrong Tone">
**Cause**: Sample tone doesn't match desired output
**Fix**: Record samples in the style you want to generate
</Accordion>
<Accordion title="Artifacts/Glitches">
**Cause**: Background noise or audio issues in samples
**Fix**: Clean up samples or re-record in quieter environment
</Accordion>
</AccordionGroup>
## Workflow B — Preset Profiles
Use this when you want a ready-made voice without recording anything. Available engines: **Kokoro 82M** (50 voices) and **Qwen CustomVoice** (9 voices). See [Preset Voices](/overview/preset-voices) for the full catalog.
<Steps>
<Step title="Create Profile">
**Profiles** → **+ New Profile** → choose **Kokoro** or **Qwen CustomVoice** as the engine
</Step>
<Step title="Pick a Voice">
The engine's voice catalog appears. Click any voice to preview it
</Step>
<Step title="Name and Save">
Give the profile a name. No audio sample required
</Step>
<Step title="Generate">
The profile is ready immediately — use it in the floating generate box or Generate page
</Step>
</Steps>
<Callout type="info">
Preset profiles are **locked to their source engine**. Switching to a different engine in the floating generate box greys out the profile, since the voice only exists in that engine. Clicking a greyed profile auto-switches the engine back.
</Callout>
### Qwen CustomVoice + Instruct
Preset voices in Qwen CustomVoice support **delivery instructions** — natural-language style control over tone, pace, and emotion. The floating generate box shows a slider icon next to the generate button when a Qwen CustomVoice profile is selected; click it to reveal the instruct textarea.
See [Preset Voices → Using Instruct Mode](/overview/preset-voices#using-instruct-mode) for examples.
## Advanced Tips
### Celebrity / Character Voices (Cloning)
For cloning public figures or characters:
1. **Legal considerations** — Ensure you have rights or it's clearly fair use
2. **Source quality** — Find high-quality interview audio or clean clips
3. **Consistency** — Use clips where they speak similarly
4. **Multiple samples** — Very important for recognizable voices
### Accent & Dialect (Cloning)
Cloning models preserve accent and dialect:
- British English samples generate British English output
- Southern accent samples produce Southern accent output
- Regional pronunciations are maintained
### Emotion Transfer (Cloning)
The emotional tone of samples affects generation:
- Energetic samples → energetic output
- Calm samples → calm output
- Mix samples for a more versatile profile
For Qwen CustomVoice presets, use the **instruct** field instead of relying on sample emotion — that's exactly what it controls.
## Managing Profiles
### Organization
- **Descriptive names** — "John Smith - Professional Narrator"
- **Add descriptions** — Note recording conditions, use cases, or which preset voice
- **Language tags** — Mark the primary language
- **Archive unused** — Keep profile list manageable
### Export / Import
- **Export** profiles to share or backup
- **Import** from colleagues or teammates
- **Cloned profiles** export with their voice embeddings (not the original audio)
- **Preset profiles** export as engine + voice ID metadata only — the importer must have that engine's model installed
## Next Steps
<Cards>
<Card title="Voice Cloning" href="/overview/voice-cloning">
Engine catalog and best practices for cloning
</Card>
<Card title="Preset Voices" href="/overview/preset-voices">
Full catalog of Kokoro and Qwen CustomVoice voices
</Card>
<Card title="Generate Speech" href="/overview/generating-speech">
Use your profile to generate speech
</Card>
<Card title="Build Stories" href="/overview/building-stories">
Create multi-voice narratives
</Card>
</Cards>

View File

@@ -0,0 +1,240 @@
---
title: "Docker Deployment"
description: "Run Voicebox as a headless server with a web UI using Docker"
---
## Overview
Voicebox can run as a Docker container with a full web UI -- no desktop app required. This is ideal for headless servers, shared GPU machines, or self-hosted deployments.
## Quick Start
```bash
git clone https://github.com/jamiepine/voicebox.git
cd voicebox
docker compose up
```
Open [http://localhost:17493](http://localhost:17493) in your browser. The full Voicebox UI is served directly from the backend.
<Callout type="info">
The first build takes a few minutes (compiling the frontend, installing Python dependencies). Subsequent starts are fast thanks to Docker layer caching.
</Callout>
## How It Works
The Docker image uses a 3-stage build:
1. **Frontend** -- builds the React SPA with Bun and Vite
2. **Backend** -- installs Python dependencies and TTS model packages
3. **Runtime** -- combines both into a minimal image running the FastAPI server
The backend serves the web UI automatically when the built frontend is present. All API routes work exactly as they do in the desktop app.
## Configuration
### docker-compose.yml
The default `docker-compose.yml` binds to localhost only, mounts persistent volumes for data and model cache, and sets sensible resource limits:
```yaml
services:
voicebox:
build: .
container_name: voicebox
restart: unless-stopped
ports:
- "127.0.0.1:17493:17493"
volumes:
- ./output:/app/data/generations
- voicebox-data:/app/data
- huggingface-cache:/home/voicebox/.cache/huggingface
environment:
- LOG_LEVEL=info
deploy:
resources:
limits:
cpus: '4'
memory: 8G
```
### Exposing to Your Network
By default the container only listens on `127.0.0.1`. To allow other machines on your network to connect, change the port binding:
```yaml
ports:
- "0.0.0.0:17493:17493"
```
<Callout type="warn">
The API has no built-in authentication. Only expose to trusted networks, or put a reverse proxy with auth in front of it.
</Callout>
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `LOG_LEVEL` | `info` | Logging verbosity (`debug`, `info`, `warning`, `error`) |
| `VOICEBOX_MODELS_DIR` | (HuggingFace cache) | Custom path for model storage |
| `VOICEBOX_CORS_ORIGINS` | (local origins) | Additional CORS origins, comma-separated |
### Resource Limits
The default compose file limits the container to 4 CPUs and 8GB RAM. Adjust these based on your hardware:
```yaml
deploy:
resources:
limits:
cpus: '8'
memory: 16G
```
<Callout type="info">
TTS model inference is memory-intensive. 8GB is the minimum for running a single engine. 16GB+ is recommended if you want multiple engines loaded simultaneously.
</Callout>
## Volumes
| Volume | Container Path | Purpose |
|--------|---------------|---------|
| `./output` | `/app/data/generations` | Generated audio files (bind-mount, easy access from host) |
| `voicebox-data` | `/app/data` | Profiles, database, cache |
| `huggingface-cache` | `/home/voicebox/.cache/huggingface` | Downloaded models (persists across rebuilds) |
The `huggingface-cache` volume is important -- without it, models would be re-downloaded every time the container is rebuilt.
## GPU Acceleration
### NVIDIA GPU (CUDA)
To use your NVIDIA GPU inside the container, install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) and add GPU access to your compose file:
```yaml
services:
voicebox:
build: .
# ... existing config ...
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
```
### AMD GPU (ROCm)
For AMD GPUs, use the ROCm runtime:
```yaml
services:
voicebox:
build: .
# ... existing config ...
devices:
- /dev/kfd
- /dev/dri
group_add:
- video
```
### CPU Only
The default configuration runs on CPU. This works fine but generation will be slower. LuxTTS is the fastest engine on CPU (150x realtime).
## Security
The Docker image follows security best practices:
- **Non-root user** -- the server runs as `voicebox`, not `root`
- **Localhost binding** -- only accessible from the host machine by default
- **Health checks** -- automatic restart if the server hangs (`/health` endpoint polled every 30s)
- **CORS restricted** -- only local origins allowed by default
### Running Behind a Reverse Proxy
For production deployments, put Voicebox behind nginx or Caddy with TLS and authentication:
```nginx
server {
listen 443 ssl;
server_name voicebox.example.com;
ssl_certificate /etc/ssl/certs/voicebox.pem;
ssl_certificate_key /etc/ssl/private/voicebox.key;
auth_basic "Voicebox";
auth_basic_user_file /etc/nginx/.htpasswd;
location / {
proxy_pass http://127.0.0.1:17493;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
```
## Troubleshooting
### Container starts but UI shows JSON
If you see `{"message": "voicebox API", ...}` instead of the web UI, the frontend build may have failed during the Docker build. Check the build logs:
```bash
docker compose build --no-cache
```
Look for errors in the "Build frontend" stage.
### Models downloading on every restart
Make sure the `huggingface-cache` volume is configured. Without it, the model cache is lost when the container stops:
```yaml
volumes:
- huggingface-cache:/home/voicebox/.cache/huggingface
```
### Out of memory
TTS models are large. If the container is killed by the OOM killer, increase the memory limit:
```yaml
deploy:
resources:
limits:
memory: 16G
```
### Port already in use
```bash
# Check what's using port 17493
lsof -i :17493
# Or use a different port
ports:
- "127.0.0.1:8080:17493"
```
## Prebuilt Images (Coming Soon)
We plan to publish prebuilt Docker images to GitHub Container Registry so you won't need to build locally:
```bash
# Not available yet — coming in a future release
docker run -p 17493:17493 ghcr.io/jamiepine/voicebox:latest
```
The CPU image will be ~3-4 GB (Python + PyTorch + TTS packages). A separate CUDA tag (~6-8 GB) will be available for NVIDIA GPU users. This is normal for ML containers.
For now, use `docker compose up` to build from source as described above.
## Connecting the Desktop App
You can also use the desktop app as a frontend for a Docker-hosted backend. In the desktop app, go to **Settings -> Server**, enable **Remote Mode**, and enter `http://<server-ip>:17493`.
See the [Remote Mode guide](/overview/remote-mode) for details.

View File

@@ -0,0 +1,65 @@
---
title: "Generating Speech"
description: "Generate high-quality speech from text"
---
## Basic Generation
<Steps>
<Step title="Select Profile">
Choose a voice profile from the dropdown
</Step>
<Step title="Enter Text">
Type or paste your text
</Step>
<Step title="Generate">
Click **Generate** and wait a few seconds
</Step>
<Step title="Play & Export">
Preview and download the result
</Step>
</Steps>
## Text Formatting Tips
The way you format text affects the output quality.
### Punctuation
Use proper punctuation for natural pauses:
```
Good: "Hello! How are you today? I'm doing great."
Bad: "Hello how are you today Im doing great"
```
### Emphasis
Use formatting to suggest emphasis:
```
- ALL CAPS for louder/emphasized: "That was AMAZING!"
- Italics for subtle emphasis: "I *really* enjoyed that"
- Bold for strong emphasis: "This is **very** important"
```
<Callout type="info">
The model interprets these hints but results may vary.
</Callout>
## Advanced Features
### Batch Generation
For long-form content, split into smaller chunks for better control and faster processing.
### Voice Caching
Voicebox caches voice prompts for faster re-generation with the same profile.
## Coming Soon
- Real-time streaming
- Word-level timing control
- Emotion and style controls
- SSML support

View File

@@ -0,0 +1,88 @@
---
title: "Generation History"
description: "Track and manage all your generated audio"
---
## Overview
Voicebox keeps a complete history of all generated audio, making it easy to find, reuse, and manage your creations.
## Features
<Cards>
<Card title="Full History" icon={<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><circle cx="12" cy="12" r="10"/><polyline points="12 6 12 12 16 14"/></svg>}>
Every generation is automatically saved
</Card>
<Card title="Search & Filter" icon={<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><circle cx="11" cy="11" r="8"/><path d="m21 21-4.3-4.3"/></svg>}>
Find by text, voice, or date
</Card>
<Card title="Re-generate" icon={<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="M3 12a9 9 0 0 1 9-9 9.75 9.75 0 0 1 6.74 2.74L21 8"/><path d="M21 3v5h-5"/><path d="M21 12a9 9 0 0 1-9 9 9.75 9.75 0 0 1-6.74-2.74L3 16"/><path d="M8 16H3v5"/></svg>}>
Regenerate any past generation with one click
</Card>
<Card title="Export" icon={<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="M21 15v4a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2v-4"/><polyline points="7 10 12 15 17 10"/><line x1="12" y1="15" x2="12" y2="3"/></svg>}>
Download individual or batch exports
</Card>
</Cards>
## Viewing History
Navigate to the **History** tab to see all your generations.
Each entry shows:
- Generated text
- Voice profile used
- Timestamp
- Audio duration
- Language
## Actions
### Play
Click any generation to play it immediately.
### Re-generate
Regenerate with the same settings or modify the text/voice.
### Download
Export as WAV, MP3, or M4A.
### Delete
Remove unwanted generations to free up space.
### Add to Story
Drag generations to the Stories Editor timeline.
## Search & Filter
<Tabs items={["By Text", "By Voice", "By Date"]}>
<Tab value="By Text">
Search for specific text content
```
"Hello world"
```
</Tab>
<Tab value="By Voice">
Filter by voice profile
```
Select from dropdown
```
</Tab>
<Tab value="By Date">
Filter by date range
```
Last 7 days, Last 30 days, Custom range
```
</Tab>
</Tabs>
## Storage
History is stored locally:
- **macOS**: `~/Library/Application Support/sh.voicebox.app/data/`
- **Windows**: `%APPDATA%/sh.voicebox.app/data/`
- **Linux**: `~/.config/sh.voicebox.app/data/`
<Callout type="warn">
Deleting the data directory will remove all history. Export important files first.
</Callout>

View File

@@ -0,0 +1,236 @@
---
title: "GPU Acceleration"
description: "How Voicebox uses your GPU — auto-detection, manual setup, troubleshooting"
---
## Overview
Voicebox auto-detects available accelerators on first launch and picks the fastest backend it can use. For most people this just works — open the app and you're already on the right backend.
This page is for the cases where it doesn't:
- You have a GPU but Voicebox is running on CPU
- You upgraded GPUs (especially to RTX 50-series / Blackwell) and generation broke
- You want to switch backends manually (e.g. force MLX over PyTorch on Apple Silicon)
- You see `[UNSUPPORTED - see logs]` next to your GPU in Settings
## Backend Matrix
| Platform | Auto-selected backend | Notes |
| --------------------------- | ------------------------- | ---------------------------------------------------- |
| **macOS Apple Silicon** | MLX (Metal) | 4-5x faster than PyTorch via Apple Neural Engine |
| **macOS Intel** | PyTorch CPU | No GPU acceleration available; PyTorch ≥ 2.2 only |
| **Windows + NVIDIA** | PyTorch CUDA (cu128) | Auto-downloads the CUDA backend binary on first use |
| **Windows + Intel Arc** | PyTorch XPU (IPEX) | New in 0.4 — works with Arc A-series and B-series |
| **Windows generic GPU** | DirectML | Universal Windows GPU support; slower than CUDA |
| **Linux + NVIDIA** | PyTorch CUDA (cu128) | Same auto-download flow as Windows |
| **Linux + AMD** | PyTorch ROCm | Auto-configures `HSA_OVERRIDE_GFX_VERSION` |
| **Linux + Intel Arc** | PyTorch XPU (IPEX) | |
| **Any (no GPU)** | PyTorch CPU | Works everywhere; expect 5-50x slower than GPU |
The detected backend is shown in Settings → GPU. Logs at startup also print the chosen backend and the device name.
## Apple Silicon — MLX vs PyTorch
On M-series Macs, Voicebox ships an MLX-optimized backend that uses the Apple Neural Engine. It's **4-5x faster** than the PyTorch (CPU/Metal) path for supported engines.
| Engine | MLX support | Notes |
| -------------------- | ----------- | ------------------------------------------- |
| Qwen3-TTS | ✅ Native | Uses MLX exclusively when available |
| Chatterbox / Turbo | PyTorch MPS | Falls back to Metal via PyTorch |
| LuxTTS | PyTorch MPS | |
| TADA | PyTorch MPS | |
| Kokoro | PyTorch MPS | Requires `PYTORCH_ENABLE_MPS_FALLBACK=1` |
| Qwen CustomVoice | PyTorch MPS | |
| Whisper (transcribe) | ✅ Native | MLX-Whisper is the default on Apple Silicon |
The Whisper Turbo + MLX combo dropped transcription latency from ~20s to ~2-3s on M-series chips (see CHANGELOG entry for v0.1.10).
## Windows / Linux + NVIDIA — The CUDA Backend Swap
Voicebox doesn't bundle CUDA into the main installer (it would balloon downloads to multi-gigabyte territory for users who don't have an NVIDIA GPU). Instead, when you first need it, the app downloads a separate **CUDA backend binary** that contains the PyTorch + CUDA runtime.
<Steps>
<Step title="Open Settings → GPU">
If an NVIDIA GPU is detected, you'll see "Install CUDA backend" in the GPU panel
</Step>
<Step title="Click Install">
The app downloads two archives separately:
- **Server core** (~200-400 MB) — versioned with each Voicebox release
- **CUDA libs** (~4 GB) — the heavy PyTorch + CUDA DLLs, versioned independently
</Step>
<Step title="Restart">
Voicebox restarts to swap in the CUDA backend
</Step>
</Steps>
<Callout type="info">
The split-archive design (added in v0.4) means most Voicebox upgrades only redownload the small server-core archive. The 4 GB libs archive is only refreshed when the underlying CUDA toolkit or torch major version changes.
</Callout>
### Auto-update
When a new Voicebox release ships, the GPU panel checks if the bundled server-core matches the installed CUDA version. If only the core changed (typical), it pulls the new core in the background. If the libs version changed (rare — only happens on cu126 → cu128 type bumps), you'll be prompted to confirm the larger download.
## RTX 50-series / Blackwell
Voicebox 0.4 added explicit RTX 50-series support:
- CUDA toolkit upgraded to **cu128** (previous releases used cu126 which lacks Blackwell kernels)
- Build pinned with `TORCH_CUDA_ARCH_LIST=...12.0+PTX` for forward-compatibility
If you're on an RTX 5070 / 5080 / 5090 and you see "no kernel image is available" errors:
1. Make sure you're on Voicebox **≥ 0.4.0** (Settings → About)
2. Reinstall the CUDA backend (Settings → GPU → Reinstall CUDA backend) — older installs may have stale cu126 libs
3. If errors persist, see the GPU compatibility warnings section below
## Intel Arc (XPU)
New in 0.4. Works with both Arc A-series (Alchemist: A380, A580, A750, A770) and B-series (Battlemage).
### Setup
Voicebox auto-detects Arc GPUs and routes through Intel's PyTorch XPU backend (powered by IPEX — Intel Extension for PyTorch). No extra installation step beyond the standard Voicebox install.
Verify it's working:
- Settings → GPU should show **XPU** followed by your Arc model name (e.g. `XPU (Intel Arc A770)`)
- Startup logs print `Backend: PYTORCH` and `GPU: XPU (Intel Arc ...)`
### Engines on XPU
All PyTorch-based engines work on XPU. Performance is generally between CPU and CUDA — expect ~2-3x speedup over CPU for the larger models.
## DirectML
The fallback for Windows users with non-NVIDIA, non-Intel-Arc GPUs (older AMD discrete, integrated GPUs, etc.). Slower than CUDA and XPU but provides some acceleration over CPU.
Auto-selected when no other GPU backend is available.
## AMD ROCm (Linux)
ROCm provides PyTorch GPU acceleration on AMD discrete GPUs. Voicebox auto-configures `HSA_OVERRIDE_GFX_VERSION` for common cards that need the override.
### Verifying
```bash
# In a terminal
echo $HSA_OVERRIDE_GFX_VERSION
# Should show e.g. 10.3.0 for RX 6000 series
```
If detection fails, set the variable manually before launching Voicebox:
```bash
export HSA_OVERRIDE_GFX_VERSION=10.3.0
voicebox
```
Common values:
- `10.3.0` — RX 6000 series (RDNA 2)
- `11.0.0` — RX 7000 series (RDNA 3)
- `9.0.0` — Older Vega cards
## GPU Compatibility Warnings
Voicebox 0.4 added a runtime check that compares your GPU's compute capability against the architectures the bundled PyTorch was compiled for. If they don't match, you'll see:
- A startup log line: `WARNING: GPU COMPATIBILITY: <your GPU> is not supported by this PyTorch build...`
- The GPU label in Settings shows `[UNSUPPORTED - see logs]`
- The `/health` API returns a populated `gpu_compatibility_warning` field
### What to do
The most common trigger is a brand-new GPU architecture that pre-built PyTorch wheels don't yet cover natively. In order of preference:
1. **Update Voicebox** — newer releases ship newer PyTorch with broader arch support
2. **Reinstall the CUDA backend** — Settings → GPU → Reinstall CUDA backend
3. **For bleeding-edge GPUs (newer than current Blackwell):** install PyTorch nightly manually:
```bash
pip install torch --index-url https://download.pytorch.org/whl/nightly/cu128 --force-reinstall
```
Then point Voicebox at that environment via [Remote Mode](/overview/remote-mode) until stable PyTorch catches up.
4. **Fall back to CPU** temporarily — set `VOICEBOX_FORCE_CPU=1` before launching
## CPU-Only Fallback
When no GPU is available (or you've forced it off), Voicebox runs the PyTorch CPU backend. Expect:
- 5-50x slower generation depending on engine and text length
- Heavy CPU usage during generation
- Some engines work better than others on CPU:
- **Kokoro 82M** — runs at realtime on modern CPUs
- **LuxTTS** — exceeds 150x realtime on CPU
- **Chatterbox Turbo (350M)** — usable but slow
- Larger models (Qwen 1.7B, Chatterbox Multilingual, TADA 3B) — painful
For CPU-bound use cases, prefer the smaller, lighter engines.
## Verifying Your Setup
Three places to check that the right backend is being used:
<Steps>
<Step title="Settings → GPU">
Shows the detected backend, GPU model, and VRAM (when applicable). Look for the `[UNSUPPORTED - see logs]` suffix
</Step>
<Step title="Settings → Logs">
The "Server logs" tab shows the startup banner with `Backend: <type>` and `GPU: <name>`
</Step>
<Step title="Health endpoint">
`curl http://localhost:17493/health` returns a JSON payload with `backend_type`, `backend_variant`, and `gpu_compatibility_warning` (when applicable)
</Step>
</Steps>
## Troubleshooting
<AccordionGroup>
<Accordion title="Settings shows CPU instead of my GPU">
- On NVIDIA: install the CUDA backend (Settings → GPU)
- On Intel Arc: confirm IPEX detection in startup logs; restart the app after a driver update
- On AMD Linux: check `HSA_OVERRIDE_GFX_VERSION` is set
</Accordion>
<Accordion title="'no kernel image is available' / 'CUDA error'">
Almost always means the bundled PyTorch doesn't have kernels for your GPU's compute capability.
1. Update to Voicebox ≥ 0.4.0 (Blackwell support added there)
2. Reinstall the CUDA backend
3. If still broken, install PyTorch nightly via Remote Mode
</Accordion>
<Accordion title="Out of memory (CUDA)">
- Switch to a smaller model size (e.g. Qwen3 0.6B instead of 1.7B)
- Use Settings → Models to unload other engines you're not using
- Enable `low_cpu_mem_usage` is already on for CPU; for CUDA, the engine's `device_map` handles offload automatically
- Close other GPU applications
</Accordion>
<Accordion title="MPS fallback errors on macOS">
Some operations don't have a Metal implementation. Voicebox sets `PYTORCH_ENABLE_MPS_FALLBACK=1` for engines that need it (notably Kokoro), but if you launch from a custom env, set it manually:
```bash
export PYTORCH_ENABLE_MPS_FALLBACK=1
```
</Accordion>
<Accordion title="Generation works but is slow on my GPU">
- Check Settings → GPU shows your GPU (not CPU)
- Check VRAM usage — you may be paging to system memory
- Try a smaller model
- For NVIDIA: confirm cu128 is installed (Settings → GPU → version)
</Accordion>
</AccordionGroup>
## Next Steps
<Cards>
<Card title="Remote Mode" href="/overview/remote-mode">
Run the backend on a different machine with a stronger GPU
</Card>
<Card title="Model Management" href="/developer/model-management">
Unload models to free GPU memory
</Card>
<Card title="Troubleshooting" href="/overview/troubleshooting">
General troubleshooting beyond GPU
</Card>
</Cards>

View File

@@ -0,0 +1,119 @@
---
title: "Installation"
description: "Download and install Voicebox on macOS, Windows, or Linux"
---
## Download
Voicebox is available for macOS and Windows, with Linux builds coming soon.
<Cards>
<Card title="macOS" icon={<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="M12 2c-1.5 0-2.8.4-3.9 1.1A5.5 5.5 0 0 0 4 2.5C2.5 2.5 1 4 1 6c0 3.5 2.5 6 5 7.5C5 16 4 18 4 20c0 1.5.5 2.5 1.5 3C6.5 23.5 8 24 9.5 24c2 0 3.5-.5 5-2 1.5 1.5 3 2 5 2 1.5 0 3-.5 4-1 1-.5 1.5-1.5 1.5-3 0-2-1-4-2-6.5 2.5-1.5 5-4 5-7.5 0-2-1.5-3.5-3-3.5-.9 0-2.1.4-3.1 1.1A6.5 6.5 0 0 0 12 2Z"/></svg>}>
Download for Apple Silicon or Intel Macs
</Card>
<Card title="Windows" icon={<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><rect x="2" y="3" width="20" height="14" rx="2" ry="2"/><line x1="8" y1="21" x2="16" y2="21"/><line x1="12" y1="17" x2="12" y2="21"/></svg>}>
Download MSI installer or Setup executable
</Card>
</Cards>
### macOS
<Tabs items={["Apple Silicon", "Intel"]}>
<Tab value="Apple Silicon">
Download: [voicebox_aarch64.app.tar.gz](https://github.com/jamiepine/voicebox/releases/latest/download/voicebox_aarch64.app.tar.gz)
```bash
# Extract the archive
tar -xzf voicebox_aarch64.app.tar.gz
# Move to Applications
mv Voicebox.app /Applications/
```
</Tab>
<Tab value="Intel">
Download: [voicebox_x64.app.tar.gz](https://github.com/jamiepine/voicebox/releases/latest/download/voicebox_x64.app.tar.gz)
```bash
# Extract the archive
tar -xzf voicebox_x64.app.tar.gz
# Move to Applications
mv Voicebox.app /Applications/
```
</Tab>
</Tabs>
### Windows
<Tabs items={["MSI Installer", "Setup Executable"]}>
<Tab value="MSI Installer">
Download: [voicebox_x64_en-US.msi](https://github.com/jamiepine/voicebox/releases/latest/download/voicebox_x64_en-US.msi)
Double-click the MSI file and follow the installation wizard.
</Tab>
<Tab value="Setup Executable">
Download: [voicebox_x64-setup.exe](https://github.com/jamiepine/voicebox/releases/latest/download/voicebox_x64-setup.exe)
Run the executable and follow the installation wizard.
</Tab>
</Tabs>
### Linux
<Callout type="info">
Linux builds are coming soon. Currently blocked by GitHub runner disk space limitations.
</Callout>
## First Launch
When you launch Voicebox for the first time:
1. **Model Download** — The TTS engine you generate with first will download its model automatically. Sizes range from ~350 MB (Kokoro) to ~8 GB (TADA 3B). Most users start with Qwen 1.7B (~3.5 GB).
2. **Data Directory** — Voice profiles and generated audio are stored in:
- macOS: `~/Library/Application Support/sh.voicebox.app/`
- Windows: `%APPDATA%/sh.voicebox.app/`
- Linux: `~/.config/sh.voicebox.app/`
3. **Backend Server** — The bundled Python server starts automatically
<Callout type="info">
First generation will be slower due to model downloads. Subsequent runs use cached models.
</Callout>
## System Requirements
### Minimum
- **OS:** macOS 11+, Windows 10+, or Linux
- **RAM:** 8GB
- **Storage:** 5GB free space (for models and data)
- **CPU:** Modern multi-core processor
### Recommended
- **RAM:** 16GB+
- **GPU:** CUDA-capable NVIDIA GPU (for faster generation)
- **Storage:** 10GB+ free space
<Callout type="info">
CPU inference is supported but significantly slower than GPU. A CUDA-capable GPU is highly recommended for real-time workflows.
</Callout>
## Verification
After installation, verify everything works:
1. Launch Voicebox
2. Check the server status indicator in the bottom-left corner (should be green)
3. Navigate to **Profiles** and create a test profile
4. Generate a short audio clip to verify the TTS engine works
<Callout type="success">
If you see a green status indicator and can generate audio, you're all set!
</Callout>
## Next Steps
<Card title="Quick Start Guide" href="/overview/quick-start">
Create your first voice profile and generate speech
</Card>

View File

@@ -0,0 +1,68 @@
---
title: "Introduction"
description: "Voicebox is a local-first voice cloning studio -- a free and open-source alternative to ElevenLabs."
---
## What is Voicebox?
Voicebox is a **local-first voice cloning studio** -- a free and open-source alternative to ElevenLabs. Clone voices from a few seconds of audio or pick from 50+ preset voices, generate speech in 23 languages across 7 TTS engines, apply post-processing effects, and compose multi-voice projects with a timeline editor.
- **Complete privacy** -- models and voice data stay on your machine
- **7 TTS engines** -- Qwen3-TTS, Qwen CustomVoice, LuxTTS, Chatterbox Multilingual, Chatterbox Turbo, HumeAI TADA, and Kokoro
- **Cloning and preset voices** -- zero-shot cloning from a reference sample, or curated preset voices via Kokoro (50 voices) and Qwen CustomVoice (9 voices)
- **23 languages** -- from English to Arabic, Japanese, Hindi, Swahili, and more
- **Post-processing effects** -- pitch shift, reverb, delay, chorus, compression, and filters
- **Expressive speech** -- paralinguistic tags like `[laugh]`, `[sigh]`, `[gasp]` via Chatterbox Turbo; natural-language delivery control via Qwen CustomVoice
- **Unlimited length** -- auto-chunking with crossfade for scripts, articles, and chapters
- **Stories editor** -- multi-track timeline for conversations, podcasts, and narratives
- **API-first** -- REST API for integrating voice synthesis into your own projects
- **Native performance** -- built with Tauri (Rust), not Electron
- **Runs everywhere** -- macOS (MLX/Metal), Windows (CUDA), Linux, AMD ROCm, Intel Arc, Docker
## TTS Engines
Seven engines with different strengths, switchable per-generation:
| Engine | Profile Type | Languages | Strengths |
|--------|--------------|-----------|-----------|
| **Qwen3-TTS** (0.6B / 1.7B) | Cloned | 10 | High-quality multilingual cloning |
| **Qwen CustomVoice** (0.6B / 1.7B) | Preset (9 voices) | 10 | Natural-language delivery control (tone, emotion, pace) |
| **LuxTTS** | Cloned | English | Lightweight (~1GB VRAM), 48kHz output, 150x realtime on CPU |
| **Chatterbox Multilingual** | Cloned | 23 | Broadest language coverage |
| **Chatterbox Turbo** | Cloned | English | Fast 350M model with paralinguistic emotion/sound tags |
| **TADA** (1B / 3B) | Cloned | 10 | HumeAI speech-language model -- 700s+ coherent audio |
| **Kokoro** | Preset (50 voices) | 9 | 82M parameters, CPU realtime, lowest VRAM of any engine |
## GPU Support
| Platform | Backend | Notes |
|----------|---------|-------|
| macOS (Apple Silicon) | MLX (Metal) | 4-5x faster via Neural Engine |
| Windows / Linux (NVIDIA) | PyTorch (CUDA) | Auto-downloads CUDA binary from within the app |
| Linux (AMD) | PyTorch (ROCm) | Auto-configures HSA_OVERRIDE_GFX_VERSION |
| Windows (any GPU) | DirectML | Universal Windows GPU support |
| Intel Arc | IPEX/XPU | Intel discrete GPU acceleration |
| Any | CPU | Works everywhere, just slower |
## Use Cases
- **Game development** -- generate dynamic dialogue for characters
- **Content creation** -- produce podcasts and video voiceovers
- **Accessibility** -- build text-to-speech tools for users who need them
- **Voice assistants** -- create custom voice interfaces
- **Production pipelines** -- automate voiceover workflows via the REST API
## Tech Stack
| Layer | Technology |
|-------|------------|
| Desktop App | Tauri (Rust) |
| Frontend | React, TypeScript, Tailwind CSS |
| State | Zustand, React Query |
| Backend | FastAPI (Python) |
| TTS Engines | Qwen3-TTS, Qwen CustomVoice, LuxTTS, Chatterbox, Chatterbox Turbo, TADA, Kokoro |
| Effects | Pedalboard (Spotify) |
| Transcription | Whisper / Whisper Turbo (PyTorch or MLX) |
| Inference | MLX (Apple Silicon) / PyTorch (CUDA/ROCm/XPU/CPU) |
| Database | SQLite |
| Audio | WaveSurfer.js, librosa |

View File

@@ -0,0 +1,21 @@
{
"title": "Overview",
"defaultOpen": true,
"pages": [
"introduction",
"installation",
"docker",
"quick-start",
"gpu-acceleration",
"voice-cloning",
"preset-voices",
"stories-editor",
"recording-transcription",
"generation-history",
"remote-mode",
"creating-voice-profiles",
"generating-speech",
"building-stories",
"troubleshooting"
]
}

View File

@@ -0,0 +1,202 @@
---
title: "Preset Voices"
description: "Use built-in, ready-made voices without recording audio samples"
---
## Overview
Some Voicebox engines ship with a curated set of pre-built voices. Instead of cloning from your own audio sample, you pick a voice from a fixed catalog and the model speaks in that voice. No recording, no upload, no per-voice training required.
Two engines in 0.4 ship preset voices:
| Engine | Voices | Languages | Strengths |
| --------------------- | ----------------------- | --------- | ------------------------------------------------------- |
| **Kokoro 82M** | 50 | 9 | Tiny model, CPU-friendly, lowest VRAM of any engine |
| **Qwen CustomVoice** | 9 (premium curated) | 4 | Natural-language style control over tone, emotion, pace |
<Callout type="info">
Looking for cloning a specific person's voice instead? See [Voice Cloning](/overview/voice-cloning).
</Callout>
## When to Use Preset Voices
<Cards>
<Card title="No reference audio">
You don't have (or don't want to provide) a recording of the target voice
</Card>
<Card title="Production reliability">
Curated voices have predictable quality across any text input
</Card>
<Card title="Speed">
Skip the audio cleanup, sample preparation, and quality iteration loop
</Card>
<Card title="Lightweight setup">
Kokoro runs at CPU realtime with ~150 MB on disk — no GPU needed
</Card>
</Cards>
## Creating a Preset-Voice Profile
<Steps>
<Step title="Open Profiles → New Profile">
Same entry point as cloning profiles
</Step>
<Step title="Choose the engine">
Select **Kokoro** or **Qwen CustomVoice** from the engine dropdown
</Step>
<Step title="Pick a preset voice">
The voice catalog for the chosen engine appears — preview each by clicking it
</Step>
<Step title="Name and save">
Give the profile a name. No audio sample needed — just save
</Step>
<Step title="Generate">
Use the profile like any other in the floating generate box or the Generate page
</Step>
</Steps>
<Callout type="info">
Preset profiles are locked to their source engine — switching engines won't work since the voice exists only for that model. The profile grid greys out preset profiles when you switch to a different engine, and clicking one auto-switches the engine back to the right one.
</Callout>
## Kokoro 82M — 50 Voices Across 9 Languages
Kokoro is the smallest engine in Voicebox at 82M parameters. It runs at CPU realtime with negligible VRAM, making it the best option for lightweight local inference. Voices are pre-built style vectors trained into the model — there's no concept of cloning here.
**Repository:** [`hexgrad/Kokoro-82M`](https://huggingface.co/hexgrad/Kokoro-82M) · Apache 2.0 licensed
### American English
| Female | Male |
| ------- | ------- |
| Alloy | Adam |
| Aoede | Echo |
| Bella | Eric |
| Heart | Fenrir |
| Jessica | Liam |
| Kore | Michael |
| Nicole | Onyx |
| Nova | Puck |
| River | Santa |
| Sarah | |
| Sky | |
### British English
| Female | Male |
| -------- | ------ |
| Alice | Daniel |
| Emma | Fable |
| Isabella | George |
| Lily | Lewis |
### Other Languages
| Language | Voices |
| ----------------- | ------------------------------------------- |
| Spanish (`es`) | Dora (f), Alex (m), Santa (m) |
| French (`fr`) | Siwis (f) |
| Hindi (`hi`) | Alpha (f), Beta (f), Omega (m), Psi (m) |
| Italian (`it`) | Sara (f), Nicola (m) |
| Japanese (`ja`) | Alpha (f), Gongitsune (f), Nezumi (f), Tebukuro (f), Kumo (m) |
| Portuguese (`pt`) | Dora (f), Alex (m), Santa (m) |
| Chinese (`zh`) | Xiaobei (f), Xiaoni (f), Xiaoxiao (f), Xiaoyi (f) |
### Kokoro at a Glance
| Property | Value |
| --------------- | -------------------------------------------- |
| Parameters | 82M |
| Sample rate | 24 kHz |
| VRAM | ~150 MB (negligible on CPU) |
| Speed | Realtime on CPU, faster on GPU |
| Instruct | Not supported (preset voice carries the style) |
| License | Apache 2.0 |
## Qwen CustomVoice — 9 Premium Voices with Instruct Control
Qwen CustomVoice ships with 9 curated speakers and supports **natural-language style control** — you tell the model how to deliver the line ("speak slowly with warmth", "authoritative and clear") and it adapts tone, emotion, and pace.
Two model sizes:
- **1.7B** — full quality, recommended default
- **0.6B** — lighter, faster, lower-end hardware
**Repository:** [`Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice`](https://huggingface.co/Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice) (and 0.6B variant) · by Alibaba
### Voice Catalog
| Speaker | Gender | Language | Description |
| --------- | ------ | -------- | ------------------------------------------------------------ |
| Vivian | female | Chinese | Bright, slightly edgy young female voice |
| Serena | female | Chinese | Warm, gentle young female voice |
| Uncle Fu | male | Chinese | Seasoned male voice with a low, mellow timbre |
| Dylan | male | Chinese | Youthful Beijing male voice with a clear, natural timbre |
| Eric | male | Chinese | Lively Chengdu male voice with a slightly husky brightness |
| Ryan | male | English | Dynamic male voice with strong rhythmic drive (default) |
| Aiden | male | English | Sunny American male voice with a clear midrange |
| Ono Anna | female | Japanese | Playful Japanese female voice with a light, nimble timbre |
| Sohee | female | Korean | Warm Korean female voice with rich emotion |
### Using Instruct Mode
In the floating generate box, switch to a Qwen CustomVoice profile and click the **delivery instructions** toggle (slider icon, left of the generate button). A second textarea appears below the main text:
- Main text → what you want the voice to say
- Instruct text → how you want it delivered
Examples of effective instruct prompts:
```
Speak slowly with emphasis, like reading bedtime stories
Warm and friendly, conversational tone
Professional and authoritative, broadcast quality
Whisper, intimate and close
Excited and energetic, like sports commentary
```
The full Generate page also surfaces the instruct field as a separate input.
### Qwen CustomVoice at a Glance
| Property | Value |
| --------------- | -------------------------------------------------- |
| Parameters | 1.7B / 0.6B |
| Languages | Chinese, English, Japanese, Korean (10 supported) |
| Voices | 9 curated preset speakers |
| VRAM | ~3.5 GB (1.7B), ~1.2 GB (0.6B) |
| Instruct | Yes — natural-language style control |
| Cloning | No — paired Base Qwen3-TTS engine handles cloning |
## Cloning vs Preset — Quick Decision
| You want… | Use |
| -------------------------------------------------- | ----------------------------------------- |
| To replicate a specific person's voice | [Voice Cloning](/overview/voice-cloning) |
| Production-ready voices with no audio prep | Kokoro or Qwen CustomVoice |
| The smallest possible footprint (CPU-only) | Kokoro |
| Fine control over delivery (tone, pace, emotion) | Qwen CustomVoice |
| The broadest language coverage | [Voice Cloning](/overview/voice-cloning) via Chatterbox Multilingual (23 langs) |
## Limitations
<Callout type="warn">
Preset voices are fixed — you can't fine-tune or modify the underlying voice. If you want a specific voice that isn't in the catalog, use a cloning engine and provide a reference sample.
</Callout>
- Preset voices can't be exported to use in other Voicebox installations as audio (only as profile metadata pointing to the same engine + voice ID)
- The Kokoro voice catalog is set by the upstream model — new voices appear only when hexgrad publishes new model releases
- Qwen CustomVoice's 9 speakers are part of the model checkpoint — same constraint
## Next Steps
<Cards>
<Card title="Voice Cloning" href="/overview/voice-cloning">
Clone a specific voice from your own audio
</Card>
<Card title="Generate Speech" href="/overview/generating-speech">
Use a profile to generate audio
</Card>
<Card title="Build Stories" href="/overview/building-stories">
Compose multi-voice narratives
</Card>
</Cards>

View File

@@ -0,0 +1,164 @@
---
title: "Quick Start"
description: "Get started with Voicebox in 5 minutes"
---
This guide will walk you through creating your first voice profile and generating speech.
## Prerequisites
Make sure you have [installed Voicebox](/overview/installation) and launched the app.
## Step 1: Create a Voice Profile
Voice profiles are the foundation of Voicebox. Each profile contains voice samples that the AI uses to clone the voice.
<Steps>
<Step title="Navigate to Profiles">
Click the **Profiles** tab in the sidebar
</Step>
<Step title="Create New Profile">
Click the **+ New Profile** button
Fill in the details:
- **Name:** A descriptive name (e.g., "John Smith")
- **Language:** Select the primary language
- **Description:** Optional notes about the voice
</Step>
<Step title="Add Voice Sample">
You have two options:
**Option A: Upload Audio**
- Click **Upload Sample**
- Select an audio file (WAV, MP3, or M4A)
- Ideal length: 10-30 seconds of clear speech
**Option B: Record Live**
- Click **Record Sample**
- Speak clearly for 10-30 seconds
- Click stop when finished
</Step>
<Step title="Save Profile">
Click **Create Profile** to save
</Step>
</Steps>
<Callout type="info">
For best results, use clean audio with minimal background noise and consistent speaking tone.
</Callout>
## Step 2: Generate Speech
Now let's use your new voice profile to generate speech.
<Steps>
<Step title="Go to Generation">
Click the **Generate** tab in the sidebar
</Step>
<Step title="Select Voice Profile">
Choose your newly created profile from the dropdown
</Step>
<Step title="Enter Text">
Type or paste the text you want to generate:
```
Hello! This is my first voice generation with Voicebox.
```
<Callout type="info">
Paralinguistic tags like `[laugh]`, `[sigh]`, and `[gasp]` only work with
**Chatterbox Turbo**. Qwen3-TTS, LuxTTS, Chatterbox Multilingual, and
HumeAI TADA will read those tags literally instead of turning them into
expressive sounds.
</Callout>
To insert supported tags, select **Chatterbox Turbo** and type `/` in the
text input to open the tag inserter.
</Step>
<Step title="Generate">
Click **Generate** and wait a few seconds
<Callout type="info">
First generation may take longer due to model initialization. Subsequent generations will be faster.
</Callout>
</Step>
<Step title="Play & Download">
- Click **Play** to preview the audio
- Click **Download** to save the audio file
- The generation is also saved to your **History**
</Step>
</Steps>
## Step 3: Build a Story (Optional)
The Stories Editor lets you create multi-voice narratives with a timeline-based interface.
<Steps>
<Step title="Create New Story">
Navigate to **Stories** and click **+ New Story**
</Step>
<Step title="Add Voice Tracks">
Click **+ Add Track** to create tracks for different speakers
</Step>
<Step title="Add Audio Clips">
- Drag generated audio from your History
- Or generate new clips directly in the timeline
- Arrange clips on the timeline
</Step>
<Step title="Edit & Export">
- Trim clips by dragging edges
- Adjust timing and spacing
- Click **Export** to render the final audio
</Step>
</Steps>
## What's Next?
<Cards>
<Card title="Voice Cloning Guide" href="/overview/creating-voice-profiles">
Learn advanced techniques for high-quality voice cloning
</Card>
<Card title="API Integration" href="/api-reference">
Integrate Voicebox into your own applications
</Card>
<Card title="Stories Editor" href="/overview/stories-editor">
Master the multi-track timeline editor
</Card>
<Card title="Remote Mode" href="/overview/remote-mode">
Connect to a GPU server for faster generation
</Card>
</Cards>
## Tips for Success
<AccordionGroup>
<Accordion title="Getting the Best Voice Quality">
- Use 10-30 seconds of clear, consistent speech
- Avoid background noise and echo
- Multiple samples from the same speaker improve quality
- Match the speaking style you want to generate
</Accordion>
<Accordion title="Improving Generation Speed">
- Use a CUDA-capable GPU for 5-10x faster generation
- Enable voice prompt caching for repeated generations
- Consider running the backend on a remote GPU server
</Accordion>
<Accordion title="Troubleshooting Common Issues">
- **Server won't start:** Check if port 17493 is available
- **Poor audio quality:** Try adding more voice samples
- **Slow generation:** Verify GPU acceleration is enabled
- See the full [Troubleshooting Guide](/overview/troubleshooting) for more
</Accordion>
</AccordionGroup>

View File

@@ -0,0 +1,64 @@
---
title: "Recording & Transcription"
description: "Record audio and transcribe speech with Whisper"
---
## Recording
Voicebox includes built-in recording capabilities for creating voice samples and capturing audio.
### Features
- **Microphone input** - Record from any audio input device
- **System audio capture** - Record desktop audio (macOS/Windows)
- **Waveform visualization** - See audio levels in real-time
- **Multiple formats** - Export as WAV, MP3, or M4A
### How to Record
<Steps>
<Step title="Select Input">
Choose your microphone or system audio
</Step>
<Step title="Start Recording">
Click the record button and speak clearly
</Step>
<Step title="Stop & Save">
Click stop when finished
</Step>
<Step title="Use or Export">
Use as voice sample or export to file
</Step>
</Steps>
## Transcription
Automatic speech-to-text powered by OpenAI's Whisper model.
### Features
- **High accuracy** - Industry-leading speech recognition
- **Multiple languages** - Supports 50+ languages
- **Automatic detection** - Language auto-detection
- **Timestamps** - Word-level timing information
### How to Transcribe
<Steps>
<Step title="Select Audio">
Choose a recording or upload an audio file
</Step>
<Step title="Choose Language">
Select language or use auto-detect
</Step>
<Step title="Transcribe">
Click transcribe and wait for processing
</Step>
<Step title="Review & Export">
Review text and export as needed
</Step>
</Steps>
<Callout type="info">
Transcription is useful for creating voice samples from existing audio or generating subtitles.
</Callout>

View File

@@ -0,0 +1,146 @@
---
title: "Remote Mode"
description: "Connect to a GPU server for faster generation"
---
## Overview
Remote Mode allows you to run the Voicebox backend on a separate machine (like a GPU server) while using the desktop app on your local machine.
## Use Cases
- **No local GPU** - Use a cloud GPU or remote workstation
- **Faster generation** - Leverage powerful remote hardware
- **Shared infrastructure** - Multiple users connect to one server
- **Laptop workflows** - Keep your laptop cool and battery-efficient
## Architecture
In Remote Mode, the Voicebox desktop app (running on your local machine) communicates with the backend server (running on a remote machine) via HTTP. The local app provides only the user interface, while the remote server handles all the heavy processing including the TTS models, API endpoints, and audio generation.
## Setting Up Remote Mode
### On the Server
<Steps>
<Step title="Install Dependencies">
```bash
# Clone the repo
git clone https://github.com/jamiepine/voicebox.git
cd voicebox/backend
# Install Python dependencies
pip install -r requirements.txt
# Engines with incompatible transitive pins — install with --no-deps
pip install --no-deps chatterbox-tts
pip install --no-deps hume-tada
# Qwen3-TTS from source
pip install git+https://github.com/QwenLM/Qwen3-TTS.git
```
Or just run `just setup` from the repo root, which handles all of this.
</Step>
<Step title="Start the Server">
```bash
# Allow external connections
uvicorn main:app --host 0.0.0.0 --port 17493
```
<Callout type="warn">
This exposes the server to your network. Use a firewall or VPN for security.
</Callout>
</Step>
<Step title="Open Firewall">
```bash
# Ubuntu/Debian
sudo ufw allow 17493
# Or use your cloud provider's firewall settings
```
</Step>
</Steps>
### On the Client
<Steps>
<Step title="Open Settings">
In Voicebox, go to **Settings → Server**
</Step>
<Step title="Enable Remote Mode">
Toggle **Use Remote Server**
</Step>
<Step title="Enter Server URL">
```
http://<server-ip>:17493
```
Replace `<server-ip>` with your server's IP address
</Step>
<Step title="Test Connection">
Click **Test Connection** to verify
</Step>
</Steps>
## Cloud Deployment
### AWS EC2
```bash
# Launch a GPU instance (e.g., g4dn.xlarge)
# Install dependencies
# Start server with --host 0.0.0.0
```
### Vast.ai
```bash
# Rent a GPU instance
# SSH in and clone repo
# Start server
```
### RunPod
```bash
# Deploy a pod with CUDA support
# Install Voicebox backend
# Expose port 17493
```
## Security Considerations
<Callout type="warn">
The API currently has no authentication. Only use on trusted networks or with a VPN.
</Callout>
**Best Practices:**
- Use a VPN (WireGuard, Tailscale) instead of exposing to the internet
- Run behind a reverse proxy with authentication (nginx + basic auth)
- Use HTTPS with SSL certificates
- Firewall rules to limit access to specific IPs
## Performance
Expected performance on various GPUs:
| GPU | Generation Speed |
|-----|------------------|
| RTX 4090 | ~2-3s per 10 words |
| RTX 3090 | ~3-4s per 10 words |
| RTX 3060 | ~5-7s per 10 words |
| CPU (12-core) | ~20-30s per 10 words |
<Callout type="info">
A GPU with 8GB+ VRAM is recommended for best performance.
</Callout>
## Troubleshooting
See the [Troubleshooting Guide](/overview/troubleshooting) for common issues.

View File

@@ -0,0 +1,64 @@
---
title: "Stories Editor"
description: "Create multi-voice narratives with a timeline-based editor"
---
## Overview
The Stories Editor is a DAW-like timeline interface for creating multi-voice narratives, podcasts, and conversations.
## Features
<Cards>
<Card title="Multi-Track Timeline">
Arrange multiple voice tracks in parallel
</Card>
<Card title="Inline Editing">
Trim and split clips directly in the timeline
</Card>
<Card title="Auto-Playback">
Preview with synchronized playhead
</Card>
<Card title="Voice Mixing">
Build conversations with multiple speakers
</Card>
</Cards>
## Creating a Story
<Steps>
<Step title="Create New Story">
Navigate to **Stories** and click **+ New Story**
</Step>
<Step title="Add Tracks">
Create separate tracks for each voice/speaker
</Step>
<Step title="Add Clips">
- Drag from generation history
- Generate new clips inline
- Upload audio files
</Step>
<Step title="Arrange & Edit">
- Position clips on timeline
- Trim clip edges
- Adjust spacing and timing
</Step>
<Step title="Export">
Render the final mixed audio
</Step>
</Steps>
## Use Cases
- **Podcasts**: Multi-host conversations
- **Audiobooks**: Narrator + character voices
- **Game Dialogue**: Character interactions
- **Video Voiceovers**: Multiple speakers
- **Audio Drama**: Full voice casts
## Coming Soon
- Word-level editing
- Crossfades and transitions
- Audio effects (reverb, EQ)
- Real-time collaboration

View File

@@ -0,0 +1,596 @@
---
title: "Troubleshooting"
description: "Common issues and solutions for Voicebox"
---
This guide covers common issues you might encounter when using or developing Voicebox, along with solutions.
## Installation Issues
### macOS: "App is damaged and can't be opened"
This occurs because the app isn't signed with an Apple Developer certificate.
**Solution:**
```bash
# Remove the quarantine attribute
xattr -cr /Applications/Voicebox.app
```
### Windows: SmartScreen Warning
Windows SmartScreen may warn that the app is unrecognized.
**Solution:**
- Click "More info"
- Click "Run anyway"
<Callout type="info">
This is expected for unsigned applications. We're working on code signing for future releases.
</Callout>
### Linux: AppImage Won't Run
**Solution:**
```bash
chmod +x voicebox-*.AppImage
./voicebox-*.AppImage
```
## Server Issues
### Backend Server Won't Start
**Symptoms:**
- Red status indicator in bottom-left corner
- "Failed to connect to server" error
**Solutions:**
<AccordionGroup>
<Accordion title="Port Already in Use">
Check if port 17493 is already in use:
```bash
# macOS/Linux
lsof -i :17493
# Windows
powershell -Command "Get-NetTCPConnection -LocalPort 17493 -State Listen"
```
Kill the process using the port:
```bash
# macOS/Linux
kill -9 <PID>
# Windows
taskkill /PID <PID> /F
```
</Accordion>
<Accordion title="Permission Issues">
The server binary might not have execute permissions:
```bash
# macOS/Linux
chmod +x ~/Library/Application\ Support/sh.voicebox.app/backend/voicebox-server
```
</Accordion>
<Accordion title="Check Logs">
View server logs for errors:
**macOS:**
```bash
tail -f ~/Library/Application\ Support/sh.voicebox.app/logs/server.log
```
**Windows:**
```bash
type %APPDATA%\sh.voicebox.app\logs\server.log
```
</Accordion>
</AccordionGroup>
### `flash-attn is not installed` Warning in Server Logs
**Symptoms:**
```
Warning: flash-attn is not installed. Will only run the manual PyTorch version.
Please install flash-attn for faster inference.
```
**This is harmless.** The warning is emitted by our transformer-based engines (Chatterbox / Qwen) on every startup. FlashAttention is an optional acceleration library — when it's not present, PyTorch's built-in scaled-dot-product attention (SDPA) runs instead, which is near-FA2 throughput on modern GPUs. Generation works normally.
**Why it shows up on every platform:**
- **Windows:** `flash-attn` has no official Windows support. The upstream project (Dao-AILab/flash-attention) still only says it *might* work, and source builds typically fail on recent CUDA/MSVC combinations.
- **macOS (Apple Silicon):** FlashAttention is CUDA-only and doesn't apply here at all. MLX has its own optimized attention kernels.
- **Linux:** It's not pinned in our requirements because installing it is fragile and version-sensitive; users who want it install it themselves.
**Solutions (all optional):**
<AccordionGroup>
<Accordion title="Ignore it (recommended)">
PyTorch SDPA is what actually runs the model, and on Ampere/Ada/Hopper GPUs it's within a few percent of FA2 for our workloads. You won't notice a meaningful speed difference.
</Accordion>
<Accordion title="Install flash-attn on Linux">
```bash
pip install flash-attn --no-build-isolation
```
Requires a matching CUDA toolkit. Build can take 20+ minutes.
</Accordion>
<Accordion title="Install flash-attn on Windows (community wheels)">
Official builds don't exist, but community maintainers publish prebuilt wheels:
- [kingbri1/flash-attention releases](https://github.com/kingbri1/flash-attention/releases)
- [bdashore3/flash-attention releases](https://github.com/bdashore3/flash-attention/releases)
Pick the wheel matching your exact CUDA + PyTorch + Python combination. Example:
```bash
pip install https://github.com/kingbri1/flash-attention/releases/download/v2.8.3/flash_attn-2.8.3+cu128torch2.8.0cxx11abiFALSE-cp312-cp312-win_amd64.whl
```
Alternatively, run Voicebox's backend inside WSL2 and use the standard Linux wheels.
</Accordion>
</AccordionGroup>
### Connection Timeout
**Symptoms:**
- Long loading times
- "Connection timeout" errors
**Solution:**
- Restart the app
- Check your firewall settings
- Ensure localhost is accessible
## Generation Issues
### First Generation is Very Slow
**Symptoms:**
- First generation takes 2-5 minutes
- Progress indicator stuck at "Loading model..."
**Explanation:**
This is expected behavior. The first generation downloads the selected TTS engine's model and initializes it. Sizes range from 350 MB (Kokoro) to 8 GB (TADA 3B).
**Solution:**
- Wait for the initial download to complete (progress is shown in Settings → Models)
- Subsequent generations reuse the cached model and are much faster
- Check your internet connection
- For low-bandwidth setups, start with Kokoro (~350 MB) or LuxTTS (~300 MB)
### Poor Voice Quality
**Symptoms:**
- Robotic or unnatural voice
- Missing emotion or prosody
- Pronunciation errors
**Solutions:**
<Steps>
<Step title="Improve Voice Samples">
- Use 10-30 seconds of clear audio
- Avoid background noise
- Ensure consistent speaking tone
- Add multiple samples from the same speaker
</Step>
<Step title="Match Speaking Style">
The generated voice will mimic the tone and style of your samples. If your sample is monotone, the generation will be too.
</Step>
<Step title="Adjust Text Formatting">
- Use proper punctuation
- Add commas for natural pauses
- Capitalize proper nouns
</Step>
</Steps>
### Generation Fails with "Out of Memory"
**Symptoms:**
- Generation crashes
- "CUDA out of memory" or "RuntimeError: out of memory"
**Solutions:**
<AccordionGroup>
<Accordion title="Free GPU Memory">
Close other GPU-intensive applications:
- Games
- Video editors
- Multiple browser tabs with WebGL
Then restart Voicebox.
</Accordion>
<Accordion title="Use CPU Mode">
If your GPU doesn't have enough VRAM (need 6GB+), use CPU mode:
Settings → Generation → Use CPU instead of GPU
<Callout type="warn">
CPU generation is 5-10x slower but uses system RAM instead of VRAM.
</Callout>
</Accordion>
<Accordion title="Reduce Batch Size">
For long text, split it into smaller chunks instead of generating all at once.
</Accordion>
</AccordionGroup>
### MLX "Failed to load the default metallib" (Apple Silicon)
**Symptoms:**
- Generation fails with "library not found" or "metallib" errors
- Server logs reference missing Metal shader libraries
**Solutions:**
<AccordionGroup>
<Accordion title="Rebuild the Server Binary">
```bash
just build-server
```
The build script bundles MLX Metal shader libraries on Apple Silicon automatically.
</Accordion>
<Accordion title="Reinstall MLX Dependencies">
```bash
pip install -r backend/requirements-mlx.txt
```
</Accordion>
<Accordion title="Verify Backend Detection">
Check Settings → Server Status. Should show **Backend: MLX** on Apple Silicon. If it shows **Backend: PYTORCH**, MLX isn't installed correctly.
</Accordion>
</AccordionGroup>
## Audio Issues
### No Audio Playback
**Symptoms:**
- Generated audio won't play
- Playback button doesn't respond
**Solutions:**
- Check system audio settings
- Ensure audio output device is connected
- Try exporting and playing in a media player
### Crackling or Distorted Audio
**Symptoms:**
- Audio has static or distortion
- Clipping sounds
**Solutions:**
- Check if your input samples have distortion
- Reduce playback volume
- Re-generate with cleaner voice samples
## Development Issues
### Backend Won't Start in Dev Mode
**Symptoms:**
- `just dev-backend` or `just dev` fails
- Import errors or module not found
**Solutions:**
<AccordionGroup>
<Accordion title="Python Version">
Ensure Python 3.11 or higher:
```bash
python --version
```
If not, install Python 3.11+ and recreate the virtual environment.
</Accordion>
<Accordion title="Virtual Environment">
Ensure venv is activated:
```bash
# macOS/Linux
source backend/venv/bin/activate
# Windows
backend\venv\Scripts\activate
```
You should see `(venv)` in your prompt.
</Accordion>
<Accordion title="Dependencies">
Reinstall dependencies — easiest via `just`:
```bash
just setup
```
Or manually:
```bash
cd backend
pip install -r requirements.txt
pip install --no-deps chatterbox-tts
pip install --no-deps hume-tada
pip install git+https://github.com/QwenLM/Qwen3-TTS.git
```
</Accordion>
</AccordionGroup>
### Tauri Build Fails
**Symptoms:**
- `bun run tauri build` fails
- Rust compilation errors
**Solutions:**
```bash
# Clean build artifacts
cd tauri/src-tauri
cargo clean
# Update Rust
rustup update
# Try building again
cd ../..
bun run tauri build
```
### OpenAPI Client Generation Fails
**Symptoms:**
- `./scripts/generate-api.sh` fails
- "Failed to fetch schema" error
**Solutions:**
<Steps>
<Step title="Ensure Backend is Running">
```bash
curl http://localhost:17493/openapi.json
```
Should return JSON. If not, start the backend.
</Step>
<Step title="Check Port">
Ensure nothing else is using port 17493
</Step>
<Step title="Regenerate Manually">
```bash
cd backend
source venv/bin/activate
uvicorn main:app --reload --port 17493
# In another terminal
./scripts/generate-api.sh
```
</Step>
</Steps>
## Database Issues
### "Database is locked" Error
**Symptoms:**
- Profile or generation operations fail
- SQLite lock errors
**Solutions:**
- Close all Voicebox instances
- Delete the lock file:
```bash
# macOS
rm ~/Library/Application\ Support/sh.voicebox.app/data/voicebox.db-shm
rm ~/Library/Application\ Support/sh.voicebox.app/data/voicebox.db-wal
```
### Corrupted Database
**Symptoms:**
- App crashes on launch
- Data missing or corrupted
**Solutions:**
<Callout type="warn">
This will delete all your voice profiles and generation history. Export important profiles first if possible.
</Callout>
```bash
# macOS
rm ~/Library/Application\ Support/sh.voicebox.app/data/voicebox.db
# Windows
del %APPDATA%\sh.voicebox.app\data\voicebox.db
```
Restart the app to create a fresh database.
## Model Issues
### Model Download Fails
**Symptoms:**
- "Failed to download model" error
- Stuck at "Downloading..."
**Solutions:**
- Check your internet connection
- Check HuggingFace Hub status
- Try using a VPN if HuggingFace is blocked in your region
- Manually download via the HuggingFace CLI and place in the cache directory:
```bash
pip install huggingface_hub
huggingface-cli download Qwen/Qwen3-TTS-12Hz-1.7B-Base
```
### Wrong Model Version
**Symptoms:**
- Generation quality suddenly degraded
- Different voice output
**Solutions:**
Clear the model cache and re-download. Replace the `Qwen*` glob with the engine org prefix for other engines (`ResembleAI*` for Chatterbox, `HumeAI*` for TADA, `hexgrad*` for Kokoro, etc.) or use `DELETE /models/{name}` via the API.
```bash
# macOS / Linux
rm -rf ~/.cache/huggingface/hub/models--Qwen*
# Windows
rmdir /s %USERPROFILE%\.cache\huggingface\hub\models--Qwen*
```
## Performance Issues
### Slow Generation on GPU
**Symptoms:**
- Generation slower than expected
- GPU not being utilized
**Solutions:**
<AccordionGroup>
<Accordion title="Verify CUDA Installation">
```bash
nvidia-smi
```
Should show your GPU. If not, install CUDA drivers.
</Accordion>
<Accordion title="Check GPU Selection">
If you have multiple GPUs, ensure Voicebox is using the right one.
Settings → Generation → GPU Device
</Accordion>
<Accordion title="Update GPU Drivers">
Outdated drivers can cause performance issues. Update to the latest NVIDIA drivers.
</Accordion>
<Accordion title="Apple Silicon: Confirm MLX Backend">
Check Settings → Server Status. Should show **Backend: MLX** on Apple Silicon — MLX is 45× faster than PyTorch here. If it shows **Backend: PYTORCH**, reinstall MLX:
```bash
pip install -r backend/requirements-mlx.txt
```
GPU availability should read "Metal (Apple Silicon via MLX)".
</Accordion>
</AccordionGroup>
### High Memory Usage
**Symptoms:**
- App uses excessive RAM
- System becomes sluggish
**Solutions:**
- Close unused voice profiles
- Clear generation history
- Restart the app periodically
## Update Issues
### "Update Check Failed"
**Solutions:**
- Confirm your internet connection — updates are fetched from GitHub releases.
- Ensure `github.com` is accessible and not blocked by a firewall or proxy.
- As a fallback, download the latest release from GitHub and install manually.
### "Invalid Signature" Error
**Solutions:**
- Re-download the installer — the signature may have been corrupted in transit.
- Verify the `.sig` file matches the installer; if it doesn't, file an issue.
## Remote Mode Issues
### Can't Connect to Remote Server
**Symptoms:**
- "Connection refused" error
- Remote server not found
**Solutions:**
<Steps>
<Step title="Check Server Status">
Ensure the remote server is running:
```bash
curl http://<server-ip>:17493/health
```
</Step>
<Step title="Check Firewall">
Ensure port 17493 is open on the remote server:
```bash
# Allow port on Ubuntu/Debian
sudo ufw allow 17493
```
</Step>
<Step title="Verify Network">
- Ensure both machines are on the same network (for local servers)
- Use IP address instead of hostname
- Try pinging the server: `ping <server-ip>`
</Step>
</Steps>
## Still Having Issues?
If you're still experiencing problems:
1. **Check GitHub Issues:** [github.com/jamiepine/voicebox/issues](https://github.com/jamiepine/voicebox/issues)
2. **Open a New Issue:** Provide:
- Operating system and version
- Voicebox version
- Steps to reproduce
- Error messages or logs
3. **Join Discord:** [discord.gg/voicebox](https://discord.gg/voicebox) (coming soon)
## Diagnostic Information
When reporting issues, include this information:
```bash
# Voicebox version
# Check Help → About in the app
# Operating system
uname -a # macOS/Linux
systeminfo # Windows
# Python version (for dev issues)
python --version
# GPU info (if generation issues)
nvidia-smi # NVIDIA GPUs
```

View File

@@ -0,0 +1,118 @@
---
title: "Voice Cloning"
description: "Clone any voice from a few seconds of reference audio"
---
## Overview
Voicebox can replicate a specific person's voice from a short audio sample — known as **zero-shot voice cloning**. You provide 10-30 seconds of clear speech, the model extracts a voice embedding, and from then on you can generate any text in that voice.
Five engines in 0.4 support cloning:
| Engine | Languages | Strengths |
| --------------------------- | --------- | -------------------------------------------------------------------------- |
| **Qwen3-TTS** (0.6B / 1.7B) | 10 | High-quality multilingual, supports delivery instructions on the same kwarg |
| **Chatterbox Multilingual** | 23 | Broadest language coverage — Arabic, Hindi, Swahili, Hebrew, more |
| **Chatterbox Turbo** | English | Fast 350M model with paralinguistic emotion tags (`[laugh]`, `[sigh]`) |
| **LuxTTS** | English | Lightweight (~1 GB VRAM), 48 kHz output, 150x realtime on CPU |
| **TADA** (1B / 3B) | 10 | Speech-language model with 700s+ coherent long-form generation |
<Callout type="info">
Don't want to record audio? Use a curated voice from Kokoro or Qwen CustomVoice instead — see [Preset Voices](/overview/preset-voices).
</Callout>
## How It Works
<Steps>
<Step title="Upload or Record Sample">
Provide 10-30 seconds of clear speech from the target voice
</Step>
<Step title="Engine Analysis">
The selected engine analyzes vocal characteristics, tone, and speaking patterns
</Step>
<Step title="Voice Profile Created">
A voice embedding is generated and stored with your profile
</Step>
<Step title="Generate Speech">
Use the profile to generate any text in the cloned voice
</Step>
</Steps>
## Choosing an Engine for Cloning
Different engines suit different use cases. The profile grid greys out unsupported engines so you can switch easily.
| If you want… | Pick |
| -------------------------------------------------- | --------------------- |
| Best overall quality on a few common languages | **Qwen3-TTS 1.7B** |
| Faster generation, slightly lower quality | **Qwen3-TTS 0.6B** |
| Languages outside Qwen's 10 (Arabic, Hindi, etc.) | **Chatterbox Multilingual** |
| Expressive English with `[laugh]` `[sigh]` tags | **Chatterbox Turbo** |
| CPU-only or GPU-light setup, English | **LuxTTS** |
| Long-form generation (audiobooks, full chapters) | **TADA 3B** |
## Best Practices
### Sample Quality
<Cards>
<Card title="Do">
- Use 10-30 seconds of audio
- Clear, consistent speaking
- Minimal background noise
- Natural speaking pace
</Card>
<Card title="Don't">
- Very short clips (< 5 seconds)
- Heavy background noise
- Music or overlapping voices
- Heavily processed audio
</Card>
</Cards>
### Multiple Samples
Adding multiple samples from the same speaker can improve quality:
- Different speaking styles (casual, formal)
- Different emotions (happy, serious)
- Different recording conditions
<Callout type="info">
The model will learn a more robust representation from diverse samples. Especially helpful for distinctive voices the model might otherwise smooth over.
</Callout>
## Supported Languages by Engine
- **Qwen3-TTS** — English, Chinese, Japanese, Korean, German, French, Russian, Portuguese, Spanish, Italian (10)
- **Chatterbox Multilingual** — Arabic, Chinese, Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi, Italian, Japanese, Korean, Malay, Norwegian, Polish, Portuguese, Russian, Spanish, Swahili, Swedish, Turkish (23)
- **Chatterbox Turbo** — English
- **LuxTTS** — English
- **TADA 3B** — 10 multilingual; **TADA 1B** — English
For complete language tables and engine-specific notes, see the [TTS Engines developer guide](/developer/tts-engines).
## Limitations
<Callout type="warn">
Voice cloning should only be used with consent. Ensure you have permission to clone someone's voice. See the project's [SECURITY.md](https://github.com/jamiepine/voicebox/blob/main/SECURITY.md) and your local laws on synthetic voice content.
</Callout>
- Quality depends on sample clarity — noisy samples produce noisy clones
- Works best with consistent speaking tone within a sample
- May struggle with extreme accents or speech impediments
- Background noise reduces quality and can introduce artifacts
## Next Steps
<Cards>
<Card title="Creating Voice Profiles" href="/overview/creating-voice-profiles">
Step-by-step guide to creating profiles
</Card>
<Card title="Preset Voices" href="/overview/preset-voices">
Use built-in voices instead of cloning
</Card>
<Card title="Generating Speech" href="/overview/generating-speech">
Use a profile to generate audio
</Card>
</Cards>

1
docs/lib/cn.ts Normal file
View File

@@ -0,0 +1 @@
export { twMerge as cn } from 'tailwind-merge';

View File

@@ -0,0 +1,9 @@
import type { BaseLayoutProps } from 'fumadocs-ui/layouts/shared';
export function baseOptions(): BaseLayoutProps {
return {
nav: {
title: 'Voicebox',
},
};
}

5
docs/lib/openapi.ts Normal file
View File

@@ -0,0 +1,5 @@
import { createOpenAPI } from 'fumadocs-openapi/server';
export const openapi = createOpenAPI({
input: ['./openapi.json'],
});

27
docs/lib/source.ts Normal file
View File

@@ -0,0 +1,27 @@
import { type InferPageType, loader } from 'fumadocs-core/source';
import { lucideIconsPlugin } from 'fumadocs-core/source/lucide-icons';
import { docs } from '@/.source';
// See https://fumadocs.dev/docs/headless/source-api for more info
export const source = loader({
baseUrl: '/',
source: docs.toFumadocsSource(),
plugins: [lucideIconsPlugin()],
});
export function getPageImage(page: InferPageType<typeof source>) {
const segments = [...page.slugs, 'image.png'];
return {
segments,
url: `/og/docs/${segments.join('/')}`,
};
}
export async function getLLMText(page: InferPageType<typeof source>) {
const processed = await page.data.getText('processed');
return `# ${page.data.title}
${processed}`;
}

53
docs/mdx-components.tsx Normal file
View File

@@ -0,0 +1,53 @@
import { Callout } from 'fumadocs-ui/components/callout';
import { Card, Cards } from 'fumadocs-ui/components/card';
import { File, Files, Folder } from 'fumadocs-ui/components/files';
import { Step, Steps } from 'fumadocs-ui/components/steps';
import { Tab, Tabs } from 'fumadocs-ui/components/tabs';
import defaultMdxComponents from 'fumadocs-ui/mdx';
import type { MDXComponents } from 'mdx/types';
import type { ReactNode } from 'react';
import { APIPage } from '@/components/api-page';
// Simple accordion using native HTML details/summary
function AccordionGroup({ children }: { children: ReactNode }) {
return <div className="my-6 space-y-2">{children}</div>;
}
function Accordion({ title, children }: { title: string; children: ReactNode }) {
return (
<details className="group border rounded-lg p-4">
<summary className="cursor-pointer font-semibold list-none">
<span className="group-open:rotate-90 transition-transform inline-block mr-2"></span>
{title}
</summary>
<div className="mt-4 pl-6">{children}</div>
</details>
);
}
export function getMDXComponents(components?: MDXComponents): MDXComponents {
return {
...defaultMdxComponents,
// Layout components
Card,
Cards,
// Files
Files,
Folder,
File,
// Callouts
Callout,
// Tabs
Tabs,
Tab,
// Steps
Steps,
Step,
// Accordion (native HTML-based)
AccordionGroup,
Accordion,
// OpenAPI component
APIPage,
...components,
};
}

25
docs/next.config.mjs Normal file
View File

@@ -0,0 +1,25 @@
import { createMDX } from 'fumadocs-mdx/next';
const withMDX = createMDX();
/** @type {import('next').NextConfig} */
const config = {
reactStrictMode: true,
async rewrites() {
return [
{
source: '/docs/:path*.mdx',
destination: '/llms.mdx/docs/:path*',
},
];
},
webpack: (config) => {
config.experiments = {
...config.experiments,
topLevelAwait: true,
};
return config;
},
};
export default withMDX(config);

1441
docs/openapi.json Normal file

File diff suppressed because it is too large Load Diff

35
docs/package.json Normal file
View File

@@ -0,0 +1,35 @@
{
"name": "example-next-mdx",
"version": "0.0.0",
"private": true,
"scripts": {
"build": "fumadocs-mdx && next build",
"dev": "fumadocs-mdx && next dev",
"start": "next start",
"postinstall": "fumadocs-mdx"
},
"dependencies": {
"@radix-ui/react-popover": "^1.1.15",
"class-variance-authority": "^0.7.1",
"fumadocs-core": "^16.4.11",
"fumadocs-mdx": "13",
"fumadocs-openapi": "^10.2.7",
"fumadocs-ui": "^16.4.11",
"lucide-react": "^0.546.0",
"next": "^16.1.6",
"react": "^19.2.0",
"react-dom": "^19.2.0",
"shiki": "^3.22.0",
"tailwind-merge": "^3.5.0"
},
"devDependencies": {
"@tailwindcss/postcss": "^4.1.15",
"@types/mdx": "^2.0.13",
"@types/node": "^24.9.1",
"@types/react": "^19.2.2",
"@types/react-dom": "^19.2.2",
"postcss": "^8.5.6",
"tailwindcss": "^4.1.15",
"typescript": "^5.9.3"
}
}

View File

@@ -0,0 +1,758 @@
# Docker Deployment Guide
**Status:** In Development for v0.2.0
**Requested By:** Reddit community ([thread](https://reddit.com/r/LocalLLaMA/...))
## Overview
Docker support makes Voicebox easier to deploy, especially for:
- **Consistent Environments**: Same setup across dev/staging/prod
- **GPU Passthrough**: Easy NVIDIA/AMD GPU access
- **Server Deployments**: Run on headless Linux servers
- **Multi-User Setups**: Isolate instances per user/team
- **Cloud Platforms**: Deploy to AWS, GCP, Azure, DigitalOcean
## Quick Start
### Using Pre-Built Images (Recommended)
```bash
# CPU-only version
docker run -p 8000:8000 -v voicebox-data:/app/data \
ghcr.io/jamiepine/voicebox:latest
# NVIDIA GPU version
docker run --gpus all -p 8000:8000 -v voicebox-data:/app/data \
ghcr.io/jamiepine/voicebox:latest-cuda
# AMD GPU version (experimental)
docker run --device=/dev/kfd --device=/dev/dri -p 8000:8000 \
-v voicebox-data:/app/data \
ghcr.io/jamiepine/voicebox:latest-rocm
```
Then open: `http://localhost:8000`
### Using Docker Compose (Easiest)
Create `docker-compose.yml`:
```yaml
version: '3.8'
services:
voicebox:
image: ghcr.io/jamiepine/voicebox:latest-cuda
ports:
- "8000:8000"
volumes:
- voicebox-data:/app/data
- huggingface-cache:/root/.cache/huggingface
environment:
- GPU_MEMORY_FRACTION=0.8 # Use 80% of GPU memory
- TTS_MODE=local
- WHISPER_MODE=local
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
volumes:
voicebox-data:
huggingface-cache:
```
Run:
```bash
docker compose up -d
```
## Building From Source
### Basic Dockerfile
```dockerfile
# Dockerfile
FROM python:3.11-slim
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y \
git \
build-essential \
ffmpeg \
&& rm -rf /var/lib/apt/lists/*
# Copy application
COPY backend/ /app/backend/
COPY requirements.txt /app/
# Install Python dependencies
RUN pip install --no-cache-dir -r requirements.txt
RUN pip install --no-cache-dir git+https://github.com/QwenLM/Qwen3-TTS.git
# Create data directory
RUN mkdir -p /app/data
# Expose port
EXPOSE 8000
# Run server
CMD ["uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "8000"]
```
Build and run:
```bash
docker build -t voicebox .
docker run -p 8000:8000 -v $(pwd)/data:/app/data voicebox
```
### Multi-Stage Build (Optimized)
Smaller image size by separating build and runtime:
```dockerfile
# Dockerfile.optimized
# Stage 1: Build dependencies
FROM python:3.11-slim AS builder
WORKDIR /build
RUN apt-get update && apt-get install -y \
git build-essential && \
rm -rf /var/lib/apt/lists/*
COPY backend/requirements.txt .
RUN pip install --no-cache-dir --target=/build/packages \
-r requirements.txt
RUN pip install --no-cache-dir --target=/build/packages \
git+https://github.com/QwenLM/Qwen3-TTS.git
# Stage 2: Runtime
FROM python:3.11-slim
WORKDIR /app
# Install only runtime dependencies
RUN apt-get update && apt-get install -y \
ffmpeg \
&& rm -rf /var/lib/apt/lists/*
# Copy installed packages from builder
COPY --from=builder /build/packages /usr/local/lib/python3.11/site-packages/
# Copy application code
COPY backend/ /app/backend/
# Create data directory
RUN mkdir -p /app/data
EXPOSE 8000
CMD ["uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "8000"]
```
Build:
```bash
docker build -f Dockerfile.optimized -t voicebox:slim .
```
## GPU Support
### NVIDIA GPUs (CUDA)
**Dockerfile:**
```dockerfile
FROM nvidia/cuda:12.1.0-runtime-ubuntu22.04
# Install Python
RUN apt-get update && apt-get install -y \
python3.11 python3-pip git ffmpeg && \
rm -rf /var/lib/apt/lists/*
WORKDIR /app
# Install PyTorch with CUDA support
COPY backend/requirements.txt .
RUN pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
# Install other dependencies
RUN pip3 install -r requirements.txt
RUN pip3 install git+https://github.com/QwenLM/Qwen3-TTS.git
COPY backend/ /app/backend/
EXPOSE 8000
CMD ["uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "8000"]
```
**Run with GPU:**
```bash
docker run --gpus all -p 8000:8000 \
-v voicebox-data:/app/data \
voicebox:cuda
```
**Docker Compose with GPU:**
```yaml
services:
voicebox:
image: voicebox:cuda
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
```
### AMD GPUs (ROCm) - Experimental
**Dockerfile:**
```dockerfile
FROM rocm/dev-ubuntu-22.04:6.0
# Install Python
RUN apt-get update && apt-get install -y \
python3.11 python3-pip git ffmpeg && \
rm -rf /var/lib/apt/lists/*
WORKDIR /app
# Install PyTorch with ROCm support
COPY backend/requirements.txt .
RUN pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.0
# Install other dependencies
RUN pip3 install -r requirements.txt
RUN pip3 install git+https://github.com/QwenLM/Qwen3-TTS.git
# Set ROCm environment variables
ENV HSA_OVERRIDE_GFX_VERSION=10.3.0
ENV ROCM_PATH=/opt/rocm
COPY backend/ /app/backend/
EXPOSE 8000
CMD ["uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "8000"]
```
**Run with AMD GPU:**
```bash
docker run --device=/dev/kfd --device=/dev/dri \
--group-add video --ipc=host --cap-add=SYS_PTRACE \
--security-opt seccomp=unconfined \
-p 8000:8000 -v voicebox-data:/app/data \
voicebox:rocm
```
**Note:** ROCm support varies by GPU model. Works best on Linux. See [AMD ROCm docs](https://rocm.docs.amd.com) for compatibility.
## Volume Mounts
### Essential Volumes
```bash
docker run -v voicebox-data:/app/data \ # Profiles, generations, history
-v huggingface-cache:/root/.cache/huggingface \ # Downloaded models
-p 8000:8000 voicebox
```
### Development Volume Mounts
For development with hot-reload:
```bash
docker run -v $(pwd)/backend:/app/backend \ # Live code changes
-v voicebox-data:/app/data \
-e RELOAD=true \
-p 8000:8000 voicebox
```
### Custom Model Storage
Use external model directory:
```bash
docker run -v /path/to/models:/models \
-e MODELS_DIR=/models \
-v voicebox-data:/app/data \
-p 8000:8000 voicebox
```
## Environment Variables
Configure Voicebox via environment variables:
```bash
docker run -e TTS_MODE=local \
-e WHISPER_MODE=openai-api \
-e OPENAI_API_KEY=sk-... \
-e GPU_MEMORY_FRACTION=0.8 \
-e LOG_LEVEL=info \
-p 8000:8000 voicebox
```
### Available Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `TTS_MODE` | `local` | TTS provider: `local`, `remote` |
| `TTS_REMOTE_URL` | - | URL for remote TTS server |
| `WHISPER_MODE` | `local` | Whisper provider: `local`, `openai-api`, `remote` |
| `WHISPER_REMOTE_URL` | - | URL for remote Whisper server |
| `OPENAI_API_KEY` | - | OpenAI API key (if using OpenAI Whisper) |
| `GPU_MEMORY_FRACTION` | `0.9` | Fraction of GPU memory to use (0.0-1.0) |
| `DATA_DIR` | `/app/data` | Directory for profiles/generations |
| `MODELS_DIR` | `/app/models` | Directory for local models |
| `LOG_LEVEL` | `info` | Logging level: `debug`, `info`, `warning`, `error` |
| `RELOAD` | `false` | Enable hot-reload for development |
## Complete Docker Compose Examples
### Production Deployment
```yaml
# docker-compose.prod.yml
version: '3.8'
services:
voicebox:
image: ghcr.io/jamiepine/voicebox:latest-cuda
container_name: voicebox
restart: unless-stopped
ports:
- "8000:8000"
volumes:
- voicebox-data:/app/data
- huggingface-cache:/root/.cache/huggingface
environment:
- TTS_MODE=local
- WHISPER_MODE=local
- GPU_MEMORY_FRACTION=0.8
- LOG_LEVEL=info
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
volumes:
voicebox-data:
driver: local
huggingface-cache:
driver: local
```
Run:
```bash
docker compose -f docker-compose.prod.yml up -d
```
### Development Setup
```yaml
# docker-compose.dev.yml
version: '3.8'
services:
voicebox:
build:
context: .
dockerfile: Dockerfile
ports:
- "8000:8000"
volumes:
- ./backend:/app/backend:ro
- voicebox-data:/app/data
- huggingface-cache:/root/.cache/huggingface
environment:
- RELOAD=true
- LOG_LEVEL=debug
- TTS_MODE=local
command: uvicorn backend.main:app --host 0.0.0.0 --port 8000 --reload
volumes:
voicebox-data:
huggingface-cache:
```
### Multi-Service Stack
Full stack with reverse proxy and monitoring:
```yaml
# docker-compose.stack.yml
version: '3.8'
services:
# Main Voicebox app
voicebox:
image: ghcr.io/jamiepine/voicebox:latest-cuda
restart: unless-stopped
volumes:
- voicebox-data:/app/data
- huggingface-cache:/root/.cache/huggingface
environment:
- TTS_MODE=local
- WHISPER_MODE=local
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
# Nginx reverse proxy
nginx:
image: nginx:alpine
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
- ./ssl:/etc/nginx/ssl:ro
depends_on:
- voicebox
# Prometheus monitoring (optional)
prometheus:
image: prom/prometheus
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus-data:/prometheus
volumes:
voicebox-data:
huggingface-cache:
prometheus-data:
```
## Cloud Deployment
### AWS EC2
1. **Launch GPU Instance** (g4dn.xlarge or p3.2xlarge)
2. **Install Docker + nvidia-docker:**
```bash
# Amazon Linux 2
sudo yum install -y docker
sudo systemctl start docker
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update && sudo apt-get install -y nvidia-docker2
sudo systemctl restart docker
```
3. **Deploy:**
```bash
docker run --gpus all -d -p 80:8000 \
-v voicebox-data:/app/data \
--restart unless-stopped \
ghcr.io/jamiepine/voicebox:latest-cuda
```
### DigitalOcean
Use GPU Droplet + Docker:
```bash
# Create droplet via CLI
doctl compute droplet create voicebox \
--size gpu-h100x1-80gb \
--image ubuntu-22-04-x64 \
--region nyc3
# SSH and deploy
ssh root@<droplet-ip>
curl -fsSL https://get.docker.com -o get-docker.sh
sh get-docker.sh
docker run --gpus all -d -p 80:8000 voicebox:cuda
```
### Google Cloud Run (CPU-only)
```bash
# Build and push
docker build -t gcr.io/your-project/voicebox .
docker push gcr.io/your-project/voicebox
# Deploy to Cloud Run
gcloud run deploy voicebox \
--image gcr.io/your-project/voicebox \
--platform managed \
--region us-central1 \
--memory 4Gi \
--cpu 2 \
--port 8000
```
### Fly.io
Create `fly.toml`:
```toml
app = "voicebox"
[build]
image = "ghcr.io/jamiepine/voicebox:latest"
[[services]]
http_checks = []
internal_port = 8000
protocol = "tcp"
[[services.ports]]
port = 80
handlers = ["http"]
[[services.ports]]
port = 443
handlers = ["tls", "http"]
[mounts]
source = "voicebox_data"
destination = "/app/data"
```
Deploy:
```bash
fly launch
fly deploy
```
## Troubleshooting
### GPU Not Detected
**Check NVIDIA Docker:**
```bash
docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
```
If this fails, reinstall nvidia-docker2.
**Check AMD ROCm:**
```bash
docker run --rm --device=/dev/kfd --device=/dev/dri rocm/dev-ubuntu-22.04:6.0 rocminfo
```
### Permission Errors
Container can't write to volumes:
```bash
# Fix permissions
docker run --user $(id -u):$(id -g) -v $(pwd)/data:/app/data voicebox
```
### Out of Memory
Reduce GPU memory usage:
```bash
docker run -e GPU_MEMORY_FRACTION=0.5 voicebox
```
Or use CPU-only:
```bash
docker run -e DEVICE=cpu voicebox
```
### Model Download Fails
Ensure HuggingFace cache is writable:
```bash
docker run -v huggingface-cache:/root/.cache/huggingface voicebox
```
Or use host cache:
```bash
docker run -v ~/.cache/huggingface:/root/.cache/huggingface voicebox
```
### Port Already in Use
Change host port:
```bash
docker run -p 8080:8000 voicebox # Use port 8080 instead
```
## Security Best Practices
### 1. Don't Run as Root
Create non-root user in Dockerfile:
```dockerfile
RUN useradd -m -u 1000 voicebox
USER voicebox
```
### 2. Use Secrets for API Keys
Don't put API keys in docker-compose.yml:
```bash
# Use Docker secrets
echo "sk-your-key" | docker secret create openai_key -
docker service create \
--secret openai_key \
-e OPENAI_API_KEY_FILE=/run/secrets/openai_key \
voicebox
```
### 3. Network Isolation
Use internal networks for multi-container setups:
```yaml
services:
voicebox:
networks:
- internal
nginx:
networks:
- internal
- external
ports:
- "80:80"
networks:
internal:
internal: true
external:
```
### 4. Resource Limits
Prevent resource exhaustion:
```yaml
services:
voicebox:
deploy:
resources:
limits:
cpus: '4'
memory: 8G
reservations:
cpus: '2'
memory: 4G
```
## Performance Tuning
### GPU Memory Management
```bash
# Use 80% of GPU (default 90%)
docker run -e GPU_MEMORY_FRACTION=0.8 voicebox
# Allow GPU memory growth (prevents OOM)
docker run -e TF_FORCE_GPU_ALLOW_GROWTH=true voicebox
```
### Model Caching
Pre-download models to volume:
```bash
# Download models first
docker run --rm -v huggingface-cache:/root/.cache/huggingface \
voicebox python -c "
from transformers import WhisperProcessor, WhisperForConditionalGeneration
WhisperProcessor.from_pretrained('openai/whisper-base')
WhisperForConditionalGeneration.from_pretrained('openai/whisper-base')
"
# Then run normally
docker run -v huggingface-cache:/root/.cache/huggingface voicebox
```
### Multi-Worker Setup
Use uvicorn workers for better throughput:
```dockerfile
CMD ["uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "4"]
```
## Monitoring
### Health Checks
Built-in health endpoint:
```bash
curl http://localhost:8000/health
```
Docker health check:
```yaml
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
```
### Prometheus Metrics
Add metrics exporter:
```python
# backend/main.py
from prometheus_fastapi_instrumentator import Instrumentator
Instrumentator().instrument(app).expose(app)
```
Then scrape `/metrics` with Prometheus.
### Logs
View container logs:
```bash
docker logs -f voicebox
# Or with compose
docker compose logs -f voicebox
```
## Next Steps
- [ ] Publish official images to GitHub Container Registry
- [ ] Add Kubernetes Helm charts
- [ ] Create Docker Desktop extension
- [ ] Add automated vulnerability scanning
- [ ] Support ARM64 builds for Raspberry Pi / Apple Silicon
## Contributing
Help improve Docker support:
1. Test on different platforms (AMD GPU, ARM64, etc.)
2. Submit Dockerfile optimizations
3. Share deployment configurations
4. Report issues: [GitHub Issues](https://github.com/jamiepine/voicebox/issues)
## Resources
- [Docker Documentation](https://docs.docker.com)
- [NVIDIA Container Toolkit](https://github.com/NVIDIA/nvidia-docker)
- [AMD ROCm Docker](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/docker.html)
- [Docker Compose Reference](https://docs.docker.com/compose/compose-file/)

View File

@@ -0,0 +1,235 @@
# OpenAI API Compatibility
**Status:** Planned for v0.2.0
**Issue:** [#10 OpenAI API compatibility](https://github.com/jamiepine/voicebox/issues/10)
## Overview
This feature exposes OpenAI-compatible endpoints from Voicebox, allowing any tool, library, or application that speaks the OpenAI Audio API to use Voicebox as a drop-in local replacement.
```mermaid
flowchart LR
subgraph clients [External Clients]
SDK[OpenAI SDK]
Curl[curl / HTTP]
Apps[Third-party Apps]
end
subgraph voicebox [Voicebox Server]
OpenAI["/v1/audio/* endpoints"]
TTS[TTSModel]
Whisper[WhisperModel]
Profiles[Voice Profiles]
end
SDK --> OpenAI
Curl --> OpenAI
Apps --> OpenAI
OpenAI --> TTS
OpenAI --> Whisper
OpenAI --> Profiles
```
## Use Cases
- **OpenAI SDK users**: `openai.audio.speech.create()` works with Voicebox
- **LLM frameworks**: LangChain, AutoGen, etc. can use Voicebox for TTS
- **Shell scripts**: `curl` commands copy-pasted from OpenAI docs work
- **Existing integrations**: Any tool expecting OpenAI's API works without code changes
## Endpoints to Implement
### 1. `POST /v1/audio/speech` (TTS)
OpenAI spec: https://platform.openai.com/docs/api-reference/audio/createSpeech
**Request:**
```json
{
"model": "tts-1",
"input": "Hello world!",
"voice": "alloy",
"response_format": "mp3",
"speed": 1.0
}
```
**Response:** Audio file (mp3, wav, opus, aac, flac, pcm)
**Voice Mapping Strategy:**
- `voice` parameter maps to Voicebox profile names (case-insensitive)
- If no match, use a configurable default profile
- Support special syntax: `voice: "profile:uuid"` for explicit profile ID
### 2. `POST /v1/audio/transcriptions` (Whisper)
OpenAI spec: https://platform.openai.com/docs/api-reference/audio/createTranscription
**Request:** (multipart/form-data)
- `file`: Audio file
- `model`: "whisper-1"
- `language`: Optional language hint
- `response_format`: json, text, srt, verbose_json, vtt
**Response:**
```json
{
"text": "Hello world!"
}
```
## Implementation Details
### New File: `backend/openai_compat.py`
Create a dedicated module with an APIRouter for OpenAI-compatible endpoints:
```python
from fastapi import APIRouter, UploadFile, File, Form, HTTPException
from fastapi.responses import StreamingResponse
from pydantic import BaseModel
from typing import Literal, Optional
router = APIRouter(prefix="/v1/audio", tags=["OpenAI Compatible"])
class SpeechRequest(BaseModel):
model: str = "tts-1"
input: str
voice: str = "alloy"
response_format: Literal["mp3", "wav", "opus", "aac", "flac", "pcm"] = "mp3"
speed: float = 1.0
@router.post("/speech")
async def create_speech(request: SpeechRequest, db: Session = Depends(get_db)):
# 1. Map voice name to profile
# 2. Generate audio using existing TTSModel
# 3. Convert to requested format
# 4. Return audio stream
...
@router.post("/transcriptions")
async def create_transcription(
file: UploadFile = File(...),
model: str = Form("whisper-1"),
language: Optional[str] = Form(None),
response_format: str = Form("json"),
):
# 1. Save uploaded file
# 2. Transcribe using existing WhisperModel
# 3. Return in requested format
...
```
### Voice Profile Resolution
Add helper in [backend/profiles.py](backend/profiles.py):
```python
async def resolve_voice_for_openai(voice: str, db: Session) -> Optional[VoiceProfile]:
"""
Resolve OpenAI voice parameter to a Voicebox profile.
Priority:
1. Exact profile name match (case-insensitive)
2. Profile ID match (if voice starts with "profile:")
3. Default profile from config
4. First available profile
"""
...
```
### Audio Format Conversion
Add conversion utilities in [backend/utils/audio.py](backend/utils/audio.py):
```python
def convert_audio_format(
audio: np.ndarray,
sample_rate: int,
target_format: str, # mp3, wav, opus, aac, flac, pcm
) -> bytes:
"""Convert audio to target format using ffmpeg or pydub."""
...
```
### Configuration
Add to [backend/config.py](backend/config.py):
```python
# OpenAI API Compatibility
OPENAI_COMPAT_ENABLED = True
OPENAI_COMPAT_DEFAULT_VOICE = None # Profile ID or name for default voice
OPENAI_COMPAT_REQUIRE_AUTH = False # Require API key validation
OPENAI_COMPAT_API_KEY = None # If set, validate against this
```
### Integration with main.py
In [backend/main.py](backend/main.py), include the router:
```python
from . import openai_compat
# Add OpenAI-compatible routes
if config.OPENAI_COMPAT_ENABLED:
app.include_router(openai_compat.router)
```
## Streaming Support (Future Enhancement)
Initial implementation returns complete audio. Streaming can be added later:
```python
@router.post("/speech")
async def create_speech(request: SpeechRequest):
if request.stream:
return StreamingResponse(
generate_audio_chunks(request),
media_type=f"audio/{request.response_format}"
)
...
```
## Testing
Example usage after implementation:
```bash
# TTS with curl
curl http://localhost:8000/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{"model": "tts-1", "input": "Hello!", "voice": "MyProfile"}' \
--output speech.mp3
# With OpenAI Python SDK
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="unused")
response = client.audio.speech.create(
model="tts-1",
voice="MyProfile",
input="Hello world!"
)
response.stream_to_file("output.mp3")
# Transcription
curl http://localhost:8000/v1/audio/transcriptions \
-F file=@audio.mp3 \
-F model="whisper-1"
```
## Security Considerations
- Optional API key validation (for shared deployments)
- Rate limiting on endpoints
- Input length limits (same as existing `/generate` endpoint)
## Dependencies
- `pydub` or `ffmpeg-python` for audio format conversion (mp3, opus, etc.)
- No changes to existing TTS/Whisper model code

5
docs/postcss.config.mjs Normal file
View File

@@ -0,0 +1,5 @@
export default {
plugins: {
'@tailwindcss/postcss': {},
},
};

Binary file not shown.

After

Width:  |  Height:  |  Size: 134 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 129 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 108 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 10 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 10 KiB

View File

@@ -0,0 +1,10 @@
import { generateFiles } from 'fumadocs-openapi';
import { openapi } from '../lib/openapi';
await generateFiles({
input: openapi,
output: 'content/docs/api-reference',
groupBy: 'tag',
});
console.log('✓ OpenAPI documentation generated in content/docs/api-reference/');

22
docs/source.config.ts Normal file
View File

@@ -0,0 +1,22 @@
import { defineConfig, defineDocs, frontmatterSchema, metaSchema } from 'fumadocs-mdx/config';
// You can customise Zod schemas for frontmatter and `meta.json` here
// see https://fumadocs.dev/docs/mdx/collections
export const docs = defineDocs({
dir: 'content/docs',
docs: {
schema: frontmatterSchema,
postprocess: {
includeProcessedMarkdown: true,
},
},
meta: {
schema: metaSchema,
},
});
export default defineConfig({
mdxOptions: {
// MDX options
},
});

36
docs/tsconfig.json Normal file
View File

@@ -0,0 +1,36 @@
{
"compilerOptions": {
"baseUrl": ".",
"target": "ESNext",
"lib": ["dom", "dom.iterable", "esnext"],
"allowJs": true,
"skipLibCheck": true,
"strict": true,
"forceConsistentCasingInFileNames": true,
"noEmit": true,
"esModuleInterop": true,
"module": "esnext",
"moduleResolution": "bundler",
"resolveJsonModule": true,
"isolatedModules": true,
"jsx": "react-jsx",
"incremental": true,
"paths": {
"@/*": ["./*"],
"@/.source": [".source"]
},
"plugins": [
{
"name": "next"
}
]
},
"include": [
"next-env.d.ts",
"**/*.ts",
"**/*.tsx",
".next/types/**/*.ts",
".next/dev/types/**/*.ts"
],
"exclude": ["node_modules"]
}