Initial commit
This commit is contained in:
26
docs/.gitignore
vendored
Normal file
26
docs/.gitignore
vendored
Normal file
@@ -0,0 +1,26 @@
|
||||
# deps
|
||||
/node_modules
|
||||
|
||||
# generated content
|
||||
.source
|
||||
|
||||
# test & build
|
||||
/coverage
|
||||
/.next/
|
||||
/out/
|
||||
/build
|
||||
*.tsbuildinfo
|
||||
|
||||
# misc
|
||||
.DS_Store
|
||||
*.pem
|
||||
/.pnp
|
||||
.pnp.js
|
||||
npm-debug.log*
|
||||
yarn-debug.log*
|
||||
yarn-error.log*
|
||||
|
||||
# others
|
||||
.env*.local
|
||||
.vercel
|
||||
next-env.d.ts
|
||||
634
docs/PROJECT_STATUS.md
Normal file
634
docs/PROJECT_STATUS.md
Normal file
@@ -0,0 +1,634 @@
|
||||
# Voicebox Project Status & Roadmap
|
||||
|
||||
> Last updated: 2026-04-18 | Current version: **v0.4.1** | 232 open issues | 12 open PRs
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Architecture Overview](#architecture-overview)
|
||||
2. [Current State](#current-state)
|
||||
3. [Open PRs — Triage & Analysis](#open-prs--triage--analysis)
|
||||
4. [Open Issues — Categorized](#open-issues--categorized)
|
||||
5. [Existing Plan Documents — Status](#existing-plan-documents--status)
|
||||
6. [New Model Integration — Landscape](#new-model-integration--landscape)
|
||||
7. [Architectural Bottlenecks](#architectural-bottlenecks)
|
||||
8. [Recommended Priorities](#recommended-priorities)
|
||||
|
||||
---
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
**Tauri shell (Rust)** hosts a **React frontend** (`app/`) that talks over HTTP on `localhost:17493` to a **FastAPI backend** (`backend/`).
|
||||
|
||||
The backend exposes:
|
||||
|
||||
- **`TTSBackend` Protocol** with seven concrete engine implementations:
|
||||
- Qwen3-TTS (PyTorch or MLX depending on platform)
|
||||
- Qwen CustomVoice (predefined speakers with instruct)
|
||||
- LuxTTS (fast, CPU-friendly)
|
||||
- Chatterbox Multilingual (23 languages)
|
||||
- Chatterbox Turbo (English, paralinguistic tags)
|
||||
- TADA (1B English, 3B multilingual via HumeAI)
|
||||
- Kokoro 82M (pre-built voices, CPU realtime)
|
||||
- **`STTBackend` Protocol** for Whisper (PyTorch or MLX-Whisper)
|
||||
- **Profiles / History / Stories** services for persistence and timeline editing
|
||||
|
||||
### Key Files
|
||||
|
||||
| Layer | File | Purpose |
|
||||
|-------|------|---------|
|
||||
| Backend entry | `backend/main.py` | FastAPI app, all API routes (~2850 lines) |
|
||||
| TTS protocol | `backend/backends/__init__.py:32-101` | `TTSBackend` Protocol definition |
|
||||
| Model registry | `backend/backends/__init__.py:17-29,153-366` | `ModelConfig` dataclass + registry helpers |
|
||||
| TTS factory | `backend/backends/__init__.py:382-426` | Thread-safe engine registry (double-checked locking) |
|
||||
| PyTorch TTS | `backend/backends/pytorch_backend.py` | Qwen3-TTS via `qwen_tts` package |
|
||||
| MLX TTS | `backend/backends/mlx_backend.py` | Qwen3-TTS via `mlx_audio.tts` |
|
||||
| LuxTTS | `backend/backends/luxtts_backend.py` | LuxTTS — fast, CPU-friendly |
|
||||
| Chatterbox MTL | `backend/backends/chatterbox_backend.py` | Chatterbox Multilingual — 23 languages |
|
||||
| Chatterbox Turbo | `backend/backends/chatterbox_turbo_backend.py` | Chatterbox Turbo — English, paralinguistic tags |
|
||||
| TADA | `backend/backends/hume_backend.py` | HumeAI TADA — 1B English + 3B Multilingual |
|
||||
| Kokoro | `backend/backends/kokoro_backend.py` | Kokoro 82M — CPU realtime, pre-built voices |
|
||||
| Qwen CustomVoice | `backend/backends/qwen_custom_voice_backend.py` | Qwen CustomVoice — predefined speakers with instruct |
|
||||
| Platform detect | `backend/platform_detect.py` | Apple Silicon → MLX, else → PyTorch |
|
||||
| API types | `backend/models.py` | Pydantic request/response models |
|
||||
| HF progress | `backend/utils/hf_progress.py` | HFProgressTracker (tqdm patching for download progress) |
|
||||
| Audio utils | `backend/utils/audio.py` | `trim_tts_output()`, normalize, load/save audio |
|
||||
| Frontend API | `app/src/lib/api/client.ts` | Hand-written fetch wrapper |
|
||||
| Frontend types | `app/src/lib/api/types.ts` | TypeScript API types |
|
||||
| Engine selector | `app/src/components/Generation/EngineModelSelector.tsx` | Shared engine/model dropdown |
|
||||
| Generation form | `app/src/components/Generation/GenerationForm.tsx` | TTS generation UI |
|
||||
| Floating gen box | `app/src/components/Generation/FloatingGenerateBox.tsx` | Compact generation UI |
|
||||
| Model manager | `app/src/components/ServerSettings/ModelManagement.tsx` | Model download/status/progress UI |
|
||||
| GPU acceleration | `app/src/components/ServerSettings/GpuAcceleration.tsx` | CUDA backend swap UI |
|
||||
| Gen form hook | `app/src/lib/hooks/useGenerationForm.ts` | Form validation + submission |
|
||||
| Language constants | `app/src/lib/constants/languages.ts` | Per-engine language maps |
|
||||
|
||||
### How TTS Generation Works (Current Flow)
|
||||
|
||||
```
|
||||
POST /generate
|
||||
1. Look up voice profile from DB
|
||||
2. Resolve engine from request (qwen | qwen_custom_voice | luxtts | chatterbox | chatterbox_turbo | tada | kokoro)
|
||||
3. Get backend: get_tts_backend_for_engine(engine) # thread-safe singleton per engine
|
||||
4. Check model cache → if missing, trigger background download, return HTTP 202
|
||||
5. Load model (lazy): tts_backend.load_model(model_size)
|
||||
6. Create voice prompt: profiles.create_voice_prompt_for_profile(engine=engine)
|
||||
→ tts_backend.create_voice_prompt(audio_path, reference_text)
|
||||
7. Generate: tts_backend.generate(text, voice_prompt, language, seed, instruct)
|
||||
8. Post-process: trim_tts_output() for Chatterbox engines
|
||||
9. Save WAV → data/generations/{id}.wav
|
||||
10. Insert history record in SQLite
|
||||
11. Return GenerationResponse
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Current State
|
||||
|
||||
### What's Shipped (v0.4.x)
|
||||
|
||||
**New since v0.3.0:**
|
||||
- Kokoro 82M TTS engine + voice profile type system (PR #325)
|
||||
- Qwen CustomVoice preset engine — predefined speakers with instruct support (PR #328)
|
||||
- Intel Arc (XPU) GPU support (PR #320)
|
||||
- Blackwell GPU (sm_120) CUDA support (PR #401)
|
||||
- Generation cancellation flow (PR #444)
|
||||
- Frontend quality gates + TypeScript hardening (PR #418)
|
||||
- macOS Intel (x86_64) PyTorch compatibility (PR #416)
|
||||
- Frozen-binary import fixes for Kokoro / Chatterbox Multilingual / scipy / transformers (PR #438)
|
||||
- Linux PipeWire/PulseAudio monitor detection (PR #457)
|
||||
- Server survives GUI close on Windows (PR #402)
|
||||
- GPU arch compatibility warning on startup (catches unsupported PyTorch builds)
|
||||
- cpal Stream playback reliability (PR #405), clip-splitting stability (PR #403)
|
||||
- torch.from_numpy crash with numpy 2.x in frozen binary (PR #361)
|
||||
- Async CUDA download lock (PR #428), NUMBA_CACHE_DIR env var (PR #425)
|
||||
- "Clear failed" history button (PR #412)
|
||||
- External server GUI startup + data refresh (PR #319)
|
||||
- Force offline mode for cached Qwen/Whisper models (PR #318)
|
||||
- macOS 11 ScreenCaptureKit launch crash fix (PR #424)
|
||||
|
||||
**Core TTS (cumulative):**
|
||||
- Qwen3-TTS voice cloning (1.7B and 0.6B models, MLX + PyTorch)
|
||||
- Qwen CustomVoice (preset speakers, instruct)
|
||||
- LuxTTS — fast, CPU-friendly English TTS (PR #254)
|
||||
- Chatterbox Multilingual — 23 languages including Hebrew (PR #257)
|
||||
- Chatterbox Turbo — paralinguistic tags, low latency English (PR #258)
|
||||
- HumeAI TADA — 1B English + 3B Multilingual (PR #296)
|
||||
- Kokoro 82M — CPU-realtime, 8 languages, Apache 2.0 (PR #325)
|
||||
- Multi-engine architecture with thread-safe backend registry (PR #254)
|
||||
- Chunked TTS generation — engine-agnostic, removes ~500 char limit (PR #266)
|
||||
- Async generation queue (PR #269)
|
||||
- Post-processing audio effects system (PR #271)
|
||||
- Voice profile type system (preset vs cloned, engine compatibility gating)
|
||||
- Centralized `ModelConfig` registry — no per-engine dispatch maps
|
||||
- Shared `EngineModelSelector` component
|
||||
|
||||
**Infrastructure (cumulative):**
|
||||
- CUDA backend swap via binary download (PR #252), cu128 upgrade (PR #316), Blackwell/sm_120 (PR #401)
|
||||
- CUDA backend split into independently versioned server + libs archives (PR #298)
|
||||
- Intel Arc XPU support (PR #320)
|
||||
- Docker + web deployment (PR #161)
|
||||
- Backend refactor: modular architecture, style guide, tooling (PR #285)
|
||||
- Settings overhaul: routed sub-tabs, server logs, changelog, about page (PR #294)
|
||||
- Windows support: CUDA detection, cross-platform justfile, server lifecycle (PR #272, #402)
|
||||
- Linux audio capture via pactl monitor detection (PR #457)
|
||||
- macOS Intel x86_64 compatibility (PR #416)
|
||||
- Voice profiles with multi-sample support
|
||||
- Stories editor (multi-track DAW timeline)
|
||||
- Whisper transcription (base, small, medium, large, turbo variants)
|
||||
- Model management UI with inline download progress + folder migration (PR #268)
|
||||
- Download cancel/clear UI with error panel (PR #238)
|
||||
- Generation history with caching and cancellation (PR #444)
|
||||
- Streaming generation endpoint (MLX only)
|
||||
- Audio player freeze fix + UX improvements (PR #293)
|
||||
- CORS restriction to known local origins (PR #88)
|
||||
|
||||
### Abandoned / Backlogged Integrations
|
||||
|
||||
| Model | PR / Branch | Reason |
|
||||
|-------|-------------|--------|
|
||||
| **CosyVoice2/3** | PR #311 | Output quality too poor. Heavy deps, no PyPI, needed 5+ shims. PR should be closed. |
|
||||
| **VoxCPM 1.5 / VoxCPM2** | `voicebox-new-models` research (2026-04-18) | **Backlogged.** See detailed analysis below. |
|
||||
|
||||
#### VoxCPM — Evaluation Notes (2026-04-18)
|
||||
|
||||
**Project:** [OpenBMB/VoxCPM](https://github.com/OpenBMB/VoxCPM) — tokenizer-free TTS, 2B params (VoxCPM2), end-to-end diffusion autoregressive architecture, 30 languages, 48 kHz output, Apache 2.0, `pip install voxcpm`.
|
||||
|
||||
**Why it looked interesting:**
|
||||
- Clean PyPI install (`pip install voxcpm`)
|
||||
- Apache 2.0 — commercially safe
|
||||
- Voice cloning via `reference_wav_path` with optional `prompt_wav_path` + `prompt_text` for "ultimate" cloning
|
||||
- Streaming API via `generate_streaming()`
|
||||
- Zero-shot cloning + style control via parenthetical prefixes in text (`(slightly faster, cheerful tone)...`)
|
||||
- Relatively high-quality output per demos
|
||||
|
||||
**Why we backlogged it:**
|
||||
- **Effectively CUDA-only.** README states `CUDA ≥ 12.0` as hard requirement. Source code's `from_pretrained(device=None|"auto")` claims "preferring CUDA, then MPS, then CPU," but in practice:
|
||||
- **MPS (Apple Silicon) broken upstream** — OpenBMB/VoxCPM issues #232 (`NotImplementedError: Output channels > 65536 not supported at the MPS device`) and #248 (`IndexError` on M3 Mac) are both open with no resolution.
|
||||
- **CPU unsupported in the Python package** — issue #256 shows `voxcpm --device cpu` rejected with `unrecognized arguments`. The only CPU path is the third-party **VoxCPM.cpp** GGML engine, which is a separate ecosystem project, not `pip install voxcpm`.
|
||||
- **macOS source install fails** — issue #233 open with no resolution.
|
||||
- Would require CUDA-only gating in UI (new `requires_cuda` flag on `ModelConfig`, lock icon + "Requires NVIDIA GPU" in `ModelManagement.tsx` / `EngineModelSelector.tsx`) plus a hard error at `load_model()` as safety net. Doable but adds first-class platform gating that doesn't exist for any other engine today.
|
||||
- Voicebox's user base skews Apple Silicon (MLX is a primary backend). Shipping a CUDA-only model sets a precedent worth a separate scoping discussion (see issues #419 engine sprawl, #420 platform tiers, PR #465).
|
||||
|
||||
**What would change the decision:**
|
||||
- Upstream fixes MPS crashes (watch issues #232, #248).
|
||||
- We define an "experimental / CUDA-only" engine tier as part of issue #419 / PR #465, and decide it's acceptable to ship engines that are hidden on non-NVIDIA platforms.
|
||||
- VoxCPM.cpp matures into a viable CPU path we can wrap (currently separate project, C++/GGML, unclear ergonomics).
|
||||
|
||||
**Integration shape if we revive it:** Zero-shot cloning maps naturally to the Chatterbox-style backend (store `ref_audio` + `ref_text` paths in the voice prompt dict, process at generate time). Est. ~250 lines for `voxcpm_backend.py` + one `ModelConfig` entry + engine registration in `backends/__init__.py`. Frontend UI gating is the bigger lift.
|
||||
|
||||
### What's In-Flight
|
||||
|
||||
| Feature | Branch/PR | Status |
|
||||
|---------|-----------|--------|
|
||||
| Platform support tiers | PR #465, issue #420 | Defining tier-1 (supported) vs tier-2 (community) platforms |
|
||||
| Engine sprawl cleanup | issue #419 | First-class vs experimental TTS backends distinction |
|
||||
| Frontend tech-debt burn-down | issue #421 | Biome + a11y debt before gating CI |
|
||||
| Docker registry auto-publish | PR #463, issue #453 | ghcr.io image on tag push |
|
||||
| New model research | `voicebox-new-models` branch | Evaluating Fish Speech, XTTS-v2, Pocket TTS, VibeVoice, Fish Audio S2, index-tts2 |
|
||||
|
||||
### TTS Engine Comparison
|
||||
|
||||
| Engine | Model Name | Profile Type | Languages | Size | Key Features | Instruct Support |
|
||||
|--------|-----------|--------------|-----------|------|-------------|-----------------|
|
||||
| Qwen3-TTS 1.7B | `qwen-tts-1.7B` | Cloned | 10 (zh, en, ja, ko, de, fr, ru, pt, es, it) | ~3.5 GB | Highest quality, voice cloning | None (Base model has no instruct path) |
|
||||
| Qwen3-TTS 0.6B | `qwen-tts-0.6B` | Cloned | 10 | ~1.2 GB | Lighter, faster | None |
|
||||
| Qwen CustomVoice 1.7B | `qwen-custom-voice-1.7B` | Preset | 10 | ~3.5 GB | Predefined speakers, instruct support | **Yes** |
|
||||
| Qwen CustomVoice 0.6B | `qwen-custom-voice-0.6B` | Preset | 10 | ~1.2 GB | Predefined speakers, instruct support | **Yes** |
|
||||
| LuxTTS | `luxtts` | Cloned | English | ~300 MB | CPU-friendly, 48 kHz, fast | None |
|
||||
| Chatterbox | `chatterbox-tts` | Cloned | 23 (incl. Hebrew, Arabic, Hindi, etc.) | ~3.2 GB | Zero-shot cloning, multilingual | Partial — `exaggeration` float (0-1) |
|
||||
| Chatterbox Turbo | `chatterbox-turbo` | Cloned | English | ~1.5 GB | Paralinguistic tags ([laugh], [cough]), 350M params, low latency | Partial — inline tags only |
|
||||
| TADA 1B | `tada-1b` | Cloned | English | ~4 GB | HumeAI speech-language model, 700s+ coherent audio | None |
|
||||
| TADA 3B Multilingual | `tada-3b-ml` | Cloned | 10 (en, ar, zh, de, es, fr, it, ja, pl, pt) | ~8 GB | Multilingual, text-acoustic dual alignment | None |
|
||||
| Kokoro 82M | `kokoro` | Preset | 8 (en, es, fr, hi, it, pt, ja, zh) | ~350 MB | 82M params, CPU realtime, Apache 2.0, pre-built voices | None |
|
||||
|
||||
### Multi-Engine Architecture (Shipped)
|
||||
|
||||
- **Thread-safe backend registry** (`_tts_backends` dict + `_tts_backends_lock`) with double-checked locking
|
||||
- **Per-engine backend instances** — each engine gets its own singleton, loaded lazily
|
||||
- **Engine field on GenerationRequest** — frontend sends `engine: 'qwen' | 'qwen_custom_voice' | 'luxtts' | 'chatterbox' | 'chatterbox_turbo' | 'tada' | 'kokoro'`
|
||||
- **Per-engine language filtering** — `ENGINE_LANGUAGES` map in frontend, backend regex accepts all languages
|
||||
- **Per-engine voice prompts** — `create_voice_prompt_for_profile()` dispatches to the correct backend
|
||||
- **Profile type system** — preset vs cloned profiles, UI grays out incompatible engines and auto-switches on selection
|
||||
- **Trim post-processing** — `trim_tts_output()` for Chatterbox engines (cuts trailing silence/hallucination)
|
||||
|
||||
### Known Limitations
|
||||
|
||||
- **HF XET progress**: Large files downloaded via `hf-xet` (HuggingFace's new transfer backend) report `n=0` in tqdm updates. Progress bars may appear stuck for large `.safetensors` files even though the download is proceeding. This is a known upstream limitation.
|
||||
- **Chatterbox Turbo upstream token bug**: `from_pretrained()` passes `token=os.getenv("HF_TOKEN") or True` which fails without a stored HF token. Our backend works around this by calling `snapshot_download(token=None)` + `from_local()`.
|
||||
- **chatterbox-tts must install with `--no-deps`**: It pins `numpy<1.26`, `torch==2.6.0`, `transformers==4.46.3` — all incompatible with our stack (Python 3.12, torch 2.10, transformers 4.57.3). Sub-deps listed explicitly in `requirements.txt`.
|
||||
- **Instruct parameter partially shipped** (#224, #303): Qwen CustomVoice (PR #328) now provides real instruct support via predefined speakers. Other backends still silently drop the instruct field — the UI exposes the field broadly but most engines ignore it. The floating generate box was patched to restore instruct for CustomVoice (commit `106aec4`).
|
||||
- **Streaming generation** only works for Qwen on MLX. Other engines use the non-streaming `/generate` endpoint.
|
||||
- **dicta-onnx** (Hebrew diacritization) not included — upstream Chatterbox bug requires `model_path` arg but calls `Dicta()` with none. Hebrew works fine without it.
|
||||
- **Blackwell (RTX 50-series) CUDA**: cu128 + sm_120 kernel support shipped (PR #401, #316), but users still report `cudaErrorNoKernelImageForDevice` (#417, #400, #396, #395, #390, #362) — likely a stale CUDA binary on upgraded installs. Needs a follow-up diagnostic / forced re-download path.
|
||||
- **Long text 50k character limit** (#464, #365, #354): Still hit on GPU despite chunking (PR #266). Chunking reliability needs another pass.
|
||||
- **ROCm on RDNA 3/4** (#469): `HSA_OVERRIDE_GFX_VERSION` is hardcoded and harms newer cards.
|
||||
- **`flash-attn is not installed` warning on every platform (cosmetic, common user complaint)**: Our transformer-based engines (Chatterbox / Qwen) emit `Warning: flash-attn is not installed. Will only run the manual PyTorch version. Please install flash-attn for faster inference.` on every startup, on every platform — we don't pin `flash-attn` in requirements because installing it is fragile and version-sensitive. Fallback is PyTorch SDPA, which is near-FA2 throughput on Ampere+ and is what actually runs. **Per-platform reality:** (a) **macOS/Apple Silicon** — FlashAttention is CUDA-only, irrelevant here; MLX has its own attention kernels. (b) **Linux** — `pip install flash-attn --no-build-isolation` works but takes 20+ min to compile. (c) **Windows** — no official support (Dao-AILab README still says only "Might work"; source builds routinely fail on recent CUDA/MSVC, issues #1715, #1828, #2395). Windows users can install community prebuilt wheels from `kingbri1/flash-attention` or `bdashore3/flash-attention` (latest v2.8.3, Aug 2025; `win_amd64` wheels for CUDA 12.4/12.8, Torch 2.6–2.9, Python 3.10–3.13) matching their exact CUDA/Torch/Python, or use WSL2. **Native-Windows alternatives worth considering as a build-time swap:** SageAttention (thu-ml, Apache 2.0, claims 2–5× over FA2) and xformers (official Windows wheels). **Action for us:** troubleshooting doc now covers it (see `docs/content/docs/overview/troubleshooting.mdx`), and we should optionally suppress the warning via `logging.getLogger(...).setLevel(ERROR)` at backend import since the fallback is functionally fine.
|
||||
- **WebAudio playback dies after audio-session interruption** (#41, plus an internal repro where the app is backgrounded long enough): WaveSurfer's `AudioContext` gets suspended by macOS — either because another app grabs the audio output, or because the WKWebView throttles when backgrounded. `play()` resolves and `timeupdate` can still fire, but no audio reaches the output. Only app restart fixes it. **Things already tried that didn't work:** (a) swapping WaveSurfer backend away from WebAudio — introduced more bugs, not an option; (b) remount hook on the player — doesn't help because a freshly-created `AudioContext` is born suspended and only resumes on a user gesture. PR #293 was a prior partial fix that doesn't cover this path. **Next thing to try** (not yet attempted — confirmed via grep of `AudioPlayer.tsx`): call `wavesurfer.getMediaElement().getGainNode().context.resume()` on the play button click (the click itself is a valid user gesture), plus a `visibilitychange` + `statechange` listener as belt-and-suspenders. The `ctx.resume()` pattern already exists in the codebase at `useStoryPlayback.ts:52` — just not wired into the main player.
|
||||
|
||||
---
|
||||
|
||||
## Open PRs — Triage & Analysis
|
||||
|
||||
### Recently Merged (Since Last Update — 2026-03-18 → 2026-04-18)
|
||||
|
||||
| PR | Title | Merged |
|
||||
|----|-------|--------|
|
||||
| **#481** | fix(build): pin transformers in MLX requirements to prevent 5.x upgrade | 2026-04-19 |
|
||||
| **#470** | fix(api-client): declare moved + errors on migrateModels response type | 2026-04-18 |
|
||||
| **#457** | fix(linux): use pactl to detect PipeWire/PulseAudio monitor | 2026-04-18 |
|
||||
| **#450** | docs: clarify paralinguistic tag support in quick start | 2026-04-18 |
|
||||
| **#447** | fix: delete version rows and files in delete_generations_by_profile | 2026-04-18 |
|
||||
| **#444** | Fix generation cancellation flow | 2026-04-18 |
|
||||
| **#440** | fix(paths): strip legacy "data/" prefix when resolving stored paths | 2026-04-18 |
|
||||
| **#439** | Fix migration dialog hanging when no models are present | 2026-04-18 |
|
||||
| **#438** | fix(build): repair frozen-binary imports for kokoro/chatterbox-multilingual/scipy/transformers | 2026-04-18 |
|
||||
| **#433** | fix: warn user when no models to migrate during storage change | 2026-04-18 |
|
||||
| **#425** | Add NUMBA_CACHE_DIR environment variable | 2026-04-16 |
|
||||
| **#424** | fix: avoid ScreenCaptureKit launch crash on macOS 11 | 2026-04-16 |
|
||||
| **#418** | Frontend quality gates + TypeScript hardening | 2026-04-18 |
|
||||
| **#416** | fix(deps): relax PyTorch requirement for macOS Intel (x86_64) | 2026-04-16 |
|
||||
| **#412** | feat(history): add "Clear failed" button | 2026-04-16 |
|
||||
| **#405** | fix: keep cpal Stream alive until playback completes | 2026-04-16 |
|
||||
| **#403** | fix: prevent intermittent clip splitting failures | 2026-04-16 |
|
||||
| **#402** | fix: reliably keep server alive after GUI close on Windows | 2026-04-16 |
|
||||
| **#401** | feat: add Blackwell GPU (sm_120) CUDA support | 2026-04-16 |
|
||||
| **#394** | fix(history): populate status/error/engine fields from DB row | 2026-04-16 |
|
||||
| **#384** | Fix: Resolve ModuleNotFoundError in effects service | 2026-04-16 |
|
||||
| **#361** | fix: torch.from_numpy crash with numpy 2.x in frozen binary | 2026-04-16 |
|
||||
| **#345** | Fix: "Failed to Save" preset error by resolving backend import path | 2026-03-22 |
|
||||
| **#344** | fix: include changelog in docker web build | 2026-03-27 |
|
||||
| **#332** | Fix links in Get Started section of index.mdx | 2026-03-21 |
|
||||
| **#328** | feat: add Qwen CustomVoice preset engine | 2026-03-27 |
|
||||
| **#325** | feat: Kokoro 82M TTS engine + voice profile type system | 2026-03-20 |
|
||||
| **#321** | fix: allows deletion of failed generations | 2026-03-19 |
|
||||
| **#320** | feat: Intel Arc (XPU) GPU support | 2026-03-21 |
|
||||
| **#319** | fix: GUI startup with external server + data refresh on server switch | 2026-03-27 |
|
||||
| **#318** | fix: force offline mode when loading cached models (Qwen TTS & Whisper) | 2026-03-21 |
|
||||
| **#316** | Upgrade CUDA backend from cu126 to cu128, fix GPU settings UI | 2026-03-18 |
|
||||
|
||||
### Currently Open (12 PRs)
|
||||
|
||||
| PR | Title | Status | Notes |
|
||||
|----|-------|--------|-------|
|
||||
| **#465** | docs: define tier-1 and tier-2 platform support targets | Community PR | Pairs with issue #420. Important for scoping. |
|
||||
| **#463** | feat(actions): add docker-registry.yml for automatic ghcr.io publishing | Community PR | Pairs with issue #453. Low risk. |
|
||||
| **#443** | fix: prevent infinite retry loop in offline mode (#434) | Community PR | Fixes reported bug. |
|
||||
| **#430** | feat: add MiniMax TTS provider support | Community PR | Cloud TTS provider — new direction (external API). Superset of #331? |
|
||||
| **#331** | feat: add MiniMax Cloud TTS as a built-in engine | Community PR | Likely superseded by #430. Dedupe. |
|
||||
| **#311** | feat: add CosyVoice2/3 TTS engine | **Close** | Abandoned — output quality too poor. |
|
||||
| **#253** | Enhance speech tokenizer with 48kHz version | Community PR | Qwen tokenizer upgrade. Still worth reviewing. |
|
||||
| **#227** | fix: harden input validation & file safety | Community PR | Coupled to #225 (custom models). |
|
||||
| **#225** | feat: custom HuggingFace voice model support | Community PR | Needs rework for multi-engine arch. |
|
||||
| **#195** | feat: per-profile LoRA fine-tuning | Draft | Complex. 15 new endpoints. |
|
||||
| **#154** | feat: Audiobook tab | Community PR | Chunked generation now shipped (#266). |
|
||||
| **#91** | fix: CoreAudio device enumeration | Draft | macOS audio device handling. |
|
||||
|
||||
---
|
||||
|
||||
## Open Issues — Categorized
|
||||
|
||||
### GPU / Hardware Detection — still the top category
|
||||
|
||||
**RTX 50-series (Blackwell / sm_120) cluster — NEW:** #417, #400, #396, #395, #390, #362 all report `cudaErrorNoKernelImageForDevice` / "no kernel image available." sm_120 support shipped in PR #401 + cu128 in PR #316, but users on upgraded installs still hit it — likely stale CUDA binary. Needs a diagnostic that detects binary/GPU-arch mismatch and prompts re-download.
|
||||
|
||||
**AMD / ROCm — NEW:** #469 `HSA_OVERRIDE_GFX_VERSION` is hardcoded and breaks RDNA 3/4 cards. #313 DirectML on AMD Ryzen AI Max+ 395 not working.
|
||||
|
||||
**Intel Arc:** PR #320 shipped XPU support — may resolve #119.
|
||||
|
||||
**General GPU-not-detected (older):** #368, #310, #330, #324, #326, #355 (multi-GPU / eGPU).
|
||||
|
||||
**Fix path:** CUDA backend swap (PR #252) + cu128 (PR #316) + sm_120 (PR #401) + GPU-arch warning (`73170d0`) are all in. Remaining work is diagnostics + re-download prompts for users whose binary predates the kernel updates.
|
||||
|
||||
### Model Downloads
|
||||
|
||||
Still reported. Users get stuck downloads, can't resume, offline mode edge cases.
|
||||
|
||||
**Key issues:** #475 (MAC CustomVoice install error), #449 (infinite loading macOS), #445 (can't download CustomVoice), #462 (Qwen requires internet even when loaded — regression from #150), #434 (infinite retry loop offline — PR #443 open), #432 (storage location change hangs when empty — partly fixed by PR #439/#433), #348 (TADA 3B Multilingual download fails), #336 (TADA model not listed in app), #275 (`No module named 'chatterbox'` on download), #304 (whisper-base feature extractor load error), #287 (macOS ARM `check_model_inputs` ImportError on new version), #181, #180.
|
||||
|
||||
**Fix path:** PR #443 addresses infinite offline retry. CustomVoice-specific download failures (#475, #445) need triage — likely related to frozen-binary import fixes in PR #438. TADA cluster (#336, #348) and macOS ARM import regressions (#287, #275, #304) need a dedicated triage pass.
|
||||
|
||||
**Qwen 0.6B-downloads-1.7B reports:** **#485** (2026-04-19), **#423** (macOS M1), **#329**. Originally a stale-fallback bug: `mlx-community/Qwen3-TTS-12Hz-0.6B-Base-bf16` wasn't published when MLX support shipped, so the 0.6B slot was aliased to the 1.7B repo. The 0.6B bf16 conversion is live now and both `backend/backends/mlx_backend.py` and `backend/backends/__init__.py` point at their correct repos. Qwen CustomVoice is unaffected — it runs via PyTorch on all platforms, both sizes always have dedicated repos.
|
||||
|
||||
### Language Requests (ongoing)
|
||||
|
||||
Strong demand: Hungarian (#479), Indonesian (#458, #247), Thai (#455), Bangla (#454), Arabic (#379), Persian (#162), IndicF5 (#339 — Indian languages), Ukrainian (#109), Chinese UI (#392, #261).
|
||||
|
||||
**Fix path:** Chatterbox Multilingual (PR #257) covers Arabic, Danish, German, Greek, Finnish, Hebrew, Hindi, Dutch, Norwegian, Polish, Swedish, Swahili, Turkish. Still missing: Hungarian, Indonesian, Thai, Bangla, Ukrainian. Issue #411 offers a PR for UI i18n foundation.
|
||||
|
||||
### New Model Requests (growing)
|
||||
|
||||
| Issue | Model Requested |
|
||||
|-------|----------------|
|
||||
| #478 | CosyVoice3 (we tried & abandoned CosyVoice2/3 — see #311) |
|
||||
| #407, #347 | RVC-style voice-to-voice / seed voice conversion (STS) |
|
||||
| #385 | Fish Audio S2 |
|
||||
| #380 | OmniVoice |
|
||||
| #370 | index-tts2 |
|
||||
| #364 | Voxtral-TTS |
|
||||
| #335 | Faster-Qwen-TTS |
|
||||
| #346 | Multi-model batch request |
|
||||
| #381 | Microsoft MAI models |
|
||||
| #339 | IndicF5 |
|
||||
| #226 | GGUF support |
|
||||
| #172 | VibeVoice |
|
||||
| #138 | Export to ONNX/Piper format |
|
||||
| #132 | LavaSR (transcription) |
|
||||
| #147 | Facebook Omnilingual ASR |
|
||||
| #338 | Default voices |
|
||||
|
||||
The multi-engine architecture makes integration straightforward — see [`content/docs/developer/tts-engines.mdx`](content/docs/developer/tts-engines.mdx). Platform-specific gating (e.g. VoxCPM CUDA-only) doesn't exist yet and would need design.
|
||||
|
||||
### Platform Scope & Quality Debt — NEW category
|
||||
|
||||
Awareness issues filed this cycle — ties into engine sprawl and platform tier work.
|
||||
|
||||
- **#419** — Engine sprawl: define first-class vs experimental TTS backends
|
||||
- **#420** — Formalize tier-1 vs tier-2 platform support targets (PR #465 open)
|
||||
- **#421** — Track & burn down frontend Biome + a11y debt before gating CI
|
||||
- **#422** — Code-split web build (main bundle > 1 MB)
|
||||
|
||||
### Long-Form / Chunking
|
||||
|
||||
Still reported despite chunking + queue being merged.
|
||||
|
||||
**Key issues:** #464 (50k char limit on GPU despite 16 GB VRAM — v0.4.0), #365 (FR: >50k chars), #363 (smart chunking to prevent robotic artifacts), #354 (50k limit v0.3.0).
|
||||
|
||||
**Fix path:** Chunking (#266) and queue (#269) shipped. Remaining work is raising/removing the 50k guard and tuning chunk boundaries for prosody.
|
||||
|
||||
### Feature Requests (ongoing)
|
||||
|
||||
Notable:
|
||||
- **#480** — Noise removal on uploaded recordings
|
||||
- **#448** — API for non-Qwen models (external integrations)
|
||||
- **#427** — Task status control
|
||||
- **#407, #347** — Voice-to-voice / audio-to-audio conversion
|
||||
- **#387** — Location of downloaded generated voices
|
||||
- **#383** — Concatenate partial reference audio into generated audio
|
||||
- **#382** — Lightning.ai support
|
||||
- **#376** — Remote mode
|
||||
- **#353** — Audio transcoding
|
||||
- **#317** — Voice pitch control
|
||||
- **#189** — "Auto" language option
|
||||
- **#173** — Vocal intonation/inflection control
|
||||
- **#165, #270** — Audiobook mode (PR #154 open)
|
||||
- **#242** — Seed value pinning
|
||||
- **#228** — Always use 0.6B option
|
||||
- **#235** — Finetuned Qwen3-TTS tokenizer (PR #253 open)
|
||||
- **#144** — Copy text to clipboard
|
||||
|
||||
### Housekeeping / Triage Needed
|
||||
|
||||
| Issue | Reason |
|
||||
|-------|--------|
|
||||
| **#431**, **#408** | Spam — Chinese "free Claude API" promos. Close. |
|
||||
| **#398** ("Excelente") | Non-issue. Close. |
|
||||
| **#357** | Informational — project featured in Awesome MLX. Close after acknowledgement. |
|
||||
| **#374**, **#377** | Version-release questions, no bug. Close. |
|
||||
| **#306** ("voice model"), **#389** ("New model"), **#473** ("New functionality") | Title-only issues, no content. Request details or close. |
|
||||
| **#309** | Uninstall/cleanup question. Answer and close. |
|
||||
| **#241** | "How to use in Colab" — support question, not a bug. |
|
||||
| **#423** / **#485** / **#329** | Stale MLX fallback to 1.7B repo — fixed; 0.6B bf16 conversion now live on `mlx-community`, registry points at correct repo on both backends. |
|
||||
| **#336** / **#348** | TADA download/registration cluster — triage together. |
|
||||
| **#287** / **#275** / **#304** | macOS ARM import regressions on new version — likely one root cause. |
|
||||
| **#292**, **#349** | Possibly already fixed by merged PRs (#321/#412 and #345). Verify + close. |
|
||||
|
||||
**~70 older issues (pre-#170) not individually categorized above.** Most are long-tail support questions or duplicates of problems now addressed by the multi-engine / model-registry work. A dedicated backlog-sweep pass is overdue.
|
||||
|
||||
### Bugs (ongoing)
|
||||
|
||||
| Category | Issues |
|
||||
|----------|--------|
|
||||
| Generation failures | #476, #467, #452, #459 (voice clone fetch error), #468 (tada-1b marked error), #437, #300, #301, #282 |
|
||||
| Audio quality | #456 (clipping errors v0.4.0), #436 (emotion labels), #333 (pitch/echo), #307 (by-model breakdown), #340 (all generations say "www...") |
|
||||
| Transcription | #371 (fails every time), #291 (extract transcription from generated audio) |
|
||||
| Effects / presets | #349 ("Failed to save" when creating effects presets — possibly fixed by merged #345) |
|
||||
| File ops | #477 (spacy_pkuseg dict missing on frozen Windows build), #472 (storage location change), #283 (allow longer files for voice creation + in-app trim), #350 (failed to add sample) |
|
||||
| History | #292 (can't delete failed generations — possibly fixed by merged #321/#412) |
|
||||
| Windows | #466 (install problem), #375 (WinError 5 access denied), #273 (port 8000 conflict), #201 (model doesn't stay loaded) |
|
||||
| Linux | #471 (thread-safe PULSE_SOURCE), #413 (Arch build), #409 (Kubuntu build), #351, #341 |
|
||||
| macOS | #441 (older macOS), #369 (malware flag), #334 (microphone permission), #287 (`check_model_inputs` ImportError — regression), #171 (ARM64 binary won't open) |
|
||||
| Profile/UI | #360 (Kokoro profile hides others — partly addressed by auto-switch), #299 (drag-drop on Win11), #329 (size selector state bug), #393 (stuck loading screen after reinstall to new dir) |
|
||||
| Integrations | #397 (SAMMI-bot 422 Unprocessable Entity) |
|
||||
| Audio playback / session | **#41** (macOS: Voicebox goes silent after another app takes audio output; restart restores it) — see deep-dive below |
|
||||
| Database | #174 (sqlite3 IntegrityError) |
|
||||
|
||||
---
|
||||
|
||||
## Existing Plan Documents — Status
|
||||
|
||||
| Document | Target Version | Status | Relevance |
|
||||
|----------|---------------|--------|-----------|
|
||||
| `TTS_PROVIDER_ARCHITECTURE.md` | v0.1.13 | **Partially superseded** by multi-engine arch + CUDA swap | Core concepts implemented differently than planned |
|
||||
| `CUDA_BACKEND_SWAP.md` | — | **Shipped** (PR #252) | CUDA binary download + backend restart |
|
||||
| `CUDA_BACKEND_SWAP_FINAL.md` | — | **Shipped** (PR #252) | Final implementation plan |
|
||||
| `EXTERNAL_PROVIDERS.md` | v0.2.0 | **Not started** | Remote server support |
|
||||
| `MLX_AUDIO.md` | — | **Shipped** | MLX backend is live |
|
||||
| `DOCKER_DEPLOYMENT.md` | v0.2.0 | **Shipped** (PR #161) | Docker + web deployment |
|
||||
| `OPENAI_SUPPORT.md` | v0.2.0 | **Not started** | OpenAI-compatible API layer |
|
||||
| `PR33_CUDA_PROVIDER_REVIEW.md` | — | **Reference** | Analysis of the original provider approach |
|
||||
|
||||
---
|
||||
|
||||
## New Model Integration — Landscape
|
||||
|
||||
### Status Snapshot (2026-04-18)
|
||||
|
||||
| Model | Cloning | Speed | Sample Rate | Languages | VRAM | Instruct | Cross-platform? | Status |
|
||||
|-------|---------|-------|-------------|-----------|------|----------|-----------------|--------|
|
||||
| **Qwen3-TTS** | 10s zero-shot | Medium | 24 kHz | 10 | Medium | None | MLX + PyTorch | **Shipped** |
|
||||
| **Qwen CustomVoice** | Preset speakers | Medium | 24 kHz | 10 | Medium | **Yes** | PyTorch | **Shipped** (PR #328) |
|
||||
| **LuxTTS** | 3s zero-shot | 150x RT, CPU ok | 48 kHz | English | <1 GB | None | All | **Shipped** (PR #254) |
|
||||
| **Chatterbox MTL** | 5s zero-shot | Medium | 24 kHz | 23 | Medium | Partial — `exaggeration` | CPU/CUDA | **Shipped** (PR #257) |
|
||||
| **Chatterbox Turbo** | 5s zero-shot | Fast | 24 kHz | English | Low | Partial — inline tags | CPU/CUDA | **Shipped** (PR #258) |
|
||||
| **HumeAI TADA 1B/3B** | Zero-shot | 5x faster than LLM-TTS | 24 kHz | EN (1B), 10 (3B) | Medium | Partial — prosody | PyTorch | **Shipped** (PR #296) |
|
||||
| **Kokoro-82M** | Preset voices | CPU realtime | 24 kHz | 8 | Tiny (82M) | None | All | **Shipped** (PR #325) |
|
||||
| ~~**CosyVoice2-0.5B**~~ | 3-10s zero-shot | Very fast | 24 kHz | Multilingual | Low | **Yes** | — | **Abandoned** (PR #311) — poor output quality |
|
||||
| ~~**VoxCPM2**~~ | Zero-shot | ~0.15 RTF streaming | 48 kHz | 30 | Medium | Partial — parenthetical style | **CUDA-only in practice** | **Backlogged** (2026-04-18) — see notes above |
|
||||
| **Fish Speech** | 10-30s few-shot | Real-time | 24-44 kHz | 50+ | Medium | **Yes** — word-level inline | All | Candidate — license TBD |
|
||||
| **Fish Audio S2** | — | — | — | — | — | — | — | Candidate (#385) |
|
||||
| **XTTS-v2** | 6s zero-shot | Mid-GPU | 24 kHz | 17+ | Medium | Partial — style transfer from ref | All | Candidate — CPML license likely blocker |
|
||||
| **Pocket TTS** (Kyutai) | Zero-shot + streaming | >1x RT on CPU | — | English + several European (FR/DE/PT/IT/ES added by Feb 2026) | ~100M | None | CPU-first | Candidate — MIT |
|
||||
| **MOSS-TTS-Nano** | Zero-shot | **Realtime on 4 CPU cores** | 48 kHz stereo | 20 | 0.1B | Partial — MOSS-VoiceGenerator companion does text-to-voice design | All (ONNX CPU path dropped 2026-04-17) | **Top candidate** — Apache 2.0, released 2026-04-13, streaming |
|
||||
| **VibeVoice** (Microsoft) | — | — | — | Multi-speaker long-form (up to 90 min, 4 speakers) | 1.5B | — | — | Candidate (#172) — Stories-editor fit |
|
||||
| **index-tts2** | — | — | — | — | — | — | — | Candidate (#370) |
|
||||
| **Voxtral TTS** (Mistral) | Zero-shot (short clips) + 20 preset voices | Single-GPU | — | — | 4B (`Voxtral-4B-TTS-2603`) | Presets + cloning | CUDA (16 GB+ VRAM) | Candidate (#364) — frontier quality claim, open-weight |
|
||||
| **Dia / Dia2** | — | — | — | — | — | — | — | Watch — emotion-forward, but "rough edges" / artifacts per April reviews |
|
||||
| **IndicF5** | — | — | — | Indian languages | — | — | — | Candidate (#339) — fills Indic gap |
|
||||
| **MiniMax Cloud TTS** | — | Cloud | — | — | N/A (API) | — | N/A | Community PR #430, #331 — new direction (external API) |
|
||||
| **OmniVoice** | — | — | — | — | — | — | — | Candidate (#380) |
|
||||
| **RVC voice conversion** | N/A (STS) | — | — | — | — | N/A | All | New modality, not TTS (#407, #347) |
|
||||
|
||||
**Watch list:** MioTTS-2.6B (fast LLM-based EN/JP, vLLM compatible), Oolel-Voices (Soynade Research, expressive modular control), Faster-Qwen-TTS (#335), Orpheus / Sesame CSM (on-device fine-tuning discussions), Fish Audio S2 Pro / Fish Speech V1.5 (benchmark leader but research/non-commercial license — same blocker as Fish Speech).
|
||||
|
||||
**Deep-research pass (2026-04-18):** MOSS-TTS-Nano identified as the freshest high-alignment candidate — verified via [OpenMOSS/MOSS-TTS](https://github.com/OpenMOSS/MOSS-TTS) README (0.1B params, Apache 2.0, 48 kHz stereo, 4-core CPU realtime, streaming, released 2026-04-13). Dedicated repo: [OpenMOSS/MOSS-TTS-Nano](https://github.com/OpenMOSS/MOSS-TTS-Nano). Voxtral TTS verified on HF as `mistralai/Voxtral-4B-TTS-2603`.
|
||||
|
||||
#### Active Evaluation Criteria (learned from cycle)
|
||||
|
||||
1. **Cross-platform first.** MLX is a primary backend for our Apple Silicon user base. CUDA-only models require platform gating that doesn't exist yet — shipping one sets a precedent (see VoxCPM notes, issues #419/#420).
|
||||
2. **PyPI + Apache/MIT licensing preferred.** Heavy deps, git-only installs, and `--no-deps` workarounds are expensive to maintain (Chatterbox taught us this).
|
||||
3. **Output quality is non-negotiable.** CosyVoice was abandoned despite the best instruct API.
|
||||
4. **Instruct support fills a real gap** (#173, #224, #303). Qwen CustomVoice partially addresses it with preset speakers; zero-shot clone-with-instruct is still unmet.
|
||||
5. **Long-form + streaming are user-requested** (#363, #365, #464). Candidates with native streaming (Pocket TTS, Fish Speech) get extra weight.
|
||||
|
||||
### Adding a New Engine (Now Straightforward)
|
||||
|
||||
With the model config registry and shared `EngineModelSelector` component, adding a new TTS engine requires:
|
||||
|
||||
1. **Create `backend/backends/<engine>_backend.py`** — implement `TTSBackend` protocol (~200-300 lines)
|
||||
2. **Register in `backend/backends/__init__.py`** — add `ModelConfig` entry + `TTS_ENGINES` entry + factory elif
|
||||
3. **Update `backend/models.py`** — add engine name to regex
|
||||
4. **Update frontend** — add to engine union type, `EngineModelSelector` options, form schema, language map, profile type gating (icons/labels ~9 files per grep of `kokoro`)
|
||||
|
||||
`main.py` requires **zero changes** — the registry handles all dispatch automatically.
|
||||
|
||||
**Platform gating doesn't exist yet.** If we add a CUDA-only model (e.g. VoxCPM), we need a new `requires_cuda` (or more generally `requires: list[device]`) flag on `ModelConfig`, plumbed through `/models` API and surfaced in `ModelManagement.tsx` and `EngineModelSelector.tsx` as a lock icon + "Requires NVIDIA GPU" state. Backend should hard-error at `load_model()` as a safety net.
|
||||
|
||||
Total effort: **~1 day** for a well-documented model with a PyPI package, cross-platform. **~2 days** if platform gating is required. See [`content/docs/developer/tts-engines.mdx`](content/docs/developer/tts-engines.mdx) for the full guide.
|
||||
|
||||
---
|
||||
|
||||
## Architectural Bottlenecks
|
||||
|
||||
### ~~1. Single Backend Singleton~~ — RESOLVED
|
||||
|
||||
The singleton TTS backend was replaced with a thread-safe per-engine registry in PR #254. Multiple engines can now be loaded simultaneously.
|
||||
|
||||
### ~~2. `main.py` Dispatch Point Duplication~~ — RESOLVED
|
||||
|
||||
Previously, each engine required updates to 6+ hardcoded dispatch maps across `main.py` (~320 lines of if/elif chains). A model config registry in `backend/backends/__init__.py` now centralizes all model metadata (`ModelConfig` dataclass) with helper functions (`load_engine_model()`, `check_model_loaded()`, `engine_needs_trim()`, etc.). Adding a new engine requires zero changes to `main.py`.
|
||||
|
||||
### ~~3. Model Config is Scattered~~ — RESOLVED
|
||||
|
||||
Model identifiers, HF repo IDs, display names, and engine metadata are now consolidated in the `ModelConfig` registry. Backend-aware branching (e.g. MLX vs PyTorch Qwen repo IDs) happens inside the registry. Frontend model options are centralized in `EngineModelSelector.tsx`.
|
||||
|
||||
### 4. Voice Prompt Cache Assumes PyTorch Tensors
|
||||
|
||||
`backend/utils/cache.py` uses `torch.save()` / `torch.load()`. LuxTTS, Chatterbox, and Kokoro backends work around this by storing reference audio paths (or preset voice IDs) instead of tensors in their voice prompt dicts. Not ideal but functional.
|
||||
|
||||
### 5. ~~Frontend Assumes Qwen Model Sizes~~ — RESOLVED
|
||||
|
||||
The generation form now uses a flat model dropdown with engine-based routing. Per-engine language filtering is in place. Model size is only sent for Qwen / Qwen CustomVoice.
|
||||
|
||||
### 6. No Platform Gating on Models — NEW
|
||||
|
||||
`ModelConfig` has no way to express hardware requirements. Every engine is shown to every user, regardless of whether it'll actually load. Users on non-CUDA platforms discover failure at load time (or not at all — some fall back silently to CPU and never complete). Blocks shipping CUDA-only engines (VoxCPM) and would improve the Intel Arc / ROCm / CPU-only UX today. See `ModelConfig` TODO: add `requires: list[Literal["cuda", "mps", "xpu", "cpu", "rocm"]]` or equivalent, plumb through `/models` API, render in `ModelManagement.tsx` + `EngineModelSelector.tsx`.
|
||||
|
||||
### 7. Engine Sprawl — NEW
|
||||
|
||||
Seven TTS engines shipped, more candidates queued. Issue #419 asks for a first-class vs experimental distinction. Related: issue #420 asks for formalized platform support tiers. Combined, these would let us ship more engines more confidently with clearer expectations for users.
|
||||
|
||||
---
|
||||
|
||||
## Recommended Priorities
|
||||
|
||||
### Tier 1 — Ship Now
|
||||
|
||||
| Priority | PR/Item | Impact | Effort |
|
||||
|----------|---------|--------|--------|
|
||||
| 1 | **RTX 50-series / Blackwell diagnostic** — detect stale CUDA binary vs GPU arch, prompt re-download (#417, #400, #396, #395, #390, #362) | Large cluster of user-blocking errors | Medium |
|
||||
| 2 | **CustomVoice download failures** (#475, #445) | New engine blocked on MAC/Win — regression triage | Medium |
|
||||
| 3 | **50k char limit on GPU** (#464) | Regression — chunking should handle this | Medium |
|
||||
| 4 | Close PR #311 (CosyVoice) and dedupe #331/#430 (MiniMax) | Housekeeping | None |
|
||||
| 5 | **PR #443** — infinite offline retry loop | Bug fix, reviewable | Low |
|
||||
| 6 | **PR #465** — define tier-1 / tier-2 platforms | Unblocks engine-sprawl decision (#419) | Low |
|
||||
| 7 | **PR #463** — docker registry auto-publish | Community PR, low risk | Low |
|
||||
| 8 | **#253** — 48kHz speech tokenizer | Quality improvement for Qwen | Medium |
|
||||
| 9 | **Kokoro profile UX** (#360) — partially addressed by auto-switch | Polish | Low |
|
||||
|
||||
### Tier 2 — Feature Work
|
||||
|
||||
| Priority | Item | Impact | Effort |
|
||||
|----------|------|--------|--------|
|
||||
| 1 | **Engine tier system** (#419) — first-class vs experimental, platform gating in `ModelConfig` | Unblocks CUDA-only engines (VoxCPM, etc.) and frontend polish | Medium |
|
||||
| 2 | **Frontend tech-debt burn-down** (#421) + code-split (#422) | Before gating CI on Biome | Medium |
|
||||
| 3 | **#154** — Audiobook tab | Long-form users. Chunking + queue shipped. | Medium |
|
||||
| 4 | **UI i18n** (#411 PR offer, #392, #261) | Chinese UI + general localization | Medium |
|
||||
| 5 | **#225** — Custom HuggingFace models | User-supplied models. Needs rework. | High |
|
||||
| 6 | OpenAI-compatible API (plan doc exists) — see also #448 (API for non-Qwen) | Low effort once API is stable | Low |
|
||||
| 7 | LoRA fine-tuning (PR #195) | Complex, needs rework for multi-engine | Very High |
|
||||
| 8 | Streaming for non-MLX engines | Currently MLX-only | Medium |
|
||||
| 9 | Voice-to-voice / RVC (#407, #347) | New modality — different arch shape | High |
|
||||
|
||||
### Tier 3 — Future Engines (cross-platform preferred)
|
||||
|
||||
| Priority | Item | Notes |
|
||||
|----------|------|-------|
|
||||
| 1 | **MOSS-TTS-Nano** | 0.1B, Apache 2.0, 4-core CPU realtime, 48 kHz stereo, streaming, 20 langs, released 2026-04-13. Best alignment with our criteria. Verify install ergonomics before committing. |
|
||||
| 2 | **Pocket TTS** (Kyutai) | CPU-first 100M model. MIT. Fills streaming gap without CUDA dependency. Several European langs added by Feb 2026. |
|
||||
| 3 | **IndicF5** | Fills Indian-language gap (#339). Closes many language-request issues. |
|
||||
| 4 | **VibeVoice** (Microsoft, #172) | 1.5B, long-form multi-speaker (up to 90 min, 4 speakers). Strong Stories-editor fit. |
|
||||
| 5 | **Voxtral TTS** (Mistral, #364) | 4B presets+cloning. Frontier quality claim, but 16 GB+ VRAM — would need the platform-tier work first. |
|
||||
| 6 | **Fish Speech / Fish Audio S2** | 50+ langs, word-level instruct. **License clarification first.** (#385) |
|
||||
| 7 | **XTTS-v2** | 17+ langs, mature pip. CPML likely kills commercial use — verify. |
|
||||
| 8 | **index-tts2** (#370) | Unvetted. |
|
||||
| — | ~~**VoxCPM2**~~ | **Backlogged** — CUDA-only upstream. Revisit when tier system ships or MPS bugs are fixed upstream. |
|
||||
|
||||
### ~~Previously Prioritized — Now Done~~
|
||||
|
||||
- ~~Kokoro 82M — finish integration~~ **Shipped** (PR #325)
|
||||
- ~~Qwen CustomVoice~~ **Shipped** (PR #328)
|
||||
- ~~Intel Arc (XPU) support~~ **Shipped** (PR #320)
|
||||
- ~~Blackwell CUDA~~ **Shipped** (PR #401, follow-up work open)
|
||||
- ~~Generation cancellation~~ **Shipped** (PR #444)
|
||||
- ~~macOS Intel x86_64~~ **Shipped** (PR #416)
|
||||
|
||||
---
|
||||
|
||||
## Branch Inventory
|
||||
|
||||
| Branch | PR | Status | Notes |
|
||||
|--------|-----|--------|-------|
|
||||
| `voicebox-new-models` | — | **Active** | New model research (Fish Speech, Pocket TTS, VibeVoice, etc.); VoxCPM evaluated & backlogged |
|
||||
| `fix/kokoro-pyinstaller-source-files` | — | Active | Kokoro frozen-build source bundling (parent of `voicebox-new-models`) |
|
||||
| `feat/cosyvoice-engine` | #311 | Open — closing | CosyVoice2/3 — abandoned, poor quality |
|
||||
| `feat/kokoro` | #325 | **Merged** | Kokoro 82M + voice profile type system |
|
||||
| `feat/qwen-custom-voice` | #328 | **Merged** | Qwen CustomVoice preset engine |
|
||||
| `feat/chatterbox-turbo` | #258 | **Merged** | Chatterbox Turbo + per-engine languages |
|
||||
| `feat/chatterbox` | #257 | **Merged** | Chatterbox Multilingual |
|
||||
| `feat/luxtts` | #254 | **Merged** | LuxTTS + multi-engine arch |
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference: API Endpoints
|
||||
|
||||
<details>
|
||||
<summary>All current endpoints</summary>
|
||||
|
||||
| Endpoint | Method | Purpose |
|
||||
|----------|--------|---------|
|
||||
| `/health` | GET | Health check, model/GPU status |
|
||||
| `/profiles` | POST, GET | Create/list voice profiles |
|
||||
| `/profiles/{id}` | GET, PUT, DELETE | Profile CRUD |
|
||||
| `/profiles/{id}/samples` | POST, GET | Add/list voice samples |
|
||||
| `/profiles/{id}/avatar` | POST, GET, DELETE | Avatar management |
|
||||
| `/profiles/{id}/export` | GET | Export profile as ZIP |
|
||||
| `/profiles/import` | POST | Import profile from ZIP |
|
||||
| `/generate` | POST | Generate speech (engine param selects TTS backend) |
|
||||
| `/generate/stream` | POST | Stream speech (MLX only) |
|
||||
| `/history` | GET | List generation history |
|
||||
| `/history/{id}` | GET, DELETE | Get/delete generation |
|
||||
| `/history/{id}/export` | GET | Export generation ZIP |
|
||||
| `/history/{id}/export-audio` | GET | Export audio only |
|
||||
| `/transcribe` | POST | Transcribe audio (Whisper) |
|
||||
| `/models/status` | GET | All model statuses (Qwen, LuxTTS, Chatterbox, Chatterbox Turbo, TADA, Whisper) |
|
||||
| `/models/download` | POST | Trigger model download |
|
||||
| `/models/download/cancel` | POST | Cancel/dismiss download |
|
||||
| `/models/{name}` | DELETE | Delete downloaded model |
|
||||
| `/models/load` | POST | Load model into memory |
|
||||
| `/models/unload` | POST | Unload model |
|
||||
| `/models/progress/{name}` | GET | SSE download progress |
|
||||
| `/tasks/active` | GET | Active downloads/generations (with inline progress) |
|
||||
| `/stories` | POST, GET | Create/list stories |
|
||||
| `/stories/{id}` | GET, PUT, DELETE | Story CRUD |
|
||||
| `/stories/{id}/items` | POST, GET | Story items CRUD |
|
||||
| `/stories/{id}/export` | GET | Export story audio |
|
||||
| `/channels` | POST, GET | Audio channel CRUD |
|
||||
| `/channels/{id}` | PUT, DELETE | Channel update/delete |
|
||||
| `/cache/clear` | POST | Clear voice prompt cache |
|
||||
| `/server/cuda/status` | GET | CUDA binary availability |
|
||||
| `/server/cuda/download` | POST | Download CUDA binary |
|
||||
| `/server/cuda/switch` | POST | Switch to CUDA backend |
|
||||
|
||||
</details>
|
||||
41
docs/README.md
Normal file
41
docs/README.md
Normal file
@@ -0,0 +1,41 @@
|
||||
# fumadocs-ui-template
|
||||
|
||||
This is a Next.js application generated with
|
||||
[Create Fumadocs](https://github.com/fuma-nama/fumadocs).
|
||||
|
||||
Run development server:
|
||||
|
||||
```bash
|
||||
bun run dev
|
||||
```
|
||||
|
||||
Open http://localhost:3000 with your browser to see the result.
|
||||
|
||||
## Explore
|
||||
|
||||
In the project, you can see:
|
||||
|
||||
- `lib/source.ts`: Code for content source adapter, [`loader()`](https://fumadocs.dev/docs/headless/source-api) provides the interface to access your content.
|
||||
- `lib/layout.shared.tsx`: Shared options for layouts, optional but preferred to keep.
|
||||
|
||||
| Route | Description |
|
||||
| ------------------------- | ------------------------------------------------------ |
|
||||
| `app/(home)` | The route group for your landing page and other pages. |
|
||||
| `app/docs` | The documentation layout and pages. |
|
||||
| `app/api/search/route.ts` | The Route Handler for search. |
|
||||
|
||||
### Fumadocs MDX
|
||||
|
||||
A `source.config.ts` config file has been included, you can customise different options like frontmatter schema.
|
||||
|
||||
Read the [Introduction](https://fumadocs.dev/docs/mdx) for further details.
|
||||
|
||||
## Learn More
|
||||
|
||||
To learn more about Next.js and Fumadocs, take a look at the following
|
||||
resources:
|
||||
|
||||
- [Next.js Documentation](https://nextjs.org/docs) - learn about Next.js
|
||||
features and API.
|
||||
- [Learn Next.js](https://nextjs.org/learn) - an interactive Next.js tutorial.
|
||||
- [Fumadocs](https://fumadocs.dev) - learn about Fumadocs
|
||||
11
docs/app/[[...slug]]/layout.tsx
Normal file
11
docs/app/[[...slug]]/layout.tsx
Normal file
@@ -0,0 +1,11 @@
|
||||
import { DocsLayout } from 'fumadocs-ui/layouts/docs';
|
||||
import { baseOptions } from '@/lib/layout.shared';
|
||||
import { source } from '@/lib/source';
|
||||
|
||||
export default function Layout({ children }: LayoutProps<'/[[...slug]]'>) {
|
||||
return (
|
||||
<DocsLayout tree={source.pageTree} {...baseOptions()}>
|
||||
{children}
|
||||
</DocsLayout>
|
||||
);
|
||||
}
|
||||
74
docs/app/[[...slug]]/page.tsx
Normal file
74
docs/app/[[...slug]]/page.tsx
Normal file
@@ -0,0 +1,74 @@
|
||||
import { createRelativeLink } from 'fumadocs-ui/mdx';
|
||||
import { DocsBody, DocsDescription, DocsPage, DocsTitle } from 'fumadocs-ui/page';
|
||||
import type { Metadata } from 'next';
|
||||
import { notFound } from 'next/navigation';
|
||||
import { MarkdownCopyButton, ViewOptionsPopover } from '@/components/ai/page-actions';
|
||||
import { APIPage } from '@/components/api-page';
|
||||
import { getPageImage, source } from '@/lib/source';
|
||||
import { getMDXComponents } from '@/mdx-components';
|
||||
|
||||
export default async function Page(props: PageProps<'/[[...slug]]'>) {
|
||||
const params = await props.params;
|
||||
const page = source.getPage(params.slug);
|
||||
if (!page) notFound();
|
||||
|
||||
const MDX = page.data.body;
|
||||
const markdownUrl = `${page.url}.mdx`;
|
||||
const githubUrl = `https://github.com/jamiepine/voicebox/blob/main/docs/content/docs/${page.path}`;
|
||||
|
||||
return (
|
||||
<DocsPage
|
||||
toc={page.data.toc}
|
||||
full={page.data.full}
|
||||
editOnGithub={{
|
||||
owner: 'jamiepine',
|
||||
repo: 'voicebox',
|
||||
sha: 'main',
|
||||
path: `docs/content/docs/${page.path}`,
|
||||
}}
|
||||
lastUpdate={page.data.lastModified}
|
||||
>
|
||||
<DocsTitle>{page.data.title}</DocsTitle>
|
||||
<DocsDescription className="mb-0">{page.data.description}</DocsDescription>
|
||||
<div className="flex flex-row gap-2 items-center">
|
||||
<MarkdownCopyButton markdownUrl={markdownUrl} />
|
||||
<ViewOptionsPopover markdownUrl={markdownUrl} githubUrl={githubUrl} />
|
||||
</div>
|
||||
<div
|
||||
role="separator"
|
||||
style={{
|
||||
height: '1px',
|
||||
background: 'currentColor',
|
||||
opacity: 0.15,
|
||||
marginTop: '8px',
|
||||
marginBottom: '24px',
|
||||
}}
|
||||
/>
|
||||
<DocsBody>
|
||||
<MDX
|
||||
components={getMDXComponents({
|
||||
a: createRelativeLink(source, page),
|
||||
})}
|
||||
/>
|
||||
</DocsBody>
|
||||
</DocsPage>
|
||||
);
|
||||
}
|
||||
|
||||
export async function generateStaticParams() {
|
||||
return source.generateParams();
|
||||
}
|
||||
|
||||
export async function generateMetadata(props: PageProps<'/[[...slug]]'>): Promise<Metadata> {
|
||||
const params = await props.params;
|
||||
const page = source.getPage(params.slug);
|
||||
if (!page) notFound();
|
||||
|
||||
return {
|
||||
title: page.data.title,
|
||||
description: page.data.description,
|
||||
openGraph: {
|
||||
images: getPageImage(page).url,
|
||||
},
|
||||
};
|
||||
}
|
||||
7
docs/app/api/search/route.ts
Normal file
7
docs/app/api/search/route.ts
Normal file
@@ -0,0 +1,7 @@
|
||||
import { source } from '@/lib/source';
|
||||
import { createFromSource } from 'fumadocs-core/search/server';
|
||||
|
||||
export const { GET } = createFromSource(source, {
|
||||
// https://docs.orama.com/docs/orama-js/supported-languages
|
||||
language: 'english',
|
||||
});
|
||||
14
docs/app/global.css
Normal file
14
docs/app/global.css
Normal file
@@ -0,0 +1,14 @@
|
||||
@import "tailwindcss";
|
||||
@import "fumadocs-ui/css/neutral.css";
|
||||
@import "fumadocs-ui/css/preset.css";
|
||||
@import "fumadocs-openapi/css/preset.css";
|
||||
|
||||
:root {
|
||||
--color-fd-primary: hsl(43, 50%, 50%);
|
||||
--color-fd-primary-foreground: hsl(222.2, 47.4%, 11.2%);
|
||||
}
|
||||
|
||||
.dark {
|
||||
--color-fd-primary: hsl(43, 50%, 45%);
|
||||
--color-fd-primary-foreground: hsl(0, 0%, 95%);
|
||||
}
|
||||
17
docs/app/layout.tsx
Normal file
17
docs/app/layout.tsx
Normal file
@@ -0,0 +1,17 @@
|
||||
import { RootProvider } from 'fumadocs-ui/provider/next';
|
||||
import './global.css';
|
||||
import { Inter } from 'next/font/google';
|
||||
|
||||
const inter = Inter({
|
||||
subsets: ['latin'],
|
||||
});
|
||||
|
||||
export default function Layout({ children }: LayoutProps<'/'>) {
|
||||
return (
|
||||
<html lang="en" className={inter.className} suppressHydrationWarning>
|
||||
<body className="flex flex-col min-h-screen">
|
||||
<RootProvider>{children}</RootProvider>
|
||||
</body>
|
||||
</html>
|
||||
);
|
||||
}
|
||||
10
docs/app/llms-full.txt/route.ts
Normal file
10
docs/app/llms-full.txt/route.ts
Normal file
@@ -0,0 +1,10 @@
|
||||
import { getLLMText, source } from '@/lib/source';
|
||||
|
||||
export const revalidate = false;
|
||||
|
||||
export async function GET() {
|
||||
const scan = source.getPages().map(getLLMText);
|
||||
const scanned = await Promise.all(scan);
|
||||
|
||||
return new Response(scanned.join('\n\n'));
|
||||
}
|
||||
20
docs/app/llms.mdx/docs/[[...slug]]/route.ts
Normal file
20
docs/app/llms.mdx/docs/[[...slug]]/route.ts
Normal file
@@ -0,0 +1,20 @@
|
||||
import { notFound } from 'next/navigation';
|
||||
import { getLLMText, source } from '@/lib/source';
|
||||
|
||||
export const revalidate = false;
|
||||
|
||||
export async function GET(_req: Request, { params }: RouteContext<'/llms.mdx/docs/[[...slug]]'>) {
|
||||
const { slug } = await params;
|
||||
const page = source.getPage(slug);
|
||||
if (!page) notFound();
|
||||
|
||||
return new Response(await getLLMText(page), {
|
||||
headers: {
|
||||
'Content-Type': 'text/markdown',
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
export function generateStaticParams() {
|
||||
return source.generateParams();
|
||||
}
|
||||
27
docs/app/og/docs/[...slug]/route.tsx
Normal file
27
docs/app/og/docs/[...slug]/route.tsx
Normal file
@@ -0,0 +1,27 @@
|
||||
import { getPageImage, source } from '@/lib/source';
|
||||
import { notFound } from 'next/navigation';
|
||||
import { ImageResponse } from 'next/og';
|
||||
import { generate as DefaultImage } from 'fumadocs-ui/og';
|
||||
|
||||
export const revalidate = false;
|
||||
|
||||
export async function GET(_req: Request, { params }: RouteContext<'/og/docs/[...slug]'>) {
|
||||
const { slug } = await params;
|
||||
const page = source.getPage(slug.slice(0, -1));
|
||||
if (!page) notFound();
|
||||
|
||||
return new ImageResponse(
|
||||
<DefaultImage title={page.data.title} description={page.data.description} site="My App" />,
|
||||
{
|
||||
width: 1200,
|
||||
height: 630,
|
||||
},
|
||||
);
|
||||
}
|
||||
|
||||
export function generateStaticParams() {
|
||||
return source.getPages().map((page) => ({
|
||||
lang: page.locale,
|
||||
slug: getPageImage(page).segments,
|
||||
}));
|
||||
}
|
||||
830
docs/bun.lock
Normal file
830
docs/bun.lock
Normal file
@@ -0,0 +1,830 @@
|
||||
{
|
||||
"lockfileVersion": 1,
|
||||
"configVersion": 1,
|
||||
"workspaces": {
|
||||
"": {
|
||||
"name": "example-next-mdx",
|
||||
"dependencies": {
|
||||
"@radix-ui/react-popover": "^1.1.15",
|
||||
"class-variance-authority": "^0.7.1",
|
||||
"fumadocs-core": "^16.4.11",
|
||||
"fumadocs-mdx": "13",
|
||||
"fumadocs-openapi": "^10.2.7",
|
||||
"fumadocs-ui": "^16.4.11",
|
||||
"lucide-react": "^0.546.0",
|
||||
"next": "^16.1.6",
|
||||
"react": "^19.2.0",
|
||||
"react-dom": "^19.2.0",
|
||||
"shiki": "^3.22.0",
|
||||
"tailwind-merge": "^3.5.0",
|
||||
},
|
||||
"devDependencies": {
|
||||
"@tailwindcss/postcss": "^4.1.15",
|
||||
"@types/mdx": "^2.0.13",
|
||||
"@types/node": "^24.9.1",
|
||||
"@types/react": "^19.2.2",
|
||||
"@types/react-dom": "^19.2.2",
|
||||
"postcss": "^8.5.6",
|
||||
"tailwindcss": "^4.1.15",
|
||||
"typescript": "^5.9.3",
|
||||
},
|
||||
},
|
||||
},
|
||||
"packages": {
|
||||
"@alloc/quick-lru": ["@alloc/quick-lru@5.2.0", "", {}, "sha512-UrcABB+4bUrFABwbluTIBErXwvbsU/V7TZWfmbgJfbkwiBuziS9gxdODUyuiecfdGQ85jglMW6juS3+z5TsKLw=="],
|
||||
|
||||
"@emnapi/runtime": ["@emnapi/runtime@1.8.1", "", { "dependencies": { "tslib": "^2.4.0" } }, "sha512-mehfKSMWjjNol8659Z8KxEMrdSJDDot5SXMq00dM8BN4o+CLNXQ0xH2V7EchNHV4RmbZLmmPdEaXZc5H2FXmDg=="],
|
||||
|
||||
"@esbuild/aix-ppc64": ["@esbuild/aix-ppc64@0.25.12", "", { "os": "aix", "cpu": "ppc64" }, "sha512-Hhmwd6CInZ3dwpuGTF8fJG6yoWmsToE+vYgD4nytZVxcu1ulHpUQRAB1UJ8+N1Am3Mz4+xOByoQoSZf4D+CpkA=="],
|
||||
|
||||
"@esbuild/android-arm": ["@esbuild/android-arm@0.25.12", "", { "os": "android", "cpu": "arm" }, "sha512-VJ+sKvNA/GE7Ccacc9Cha7bpS8nyzVv0jdVgwNDaR4gDMC/2TTRc33Ip8qrNYUcpkOHUT5OZ0bUcNNVZQ9RLlg=="],
|
||||
|
||||
"@esbuild/android-arm64": ["@esbuild/android-arm64@0.25.12", "", { "os": "android", "cpu": "arm64" }, "sha512-6AAmLG7zwD1Z159jCKPvAxZd4y/VTO0VkprYy+3N2FtJ8+BQWFXU+OxARIwA46c5tdD9SsKGZ/1ocqBS/gAKHg=="],
|
||||
|
||||
"@esbuild/android-x64": ["@esbuild/android-x64@0.25.12", "", { "os": "android", "cpu": "x64" }, "sha512-5jbb+2hhDHx5phYR2By8GTWEzn6I9UqR11Kwf22iKbNpYrsmRB18aX/9ivc5cabcUiAT/wM+YIZ6SG9QO6a8kg=="],
|
||||
|
||||
"@esbuild/darwin-arm64": ["@esbuild/darwin-arm64@0.25.12", "", { "os": "darwin", "cpu": "arm64" }, "sha512-N3zl+lxHCifgIlcMUP5016ESkeQjLj/959RxxNYIthIg+CQHInujFuXeWbWMgnTo4cp5XVHqFPmpyu9J65C1Yg=="],
|
||||
|
||||
"@esbuild/darwin-x64": ["@esbuild/darwin-x64@0.25.12", "", { "os": "darwin", "cpu": "x64" }, "sha512-HQ9ka4Kx21qHXwtlTUVbKJOAnmG1ipXhdWTmNXiPzPfWKpXqASVcWdnf2bnL73wgjNrFXAa3yYvBSd9pzfEIpA=="],
|
||||
|
||||
"@esbuild/freebsd-arm64": ["@esbuild/freebsd-arm64@0.25.12", "", { "os": "freebsd", "cpu": "arm64" }, "sha512-gA0Bx759+7Jve03K1S0vkOu5Lg/85dou3EseOGUes8flVOGxbhDDh/iZaoek11Y8mtyKPGF3vP8XhnkDEAmzeg=="],
|
||||
|
||||
"@esbuild/freebsd-x64": ["@esbuild/freebsd-x64@0.25.12", "", { "os": "freebsd", "cpu": "x64" }, "sha512-TGbO26Yw2xsHzxtbVFGEXBFH0FRAP7gtcPE7P5yP7wGy7cXK2oO7RyOhL5NLiqTlBh47XhmIUXuGciXEqYFfBQ=="],
|
||||
|
||||
"@esbuild/linux-arm": ["@esbuild/linux-arm@0.25.12", "", { "os": "linux", "cpu": "arm" }, "sha512-lPDGyC1JPDou8kGcywY0YILzWlhhnRjdof3UlcoqYmS9El818LLfJJc3PXXgZHrHCAKs/Z2SeZtDJr5MrkxtOw=="],
|
||||
|
||||
"@esbuild/linux-arm64": ["@esbuild/linux-arm64@0.25.12", "", { "os": "linux", "cpu": "arm64" }, "sha512-8bwX7a8FghIgrupcxb4aUmYDLp8pX06rGh5HqDT7bB+8Rdells6mHvrFHHW2JAOPZUbnjUpKTLg6ECyzvas2AQ=="],
|
||||
|
||||
"@esbuild/linux-ia32": ["@esbuild/linux-ia32@0.25.12", "", { "os": "linux", "cpu": "ia32" }, "sha512-0y9KrdVnbMM2/vG8KfU0byhUN+EFCny9+8g202gYqSSVMonbsCfLjUO+rCci7pM0WBEtz+oK/PIwHkzxkyharA=="],
|
||||
|
||||
"@esbuild/linux-loong64": ["@esbuild/linux-loong64@0.25.12", "", { "os": "linux", "cpu": "none" }, "sha512-h///Lr5a9rib/v1GGqXVGzjL4TMvVTv+s1DPoxQdz7l/AYv6LDSxdIwzxkrPW438oUXiDtwM10o9PmwS/6Z0Ng=="],
|
||||
|
||||
"@esbuild/linux-mips64el": ["@esbuild/linux-mips64el@0.25.12", "", { "os": "linux", "cpu": "none" }, "sha512-iyRrM1Pzy9GFMDLsXn1iHUm18nhKnNMWscjmp4+hpafcZjrr2WbT//d20xaGljXDBYHqRcl8HnxbX6uaA/eGVw=="],
|
||||
|
||||
"@esbuild/linux-ppc64": ["@esbuild/linux-ppc64@0.25.12", "", { "os": "linux", "cpu": "ppc64" }, "sha512-9meM/lRXxMi5PSUqEXRCtVjEZBGwB7P/D4yT8UG/mwIdze2aV4Vo6U5gD3+RsoHXKkHCfSxZKzmDssVlRj1QQA=="],
|
||||
|
||||
"@esbuild/linux-riscv64": ["@esbuild/linux-riscv64@0.25.12", "", { "os": "linux", "cpu": "none" }, "sha512-Zr7KR4hgKUpWAwb1f3o5ygT04MzqVrGEGXGLnj15YQDJErYu/BGg+wmFlIDOdJp0PmB0lLvxFIOXZgFRrdjR0w=="],
|
||||
|
||||
"@esbuild/linux-s390x": ["@esbuild/linux-s390x@0.25.12", "", { "os": "linux", "cpu": "s390x" }, "sha512-MsKncOcgTNvdtiISc/jZs/Zf8d0cl/t3gYWX8J9ubBnVOwlk65UIEEvgBORTiljloIWnBzLs4qhzPkJcitIzIg=="],
|
||||
|
||||
"@esbuild/linux-x64": ["@esbuild/linux-x64@0.25.12", "", { "os": "linux", "cpu": "x64" }, "sha512-uqZMTLr/zR/ed4jIGnwSLkaHmPjOjJvnm6TVVitAa08SLS9Z0VM8wIRx7gWbJB5/J54YuIMInDquWyYvQLZkgw=="],
|
||||
|
||||
"@esbuild/netbsd-arm64": ["@esbuild/netbsd-arm64@0.25.12", "", { "os": "none", "cpu": "arm64" }, "sha512-xXwcTq4GhRM7J9A8Gv5boanHhRa/Q9KLVmcyXHCTaM4wKfIpWkdXiMog/KsnxzJ0A1+nD+zoecuzqPmCRyBGjg=="],
|
||||
|
||||
"@esbuild/netbsd-x64": ["@esbuild/netbsd-x64@0.25.12", "", { "os": "none", "cpu": "x64" }, "sha512-Ld5pTlzPy3YwGec4OuHh1aCVCRvOXdH8DgRjfDy/oumVovmuSzWfnSJg+VtakB9Cm0gxNO9BzWkj6mtO1FMXkQ=="],
|
||||
|
||||
"@esbuild/openbsd-arm64": ["@esbuild/openbsd-arm64@0.25.12", "", { "os": "openbsd", "cpu": "arm64" }, "sha512-fF96T6KsBo/pkQI950FARU9apGNTSlZGsv1jZBAlcLL1MLjLNIWPBkj5NlSz8aAzYKg+eNqknrUJ24QBybeR5A=="],
|
||||
|
||||
"@esbuild/openbsd-x64": ["@esbuild/openbsd-x64@0.25.12", "", { "os": "openbsd", "cpu": "x64" }, "sha512-MZyXUkZHjQxUvzK7rN8DJ3SRmrVrke8ZyRusHlP+kuwqTcfWLyqMOE3sScPPyeIXN/mDJIfGXvcMqCgYKekoQw=="],
|
||||
|
||||
"@esbuild/openharmony-arm64": ["@esbuild/openharmony-arm64@0.25.12", "", { "os": "none", "cpu": "arm64" }, "sha512-rm0YWsqUSRrjncSXGA7Zv78Nbnw4XL6/dzr20cyrQf7ZmRcsovpcRBdhD43Nuk3y7XIoW2OxMVvwuRvk9XdASg=="],
|
||||
|
||||
"@esbuild/sunos-x64": ["@esbuild/sunos-x64@0.25.12", "", { "os": "sunos", "cpu": "x64" }, "sha512-3wGSCDyuTHQUzt0nV7bocDy72r2lI33QL3gkDNGkod22EsYl04sMf0qLb8luNKTOmgF/eDEDP5BFNwoBKH441w=="],
|
||||
|
||||
"@esbuild/win32-arm64": ["@esbuild/win32-arm64@0.25.12", "", { "os": "win32", "cpu": "arm64" }, "sha512-rMmLrur64A7+DKlnSuwqUdRKyd3UE7oPJZmnljqEptesKM8wx9J8gx5u0+9Pq0fQQW8vqeKebwNXdfOyP+8Bsg=="],
|
||||
|
||||
"@esbuild/win32-ia32": ["@esbuild/win32-ia32@0.25.12", "", { "os": "win32", "cpu": "ia32" }, "sha512-HkqnmmBoCbCwxUKKNPBixiWDGCpQGVsrQfJoVGYLPT41XWF8lHuE5N6WhVia2n4o5QK5M4tYr21827fNhi4byQ=="],
|
||||
|
||||
"@esbuild/win32-x64": ["@esbuild/win32-x64@0.25.12", "", { "os": "win32", "cpu": "x64" }, "sha512-alJC0uCZpTFrSL0CCDjcgleBXPnCrEAhTBILpeAp7M/OFgoqtAetfBzX0xM00MUsVVPpVjlPuMbREqnZCXaTnA=="],
|
||||
|
||||
"@floating-ui/core": ["@floating-ui/core@1.7.4", "", { "dependencies": { "@floating-ui/utils": "^0.2.10" } }, "sha512-C3HlIdsBxszvm5McXlB8PeOEWfBhcGBTZGkGlWc2U0KFY5IwG5OQEuQ8rq52DZmcHDlPLd+YFBK+cZcytwIFWg=="],
|
||||
|
||||
"@floating-ui/dom": ["@floating-ui/dom@1.7.5", "", { "dependencies": { "@floating-ui/core": "^1.7.4", "@floating-ui/utils": "^0.2.10" } }, "sha512-N0bD2kIPInNHUHehXhMke1rBGs1dwqvC9O9KYMyyjK7iXt7GAhnro7UlcuYcGdS/yYOlq0MAVgrow8IbWJwyqg=="],
|
||||
|
||||
"@floating-ui/react-dom": ["@floating-ui/react-dom@2.1.7", "", { "dependencies": { "@floating-ui/dom": "^1.7.5" }, "peerDependencies": { "react": ">=16.8.0", "react-dom": ">=16.8.0" } }, "sha512-0tLRojf/1Go2JgEVm+3Frg9A3IW8bJgKgdO0BN5RkF//ufuz2joZM63Npau2ff3J6lUVYgDSNzNkR+aH3IVfjg=="],
|
||||
|
||||
"@floating-ui/utils": ["@floating-ui/utils@0.2.10", "", {}, "sha512-aGTxbpbg8/b5JfU1HXSrbH3wXZuLPJcNEcZQFMxLs3oSzgtVu6nFPkbbGGUvBcUjKV2YyB9Wxxabo+HEH9tcRQ=="],
|
||||
|
||||
"@formatjs/fast-memoize": ["@formatjs/fast-memoize@3.1.0", "", { "dependencies": { "tslib": "^2.8.1" } }, "sha512-b5mvSWCI+XVKiz5WhnBCY3RJ4ZwfjAidU0yVlKa3d3MSgKmH1hC3tBGEAtYyN5mqL7N0G5x0BOUYyO8CEupWgg=="],
|
||||
|
||||
"@formatjs/intl-localematcher": ["@formatjs/intl-localematcher@0.8.0", "", { "dependencies": { "@formatjs/fast-memoize": "3.1.0", "tslib": "^2.8.1" } }, "sha512-zgMYWdUlmEZpX2Io+v3LHrfq9xZ6khpQVf9UAw2xYWhGerGgI9XgH1HvL/A34jWiruUJpYlP5pk4g8nIcaDrXQ=="],
|
||||
|
||||
"@fumadocs/ui": ["@fumadocs/ui@16.4.11", "", { "dependencies": { "next-themes": "^0.4.6", "postcss-selector-parser": "^7.1.1", "tailwind-merge": "^3.4.0" }, "peerDependencies": { "@types/react": "*", "fumadocs-core": "16.4.11", "next": "16.x.x", "react": "^19.2.0", "react-dom": "^19.2.0", "tailwindcss": "^4.0.0" }, "optionalPeers": ["@types/react", "next", "tailwindcss"] }, "sha512-3APzHr4Rv5P9YQApTKCQW3cXika0dwHuOo8WxYz74y42nONRo/TMDtvoWaNhB145sBrW9N4j0/0xXfiGLihVRQ=="],
|
||||
|
||||
"@fumari/json-schema-to-typescript": ["@fumari/json-schema-to-typescript@2.0.0", "", { "dependencies": { "js-yaml": "^4.1.0" }, "peerDependencies": { "@apidevtools/json-schema-ref-parser": "14.x.x", "prettier": "3.x.x" }, "optionalPeers": ["@apidevtools/json-schema-ref-parser", "prettier"] }, "sha512-X0Wm3QJLj1Rtb1nY2exM6QwMXb9LGyIKLf35+n6xyltDDBLMECOC4R/zPaw3RwgFVmvRLSmLCd+ht4sKabgmNw=="],
|
||||
|
||||
"@fumari/stf": ["@fumari/stf@0.0.1", "", { "peerDependencies": { "@types/react": "*", "react": "^19.2.0", "react-dom": "^19.2.0" }, "optionalPeers": ["@types/react"] }, "sha512-Io3xlYr8xMPZtxWI5GwIRvWEMu1CsfbwXa09ACeXGjbY4QVreMiMjNCvN1YNLmETgG6Ru1S/+2B8qv80OIExyA=="],
|
||||
|
||||
"@img/colour": ["@img/colour@1.0.0", "", {}, "sha512-A5P/LfWGFSl6nsckYtjw9da+19jB8hkJ6ACTGcDfEJ0aE+l2n2El7dsVM7UVHZQ9s2lmYMWlrS21YLy2IR1LUw=="],
|
||||
|
||||
"@img/sharp-darwin-arm64": ["@img/sharp-darwin-arm64@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-darwin-arm64": "1.2.4" }, "os": "darwin", "cpu": "arm64" }, "sha512-imtQ3WMJXbMY4fxb/Ndp6HBTNVtWCUI0WdobyheGf5+ad6xX8VIDO8u2xE4qc/fr08CKG/7dDseFtn6M6g/r3w=="],
|
||||
|
||||
"@img/sharp-darwin-x64": ["@img/sharp-darwin-x64@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-darwin-x64": "1.2.4" }, "os": "darwin", "cpu": "x64" }, "sha512-YNEFAF/4KQ/PeW0N+r+aVVsoIY0/qxxikF2SWdp+NRkmMB7y9LBZAVqQ4yhGCm/H3H270OSykqmQMKLBhBJDEw=="],
|
||||
|
||||
"@img/sharp-libvips-darwin-arm64": ["@img/sharp-libvips-darwin-arm64@1.2.4", "", { "os": "darwin", "cpu": "arm64" }, "sha512-zqjjo7RatFfFoP0MkQ51jfuFZBnVE2pRiaydKJ1G/rHZvnsrHAOcQALIi9sA5co5xenQdTugCvtb1cuf78Vf4g=="],
|
||||
|
||||
"@img/sharp-libvips-darwin-x64": ["@img/sharp-libvips-darwin-x64@1.2.4", "", { "os": "darwin", "cpu": "x64" }, "sha512-1IOd5xfVhlGwX+zXv2N93k0yMONvUlANylbJw1eTah8K/Jtpi15KC+WSiaX/nBmbm2HxRM1gZ0nSdjSsrZbGKg=="],
|
||||
|
||||
"@img/sharp-libvips-linux-arm": ["@img/sharp-libvips-linux-arm@1.2.4", "", { "os": "linux", "cpu": "arm" }, "sha512-bFI7xcKFELdiNCVov8e44Ia4u2byA+l3XtsAj+Q8tfCwO6BQ8iDojYdvoPMqsKDkuoOo+X6HZA0s0q11ANMQ8A=="],
|
||||
|
||||
"@img/sharp-libvips-linux-arm64": ["@img/sharp-libvips-linux-arm64@1.2.4", "", { "os": "linux", "cpu": "arm64" }, "sha512-excjX8DfsIcJ10x1Kzr4RcWe1edC9PquDRRPx3YVCvQv+U5p7Yin2s32ftzikXojb1PIFc/9Mt28/y+iRklkrw=="],
|
||||
|
||||
"@img/sharp-libvips-linux-ppc64": ["@img/sharp-libvips-linux-ppc64@1.2.4", "", { "os": "linux", "cpu": "ppc64" }, "sha512-FMuvGijLDYG6lW+b/UvyilUWu5Ayu+3r2d1S8notiGCIyYU/76eig1UfMmkZ7vwgOrzKzlQbFSuQfgm7GYUPpA=="],
|
||||
|
||||
"@img/sharp-libvips-linux-riscv64": ["@img/sharp-libvips-linux-riscv64@1.2.4", "", { "os": "linux", "cpu": "none" }, "sha512-oVDbcR4zUC0ce82teubSm+x6ETixtKZBh/qbREIOcI3cULzDyb18Sr/Wcyx7NRQeQzOiHTNbZFF1UwPS2scyGA=="],
|
||||
|
||||
"@img/sharp-libvips-linux-s390x": ["@img/sharp-libvips-linux-s390x@1.2.4", "", { "os": "linux", "cpu": "s390x" }, "sha512-qmp9VrzgPgMoGZyPvrQHqk02uyjA0/QrTO26Tqk6l4ZV0MPWIW6LTkqOIov+J1yEu7MbFQaDpwdwJKhbJvuRxQ=="],
|
||||
|
||||
"@img/sharp-libvips-linux-x64": ["@img/sharp-libvips-linux-x64@1.2.4", "", { "os": "linux", "cpu": "x64" }, "sha512-tJxiiLsmHc9Ax1bz3oaOYBURTXGIRDODBqhveVHonrHJ9/+k89qbLl0bcJns+e4t4rvaNBxaEZsFtSfAdquPrw=="],
|
||||
|
||||
"@img/sharp-libvips-linuxmusl-arm64": ["@img/sharp-libvips-linuxmusl-arm64@1.2.4", "", { "os": "linux", "cpu": "arm64" }, "sha512-FVQHuwx1IIuNow9QAbYUzJ+En8KcVm9Lk5+uGUQJHaZmMECZmOlix9HnH7n1TRkXMS0pGxIJokIVB9SuqZGGXw=="],
|
||||
|
||||
"@img/sharp-libvips-linuxmusl-x64": ["@img/sharp-libvips-linuxmusl-x64@1.2.4", "", { "os": "linux", "cpu": "x64" }, "sha512-+LpyBk7L44ZIXwz/VYfglaX/okxezESc6UxDSoyo2Ks6Jxc4Y7sGjpgU9s4PMgqgjj1gZCylTieNamqA1MF7Dg=="],
|
||||
|
||||
"@img/sharp-linux-arm": ["@img/sharp-linux-arm@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-linux-arm": "1.2.4" }, "os": "linux", "cpu": "arm" }, "sha512-9dLqsvwtg1uuXBGZKsxem9595+ujv0sJ6Vi8wcTANSFpwV/GONat5eCkzQo/1O6zRIkh0m/8+5BjrRr7jDUSZw=="],
|
||||
|
||||
"@img/sharp-linux-arm64": ["@img/sharp-linux-arm64@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-linux-arm64": "1.2.4" }, "os": "linux", "cpu": "arm64" }, "sha512-bKQzaJRY/bkPOXyKx5EVup7qkaojECG6NLYswgktOZjaXecSAeCWiZwwiFf3/Y+O1HrauiE3FVsGxFg8c24rZg=="],
|
||||
|
||||
"@img/sharp-linux-ppc64": ["@img/sharp-linux-ppc64@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-linux-ppc64": "1.2.4" }, "os": "linux", "cpu": "ppc64" }, "sha512-7zznwNaqW6YtsfrGGDA6BRkISKAAE1Jo0QdpNYXNMHu2+0dTrPflTLNkpc8l7MUP5M16ZJcUvysVWWrMefZquA=="],
|
||||
|
||||
"@img/sharp-linux-riscv64": ["@img/sharp-linux-riscv64@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-linux-riscv64": "1.2.4" }, "os": "linux", "cpu": "none" }, "sha512-51gJuLPTKa7piYPaVs8GmByo7/U7/7TZOq+cnXJIHZKavIRHAP77e3N2HEl3dgiqdD/w0yUfiJnII77PuDDFdw=="],
|
||||
|
||||
"@img/sharp-linux-s390x": ["@img/sharp-linux-s390x@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-linux-s390x": "1.2.4" }, "os": "linux", "cpu": "s390x" }, "sha512-nQtCk0PdKfho3eC5MrbQoigJ2gd1CgddUMkabUj+rBevs8tZ2cULOx46E7oyX+04WGfABgIwmMC0VqieTiR4jg=="],
|
||||
|
||||
"@img/sharp-linux-x64": ["@img/sharp-linux-x64@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-linux-x64": "1.2.4" }, "os": "linux", "cpu": "x64" }, "sha512-MEzd8HPKxVxVenwAa+JRPwEC7QFjoPWuS5NZnBt6B3pu7EG2Ge0id1oLHZpPJdn3OQK+BQDiw9zStiHBTJQQQQ=="],
|
||||
|
||||
"@img/sharp-linuxmusl-arm64": ["@img/sharp-linuxmusl-arm64@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-linuxmusl-arm64": "1.2.4" }, "os": "linux", "cpu": "arm64" }, "sha512-fprJR6GtRsMt6Kyfq44IsChVZeGN97gTD331weR1ex1c1rypDEABN6Tm2xa1wE6lYb5DdEnk03NZPqA7Id21yg=="],
|
||||
|
||||
"@img/sharp-linuxmusl-x64": ["@img/sharp-linuxmusl-x64@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-linuxmusl-x64": "1.2.4" }, "os": "linux", "cpu": "x64" }, "sha512-Jg8wNT1MUzIvhBFxViqrEhWDGzqymo3sV7z7ZsaWbZNDLXRJZoRGrjulp60YYtV4wfY8VIKcWidjojlLcWrd8Q=="],
|
||||
|
||||
"@img/sharp-wasm32": ["@img/sharp-wasm32@0.34.5", "", { "dependencies": { "@emnapi/runtime": "^1.7.0" }, "cpu": "none" }, "sha512-OdWTEiVkY2PHwqkbBI8frFxQQFekHaSSkUIJkwzclWZe64O1X4UlUjqqqLaPbUpMOQk6FBu/HtlGXNblIs0huw=="],
|
||||
|
||||
"@img/sharp-win32-arm64": ["@img/sharp-win32-arm64@0.34.5", "", { "os": "win32", "cpu": "arm64" }, "sha512-WQ3AgWCWYSb2yt+IG8mnC6Jdk9Whs7O0gxphblsLvdhSpSTtmu69ZG1Gkb6NuvxsNACwiPV6cNSZNzt0KPsw7g=="],
|
||||
|
||||
"@img/sharp-win32-ia32": ["@img/sharp-win32-ia32@0.34.5", "", { "os": "win32", "cpu": "ia32" }, "sha512-FV9m/7NmeCmSHDD5j4+4pNI8Cp3aW+JvLoXcTUo0IqyjSfAZJ8dIUmijx1qaJsIiU+Hosw6xM5KijAWRJCSgNg=="],
|
||||
|
||||
"@img/sharp-win32-x64": ["@img/sharp-win32-x64@0.34.5", "", { "os": "win32", "cpu": "x64" }, "sha512-+29YMsqY2/9eFEiW93eqWnuLcWcufowXewwSNIT6UwZdUUCrM3oFjMWH/Z6/TMmb4hlFenmfAVbpWeup2jryCw=="],
|
||||
|
||||
"@jridgewell/gen-mapping": ["@jridgewell/gen-mapping@0.3.13", "", { "dependencies": { "@jridgewell/sourcemap-codec": "^1.5.0", "@jridgewell/trace-mapping": "^0.3.24" } }, "sha512-2kkt/7niJ6MgEPxF0bYdQ6etZaA+fQvDcLKckhy1yIQOzaoKjBBjSj63/aLVjYE3qhRt5dvM+uUyfCg6UKCBbA=="],
|
||||
|
||||
"@jridgewell/remapping": ["@jridgewell/remapping@2.3.5", "", { "dependencies": { "@jridgewell/gen-mapping": "^0.3.5", "@jridgewell/trace-mapping": "^0.3.24" } }, "sha512-LI9u/+laYG4Ds1TDKSJW2YPrIlcVYOwi2fUC6xB43lueCjgxV4lffOCZCtYFiH6TNOX+tQKXx97T4IKHbhyHEQ=="],
|
||||
|
||||
"@jridgewell/resolve-uri": ["@jridgewell/resolve-uri@3.1.2", "", {}, "sha512-bRISgCIjP20/tbWSPWMEi54QVPRZExkuD9lJL+UIxUKtwVJA8wW1Trb1jMs1RFXo1CBTNZ/5hpC9QvmKWdopKw=="],
|
||||
|
||||
"@jridgewell/sourcemap-codec": ["@jridgewell/sourcemap-codec@1.5.5", "", {}, "sha512-cYQ9310grqxueWbl+WuIUIaiUaDcj7WOq5fVhEljNVgRfOUhY9fy2zTvfoqWsnebh8Sl70VScFbICvJnLKB0Og=="],
|
||||
|
||||
"@jridgewell/trace-mapping": ["@jridgewell/trace-mapping@0.3.31", "", { "dependencies": { "@jridgewell/resolve-uri": "^3.1.0", "@jridgewell/sourcemap-codec": "^1.4.14" } }, "sha512-zzNR+SdQSDJzc8joaeP8QQoCQr8NuYx2dIIytl1QeBEZHJ9uW6hebsrYgbz8hJwUQao3TWCMtmfV8Nu1twOLAw=="],
|
||||
|
||||
"@mdx-js/mdx": ["@mdx-js/mdx@3.1.1", "", { "dependencies": { "@types/estree": "^1.0.0", "@types/estree-jsx": "^1.0.0", "@types/hast": "^3.0.0", "@types/mdx": "^2.0.0", "acorn": "^8.0.0", "collapse-white-space": "^2.0.0", "devlop": "^1.0.0", "estree-util-is-identifier-name": "^3.0.0", "estree-util-scope": "^1.0.0", "estree-walker": "^3.0.0", "hast-util-to-jsx-runtime": "^2.0.0", "markdown-extensions": "^2.0.0", "recma-build-jsx": "^1.0.0", "recma-jsx": "^1.0.0", "recma-stringify": "^1.0.0", "rehype-recma": "^1.0.0", "remark-mdx": "^3.0.0", "remark-parse": "^11.0.0", "remark-rehype": "^11.0.0", "source-map": "^0.7.0", "unified": "^11.0.0", "unist-util-position-from-estree": "^2.0.0", "unist-util-stringify-position": "^4.0.0", "unist-util-visit": "^5.0.0", "vfile": "^6.0.0" } }, "sha512-f6ZO2ifpwAQIpzGWaBQT2TXxPv6z3RBzQKpVftEWN78Vl/YweF1uwussDx8ECAXVtr3Rs89fKyG9YlzUs9DyGQ=="],
|
||||
|
||||
"@next/env": ["@next/env@16.1.6", "", {}, "sha512-N1ySLuZjnAtN3kFnwhAwPvZah8RJxKasD7x1f8shFqhncnWZn4JMfg37diLNuoHsLAlrDfM3g4mawVdtAG8XLQ=="],
|
||||
|
||||
"@next/swc-darwin-arm64": ["@next/swc-darwin-arm64@16.1.6", "", { "os": "darwin", "cpu": "arm64" }, "sha512-wTzYulosJr/6nFnqGW7FrG3jfUUlEf8UjGA0/pyypJl42ExdVgC6xJgcXQ+V8QFn6niSG2Pb8+MIG1mZr2vczw=="],
|
||||
|
||||
"@next/swc-darwin-x64": ["@next/swc-darwin-x64@16.1.6", "", { "os": "darwin", "cpu": "x64" }, "sha512-BLFPYPDO+MNJsiDWbeVzqvYd4NyuRrEYVB5k2N3JfWncuHAy2IVwMAOlVQDFjj+krkWzhY2apvmekMkfQR0CUQ=="],
|
||||
|
||||
"@next/swc-linux-arm64-gnu": ["@next/swc-linux-arm64-gnu@16.1.6", "", { "os": "linux", "cpu": "arm64" }, "sha512-OJYkCd5pj/QloBvoEcJ2XiMnlJkRv9idWA/j0ugSuA34gMT6f5b7vOiCQHVRpvStoZUknhl6/UxOXL4OwtdaBw=="],
|
||||
|
||||
"@next/swc-linux-arm64-musl": ["@next/swc-linux-arm64-musl@16.1.6", "", { "os": "linux", "cpu": "arm64" }, "sha512-S4J2v+8tT3NIO9u2q+S0G5KdvNDjXfAv06OhfOzNDaBn5rw84DGXWndOEB7d5/x852A20sW1M56vhC/tRVbccQ=="],
|
||||
|
||||
"@next/swc-linux-x64-gnu": ["@next/swc-linux-x64-gnu@16.1.6", "", { "os": "linux", "cpu": "x64" }, "sha512-2eEBDkFlMMNQnkTyPBhQOAyn2qMxyG2eE7GPH2WIDGEpEILcBPI/jdSv4t6xupSP+ot/jkfrCShLAa7+ZUPcJQ=="],
|
||||
|
||||
"@next/swc-linux-x64-musl": ["@next/swc-linux-x64-musl@16.1.6", "", { "os": "linux", "cpu": "x64" }, "sha512-oicJwRlyOoZXVlxmIMaTq7f8pN9QNbdes0q2FXfRsPhfCi8n8JmOZJm5oo1pwDaFbnnD421rVU409M3evFbIqg=="],
|
||||
|
||||
"@next/swc-win32-arm64-msvc": ["@next/swc-win32-arm64-msvc@16.1.6", "", { "os": "win32", "cpu": "arm64" }, "sha512-gQmm8izDTPgs+DCWH22kcDmuUp7NyiJgEl18bcr8irXA5N2m2O+JQIr6f3ct42GOs9c0h8QF3L5SzIxcYAAXXw=="],
|
||||
|
||||
"@next/swc-win32-x64-msvc": ["@next/swc-win32-x64-msvc@16.1.6", "", { "os": "win32", "cpu": "x64" }, "sha512-NRfO39AIrzBnixKbjuo2YiYhB6o9d8v/ymU9m/Xk8cyVk+k7XylniXkHwjs4s70wedVffc6bQNbufk5v0xEm0A=="],
|
||||
|
||||
"@orama/orama": ["@orama/orama@3.1.18", "", {}, "sha512-a61ljmRVVyG5MC/698C8/FfFDw5a8LOIvyOLW5fztgUXqUpc1jOfQzOitSCbge657OgXXThmY3Tk8fpiDb4UcA=="],
|
||||
|
||||
"@radix-ui/number": ["@radix-ui/number@1.1.1", "", {}, "sha512-MkKCwxlXTgz6CFoJx3pCwn07GKp36+aZyu/u2Ln2VrA5DcdyCZkASEDBTd8x5whTQQL5CiYf4prXKLcgQdv29g=="],
|
||||
|
||||
"@radix-ui/primitive": ["@radix-ui/primitive@1.1.3", "", {}, "sha512-JTF99U/6XIjCBo0wqkU5sK10glYe27MRRsfwoiq5zzOEZLHU3A3KCMa5X/azekYRCJ0HlwI0crAXS/5dEHTzDg=="],
|
||||
|
||||
"@radix-ui/react-accordion": ["@radix-ui/react-accordion@1.2.12", "", { "dependencies": { "@radix-ui/primitive": "1.1.3", "@radix-ui/react-collapsible": "1.1.12", "@radix-ui/react-collection": "1.1.7", "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-direction": "1.1.1", "@radix-ui/react-id": "1.1.1", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-use-controllable-state": "1.2.2" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-T4nygeh9YE9dLRPhAHSeOZi7HBXo+0kYIPJXayZfvWOWA0+n3dESrZbjfDPUABkUNym6Hd+f2IR113To8D2GPA=="],
|
||||
|
||||
"@radix-ui/react-arrow": ["@radix-ui/react-arrow@1.1.7", "", { "dependencies": { "@radix-ui/react-primitive": "2.1.3" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-F+M1tLhO+mlQaOWspE8Wstg+z6PwxwRd8oQ8IXceWz92kfAmalTRf0EjrouQeo7QssEPfCn05B4Ihs1K9WQ/7w=="],
|
||||
|
||||
"@radix-ui/react-collapsible": ["@radix-ui/react-collapsible@1.1.12", "", { "dependencies": { "@radix-ui/primitive": "1.1.3", "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-id": "1.1.1", "@radix-ui/react-presence": "1.1.5", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-use-controllable-state": "1.2.2", "@radix-ui/react-use-layout-effect": "1.1.1" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-Uu+mSh4agx2ib1uIGPP4/CKNULyajb3p92LsVXmH2EHVMTfZWpll88XJ0j4W0z3f8NK1eYl1+Mf/szHPmcHzyA=="],
|
||||
|
||||
"@radix-ui/react-collection": ["@radix-ui/react-collection@1.1.7", "", { "dependencies": { "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-slot": "1.2.3" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-Fh9rGN0MoI4ZFUNyfFVNU4y9LUz93u9/0K+yLgA2bwRojxM8JU1DyvvMBabnZPBgMWREAJvU2jjVzq+LrFUglw=="],
|
||||
|
||||
"@radix-ui/react-compose-refs": ["@radix-ui/react-compose-refs@1.1.2", "", { "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-z4eqJvfiNnFMHIIvXP3CY57y2WJs5g2v3X0zm9mEJkrkNv4rDxu+sg9Jh8EkXyeqBkB7SOcboo9dMVqhyrACIg=="],
|
||||
|
||||
"@radix-ui/react-context": ["@radix-ui/react-context@1.1.2", "", { "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-jCi/QKUM2r1Ju5a3J64TH2A5SpKAgh0LpknyqdQ4m6DCV0xJ2HG1xARRwNGPQfi1SLdLWZ1OJz6F4OMBBNiGJA=="],
|
||||
|
||||
"@radix-ui/react-dialog": ["@radix-ui/react-dialog@1.1.15", "", { "dependencies": { "@radix-ui/primitive": "1.1.3", "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-dismissable-layer": "1.1.11", "@radix-ui/react-focus-guards": "1.1.3", "@radix-ui/react-focus-scope": "1.1.7", "@radix-ui/react-id": "1.1.1", "@radix-ui/react-portal": "1.1.9", "@radix-ui/react-presence": "1.1.5", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-slot": "1.2.3", "@radix-ui/react-use-controllable-state": "1.2.2", "aria-hidden": "^1.2.4", "react-remove-scroll": "^2.6.3" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-TCglVRtzlffRNxRMEyR36DGBLJpeusFcgMVD9PZEzAKnUs1lKCgX5u9BmC2Yg+LL9MgZDugFFs1Vl+Jp4t/PGw=="],
|
||||
|
||||
"@radix-ui/react-direction": ["@radix-ui/react-direction@1.1.1", "", { "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-1UEWRX6jnOA2y4H5WczZ44gOOjTEmlqv1uNW4GAJEO5+bauCBhv8snY65Iw5/VOS/ghKN9gr2KjnLKxrsvoMVw=="],
|
||||
|
||||
"@radix-ui/react-dismissable-layer": ["@radix-ui/react-dismissable-layer@1.1.11", "", { "dependencies": { "@radix-ui/primitive": "1.1.3", "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-use-callback-ref": "1.1.1", "@radix-ui/react-use-escape-keydown": "1.1.1" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-Nqcp+t5cTB8BinFkZgXiMJniQH0PsUt2k51FUhbdfeKvc4ACcG2uQniY/8+h1Yv6Kza4Q7lD7PQV0z0oicE0Mg=="],
|
||||
|
||||
"@radix-ui/react-focus-guards": ["@radix-ui/react-focus-guards@1.1.3", "", { "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-0rFg/Rj2Q62NCm62jZw0QX7a3sz6QCQU0LpZdNrJX8byRGaGVTqbrW9jAoIAHyMQqsNpeZ81YgSizOt5WXq0Pw=="],
|
||||
|
||||
"@radix-ui/react-focus-scope": ["@radix-ui/react-focus-scope@1.1.7", "", { "dependencies": { "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-use-callback-ref": "1.1.1" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-t2ODlkXBQyn7jkl6TNaw/MtVEVvIGelJDCG41Okq/KwUsJBwQ4XVZsHAVUkK4mBv3ewiAS3PGuUWuY2BoK4ZUw=="],
|
||||
|
||||
"@radix-ui/react-id": ["@radix-ui/react-id@1.1.1", "", { "dependencies": { "@radix-ui/react-use-layout-effect": "1.1.1" }, "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-kGkGegYIdQsOb4XjsfM97rXsiHaBwco+hFI66oO4s9LU+PLAC5oJ7khdOVFxkhsmlbpUqDAvXw11CluXP+jkHg=="],
|
||||
|
||||
"@radix-ui/react-navigation-menu": ["@radix-ui/react-navigation-menu@1.2.14", "", { "dependencies": { "@radix-ui/primitive": "1.1.3", "@radix-ui/react-collection": "1.1.7", "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-direction": "1.1.1", "@radix-ui/react-dismissable-layer": "1.1.11", "@radix-ui/react-id": "1.1.1", "@radix-ui/react-presence": "1.1.5", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-use-callback-ref": "1.1.1", "@radix-ui/react-use-controllable-state": "1.2.2", "@radix-ui/react-use-layout-effect": "1.1.1", "@radix-ui/react-use-previous": "1.1.1", "@radix-ui/react-visually-hidden": "1.2.3" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-YB9mTFQvCOAQMHU+C/jVl96WmuWeltyUEpRJJky51huhds5W2FQr1J8D/16sQlf0ozxkPK8uF3niQMdUwZPv5w=="],
|
||||
|
||||
"@radix-ui/react-popover": ["@radix-ui/react-popover@1.1.15", "", { "dependencies": { "@radix-ui/primitive": "1.1.3", "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-dismissable-layer": "1.1.11", "@radix-ui/react-focus-guards": "1.1.3", "@radix-ui/react-focus-scope": "1.1.7", "@radix-ui/react-id": "1.1.1", "@radix-ui/react-popper": "1.2.8", "@radix-ui/react-portal": "1.1.9", "@radix-ui/react-presence": "1.1.5", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-slot": "1.2.3", "@radix-ui/react-use-controllable-state": "1.2.2", "aria-hidden": "^1.2.4", "react-remove-scroll": "^2.6.3" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-kr0X2+6Yy/vJzLYJUPCZEc8SfQcf+1COFoAqauJm74umQhta9M7lNJHP7QQS3vkvcGLQUbWpMzwrXYwrYztHKA=="],
|
||||
|
||||
"@radix-ui/react-popper": ["@radix-ui/react-popper@1.2.8", "", { "dependencies": { "@floating-ui/react-dom": "^2.0.0", "@radix-ui/react-arrow": "1.1.7", "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-use-callback-ref": "1.1.1", "@radix-ui/react-use-layout-effect": "1.1.1", "@radix-ui/react-use-rect": "1.1.1", "@radix-ui/react-use-size": "1.1.1", "@radix-ui/rect": "1.1.1" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-0NJQ4LFFUuWkE7Oxf0htBKS6zLkkjBH+hM1uk7Ng705ReR8m/uelduy1DBo0PyBXPKVnBA6YBlU94MBGXrSBCw=="],
|
||||
|
||||
"@radix-ui/react-portal": ["@radix-ui/react-portal@1.1.9", "", { "dependencies": { "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-use-layout-effect": "1.1.1" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-bpIxvq03if6UNwXZ+HTK71JLh4APvnXntDc6XOX8UVq4XQOVl7lwok0AvIl+b8zgCw3fSaVTZMpAPPagXbKmHQ=="],
|
||||
|
||||
"@radix-ui/react-presence": ["@radix-ui/react-presence@1.1.5", "", { "dependencies": { "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-use-layout-effect": "1.1.1" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-/jfEwNDdQVBCNvjkGit4h6pMOzq8bHkopq458dPt2lMjx+eBQUohZNG9A7DtO/O5ukSbxuaNGXMjHicgwy6rQQ=="],
|
||||
|
||||
"@radix-ui/react-primitive": ["@radix-ui/react-primitive@2.1.3", "", { "dependencies": { "@radix-ui/react-slot": "1.2.3" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-m9gTwRkhy2lvCPe6QJp4d3G1TYEUHn/FzJUtq9MjH46an1wJU+GdoGC5VLof8RX8Ft/DlpshApkhswDLZzHIcQ=="],
|
||||
|
||||
"@radix-ui/react-roving-focus": ["@radix-ui/react-roving-focus@1.1.11", "", { "dependencies": { "@radix-ui/primitive": "1.1.3", "@radix-ui/react-collection": "1.1.7", "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-direction": "1.1.1", "@radix-ui/react-id": "1.1.1", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-use-callback-ref": "1.1.1", "@radix-ui/react-use-controllable-state": "1.2.2" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-7A6S9jSgm/S+7MdtNDSb+IU859vQqJ/QAtcYQcfFC6W8RS4IxIZDldLR0xqCFZ6DCyrQLjLPsxtTNch5jVA4lA=="],
|
||||
|
||||
"@radix-ui/react-scroll-area": ["@radix-ui/react-scroll-area@1.2.10", "", { "dependencies": { "@radix-ui/number": "1.1.1", "@radix-ui/primitive": "1.1.3", "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-direction": "1.1.1", "@radix-ui/react-presence": "1.1.5", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-use-callback-ref": "1.1.1", "@radix-ui/react-use-layout-effect": "1.1.1" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-tAXIa1g3sM5CGpVT0uIbUx/U3Gs5N8T52IICuCtObaos1S8fzsrPXG5WObkQN3S6NVl6wKgPhAIiBGbWnvc97A=="],
|
||||
|
||||
"@radix-ui/react-select": ["@radix-ui/react-select@2.2.6", "", { "dependencies": { "@radix-ui/number": "1.1.1", "@radix-ui/primitive": "1.1.3", "@radix-ui/react-collection": "1.1.7", "@radix-ui/react-compose-refs": "1.1.2", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-direction": "1.1.1", "@radix-ui/react-dismissable-layer": "1.1.11", "@radix-ui/react-focus-guards": "1.1.3", "@radix-ui/react-focus-scope": "1.1.7", "@radix-ui/react-id": "1.1.1", "@radix-ui/react-popper": "1.2.8", "@radix-ui/react-portal": "1.1.9", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-slot": "1.2.3", "@radix-ui/react-use-callback-ref": "1.1.1", "@radix-ui/react-use-controllable-state": "1.2.2", "@radix-ui/react-use-layout-effect": "1.1.1", "@radix-ui/react-use-previous": "1.1.1", "@radix-ui/react-visually-hidden": "1.2.3", "aria-hidden": "^1.2.4", "react-remove-scroll": "^2.6.3" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-I30RydO+bnn2PQztvo25tswPH+wFBjehVGtmagkU78yMdwTwVf12wnAOF+AeP8S2N8xD+5UPbGhkUfPyvT+mwQ=="],
|
||||
|
||||
"@radix-ui/react-slot": ["@radix-ui/react-slot@1.2.3", "", { "dependencies": { "@radix-ui/react-compose-refs": "1.1.2" }, "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-aeNmHnBxbi2St0au6VBVC7JXFlhLlOnvIIlePNniyUNAClzmtAUEY8/pBiK3iHjufOlwA+c20/8jngo7xcrg8A=="],
|
||||
|
||||
"@radix-ui/react-tabs": ["@radix-ui/react-tabs@1.1.13", "", { "dependencies": { "@radix-ui/primitive": "1.1.3", "@radix-ui/react-context": "1.1.2", "@radix-ui/react-direction": "1.1.1", "@radix-ui/react-id": "1.1.1", "@radix-ui/react-presence": "1.1.5", "@radix-ui/react-primitive": "2.1.3", "@radix-ui/react-roving-focus": "1.1.11", "@radix-ui/react-use-controllable-state": "1.2.2" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-7xdcatg7/U+7+Udyoj2zodtI9H/IIopqo+YOIcZOq1nJwXWBZ9p8xiu5llXlekDbZkca79a/fozEYQXIA4sW6A=="],
|
||||
|
||||
"@radix-ui/react-use-callback-ref": ["@radix-ui/react-use-callback-ref@1.1.1", "", { "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-FkBMwD+qbGQeMu1cOHnuGB6x4yzPjho8ap5WtbEJ26umhgqVXbhekKUQO+hZEL1vU92a3wHwdp0HAcqAUF5iDg=="],
|
||||
|
||||
"@radix-ui/react-use-controllable-state": ["@radix-ui/react-use-controllable-state@1.2.2", "", { "dependencies": { "@radix-ui/react-use-effect-event": "0.0.2", "@radix-ui/react-use-layout-effect": "1.1.1" }, "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-BjasUjixPFdS+NKkypcyyN5Pmg83Olst0+c6vGov0diwTEo6mgdqVR6hxcEgFuh4QrAs7Rc+9KuGJ9TVCj0Zzg=="],
|
||||
|
||||
"@radix-ui/react-use-effect-event": ["@radix-ui/react-use-effect-event@0.0.2", "", { "dependencies": { "@radix-ui/react-use-layout-effect": "1.1.1" }, "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-Qp8WbZOBe+blgpuUT+lw2xheLP8q0oatc9UpmiemEICxGvFLYmHm9QowVZGHtJlGbS6A6yJ3iViad/2cVjnOiA=="],
|
||||
|
||||
"@radix-ui/react-use-escape-keydown": ["@radix-ui/react-use-escape-keydown@1.1.1", "", { "dependencies": { "@radix-ui/react-use-callback-ref": "1.1.1" }, "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-Il0+boE7w/XebUHyBjroE+DbByORGR9KKmITzbR7MyQ4akpORYP/ZmbhAr0DG7RmmBqoOnZdy2QlvajJ2QA59g=="],
|
||||
|
||||
"@radix-ui/react-use-layout-effect": ["@radix-ui/react-use-layout-effect@1.1.1", "", { "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-RbJRS4UWQFkzHTTwVymMTUv8EqYhOp8dOOviLj2ugtTiXRaRQS7GLGxZTLL1jWhMeoSCf5zmcZkqTl9IiYfXcQ=="],
|
||||
|
||||
"@radix-ui/react-use-previous": ["@radix-ui/react-use-previous@1.1.1", "", { "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-2dHfToCj/pzca2Ck724OZ5L0EVrr3eHRNsG/b3xQJLA2hZpVCS99bLAX+hm1IHXDEnzU6by5z/5MIY794/a8NQ=="],
|
||||
|
||||
"@radix-ui/react-use-rect": ["@radix-ui/react-use-rect@1.1.1", "", { "dependencies": { "@radix-ui/rect": "1.1.1" }, "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-QTYuDesS0VtuHNNvMh+CjlKJ4LJickCMUAqjlE3+j8w+RlRpwyX3apEQKGFzbZGdo7XNG1tXa+bQqIE7HIXT2w=="],
|
||||
|
||||
"@radix-ui/react-use-size": ["@radix-ui/react-use-size@1.1.1", "", { "dependencies": { "@radix-ui/react-use-layout-effect": "1.1.1" }, "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-ewrXRDTAqAXlkl6t/fkXWNAhFX9I+CkKlw6zjEwk86RSPKwZr3xpBRso655aqYafwtnbpHLj6toFzmd6xdVptQ=="],
|
||||
|
||||
"@radix-ui/react-visually-hidden": ["@radix-ui/react-visually-hidden@1.2.3", "", { "dependencies": { "@radix-ui/react-primitive": "2.1.3" }, "peerDependencies": { "@types/react": "*", "@types/react-dom": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc", "react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react", "@types/react-dom"] }, "sha512-pzJq12tEaaIhqjbzpCuv/OypJY/BPavOofm+dbab+MHLajy277+1lLm6JFcGgF5eskJ6mquGirhXY2GD/8u8Ug=="],
|
||||
|
||||
"@radix-ui/rect": ["@radix-ui/rect@1.1.1", "", {}, "sha512-HPwpGIzkl28mWyZqG52jiqDJ12waP11Pa1lGoiyUkIEuMLBP0oeK/C89esbXrxsky5we7dfd8U58nm0SgAWpVw=="],
|
||||
|
||||
"@scalar/helpers": ["@scalar/helpers@0.2.10", "", {}, "sha512-VS32setBEAGY9JifuDZKHIq8SUCUWLEfL1V+h3s5V4wcmE8OZVkzaJemsMq/YAM9e7gb9ZbkvJLL4zzEvPSrVg=="],
|
||||
|
||||
"@scalar/json-magic": ["@scalar/json-magic@0.9.5", "", { "dependencies": { "@scalar/helpers": "0.2.10", "yaml": "^2.8.0" } }, "sha512-+IZngReH0P+ima7y9u/f5QJD60AdISG81ezhwEVrYhsp46PiJp7YyOd0z1YLiOgwV0jkPlPo74T/FVBcM2ejuw=="],
|
||||
|
||||
"@scalar/openapi-parser": ["@scalar/openapi-parser@0.24.5", "", { "dependencies": { "@scalar/json-magic": "0.9.4", "@scalar/openapi-types": "0.5.3", "@scalar/openapi-upgrader": "0.1.8", "ajv": "^8.17.1", "ajv-draft-04": "^1.0.0", "ajv-formats": "^3.0.1", "jsonpointer": "^5.0.1", "leven": "^4.0.0", "yaml": "^2.8.0" } }, "sha512-pTeKnmhVdSIfG3vysgDm6jsKc7Do1vXdy/4aqp7j8AEzXllf8RZjSgRSUhtvFYFQCr27fDZ117V3WPQUYtgmCw=="],
|
||||
|
||||
"@scalar/openapi-types": ["@scalar/openapi-types@0.5.3", "", { "dependencies": { "zod": "^4.1.11" } }, "sha512-m4n/Su3K01d15dmdWO1LlqecdSPKuNjuokrJLdiQ485kW/hRHbXW1QP6tJL75myhw/XhX5YhYAR+jrwnGjXiMw=="],
|
||||
|
||||
"@scalar/openapi-upgrader": ["@scalar/openapi-upgrader@0.1.8", "", { "dependencies": { "@scalar/openapi-types": "0.5.3" } }, "sha512-2xuYLLs0fBadLIk4I1ObjMiCnOyLPEMPf24A1HtHQvhKGDnGlvT63F2rU2Xw8lxCjgHnzveMPnOJEbwIy64RCg=="],
|
||||
|
||||
"@shikijs/core": ["@shikijs/core@3.22.0", "", { "dependencies": { "@shikijs/types": "3.22.0", "@shikijs/vscode-textmate": "^10.0.2", "@types/hast": "^3.0.4", "hast-util-to-html": "^9.0.5" } }, "sha512-iAlTtSDDbJiRpvgL5ugKEATDtHdUVkqgHDm/gbD2ZS9c88mx7G1zSYjjOxp5Qa0eaW0MAQosFRmJSk354PRoQA=="],
|
||||
|
||||
"@shikijs/engine-javascript": ["@shikijs/engine-javascript@3.22.0", "", { "dependencies": { "@shikijs/types": "3.22.0", "@shikijs/vscode-textmate": "^10.0.2", "oniguruma-to-es": "^4.3.4" } }, "sha512-jdKhfgW9CRtj3Tor0L7+yPwdG3CgP7W+ZEqSsojrMzCjD1e0IxIbwUMDDpYlVBlC08TACg4puwFGkZfLS+56Tw=="],
|
||||
|
||||
"@shikijs/engine-oniguruma": ["@shikijs/engine-oniguruma@3.22.0", "", { "dependencies": { "@shikijs/types": "3.22.0", "@shikijs/vscode-textmate": "^10.0.2" } }, "sha512-DyXsOG0vGtNtl7ygvabHd7Mt5EY8gCNqR9Y7Lpbbd/PbJvgWrqaKzH1JW6H6qFkuUa8aCxoiYVv8/YfFljiQxA=="],
|
||||
|
||||
"@shikijs/langs": ["@shikijs/langs@3.22.0", "", { "dependencies": { "@shikijs/types": "3.22.0" } }, "sha512-x/42TfhWmp6H00T6uwVrdTJGKgNdFbrEdhaDwSR5fd5zhQ1Q46bHq9EO61SCEWJR0HY7z2HNDMaBZp8JRmKiIA=="],
|
||||
|
||||
"@shikijs/rehype": ["@shikijs/rehype@3.22.0", "", { "dependencies": { "@shikijs/types": "3.22.0", "@types/hast": "^3.0.4", "hast-util-to-string": "^3.0.1", "shiki": "3.22.0", "unified": "^11.0.5", "unist-util-visit": "^5.1.0" } }, "sha512-69b2VPc6XBy/VmAJlpBU5By+bJSBdE2nvgRCZXav7zujbrjXuT0F60DIrjKuutjPqNufuizE+E8tIZr2Yn8Z+g=="],
|
||||
|
||||
"@shikijs/themes": ["@shikijs/themes@3.22.0", "", { "dependencies": { "@shikijs/types": "3.22.0" } }, "sha512-o+tlOKqsr6FE4+mYJG08tfCFDS+3CG20HbldXeVoyP+cYSUxDhrFf3GPjE60U55iOkkjbpY2uC3It/eeja35/g=="],
|
||||
|
||||
"@shikijs/transformers": ["@shikijs/transformers@3.22.0", "", { "dependencies": { "@shikijs/core": "3.22.0", "@shikijs/types": "3.22.0" } }, "sha512-E7eRV7mwDBjueLF6852n2oYeJYxBq3NSsDk+uyruYAXONv4U8holGmIrT+mPRJQ1J1SNOH6L8G19KRzmBawrFw=="],
|
||||
|
||||
"@shikijs/types": ["@shikijs/types@3.22.0", "", { "dependencies": { "@shikijs/vscode-textmate": "^10.0.2", "@types/hast": "^3.0.4" } }, "sha512-491iAekgKDBFE67z70Ok5a8KBMsQ2IJwOWw3us/7ffQkIBCyOQfm/aNwVMBUriP02QshIfgHCBSIYAl3u2eWjg=="],
|
||||
|
||||
"@shikijs/vscode-textmate": ["@shikijs/vscode-textmate@10.0.2", "", {}, "sha512-83yeghZ2xxin3Nj8z1NMd/NCuca+gsYXswywDy5bHvwlWL8tpTQmzGeUuHd9FC3E/SBEMvzJRwWEOz5gGes9Qg=="],
|
||||
|
||||
"@standard-schema/spec": ["@standard-schema/spec@1.1.0", "", {}, "sha512-l2aFy5jALhniG5HgqrD6jXLi/rUWrKvqN/qJx6yoJsgKhblVd+iqqU4RCXavm/jPityDo5TCvKMnpjKnOriy0w=="],
|
||||
|
||||
"@swc/helpers": ["@swc/helpers@0.5.15", "", { "dependencies": { "tslib": "^2.8.0" } }, "sha512-JQ5TuMi45Owi4/BIMAJBoSQoOJu12oOk/gADqlcUL9JEdHB8vyjUSsxqeNXnmXHjYKMi2WcYtezGEEhqUI/E2g=="],
|
||||
|
||||
"@tailwindcss/node": ["@tailwindcss/node@4.1.18", "", { "dependencies": { "@jridgewell/remapping": "^2.3.4", "enhanced-resolve": "^5.18.3", "jiti": "^2.6.1", "lightningcss": "1.30.2", "magic-string": "^0.30.21", "source-map-js": "^1.2.1", "tailwindcss": "4.1.18" } }, "sha512-DoR7U1P7iYhw16qJ49fgXUlry1t4CpXeErJHnQ44JgTSKMaZUdf17cfn5mHchfJ4KRBZRFA/Coo+MUF5+gOaCQ=="],
|
||||
|
||||
"@tailwindcss/oxide": ["@tailwindcss/oxide@4.1.18", "", { "optionalDependencies": { "@tailwindcss/oxide-android-arm64": "4.1.18", "@tailwindcss/oxide-darwin-arm64": "4.1.18", "@tailwindcss/oxide-darwin-x64": "4.1.18", "@tailwindcss/oxide-freebsd-x64": "4.1.18", "@tailwindcss/oxide-linux-arm-gnueabihf": "4.1.18", "@tailwindcss/oxide-linux-arm64-gnu": "4.1.18", "@tailwindcss/oxide-linux-arm64-musl": "4.1.18", "@tailwindcss/oxide-linux-x64-gnu": "4.1.18", "@tailwindcss/oxide-linux-x64-musl": "4.1.18", "@tailwindcss/oxide-wasm32-wasi": "4.1.18", "@tailwindcss/oxide-win32-arm64-msvc": "4.1.18", "@tailwindcss/oxide-win32-x64-msvc": "4.1.18" } }, "sha512-EgCR5tTS5bUSKQgzeMClT6iCY3ToqE1y+ZB0AKldj809QXk1Y+3jB0upOYZrn9aGIzPtUsP7sX4QQ4XtjBB95A=="],
|
||||
|
||||
"@tailwindcss/oxide-android-arm64": ["@tailwindcss/oxide-android-arm64@4.1.18", "", { "os": "android", "cpu": "arm64" }, "sha512-dJHz7+Ugr9U/diKJA0W6N/6/cjI+ZTAoxPf9Iz9BFRF2GzEX8IvXxFIi/dZBloVJX/MZGvRuFA9rqwdiIEZQ0Q=="],
|
||||
|
||||
"@tailwindcss/oxide-darwin-arm64": ["@tailwindcss/oxide-darwin-arm64@4.1.18", "", { "os": "darwin", "cpu": "arm64" }, "sha512-Gc2q4Qhs660bhjyBSKgq6BYvwDz4G+BuyJ5H1xfhmDR3D8HnHCmT/BSkvSL0vQLy/nkMLY20PQ2OoYMO15Jd0A=="],
|
||||
|
||||
"@tailwindcss/oxide-darwin-x64": ["@tailwindcss/oxide-darwin-x64@4.1.18", "", { "os": "darwin", "cpu": "x64" }, "sha512-FL5oxr2xQsFrc3X9o1fjHKBYBMD1QZNyc1Xzw/h5Qu4XnEBi3dZn96HcHm41c/euGV+GRiXFfh2hUCyKi/e+yw=="],
|
||||
|
||||
"@tailwindcss/oxide-freebsd-x64": ["@tailwindcss/oxide-freebsd-x64@4.1.18", "", { "os": "freebsd", "cpu": "x64" }, "sha512-Fj+RHgu5bDodmV1dM9yAxlfJwkkWvLiRjbhuO2LEtwtlYlBgiAT4x/j5wQr1tC3SANAgD+0YcmWVrj8R9trVMA=="],
|
||||
|
||||
"@tailwindcss/oxide-linux-arm-gnueabihf": ["@tailwindcss/oxide-linux-arm-gnueabihf@4.1.18", "", { "os": "linux", "cpu": "arm" }, "sha512-Fp+Wzk/Ws4dZn+LV2Nqx3IilnhH51YZoRaYHQsVq3RQvEl+71VGKFpkfHrLM/Li+kt5c0DJe/bHXK1eHgDmdiA=="],
|
||||
|
||||
"@tailwindcss/oxide-linux-arm64-gnu": ["@tailwindcss/oxide-linux-arm64-gnu@4.1.18", "", { "os": "linux", "cpu": "arm64" }, "sha512-S0n3jboLysNbh55Vrt7pk9wgpyTTPD0fdQeh7wQfMqLPM/Hrxi+dVsLsPrycQjGKEQk85Kgbx+6+QnYNiHalnw=="],
|
||||
|
||||
"@tailwindcss/oxide-linux-arm64-musl": ["@tailwindcss/oxide-linux-arm64-musl@4.1.18", "", { "os": "linux", "cpu": "arm64" }, "sha512-1px92582HkPQlaaCkdRcio71p8bc8i/ap5807tPRDK/uw953cauQBT8c5tVGkOwrHMfc2Yh6UuxaH4vtTjGvHg=="],
|
||||
|
||||
"@tailwindcss/oxide-linux-x64-gnu": ["@tailwindcss/oxide-linux-x64-gnu@4.1.18", "", { "os": "linux", "cpu": "x64" }, "sha512-v3gyT0ivkfBLoZGF9LyHmts0Isc8jHZyVcbzio6Wpzifg/+5ZJpDiRiUhDLkcr7f/r38SWNe7ucxmGW3j3Kb/g=="],
|
||||
|
||||
"@tailwindcss/oxide-linux-x64-musl": ["@tailwindcss/oxide-linux-x64-musl@4.1.18", "", { "os": "linux", "cpu": "x64" }, "sha512-bhJ2y2OQNlcRwwgOAGMY0xTFStt4/wyU6pvI6LSuZpRgKQwxTec0/3Scu91O8ir7qCR3AuepQKLU/kX99FouqQ=="],
|
||||
|
||||
"@tailwindcss/oxide-wasm32-wasi": ["@tailwindcss/oxide-wasm32-wasi@4.1.18", "", { "dependencies": { "@emnapi/core": "^1.7.1", "@emnapi/runtime": "^1.7.1", "@emnapi/wasi-threads": "^1.1.0", "@napi-rs/wasm-runtime": "^1.1.0", "@tybys/wasm-util": "^0.10.1", "tslib": "^2.4.0" }, "cpu": "none" }, "sha512-LffYTvPjODiP6PT16oNeUQJzNVyJl1cjIebq/rWWBF+3eDst5JGEFSc5cWxyRCJ0Mxl+KyIkqRxk1XPEs9x8TA=="],
|
||||
|
||||
"@tailwindcss/oxide-win32-arm64-msvc": ["@tailwindcss/oxide-win32-arm64-msvc@4.1.18", "", { "os": "win32", "cpu": "arm64" }, "sha512-HjSA7mr9HmC8fu6bdsZvZ+dhjyGCLdotjVOgLA2vEqxEBZaQo9YTX4kwgEvPCpRh8o4uWc4J/wEoFzhEmjvPbA=="],
|
||||
|
||||
"@tailwindcss/oxide-win32-x64-msvc": ["@tailwindcss/oxide-win32-x64-msvc@4.1.18", "", { "os": "win32", "cpu": "x64" }, "sha512-bJWbyYpUlqamC8dpR7pfjA0I7vdF6t5VpUGMWRkXVE3AXgIZjYUYAK7II1GNaxR8J1SSrSrppRar8G++JekE3Q=="],
|
||||
|
||||
"@tailwindcss/postcss": ["@tailwindcss/postcss@4.1.18", "", { "dependencies": { "@alloc/quick-lru": "^5.2.0", "@tailwindcss/node": "4.1.18", "@tailwindcss/oxide": "4.1.18", "postcss": "^8.4.41", "tailwindcss": "4.1.18" } }, "sha512-Ce0GFnzAOuPyfV5SxjXGn0CubwGcuDB0zcdaPuCSzAa/2vII24JTkH+I6jcbXLb1ctjZMZZI6OjDaLPJQL1S0g=="],
|
||||
|
||||
"@types/debug": ["@types/debug@4.1.12", "", { "dependencies": { "@types/ms": "*" } }, "sha512-vIChWdVG3LG1SMxEvI/AK+FWJthlrqlTu7fbrlywTkkaONwk/UAGaULXRlf8vkzFBLVm0zkMdCquhL5aOjhXPQ=="],
|
||||
|
||||
"@types/estree": ["@types/estree@1.0.8", "", {}, "sha512-dWHzHa2WqEXI/O1E9OjrocMTKJl2mSrEolh1Iomrv6U+JuNwaHXsXx9bLu5gG7BUWFIN0skIQJQ/L1rIex4X6w=="],
|
||||
|
||||
"@types/estree-jsx": ["@types/estree-jsx@1.0.5", "", { "dependencies": { "@types/estree": "*" } }, "sha512-52CcUVNFyfb1A2ALocQw/Dd1BQFNmSdkuC3BkZ6iqhdMfQz7JWOFRuJFloOzjk+6WijU56m9oKXFAXc7o3Towg=="],
|
||||
|
||||
"@types/hast": ["@types/hast@3.0.4", "", { "dependencies": { "@types/unist": "*" } }, "sha512-WPs+bbQw5aCj+x6laNGWLH3wviHtoCv/P3+otBhbOhJgG8qtpdAMlTCxLtsTWA7LH1Oh/bFCHsBn0TPS5m30EQ=="],
|
||||
|
||||
"@types/json-schema": ["@types/json-schema@7.0.15", "", {}, "sha512-5+fP8P8MFNC+AyZCDxrB2pkZFPGzqQWUzpSeuuVLvm8VMcorNYavBqoFcxK8bQz4Qsbn4oUEEem4wDLfcysGHA=="],
|
||||
|
||||
"@types/mdast": ["@types/mdast@4.0.4", "", { "dependencies": { "@types/unist": "*" } }, "sha512-kGaNbPh1k7AFzgpud/gMdvIm5xuECykRR+JnWKQno9TAXVa6WIVCGTPvYGekIDL4uwCZQSYbUxNBSb1aUo79oA=="],
|
||||
|
||||
"@types/mdx": ["@types/mdx@2.0.13", "", {}, "sha512-+OWZQfAYyio6YkJb3HLxDrvnx6SWWDbC0zVPfBRzUk0/nqoDyf6dNxQi3eArPe8rJ473nobTMQ/8Zk+LxJ+Yuw=="],
|
||||
|
||||
"@types/ms": ["@types/ms@2.1.0", "", {}, "sha512-GsCCIZDE/p3i96vtEqx+7dBUGXrc7zeSK3wwPHIaRThS+9OhWIXRqzs4d6k1SVU8g91DrNRWxWUGhp5KXQb2VA=="],
|
||||
|
||||
"@types/node": ["@types/node@24.10.9", "", { "dependencies": { "undici-types": "~7.16.0" } }, "sha512-ne4A0IpG3+2ETuREInjPNhUGis1SFjv1d5asp8MzEAGtOZeTeHVDOYqOgqfhvseqg/iXty2hjBf1zAOb7RNiNw=="],
|
||||
|
||||
"@types/react": ["@types/react@19.2.10", "", { "dependencies": { "csstype": "^3.2.2" } }, "sha512-WPigyYuGhgZ/cTPRXB2EwUw+XvsRA3GqHlsP4qteqrnnjDrApbS7MxcGr/hke5iUoeB7E/gQtrs9I37zAJ0Vjw=="],
|
||||
|
||||
"@types/react-dom": ["@types/react-dom@19.2.3", "", { "peerDependencies": { "@types/react": "^19.2.0" } }, "sha512-jp2L/eY6fn+KgVVQAOqYItbF0VY/YApe5Mz2F0aykSO8gx31bYCZyvSeYxCHKvzHG5eZjc+zyaS5BrBWya2+kQ=="],
|
||||
|
||||
"@types/unist": ["@types/unist@3.0.3", "", {}, "sha512-ko/gIFJRv177XgZsZcBwnqJN5x/Gien8qNOn0D5bQU/zAzVf9Zt3BlcUiLqhV9y4ARk0GbT3tnUiPNgnTXzc/Q=="],
|
||||
|
||||
"@ungap/structured-clone": ["@ungap/structured-clone@1.3.0", "", {}, "sha512-WmoN8qaIAo7WTYWbAZuG8PYEhn5fkz7dZrqTBZ7dtt//lL2Gwms1IcnQ5yHqjDfX8Ft5j4YzDM23f87zBfDe9g=="],
|
||||
|
||||
"acorn": ["acorn@8.15.0", "", { "bin": { "acorn": "bin/acorn" } }, "sha512-NZyJarBfL7nWwIq+FDL6Zp/yHEhePMNnnJ0y3qfieCrmNvYct8uvtiV41UvlSe6apAfk0fY1FbWx+NwfmpvtTg=="],
|
||||
|
||||
"acorn-jsx": ["acorn-jsx@5.3.2", "", { "peerDependencies": { "acorn": "^6.0.0 || ^7.0.0 || ^8.0.0" } }, "sha512-rq9s+JNhf0IChjtDXxllJ7g41oZk5SlXtp0LHwyA5cejwn7vKmKp4pPri6YEePv2PU65sAsegbXtIinmDFDXgQ=="],
|
||||
|
||||
"ajv": ["ajv@8.17.1", "", { "dependencies": { "fast-deep-equal": "^3.1.3", "fast-uri": "^3.0.1", "json-schema-traverse": "^1.0.0", "require-from-string": "^2.0.2" } }, "sha512-B/gBuNg5SiMTrPkC+A2+cW0RszwxYmn6VYxB/inlBStS5nx6xHIt/ehKRhIMhqusl7a8LjQoZnjCs5vhwxOQ1g=="],
|
||||
|
||||
"ajv-draft-04": ["ajv-draft-04@1.0.0", "", { "peerDependencies": { "ajv": "^8.5.0" }, "optionalPeers": ["ajv"] }, "sha512-mv00Te6nmYbRp5DCwclxtt7yV/joXJPGS7nM+97GdxvuttCOfgI3K4U25zboyeX0O+myI8ERluxQe5wljMmVIw=="],
|
||||
|
||||
"ajv-formats": ["ajv-formats@3.0.1", "", { "dependencies": { "ajv": "^8.0.0" } }, "sha512-8iUql50EUR+uUcdRQ3HDqa6EVyo3docL8g5WJ3FNcWmu62IbkGUue/pEyLBW8VGKKucTPgqeks4fIU1DA4yowQ=="],
|
||||
|
||||
"argparse": ["argparse@2.0.1", "", {}, "sha512-8+9WqebbFzpX9OR+Wa6O29asIogeRMzcGtAINdpMHHyAg10f05aSFVBbcEqGf/PXw1EjAZ+q2/bEBg3DvurK3Q=="],
|
||||
|
||||
"aria-hidden": ["aria-hidden@1.2.6", "", { "dependencies": { "tslib": "^2.0.0" } }, "sha512-ik3ZgC9dY/lYVVM++OISsaYDeg1tb0VtP5uL3ouh1koGOaUMDPpbFIei4JkFimWUFPn90sbMNMXQAIVOlnYKJA=="],
|
||||
|
||||
"astring": ["astring@1.9.0", "", { "bin": { "astring": "bin/astring" } }, "sha512-LElXdjswlqjWrPpJFg1Fx4wpkOCxj1TDHlSV4PlaRxHGWko024xICaa97ZkMfs6DRKlCguiAI+rbXv5GWwXIkg=="],
|
||||
|
||||
"bail": ["bail@2.0.2", "", {}, "sha512-0xO6mYd7JB2YesxDKplafRpsiOzPt9V02ddPCLbY1xYGPOX24NTyN50qnUxgCPcSoYMhKpAuBTjQoRZCAkUDRw=="],
|
||||
|
||||
"baseline-browser-mapping": ["baseline-browser-mapping@2.9.19", "", { "bin": { "baseline-browser-mapping": "dist/cli.js" } }, "sha512-ipDqC8FrAl/76p2SSWKSI+H9tFwm7vYqXQrItCuiVPt26Km0jS+NzSsBWAaBusvSbQcfJG+JitdMm+wZAgTYqg=="],
|
||||
|
||||
"caniuse-lite": ["caniuse-lite@1.0.30001766", "", {}, "sha512-4C0lfJ0/YPjJQHagaE9x2Elb69CIqEPZeG0anQt9SIvIoOH4a4uaRl73IavyO+0qZh6MDLH//DrXThEYKHkmYA=="],
|
||||
|
||||
"ccount": ["ccount@2.0.1", "", {}, "sha512-eyrF0jiFpY+3drT6383f1qhkbGsLSifNAjA61IUjZjmLCWjItY6LB9ft9YhoDgwfmclB2zhu51Lc7+95b8NRAg=="],
|
||||
|
||||
"character-entities": ["character-entities@2.0.2", "", {}, "sha512-shx7oQ0Awen/BRIdkjkvz54PnEEI/EjwXDSIZp86/KKdbafHh1Df/RYGBhn4hbe2+uKC9FnT5UCEdyPz3ai9hQ=="],
|
||||
|
||||
"character-entities-html4": ["character-entities-html4@2.1.0", "", {}, "sha512-1v7fgQRj6hnSwFpq1Eu0ynr/CDEw0rXo2B61qXrLNdHZmPKgb7fqS1a2JwF0rISo9q77jDI8VMEHoApn8qDoZA=="],
|
||||
|
||||
"character-entities-legacy": ["character-entities-legacy@3.0.0", "", {}, "sha512-RpPp0asT/6ufRm//AJVwpViZbGM/MkjQFxJccQRHmISF/22NBtsHqAWmL+/pmkPWoIUJdWyeVleTl1wydHATVQ=="],
|
||||
|
||||
"character-reference-invalid": ["character-reference-invalid@2.0.1", "", {}, "sha512-iBZ4F4wRbyORVsu0jPV7gXkOsGYjGHPmAyv+HiHG8gi5PtC9KI2j1+v8/tlibRvjoWX027ypmG/n0HtO5t7unw=="],
|
||||
|
||||
"chokidar": ["chokidar@4.0.3", "", { "dependencies": { "readdirp": "^4.0.1" } }, "sha512-Qgzu8kfBvo+cA4962jnP1KkS6Dop5NS6g7R5LFYJr4b8Ub94PPQXUksCw9PvXoeXPRRddRNC5C1JQUR2SMGtnA=="],
|
||||
|
||||
"class-variance-authority": ["class-variance-authority@0.7.1", "", { "dependencies": { "clsx": "^2.1.1" } }, "sha512-Ka+9Trutv7G8M6WT6SeiRWz792K5qEqIGEGzXKhAE6xOWAY6pPH8U+9IY3oCMv6kqTmLsv7Xh/2w2RigkePMsg=="],
|
||||
|
||||
"client-only": ["client-only@0.0.1", "", {}, "sha512-IV3Ou0jSMzZrd3pZ48nLkT9DA7Ag1pnPzaiQhpW7c3RbcqqzvzzVu+L8gfqMp/8IM2MQtSiqaCxrrcfu8I8rMA=="],
|
||||
|
||||
"clsx": ["clsx@2.1.1", "", {}, "sha512-eYm0QWBtUrBWZWG0d386OGAw16Z995PiOVo2B7bjWSbHedGl5e0ZWaq65kOGgUSNesEIDkB9ISbTg/JK9dhCZA=="],
|
||||
|
||||
"collapse-white-space": ["collapse-white-space@2.1.0", "", {}, "sha512-loKTxY1zCOuG4j9f6EPnuyyYkf58RnhhWTvRoZEokgB+WbdXehfjFviyOVYkqzEWz1Q5kRiZdBYS5SwxbQYwzw=="],
|
||||
|
||||
"comma-separated-tokens": ["comma-separated-tokens@2.0.3", "", {}, "sha512-Fu4hJdvzeylCfQPp9SGWidpzrMs7tTrlu6Vb8XGaRGck8QSNZJJp538Wrb60Lax4fPwR64ViY468OIUTbRlGZg=="],
|
||||
|
||||
"compute-scroll-into-view": ["compute-scroll-into-view@3.1.1", "", {}, "sha512-VRhuHOLoKYOy4UbilLbUzbYg93XLjv2PncJC50EuTWPA3gaja1UjBsUP/D/9/juV3vQFr6XBEzn9KCAHdUvOHw=="],
|
||||
|
||||
"cssesc": ["cssesc@3.0.0", "", { "bin": { "cssesc": "bin/cssesc" } }, "sha512-/Tb/JcjK111nNScGob5MNtsntNM1aCNUDipB/TkwZFhyDrrE47SOx/18wF2bbjgc3ZzCSKW1T5nt5EbFoAz/Vg=="],
|
||||
|
||||
"csstype": ["csstype@3.2.3", "", {}, "sha512-z1HGKcYy2xA8AGQfwrn0PAy+PB7X/GSj3UVJW9qKyn43xWa+gl5nXmU4qqLMRzWVLFC8KusUX8T/0kCiOYpAIQ=="],
|
||||
|
||||
"debug": ["debug@4.4.3", "", { "dependencies": { "ms": "^2.1.3" } }, "sha512-RGwwWnwQvkVfavKVt22FGLw+xYSdzARwm0ru6DhTVA3umU5hZc28V3kO4stgYryrTlLpuvgI9GiijltAjNbcqA=="],
|
||||
|
||||
"decode-named-character-reference": ["decode-named-character-reference@1.3.0", "", { "dependencies": { "character-entities": "^2.0.0" } }, "sha512-GtpQYB283KrPp6nRw50q3U9/VfOutZOe103qlN7BPP6Ad27xYnOIWv4lPzo8HCAL+mMZofJ9KEy30fq6MfaK6Q=="],
|
||||
|
||||
"dequal": ["dequal@2.0.3", "", {}, "sha512-0je+qPKHEMohvfRTCEo3CrPG6cAzAYgmzKyxRiYSSDkS6eGJdyVJm7WaYA5ECaAD9wLB2T4EEeymA5aFVcYXCA=="],
|
||||
|
||||
"detect-libc": ["detect-libc@2.1.2", "", {}, "sha512-Btj2BOOO83o3WyH59e8MgXsxEQVcarkUOpEYrubB0urwnN10yQ364rsiByU11nZlqWYZm05i/of7io4mzihBtQ=="],
|
||||
|
||||
"detect-node-es": ["detect-node-es@1.1.0", "", {}, "sha512-ypdmJU/TbBby2Dxibuv7ZLW3Bs1QEmM7nHjEANfohJLvE0XVujisn1qPJcZxg+qDucsr+bP6fLD1rPS3AhJ7EQ=="],
|
||||
|
||||
"devlop": ["devlop@1.1.0", "", { "dependencies": { "dequal": "^2.0.0" } }, "sha512-RWmIqhcFf1lRYBvNmr7qTNuyCt/7/ns2jbpp1+PalgE/rDQcBT0fioSMUpJ93irlUhC5hrg4cYqe6U+0ImW0rA=="],
|
||||
|
||||
"enhanced-resolve": ["enhanced-resolve@5.18.4", "", { "dependencies": { "graceful-fs": "^4.2.4", "tapable": "^2.2.0" } }, "sha512-LgQMM4WXU3QI+SYgEc2liRgznaD5ojbmY3sb8LxyguVkIg5FxdpTkvk72te2R38/TGKxH634oLxXRGY6d7AP+Q=="],
|
||||
|
||||
"esast-util-from-estree": ["esast-util-from-estree@2.0.0", "", { "dependencies": { "@types/estree-jsx": "^1.0.0", "devlop": "^1.0.0", "estree-util-visit": "^2.0.0", "unist-util-position-from-estree": "^2.0.0" } }, "sha512-4CyanoAudUSBAn5K13H4JhsMH6L9ZP7XbLVe/dKybkxMO7eDyLsT8UHl9TRNrU2Gr9nz+FovfSIjuXWJ81uVwQ=="],
|
||||
|
||||
"esast-util-from-js": ["esast-util-from-js@2.0.1", "", { "dependencies": { "@types/estree-jsx": "^1.0.0", "acorn": "^8.0.0", "esast-util-from-estree": "^2.0.0", "vfile-message": "^4.0.0" } }, "sha512-8Ja+rNJ0Lt56Pcf3TAmpBZjmx8ZcK5Ts4cAzIOjsjevg9oSXJnl6SUQ2EevU8tv3h6ZLWmoKL5H4fgWvdvfETw=="],
|
||||
|
||||
"esbuild": ["esbuild@0.25.12", "", { "optionalDependencies": { "@esbuild/aix-ppc64": "0.25.12", "@esbuild/android-arm": "0.25.12", "@esbuild/android-arm64": "0.25.12", "@esbuild/android-x64": "0.25.12", "@esbuild/darwin-arm64": "0.25.12", "@esbuild/darwin-x64": "0.25.12", "@esbuild/freebsd-arm64": "0.25.12", "@esbuild/freebsd-x64": "0.25.12", "@esbuild/linux-arm": "0.25.12", "@esbuild/linux-arm64": "0.25.12", "@esbuild/linux-ia32": "0.25.12", "@esbuild/linux-loong64": "0.25.12", "@esbuild/linux-mips64el": "0.25.12", "@esbuild/linux-ppc64": "0.25.12", "@esbuild/linux-riscv64": "0.25.12", "@esbuild/linux-s390x": "0.25.12", "@esbuild/linux-x64": "0.25.12", "@esbuild/netbsd-arm64": "0.25.12", "@esbuild/netbsd-x64": "0.25.12", "@esbuild/openbsd-arm64": "0.25.12", "@esbuild/openbsd-x64": "0.25.12", "@esbuild/openharmony-arm64": "0.25.12", "@esbuild/sunos-x64": "0.25.12", "@esbuild/win32-arm64": "0.25.12", "@esbuild/win32-ia32": "0.25.12", "@esbuild/win32-x64": "0.25.12" }, "bin": { "esbuild": "bin/esbuild" } }, "sha512-bbPBYYrtZbkt6Os6FiTLCTFxvq4tt3JKall1vRwshA3fdVztsLAatFaZobhkBC8/BrPetoa0oksYoKXoG4ryJg=="],
|
||||
|
||||
"escape-string-regexp": ["escape-string-regexp@5.0.0", "", {}, "sha512-/veY75JbMK4j1yjvuUxuVsiS/hr/4iHs9FTT6cgTexxdE0Ly/glccBAkloH/DofkjRbZU3bnoj38mOmhkZ0lHw=="],
|
||||
|
||||
"estree-util-attach-comments": ["estree-util-attach-comments@3.0.0", "", { "dependencies": { "@types/estree": "^1.0.0" } }, "sha512-cKUwm/HUcTDsYh/9FgnuFqpfquUbwIqwKM26BVCGDPVgvaCl/nDCCjUfiLlx6lsEZ3Z4RFxNbOQ60pkaEwFxGw=="],
|
||||
|
||||
"estree-util-build-jsx": ["estree-util-build-jsx@3.0.1", "", { "dependencies": { "@types/estree-jsx": "^1.0.0", "devlop": "^1.0.0", "estree-util-is-identifier-name": "^3.0.0", "estree-walker": "^3.0.0" } }, "sha512-8U5eiL6BTrPxp/CHbs2yMgP8ftMhR5ww1eIKoWRMlqvltHF8fZn5LRDvTKuxD3DUn+shRbLGqXemcP51oFCsGQ=="],
|
||||
|
||||
"estree-util-is-identifier-name": ["estree-util-is-identifier-name@3.0.0", "", {}, "sha512-hFtqIDZTIUZ9BXLb8y4pYGyk6+wekIivNVTcmvk8NoOh+VeRn5y6cEHzbURrWbfp1fIqdVipilzj+lfaadNZmg=="],
|
||||
|
||||
"estree-util-scope": ["estree-util-scope@1.0.0", "", { "dependencies": { "@types/estree": "^1.0.0", "devlop": "^1.0.0" } }, "sha512-2CAASclonf+JFWBNJPndcOpA8EMJwa0Q8LUFJEKqXLW6+qBvbFZuF5gItbQOs/umBUkjviCSDCbBwU2cXbmrhQ=="],
|
||||
|
||||
"estree-util-to-js": ["estree-util-to-js@2.0.0", "", { "dependencies": { "@types/estree-jsx": "^1.0.0", "astring": "^1.8.0", "source-map": "^0.7.0" } }, "sha512-WDF+xj5rRWmD5tj6bIqRi6CkLIXbbNQUcxQHzGysQzvHmdYG2G7p/Tf0J0gpxGgkeMZNTIjT/AoSvC9Xehcgdg=="],
|
||||
|
||||
"estree-util-value-to-estree": ["estree-util-value-to-estree@3.5.0", "", { "dependencies": { "@types/estree": "^1.0.0" } }, "sha512-aMV56R27Gv3QmfmF1MY12GWkGzzeAezAX+UplqHVASfjc9wNzI/X6hC0S9oxq61WT4aQesLGslWP9tKk6ghRZQ=="],
|
||||
|
||||
"estree-util-visit": ["estree-util-visit@2.0.0", "", { "dependencies": { "@types/estree-jsx": "^1.0.0", "@types/unist": "^3.0.0" } }, "sha512-m5KgiH85xAhhW8Wta0vShLcUvOsh3LLPI2YVwcbio1l7E09NTLL1EyMZFM1OyWowoH0skScNbhOPl4kcBgzTww=="],
|
||||
|
||||
"estree-walker": ["estree-walker@3.0.3", "", { "dependencies": { "@types/estree": "^1.0.0" } }, "sha512-7RUKfXgSMMkzt6ZuXmqapOurLGPPfgj6l9uRZ7lRGolvk0y2yocc35LdcxKC5PQZdn2DMqioAQ2NoWcrTKmm6g=="],
|
||||
|
||||
"extend": ["extend@3.0.2", "", {}, "sha512-fjquC59cD7CyW6urNXK0FBufkZcoiGG80wTuPujX590cB5Ttln20E2UB4S/WARVqhXffZl2LNgS+gQdPIIim/g=="],
|
||||
|
||||
"fast-deep-equal": ["fast-deep-equal@3.1.3", "", {}, "sha512-f3qQ9oQy9j2AhBe/H9VC91wLmKBCCU/gDOnKNAYG5hswO7BLKj09Hc5HYNz9cGI++xlpDCIgDaitVs03ATR84Q=="],
|
||||
|
||||
"fast-uri": ["fast-uri@3.1.0", "", {}, "sha512-iPeeDKJSWf4IEOasVVrknXpaBV0IApz/gp7S2bb7Z4Lljbl2MGJRqInZiUrQwV16cpzw/D3S5j5Julj/gT52AA=="],
|
||||
|
||||
"fast-xml-parser": ["fast-xml-parser@4.5.3", "", { "dependencies": { "strnum": "^1.1.1" }, "bin": { "fxparser": "src/cli/cli.js" } }, "sha512-RKihhV+SHsIUGXObeVy9AXiBbFwkVk7Syp8XgwN5U3JV416+Gwp/GO9i0JYKmikykgz/UHRrrV4ROuZEo/T0ig=="],
|
||||
|
||||
"fdir": ["fdir@6.5.0", "", { "peerDependencies": { "picomatch": "^3 || ^4" }, "optionalPeers": ["picomatch"] }, "sha512-tIbYtZbucOs0BRGqPJkshJUYdL+SDH7dVM8gjy+ERp3WAUjLEFJE+02kanyHtwjWOnwrKYBiwAmM0p4kLJAnXg=="],
|
||||
|
||||
"foreach": ["foreach@2.0.6", "", {}, "sha512-k6GAGDyqLe9JaebCsFCoudPPWfihKu8pylYXRlqP1J7ms39iPoTtk2fviNglIeQEwdh0bQeKJ01ZPyuyQvKzwg=="],
|
||||
|
||||
"fumadocs-core": ["fumadocs-core@16.4.11", "", { "dependencies": { "@formatjs/intl-localematcher": "^0.8.0", "@orama/orama": "^3.1.18", "@shikijs/rehype": "^3.21.0", "@shikijs/transformers": "^3.21.0", "estree-util-value-to-estree": "^3.5.0", "github-slugger": "^2.0.0", "hast-util-to-estree": "^3.1.3", "hast-util-to-jsx-runtime": "^2.3.6", "image-size": "^2.0.2", "negotiator": "^1.0.0", "npm-to-yarn": "^3.0.1", "path-to-regexp": "^8.3.0", "remark": "^15.0.1", "remark-gfm": "^4.0.1", "remark-rehype": "^11.1.2", "scroll-into-view-if-needed": "^3.1.0", "shiki": "^3.21.0", "tinyglobby": "^0.2.15", "unist-util-visit": "^5.1.0" }, "peerDependencies": { "@mixedbread/sdk": "^0.46.0", "@orama/core": "1.x.x", "@oramacloud/client": "2.x.x", "@tanstack/react-router": "1.x.x", "@types/react": "*", "algoliasearch": "5.x.x", "lucide-react": "*", "next": "16.x.x", "react": "^19.2.0", "react-dom": "^19.2.0", "react-router": "7.x.x", "waku": "^0.26.0 || ^0.27.0", "zod": "4.x.x" }, "optionalPeers": ["@mixedbread/sdk", "@orama/core", "@oramacloud/client", "@tanstack/react-router", "@types/react", "algoliasearch", "lucide-react", "next", "react", "react-dom", "react-router", "waku", "zod"] }, "sha512-ORjWgYetxDgyHZocuvEghfxt6tuEPWE+Km5KvwNKlXPxcNdBIiSVCED8WEMwiw1n/FZ/ys+W+BOe58ZXxhWg2A=="],
|
||||
|
||||
"fumadocs-mdx": ["fumadocs-mdx@13.0.8", "", { "dependencies": { "@mdx-js/mdx": "^3.1.1", "@standard-schema/spec": "^1.0.0", "chokidar": "^4.0.3", "esbuild": "^0.25.12", "estree-util-value-to-estree": "^3.5.0", "js-yaml": "^4.1.0", "lru-cache": "^11.2.2", "mdast-util-to-markdown": "^2.1.2", "picocolors": "^1.1.1", "picomatch": "^4.0.3", "remark-mdx": "^3.1.1", "tinyexec": "^1.0.2", "tinyglobby": "^0.2.15", "unified": "^11.0.5", "unist-util-remove-position": "^5.0.0", "unist-util-visit": "^5.0.0", "zod": "^4.1.12" }, "peerDependencies": { "@fumadocs/mdx-remote": "^1.4.0", "fumadocs-core": "^15.0.0 || ^16.0.0", "next": "^15.3.0 || ^16.0.0", "react": "*", "vite": "6.x.x || 7.x.x" }, "optionalPeers": ["@fumadocs/mdx-remote", "next", "react", "vite"], "bin": { "fumadocs-mdx": "dist/bin.js" } }, "sha512-UbUwH0iGvYbytnxhmfd7tWJKFK8L0mrbTAmrQYnpg6Wi/h8afNMJmbHBOzVcaEWJKeFipZ1CGDAsNA2fztwXNg=="],
|
||||
|
||||
"fumadocs-openapi": ["fumadocs-openapi@10.2.7", "", { "dependencies": { "@fumari/json-schema-to-typescript": "^2.0.0", "@fumari/stf": "^0.0.1", "@radix-ui/react-accordion": "^1.2.12", "@radix-ui/react-dialog": "^1.1.15", "@radix-ui/react-select": "^2.2.6", "@radix-ui/react-slot": "^1.2.4", "@scalar/json-magic": "^0.9.4", "@scalar/openapi-parser": "0.24.5", "ajv": "^8.17.1", "class-variance-authority": "^0.7.1", "github-slugger": "^2.0.0", "hast-util-to-jsx-runtime": "^2.3.6", "js-yaml": "^4.1.1", "lucide-react": "^0.563.0", "next-themes": "^0.4.6", "openapi-sampler": "^1.6.2", "react-hook-form": "^7.71.1", "remark": "^15.0.1", "remark-rehype": "^11.1.2", "tailwind-merge": "^3.4.0", "xml-js": "^1.6.11" }, "peerDependencies": { "@scalar/api-client-react": "*", "@types/react": "*", "fumadocs-core": "^16.2.0", "fumadocs-ui": "^16.2.0", "react": "^19.2.0", "react-dom": "^19.2.0" }, "optionalPeers": ["@scalar/api-client-react", "@types/react"] }, "sha512-V24iseZFHmUyPdVEH/nyR1205mltOamlHXvAGtJx9FteKj0li0Rf7o7EPkV9Mby202ReG2CIic1cR2oWa+i7Jg=="],
|
||||
|
||||
"fumadocs-ui": ["fumadocs-ui@16.4.11", "", { "dependencies": { "@fumadocs/ui": "16.4.11", "@radix-ui/react-accordion": "^1.2.12", "@radix-ui/react-collapsible": "^1.1.12", "@radix-ui/react-dialog": "^1.1.15", "@radix-ui/react-direction": "^1.1.1", "@radix-ui/react-navigation-menu": "^1.2.14", "@radix-ui/react-popover": "^1.1.15", "@radix-ui/react-presence": "^1.1.5", "@radix-ui/react-scroll-area": "^1.2.10", "@radix-ui/react-slot": "^1.2.4", "@radix-ui/react-tabs": "^1.1.13", "class-variance-authority": "^0.7.1", "lucide-react": "^0.563.0", "next-themes": "^0.4.6", "react-medium-image-zoom": "^5.4.0", "scroll-into-view-if-needed": "^3.1.0" }, "peerDependencies": { "@types/react": "*", "fumadocs-core": "16.4.11", "next": "16.x.x", "react": "^19.2.0", "react-dom": "^19.2.0", "tailwindcss": "^4.0.0" }, "optionalPeers": ["@types/react", "next", "tailwindcss"] }, "sha512-LFOzdnNFAFkOHzsUtCMi8cyal1pIZqygoQKSET0LO/C5JOk1YQKAZqiut1jf6pv6o0OKXacDk+MY7kfn61309A=="],
|
||||
|
||||
"get-nonce": ["get-nonce@1.0.1", "", {}, "sha512-FJhYRoDaiatfEkUK8HKlicmu/3SGFD51q3itKDGoSTysQJBnfOcxU5GxnhE1E6soB76MbT0MBtnKJuXyAx+96Q=="],
|
||||
|
||||
"github-slugger": ["github-slugger@2.0.0", "", {}, "sha512-IaOQ9puYtjrkq7Y0Ygl9KDZnrf/aiUJYUpVf89y8kyaxbRG7Y1SrX/jaumrv81vc61+kiMempujsM3Yw7w5qcw=="],
|
||||
|
||||
"graceful-fs": ["graceful-fs@4.2.11", "", {}, "sha512-RbJ5/jmFcNNCcDV5o9eTnBLJ/HszWV0P73bc+Ff4nS/rJj+YaS6IGyiOL0VoBYX+l1Wrl3k63h/KrH+nhJ0XvQ=="],
|
||||
|
||||
"hast-util-to-estree": ["hast-util-to-estree@3.1.3", "", { "dependencies": { "@types/estree": "^1.0.0", "@types/estree-jsx": "^1.0.0", "@types/hast": "^3.0.0", "comma-separated-tokens": "^2.0.0", "devlop": "^1.0.0", "estree-util-attach-comments": "^3.0.0", "estree-util-is-identifier-name": "^3.0.0", "hast-util-whitespace": "^3.0.0", "mdast-util-mdx-expression": "^2.0.0", "mdast-util-mdx-jsx": "^3.0.0", "mdast-util-mdxjs-esm": "^2.0.0", "property-information": "^7.0.0", "space-separated-tokens": "^2.0.0", "style-to-js": "^1.0.0", "unist-util-position": "^5.0.0", "zwitch": "^2.0.0" } }, "sha512-48+B/rJWAp0jamNbAAf9M7Uf//UVqAoMmgXhBdxTDJLGKY+LRnZ99qcG+Qjl5HfMpYNzS5v4EAwVEF34LeAj7w=="],
|
||||
|
||||
"hast-util-to-html": ["hast-util-to-html@9.0.5", "", { "dependencies": { "@types/hast": "^3.0.0", "@types/unist": "^3.0.0", "ccount": "^2.0.0", "comma-separated-tokens": "^2.0.0", "hast-util-whitespace": "^3.0.0", "html-void-elements": "^3.0.0", "mdast-util-to-hast": "^13.0.0", "property-information": "^7.0.0", "space-separated-tokens": "^2.0.0", "stringify-entities": "^4.0.0", "zwitch": "^2.0.4" } }, "sha512-OguPdidb+fbHQSU4Q4ZiLKnzWo8Wwsf5bZfbvu7//a9oTYoqD/fWpe96NuHkoS9h0ccGOTe0C4NGXdtS0iObOw=="],
|
||||
|
||||
"hast-util-to-jsx-runtime": ["hast-util-to-jsx-runtime@2.3.6", "", { "dependencies": { "@types/estree": "^1.0.0", "@types/hast": "^3.0.0", "@types/unist": "^3.0.0", "comma-separated-tokens": "^2.0.0", "devlop": "^1.0.0", "estree-util-is-identifier-name": "^3.0.0", "hast-util-whitespace": "^3.0.0", "mdast-util-mdx-expression": "^2.0.0", "mdast-util-mdx-jsx": "^3.0.0", "mdast-util-mdxjs-esm": "^2.0.0", "property-information": "^7.0.0", "space-separated-tokens": "^2.0.0", "style-to-js": "^1.0.0", "unist-util-position": "^5.0.0", "vfile-message": "^4.0.0" } }, "sha512-zl6s8LwNyo1P9uw+XJGvZtdFF1GdAkOg8ujOw+4Pyb76874fLps4ueHXDhXWdk6YHQ6OgUtinliG7RsYvCbbBg=="],
|
||||
|
||||
"hast-util-to-string": ["hast-util-to-string@3.0.1", "", { "dependencies": { "@types/hast": "^3.0.0" } }, "sha512-XelQVTDWvqcl3axRfI0xSeoVKzyIFPwsAGSLIsKdJKQMXDYJS4WYrBNF/8J7RdhIcFI2BOHgAifggsvsxp/3+A=="],
|
||||
|
||||
"hast-util-whitespace": ["hast-util-whitespace@3.0.0", "", { "dependencies": { "@types/hast": "^3.0.0" } }, "sha512-88JUN06ipLwsnv+dVn+OIYOvAuvBMy/Qoi6O7mQHxdPXpjy+Cd6xRkWwux7DKO+4sYILtLBRIKgsdpS2gQc7qw=="],
|
||||
|
||||
"html-void-elements": ["html-void-elements@3.0.0", "", {}, "sha512-bEqo66MRXsUGxWHV5IP0PUiAWwoEjba4VCzg0LjFJBpchPaTfyfCKTG6bc5F8ucKec3q5y6qOdGyYTSBEvhCrg=="],
|
||||
|
||||
"image-size": ["image-size@2.0.2", "", { "bin": { "image-size": "bin/image-size.js" } }, "sha512-IRqXKlaXwgSMAMtpNzZa1ZAe8m+Sa1770Dhk8VkSsP9LS+iHD62Zd8FQKs8fbPiagBE7BzoFX23cxFnwshpV6w=="],
|
||||
|
||||
"inline-style-parser": ["inline-style-parser@0.2.7", "", {}, "sha512-Nb2ctOyNR8DqQoR0OwRG95uNWIC0C1lCgf5Naz5H6Ji72KZ8OcFZLz2P5sNgwlyoJ8Yif11oMuYs5pBQa86csA=="],
|
||||
|
||||
"is-alphabetical": ["is-alphabetical@2.0.1", "", {}, "sha512-FWyyY60MeTNyeSRpkM2Iry0G9hpr7/9kD40mD/cGQEuilcZYS4okz8SN2Q6rLCJ8gbCt6fN+rC+6tMGS99LaxQ=="],
|
||||
|
||||
"is-alphanumerical": ["is-alphanumerical@2.0.1", "", { "dependencies": { "is-alphabetical": "^2.0.0", "is-decimal": "^2.0.0" } }, "sha512-hmbYhX/9MUMF5uh7tOXyK/n0ZvWpad5caBA17GsC6vyuCqaWliRG5K1qS9inmUhEMaOBIW7/whAnSwveW/LtZw=="],
|
||||
|
||||
"is-decimal": ["is-decimal@2.0.1", "", {}, "sha512-AAB9hiomQs5DXWcRB1rqsxGUstbRroFOPPVAomNk/3XHR5JyEZChOyTWe2oayKnsSsr/kcGqF+z6yuH6HHpN0A=="],
|
||||
|
||||
"is-hexadecimal": ["is-hexadecimal@2.0.1", "", {}, "sha512-DgZQp241c8oO6cA1SbTEWiXeoxV42vlcJxgH+B3hi1AiqqKruZR3ZGF8In3fj4+/y/7rHvlOZLZtgJ/4ttYGZg=="],
|
||||
|
||||
"is-plain-obj": ["is-plain-obj@4.1.0", "", {}, "sha512-+Pgi+vMuUNkJyExiMBt5IlFoMyKnr5zhJ4Uspz58WOhBF5QoIZkFyNHIbBAtHwzVAgk5RtndVNsDRN61/mmDqg=="],
|
||||
|
||||
"jiti": ["jiti@2.6.1", "", { "bin": { "jiti": "lib/jiti-cli.mjs" } }, "sha512-ekilCSN1jwRvIbgeg/57YFh8qQDNbwDb9xT/qu2DAHbFFZUicIl4ygVaAvzveMhMVr3LnpSKTNnwt8PoOfmKhQ=="],
|
||||
|
||||
"js-yaml": ["js-yaml@4.1.1", "", { "dependencies": { "argparse": "^2.0.1" }, "bin": { "js-yaml": "bin/js-yaml.js" } }, "sha512-qQKT4zQxXl8lLwBtHMWwaTcGfFOZviOJet3Oy/xmGk2gZH677CJM9EvtfdSkgWcATZhj/55JZ0rmy3myCT5lsA=="],
|
||||
|
||||
"json-pointer": ["json-pointer@0.6.2", "", { "dependencies": { "foreach": "^2.0.4" } }, "sha512-vLWcKbOaXlO+jvRy4qNd+TI1QUPZzfJj1tpJ3vAXDych5XJf93ftpUKe5pKCrzyIIwgBJcOcCVRUfqQP25afBw=="],
|
||||
|
||||
"json-schema-traverse": ["json-schema-traverse@1.0.0", "", {}, "sha512-NM8/P9n3XjXhIZn1lLhkFaACTOURQXjWhV4BA/RnOv8xvgqtqpAX9IO4mRQxSx1Rlo4tqzeqb0sOlruaOy3dug=="],
|
||||
|
||||
"jsonpointer": ["jsonpointer@5.0.1", "", {}, "sha512-p/nXbhSEcu3pZRdkW1OfJhpsVtW1gd4Wa1fnQc9YLiTfAjn0312eMKimbdIQzuZl9aa9xUGaRlP9T/CJE/ditQ=="],
|
||||
|
||||
"leven": ["leven@4.1.0", "", {}, "sha512-KZ9W9nWDT7rF7Dazg8xyLHGLrmpgq2nVNFUckhqdW3szVP6YhCpp/RAnpmVExA9JvrMynjwSLVrEj3AepHR6ew=="],
|
||||
|
||||
"lightningcss": ["lightningcss@1.30.2", "", { "dependencies": { "detect-libc": "^2.0.3" }, "optionalDependencies": { "lightningcss-android-arm64": "1.30.2", "lightningcss-darwin-arm64": "1.30.2", "lightningcss-darwin-x64": "1.30.2", "lightningcss-freebsd-x64": "1.30.2", "lightningcss-linux-arm-gnueabihf": "1.30.2", "lightningcss-linux-arm64-gnu": "1.30.2", "lightningcss-linux-arm64-musl": "1.30.2", "lightningcss-linux-x64-gnu": "1.30.2", "lightningcss-linux-x64-musl": "1.30.2", "lightningcss-win32-arm64-msvc": "1.30.2", "lightningcss-win32-x64-msvc": "1.30.2" } }, "sha512-utfs7Pr5uJyyvDETitgsaqSyjCb2qNRAtuqUeWIAKztsOYdcACf2KtARYXg2pSvhkt+9NfoaNY7fxjl6nuMjIQ=="],
|
||||
|
||||
"lightningcss-android-arm64": ["lightningcss-android-arm64@1.30.2", "", { "os": "android", "cpu": "arm64" }, "sha512-BH9sEdOCahSgmkVhBLeU7Hc9DWeZ1Eb6wNS6Da8igvUwAe0sqROHddIlvU06q3WyXVEOYDZ6ykBZQnjTbmo4+A=="],
|
||||
|
||||
"lightningcss-darwin-arm64": ["lightningcss-darwin-arm64@1.30.2", "", { "os": "darwin", "cpu": "arm64" }, "sha512-ylTcDJBN3Hp21TdhRT5zBOIi73P6/W0qwvlFEk22fkdXchtNTOU4Qc37SkzV+EKYxLouZ6M4LG9NfZ1qkhhBWA=="],
|
||||
|
||||
"lightningcss-darwin-x64": ["lightningcss-darwin-x64@1.30.2", "", { "os": "darwin", "cpu": "x64" }, "sha512-oBZgKchomuDYxr7ilwLcyms6BCyLn0z8J0+ZZmfpjwg9fRVZIR5/GMXd7r9RH94iDhld3UmSjBM6nXWM2TfZTQ=="],
|
||||
|
||||
"lightningcss-freebsd-x64": ["lightningcss-freebsd-x64@1.30.2", "", { "os": "freebsd", "cpu": "x64" }, "sha512-c2bH6xTrf4BDpK8MoGG4Bd6zAMZDAXS569UxCAGcA7IKbHNMlhGQ89eRmvpIUGfKWNVdbhSbkQaWhEoMGmGslA=="],
|
||||
|
||||
"lightningcss-linux-arm-gnueabihf": ["lightningcss-linux-arm-gnueabihf@1.30.2", "", { "os": "linux", "cpu": "arm" }, "sha512-eVdpxh4wYcm0PofJIZVuYuLiqBIakQ9uFZmipf6LF/HRj5Bgm0eb3qL/mr1smyXIS1twwOxNWndd8z0E374hiA=="],
|
||||
|
||||
"lightningcss-linux-arm64-gnu": ["lightningcss-linux-arm64-gnu@1.30.2", "", { "os": "linux", "cpu": "arm64" }, "sha512-UK65WJAbwIJbiBFXpxrbTNArtfuznvxAJw4Q2ZGlU8kPeDIWEX1dg3rn2veBVUylA2Ezg89ktszWbaQnxD/e3A=="],
|
||||
|
||||
"lightningcss-linux-arm64-musl": ["lightningcss-linux-arm64-musl@1.30.2", "", { "os": "linux", "cpu": "arm64" }, "sha512-5Vh9dGeblpTxWHpOx8iauV02popZDsCYMPIgiuw97OJ5uaDsL86cnqSFs5LZkG3ghHoX5isLgWzMs+eD1YzrnA=="],
|
||||
|
||||
"lightningcss-linux-x64-gnu": ["lightningcss-linux-x64-gnu@1.30.2", "", { "os": "linux", "cpu": "x64" }, "sha512-Cfd46gdmj1vQ+lR6VRTTadNHu6ALuw2pKR9lYq4FnhvgBc4zWY1EtZcAc6EffShbb1MFrIPfLDXD6Xprbnni4w=="],
|
||||
|
||||
"lightningcss-linux-x64-musl": ["lightningcss-linux-x64-musl@1.30.2", "", { "os": "linux", "cpu": "x64" }, "sha512-XJaLUUFXb6/QG2lGIW6aIk6jKdtjtcffUT0NKvIqhSBY3hh9Ch+1LCeH80dR9q9LBjG3ewbDjnumefsLsP6aiA=="],
|
||||
|
||||
"lightningcss-win32-arm64-msvc": ["lightningcss-win32-arm64-msvc@1.30.2", "", { "os": "win32", "cpu": "arm64" }, "sha512-FZn+vaj7zLv//D/192WFFVA0RgHawIcHqLX9xuWiQt7P0PtdFEVaxgF9rjM/IRYHQXNnk61/H/gb2Ei+kUQ4xQ=="],
|
||||
|
||||
"lightningcss-win32-x64-msvc": ["lightningcss-win32-x64-msvc@1.30.2", "", { "os": "win32", "cpu": "x64" }, "sha512-5g1yc73p+iAkid5phb4oVFMB45417DkRevRbt/El/gKXJk4jid+vPFF/AXbxn05Aky8PapwzZrdJShv5C0avjw=="],
|
||||
|
||||
"longest-streak": ["longest-streak@3.1.0", "", {}, "sha512-9Ri+o0JYgehTaVBBDoMqIl8GXtbWg711O3srftcHhZ0dqnETqLaoIK0x17fUw9rFSlK/0NlsKe0Ahhyl5pXE2g=="],
|
||||
|
||||
"lru-cache": ["lru-cache@11.2.5", "", {}, "sha512-vFrFJkWtJvJnD5hg+hJvVE8Lh/TcMzKnTgCWmtBipwI5yLX/iX+5UB2tfuyODF5E7k9xEzMdYgGqaSb1c0c5Yw=="],
|
||||
|
||||
"lucide-react": ["lucide-react@0.546.0", "", { "peerDependencies": { "react": "^16.5.1 || ^17.0.0 || ^18.0.0 || ^19.0.0" } }, "sha512-Z94u6fKT43lKeYHiVyvyR8fT7pwCzDu7RyMPpTvh054+xahSgj4HFQ+NmflvzdXsoAjYGdCguGaFKYuvq0ThCQ=="],
|
||||
|
||||
"magic-string": ["magic-string@0.30.21", "", { "dependencies": { "@jridgewell/sourcemap-codec": "^1.5.5" } }, "sha512-vd2F4YUyEXKGcLHoq+TEyCjxueSeHnFxyyjNp80yg0XV4vUhnDer/lvvlqM/arB5bXQN5K2/3oinyCRyx8T2CQ=="],
|
||||
|
||||
"markdown-extensions": ["markdown-extensions@2.0.0", "", {}, "sha512-o5vL7aDWatOTX8LzaS1WMoaoxIiLRQJuIKKe2wAw6IeULDHaqbiqiggmx+pKvZDb1Sj+pE46Sn1T7lCqfFtg1Q=="],
|
||||
|
||||
"markdown-table": ["markdown-table@3.0.4", "", {}, "sha512-wiYz4+JrLyb/DqW2hkFJxP7Vd7JuTDm77fvbM8VfEQdmSMqcImWeeRbHwZjBjIFki/VaMK2BhFi7oUUZeM5bqw=="],
|
||||
|
||||
"mdast-util-find-and-replace": ["mdast-util-find-and-replace@3.0.2", "", { "dependencies": { "@types/mdast": "^4.0.0", "escape-string-regexp": "^5.0.0", "unist-util-is": "^6.0.0", "unist-util-visit-parents": "^6.0.0" } }, "sha512-Tmd1Vg/m3Xz43afeNxDIhWRtFZgM2VLyaf4vSTYwudTyeuTneoL3qtWMA5jeLyz/O1vDJmmV4QuScFCA2tBPwg=="],
|
||||
|
||||
"mdast-util-from-markdown": ["mdast-util-from-markdown@2.0.2", "", { "dependencies": { "@types/mdast": "^4.0.0", "@types/unist": "^3.0.0", "decode-named-character-reference": "^1.0.0", "devlop": "^1.0.0", "mdast-util-to-string": "^4.0.0", "micromark": "^4.0.0", "micromark-util-decode-numeric-character-reference": "^2.0.0", "micromark-util-decode-string": "^2.0.0", "micromark-util-normalize-identifier": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0", "unist-util-stringify-position": "^4.0.0" } }, "sha512-uZhTV/8NBuw0WHkPTrCqDOl0zVe1BIng5ZtHoDk49ME1qqcjYmmLmOf0gELgcRMxN4w2iuIeVso5/6QymSrgmA=="],
|
||||
|
||||
"mdast-util-gfm": ["mdast-util-gfm@3.1.0", "", { "dependencies": { "mdast-util-from-markdown": "^2.0.0", "mdast-util-gfm-autolink-literal": "^2.0.0", "mdast-util-gfm-footnote": "^2.0.0", "mdast-util-gfm-strikethrough": "^2.0.0", "mdast-util-gfm-table": "^2.0.0", "mdast-util-gfm-task-list-item": "^2.0.0", "mdast-util-to-markdown": "^2.0.0" } }, "sha512-0ulfdQOM3ysHhCJ1p06l0b0VKlhU0wuQs3thxZQagjcjPrlFRqY215uZGHHJan9GEAXd9MbfPjFJz+qMkVR6zQ=="],
|
||||
|
||||
"mdast-util-gfm-autolink-literal": ["mdast-util-gfm-autolink-literal@2.0.1", "", { "dependencies": { "@types/mdast": "^4.0.0", "ccount": "^2.0.0", "devlop": "^1.0.0", "mdast-util-find-and-replace": "^3.0.0", "micromark-util-character": "^2.0.0" } }, "sha512-5HVP2MKaP6L+G6YaxPNjuL0BPrq9orG3TsrZ9YXbA3vDw/ACI4MEsnoDpn6ZNm7GnZgtAcONJyPhOP8tNJQavQ=="],
|
||||
|
||||
"mdast-util-gfm-footnote": ["mdast-util-gfm-footnote@2.1.0", "", { "dependencies": { "@types/mdast": "^4.0.0", "devlop": "^1.1.0", "mdast-util-from-markdown": "^2.0.0", "mdast-util-to-markdown": "^2.0.0", "micromark-util-normalize-identifier": "^2.0.0" } }, "sha512-sqpDWlsHn7Ac9GNZQMeUzPQSMzR6Wv0WKRNvQRg0KqHh02fpTz69Qc1QSseNX29bhz1ROIyNyxExfawVKTm1GQ=="],
|
||||
|
||||
"mdast-util-gfm-strikethrough": ["mdast-util-gfm-strikethrough@2.0.0", "", { "dependencies": { "@types/mdast": "^4.0.0", "mdast-util-from-markdown": "^2.0.0", "mdast-util-to-markdown": "^2.0.0" } }, "sha512-mKKb915TF+OC5ptj5bJ7WFRPdYtuHv0yTRxK2tJvi+BDqbkiG7h7u/9SI89nRAYcmap2xHQL9D+QG/6wSrTtXg=="],
|
||||
|
||||
"mdast-util-gfm-table": ["mdast-util-gfm-table@2.0.0", "", { "dependencies": { "@types/mdast": "^4.0.0", "devlop": "^1.0.0", "markdown-table": "^3.0.0", "mdast-util-from-markdown": "^2.0.0", "mdast-util-to-markdown": "^2.0.0" } }, "sha512-78UEvebzz/rJIxLvE7ZtDd/vIQ0RHv+3Mh5DR96p7cS7HsBhYIICDBCu8csTNWNO6tBWfqXPWekRuj2FNOGOZg=="],
|
||||
|
||||
"mdast-util-gfm-task-list-item": ["mdast-util-gfm-task-list-item@2.0.0", "", { "dependencies": { "@types/mdast": "^4.0.0", "devlop": "^1.0.0", "mdast-util-from-markdown": "^2.0.0", "mdast-util-to-markdown": "^2.0.0" } }, "sha512-IrtvNvjxC1o06taBAVJznEnkiHxLFTzgonUdy8hzFVeDun0uTjxxrRGVaNFqkU1wJR3RBPEfsxmU6jDWPofrTQ=="],
|
||||
|
||||
"mdast-util-mdx": ["mdast-util-mdx@3.0.0", "", { "dependencies": { "mdast-util-from-markdown": "^2.0.0", "mdast-util-mdx-expression": "^2.0.0", "mdast-util-mdx-jsx": "^3.0.0", "mdast-util-mdxjs-esm": "^2.0.0", "mdast-util-to-markdown": "^2.0.0" } }, "sha512-JfbYLAW7XnYTTbUsmpu0kdBUVe+yKVJZBItEjwyYJiDJuZ9w4eeaqks4HQO+R7objWgS2ymV60GYpI14Ug554w=="],
|
||||
|
||||
"mdast-util-mdx-expression": ["mdast-util-mdx-expression@2.0.1", "", { "dependencies": { "@types/estree-jsx": "^1.0.0", "@types/hast": "^3.0.0", "@types/mdast": "^4.0.0", "devlop": "^1.0.0", "mdast-util-from-markdown": "^2.0.0", "mdast-util-to-markdown": "^2.0.0" } }, "sha512-J6f+9hUp+ldTZqKRSg7Vw5V6MqjATc+3E4gf3CFNcuZNWD8XdyI6zQ8GqH7f8169MM6P7hMBRDVGnn7oHB9kXQ=="],
|
||||
|
||||
"mdast-util-mdx-jsx": ["mdast-util-mdx-jsx@3.2.0", "", { "dependencies": { "@types/estree-jsx": "^1.0.0", "@types/hast": "^3.0.0", "@types/mdast": "^4.0.0", "@types/unist": "^3.0.0", "ccount": "^2.0.0", "devlop": "^1.1.0", "mdast-util-from-markdown": "^2.0.0", "mdast-util-to-markdown": "^2.0.0", "parse-entities": "^4.0.0", "stringify-entities": "^4.0.0", "unist-util-stringify-position": "^4.0.0", "vfile-message": "^4.0.0" } }, "sha512-lj/z8v0r6ZtsN/cGNNtemmmfoLAFZnjMbNyLzBafjzikOM+glrjNHPlf6lQDOTccj9n5b0PPihEBbhneMyGs1Q=="],
|
||||
|
||||
"mdast-util-mdxjs-esm": ["mdast-util-mdxjs-esm@2.0.1", "", { "dependencies": { "@types/estree-jsx": "^1.0.0", "@types/hast": "^3.0.0", "@types/mdast": "^4.0.0", "devlop": "^1.0.0", "mdast-util-from-markdown": "^2.0.0", "mdast-util-to-markdown": "^2.0.0" } }, "sha512-EcmOpxsZ96CvlP03NghtH1EsLtr0n9Tm4lPUJUBccV9RwUOneqSycg19n5HGzCf+10LozMRSObtVr3ee1WoHtg=="],
|
||||
|
||||
"mdast-util-phrasing": ["mdast-util-phrasing@4.1.0", "", { "dependencies": { "@types/mdast": "^4.0.0", "unist-util-is": "^6.0.0" } }, "sha512-TqICwyvJJpBwvGAMZjj4J2n0X8QWp21b9l0o7eXyVJ25YNWYbJDVIyD1bZXE6WtV6RmKJVYmQAKWa0zWOABz2w=="],
|
||||
|
||||
"mdast-util-to-hast": ["mdast-util-to-hast@13.2.1", "", { "dependencies": { "@types/hast": "^3.0.0", "@types/mdast": "^4.0.0", "@ungap/structured-clone": "^1.0.0", "devlop": "^1.0.0", "micromark-util-sanitize-uri": "^2.0.0", "trim-lines": "^3.0.0", "unist-util-position": "^5.0.0", "unist-util-visit": "^5.0.0", "vfile": "^6.0.0" } }, "sha512-cctsq2wp5vTsLIcaymblUriiTcZd0CwWtCbLvrOzYCDZoWyMNV8sZ7krj09FSnsiJi3WVsHLM4k6Dq/yaPyCXA=="],
|
||||
|
||||
"mdast-util-to-markdown": ["mdast-util-to-markdown@2.1.2", "", { "dependencies": { "@types/mdast": "^4.0.0", "@types/unist": "^3.0.0", "longest-streak": "^3.0.0", "mdast-util-phrasing": "^4.0.0", "mdast-util-to-string": "^4.0.0", "micromark-util-classify-character": "^2.0.0", "micromark-util-decode-string": "^2.0.0", "unist-util-visit": "^5.0.0", "zwitch": "^2.0.0" } }, "sha512-xj68wMTvGXVOKonmog6LwyJKrYXZPvlwabaryTjLh9LuvovB/KAH+kvi8Gjj+7rJjsFi23nkUxRQv1KqSroMqA=="],
|
||||
|
||||
"mdast-util-to-string": ["mdast-util-to-string@4.0.0", "", { "dependencies": { "@types/mdast": "^4.0.0" } }, "sha512-0H44vDimn51F0YwvxSJSm0eCDOJTRlmN0R1yBh4HLj9wiV1Dn0QoXGbvFAWj2hSItVTlCmBF1hqKlIyUBVFLPg=="],
|
||||
|
||||
"micromark": ["micromark@4.0.2", "", { "dependencies": { "@types/debug": "^4.0.0", "debug": "^4.0.0", "decode-named-character-reference": "^1.0.0", "devlop": "^1.0.0", "micromark-core-commonmark": "^2.0.0", "micromark-factory-space": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-chunked": "^2.0.0", "micromark-util-combine-extensions": "^2.0.0", "micromark-util-decode-numeric-character-reference": "^2.0.0", "micromark-util-encode": "^2.0.0", "micromark-util-normalize-identifier": "^2.0.0", "micromark-util-resolve-all": "^2.0.0", "micromark-util-sanitize-uri": "^2.0.0", "micromark-util-subtokenize": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-zpe98Q6kvavpCr1NPVSCMebCKfD7CA2NqZ+rykeNhONIJBpc1tFKt9hucLGwha3jNTNI8lHpctWJWoimVF4PfA=="],
|
||||
|
||||
"micromark-core-commonmark": ["micromark-core-commonmark@2.0.3", "", { "dependencies": { "decode-named-character-reference": "^1.0.0", "devlop": "^1.0.0", "micromark-factory-destination": "^2.0.0", "micromark-factory-label": "^2.0.0", "micromark-factory-space": "^2.0.0", "micromark-factory-title": "^2.0.0", "micromark-factory-whitespace": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-chunked": "^2.0.0", "micromark-util-classify-character": "^2.0.0", "micromark-util-html-tag-name": "^2.0.0", "micromark-util-normalize-identifier": "^2.0.0", "micromark-util-resolve-all": "^2.0.0", "micromark-util-subtokenize": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-RDBrHEMSxVFLg6xvnXmb1Ayr2WzLAWjeSATAoxwKYJV94TeNavgoIdA0a9ytzDSVzBy2YKFK+emCPOEibLeCrg=="],
|
||||
|
||||
"micromark-extension-gfm": ["micromark-extension-gfm@3.0.0", "", { "dependencies": { "micromark-extension-gfm-autolink-literal": "^2.0.0", "micromark-extension-gfm-footnote": "^2.0.0", "micromark-extension-gfm-strikethrough": "^2.0.0", "micromark-extension-gfm-table": "^2.0.0", "micromark-extension-gfm-tagfilter": "^2.0.0", "micromark-extension-gfm-task-list-item": "^2.0.0", "micromark-util-combine-extensions": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-vsKArQsicm7t0z2GugkCKtZehqUm31oeGBV/KVSorWSy8ZlNAv7ytjFhvaryUiCUJYqs+NoE6AFhpQvBTM6Q4w=="],
|
||||
|
||||
"micromark-extension-gfm-autolink-literal": ["micromark-extension-gfm-autolink-literal@2.1.0", "", { "dependencies": { "micromark-util-character": "^2.0.0", "micromark-util-sanitize-uri": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-oOg7knzhicgQ3t4QCjCWgTmfNhvQbDDnJeVu9v81r7NltNCVmhPy1fJRX27pISafdjL+SVc4d3l48Gb6pbRypw=="],
|
||||
|
||||
"micromark-extension-gfm-footnote": ["micromark-extension-gfm-footnote@2.1.0", "", { "dependencies": { "devlop": "^1.0.0", "micromark-core-commonmark": "^2.0.0", "micromark-factory-space": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-normalize-identifier": "^2.0.0", "micromark-util-sanitize-uri": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-/yPhxI1ntnDNsiHtzLKYnE3vf9JZ6cAisqVDauhp4CEHxlb4uoOTxOCJ+9s51bIB8U1N1FJ1RXOKTIlD5B/gqw=="],
|
||||
|
||||
"micromark-extension-gfm-strikethrough": ["micromark-extension-gfm-strikethrough@2.1.0", "", { "dependencies": { "devlop": "^1.0.0", "micromark-util-chunked": "^2.0.0", "micromark-util-classify-character": "^2.0.0", "micromark-util-resolve-all": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-ADVjpOOkjz1hhkZLlBiYA9cR2Anf8F4HqZUO6e5eDcPQd0Txw5fxLzzxnEkSkfnD0wziSGiv7sYhk/ktvbf1uw=="],
|
||||
|
||||
"micromark-extension-gfm-table": ["micromark-extension-gfm-table@2.1.1", "", { "dependencies": { "devlop": "^1.0.0", "micromark-factory-space": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-t2OU/dXXioARrC6yWfJ4hqB7rct14e8f7m0cbI5hUmDyyIlwv5vEtooptH8INkbLzOatzKuVbQmAYcbWoyz6Dg=="],
|
||||
|
||||
"micromark-extension-gfm-tagfilter": ["micromark-extension-gfm-tagfilter@2.0.0", "", { "dependencies": { "micromark-util-types": "^2.0.0" } }, "sha512-xHlTOmuCSotIA8TW1mDIM6X2O1SiX5P9IuDtqGonFhEK0qgRI4yeC6vMxEV2dgyr2TiD+2PQ10o+cOhdVAcwfg=="],
|
||||
|
||||
"micromark-extension-gfm-task-list-item": ["micromark-extension-gfm-task-list-item@2.1.0", "", { "dependencies": { "devlop": "^1.0.0", "micromark-factory-space": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-qIBZhqxqI6fjLDYFTBIa4eivDMnP+OZqsNwmQ3xNLE4Cxwc+zfQEfbs6tzAo2Hjq+bh6q5F+Z8/cksrLFYWQQw=="],
|
||||
|
||||
"micromark-extension-mdx-expression": ["micromark-extension-mdx-expression@3.0.1", "", { "dependencies": { "@types/estree": "^1.0.0", "devlop": "^1.0.0", "micromark-factory-mdx-expression": "^2.0.0", "micromark-factory-space": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-events-to-acorn": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-dD/ADLJ1AeMvSAKBwO22zG22N4ybhe7kFIZ3LsDI0GlsNr2A3KYxb0LdC1u5rj4Nw+CHKY0RVdnHX8vj8ejm4Q=="],
|
||||
|
||||
"micromark-extension-mdx-jsx": ["micromark-extension-mdx-jsx@3.0.2", "", { "dependencies": { "@types/estree": "^1.0.0", "devlop": "^1.0.0", "estree-util-is-identifier-name": "^3.0.0", "micromark-factory-mdx-expression": "^2.0.0", "micromark-factory-space": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-events-to-acorn": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0", "vfile-message": "^4.0.0" } }, "sha512-e5+q1DjMh62LZAJOnDraSSbDMvGJ8x3cbjygy2qFEi7HCeUT4BDKCvMozPozcD6WmOt6sVvYDNBKhFSz3kjOVQ=="],
|
||||
|
||||
"micromark-extension-mdx-md": ["micromark-extension-mdx-md@2.0.0", "", { "dependencies": { "micromark-util-types": "^2.0.0" } }, "sha512-EpAiszsB3blw4Rpba7xTOUptcFeBFi+6PY8VnJ2hhimH+vCQDirWgsMpz7w1XcZE7LVrSAUGb9VJpG9ghlYvYQ=="],
|
||||
|
||||
"micromark-extension-mdxjs": ["micromark-extension-mdxjs@3.0.0", "", { "dependencies": { "acorn": "^8.0.0", "acorn-jsx": "^5.0.0", "micromark-extension-mdx-expression": "^3.0.0", "micromark-extension-mdx-jsx": "^3.0.0", "micromark-extension-mdx-md": "^2.0.0", "micromark-extension-mdxjs-esm": "^3.0.0", "micromark-util-combine-extensions": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-A873fJfhnJ2siZyUrJ31l34Uqwy4xIFmvPY1oj+Ean5PHcPBYzEsvqvWGaWcfEIr11O5Dlw3p2y0tZWpKHDejQ=="],
|
||||
|
||||
"micromark-extension-mdxjs-esm": ["micromark-extension-mdxjs-esm@3.0.0", "", { "dependencies": { "@types/estree": "^1.0.0", "devlop": "^1.0.0", "micromark-core-commonmark": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-events-to-acorn": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0", "unist-util-position-from-estree": "^2.0.0", "vfile-message": "^4.0.0" } }, "sha512-DJFl4ZqkErRpq/dAPyeWp15tGrcrrJho1hKK5uBS70BCtfrIFg81sqcTVu3Ta+KD1Tk5vAtBNElWxtAa+m8K9A=="],
|
||||
|
||||
"micromark-factory-destination": ["micromark-factory-destination@2.0.1", "", { "dependencies": { "micromark-util-character": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-Xe6rDdJlkmbFRExpTOmRj9N3MaWmbAgdpSrBQvCFqhezUn4AHqJHbaEnfbVYYiexVSs//tqOdY/DxhjdCiJnIA=="],
|
||||
|
||||
"micromark-factory-label": ["micromark-factory-label@2.0.1", "", { "dependencies": { "devlop": "^1.0.0", "micromark-util-character": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-VFMekyQExqIW7xIChcXn4ok29YE3rnuyveW3wZQWWqF4Nv9Wk5rgJ99KzPvHjkmPXF93FXIbBp6YdW3t71/7Vg=="],
|
||||
|
||||
"micromark-factory-mdx-expression": ["micromark-factory-mdx-expression@2.0.3", "", { "dependencies": { "@types/estree": "^1.0.0", "devlop": "^1.0.0", "micromark-factory-space": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-events-to-acorn": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0", "unist-util-position-from-estree": "^2.0.0", "vfile-message": "^4.0.0" } }, "sha512-kQnEtA3vzucU2BkrIa8/VaSAsP+EJ3CKOvhMuJgOEGg9KDC6OAY6nSnNDVRiVNRqj7Y4SlSzcStaH/5jge8JdQ=="],
|
||||
|
||||
"micromark-factory-space": ["micromark-factory-space@2.0.1", "", { "dependencies": { "micromark-util-character": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-zRkxjtBxxLd2Sc0d+fbnEunsTj46SWXgXciZmHq0kDYGnck/ZSGj9/wULTV95uoeYiK5hRXP2mJ98Uo4cq/LQg=="],
|
||||
|
||||
"micromark-factory-title": ["micromark-factory-title@2.0.1", "", { "dependencies": { "micromark-factory-space": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-5bZ+3CjhAd9eChYTHsjy6TGxpOFSKgKKJPJxr293jTbfry2KDoWkhBb6TcPVB4NmzaPhMs1Frm9AZH7OD4Cjzw=="],
|
||||
|
||||
"micromark-factory-whitespace": ["micromark-factory-whitespace@2.0.1", "", { "dependencies": { "micromark-factory-space": "^2.0.0", "micromark-util-character": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-Ob0nuZ3PKt/n0hORHyvoD9uZhr+Za8sFoP+OnMcnWK5lngSzALgQYKMr9RJVOWLqQYuyn6ulqGWSXdwf6F80lQ=="],
|
||||
|
||||
"micromark-util-character": ["micromark-util-character@2.1.1", "", { "dependencies": { "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-wv8tdUTJ3thSFFFJKtpYKOYiGP2+v96Hvk4Tu8KpCAsTMs6yi+nVmGh1syvSCsaxz45J6Jbw+9DD6g97+NV67Q=="],
|
||||
|
||||
"micromark-util-chunked": ["micromark-util-chunked@2.0.1", "", { "dependencies": { "micromark-util-symbol": "^2.0.0" } }, "sha512-QUNFEOPELfmvv+4xiNg2sRYeS/P84pTW0TCgP5zc9FpXetHY0ab7SxKyAQCNCc1eK0459uoLI1y5oO5Vc1dbhA=="],
|
||||
|
||||
"micromark-util-classify-character": ["micromark-util-classify-character@2.0.1", "", { "dependencies": { "micromark-util-character": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-K0kHzM6afW/MbeWYWLjoHQv1sgg2Q9EccHEDzSkxiP/EaagNzCm7T/WMKZ3rjMbvIpvBiZgwR3dKMygtA4mG1Q=="],
|
||||
|
||||
"micromark-util-combine-extensions": ["micromark-util-combine-extensions@2.0.1", "", { "dependencies": { "micromark-util-chunked": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-OnAnH8Ujmy59JcyZw8JSbK9cGpdVY44NKgSM7E9Eh7DiLS2E9RNQf0dONaGDzEG9yjEl5hcqeIsj4hfRkLH/Bg=="],
|
||||
|
||||
"micromark-util-decode-numeric-character-reference": ["micromark-util-decode-numeric-character-reference@2.0.2", "", { "dependencies": { "micromark-util-symbol": "^2.0.0" } }, "sha512-ccUbYk6CwVdkmCQMyr64dXz42EfHGkPQlBj5p7YVGzq8I7CtjXZJrubAYezf7Rp+bjPseiROqe7G6foFd+lEuw=="],
|
||||
|
||||
"micromark-util-decode-string": ["micromark-util-decode-string@2.0.1", "", { "dependencies": { "decode-named-character-reference": "^1.0.0", "micromark-util-character": "^2.0.0", "micromark-util-decode-numeric-character-reference": "^2.0.0", "micromark-util-symbol": "^2.0.0" } }, "sha512-nDV/77Fj6eH1ynwscYTOsbK7rR//Uj0bZXBwJZRfaLEJ1iGBR6kIfNmlNqaqJf649EP0F3NWNdeJi03elllNUQ=="],
|
||||
|
||||
"micromark-util-encode": ["micromark-util-encode@2.0.1", "", {}, "sha512-c3cVx2y4KqUnwopcO9b/SCdo2O67LwJJ/UyqGfbigahfegL9myoEFoDYZgkT7f36T0bLrM9hZTAaAyH+PCAXjw=="],
|
||||
|
||||
"micromark-util-events-to-acorn": ["micromark-util-events-to-acorn@2.0.3", "", { "dependencies": { "@types/estree": "^1.0.0", "@types/unist": "^3.0.0", "devlop": "^1.0.0", "estree-util-visit": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0", "vfile-message": "^4.0.0" } }, "sha512-jmsiEIiZ1n7X1Rr5k8wVExBQCg5jy4UXVADItHmNk1zkwEVhBuIUKRu3fqv+hs4nxLISi2DQGlqIOGiFxgbfHg=="],
|
||||
|
||||
"micromark-util-html-tag-name": ["micromark-util-html-tag-name@2.0.1", "", {}, "sha512-2cNEiYDhCWKI+Gs9T0Tiysk136SnR13hhO8yW6BGNyhOC4qYFnwF1nKfD3HFAIXA5c45RrIG1ub11GiXeYd1xA=="],
|
||||
|
||||
"micromark-util-normalize-identifier": ["micromark-util-normalize-identifier@2.0.1", "", { "dependencies": { "micromark-util-symbol": "^2.0.0" } }, "sha512-sxPqmo70LyARJs0w2UclACPUUEqltCkJ6PhKdMIDuJ3gSf/Q+/GIe3WKl0Ijb/GyH9lOpUkRAO2wp0GVkLvS9Q=="],
|
||||
|
||||
"micromark-util-resolve-all": ["micromark-util-resolve-all@2.0.1", "", { "dependencies": { "micromark-util-types": "^2.0.0" } }, "sha512-VdQyxFWFT2/FGJgwQnJYbe1jjQoNTS4RjglmSjTUlpUMa95Htx9NHeYW4rGDJzbjvCsl9eLjMQwGeElsqmzcHg=="],
|
||||
|
||||
"micromark-util-sanitize-uri": ["micromark-util-sanitize-uri@2.0.1", "", { "dependencies": { "micromark-util-character": "^2.0.0", "micromark-util-encode": "^2.0.0", "micromark-util-symbol": "^2.0.0" } }, "sha512-9N9IomZ/YuGGZZmQec1MbgxtlgougxTodVwDzzEouPKo3qFWvymFHWcnDi2vzV1ff6kas9ucW+o3yzJK9YB1AQ=="],
|
||||
|
||||
"micromark-util-subtokenize": ["micromark-util-subtokenize@2.1.0", "", { "dependencies": { "devlop": "^1.0.0", "micromark-util-chunked": "^2.0.0", "micromark-util-symbol": "^2.0.0", "micromark-util-types": "^2.0.0" } }, "sha512-XQLu552iSctvnEcgXw6+Sx75GflAPNED1qx7eBJ+wydBb2KCbRZe+NwvIEEMM83uml1+2WSXpBAcp9IUCgCYWA=="],
|
||||
|
||||
"micromark-util-symbol": ["micromark-util-symbol@2.0.1", "", {}, "sha512-vs5t8Apaud9N28kgCrRUdEed4UJ+wWNvicHLPxCa9ENlYuAY31M0ETy5y1vA33YoNPDFTghEbnh6efaE8h4x0Q=="],
|
||||
|
||||
"micromark-util-types": ["micromark-util-types@2.0.2", "", {}, "sha512-Yw0ECSpJoViF1qTU4DC6NwtC4aWGt1EkzaQB8KPPyCRR8z9TWeV0HbEFGTO+ZY1wB22zmxnJqhPyTpOVCpeHTA=="],
|
||||
|
||||
"ms": ["ms@2.1.3", "", {}, "sha512-6FlzubTLZG3J2a/NVCAleEhjzq5oxgHyaCU9yYXvcLsvoVaHJq/s5xXI6/XXP6tz7R9xAOtHnSO/tXtF3WRTlA=="],
|
||||
|
||||
"nanoid": ["nanoid@3.3.11", "", { "bin": { "nanoid": "bin/nanoid.cjs" } }, "sha512-N8SpfPUnUp1bK+PMYW8qSWdl9U+wwNWI4QKxOYDy9JAro3WMX7p2OeVRF9v+347pnakNevPmiHhNmZ2HbFA76w=="],
|
||||
|
||||
"negotiator": ["negotiator@1.0.0", "", {}, "sha512-8Ofs/AUQh8MaEcrlq5xOX0CQ9ypTF5dl78mjlMNfOK08fzpgTHQRQPBxcPlEtIw0yRpws+Zo/3r+5WRby7u3Gg=="],
|
||||
|
||||
"next": ["next@16.1.6", "", { "dependencies": { "@next/env": "16.1.6", "@swc/helpers": "0.5.15", "baseline-browser-mapping": "^2.8.3", "caniuse-lite": "^1.0.30001579", "postcss": "8.4.31", "styled-jsx": "5.1.6" }, "optionalDependencies": { "@next/swc-darwin-arm64": "16.1.6", "@next/swc-darwin-x64": "16.1.6", "@next/swc-linux-arm64-gnu": "16.1.6", "@next/swc-linux-arm64-musl": "16.1.6", "@next/swc-linux-x64-gnu": "16.1.6", "@next/swc-linux-x64-musl": "16.1.6", "@next/swc-win32-arm64-msvc": "16.1.6", "@next/swc-win32-x64-msvc": "16.1.6", "sharp": "^0.34.4" }, "peerDependencies": { "@opentelemetry/api": "^1.1.0", "@playwright/test": "^1.51.1", "babel-plugin-react-compiler": "*", "react": "^18.2.0 || 19.0.0-rc-de68d2f4-20241204 || ^19.0.0", "react-dom": "^18.2.0 || 19.0.0-rc-de68d2f4-20241204 || ^19.0.0", "sass": "^1.3.0" }, "optionalPeers": ["@opentelemetry/api", "@playwright/test", "babel-plugin-react-compiler", "sass"], "bin": { "next": "dist/bin/next" } }, "sha512-hkyRkcu5x/41KoqnROkfTm2pZVbKxvbZRuNvKXLRXxs3VfyO0WhY50TQS40EuKO9SW3rBj/sF3WbVwDACeMZyw=="],
|
||||
|
||||
"next-themes": ["next-themes@0.4.6", "", { "peerDependencies": { "react": "^16.8 || ^17 || ^18 || ^19 || ^19.0.0-rc", "react-dom": "^16.8 || ^17 || ^18 || ^19 || ^19.0.0-rc" } }, "sha512-pZvgD5L0IEvX5/9GWyHMf3m8BKiVQwsCMHfoFosXtXBMnaS0ZnIJ9ST4b4NqLVKDEm8QBxoNNGNaBv2JNF6XNA=="],
|
||||
|
||||
"npm-to-yarn": ["npm-to-yarn@3.0.1", "", {}, "sha512-tt6PvKu4WyzPwWUzy/hvPFqn+uwXO0K1ZHka8az3NnrhWJDmSqI8ncWq0fkL0k/lmmi5tAC11FXwXuh0rFbt1A=="],
|
||||
|
||||
"oniguruma-parser": ["oniguruma-parser@0.12.1", "", {}, "sha512-8Unqkvk1RYc6yq2WBYRj4hdnsAxVze8i7iPfQr8e4uSP3tRv0rpZcbGUDvxfQQcdwHt/e9PrMvGCsa8OqG9X3w=="],
|
||||
|
||||
"oniguruma-to-es": ["oniguruma-to-es@4.3.4", "", { "dependencies": { "oniguruma-parser": "^0.12.1", "regex": "^6.0.1", "regex-recursion": "^6.0.2" } }, "sha512-3VhUGN3w2eYxnTzHn+ikMI+fp/96KoRSVK9/kMTcFqj1NRDh2IhQCKvYxDnWePKRXY/AqH+Fuiyb7VHSzBjHfA=="],
|
||||
|
||||
"openapi-sampler": ["openapi-sampler@1.6.2", "", { "dependencies": { "@types/json-schema": "^7.0.7", "fast-xml-parser": "^4.5.0", "json-pointer": "0.6.2" } }, "sha512-NyKGiFKfSWAZr4srD/5WDhInOWDhfml32h/FKUqLpEwKJt0kG0LGUU0MdyNkKrVGuJnw6DuPWq/sHCwAMpiRxg=="],
|
||||
|
||||
"parse-entities": ["parse-entities@4.0.2", "", { "dependencies": { "@types/unist": "^2.0.0", "character-entities-legacy": "^3.0.0", "character-reference-invalid": "^2.0.0", "decode-named-character-reference": "^1.0.0", "is-alphanumerical": "^2.0.0", "is-decimal": "^2.0.0", "is-hexadecimal": "^2.0.0" } }, "sha512-GG2AQYWoLgL877gQIKeRPGO1xF9+eG1ujIb5soS5gPvLQ1y2o8FL90w2QWNdf9I361Mpp7726c+lj3U0qK1uGw=="],
|
||||
|
||||
"path-to-regexp": ["path-to-regexp@8.3.0", "", {}, "sha512-7jdwVIRtsP8MYpdXSwOS0YdD0Du+qOoF/AEPIt88PcCFrZCzx41oxku1jD88hZBwbNUIEfpqvuhjFaMAqMTWnA=="],
|
||||
|
||||
"picocolors": ["picocolors@1.1.1", "", {}, "sha512-xceH2snhtb5M9liqDsmEw56le376mTZkEX/jEb/RxNFyegNul7eNslCXP9FDj/Lcu0X8KEyMceP2ntpaHrDEVA=="],
|
||||
|
||||
"picomatch": ["picomatch@4.0.3", "", {}, "sha512-5gTmgEY/sqK6gFXLIsQNH19lWb4ebPDLA4SdLP7dsWkIXHWlG66oPuVvXSGFPppYZz8ZDZq0dYYrbHfBCVUb1Q=="],
|
||||
|
||||
"postcss": ["postcss@8.5.6", "", { "dependencies": { "nanoid": "^3.3.11", "picocolors": "^1.1.1", "source-map-js": "^1.2.1" } }, "sha512-3Ybi1tAuwAP9s0r1UQ2J4n5Y0G05bJkpUIO0/bI9MhwmD70S5aTWbXGBwxHrelT+XM1k6dM0pk+SwNkpTRN7Pg=="],
|
||||
|
||||
"postcss-selector-parser": ["postcss-selector-parser@7.1.1", "", { "dependencies": { "cssesc": "^3.0.0", "util-deprecate": "^1.0.2" } }, "sha512-orRsuYpJVw8LdAwqqLykBj9ecS5/cRHlI5+nvTo8LcCKmzDmqVORXtOIYEEQuL9D4BxtA1lm5isAqzQZCoQ6Eg=="],
|
||||
|
||||
"property-information": ["property-information@7.1.0", "", {}, "sha512-TwEZ+X+yCJmYfL7TPUOcvBZ4QfoT5YenQiJuX//0th53DE6w0xxLEtfK3iyryQFddXuvkIk51EEgrJQ0WJkOmQ=="],
|
||||
|
||||
"react": ["react@19.2.4", "", {}, "sha512-9nfp2hYpCwOjAN+8TZFGhtWEwgvWHXqESH8qT89AT/lWklpLON22Lc8pEtnpsZz7VmawabSU0gCjnj8aC0euHQ=="],
|
||||
|
||||
"react-dom": ["react-dom@19.2.4", "", { "dependencies": { "scheduler": "^0.27.0" }, "peerDependencies": { "react": "^19.2.4" } }, "sha512-AXJdLo8kgMbimY95O2aKQqsz2iWi9jMgKJhRBAxECE4IFxfcazB2LmzloIoibJI3C12IlY20+KFaLv+71bUJeQ=="],
|
||||
|
||||
"react-hook-form": ["react-hook-form@7.71.1", "", { "peerDependencies": { "react": "^16.8.0 || ^17 || ^18 || ^19" } }, "sha512-9SUJKCGKo8HUSsCO+y0CtqkqI5nNuaDqTxyqPsZPqIwudpj4rCrAz/jZV+jn57bx5gtZKOh3neQu94DXMc+w5w=="],
|
||||
|
||||
"react-medium-image-zoom": ["react-medium-image-zoom@5.4.0", "", { "peerDependencies": { "react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0", "react-dom": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0" } }, "sha512-BsE+EnFVQzFIlyuuQrZ9iTwyKpKkqdFZV1ImEQN573QPqGrIUuNni7aF+sZwDcxlsuOMayCr6oO/PZR/yJnbRg=="],
|
||||
|
||||
"react-remove-scroll": ["react-remove-scroll@2.7.2", "", { "dependencies": { "react-remove-scroll-bar": "^2.3.7", "react-style-singleton": "^2.2.3", "tslib": "^2.1.0", "use-callback-ref": "^1.3.3", "use-sidecar": "^1.1.3" }, "peerDependencies": { "@types/react": "*", "react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-Iqb9NjCCTt6Hf+vOdNIZGdTiH1QSqr27H/Ek9sv/a97gfueI/5h1s3yRi1nngzMUaOOToin5dI1dXKdXiF+u0Q=="],
|
||||
|
||||
"react-remove-scroll-bar": ["react-remove-scroll-bar@2.3.8", "", { "dependencies": { "react-style-singleton": "^2.2.2", "tslib": "^2.0.0" }, "peerDependencies": { "@types/react": "*", "react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0" }, "optionalPeers": ["@types/react"] }, "sha512-9r+yi9+mgU33AKcj6IbT9oRCO78WriSj6t/cF8DWBZJ9aOGPOTEDvdUDz1FwKim7QXWwmHqtdHnRJfhAxEG46Q=="],
|
||||
|
||||
"react-style-singleton": ["react-style-singleton@2.2.3", "", { "dependencies": { "get-nonce": "^1.0.0", "tslib": "^2.0.0" }, "peerDependencies": { "@types/react": "*", "react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-b6jSvxvVnyptAiLjbkWLE/lOnR4lfTtDAl+eUC7RZy+QQWc6wRzIV2CE6xBuMmDxc2qIihtDCZD5NPOFl7fRBQ=="],
|
||||
|
||||
"readdirp": ["readdirp@4.1.2", "", {}, "sha512-GDhwkLfywWL2s6vEjyhri+eXmfH6j1L7JE27WhqLeYzoh/A3DBaYGEj2H/HFZCn/kMfim73FXxEJTw06WtxQwg=="],
|
||||
|
||||
"recma-build-jsx": ["recma-build-jsx@1.0.0", "", { "dependencies": { "@types/estree": "^1.0.0", "estree-util-build-jsx": "^3.0.0", "vfile": "^6.0.0" } }, "sha512-8GtdyqaBcDfva+GUKDr3nev3VpKAhup1+RvkMvUxURHpW7QyIvk9F5wz7Vzo06CEMSilw6uArgRqhpiUcWp8ew=="],
|
||||
|
||||
"recma-jsx": ["recma-jsx@1.0.1", "", { "dependencies": { "acorn-jsx": "^5.0.0", "estree-util-to-js": "^2.0.0", "recma-parse": "^1.0.0", "recma-stringify": "^1.0.0", "unified": "^11.0.0" }, "peerDependencies": { "acorn": "^6.0.0 || ^7.0.0 || ^8.0.0" } }, "sha512-huSIy7VU2Z5OLv6oFLosQGGDqPqdO1iq6bWNAdhzMxSJP7RAso4fCZ1cKu8j9YHCZf3TPrq4dw3okhrylgcd7w=="],
|
||||
|
||||
"recma-parse": ["recma-parse@1.0.0", "", { "dependencies": { "@types/estree": "^1.0.0", "esast-util-from-js": "^2.0.0", "unified": "^11.0.0", "vfile": "^6.0.0" } }, "sha512-OYLsIGBB5Y5wjnSnQW6t3Xg7q3fQ7FWbw/vcXtORTnyaSFscOtABg+7Pnz6YZ6c27fG1/aN8CjfwoUEUIdwqWQ=="],
|
||||
|
||||
"recma-stringify": ["recma-stringify@1.0.0", "", { "dependencies": { "@types/estree": "^1.0.0", "estree-util-to-js": "^2.0.0", "unified": "^11.0.0", "vfile": "^6.0.0" } }, "sha512-cjwII1MdIIVloKvC9ErQ+OgAtwHBmcZ0Bg4ciz78FtbT8In39aAYbaA7zvxQ61xVMSPE8WxhLwLbhif4Js2C+g=="],
|
||||
|
||||
"regex": ["regex@6.1.0", "", { "dependencies": { "regex-utilities": "^2.3.0" } }, "sha512-6VwtthbV4o/7+OaAF9I5L5V3llLEsoPyq9P1JVXkedTP33c7MfCG0/5NOPcSJn0TzXcG9YUrR0gQSWioew3LDg=="],
|
||||
|
||||
"regex-recursion": ["regex-recursion@6.0.2", "", { "dependencies": { "regex-utilities": "^2.3.0" } }, "sha512-0YCaSCq2VRIebiaUviZNs0cBz1kg5kVS2UKUfNIx8YVs1cN3AV7NTctO5FOKBA+UT2BPJIWZauYHPqJODG50cg=="],
|
||||
|
||||
"regex-utilities": ["regex-utilities@2.3.0", "", {}, "sha512-8VhliFJAWRaUiVvREIiW2NXXTmHs4vMNnSzuJVhscgmGav3g9VDxLrQndI3dZZVVdp0ZO/5v0xmX516/7M9cng=="],
|
||||
|
||||
"rehype-recma": ["rehype-recma@1.0.0", "", { "dependencies": { "@types/estree": "^1.0.0", "@types/hast": "^3.0.0", "hast-util-to-estree": "^3.0.0" } }, "sha512-lqA4rGUf1JmacCNWWZx0Wv1dHqMwxzsDWYMTowuplHF3xH0N/MmrZ/G3BDZnzAkRmxDadujCjaKM2hqYdCBOGw=="],
|
||||
|
||||
"remark": ["remark@15.0.1", "", { "dependencies": { "@types/mdast": "^4.0.0", "remark-parse": "^11.0.0", "remark-stringify": "^11.0.0", "unified": "^11.0.0" } }, "sha512-Eht5w30ruCXgFmxVUSlNWQ9iiimq07URKeFS3hNc8cUWy1llX4KDWfyEDZRycMc+znsN9Ux5/tJ/BFdgdOwA3A=="],
|
||||
|
||||
"remark-gfm": ["remark-gfm@4.0.1", "", { "dependencies": { "@types/mdast": "^4.0.0", "mdast-util-gfm": "^3.0.0", "micromark-extension-gfm": "^3.0.0", "remark-parse": "^11.0.0", "remark-stringify": "^11.0.0", "unified": "^11.0.0" } }, "sha512-1quofZ2RQ9EWdeN34S79+KExV1764+wCUGop5CPL1WGdD0ocPpu91lzPGbwWMECpEpd42kJGQwzRfyov9j4yNg=="],
|
||||
|
||||
"remark-mdx": ["remark-mdx@3.1.1", "", { "dependencies": { "mdast-util-mdx": "^3.0.0", "micromark-extension-mdxjs": "^3.0.0" } }, "sha512-Pjj2IYlUY3+D8x00UJsIOg5BEvfMyeI+2uLPn9VO9Wg4MEtN/VTIq2NEJQfde9PnX15KgtHyl9S0BcTnWrIuWg=="],
|
||||
|
||||
"remark-parse": ["remark-parse@11.0.0", "", { "dependencies": { "@types/mdast": "^4.0.0", "mdast-util-from-markdown": "^2.0.0", "micromark-util-types": "^2.0.0", "unified": "^11.0.0" } }, "sha512-FCxlKLNGknS5ba/1lmpYijMUzX2esxW5xQqjWxw2eHFfS2MSdaHVINFmhjo+qN1WhZhNimq0dZATN9pH0IDrpA=="],
|
||||
|
||||
"remark-rehype": ["remark-rehype@11.1.2", "", { "dependencies": { "@types/hast": "^3.0.0", "@types/mdast": "^4.0.0", "mdast-util-to-hast": "^13.0.0", "unified": "^11.0.0", "vfile": "^6.0.0" } }, "sha512-Dh7l57ianaEoIpzbp0PC9UKAdCSVklD8E5Rpw7ETfbTl3FqcOOgq5q2LVDhgGCkaBv7p24JXikPdvhhmHvKMsw=="],
|
||||
|
||||
"remark-stringify": ["remark-stringify@11.0.0", "", { "dependencies": { "@types/mdast": "^4.0.0", "mdast-util-to-markdown": "^2.0.0", "unified": "^11.0.0" } }, "sha512-1OSmLd3awB/t8qdoEOMazZkNsfVTeY4fTsgzcQFdXNq8ToTN4ZGwrMnlda4K6smTFKD+GRV6O48i6Z4iKgPPpw=="],
|
||||
|
||||
"require-from-string": ["require-from-string@2.0.2", "", {}, "sha512-Xf0nWe6RseziFMu+Ap9biiUbmplq6S9/p+7w7YXP/JBHhrUDDUhwa+vANyubuqfZWTveU//DYVGsDG7RKL/vEw=="],
|
||||
|
||||
"sax": ["sax@1.4.4", "", {}, "sha512-1n3r/tGXO6b6VXMdFT54SHzT9ytu9yr7TaELowdYpMqY/Ao7EnlQGmAQ1+RatX7Tkkdm6hONI2owqNx2aZj5Sw=="],
|
||||
|
||||
"scheduler": ["scheduler@0.27.0", "", {}, "sha512-eNv+WrVbKu1f3vbYJT/xtiF5syA5HPIMtf9IgY/nKg0sWqzAUEvqY/xm7OcZc/qafLx/iO9FgOmeSAp4v5ti/Q=="],
|
||||
|
||||
"scroll-into-view-if-needed": ["scroll-into-view-if-needed@3.1.0", "", { "dependencies": { "compute-scroll-into-view": "^3.0.2" } }, "sha512-49oNpRjWRvnU8NyGVmUaYG4jtTkNonFZI86MmGRDqBphEK2EXT9gdEUoQPZhuBM8yWHxCWbobltqYO5M4XrUvQ=="],
|
||||
|
||||
"semver": ["semver@7.7.3", "", { "bin": { "semver": "bin/semver.js" } }, "sha512-SdsKMrI9TdgjdweUSR9MweHA4EJ8YxHn8DFaDisvhVlUOe4BF1tLD7GAj0lIqWVl+dPb/rExr0Btby5loQm20Q=="],
|
||||
|
||||
"sharp": ["sharp@0.34.5", "", { "dependencies": { "@img/colour": "^1.0.0", "detect-libc": "^2.1.2", "semver": "^7.7.3" }, "optionalDependencies": { "@img/sharp-darwin-arm64": "0.34.5", "@img/sharp-darwin-x64": "0.34.5", "@img/sharp-libvips-darwin-arm64": "1.2.4", "@img/sharp-libvips-darwin-x64": "1.2.4", "@img/sharp-libvips-linux-arm": "1.2.4", "@img/sharp-libvips-linux-arm64": "1.2.4", "@img/sharp-libvips-linux-ppc64": "1.2.4", "@img/sharp-libvips-linux-riscv64": "1.2.4", "@img/sharp-libvips-linux-s390x": "1.2.4", "@img/sharp-libvips-linux-x64": "1.2.4", "@img/sharp-libvips-linuxmusl-arm64": "1.2.4", "@img/sharp-libvips-linuxmusl-x64": "1.2.4", "@img/sharp-linux-arm": "0.34.5", "@img/sharp-linux-arm64": "0.34.5", "@img/sharp-linux-ppc64": "0.34.5", "@img/sharp-linux-riscv64": "0.34.5", "@img/sharp-linux-s390x": "0.34.5", "@img/sharp-linux-x64": "0.34.5", "@img/sharp-linuxmusl-arm64": "0.34.5", "@img/sharp-linuxmusl-x64": "0.34.5", "@img/sharp-wasm32": "0.34.5", "@img/sharp-win32-arm64": "0.34.5", "@img/sharp-win32-ia32": "0.34.5", "@img/sharp-win32-x64": "0.34.5" } }, "sha512-Ou9I5Ft9WNcCbXrU9cMgPBcCK8LiwLqcbywW3t4oDV37n1pzpuNLsYiAV8eODnjbtQlSDwZ2cUEeQz4E54Hltg=="],
|
||||
|
||||
"shiki": ["shiki@3.22.0", "", { "dependencies": { "@shikijs/core": "3.22.0", "@shikijs/engine-javascript": "3.22.0", "@shikijs/engine-oniguruma": "3.22.0", "@shikijs/langs": "3.22.0", "@shikijs/themes": "3.22.0", "@shikijs/types": "3.22.0", "@shikijs/vscode-textmate": "^10.0.2", "@types/hast": "^3.0.4" } }, "sha512-LBnhsoYEe0Eou4e1VgJACes+O6S6QC0w71fCSp5Oya79inkwkm15gQ1UF6VtQ8j/taMDh79hAB49WUk8ALQW3g=="],
|
||||
|
||||
"source-map": ["source-map@0.7.6", "", {}, "sha512-i5uvt8C3ikiWeNZSVZNWcfZPItFQOsYTUAOkcUPGd8DqDy1uOUikjt5dG+uRlwyvR108Fb9DOd4GvXfT0N2/uQ=="],
|
||||
|
||||
"source-map-js": ["source-map-js@1.2.1", "", {}, "sha512-UXWMKhLOwVKb728IUtQPXxfYU+usdybtUrK/8uGE8CQMvrhOpwvzDBwj0QhSL7MQc7vIsISBG8VQ8+IDQxpfQA=="],
|
||||
|
||||
"space-separated-tokens": ["space-separated-tokens@2.0.2", "", {}, "sha512-PEGlAwrG8yXGXRjW32fGbg66JAlOAwbObuqVoJpv/mRgoWDQfgH1wDPvtzWyUSNAXBGSk8h755YDbbcEy3SH2Q=="],
|
||||
|
||||
"stringify-entities": ["stringify-entities@4.0.4", "", { "dependencies": { "character-entities-html4": "^2.0.0", "character-entities-legacy": "^3.0.0" } }, "sha512-IwfBptatlO+QCJUo19AqvrPNqlVMpW9YEL2LIVY+Rpv2qsjCGxaDLNRgeGsQWJhfItebuJhsGSLjaBbNSQ+ieg=="],
|
||||
|
||||
"strnum": ["strnum@1.1.2", "", {}, "sha512-vrN+B7DBIoTTZjnPNewwhx6cBA/H+IS7rfW68n7XxC1y7uoiGQBxaKzqucGUgavX15dJgiGztLJ8vxuEzwqBdA=="],
|
||||
|
||||
"style-to-js": ["style-to-js@1.1.21", "", { "dependencies": { "style-to-object": "1.0.14" } }, "sha512-RjQetxJrrUJLQPHbLku6U/ocGtzyjbJMP9lCNK7Ag0CNh690nSH8woqWH9u16nMjYBAok+i7JO1NP2pOy8IsPQ=="],
|
||||
|
||||
"style-to-object": ["style-to-object@1.0.14", "", { "dependencies": { "inline-style-parser": "0.2.7" } }, "sha512-LIN7rULI0jBscWQYaSswptyderlarFkjQ+t79nzty8tcIAceVomEVlLzH5VP4Cmsv6MtKhs7qaAiwlcp+Mgaxw=="],
|
||||
|
||||
"styled-jsx": ["styled-jsx@5.1.6", "", { "dependencies": { "client-only": "0.0.1" }, "peerDependencies": { "react": ">= 16.8.0 || 17.x.x || ^18.0.0-0 || ^19.0.0-0" } }, "sha512-qSVyDTeMotdvQYoHWLNGwRFJHC+i+ZvdBRYosOFgC+Wg1vx4frN2/RG/NA7SYqqvKNLf39P2LSRA2pu6n0XYZA=="],
|
||||
|
||||
"tailwind-merge": ["tailwind-merge@3.5.0", "", {}, "sha512-I8K9wewnVDkL1NTGoqWmVEIlUcB9gFriAEkXkfCjX5ib8ezGxtR3xD7iZIxrfArjEsH7F1CHD4RFUtxefdqV/A=="],
|
||||
|
||||
"tailwindcss": ["tailwindcss@4.1.18", "", {}, "sha512-4+Z+0yiYyEtUVCScyfHCxOYP06L5Ne+JiHhY2IjR2KWMIWhJOYZKLSGZaP5HkZ8+bY0cxfzwDE5uOmzFXyIwxw=="],
|
||||
|
||||
"tapable": ["tapable@2.3.0", "", {}, "sha512-g9ljZiwki/LfxmQADO3dEY1CbpmXT5Hm2fJ+QaGKwSXUylMybePR7/67YW7jOrrvjEgL1Fmz5kzyAjWVWLlucg=="],
|
||||
|
||||
"tinyexec": ["tinyexec@1.0.2", "", {}, "sha512-W/KYk+NFhkmsYpuHq5JykngiOCnxeVL8v8dFnqxSD8qEEdRfXk1SDM6JzNqcERbcGYj9tMrDQBYV9cjgnunFIg=="],
|
||||
|
||||
"tinyglobby": ["tinyglobby@0.2.15", "", { "dependencies": { "fdir": "^6.5.0", "picomatch": "^4.0.3" } }, "sha512-j2Zq4NyQYG5XMST4cbs02Ak8iJUdxRM0XI5QyxXuZOzKOINmWurp3smXu3y5wDcJrptwpSjgXHzIQxR0omXljQ=="],
|
||||
|
||||
"trim-lines": ["trim-lines@3.0.1", "", {}, "sha512-kRj8B+YHZCc9kQYdWfJB2/oUl9rA99qbowYYBtr4ui4mZyAQ2JpvVBd/6U2YloATfqBhBTSMhTpgBHtU0Mf3Rg=="],
|
||||
|
||||
"trough": ["trough@2.2.0", "", {}, "sha512-tmMpK00BjZiUyVyvrBK7knerNgmgvcV/KLVyuma/SC+TQN167GrMRciANTz09+k3zW8L8t60jWO1GpfkZdjTaw=="],
|
||||
|
||||
"tslib": ["tslib@2.8.1", "", {}, "sha512-oJFu94HQb+KVduSUQL7wnpmqnfmLsOA/nAh6b6EH0wCEoK0/mPeXU6c3wKDV83MkOuHPRHtSXKKU99IBazS/2w=="],
|
||||
|
||||
"typescript": ["typescript@5.9.3", "", { "bin": { "tsc": "bin/tsc", "tsserver": "bin/tsserver" } }, "sha512-jl1vZzPDinLr9eUt3J/t7V6FgNEw9QjvBPdysz9KfQDD41fQrC2Y4vKQdiaUpFT4bXlb1RHhLpp8wtm6M5TgSw=="],
|
||||
|
||||
"undici-types": ["undici-types@7.16.0", "", {}, "sha512-Zz+aZWSj8LE6zoxD+xrjh4VfkIG8Ya6LvYkZqtUQGJPZjYl53ypCaUwWqo7eI0x66KBGeRo+mlBEkMSeSZ38Nw=="],
|
||||
|
||||
"unified": ["unified@11.0.5", "", { "dependencies": { "@types/unist": "^3.0.0", "bail": "^2.0.0", "devlop": "^1.0.0", "extend": "^3.0.0", "is-plain-obj": "^4.0.0", "trough": "^2.0.0", "vfile": "^6.0.0" } }, "sha512-xKvGhPWw3k84Qjh8bI3ZeJjqnyadK+GEFtazSfZv/rKeTkTjOJho6mFqh2SM96iIcZokxiOpg78GazTSg8+KHA=="],
|
||||
|
||||
"unist-util-is": ["unist-util-is@6.0.1", "", { "dependencies": { "@types/unist": "^3.0.0" } }, "sha512-LsiILbtBETkDz8I9p1dQ0uyRUWuaQzd/cuEeS1hoRSyW5E5XGmTzlwY1OrNzzakGowI9Dr/I8HVaw4hTtnxy8g=="],
|
||||
|
||||
"unist-util-position": ["unist-util-position@5.0.0", "", { "dependencies": { "@types/unist": "^3.0.0" } }, "sha512-fucsC7HjXvkB5R3kTCO7kUjRdrS0BJt3M/FPxmHMBOm8JQi2BsHAHFsy27E0EolP8rp0NzXsJ+jNPyDWvOJZPA=="],
|
||||
|
||||
"unist-util-position-from-estree": ["unist-util-position-from-estree@2.0.0", "", { "dependencies": { "@types/unist": "^3.0.0" } }, "sha512-KaFVRjoqLyF6YXCbVLNad/eS4+OfPQQn2yOd7zF/h5T/CSL2v8NpN6a5TPvtbXthAGw5nG+PuTtq+DdIZr+cRQ=="],
|
||||
|
||||
"unist-util-remove-position": ["unist-util-remove-position@5.0.0", "", { "dependencies": { "@types/unist": "^3.0.0", "unist-util-visit": "^5.0.0" } }, "sha512-Hp5Kh3wLxv0PHj9m2yZhhLt58KzPtEYKQQ4yxfYFEO7EvHwzyDYnduhHnY1mDxoqr7VUwVuHXk9RXKIiYS1N8Q=="],
|
||||
|
||||
"unist-util-stringify-position": ["unist-util-stringify-position@4.0.0", "", { "dependencies": { "@types/unist": "^3.0.0" } }, "sha512-0ASV06AAoKCDkS2+xw5RXJywruurpbC4JZSm7nr7MOt1ojAzvyyaO+UxZf18j8FCF6kmzCZKcAgN/yu2gm2XgQ=="],
|
||||
|
||||
"unist-util-visit": ["unist-util-visit@5.1.0", "", { "dependencies": { "@types/unist": "^3.0.0", "unist-util-is": "^6.0.0", "unist-util-visit-parents": "^6.0.0" } }, "sha512-m+vIdyeCOpdr/QeQCu2EzxX/ohgS8KbnPDgFni4dQsfSCtpz8UqDyY5GjRru8PDKuYn7Fq19j1CQ+nJSsGKOzg=="],
|
||||
|
||||
"unist-util-visit-parents": ["unist-util-visit-parents@6.0.2", "", { "dependencies": { "@types/unist": "^3.0.0", "unist-util-is": "^6.0.0" } }, "sha512-goh1s1TBrqSqukSc8wrjwWhL0hiJxgA8m4kFxGlQ+8FYQ3C/m11FcTs4YYem7V664AhHVvgoQLk890Ssdsr2IQ=="],
|
||||
|
||||
"use-callback-ref": ["use-callback-ref@1.3.3", "", { "dependencies": { "tslib": "^2.0.0" }, "peerDependencies": { "@types/react": "*", "react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-jQL3lRnocaFtu3V00JToYz/4QkNWswxijDaCVNZRiRTO3HQDLsdu1ZtmIUvV4yPp+rvWm5j0y0TG/S61cuijTg=="],
|
||||
|
||||
"use-sidecar": ["use-sidecar@1.1.3", "", { "dependencies": { "detect-node-es": "^1.1.0", "tslib": "^2.0.0" }, "peerDependencies": { "@types/react": "*", "react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-Fedw0aZvkhynoPYlA5WXrMCAMm+nSWdZt6lzJQ7Ok8S6Q+VsHmHpRWndVRJ8Be0ZbkfPc5LRYH+5XrzXcEeLRQ=="],
|
||||
|
||||
"util-deprecate": ["util-deprecate@1.0.2", "", {}, "sha512-EPD5q1uXyFxJpCrLnCc1nHnq3gOa6DZBocAIiI2TaSCA7VCJ1UJDMagCzIkXNsUYfD1daK//LTEQ8xiIbrHtcw=="],
|
||||
|
||||
"vfile": ["vfile@6.0.3", "", { "dependencies": { "@types/unist": "^3.0.0", "vfile-message": "^4.0.0" } }, "sha512-KzIbH/9tXat2u30jf+smMwFCsno4wHVdNmzFyL+T/L3UGqqk6JKfVqOFOZEpZSHADH1k40ab6NUIXZq422ov3Q=="],
|
||||
|
||||
"vfile-message": ["vfile-message@4.0.3", "", { "dependencies": { "@types/unist": "^3.0.0", "unist-util-stringify-position": "^4.0.0" } }, "sha512-QTHzsGd1EhbZs4AsQ20JX1rC3cOlt/IWJruk893DfLRr57lcnOeMaWG4K0JrRta4mIJZKth2Au3mM3u03/JWKw=="],
|
||||
|
||||
"xml-js": ["xml-js@1.6.11", "", { "dependencies": { "sax": "^1.2.4" }, "bin": { "xml-js": "./bin/cli.js" } }, "sha512-7rVi2KMfwfWFl+GpPg6m80IVMWXLRjO+PxTq7V2CDhoGak0wzYzFgUY2m4XJ47OGdXd8eLE8EmwfAmdjw7lC1g=="],
|
||||
|
||||
"yaml": ["yaml@2.8.2", "", { "bin": { "yaml": "bin.mjs" } }, "sha512-mplynKqc1C2hTVYxd0PU2xQAc22TI1vShAYGksCCfxbn/dFwnHTNi1bvYsBTkhdUNtGIf5xNOg938rrSSYvS9A=="],
|
||||
|
||||
"zod": ["zod@4.3.6", "", {}, "sha512-rftlrkhHZOcjDwkGlnUtZZkvaPHCsDATp4pGpuOOMDaTdDDXF91wuVDJoWoPsKX/3YPQ5fHuF3STjcYyKr+Qhg=="],
|
||||
|
||||
"zwitch": ["zwitch@2.0.4", "", {}, "sha512-bXE4cR/kVZhKZX/RjPEflHaKVhUVl85noU3v6b8apfQEc1x4A+zBxjZ4lN8LqGd6WZ3dl98pY4o717VFmoPp+A=="],
|
||||
|
||||
"@fumadocs/ui/tailwind-merge": ["tailwind-merge@3.4.0", "", {}, "sha512-uSaO4gnW+b3Y2aWoWfFpX62vn2sR3skfhbjsEnaBI81WD1wBLlHZe5sWf0AqjksNdYTbGBEd0UasQMT3SNV15g=="],
|
||||
|
||||
"@scalar/openapi-parser/@scalar/json-magic": ["@scalar/json-magic@0.9.4", "", { "dependencies": { "@scalar/helpers": "0.2.9", "yaml": "^2.8.0" } }, "sha512-PyfyWrH4ZkW0TM1ColiiHj4NRF8hUM61H0UzAkHLhRNnKFxi6hI+oqNrwqPnyk93hrpkpTRHC7Fl5T0BRwuzVg=="],
|
||||
|
||||
"@tailwindcss/oxide-wasm32-wasi/@emnapi/core": ["@emnapi/core@1.8.1", "", { "dependencies": { "@emnapi/wasi-threads": "1.1.0", "tslib": "^2.4.0" }, "bundled": true }, "sha512-AvT9QFpxK0Zd8J0jopedNm+w/2fIzvtPKPjqyw9jwvBaReTTqPBk9Hixaz7KbjimP+QNz605/XnjFcDAL2pqBg=="],
|
||||
|
||||
"@tailwindcss/oxide-wasm32-wasi/@emnapi/runtime": ["@emnapi/runtime@1.8.1", "", { "dependencies": { "tslib": "^2.4.0" }, "bundled": true }, "sha512-mehfKSMWjjNol8659Z8KxEMrdSJDDot5SXMq00dM8BN4o+CLNXQ0xH2V7EchNHV4RmbZLmmPdEaXZc5H2FXmDg=="],
|
||||
|
||||
"@tailwindcss/oxide-wasm32-wasi/@emnapi/wasi-threads": ["@emnapi/wasi-threads@1.1.0", "", { "dependencies": { "tslib": "^2.4.0" }, "bundled": true }, "sha512-WI0DdZ8xFSbgMjR1sFsKABJ/C5OnRrjT06JXbZKexJGrDuPTzZdDYfFlsgcCXCyf+suG5QU2e/y1Wo2V/OapLQ=="],
|
||||
|
||||
"@tailwindcss/oxide-wasm32-wasi/@napi-rs/wasm-runtime": ["@napi-rs/wasm-runtime@1.1.1", "", { "dependencies": { "@emnapi/core": "^1.7.1", "@emnapi/runtime": "^1.7.1", "@tybys/wasm-util": "^0.10.1" }, "bundled": true }, "sha512-p64ah1M1ld8xjWv3qbvFwHiFVWrq1yFvV4f7w+mzaqiR4IlSgkqhcRdHwsGgomwzBH51sRY4NEowLxnaBjcW/A=="],
|
||||
|
||||
"@tailwindcss/oxide-wasm32-wasi/@tybys/wasm-util": ["@tybys/wasm-util@0.10.1", "", { "dependencies": { "tslib": "^2.4.0" }, "bundled": true }, "sha512-9tTaPJLSiejZKx+Bmog4uSubteqTvFrVrURwkmHixBo0G4seD0zUxp98E1DzUBJxLQ3NPwXrGKDiVjwx/DpPsg=="],
|
||||
|
||||
"@tailwindcss/oxide-wasm32-wasi/tslib": ["tslib@2.8.1", "", { "bundled": true }, "sha512-oJFu94HQb+KVduSUQL7wnpmqnfmLsOA/nAh6b6EH0wCEoK0/mPeXU6c3wKDV83MkOuHPRHtSXKKU99IBazS/2w=="],
|
||||
|
||||
"fumadocs-openapi/@radix-ui/react-slot": ["@radix-ui/react-slot@1.2.4", "", { "dependencies": { "@radix-ui/react-compose-refs": "1.1.2" }, "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-Jl+bCv8HxKnlTLVrcDE8zTMJ09R9/ukw4qBs/oZClOfoQk/cOTbDn+NceXfV7j09YPVQUryJPHurafcSg6EVKA=="],
|
||||
|
||||
"fumadocs-openapi/lucide-react": ["lucide-react@0.563.0", "", { "peerDependencies": { "react": "^16.5.1 || ^17.0.0 || ^18.0.0 || ^19.0.0" } }, "sha512-8dXPB2GI4dI8jV4MgUDGBeLdGk8ekfqVZ0BdLcrRzocGgG75ltNEmWS+gE7uokKF/0oSUuczNDT+g9hFJ23FkA=="],
|
||||
|
||||
"fumadocs-openapi/tailwind-merge": ["tailwind-merge@3.4.0", "", {}, "sha512-uSaO4gnW+b3Y2aWoWfFpX62vn2sR3skfhbjsEnaBI81WD1wBLlHZe5sWf0AqjksNdYTbGBEd0UasQMT3SNV15g=="],
|
||||
|
||||
"fumadocs-ui/@radix-ui/react-slot": ["@radix-ui/react-slot@1.2.4", "", { "dependencies": { "@radix-ui/react-compose-refs": "1.1.2" }, "peerDependencies": { "@types/react": "*", "react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc" }, "optionalPeers": ["@types/react"] }, "sha512-Jl+bCv8HxKnlTLVrcDE8zTMJ09R9/ukw4qBs/oZClOfoQk/cOTbDn+NceXfV7j09YPVQUryJPHurafcSg6EVKA=="],
|
||||
|
||||
"fumadocs-ui/lucide-react": ["lucide-react@0.563.0", "", { "peerDependencies": { "react": "^16.5.1 || ^17.0.0 || ^18.0.0 || ^19.0.0" } }, "sha512-8dXPB2GI4dI8jV4MgUDGBeLdGk8ekfqVZ0BdLcrRzocGgG75ltNEmWS+gE7uokKF/0oSUuczNDT+g9hFJ23FkA=="],
|
||||
|
||||
"next/postcss": ["postcss@8.4.31", "", { "dependencies": { "nanoid": "^3.3.6", "picocolors": "^1.0.0", "source-map-js": "^1.0.2" } }, "sha512-PS08Iboia9mts/2ygV3eLpY5ghnUcfLV/EXTOW1E2qYxJKGGBUtNjN76FYHnMs36RmARn41bC0AZmn+rR0OVpQ=="],
|
||||
|
||||
"parse-entities/@types/unist": ["@types/unist@2.0.11", "", {}, "sha512-CmBKiL6NNo/OqgmMn95Fk9Whlp2mtvIv+KNpQKN2F4SjvrEesubTRWGYSg+BnWZOnlCaSTU1sMpsBOzgbYhnsA=="],
|
||||
|
||||
"@scalar/openapi-parser/@scalar/json-magic/@scalar/helpers": ["@scalar/helpers@0.2.9", "", {}, "sha512-Y4ffJF0yELdwZ0BKgonqn3SumIgRn1WKyYCVHD+TDM7qRFChdGRypyt20+efHs26fmJeyBAIIv2laICj5uimiw=="],
|
||||
}
|
||||
}
|
||||
13
docs/cli.json
Normal file
13
docs/cli.json
Normal file
@@ -0,0 +1,13 @@
|
||||
{
|
||||
"$schema": "node_modules/@fumadocs/cli/dist/schema/default.json",
|
||||
"aliases": {
|
||||
"uiDir": "./components/ui",
|
||||
"componentsDir": "./components",
|
||||
"blockDir": "./components",
|
||||
"cssDir": "./styles",
|
||||
"libDir": "./lib"
|
||||
},
|
||||
"baseDir": "",
|
||||
"uiLibrary": "radix-ui",
|
||||
"commands": {}
|
||||
}
|
||||
247
docs/components/ai/page-actions.tsx
Normal file
247
docs/components/ai/page-actions.tsx
Normal file
@@ -0,0 +1,247 @@
|
||||
'use client';
|
||||
import { type ComponentProps, useMemo, useState } from 'react';
|
||||
import { Check, ChevronDown, Copy, ExternalLinkIcon, TextIcon } from 'lucide-react';
|
||||
import { cn } from '../../lib/cn';
|
||||
import { useCopyButton } from 'fumadocs-ui/utils/use-copy-button';
|
||||
import { Popover, PopoverTrigger, PopoverContent } from '../ui/popover';
|
||||
import { buttonVariants } from '../ui/button';
|
||||
|
||||
const cache = new Map<string, Promise<string>>();
|
||||
|
||||
export function MarkdownCopyButton({
|
||||
markdownUrl,
|
||||
...props
|
||||
}: ComponentProps<'button'> & {
|
||||
/**
|
||||
* A URL to fetch the raw Markdown/MDX content of page
|
||||
*/
|
||||
markdownUrl: string;
|
||||
}) {
|
||||
const [isLoading, setLoading] = useState(false);
|
||||
const [checked, onClick] = useCopyButton(async () => {
|
||||
const cached = cache.get(markdownUrl);
|
||||
if (cached) return navigator.clipboard.writeText(await cached);
|
||||
|
||||
setLoading(true);
|
||||
|
||||
try {
|
||||
const promise = fetch(markdownUrl).then((res) => res.text());
|
||||
cache.set(markdownUrl, promise);
|
||||
await navigator.clipboard.write([
|
||||
new ClipboardItem({
|
||||
'text/plain': promise,
|
||||
}),
|
||||
]);
|
||||
} finally {
|
||||
setLoading(false);
|
||||
}
|
||||
});
|
||||
|
||||
return (
|
||||
<button
|
||||
disabled={isLoading}
|
||||
onClick={onClick}
|
||||
{...props}
|
||||
className={cn(
|
||||
buttonVariants({
|
||||
variant: 'secondary',
|
||||
size: 'sm',
|
||||
className: 'gap-2 [&_svg]:size-3.5 [&_svg]:text-fd-muted-foreground',
|
||||
}),
|
||||
props.className,
|
||||
)}
|
||||
>
|
||||
{checked ? <Check /> : <Copy />}
|
||||
Copy Markdown
|
||||
</button>
|
||||
);
|
||||
}
|
||||
|
||||
export function ViewOptionsPopover({
|
||||
markdownUrl,
|
||||
githubUrl,
|
||||
...props
|
||||
}: ComponentProps<typeof PopoverTrigger> & {
|
||||
/**
|
||||
* A URL to the raw Markdown/MDX content of page
|
||||
*/
|
||||
markdownUrl: string;
|
||||
|
||||
/**
|
||||
* Source file URL on GitHub
|
||||
*/
|
||||
githubUrl: string;
|
||||
}) {
|
||||
const items = useMemo(() => {
|
||||
const pageUrl = typeof window !== 'undefined' ? window.location.href : 'loading';
|
||||
const q = `Read ${pageUrl}, I want to ask questions about it.`;
|
||||
|
||||
return [
|
||||
{
|
||||
title: 'Open in GitHub',
|
||||
href: githubUrl,
|
||||
icon: (
|
||||
<svg fill="currentColor" role="img" viewBox="0 0 24 24">
|
||||
<title>GitHub</title>
|
||||
<path d="M12 .297c-6.63 0-12 5.373-12 12 0 5.303 3.438 9.8 8.205 11.385.6.113.82-.258.82-.577 0-.285-.01-1.04-.015-2.04-3.338.724-4.042-1.61-4.042-1.61C4.422 18.07 3.633 17.7 3.633 17.7c-1.087-.744.084-.729.084-.729 1.205.084 1.838 1.236 1.838 1.236 1.07 1.835 2.809 1.305 3.495.998.108-.776.417-1.305.76-1.605-2.665-.3-5.466-1.332-5.466-5.93 0-1.31.465-2.38 1.235-3.22-.135-.303-.54-1.523.105-3.176 0 0 1.005-.322 3.3 1.23.96-.267 1.98-.399 3-.405 1.02.006 2.04.138 3 .405 2.28-1.552 3.285-1.23 3.285-1.23.645 1.653.24 2.873.12 3.176.765.84 1.23 1.91 1.23 3.22 0 4.61-2.805 5.625-5.475 5.92.42.36.81 1.096.81 2.22 0 1.606-.015 2.896-.015 3.286 0 .315.21.69.825.57C20.565 22.092 24 17.592 24 12.297c0-6.627-5.373-12-12-12" />
|
||||
</svg>
|
||||
),
|
||||
},
|
||||
{
|
||||
title: 'View as Markdown',
|
||||
href: markdownUrl,
|
||||
icon: <TextIcon />,
|
||||
},
|
||||
{
|
||||
title: 'Open in Scira AI',
|
||||
href: `https://scira.ai/?${new URLSearchParams({
|
||||
q,
|
||||
})}`,
|
||||
icon: (
|
||||
<svg
|
||||
width="910"
|
||||
height="934"
|
||||
viewBox="0 0 910 934"
|
||||
fill="none"
|
||||
xmlns="http://www.w3.org/2000/svg"
|
||||
>
|
||||
<title>Scira AI</title>
|
||||
<path
|
||||
d="M647.664 197.775C569.13 189.049 525.5 145.419 516.774 66.8849C508.048 145.419 464.418 189.049 385.884 197.775C464.418 206.501 508.048 250.131 516.774 328.665C525.5 250.131 569.13 206.501 647.664 197.775Z"
|
||||
fill="currentColor"
|
||||
stroke="currentColor"
|
||||
strokeWidth="8"
|
||||
strokeLinejoin="round"
|
||||
/>
|
||||
<path
|
||||
d="M516.774 304.217C510.299 275.491 498.208 252.087 480.335 234.214C462.462 216.341 439.058 204.251 410.333 197.775C439.059 191.3 462.462 179.209 480.335 161.336C498.208 143.463 510.299 120.06 516.774 91.334C523.25 120.059 535.34 143.463 553.213 161.336C571.086 179.209 594.49 191.3 623.216 197.775C594.49 204.251 571.086 216.341 553.213 234.214C535.34 252.087 523.25 275.491 516.774 304.217Z"
|
||||
fill="currentColor"
|
||||
stroke="currentColor"
|
||||
strokeWidth="8"
|
||||
strokeLinejoin="round"
|
||||
/>
|
||||
<path
|
||||
d="M857.5 508.116C763.259 497.644 710.903 445.288 700.432 351.047C689.961 445.288 637.605 497.644 543.364 508.116C637.605 518.587 689.961 570.943 700.432 665.184C710.903 570.943 763.259 518.587 857.5 508.116Z"
|
||||
stroke="currentColor"
|
||||
strokeWidth="20"
|
||||
strokeLinejoin="round"
|
||||
/>
|
||||
<path
|
||||
d="M700.432 615.957C691.848 589.05 678.575 566.357 660.383 548.165C642.191 529.973 619.499 516.7 592.593 508.116C619.499 499.533 642.191 486.258 660.383 468.066C678.575 449.874 691.848 427.181 700.432 400.274C709.015 427.181 722.289 449.874 740.481 468.066C758.673 486.258 781.365 499.533 808.271 508.116C781.365 516.7 758.673 529.973 740.481 548.165C722.289 566.357 709.015 589.05 700.432 615.957Z"
|
||||
stroke="currentColor"
|
||||
strokeWidth="20"
|
||||
strokeLinejoin="round"
|
||||
/>
|
||||
<path
|
||||
d="M889.949 121.237C831.049 114.692 798.326 81.9698 791.782 23.0692C785.237 81.9698 752.515 114.692 693.614 121.237C752.515 127.781 785.237 160.504 791.782 219.404C798.326 160.504 831.049 127.781 889.949 121.237Z"
|
||||
fill="currentColor"
|
||||
stroke="currentColor"
|
||||
strokeWidth="8"
|
||||
strokeLinejoin="round"
|
||||
/>
|
||||
<path
|
||||
d="M791.782 196.795C786.697 176.937 777.869 160.567 765.16 147.858C752.452 135.15 736.082 126.322 716.226 121.237C736.082 116.152 752.452 107.324 765.16 94.6152C777.869 81.9065 786.697 65.5368 791.782 45.6797C796.867 65.5367 805.695 81.9066 818.403 94.6152C831.112 107.324 847.481 116.152 867.338 121.237C847.481 126.322 831.112 135.15 818.403 147.858C805.694 160.567 796.867 176.937 791.782 196.795Z"
|
||||
fill="currentColor"
|
||||
stroke="currentColor"
|
||||
strokeWidth="8"
|
||||
strokeLinejoin="round"
|
||||
/>
|
||||
<path
|
||||
d="M760.632 764.337C720.719 814.616 669.835 855.1 611.872 882.692C553.91 910.285 490.404 924.255 426.213 923.533C362.022 922.812 298.846 907.419 241.518 878.531C184.19 849.643 134.228 808.026 95.4548 756.863C56.6815 705.7 30.1238 646.346 17.8129 583.343C5.50207 520.339 7.76433 455.354 24.4266 393.359C41.089 331.364 71.7099 274.001 113.947 225.658C156.184 177.315 208.919 139.273 268.117 114.442"
|
||||
stroke="currentColor"
|
||||
strokeWidth="30"
|
||||
strokeLinecap="round"
|
||||
strokeLinejoin="round"
|
||||
/>
|
||||
</svg>
|
||||
),
|
||||
},
|
||||
{
|
||||
title: 'Open in ChatGPT',
|
||||
href: `https://chatgpt.com/?${new URLSearchParams({
|
||||
hints: 'search',
|
||||
q,
|
||||
})}`,
|
||||
icon: (
|
||||
<svg
|
||||
role="img"
|
||||
viewBox="0 0 24 24"
|
||||
fill="currentColor"
|
||||
xmlns="http://www.w3.org/2000/svg"
|
||||
>
|
||||
<title>OpenAI</title>
|
||||
<path d="M22.2819 9.8211a5.9847 5.9847 0 0 0-.5157-4.9108 6.0462 6.0462 0 0 0-6.5098-2.9A6.0651 6.0651 0 0 0 4.9807 4.1818a5.9847 5.9847 0 0 0-3.9977 2.9 6.0462 6.0462 0 0 0 .7427 7.0966 5.98 5.98 0 0 0 .511 4.9107 6.051 6.051 0 0 0 6.5146 2.9001A5.9847 5.9847 0 0 0 13.2599 24a6.0557 6.0557 0 0 0 5.7718-4.2058 5.9894 5.9894 0 0 0 3.9977-2.9001 6.0557 6.0557 0 0 0-.7475-7.0729zm-9.022 12.6081a4.4755 4.4755 0 0 1-2.8764-1.0408l.1419-.0804 4.7783-2.7582a.7948.7948 0 0 0 .3927-.6813v-6.7369l2.02 1.1686a.071.071 0 0 1 .038.052v5.5826a4.504 4.504 0 0 1-4.4945 4.4944zm-9.6607-4.1254a4.4708 4.4708 0 0 1-.5346-3.0137l.142.0852 4.783 2.7582a.7712.7712 0 0 0 .7806 0l5.8428-3.3685v2.3324a.0804.0804 0 0 1-.0332.0615L9.74 19.9502a4.4992 4.4992 0 0 1-6.1408-1.6464zM2.3408 7.8956a4.485 4.485 0 0 1 2.3655-1.9728V11.6a.7664.7664 0 0 0 .3879.6765l5.8144 3.3543-2.0201 1.1685a.0757.0757 0 0 1-.071 0l-4.8303-2.7865A4.504 4.504 0 0 1 2.3408 7.872zm16.5963 3.8558L13.1038 8.364 15.1192 7.2a.0757.0757 0 0 1 .071 0l4.8303 2.7913a4.4944 4.4944 0 0 1-.6765 8.1042v-5.6772a.79.79 0 0 0-.407-.667zm2.0107-3.0231l-.142-.0852-4.7735-2.7818a.7759.7759 0 0 0-.7854 0L9.409 9.2297V6.8974a.0662.0662 0 0 1 .0284-.0615l4.8303-2.7866a4.4992 4.4992 0 0 1 6.6802 4.66zM8.3065 12.863l-2.02-1.1638a.0804.0804 0 0 1-.038-.0567V6.0742a4.4992 4.4992 0 0 1 7.3757-3.4537l-.142.0805L8.704 5.459a.7948.7948 0 0 0-.3927.6813zm1.0976-2.3654l2.602-1.4998 2.6069 1.4998v2.9994l-2.5974 1.4997-2.6067-1.4997Z" />
|
||||
</svg>
|
||||
),
|
||||
},
|
||||
{
|
||||
title: 'Open in Claude',
|
||||
href: `https://claude.ai/new?${new URLSearchParams({
|
||||
q,
|
||||
})}`,
|
||||
icon: (
|
||||
<svg
|
||||
fill="currentColor"
|
||||
role="img"
|
||||
viewBox="0 0 24 24"
|
||||
xmlns="http://www.w3.org/2000/svg"
|
||||
>
|
||||
<title>Anthropic</title>
|
||||
<path d="M17.3041 3.541h-3.6718l6.696 16.918H24Zm-10.6082 0L0 20.459h3.7442l1.3693-3.5527h7.0052l1.3693 3.5528h3.7442L10.5363 3.5409Zm-.3712 10.2232 2.2914-5.9456 2.2914 5.9456Z" />
|
||||
</svg>
|
||||
),
|
||||
},
|
||||
{
|
||||
title: 'Open in Cursor',
|
||||
icon: (
|
||||
<svg
|
||||
fill="currentColor"
|
||||
role="img"
|
||||
viewBox="0 0 24 24"
|
||||
xmlns="http://www.w3.org/2000/svg"
|
||||
>
|
||||
<title>Cursor</title>
|
||||
<path d="M11.503.131 1.891 5.678a.84.84 0 0 0-.42.726v11.188c0 .3.162.575.42.724l9.609 5.55a1 1 0 0 0 .998 0l9.61-5.55a.84.84 0 0 0 .42-.724V6.404a.84.84 0 0 0-.42-.726L12.497.131a1.01 1.01 0 0 0-.996 0M2.657 6.338h18.55c.263 0 .43.287.297.515L12.23 22.918c-.062.107-.229.064-.229-.06V12.335a.59.59 0 0 0-.295-.51l-9.11-5.257c-.109-.063-.064-.23.061-.23" />
|
||||
</svg>
|
||||
),
|
||||
href: `https://cursor.com/link/prompt?${new URLSearchParams({
|
||||
text: q,
|
||||
})}`,
|
||||
},
|
||||
];
|
||||
}, [githubUrl, markdownUrl]);
|
||||
|
||||
return (
|
||||
<Popover>
|
||||
<PopoverTrigger
|
||||
{...props}
|
||||
className={cn(
|
||||
buttonVariants({
|
||||
variant: 'secondary',
|
||||
size: 'sm',
|
||||
}),
|
||||
'gap-2 data-[state=open]:bg-fd-accent data-[state=open]:text-fd-accent-foreground',
|
||||
props.className,
|
||||
)}
|
||||
>
|
||||
Open
|
||||
<ChevronDown className="size-3.5 text-fd-muted-foreground" />
|
||||
</PopoverTrigger>
|
||||
<PopoverContent className="flex flex-col">
|
||||
{items.map((item) => (
|
||||
<a
|
||||
key={item.href}
|
||||
href={item.href}
|
||||
rel="noreferrer noopener"
|
||||
target="_blank"
|
||||
className="text-sm p-2 rounded-lg inline-flex items-center gap-2 hover:text-fd-accent-foreground hover:bg-fd-accent [&_svg]:size-4"
|
||||
>
|
||||
{item.icon}
|
||||
{item.title}
|
||||
<ExternalLinkIcon className="text-fd-muted-foreground size-3.5 ms-auto" />
|
||||
</a>
|
||||
))}
|
||||
</PopoverContent>
|
||||
</Popover>
|
||||
);
|
||||
}
|
||||
6
docs/components/api-page.client.tsx
Normal file
6
docs/components/api-page.client.tsx
Normal file
@@ -0,0 +1,6 @@
|
||||
'use client';
|
||||
import { defineClientConfig } from 'fumadocs-openapi/ui/client';
|
||||
|
||||
export default defineClientConfig({
|
||||
// Client-side configuration for API playground
|
||||
});
|
||||
7
docs/components/api-page.tsx
Normal file
7
docs/components/api-page.tsx
Normal file
@@ -0,0 +1,7 @@
|
||||
import { openapi } from '@/lib/openapi';
|
||||
import { createAPIPage } from 'fumadocs-openapi/ui';
|
||||
import client from './api-page.client';
|
||||
|
||||
export const APIPage = createAPIPage(openapi, {
|
||||
client,
|
||||
});
|
||||
29
docs/components/ui/button.tsx
Normal file
29
docs/components/ui/button.tsx
Normal file
@@ -0,0 +1,29 @@
|
||||
import { cva, type VariantProps } from 'class-variance-authority';
|
||||
|
||||
const variants = {
|
||||
primary:
|
||||
'bg-fd-primary text-fd-primary-foreground hover:bg-fd-primary/80 disabled:bg-fd-secondary disabled:text-fd-secondary-foreground',
|
||||
outline: 'border hover:bg-fd-accent hover:text-fd-accent-foreground',
|
||||
ghost: 'hover:bg-fd-accent hover:text-fd-accent-foreground',
|
||||
secondary:
|
||||
'border bg-fd-secondary text-fd-secondary-foreground hover:bg-fd-accent hover:text-fd-accent-foreground',
|
||||
} as const;
|
||||
|
||||
export const buttonVariants = cva(
|
||||
'inline-flex items-center justify-center rounded-md p-2 text-sm font-medium transition-colors duration-100 disabled:pointer-events-none disabled:opacity-50 focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-fd-ring',
|
||||
{
|
||||
variants: {
|
||||
variant: variants,
|
||||
// fumadocs use `color` instead of `variant`
|
||||
color: variants,
|
||||
size: {
|
||||
sm: 'gap-1 px-2 py-1.5 text-xs',
|
||||
icon: 'p-1.5 [&_svg]:size-5',
|
||||
'icon-sm': 'p-1.5 [&_svg]:size-4.5',
|
||||
'icon-xs': 'p-1 [&_svg]:size-4',
|
||||
},
|
||||
},
|
||||
},
|
||||
);
|
||||
|
||||
export type ButtonProps = VariantProps<typeof buttonVariants>;
|
||||
32
docs/components/ui/popover.tsx
Normal file
32
docs/components/ui/popover.tsx
Normal file
@@ -0,0 +1,32 @@
|
||||
'use client';
|
||||
import * as PopoverPrimitive from '@radix-ui/react-popover';
|
||||
import * as React from 'react';
|
||||
import { cn } from '../../lib/cn';
|
||||
|
||||
const Popover = PopoverPrimitive.Root;
|
||||
|
||||
const PopoverTrigger = PopoverPrimitive.Trigger;
|
||||
|
||||
const PopoverContent = React.forwardRef<
|
||||
React.ComponentRef<typeof PopoverPrimitive.Content>,
|
||||
React.ComponentPropsWithoutRef<typeof PopoverPrimitive.Content>
|
||||
>(({ className, align = 'center', sideOffset = 4, ...props }, ref) => (
|
||||
<PopoverPrimitive.Portal>
|
||||
<PopoverPrimitive.Content
|
||||
ref={ref}
|
||||
align={align}
|
||||
sideOffset={sideOffset}
|
||||
side="bottom"
|
||||
className={cn(
|
||||
'z-50 origin-(--radix-popover-content-transform-origin) overflow-y-auto max-h-(--radix-popover-content-available-height) min-w-[240px] max-w-[98vw] rounded-xl border bg-fd-popover/60 backdrop-blur-lg p-2 text-sm text-fd-popover-foreground shadow-lg focus-visible:outline-none data-[state=closed]:animate-fd-popover-out data-[state=open]:animate-fd-popover-in',
|
||||
className,
|
||||
)}
|
||||
{...props}
|
||||
/>
|
||||
</PopoverPrimitive.Portal>
|
||||
));
|
||||
PopoverContent.displayName = PopoverPrimitive.Content.displayName;
|
||||
|
||||
const PopoverClose = PopoverPrimitive.PopoverClose;
|
||||
|
||||
export { Popover, PopoverTrigger, PopoverContent, PopoverClose };
|
||||
38
docs/content/docs/README.md
Normal file
38
docs/content/docs/README.md
Normal file
@@ -0,0 +1,38 @@
|
||||
---
|
||||
title: "Documentation README"
|
||||
description: "Voicebox documentation development guide"
|
||||
---
|
||||
|
||||
This directory contains the documentation for Voicebox, built with [Fumadocs](https://fumadocs.dev).
|
||||
|
||||
## Development
|
||||
|
||||
### Running Locally
|
||||
|
||||
From the `docs/` directory:
|
||||
|
||||
```bash
|
||||
bun install
|
||||
bun run dev
|
||||
```
|
||||
|
||||
The docs will be available at `http://localhost:3000`.
|
||||
|
||||
### Structure
|
||||
|
||||
- `content/docs/overview/` — user-facing guides (installation, quick start, feature walkthroughs)
|
||||
- `content/docs/developer/` — architecture, backend internals, and contributor guides
|
||||
- `content/docs/api-reference/` — auto-generated from the backend's OpenAPI schema
|
||||
- `content/docs/index.mdx` — landing page
|
||||
- `public/` — static assets (images, screenshots, videos)
|
||||
|
||||
### Writing Docs
|
||||
|
||||
- Use `.mdx` files for all documentation pages
|
||||
- Navigation is generated from `content/docs/meta.json` files
|
||||
- Fumadocs components available: `Callout`, `Cards` / `Card`, `Tabs` / `Tab`, `Steps` / `Step`, `Accordion` / `AccordionGroup`, `Files` / `Folder` / `File`
|
||||
- API reference pages under `api-reference/` are regenerated from the backend's OpenAPI schema — don't edit them by hand
|
||||
|
||||
## Deployment
|
||||
|
||||
Docs are automatically deployed when changes land on `main`.
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: Health
|
||||
description: Health check endpoint.
|
||||
full: true
|
||||
_openapi:
|
||||
method: GET
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: Health check endpoint.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/health","method":"get"}]} />
|
||||
4
docs/content/docs/api-reference/general/meta.json
Normal file
4
docs/content/docs/api-reference/general/meta.json
Normal file
@@ -0,0 +1,4 @@
|
||||
{
|
||||
"title": "General",
|
||||
"pages": ["root__get", "health_health_get"]
|
||||
}
|
||||
16
docs/content/docs/api-reference/general/root__get.mdx
Normal file
16
docs/content/docs/api-reference/general/root__get.mdx
Normal file
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: Root
|
||||
description: Root endpoint.
|
||||
full: true
|
||||
_openapi:
|
||||
method: GET
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: Root endpoint.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/","method":"get"}]} />
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: Generate Speech
|
||||
description: Generate speech from text using a voice profile.
|
||||
full: true
|
||||
_openapi:
|
||||
method: POST
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: Generate speech from text using a voice profile.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/generate","method":"post"}]} />
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: Get Audio
|
||||
description: Serve generated audio file.
|
||||
full: true
|
||||
_openapi:
|
||||
method: GET
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: Serve generated audio file.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/audio/{generation_id}","method":"get"}]} />
|
||||
8
docs/content/docs/api-reference/generation/meta.json
Normal file
8
docs/content/docs/api-reference/generation/meta.json
Normal file
@@ -0,0 +1,8 @@
|
||||
{
|
||||
"title": "Generation",
|
||||
"pages": [
|
||||
"generate_speech_generate_post",
|
||||
"transcribe_audio_transcribe_post",
|
||||
"get_audio_audio__generation_id__get"
|
||||
]
|
||||
}
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: Transcribe Audio
|
||||
description: Transcribe audio file to text.
|
||||
full: true
|
||||
_openapi:
|
||||
method: POST
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: Transcribe audio file to text.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/transcribe","method":"post"}]} />
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: Delete Generation
|
||||
description: Delete a generation.
|
||||
full: true
|
||||
_openapi:
|
||||
method: DELETE
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: Delete a generation.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/history/{generation_id}","method":"delete"}]} />
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: Get Generation
|
||||
description: Get a generation by ID.
|
||||
full: true
|
||||
_openapi:
|
||||
method: GET
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: Get a generation by ID.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/history/{generation_id}","method":"get"}]} />
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: Get Stats
|
||||
description: Get generation statistics.
|
||||
full: true
|
||||
_openapi:
|
||||
method: GET
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: Get generation statistics.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/history/stats","method":"get"}]} />
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: List History
|
||||
description: List generation history with optional filters.
|
||||
full: true
|
||||
_openapi:
|
||||
method: GET
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: List generation history with optional filters.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/history","method":"get"}]} />
|
||||
9
docs/content/docs/api-reference/history/meta.json
Normal file
9
docs/content/docs/api-reference/history/meta.json
Normal file
@@ -0,0 +1,9 @@
|
||||
{
|
||||
"title": "History",
|
||||
"pages": [
|
||||
"list_history_history_get",
|
||||
"get_generation_history__generation_id__get",
|
||||
"delete_generation_history__generation_id__delete",
|
||||
"get_stats_history_stats_get"
|
||||
]
|
||||
}
|
||||
5
docs/content/docs/api-reference/meta.json
Normal file
5
docs/content/docs/api-reference/meta.json
Normal file
@@ -0,0 +1,5 @@
|
||||
{
|
||||
"title": "API Reference",
|
||||
"defaultOpen": true,
|
||||
"pages": ["general", "profiles", "generation", "history", "models"]
|
||||
}
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: Get Model Progress
|
||||
description: Get model download progress via Server-Sent Events.
|
||||
full: true
|
||||
_openapi:
|
||||
method: GET
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: Get model download progress via Server-Sent Events.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/models/progress/{model_name}","method":"get"}]} />
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: Get Model Status
|
||||
description: Get status of all available models.
|
||||
full: true
|
||||
_openapi:
|
||||
method: GET
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: Get status of all available models.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/models/status","method":"get"}]} />
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: Load Model
|
||||
description: Manually load TTS model.
|
||||
full: true
|
||||
_openapi:
|
||||
method: POST
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: Manually load TTS model.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/models/load","method":"post"}]} />
|
||||
10
docs/content/docs/api-reference/models/meta.json
Normal file
10
docs/content/docs/api-reference/models/meta.json
Normal file
@@ -0,0 +1,10 @@
|
||||
{
|
||||
"title": "Models",
|
||||
"pages": [
|
||||
"get_model_status_models_status_get",
|
||||
"load_model_models_load_post",
|
||||
"unload_model_models_unload_post",
|
||||
"trigger_model_download_models_download_post",
|
||||
"get_model_progress_models_progress__model_name__get"
|
||||
]
|
||||
}
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: Trigger Model Download
|
||||
description: Trigger download of a specific model.
|
||||
full: true
|
||||
_openapi:
|
||||
method: POST
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: Trigger download of a specific model.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/models/download","method":"post"}]} />
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: Unload Model
|
||||
description: Unload TTS model to free memory.
|
||||
full: true
|
||||
_openapi:
|
||||
method: POST
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: Unload TTS model to free memory.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/models/unload","method":"post"}]} />
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: Add Profile Sample
|
||||
description: Add a sample to a voice profile.
|
||||
full: true
|
||||
_openapi:
|
||||
method: POST
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: Add a sample to a voice profile.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/profiles/{profile_id}/samples","method":"post"}]} />
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: Create Profile
|
||||
description: Create a new voice profile.
|
||||
full: true
|
||||
_openapi:
|
||||
method: POST
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: Create a new voice profile.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/profiles","method":"post"}]} />
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: Delete Profile
|
||||
description: Delete a voice profile.
|
||||
full: true
|
||||
_openapi:
|
||||
method: DELETE
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: Delete a voice profile.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/profiles/{profile_id}","method":"delete"}]} />
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: Delete Profile Sample
|
||||
description: Delete a profile sample.
|
||||
full: true
|
||||
_openapi:
|
||||
method: DELETE
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: Delete a profile sample.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/profiles/samples/{sample_id}","method":"delete"}]} />
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: Get Profile
|
||||
description: Get a voice profile by ID.
|
||||
full: true
|
||||
_openapi:
|
||||
method: GET
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: Get a voice profile by ID.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/profiles/{profile_id}","method":"get"}]} />
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: Get Profile Samples
|
||||
description: Get all samples for a profile.
|
||||
full: true
|
||||
_openapi:
|
||||
method: GET
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: Get all samples for a profile.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/profiles/{profile_id}/samples","method":"get"}]} />
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: List Profiles
|
||||
description: List all voice profiles.
|
||||
full: true
|
||||
_openapi:
|
||||
method: GET
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: List all voice profiles.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/profiles","method":"get"}]} />
|
||||
13
docs/content/docs/api-reference/profiles/meta.json
Normal file
13
docs/content/docs/api-reference/profiles/meta.json
Normal file
@@ -0,0 +1,13 @@
|
||||
{
|
||||
"title": "Profiles",
|
||||
"pages": [
|
||||
"list_profiles_profiles_get",
|
||||
"create_profile_profiles_post",
|
||||
"get_profile_profiles__profile_id__get",
|
||||
"update_profile_profiles__profile_id__put",
|
||||
"delete_profile_profiles__profile_id__delete",
|
||||
"get_profile_samples_profiles__profile_id__samples_get",
|
||||
"add_profile_sample_profiles__profile_id__samples_post",
|
||||
"delete_profile_sample_profiles_samples__sample_id__delete"
|
||||
]
|
||||
}
|
||||
@@ -0,0 +1,16 @@
|
||||
---
|
||||
title: Update Profile
|
||||
description: Update a voice profile.
|
||||
full: true
|
||||
_openapi:
|
||||
method: PUT
|
||||
toc: []
|
||||
structuredData:
|
||||
headings: []
|
||||
contents:
|
||||
- content: Update a voice profile.
|
||||
---
|
||||
|
||||
{/* This file was generated by Fumadocs. Do not edit this file directly. Any changes should be made by running the generation command again. */}
|
||||
|
||||
<APIPage document={"./openapi.json"} operations={[{"path":"/profiles/{profile_id}","method":"put"}]} />
|
||||
286
docs/content/docs/developer/architecture.mdx
Normal file
286
docs/content/docs/developer/architecture.mdx
Normal file
@@ -0,0 +1,286 @@
|
||||
---
|
||||
title: "Architecture"
|
||||
description: "Understanding Voicebox's technical architecture"
|
||||
---
|
||||
|
||||
## System Overview
|
||||
|
||||
Voicebox uses a client-server architecture with a React frontend and Python backend. The desktop app is built with Tauri and contains two main layers:
|
||||
|
||||
**Frontend Layer:** A React application that handles the UI components, state management with Zustand, and data fetching with React Query (TanStack Query).
|
||||
|
||||
**Backend Layer:** A Python FastAPI server that hosts the REST API, runs a pluggable registry of TTS and STT engines, manages the SQLite database, and handles audio processing.
|
||||
|
||||
These two layers communicate via HTTP on `localhost:17493`, with the frontend making API requests to the backend. In production the backend is compiled with PyInstaller and launched as a Tauri sidecar; in development it's run manually via `uvicorn`.
|
||||
|
||||
## Frontend Architecture
|
||||
|
||||
### Tech Stack
|
||||
|
||||
- **Framework**: React 18 with TypeScript
|
||||
- **State Management**: Zustand stores
|
||||
- **Data Fetching**: React Query (TanStack Query)
|
||||
- **Styling**: Tailwind CSS
|
||||
- **Audio**: WaveSurfer.js
|
||||
- **Desktop**: Tauri (Rust)
|
||||
|
||||
### Component Structure
|
||||
|
||||
<Files>
|
||||
<Folder name="app/src" defaultOpen>
|
||||
<Folder name="components">
|
||||
<File name="Profiles/" />
|
||||
<File name="Generation/" />
|
||||
<File name="Stories/" />
|
||||
<File name="ServerSettings/" />
|
||||
</Folder>
|
||||
<Folder name="lib">
|
||||
<File name="api/" />
|
||||
<File name="constants/" />
|
||||
<File name="hooks/" />
|
||||
<File name="utils/" />
|
||||
</Folder>
|
||||
<Folder name="stores" />
|
||||
</Folder>
|
||||
</Files>
|
||||
|
||||
## Backend Architecture
|
||||
|
||||
### Tech Stack
|
||||
|
||||
- **Framework**: FastAPI (Python 3.11+)
|
||||
- **TTS Engines**: Qwen3-TTS, Qwen CustomVoice, LuxTTS, Chatterbox, Chatterbox Turbo, TADA, Kokoro
|
||||
- **Transcription**: Whisper (PyTorch or MLX-Whisper)
|
||||
- **Inference Backends**: MLX (Apple Silicon), PyTorch (CUDA / ROCm / XPU / DirectML / CPU)
|
||||
- **Database**: SQLite via SQLAlchemy
|
||||
- **Audio**: librosa, soundfile, Pedalboard
|
||||
|
||||
### Layout
|
||||
|
||||
<Files>
|
||||
<Folder name="backend" defaultOpen>
|
||||
<File name="app.py" />
|
||||
<File name="main.py" />
|
||||
<File name="config.py" />
|
||||
<File name="models.py" />
|
||||
<File name="server.py" />
|
||||
<File name="build_binary.py" />
|
||||
<Folder name="routes">
|
||||
<File name="profiles.py" />
|
||||
<File name="generate.py" />
|
||||
<File name="history.py" />
|
||||
<File name="models.py" />
|
||||
<File name="channels.py" />
|
||||
</Folder>
|
||||
<Folder name="services">
|
||||
<File name="generation.py" />
|
||||
<File name="task_queue.py" />
|
||||
<File name="profiles.py" />
|
||||
<File name="channels.py" />
|
||||
</Folder>
|
||||
<Folder name="backends">
|
||||
<File name="__init__.py" />
|
||||
<File name="base.py" />
|
||||
<File name="pytorch_backend.py" />
|
||||
<File name="mlx_backend.py" />
|
||||
<File name="qwen_custom_voice_backend.py" />
|
||||
<File name="luxtts_backend.py" />
|
||||
<File name="chatterbox_backend.py" />
|
||||
<File name="chatterbox_turbo_backend.py" />
|
||||
<File name="hume_backend.py" />
|
||||
<File name="kokoro_backend.py" />
|
||||
</Folder>
|
||||
<Folder name="database">
|
||||
<File name="models.py" />
|
||||
<File name="session.py" />
|
||||
</Folder>
|
||||
<Folder name="utils">
|
||||
<File name="audio.py" />
|
||||
<File name="effects.py" />
|
||||
</Folder>
|
||||
</Folder>
|
||||
</Files>
|
||||
|
||||
### Request Flow
|
||||
|
||||
An HTTP request enters a **route handler**, which validates input and delegates to a **service** function. The service calls into the appropriate **engine backend** via the registry, which runs the actual inference. Audio post-processing runs through **utils** (trim, resample, effects).
|
||||
|
||||
Route handlers are intentionally thin — they validate input, delegate to a service function, and format the response. All business logic lives in `services/`.
|
||||
|
||||
### Multi-Engine Registry
|
||||
|
||||
The backend is designed so that adding a new TTS engine only requires touching the `backends/` directory and the central registry. There is no per-engine branching in routes or services.
|
||||
|
||||
- **`TTSBackend` Protocol** (`backends/__init__.py`) — defines the contract every engine implements: `load_model`, `create_voice_prompt`, `combine_voice_prompts`, `generate`, `unload_model`, `is_loaded`, `_get_model_path`.
|
||||
- **`ModelConfig` dataclass** — central metadata record for each model variant: `model_name`, `display_name`, `engine`, `hf_repo_id`, `size_mb`, `needs_trim`, `languages`, `supports_instruct`, etc.
|
||||
- **`TTS_ENGINES` dict** — maps engine name (`"qwen"`, `"kokoro"`, etc.) to display name.
|
||||
- **`get_tts_backend_for_engine(engine)`** — thread-safe factory that lazily instantiates and caches the backend for an engine using double-checked locking.
|
||||
|
||||
Shipped engines:
|
||||
|
||||
| Engine key | Display name | Profile type |
|
||||
|------------|--------------|--------------|
|
||||
| `qwen` | Qwen TTS | Cloned |
|
||||
| `qwen_custom_voice` | Qwen CustomVoice | Preset |
|
||||
| `luxtts` | LuxTTS | Cloned |
|
||||
| `chatterbox` | Chatterbox TTS | Cloned |
|
||||
| `chatterbox_turbo` | Chatterbox Turbo | Cloned |
|
||||
| `tada` | TADA | Cloned |
|
||||
| `kokoro` | Kokoro | Preset |
|
||||
|
||||
See [TTS Engines](/developer/tts-engines) for the full contract and integration phases, and [PROJECT_STATUS.md](https://github.com/jamiepine/voicebox/blob/main/docs/PROJECT_STATUS.md) for candidates under evaluation.
|
||||
|
||||
### Key Modules
|
||||
|
||||
- **`app.py`** — FastAPI app factory, CORS, lifecycle events
|
||||
- **`main.py`** — Entry point (imports app, runs uvicorn)
|
||||
- **`server.py`** — Tauri sidecar launcher, parent-pid watchdog, frozen-build environment setup
|
||||
- **`services/generation.py`** — Single function handling all generation modes (generate, retry, regenerate)
|
||||
- **`services/task_queue.py`** — Serial generation queue for GPU inference
|
||||
- **`backends/__init__.py`** — Protocol definitions, `ModelConfig` registry, and engine factory
|
||||
- **`backends/base.py`** — Shared utilities across all engine implementations (device selection, progress tracking, output trimming)
|
||||
|
||||
### Inference Backend Selection
|
||||
|
||||
The server detects the best inference backend at startup and uses it for all engines that support it:
|
||||
|
||||
| Platform | Backend | Acceleration |
|
||||
|----------|---------|--------------|
|
||||
| macOS (Apple Silicon) | MLX | Metal / Neural Engine |
|
||||
| Windows / Linux (NVIDIA) | PyTorch | CUDA (cu128) |
|
||||
| Linux (AMD) | PyTorch | ROCm |
|
||||
| Windows / Linux (Intel Arc) | PyTorch | XPU (IPEX) |
|
||||
| Windows (other GPU) | PyTorch | DirectML |
|
||||
| Any | PyTorch | CPU fallback |
|
||||
|
||||
See [GPU Acceleration](/overview/gpu-acceleration) for platform-specific notes and manual overrides.
|
||||
|
||||
### Data Model
|
||||
|
||||
Core tables (see `backend/database/models.py`):
|
||||
|
||||
- **`profiles`** — Voice profiles with `voice_type` discriminator (`cloned` | `preset` | `designed`), `preset_engine`, `preset_voice_id`, and `default_engine`.
|
||||
- **`profile_samples`** — Reference audio clips + transcripts for cloned profiles. Empty for preset profiles.
|
||||
- **`generations`** — Generated audio with text, engine, model, language, seed, and duration.
|
||||
- **`generation_versions`** — Processed variants of a generation with different effects chains applied.
|
||||
- **`audio_channels`** + **`channel_device_mappings`** + **`profile_channel_mappings`** — Multi-output routing.
|
||||
|
||||
See [Voice Profiles](/developer/voice-profiles) and [Effects Pipeline](/developer/effects-pipeline) for details.
|
||||
|
||||
## Desktop App (Tauri)
|
||||
|
||||
### Rust Backend
|
||||
|
||||
<Files>
|
||||
<Folder name="tauri/src-tauri" defaultOpen>
|
||||
<File name="Cargo.toml" />
|
||||
<File name="tauri.conf.json" />
|
||||
<File name="src/" />
|
||||
<Folder name="binaries" />
|
||||
</Folder>
|
||||
</Files>
|
||||
|
||||
### Responsibilities
|
||||
|
||||
- Launch Python backend as sidecar process
|
||||
- Native file dialogs
|
||||
- System tray integration
|
||||
- Auto-updates (Tauri updater + custom CUDA backend swap)
|
||||
- Parent-PID watchdog so the backend exits if the app crashes
|
||||
|
||||
## Build Process
|
||||
|
||||
### Development
|
||||
|
||||
```bash
|
||||
just dev # Starts backend + Tauri app
|
||||
just dev-web # Starts backend + web app (no Tauri)
|
||||
just dev-backend # Backend only
|
||||
just dev-frontend # Tauri app only (backend must be running)
|
||||
```
|
||||
|
||||
### Production
|
||||
|
||||
```bash
|
||||
just build # CPU server binary + Tauri installer
|
||||
just build-local # CPU + CUDA binaries + Tauri installer (Windows)
|
||||
just build-server # Server binary only
|
||||
just build-tauri # Tauri app only
|
||||
```
|
||||
|
||||
See [Building](/developer/building) for what PyInstaller does and how the CUDA binary is split and packaged separately.
|
||||
|
||||
## Data Flow
|
||||
|
||||
### Generation Flow
|
||||
|
||||
1. **User Input** — text entered in a React component, engine + profile selected
|
||||
2. **State Update** — Zustand generation form store records the request
|
||||
3. **API Request** — React Query mutation hits `POST /generate`
|
||||
4. **Route** — `routes/generate.py` validates input, dispatches to `services/generation.py`
|
||||
5. **Voice Prompt** — the service creates or retrieves a cached voice prompt via the engine's backend
|
||||
6. **Queue** — `services/task_queue.py` serializes generation to avoid GPU contention
|
||||
7. **Inference** — the engine backend runs `generate()` and returns audio + sample rate
|
||||
8. **Post-process** — optional trim (for engines that need it), effects chain applied per generation version
|
||||
9. **Storage** — audio written to the generations directory, metadata saved to SQLite
|
||||
10. **Response** — backend returns the generation record; frontend updates React Query cache and plays audio
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
### Frontend
|
||||
|
||||
- **Code splitting** — lazy-load routes
|
||||
- **Memoization** — `React.memo` for heavy components
|
||||
- **Virtual scrolling** — for large lists
|
||||
- **Debouncing** — search and input handling
|
||||
|
||||
### Backend
|
||||
|
||||
- **Async I/O** — all I/O is async; inference runs in `asyncio.to_thread`
|
||||
- **Serial task queue** — avoids multiple engines fighting for the GPU
|
||||
- **Voice prompt caching** — engine-specific, keyed by audio hash + reference text
|
||||
- **Model pinning** — only one model per engine loaded at a time; switching unloads the previous one
|
||||
- **Per-engine backend cache** — engines are only instantiated once per process
|
||||
|
||||
## Security
|
||||
|
||||
### Current
|
||||
|
||||
- Local-only by default (bound to `127.0.0.1:17493`)
|
||||
- No authentication (localhost trust)
|
||||
- File system sandboxing via Tauri
|
||||
|
||||
### Planned
|
||||
|
||||
- API key authentication for remote mode
|
||||
- User accounts
|
||||
- Rate limiting
|
||||
- HTTPS support
|
||||
|
||||
## Deployment Modes
|
||||
|
||||
### Local Mode
|
||||
|
||||
- Backend runs as sidecar
|
||||
- All data stays on device
|
||||
- No network required
|
||||
|
||||
### Remote Mode
|
||||
|
||||
- Backend on a separate machine (Docker or bare host)
|
||||
- Frontend (desktop or web) connects over HTTP
|
||||
- See [Remote Mode](/overview/remote-mode) and [Docker](/overview/docker)
|
||||
|
||||
## Next Steps
|
||||
|
||||
<Cards>
|
||||
<Card title="Development Setup" href="/developer/setup">
|
||||
Set up your dev environment
|
||||
</Card>
|
||||
<Card title="TTS Engines" href="/developer/tts-engines">
|
||||
How to add a new engine
|
||||
</Card>
|
||||
<Card title="Contributing" href="/developer/contributing">
|
||||
Contribute to Voicebox
|
||||
</Card>
|
||||
</Cards>
|
||||
310
docs/content/docs/developer/audio-channels.mdx
Normal file
310
docs/content/docs/developer/audio-channels.mdx
Normal file
@@ -0,0 +1,310 @@
|
||||
---
|
||||
title: "Audio Channels"
|
||||
description: "How audio output routing works in Voicebox"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Audio channels allow routing voice output to different audio devices. This is useful for multi-output setups where different voices should play through different speakers or applications.
|
||||
|
||||
## Architecture
|
||||
|
||||
**Channel:** A named audio bus that can be assigned to output devices.
|
||||
|
||||
**Device Mapping:** Links channels to OS audio device identifiers.
|
||||
|
||||
**Profile Mapping:** Links voice profiles to channels (many-to-many).
|
||||
|
||||
## Data Model
|
||||
|
||||
### AudioChannel Table
|
||||
|
||||
```python
|
||||
class AudioChannel(Base):
|
||||
__tablename__ = "audio_channels"
|
||||
|
||||
id = Column(String, primary_key=True)
|
||||
name = Column(String, nullable=False)
|
||||
is_default = Column(Boolean, default=False)
|
||||
created_at = Column(DateTime)
|
||||
```
|
||||
|
||||
### ChannelDeviceMapping Table
|
||||
|
||||
```python
|
||||
class ChannelDeviceMapping(Base):
|
||||
__tablename__ = "channel_device_mappings"
|
||||
|
||||
id = Column(String, primary_key=True)
|
||||
channel_id = Column(String, ForeignKey("audio_channels.id"))
|
||||
device_id = Column(String) # OS device identifier
|
||||
```
|
||||
|
||||
### ProfileChannelMapping Table
|
||||
|
||||
```python
|
||||
class ProfileChannelMapping(Base):
|
||||
__tablename__ = "profile_channel_mappings"
|
||||
|
||||
profile_id = Column(String, ForeignKey("profiles.id"), primary_key=True)
|
||||
channel_id = Column(String, ForeignKey("audio_channels.id"), primary_key=True)
|
||||
```
|
||||
|
||||
## Default Channel
|
||||
|
||||
A default channel is created on database initialization:
|
||||
|
||||
```python
|
||||
def init_db():
|
||||
# Create default channel if it doesn't exist
|
||||
default_channel = db.query(AudioChannel).filter(
|
||||
AudioChannel.is_default == True
|
||||
).first()
|
||||
|
||||
if not default_channel:
|
||||
default_channel = AudioChannel(
|
||||
id=str(uuid.uuid4()),
|
||||
name="Default",
|
||||
is_default=True
|
||||
)
|
||||
db.add(default_channel)
|
||||
|
||||
# Assign all existing profiles to default channel
|
||||
profiles = db.query(VoiceProfile).all()
|
||||
for profile in profiles:
|
||||
mapping = ProfileChannelMapping(
|
||||
profile_id=profile.id,
|
||||
channel_id=default_channel.id
|
||||
)
|
||||
db.add(mapping)
|
||||
```
|
||||
|
||||
## Core Operations
|
||||
|
||||
### Creating a Channel
|
||||
|
||||
```python
|
||||
async def create_channel(
|
||||
data: AudioChannelCreate,
|
||||
db: Session,
|
||||
) -> AudioChannelResponse:
|
||||
# Check name uniqueness
|
||||
existing = db.query(DBAudioChannel).filter_by(name=data.name).first()
|
||||
if existing:
|
||||
raise ValueError(f"Channel with name '{data.name}' already exists")
|
||||
|
||||
# Create channel
|
||||
channel = DBAudioChannel(
|
||||
id=str(uuid.uuid4()),
|
||||
name=data.name,
|
||||
is_default=False,
|
||||
)
|
||||
db.add(channel)
|
||||
|
||||
# Add device mappings
|
||||
for device_id in data.device_ids:
|
||||
mapping = DBChannelDeviceMapping(
|
||||
id=str(uuid.uuid4()),
|
||||
channel_id=channel.id,
|
||||
device_id=device_id,
|
||||
)
|
||||
db.add(mapping)
|
||||
|
||||
db.commit()
|
||||
```
|
||||
|
||||
### Updating a Channel
|
||||
|
||||
```python
|
||||
async def update_channel(
|
||||
channel_id: str,
|
||||
data: AudioChannelUpdate,
|
||||
db: Session,
|
||||
) -> AudioChannelResponse:
|
||||
channel = db.query(DBAudioChannel).filter_by(id=channel_id).first()
|
||||
|
||||
# Cannot modify default channel
|
||||
if channel.is_default:
|
||||
raise ValueError("Cannot modify the default channel")
|
||||
|
||||
# Update name
|
||||
if data.name is not None:
|
||||
channel.name = data.name
|
||||
|
||||
# Update device mappings
|
||||
if data.device_ids is not None:
|
||||
# Delete existing
|
||||
db.query(DBChannelDeviceMapping).filter_by(channel_id=channel_id).delete()
|
||||
|
||||
# Add new
|
||||
for device_id in data.device_ids:
|
||||
mapping = DBChannelDeviceMapping(
|
||||
channel_id=channel.id,
|
||||
device_id=device_id,
|
||||
)
|
||||
db.add(mapping)
|
||||
|
||||
db.commit()
|
||||
```
|
||||
|
||||
### Deleting a Channel
|
||||
|
||||
```python
|
||||
async def delete_channel(channel_id: str, db: Session) -> bool:
|
||||
channel = db.query(DBAudioChannel).filter_by(id=channel_id).first()
|
||||
|
||||
# Cannot delete default channel
|
||||
if channel.is_default:
|
||||
raise ValueError("Cannot delete the default channel")
|
||||
|
||||
# Delete device mappings
|
||||
db.query(DBChannelDeviceMapping).filter_by(channel_id=channel_id).delete()
|
||||
|
||||
# Delete profile-channel mappings
|
||||
db.query(DBProfileChannelMapping).filter_by(channel_id=channel_id).delete()
|
||||
|
||||
# Delete channel
|
||||
db.delete(channel)
|
||||
db.commit()
|
||||
```
|
||||
|
||||
## Voice Assignment
|
||||
|
||||
### Assigning Voices to Channel
|
||||
|
||||
```python
|
||||
async def set_channel_voices(
|
||||
channel_id: str,
|
||||
data: ChannelVoiceAssignment,
|
||||
db: Session,
|
||||
) -> None:
|
||||
# Verify channel exists
|
||||
channel = db.query(DBAudioChannel).filter_by(id=channel_id).first()
|
||||
if not channel:
|
||||
raise ValueError(f"Channel {channel_id} not found")
|
||||
|
||||
# Verify all profiles exist
|
||||
for profile_id in data.profile_ids:
|
||||
profile = db.query(DBVoiceProfile).filter_by(id=profile_id).first()
|
||||
if not profile:
|
||||
raise ValueError(f"Profile {profile_id} not found")
|
||||
|
||||
# Delete existing mappings
|
||||
db.query(DBProfileChannelMapping).filter_by(channel_id=channel_id).delete()
|
||||
|
||||
# Add new mappings
|
||||
for profile_id in data.profile_ids:
|
||||
mapping = DBProfileChannelMapping(
|
||||
profile_id=profile_id,
|
||||
channel_id=channel_id,
|
||||
)
|
||||
db.add(mapping)
|
||||
|
||||
db.commit()
|
||||
```
|
||||
|
||||
### Assigning Channels to Voice
|
||||
|
||||
```python
|
||||
async def set_profile_channels(
|
||||
profile_id: str,
|
||||
data: ProfileChannelAssignment,
|
||||
db: Session,
|
||||
) -> None:
|
||||
# Verify profile exists
|
||||
profile = db.query(DBVoiceProfile).filter_by(id=profile_id).first()
|
||||
if not profile:
|
||||
raise ValueError(f"Profile {profile_id} not found")
|
||||
|
||||
# Delete existing mappings
|
||||
db.query(DBProfileChannelMapping).filter_by(profile_id=profile_id).delete()
|
||||
|
||||
# Add new mappings
|
||||
for channel_id in data.channel_ids:
|
||||
mapping = DBProfileChannelMapping(
|
||||
profile_id=profile_id,
|
||||
channel_id=channel_id,
|
||||
)
|
||||
db.add(mapping)
|
||||
|
||||
db.commit()
|
||||
```
|
||||
|
||||
## API Endpoints
|
||||
|
||||
| Method | Endpoint | Description |
|
||||
|--------|----------|-------------|
|
||||
| GET | `/channels` | List all channels |
|
||||
| POST | `/channels` | Create a channel |
|
||||
| GET | `/channels/{id}` | Get channel by ID |
|
||||
| PUT | `/channels/{id}` | Update channel |
|
||||
| DELETE | `/channels/{id}` | Delete channel |
|
||||
| GET | `/channels/{id}/voices` | Get assigned voices |
|
||||
| PUT | `/channels/{id}/voices` | Set assigned voices |
|
||||
| GET | `/profiles/{id}/channels` | Get profile's channels |
|
||||
| PUT | `/profiles/{id}/channels` | Set profile's channels |
|
||||
|
||||
## Request/Response Schemas
|
||||
|
||||
### AudioChannelCreate
|
||||
|
||||
```json
|
||||
{
|
||||
"name": "Speakers",
|
||||
"device_ids": ["device_uuid_1", "device_uuid_2"]
|
||||
}
|
||||
```
|
||||
|
||||
### AudioChannelResponse
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "channel_uuid",
|
||||
"name": "Speakers",
|
||||
"is_default": false,
|
||||
"device_ids": ["device_uuid_1", "device_uuid_2"],
|
||||
"created_at": "2024-01-15T10:30:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
### ChannelVoiceAssignment
|
||||
|
||||
```json
|
||||
{
|
||||
"profile_ids": ["profile_1", "profile_2"]
|
||||
}
|
||||
```
|
||||
|
||||
## Use Cases
|
||||
|
||||
### Multi-Output Setup
|
||||
|
||||
**Scenario:** Stream with different voice characters
|
||||
|
||||
1. Create "Stream" channel → OBS virtual audio
|
||||
2. Create "Monitor" channel → Headphones
|
||||
3. Assign "Narrator" profile → Both channels
|
||||
4. Assign "Character 1" profile → Stream only
|
||||
|
||||
### Virtual Audio Cables
|
||||
|
||||
Common device IDs for virtual audio:
|
||||
- VB-Audio Virtual Cable
|
||||
- BlackHole (macOS)
|
||||
- Soundflower (macOS)
|
||||
|
||||
## Frontend Integration
|
||||
|
||||
The frontend needs to:
|
||||
|
||||
1. **Enumerate devices** using Web Audio API or Tauri
|
||||
2. **Display channel list** with device assignments
|
||||
3. **Allow profile assignment** via drag/drop or dropdown
|
||||
4. **Route playback** to correct device based on profile's channel
|
||||
|
||||
## Limitations
|
||||
|
||||
- Device IDs are OS-specific
|
||||
- Hot-plugging may invalidate device IDs
|
||||
- Default channel cannot be modified/deleted
|
||||
- Frontend handles actual audio routing (backend just stores config)
|
||||
218
docs/content/docs/developer/autoupdater.mdx
Normal file
218
docs/content/docs/developer/autoupdater.mdx
Normal file
@@ -0,0 +1,218 @@
|
||||
---
|
||||
title: "Auto-Updater"
|
||||
description: "How Voicebox automatic updates work"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Voicebox uses Tauri's built-in auto-updater to deliver signed updates to users. The system verifies updates cryptographically before installation.
|
||||
|
||||
## How It Works
|
||||
|
||||
When Voicebox launches (in production Tauri builds only), it checks GitHub Releases for a `latest.json` manifest. If a newer version is available:
|
||||
|
||||
1. **Notification** - An update banner appears at the top of the app
|
||||
2. **Download** - User clicks "Install Now" to download the update package
|
||||
3. **Verification** - The downloaded package is cryptographically verified using the public key embedded in `tauri.conf.json`
|
||||
4. **Installation** - After verification, the update is installed
|
||||
5. **Restart** - The app restarts automatically with the new version
|
||||
|
||||
Users can also check for updates manually via **Settings → Check for Updates**.
|
||||
|
||||
## Configuration
|
||||
|
||||
The updater is configured in `tauri/src-tauri/tauri.conf.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"plugins": {
|
||||
"updater": {
|
||||
"active": true,
|
||||
"dialog": false,
|
||||
"endpoints": [
|
||||
"https://github.com/jamiepine/voicebox/releases/latest/download/latest.json"
|
||||
],
|
||||
"pubkey": "PASTE_PUBLIC_KEY_CONTENT_HERE"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Key settings:**
|
||||
- `endpoints` - URL to the `latest.json` manifest (checked on app startup)
|
||||
- `pubkey` - Public key for verifying update signatures
|
||||
- `dialog` - Set to `false` (we use custom UI instead of Tauri's built-in dialog)
|
||||
|
||||
## Release Manifest
|
||||
|
||||
The `latest.json` file defines available updates per platform:
|
||||
|
||||
```json
|
||||
{
|
||||
"version": "0.2.0",
|
||||
"notes": "Bug fixes and improvements",
|
||||
"pub_date": "2026-01-25T12:00:00Z",
|
||||
"platforms": {
|
||||
"darwin-aarch64": {
|
||||
"signature": "base64_encoded_signature",
|
||||
"url": "https://github.com/jamiepine/voicebox/releases/download/v0.2.0/voicebox_0.2.0_aarch64.app.tar.gz"
|
||||
},
|
||||
"darwin-x86_64": {
|
||||
"signature": "base64_encoded_signature",
|
||||
"url": "https://github.com/jamiepine/voicebox/releases/download/v0.2.0/voicebox_0.2.0_x64.app.tar.gz"
|
||||
},
|
||||
"linux-x86_64": {
|
||||
"signature": "base64_encoded_signature",
|
||||
"url": "https://github.com/jamiepine/voicebox/releases/download/v0.2.0/voicebox_0.2.0_amd64.AppImage"
|
||||
},
|
||||
"windows-x86_64": {
|
||||
"signature": "base64_encoded_signature",
|
||||
"url": "https://github.com/jamiepine/voicebox/releases/download/v0.2.0/voicebox_0.2.0_x64_en-US.msi"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Signing
|
||||
|
||||
Updates must be cryptographically signed to be accepted. The signing process:
|
||||
|
||||
1. **Generate keys** (one-time setup):
|
||||
```bash
|
||||
bun tauri signer generate -w ~/.tauri/voicebox.key
|
||||
```
|
||||
This creates:
|
||||
- Private key: `~/.tauri/voicebox.key` (stored in GitHub Secrets, never committed)
|
||||
- Public key: `~/.tauri/voicebox.key.pub` (pasted into `tauri.conf.json`)
|
||||
|
||||
2. **Build with signing** (GitHub Actions handles this):
|
||||
- Set `TAURI_SIGNING_PRIVATE_KEY` environment variable
|
||||
- Tauri signs the update package during build
|
||||
- Generates `.sig` signature file alongside the installer
|
||||
|
||||
3. **Verification** - The updater compares the signature against the public key before installing
|
||||
|
||||
## GitHub Actions Workflow
|
||||
|
||||
The release workflow (`.github/workflows/release.yml`) automatically:
|
||||
|
||||
- Builds signed releases for macOS, Windows, and Linux
|
||||
- Creates the `latest.json` manifest with signatures
|
||||
- Uploads everything to the GitHub Release
|
||||
|
||||
Triggered by pushing a git tag:
|
||||
|
||||
```bash
|
||||
git tag v0.2.0 && git push --tags
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
GitHub Actions needs these secrets set:
|
||||
|
||||
- `TAURI_SIGNING_PRIVATE_KEY` - Content of `~/.tauri/voicebox.key`
|
||||
- `TAURI_SIGNING_PRIVATE_KEY_PASSWORD` - Password for the key (if set)
|
||||
|
||||
## Security
|
||||
|
||||
<Callout type="warn">
|
||||
**Critical:** Never commit the private key. Store it only in GitHub Secrets. The public key in `tauri.conf.json` is safe to commit and distribute.
|
||||
</Callout>
|
||||
|
||||
- Updates are cryptographically signed using Ed25519
|
||||
- HTTP endpoints are blocked (HTTPS only)
|
||||
- Signature verification happens before installation
|
||||
- Failed verification aborts the update
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Invalid signature" error
|
||||
- Public key in `tauri.conf.json` doesn't match the private key used to sign
|
||||
- Signature file wasn't uploaded to the release
|
||||
|
||||
### "No update available" when one exists
|
||||
- `latest.json` version isn't higher than current version
|
||||
- Wrong endpoint URL in configuration
|
||||
- Manifest hasn't propagated to GitHub's CDN yet
|
||||
|
||||
### Update check fails in dev mode
|
||||
The updater only works in production Tauri builds. It doesn't run during `just dev` or web mode.
|
||||
|
||||
### Build fails with signing error
|
||||
- GitHub Secrets aren't set correctly
|
||||
- Private key file is missing or corrupted
|
||||
- Key format is wrong (should start with `dW50cnVzdGVkIGNvbW1lbnQ6`)
|
||||
|
||||
## CUDA Backend Updates
|
||||
|
||||
The CUDA-enabled backend is distributed separately from the main app because bundling CUDA would bloat the installer by several gigabytes for users who don't have an NVIDIA GPU. Unlike the Tauri auto-updater, the CUDA backend uses a custom download system built into the Python server.
|
||||
|
||||
**Size comparison (approximate):**
|
||||
- Standard CPU bundle (in the installer): ~200–400 MB
|
||||
- CUDA server core: ~945 MB (versioned with each Voicebox release)
|
||||
- CUDA libs (NVIDIA runtime DLLs): ~1.7 GB (versioned independently, cached across upgrades)
|
||||
|
||||
### Two-archive split
|
||||
|
||||
Since v0.4, the CUDA binary is packaged as **two archives** instead of one:
|
||||
|
||||
- **Server core** (`voicebox-server-cuda.tar.gz`) — the Python server + PyTorch code, changes every release.
|
||||
- **CUDA libs** (`cuda-libs-cu128-v1.tar.gz`) — the heavy NVIDIA CUDA/cuDNN DLLs, only re-downloaded when the CUDA toolkit major version changes.
|
||||
|
||||
This means most Voicebox upgrades only re-download the ~945 MB server core, not the full ~2.5 GB bundle.
|
||||
|
||||
### Download Process
|
||||
|
||||
When a user clicks "Install CUDA backend" in Settings → GPU:
|
||||
|
||||
1. **Server-core archive** — Downloaded from GitHub Releases and extracted.
|
||||
2. **CUDA libs archive** — Downloaded separately (or reused if the installed version still matches).
|
||||
3. **Verification** — SHA-256 checksum verification for integrity.
|
||||
4. **Placement** — Extracted into `{data_dir}/backends/cuda/`.
|
||||
5. **Restart** — The Voicebox server restarts and swaps in the CUDA backend.
|
||||
|
||||
### Auto-Update on Startup
|
||||
|
||||
On startup, the backend compares the installed CUDA server-core version with the current app version. If they differ, the core archive is pulled in the background. If the libs version pinned by the new release also differs (rare — e.g. on a cu126 → cu128 bump), the user is prompted to confirm the larger download.
|
||||
|
||||
### Storage Location
|
||||
|
||||
Downloaded CUDA binaries live in the app's data directory:
|
||||
|
||||
```
|
||||
{data_dir}/backends/cuda/
|
||||
voicebox-server-cuda.exe # Windows
|
||||
voicebox-server-cuda # macOS/Linux
|
||||
<NVIDIA CUDA runtime DLLs>
|
||||
```
|
||||
|
||||
### API Endpoints
|
||||
|
||||
| Endpoint | Method | Description |
|
||||
|----------|--------|-------------|
|
||||
| `/backend/cuda-status` | GET | Check if the CUDA backend is available/active and which versions are installed |
|
||||
| `/backend/download-cuda` | POST | Trigger server-core + libs download |
|
||||
| `/backend/cuda-progress` | GET | SSE stream of download progress |
|
||||
| `/backend/cuda` | DELETE | Remove the downloaded CUDA backend |
|
||||
|
||||
### Progress Tracking
|
||||
|
||||
Downloads report progress via Server-Sent Events (SSE):
|
||||
|
||||
```
|
||||
GET /backend/cuda-progress
|
||||
|
||||
event: progress
|
||||
data: {"current": 52428800, "total": 945000000, "filename": "voicebox-server-cuda.tar.gz", "status": "downloading"}
|
||||
```
|
||||
|
||||
The frontend subscribes to this endpoint to show real-time progress, including which archive (server core vs libs) is currently downloading.
|
||||
|
||||
### Release Artifacts
|
||||
|
||||
For each CUDA-capable release, these files are uploaded to GitHub:
|
||||
|
||||
- `voicebox-server-cuda.tar.gz` — server-core archive
|
||||
- `voicebox-server-cuda.tar.gz.sha256` — checksum
|
||||
- `cuda-libs-cu128-v1.tar.gz` — CUDA runtime libs (only when the libs version bumps)
|
||||
- `cuda-libs-cu128-v1.tar.gz.sha256` — checksum
|
||||
192
docs/content/docs/developer/building.mdx
Normal file
192
docs/content/docs/developer/building.mdx
Normal file
@@ -0,0 +1,192 @@
|
||||
---
|
||||
title: "Building"
|
||||
description: "How Voicebox is built for production"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Voicebox uses a two-stage build process:
|
||||
|
||||
1. **Python Server Binary** — PyInstaller bundles the FastAPI backend into a standalone executable
|
||||
2. **Tauri Desktop App** — Bundles the React frontend, Rust wrapper, and Python server as a sidecar
|
||||
|
||||
## Build Commands
|
||||
|
||||
```bash
|
||||
just build # Build everything (server + Tauri)
|
||||
just build-server # Build Python server binary only
|
||||
just build-tauri # Build Tauri app only
|
||||
```
|
||||
|
||||
## Server Binary Build
|
||||
|
||||
### Build Script
|
||||
|
||||
`scripts/build-server.sh` orchestrates the build:
|
||||
|
||||
```bash
|
||||
# Determine platform (e.g., x86_64-apple-darwin)
|
||||
PLATFORM=$(rustc --print host-tuple)
|
||||
|
||||
# Run PyInstaller via build_binary.py
|
||||
cd backend
|
||||
python build_binary.py
|
||||
|
||||
# Copy to Tauri's binaries directory
|
||||
cp dist/voicebox-server ../tauri/src-tauri/binaries/voicebox-server-${PLATFORM}
|
||||
```
|
||||
|
||||
### PyInstaller Configuration
|
||||
|
||||
`backend/build_binary.py` contains the PyInstaller configuration:
|
||||
|
||||
**Entry Point:** Uses `server.py` (not `main.py`) for Tauri sidecar support
|
||||
|
||||
**Key Options:**
|
||||
- `--onefile` — Single executable
|
||||
- `--hidden-import` — Explicitly import modules PyInstaller can't detect
|
||||
- `--collect-all` — Bundle data files and native libraries for packages like `mlx`, `zipvoice`
|
||||
- `--exclude-module` — Strip NVIDIA packages from CPU builds
|
||||
|
||||
**Platform-Specific Logic:**
|
||||
|
||||
```python
|
||||
# Apple Silicon — include MLX backend
|
||||
if is_apple_silicon() and not cuda:
|
||||
args.extend([
|
||||
"--hidden-import", "mlx",
|
||||
"--collect-all", "mlx", # Bundles .dylib and .metallib files
|
||||
])
|
||||
|
||||
# CUDA builds — include torch.cuda
|
||||
if cuda:
|
||||
args.extend(["--hidden-import", "torch.cuda"])
|
||||
|
||||
# CPU builds — exclude NVIDIA packages to save ~3GB
|
||||
else:
|
||||
for pkg in ["nvidia", "nvidia.cublas", "nvidia.cudnn", ...]:
|
||||
args.extend(["--exclude-module", pkg])
|
||||
```
|
||||
|
||||
**Environment Variable:**
|
||||
|
||||
```bash
|
||||
export QWEN_TTS_PATH=~/path/to/Qwen3-TTS # Use local Qwen3-TTS source
|
||||
```
|
||||
|
||||
### CUDA Binary
|
||||
|
||||
The CUDA-enabled server is built separately due to size (~2.43 GB vs ~410 MB CPU version):
|
||||
|
||||
```bash
|
||||
cd backend
|
||||
python build_binary.py --cuda
|
||||
```
|
||||
|
||||
The resulting binary is too large for GitHub Releases, so it's split into parts for distribution (see Auto-Updater docs for the download mechanism).
|
||||
|
||||
## Tauri App Build
|
||||
|
||||
Tauri bundles everything together:
|
||||
|
||||
```bash
|
||||
cd tauri
|
||||
bun run tauri build
|
||||
```
|
||||
|
||||
**What happens:**
|
||||
1. Vite builds the React frontend
|
||||
2. Rust compiles the Tauri wrapper
|
||||
3. Sidecar binary is copied from `src-tauri/binaries/`
|
||||
4. Platform-specific installer created (DMG, MSI, AppImage)
|
||||
|
||||
**Output locations:**
|
||||
|
||||
<Files>
|
||||
<Folder name="tauri/src-tauri/target/release/bundle" defaultOpen>
|
||||
<File name="dmg/" />
|
||||
<File name="msi/" />
|
||||
<File name="nsis/" />
|
||||
<File name="appimage/" />
|
||||
</Folder>
|
||||
</Files>
|
||||
|
||||
### Sidecar Configuration
|
||||
|
||||
The server binary is declared as an external binary in `tauri.conf.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"tauri": {
|
||||
"bundle": {
|
||||
"externalBin": ["binaries/voicebox-server"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Tauri looks for `voicebox-server-${PLATFORM}` in `src-tauri/binaries/` and bundles it.
|
||||
|
||||
## GitHub Actions Release
|
||||
|
||||
`.github/workflows/release.yml` automates the full build:
|
||||
|
||||
### Matrix Strategy
|
||||
|
||||
| Platform | Target | Backend | Notes |
|
||||
|----------|--------|---------|-------|
|
||||
| macos-latest | aarch64-apple-darwin | MLX | Apple Silicon native |
|
||||
| macos-15-intel | x86_64-apple-darwin | PyTorch | Intel Macs |
|
||||
| windows-latest | x86_64-pc-windows-msvc | PyTorch | Windows with CUDA optional |
|
||||
|
||||
### Build Steps
|
||||
|
||||
1. **Setup** — Python, Rust, Bun, dependencies
|
||||
2. **Build Server** — `build-server.sh` (Unix) or `build_binary.py` (Windows)
|
||||
3. **Build Tauri** — `tauri-action` with signing keys
|
||||
4. **Upload** — Release artifacts and `latest.json`
|
||||
|
||||
### Code Signing
|
||||
|
||||
**macOS:**
|
||||
- Apple Developer certificate imported from secrets
|
||||
- Notarization via App Store Connect API
|
||||
|
||||
**Windows:**
|
||||
- Tauri handles signing via `TAURI_SIGNING_PRIVATE_KEY`
|
||||
|
||||
### CUDA Binary (Separate Job)
|
||||
|
||||
The `build-cuda-windows` job runs separately:
|
||||
|
||||
1. Install PyTorch with CUDA 12.8
|
||||
2. Build with `build_binary.py --cuda` (produces `--onedir` output)
|
||||
3. Package with `scripts/package_cuda.py` into two archives:
|
||||
- `voicebox-server-cuda.tar.gz` — server core (~945 MB)
|
||||
- `cuda-libs-cu128-v1.tar.gz` — NVIDIA runtime libraries (~1.7 GB, cached independently)
|
||||
4. Upload archives as release artifacts
|
||||
|
||||
This binary is downloaded on-demand by users who enable CUDA in settings. The CUDA libs archive is only re-downloaded when the CUDA toolkit version changes, not on every app update.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="Binary not found in dist/">
|
||||
PyInstaller failed to create the output. Check:
|
||||
- Python venv is activated
|
||||
- All dependencies installed: `pip install -r requirements.txt`
|
||||
- PyInstaller installed: `pip install pyinstaller`
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="MLX/Metal libraries missing in bundle">
|
||||
macOS Apple Silicon builds need `--collect-all mlx` to include `.dylib` and `.metallib` files, not just `--collect-data`.
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="CUDA DLLs bloating CPU build">
|
||||
If building CPU version but CUDA torch is installed locally, the script auto-detects and swaps to CPU torch temporarily, then restores CUDA torch after.
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Tauri can't find sidecar">
|
||||
Ensure binary exists at `tauri/src-tauri/binaries/voicebox-server-${PLATFORM}` before running Tauri build.
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
339
docs/content/docs/developer/contributing.mdx
Normal file
339
docs/content/docs/developer/contributing.mdx
Normal file
@@ -0,0 +1,339 @@
|
||||
---
|
||||
title: "Contributing"
|
||||
description: "How to contribute to Voicebox"
|
||||
---
|
||||
|
||||
Thank you for your interest in contributing to Voicebox! This guide will help you get started.
|
||||
|
||||
## Code of Conduct
|
||||
|
||||
- Be respectful and inclusive
|
||||
- Welcome newcomers and help them learn
|
||||
- Focus on constructive feedback
|
||||
- Respect different viewpoints and experiences
|
||||
|
||||
## Getting Started
|
||||
|
||||
Before you start contributing, make sure you have:
|
||||
|
||||
1. **Read the documentation** to understand how Voicebox works
|
||||
2. **Set up your development environment** — see [Development Setup](/developer/setup)
|
||||
3. **Explored the codebase** to understand the project structure
|
||||
4. **Checked [`docs/PROJECT_STATUS.md`](https://github.com/jamiepine/voicebox/blob/main/docs/PROJECT_STATUS.md)** — the living engineering roadmap that tracks prioritized tasks (Tier 1 → 3), architectural bottlenecks, and candidate TTS engines under evaluation (including why some are backlogged)
|
||||
5. **Checked existing issues** to see if someone else is working on something similar
|
||||
|
||||
## Ways to Contribute
|
||||
|
||||
<Cards>
|
||||
<Card title="Report Bugs">
|
||||
Found a bug? Open an issue with reproduction steps
|
||||
</Card>
|
||||
<Card title="Request Features">
|
||||
Have an idea? Start a discussion or open an issue
|
||||
</Card>
|
||||
<Card title="Improve Docs">
|
||||
Fix typos, add examples, or clarify instructions
|
||||
</Card>
|
||||
<Card title="Write Code">
|
||||
Fix bugs, add features, or optimize performance
|
||||
</Card>
|
||||
</Cards>
|
||||
|
||||
## Development Workflow
|
||||
|
||||
### 1. Fork & Clone
|
||||
|
||||
```bash
|
||||
# Fork the repository on GitHub
|
||||
# Then clone your fork
|
||||
git clone https://github.com/YOUR_USERNAME/voicebox.git
|
||||
cd voicebox
|
||||
```
|
||||
|
||||
### 2. Create a Branch
|
||||
|
||||
Use descriptive branch names:
|
||||
|
||||
```bash
|
||||
# For features
|
||||
git checkout -b feature/voice-effects
|
||||
|
||||
# For bug fixes
|
||||
git checkout -b fix/audio-playback-issue
|
||||
|
||||
# For documentation
|
||||
git checkout -b docs/api-examples
|
||||
```
|
||||
|
||||
### 3. Make Your Changes
|
||||
|
||||
Follow these guidelines:
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="Code Style">
|
||||
**TypeScript/React:**
|
||||
- Use TypeScript strict mode
|
||||
- Prefer functional components with hooks
|
||||
- Use named exports
|
||||
- Format with Biome (runs automatically)
|
||||
|
||||
**Python:**
|
||||
- Follow PEP 8
|
||||
- Use type hints
|
||||
- Use async/await for I/O
|
||||
- Document functions with docstrings
|
||||
|
||||
**Rust:**
|
||||
- Follow Rust conventions
|
||||
- Use meaningful names
|
||||
- Handle errors explicitly
|
||||
- Run `rustfmt`
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Commit Messages">
|
||||
Write clear, descriptive commit messages:
|
||||
|
||||
```bash
|
||||
# Good
|
||||
git commit -m "Add voice profile export feature"
|
||||
git commit -m "Fix audio playback stopping after 30 seconds"
|
||||
|
||||
# Avoid
|
||||
git commit -m "Update code"
|
||||
git commit -m "Fix bug"
|
||||
```
|
||||
|
||||
Format:
|
||||
- Use imperative mood ("Add feature" not "Added feature")
|
||||
- Keep first line under 50 characters
|
||||
- Add detailed description if needed
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Testing">
|
||||
- Test your changes manually in the app
|
||||
- Ensure backend API endpoints work
|
||||
- Check for TypeScript/Python errors
|
||||
- Verify UI components render correctly
|
||||
- Add automated tests when possible
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
### 4. Push & Create PR
|
||||
|
||||
```bash
|
||||
# Push your branch
|
||||
git push origin feature/your-feature-name
|
||||
|
||||
# Then create a pull request on GitHub
|
||||
```
|
||||
|
||||
## Pull Request Guidelines
|
||||
|
||||
When creating a pull request:
|
||||
|
||||
<Steps>
|
||||
<Step title="Use a Clear Title">
|
||||
Examples:
|
||||
- "Add voice profile export functionality"
|
||||
- "Fix audio playback stopping after 30 seconds"
|
||||
- "Improve generation speed with caching"
|
||||
</Step>
|
||||
|
||||
<Step title="Provide Description">
|
||||
Include:
|
||||
- What changes you made
|
||||
- Why you made them
|
||||
- How to test them
|
||||
- Screenshots (for UI changes)
|
||||
- Reference related issues
|
||||
</Step>
|
||||
|
||||
<Step title="Update Documentation">
|
||||
- Update relevant docs if behavior changes
|
||||
- Add API documentation for new endpoints
|
||||
- Update README if needed
|
||||
</Step>
|
||||
|
||||
<Step title="Check the Checklist">
|
||||
- [ ] Code follows style guidelines
|
||||
- [ ] Documentation updated
|
||||
- [ ] Changes tested
|
||||
- [ ] No breaking changes (or documented)
|
||||
- [ ] CHANGELOG.md updated
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Project Structure
|
||||
|
||||
<Files>
|
||||
<Folder name="voicebox" defaultOpen>
|
||||
<Folder name="app/src">
|
||||
<File name="components/" />
|
||||
<File name="lib/" />
|
||||
<File name="hooks/" />
|
||||
<File name="stores/" />
|
||||
</Folder>
|
||||
<Folder name="backend">
|
||||
<File name="app.py" />
|
||||
<File name="main.py" />
|
||||
<File name="server.py" />
|
||||
<File name="models.py" />
|
||||
<Folder name="routes" />
|
||||
<Folder name="services" />
|
||||
<Folder name="backends" />
|
||||
<Folder name="database" />
|
||||
<Folder name="utils" />
|
||||
</Folder>
|
||||
<Folder name="tauri">
|
||||
<File name="src-tauri/" />
|
||||
</Folder>
|
||||
<File name="web/" />
|
||||
<File name="landing/" />
|
||||
<File name="scripts/" />
|
||||
</Folder>
|
||||
</Files>
|
||||
|
||||
## Areas for Contribution
|
||||
|
||||
### Bug Fixes
|
||||
|
||||
- Check [existing issues](https://github.com/jamiepine/voicebox/issues) for bugs
|
||||
- Test your fix thoroughly
|
||||
- Add regression tests if possible
|
||||
|
||||
### New Features
|
||||
|
||||
- Check [`docs/PROJECT_STATUS.md`](https://github.com/jamiepine/voicebox/blob/main/docs/PROJECT_STATUS.md) and the [roadmap](https://github.com/jamiepine/voicebox#roadmap) before proposing work — the status doc lists prioritized tasks (Tier 1 → 3), known architectural bottlenecks, and candidate TTS engines already under evaluation (including why some have been backlogged)
|
||||
- Discuss major features in an issue first
|
||||
- Keep features focused and well-scoped
|
||||
- Adding a new TTS engine? See [TTS Engines](/developer/tts-engines) for the phased workflow
|
||||
|
||||
### Documentation
|
||||
|
||||
- Improve clarity and fix typos
|
||||
- Add code examples
|
||||
- Create tutorials or guides
|
||||
- Document API endpoints
|
||||
|
||||
### UI/UX Improvements
|
||||
|
||||
- Improve accessibility
|
||||
- Enhance visual design
|
||||
- Optimize performance
|
||||
- Add animations/transitions
|
||||
|
||||
### Infrastructure
|
||||
|
||||
- Improve build process
|
||||
- Add CI/CD improvements
|
||||
- Optimize bundle size
|
||||
- Add testing infrastructure
|
||||
|
||||
## API Development
|
||||
|
||||
When adding new API endpoints:
|
||||
|
||||
<Steps>
|
||||
<Step title="Add Route">
|
||||
In `backend/main.py`:
|
||||
|
||||
```python
|
||||
@app.post("/api/new-endpoint")
|
||||
async def new_endpoint(data: RequestModel) -> ResponseModel:
|
||||
"""Endpoint description."""
|
||||
# Implementation
|
||||
return response
|
||||
```
|
||||
</Step>
|
||||
|
||||
<Step title="Create Models">
|
||||
In `backend/models.py`:
|
||||
|
||||
```python
|
||||
class RequestModel(BaseModel):
|
||||
field: str
|
||||
|
||||
class ResponseModel(BaseModel):
|
||||
result: str
|
||||
```
|
||||
</Step>
|
||||
|
||||
<Step title="Regenerate Client">
|
||||
```bash
|
||||
just generate-api
|
||||
```
|
||||
|
||||
This updates the TypeScript client with type-safe bindings.
|
||||
</Step>
|
||||
|
||||
<Step title="Update Docs">
|
||||
The API documentation is automatically generated from the OpenAPI schema. Ensure your endpoint has proper docstrings and type hints, then regenerate the docs:
|
||||
|
||||
```bash
|
||||
just generate-api
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Testing
|
||||
|
||||
Currently testing is primarily manual. When adding tests:
|
||||
|
||||
**Backend:**
|
||||
```bash
|
||||
cd backend
|
||||
pytest
|
||||
```
|
||||
|
||||
**Frontend:**
|
||||
```bash
|
||||
bun run test
|
||||
```
|
||||
|
||||
**E2E (future):**
|
||||
```bash
|
||||
bun run test:e2e
|
||||
```
|
||||
|
||||
## Release Process
|
||||
|
||||
Releases are managed by maintainers using `bumpversion`:
|
||||
|
||||
```bash
|
||||
# Bump version (patch, minor, or major)
|
||||
bumpversion patch
|
||||
|
||||
# Push with tags
|
||||
git push && git push --tags
|
||||
```
|
||||
|
||||
GitHub Actions automatically builds and publishes releases when tags are pushed.
|
||||
|
||||
## Community
|
||||
|
||||
- **GitHub Issues:** Bug reports and feature requests
|
||||
- **GitHub Discussions:** General questions and ideas
|
||||
- **Discord:** Real-time chat (coming soon)
|
||||
|
||||
## Recognition
|
||||
|
||||
Contributors are recognized in:
|
||||
- [CHANGELOG.md](https://github.com/jamiepine/voicebox/blob/main/CHANGELOG.md)
|
||||
- GitHub contributor list
|
||||
- Release notes
|
||||
|
||||
## License
|
||||
|
||||
By contributing, you agree that your contributions will be licensed under the MIT License.
|
||||
|
||||
## Questions?
|
||||
|
||||
If you have questions:
|
||||
|
||||
1. Check the [documentation](/overview/introduction)
|
||||
2. Read [`docs/PROJECT_STATUS.md`](https://github.com/jamiepine/voicebox/blob/main/docs/PROJECT_STATUS.md) for current engineering priorities
|
||||
3. Search [existing issues](https://github.com/jamiepine/voicebox/issues)
|
||||
4. Open a new issue or discussion
|
||||
5. See [CONTRIBUTING.md](https://github.com/jamiepine/voicebox/blob/main/CONTRIBUTING.md) in the repo
|
||||
|
||||
Thank you for contributing to Voicebox! 🎉
|
||||
257
docs/content/docs/developer/effects-pipeline.mdx
Normal file
257
docs/content/docs/developer/effects-pipeline.mdx
Normal file
@@ -0,0 +1,257 @@
|
||||
---
|
||||
title: "Effects Pipeline"
|
||||
description: "Audio post-processing effects and generation versioning"
|
||||
---
|
||||
|
||||
The effects pipeline provides professional-grade DSP audio processing using Spotify's Pedalboard library. Each generation can have multiple versions with different effect chains applied.
|
||||
|
||||
## Overview
|
||||
|
||||
**Key concepts:**
|
||||
|
||||
- **Effects Chain** — JSON-serializable list of effect configurations applied sequentially
|
||||
- **Generation Version** — A processed variant of a generation with its own audio file and effects chain
|
||||
- **Effect Preset** — Saved effects chain configuration (built-in or user-created)
|
||||
- **Clean Version** — The original unprocessed generation audio
|
||||
|
||||
**Flow:**
|
||||
|
||||
1. TTS Generation creates clean audio
|
||||
2. Effects Chain processes the audio
|
||||
3. Processed Version is saved as a new generation version
|
||||
|
||||
Each generation maintains a clean version (original) plus any number of processed versions with different effect chains applied.
|
||||
|
||||
## Effect Types
|
||||
|
||||
The following effect types are available, each with configurable parameters:
|
||||
|
||||
### Chorus / Flanger
|
||||
|
||||
Modulated delay effect. Short centre_delay_ms gives flanger; longer gives chorus.
|
||||
|
||||
**Parameters:**
|
||||
- rate_hz: LFO speed in Hz (range: 0.01 to 20, default: 1.0)
|
||||
- depth: Modulation depth (range: 0.0 to 1.0, default: 0.5)
|
||||
- feedback: Feedback amount (range: 0.0 to 0.95, default: 0.0)
|
||||
- centre_delay_ms: Centre delay in milliseconds (range: 0.5 to 50, default: 7.0)
|
||||
- mix: Wet/dry mix (range: 0.0 to 1.0, default: 0.5)
|
||||
|
||||
### Reverb
|
||||
|
||||
Room reverb effect.
|
||||
|
||||
**Parameters:**
|
||||
- room_size: Room size (range: 0.0 to 1.0, default: 0.5)
|
||||
- damping: High frequency damping (range: 0.0 to 1.0, default: 0.5)
|
||||
- wet_level: Wet level (range: 0.0 to 1.0, default: 0.33)
|
||||
- dry_level: Dry level (range: 0.0 to 1.0, default: 0.4)
|
||||
- width: Stereo width (range: 0.0 to 1.0, default: 1.0)
|
||||
|
||||
### Delay
|
||||
|
||||
Echo / delay line.
|
||||
|
||||
**Parameters:**
|
||||
- delay_seconds: Delay time in seconds (range: 0.01 to 2.0, default: 0.3)
|
||||
- feedback: Feedback amount (range: 0.0 to 0.95, default: 0.3)
|
||||
- mix: Wet/dry mix (range: 0.0 to 1.0, default: 0.3)
|
||||
|
||||
### Compressor
|
||||
|
||||
Dynamic range compression for consistent loudness.
|
||||
|
||||
**Parameters:**
|
||||
- threshold_db: Threshold in dB (range: -60 to 0, default: -20.0)
|
||||
- ratio: Compression ratio (range: 1.0 to 20.0, default: 4.0)
|
||||
- attack_ms: Attack time in ms (range: 0.1 to 100, default: 10.0)
|
||||
- release_ms: Release time in ms (range: 10 to 1000, default: 100.0)
|
||||
|
||||
### Gain
|
||||
|
||||
Volume adjustment in decibels.
|
||||
|
||||
**Parameters:**
|
||||
- gain_db: Gain in dB (range: -40 to 40, default: 0.0)
|
||||
|
||||
### High-Pass Filter
|
||||
|
||||
Removes frequencies below the cutoff.
|
||||
|
||||
**Parameters:**
|
||||
- cutoff_frequency_hz: Cutoff frequency in Hz (range: 20 to 8000, default: 80.0)
|
||||
|
||||
### Low-Pass Filter
|
||||
|
||||
Removes frequencies above the cutoff.
|
||||
|
||||
**Parameters:**
|
||||
- cutoff_frequency_hz: Cutoff frequency in Hz (range: 200 to 20000, default: 8000.0)
|
||||
|
||||
### Pitch Shift
|
||||
|
||||
Shift pitch up or down by semitones.
|
||||
|
||||
**Parameters:**
|
||||
- semitones: Semitones to shift (range: -12 to 12, default: 0.0)
|
||||
|
||||
## Generation Versions
|
||||
|
||||
Each generation starts with a clean version (no effects). Users can create processed versions by applying effect chains.
|
||||
|
||||
**Version properties:**
|
||||
- id — Unique version identifier
|
||||
- label — User-defined name (e.g., "robotic", "with reverb")
|
||||
- audio_path — Path to the processed audio file
|
||||
- effects_chain — JSON array of effect configurations
|
||||
- source_version_id — Which version this was derived from
|
||||
- is_default — Whether this is the default audio for the generation
|
||||
|
||||
**File storage:**
|
||||
|
||||
<Files>
|
||||
<Folder name="data/generations" defaultOpen>
|
||||
<File name="{generation_id}.wav" />
|
||||
<File name="{generation_id}_{version_id}.wav" />
|
||||
</Folder>
|
||||
</Files>
|
||||
|
||||
**Default version behavior:**
|
||||
- One version per generation is marked as default
|
||||
- The generation's audio_path always points to the default version's audio
|
||||
- Deleting the default version automatically promotes another version
|
||||
|
||||
## Effect Presets
|
||||
|
||||
Presets are saved effects chains that can be reused across generations.
|
||||
|
||||
**Built-in presets:**
|
||||
|
||||
- **Robotic**: Metallic robotic voice using chorus (flanger-style)
|
||||
- **Radio**: Thin AM-radio voice with band-pass filtering and light compression
|
||||
- **Echo Chamber**: Spacious reverb with trailing echo
|
||||
- **Deep Voice**: Lower pitch with added warmth using pitch shift and compression
|
||||
|
||||
**User presets:**
|
||||
- Created via the effects UI
|
||||
- Stored in the database (SQLite)
|
||||
- Cannot modify/delete built-in presets
|
||||
- Used to quickly apply favorite effect combinations
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### Effects Management
|
||||
|
||||
| Endpoint | Method | Description |
|
||||
|----------|--------|-------------|
|
||||
| /effects/available | GET | List all effect types with parameter definitions |
|
||||
| /effects/presets | GET | List all presets (built-in + user) |
|
||||
| /effects/presets | POST | Create a new user preset |
|
||||
| /effects/presets/:id | GET | Get a specific preset |
|
||||
| /effects/presets/:id | PUT | Update a user preset |
|
||||
| /effects/presets/:id | DELETE | Delete a user preset |
|
||||
| /effects/preview/:generation_id | POST | Preview effects on a generation (returns audio stream) |
|
||||
|
||||
### Generation Versions
|
||||
|
||||
| Endpoint | Method | Description |
|
||||
|----------|--------|-------------|
|
||||
| /generations/:id/versions | GET | List all versions for a generation |
|
||||
| /generations/:id/versions/apply-effects | POST | Apply effects chain, create new version |
|
||||
| /generations/:id/versions/:version_id/set-default | PUT | Set a version as default |
|
||||
| /generations/:id/versions/:version_id | DELETE | Delete a version |
|
||||
|
||||
### Request Body: Apply Effects
|
||||
|
||||
Request body for applying effects:
|
||||
|
||||
- effects_chain: Array of effect objects
|
||||
- label: Version label (e.g., "with reverb")
|
||||
- set_as_default: Whether to set as default
|
||||
- source_version_id: Source version ID (optional)
|
||||
|
||||
## Implementation
|
||||
|
||||
### Backend Architecture
|
||||
|
||||
**Files:**
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| backend/utils/effects.py | Effect registry, validation, and audio processing |
|
||||
| backend/services/versions.py | Generation version CRUD operations |
|
||||
| backend/services/effects.py | Effect preset CRUD operations |
|
||||
| backend/routes/effects.py | API endpoints for effects and versions |
|
||||
|
||||
**Effect Registry:**
|
||||
|
||||
The EFFECT_REGISTRY dict in utils/effects.py defines all available effects with their parameters, defaults, and ranges.
|
||||
|
||||
**Validation:**
|
||||
|
||||
Effects chains are validated before application:
|
||||
- Each effect type must exist in the registry
|
||||
- Parameters must be numbers within min/max bounds
|
||||
- Unknown parameters are rejected
|
||||
|
||||
**Audio Processing:**
|
||||
|
||||
Uses Spotify's Pedalboard library:
|
||||
|
||||
```python
|
||||
from pedalboard import Pedalboard
|
||||
|
||||
# Build pedalboard from chain
|
||||
board = build_pedalboard(effects_chain)
|
||||
|
||||
# Apply to audio (async via thread)
|
||||
processed = await asyncio.to_thread(lambda: board(audio, sample_rate))
|
||||
```
|
||||
|
||||
### Frontend Integration
|
||||
|
||||
**Key components:**
|
||||
|
||||
| Component | Location |
|
||||
|-----------|----------|
|
||||
| Effects chain editor | app/src/components/Effects/ |
|
||||
| Version selector | Generation detail view |
|
||||
| Preset manager | Effects panel |
|
||||
| Live preview | Preview button (streams processed audio) |
|
||||
|
||||
**State management:**
|
||||
- Effects chains are stored as JSON arrays
|
||||
- Live preview fetches processed audio without saving
|
||||
- Applied effects create new versions via POST endpoint
|
||||
|
||||
## Adding New Effects
|
||||
|
||||
To add a new effect type:
|
||||
|
||||
1. **Add to registry** (backend/utils/effects.py):
|
||||
- Add entry to EFFECT_REGISTRY with cls, label, description, and params
|
||||
- Import the effect class from Pedalboard
|
||||
|
||||
2. **Update frontend types** if needed
|
||||
|
||||
The new effect automatically appears in /effects/available and the chain editor UI.
|
||||
|
||||
## Best Practices
|
||||
|
||||
**Effect ordering matters.** Process effects in this order for best results:
|
||||
1. Pitch shift (if needed)
|
||||
2. High/low-pass filters
|
||||
3. Chorus/flanger (time-based)
|
||||
4. Reverb/delay (spatial)
|
||||
5. Compressor
|
||||
6. Gain (final level adjustment)
|
||||
|
||||
**CPU usage:**
|
||||
- Effects are applied in real-time during generation
|
||||
- Pitch shift and reverb are the most CPU-intensive
|
||||
- Consider previewing complex chains before applying
|
||||
|
||||
**Storage:**
|
||||
- Each version creates a new audio file
|
||||
- Clean version always exists (can be reverted to)
|
||||
- Processed versions can be deleted to save space
|
||||
275
docs/content/docs/developer/history.mdx
Normal file
275
docs/content/docs/developer/history.mdx
Normal file
@@ -0,0 +1,275 @@
|
||||
---
|
||||
title: "Generation History"
|
||||
description: "How generation history tracking works in Voicebox"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
The history module tracks all generated audio, providing a searchable record of past generations. Each generation stores the text, settings, and a reference to the audio file.
|
||||
|
||||
## Data Model
|
||||
|
||||
### Generation Table
|
||||
|
||||
```python
|
||||
class Generation(Base):
|
||||
__tablename__ = "generations"
|
||||
|
||||
id = Column(String, primary_key=True, default=lambda: str(uuid.uuid4()))
|
||||
profile_id = Column(String, ForeignKey("profiles.id"), nullable=False)
|
||||
text = Column(Text, nullable=False)
|
||||
language = Column(String, default="en")
|
||||
audio_path = Column(String, nullable=True)
|
||||
duration = Column(Float, nullable=True)
|
||||
seed = Column(Integer)
|
||||
instruct = Column(Text)
|
||||
engine = Column(String, default="qwen")
|
||||
model_size = Column(String, nullable=True)
|
||||
status = Column(String, default="completed") # pending | completed | failed
|
||||
error = Column(Text, nullable=True)
|
||||
is_favorited = Column(Boolean, default=False)
|
||||
created_at = Column(DateTime, default=datetime.utcnow)
|
||||
```
|
||||
|
||||
Each generation can also have multiple **generation versions** — processed variants with different effects chains applied. The original (`clean`) version plus any number of processed versions live in a separate `generation_versions` table. See [Effects Pipeline](/developer/effects-pipeline).
|
||||
|
||||
## File Storage
|
||||
|
||||
Generated audio is stored in:
|
||||
|
||||
<Files>
|
||||
<Folder name="data" defaultOpen>
|
||||
<Folder name="generations">
|
||||
<File name="{generation_id}.wav" />
|
||||
</Folder>
|
||||
</Folder>
|
||||
</Files>
|
||||
|
||||
## Core Functions
|
||||
|
||||
### Creating a Generation Record
|
||||
|
||||
After TTS generates audio, a history entry is created:
|
||||
|
||||
```python
|
||||
async def create_generation(
|
||||
profile_id: str,
|
||||
text: str,
|
||||
language: str,
|
||||
audio_path: str,
|
||||
duration: float,
|
||||
seed: Optional[int],
|
||||
db: Session,
|
||||
instruct: Optional[str] = None,
|
||||
) -> GenerationResponse:
|
||||
db_generation = DBGeneration(
|
||||
id=str(uuid.uuid4()),
|
||||
profile_id=profile_id,
|
||||
text=text,
|
||||
language=language,
|
||||
audio_path=audio_path,
|
||||
duration=duration,
|
||||
seed=seed,
|
||||
instruct=instruct,
|
||||
created_at=datetime.utcnow(),
|
||||
)
|
||||
|
||||
db.add(db_generation)
|
||||
db.commit()
|
||||
|
||||
return GenerationResponse.model_validate(db_generation)
|
||||
```
|
||||
|
||||
### Listing Generations
|
||||
|
||||
Supports filtering and pagination:
|
||||
|
||||
```python
|
||||
async def list_generations(
|
||||
query: HistoryQuery,
|
||||
db: Session,
|
||||
) -> HistoryListResponse:
|
||||
# Build query with profile name join
|
||||
q = db.query(
|
||||
DBGeneration,
|
||||
DBVoiceProfile.name.label('profile_name')
|
||||
).join(
|
||||
DBVoiceProfile,
|
||||
DBGeneration.profile_id == DBVoiceProfile.id
|
||||
)
|
||||
|
||||
# Apply filters
|
||||
if query.profile_id:
|
||||
q = q.filter(DBGeneration.profile_id == query.profile_id)
|
||||
|
||||
if query.search:
|
||||
q = q.filter(DBGeneration.text.like(f"%{query.search}%"))
|
||||
|
||||
# Order and paginate
|
||||
total = q.count()
|
||||
q = q.order_by(DBGeneration.created_at.desc())
|
||||
q = q.offset(query.offset).limit(query.limit)
|
||||
|
||||
return HistoryListResponse(items=results, total=total)
|
||||
```
|
||||
|
||||
### Getting Statistics
|
||||
|
||||
Aggregate statistics for the dashboard:
|
||||
|
||||
```python
|
||||
async def get_generation_stats(db: Session) -> dict:
|
||||
total = db.query(func.count(DBGeneration.id)).scalar()
|
||||
total_duration = db.query(func.sum(DBGeneration.duration)).scalar()
|
||||
|
||||
by_profile = db.query(
|
||||
DBGeneration.profile_id,
|
||||
func.count(DBGeneration.id).label('count')
|
||||
).group_by(DBGeneration.profile_id).all()
|
||||
|
||||
return {
|
||||
"total_generations": total,
|
||||
"total_duration_seconds": total_duration,
|
||||
"generations_by_profile": {
|
||||
profile_id: count for profile_id, count in by_profile
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
## Deletion
|
||||
|
||||
Deleting a generation removes both the database record and audio file:
|
||||
|
||||
```python
|
||||
async def delete_generation(generation_id: str, db: Session) -> bool:
|
||||
generation = db.query(DBGeneration).filter_by(id=generation_id).first()
|
||||
if not generation:
|
||||
return False
|
||||
|
||||
# Delete audio file
|
||||
audio_path = Path(generation.audio_path)
|
||||
if audio_path.exists():
|
||||
audio_path.unlink()
|
||||
|
||||
# Delete database record
|
||||
db.delete(generation)
|
||||
db.commit()
|
||||
|
||||
return True
|
||||
```
|
||||
|
||||
### Cascade Delete
|
||||
|
||||
When deleting a profile, all its generations are also deleted:
|
||||
|
||||
```python
|
||||
async def delete_generations_by_profile(profile_id: str, db: Session) -> int:
|
||||
generations = db.query(DBGeneration).filter_by(profile_id=profile_id).all()
|
||||
|
||||
for generation in generations:
|
||||
Path(generation.audio_path).unlink(missing_ok=True)
|
||||
db.delete(generation)
|
||||
|
||||
db.commit()
|
||||
return len(generations)
|
||||
```
|
||||
|
||||
## Export/Import
|
||||
|
||||
### Exporting a Generation
|
||||
|
||||
Generations can be exported as ZIP archives:
|
||||
|
||||
<Files>
|
||||
<Folder name="generation_export.zip" defaultOpen>
|
||||
<File name="generation.json" />
|
||||
<File name="audio.wav" />
|
||||
</Folder>
|
||||
</Files>
|
||||
|
||||
### Importing a Generation
|
||||
|
||||
The import process:
|
||||
|
||||
1. Extract ZIP archive
|
||||
2. Validate metadata and audio
|
||||
3. Create new generation ID
|
||||
4. Copy audio to generations directory
|
||||
5. Create database record
|
||||
|
||||
## API Endpoints
|
||||
|
||||
| Method | Endpoint | Description |
|
||||
|--------|----------|-------------|
|
||||
| GET | `/history` | List generations with filters |
|
||||
| GET | `/history/stats` | Get aggregate statistics |
|
||||
| GET | `/history/{id}` | Get generation by ID |
|
||||
| DELETE | `/history/{id}` | Delete generation |
|
||||
| GET | `/history/{id}/export` | Export as ZIP |
|
||||
| GET | `/history/{id}/export-audio` | Export audio only |
|
||||
| POST | `/history/import` | Import from ZIP |
|
||||
|
||||
### Query Parameters
|
||||
|
||||
```
|
||||
GET /history?profile_id=uuid&search=hello&limit=50&offset=0
|
||||
```
|
||||
|
||||
| Parameter | Type | Default | Description |
|
||||
|-----------|------|---------|-------------|
|
||||
| `profile_id` | string | null | Filter by profile |
|
||||
| `search` | string | null | Search in text |
|
||||
| `limit` | int | 50 | Results per page |
|
||||
| `offset` | int | 0 | Pagination offset |
|
||||
|
||||
### Response Schema
|
||||
|
||||
```json
|
||||
{
|
||||
"items": [
|
||||
{
|
||||
"id": "uuid",
|
||||
"profile_id": "uuid",
|
||||
"profile_name": "My Voice",
|
||||
"text": "Hello world",
|
||||
"language": "en",
|
||||
"audio_path": "/path/to/audio.wav",
|
||||
"duration": 1.5,
|
||||
"seed": 42,
|
||||
"instruct": null,
|
||||
"engine": "qwen",
|
||||
"model_size": "1.7B",
|
||||
"status": "completed",
|
||||
"error": null,
|
||||
"is_favorited": false,
|
||||
"created_at": "2026-04-18T10:30:00Z"
|
||||
}
|
||||
],
|
||||
"total": 150
|
||||
}
|
||||
```
|
||||
|
||||
## Usage in Stories
|
||||
|
||||
Generations can be added to stories for multi-voice narratives. The story system references generations by ID:
|
||||
|
||||
```python
|
||||
class StoryItem(Base):
|
||||
generation_id = Column(String, ForeignKey("generations.id"))
|
||||
```
|
||||
|
||||
This allows the same generation to be reused across multiple stories without duplicating audio files.
|
||||
|
||||
## Storage Considerations
|
||||
|
||||
### Disk Usage
|
||||
|
||||
Each generation creates a WAV file. For a 10-second clip at 24kHz:
|
||||
- ~480KB per file (mono, 16-bit)
|
||||
|
||||
### Cleanup Strategy
|
||||
|
||||
Consider implementing:
|
||||
- Automatic cleanup of old generations
|
||||
- Storage quota per profile
|
||||
- Compression for archival
|
||||
20
docs/content/docs/developer/meta.json
Normal file
20
docs/content/docs/developer/meta.json
Normal file
@@ -0,0 +1,20 @@
|
||||
{
|
||||
"title": "Developer",
|
||||
"defaultOpen": true,
|
||||
"pages": [
|
||||
"setup",
|
||||
"architecture",
|
||||
"contributing",
|
||||
"building",
|
||||
"autoupdater",
|
||||
"voice-profiles",
|
||||
"tts-generation",
|
||||
"tts-engines",
|
||||
"effects-pipeline",
|
||||
"history",
|
||||
"stories",
|
||||
"transcription",
|
||||
"audio-channels",
|
||||
"model-management"
|
||||
]
|
||||
}
|
||||
199
docs/content/docs/developer/model-management.mdx
Normal file
199
docs/content/docs/developer/model-management.mdx
Normal file
@@ -0,0 +1,199 @@
|
||||
---
|
||||
title: "Model Management"
|
||||
description: "How model downloading, loading, and status tracking works across all engines"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Voicebox manages two categories of models:
|
||||
|
||||
**TTS Models** — Seven engines covering zero-shot cloning and preset voices. Each engine may have one or more size variants.
|
||||
|
||||
**ASR Models** — Whisper for transcription. Five sizes, plus MLX-Whisper on Apple Silicon for ~8× faster transcription.
|
||||
|
||||
Every model is described by a `ModelConfig` entry in `backend/backends/__init__.py`. Models are downloaded from HuggingFace Hub on first use and cached in the platform-standard HF cache.
|
||||
|
||||
## Available TTS Models
|
||||
|
||||
| Model | Engine | HuggingFace Repo | Size | VRAM | Languages |
|
||||
|-------|--------|------------------|------|------|-----------|
|
||||
| **Qwen TTS 1.7B** | `qwen` | `Qwen/Qwen3-TTS-12Hz-1.7B-Base` | 3.5 GB | ~6 GB | 10 |
|
||||
| **Qwen TTS 0.6B** | `qwen` | `Qwen/Qwen3-TTS-12Hz-0.6B-Base` | 1.2 GB | ~2 GB | 10 |
|
||||
| **Qwen CustomVoice 1.7B** | `qwen_custom_voice` | `Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice` | 3.5 GB | ~6 GB | 10 |
|
||||
| **Qwen CustomVoice 0.6B** | `qwen_custom_voice` | `Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice` | 1.2 GB | ~2 GB | 10 |
|
||||
| **LuxTTS** | `luxtts` | `YatharthS/LuxTTS` | 300 MB | ~1 GB | English |
|
||||
| **Chatterbox Multilingual** | `chatterbox` | `ResembleAI/chatterbox` | 3.2 GB | ~3 GB | 23 |
|
||||
| **Chatterbox Turbo** | `chatterbox_turbo` | `ResembleAI/chatterbox-turbo` | 1.5 GB | ~1.5 GB | English |
|
||||
| **TADA 1B** | `tada` | `HumeAI/tada-1b` | 4 GB | ~4 GB | English |
|
||||
| **TADA 3B Multilingual** | `tada` | `HumeAI/tada-3b-ml` | 8 GB | ~8 GB | 10 |
|
||||
| **Kokoro 82M** | `kokoro` | `hexgrad/Kokoro-82M` | 350 MB | ~150 MB | 8 |
|
||||
|
||||
On Apple Silicon, Qwen TTS uses MLX-optimized repos from `mlx-community` instead of the PyTorch repos. The backend picks automatically via `get_backend_type()`.
|
||||
|
||||
## Available Whisper Models
|
||||
|
||||
| Model | HuggingFace Repo | Size |
|
||||
|-------|------------------|------|
|
||||
| **Whisper Base** | `openai/whisper-base` | ~300 MB |
|
||||
| **Whisper Small** | `openai/whisper-small` | ~500 MB |
|
||||
| **Whisper Medium** | `openai/whisper-medium` | ~1.5 GB |
|
||||
| **Whisper Large** | `openai/whisper-large-v3` | ~3 GB |
|
||||
| **Whisper Turbo** | `openai/whisper-large-v3-turbo` | ~1.5 GB |
|
||||
|
||||
On Apple Silicon, MLX-Whisper is preferred automatically — see [Transcription](/developer/transcription).
|
||||
|
||||
## Model Storage
|
||||
|
||||
Models live in the platform HuggingFace cache:
|
||||
|
||||
| Platform | Path |
|
||||
|----------|------|
|
||||
| macOS | `~/.cache/huggingface/hub/` |
|
||||
| Linux | `~/.cache/huggingface/hub/` |
|
||||
| Windows | `%USERPROFILE%\.cache\huggingface\hub\` |
|
||||
| Docker | `/home/voicebox/.cache/huggingface/hub` (volume-mounted) |
|
||||
|
||||
Set `VOICEBOX_MODELS_DIR` to override.
|
||||
|
||||
## Progress Tracking
|
||||
|
||||
Downloads stream progress to the frontend via Server-Sent Events. The progress pipeline has three pieces:
|
||||
|
||||
**`ProgressManager`** (`backend/utils/progress.py`) — in-memory map of `model_name → {current, total, filename, status}`.
|
||||
|
||||
**`HFProgressTracker`** — context manager that intercepts HuggingFace Hub downloads to emit byte-level progress. Needed because `huggingface_hub` silently disables tqdm in frozen PyInstaller builds.
|
||||
|
||||
**SSE endpoint** — `GET /models/progress/{model_name}` streams updates until `status` is `complete` or `error`.
|
||||
|
||||
```python
|
||||
# Frontend
|
||||
const eventSource = new EventSource(`/models/progress/${modelName}`);
|
||||
eventSource.onmessage = (event) => {
|
||||
const { current, total, status } = JSON.parse(event.data);
|
||||
updateProgressBar(current / total);
|
||||
if (status === "complete") eventSource.close();
|
||||
};
|
||||
```
|
||||
|
||||
## Model Status
|
||||
|
||||
`GET /models/status` returns every registered model's current state:
|
||||
|
||||
```json
|
||||
{
|
||||
"models": [
|
||||
{
|
||||
"model_name": "qwen-tts-1.7B",
|
||||
"display_name": "Qwen TTS 1.7B",
|
||||
"engine": "qwen",
|
||||
"downloaded": true,
|
||||
"size_mb": 3500,
|
||||
"loaded": true
|
||||
},
|
||||
...
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
The handler iterates `get_all_model_configs()` and calls `check_model_loaded(config)` for each entry, so new engines appear automatically once they're registered in `ModelConfig`.
|
||||
|
||||
## Manual Model Operations
|
||||
|
||||
| Method | Endpoint | Description |
|
||||
|--------|----------|-------------|
|
||||
| GET | `/models/status` | Status of every registered model |
|
||||
| POST | `/models/load` | Load a TTS model into memory |
|
||||
| POST | `/models/unload` | Unload a TTS model from memory |
|
||||
| POST | `/models/download` | Trigger a background download |
|
||||
| GET | `/models/progress/{name}` | Stream download progress (SSE) |
|
||||
| DELETE | `/models/{name}` | Delete a downloaded model from cache |
|
||||
|
||||
### Load
|
||||
|
||||
```http
|
||||
POST /models/load
|
||||
{
|
||||
"model_name": "qwen-tts-1.7B"
|
||||
}
|
||||
```
|
||||
|
||||
The route looks up the config, dispatches to `get_model_load_func(config)`, and returns once the model is ready.
|
||||
|
||||
### Unload
|
||||
|
||||
```http
|
||||
POST /models/unload
|
||||
{
|
||||
"model_name": "chatterbox-tts"
|
||||
}
|
||||
```
|
||||
|
||||
Calls `unload_model_by_config(config)`, which routes to the right backend's `unload_model()` and frees GPU memory.
|
||||
|
||||
### Download
|
||||
|
||||
```http
|
||||
POST /models/download
|
||||
{
|
||||
"model_name": "kokoro"
|
||||
}
|
||||
```
|
||||
|
||||
Fires off an async download task. Progress is available via the SSE endpoint. Download is triggered automatically on first generation, so this is only needed for pre-warming.
|
||||
|
||||
## Preset Voice Seeding
|
||||
|
||||
For engines that use preset voices (Kokoro, Qwen CustomVoice), the backend auto-creates a voice profile per preset voice after the model is downloaded. This is driven by `seed_preset_profiles(engine)` in `backend/services/profiles.py`, called from the models route once download completes.
|
||||
|
||||
Preset profiles have:
|
||||
|
||||
- `voice_type = "preset"`
|
||||
- `preset_engine` = engine name (`"kokoro"`, `"qwen_custom_voice"`)
|
||||
- `preset_voice_id` = engine-specific voice ID (`"am_adam"`, `"f000001"`, etc.)
|
||||
- No `profile_samples` rows — no audio to store
|
||||
|
||||
See [Voice Profiles](/developer/voice-profiles) for the schema.
|
||||
|
||||
## Adding a New Model
|
||||
|
||||
To add a new size variant of an existing engine, just add another `ModelConfig`:
|
||||
|
||||
```python
|
||||
ModelConfig(
|
||||
model_name="qwen-tts-3B",
|
||||
display_name="Qwen TTS 3B",
|
||||
engine="qwen",
|
||||
hf_repo_id="Qwen/Qwen3-TTS-12Hz-3B-Base",
|
||||
model_size="3B",
|
||||
size_mb=7000,
|
||||
languages=["zh", "en", ...],
|
||||
),
|
||||
```
|
||||
|
||||
The frontend picks it up via `/models/status`; download/load flow works without further changes.
|
||||
|
||||
Adding a whole new engine is a bigger lift — see [TTS Engines](/developer/tts-engines) for the full phased workflow.
|
||||
|
||||
## Error Handling
|
||||
|
||||
| Error | Cause | Fix |
|
||||
|-------|-------|-----|
|
||||
| Download failed | Network / HF rate limit | Retry |
|
||||
| OOM on load | Not enough VRAM | Use a smaller variant, unload other engines |
|
||||
| Model not found | Corrupt cache | Re-download via `/models/download` |
|
||||
| Stuck progress bar in frozen build | `huggingface_hub` tqdm silenced | `HFProgressTracker` force-enables the internal counter |
|
||||
| GPU architecture unsupported | PyTorch wheel doesn't target your GPU | See [GPU Acceleration](/overview/gpu-acceleration) |
|
||||
|
||||
## Next Steps
|
||||
|
||||
<Cards>
|
||||
<Card title="TTS Generation" href="/developer/tts-generation">
|
||||
How generation flows through the registry
|
||||
</Card>
|
||||
<Card title="TTS Engines" href="/developer/tts-engines">
|
||||
Add a new engine end-to-end
|
||||
</Card>
|
||||
<Card title="Transcription" href="/developer/transcription">
|
||||
Whisper and MLX-Whisper integration
|
||||
</Card>
|
||||
</Cards>
|
||||
299
docs/content/docs/developer/setup.mdx
Normal file
299
docs/content/docs/developer/setup.mdx
Normal file
@@ -0,0 +1,299 @@
|
||||
---
|
||||
title: "Development Setup"
|
||||
description: "Set up your local development environment for Voicebox"
|
||||
---
|
||||
|
||||
## Quick Setup (Recommended)
|
||||
|
||||
Get started in two commands:
|
||||
|
||||
```bash
|
||||
# Clone and enter the repository
|
||||
git clone https://github.com/jamiepine/voicebox.git
|
||||
cd voicebox
|
||||
|
||||
# Setup everything (Python venv, JS deps, dev sidecar)
|
||||
just setup
|
||||
|
||||
# Start development (backend + desktop app)
|
||||
just dev
|
||||
```
|
||||
|
||||
The `just dev` command automatically starts the Python backend (if not already running) and launches the Tauri desktop app.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Ensure you have these installed:
|
||||
|
||||
<Cards>
|
||||
<Card title="Bun" icon={<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="m7.5 4.27 9 5.15"/><path d="M21 8a2 2 0 0 0-1-1.73l-7-4a2 2 0 0 0-2 0l-7 4A2 2 0 0 0 3 8v8a2 2 0 0 0 1 1.73l7 4a2 2 0 0 0 2 0l7-4A2 2 0 0 0 21 16Z"/><path d="m3.3 7 8.7 5 8.7-5"/><path d="M12 22V12"/></svg>}>
|
||||
[Download Bun](https://bun.sh)
|
||||
```bash
|
||||
curl -fsSL https://bun.sh/install | bash
|
||||
```
|
||||
</Card>
|
||||
<Card title="Python 3.11+" icon={<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="M12 2L2 7l10 5 10-5-10-5z"/><path d="M2 17l10 5 10-5"/><path d="M2 12l10 5 10-5"/></svg>}>
|
||||
[Download Python](https://python.org)
|
||||
```bash
|
||||
python --version
|
||||
```
|
||||
</Card>
|
||||
<Card title="Rust" icon={<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="m6 9 6 6 6-6"/></svg>}>
|
||||
[Install Rust](https://rustup.rs)
|
||||
```bash
|
||||
rustc --version
|
||||
```
|
||||
</Card>
|
||||
<Card title="Just" icon={<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="M4 7V4h3"/><path d="M7 4h14v6h-2V6H7V4Z"/><path d="M4 10v10h16V10H4Z"/></svg>}>
|
||||
[Install Just](https://github.com/casey/just)
|
||||
```bash
|
||||
brew install just # macOS
|
||||
cargo install just # Linux/Windows
|
||||
```
|
||||
</Card>
|
||||
</Cards>
|
||||
|
||||
<Callout type="info">
|
||||
Just works on macOS, Linux, and Windows.
|
||||
</Callout>
|
||||
|
||||
## Just Commands
|
||||
|
||||
Run `just --list` to see all available commands. Highlights:
|
||||
|
||||
### Setup
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `just setup` | Full setup (Python venv + JS deps + dev sidecar). Detects Apple Silicon for MLX and NVIDIA/Intel Arc on Windows for accelerated PyTorch. |
|
||||
| `just setup-python` | Python venv + dependencies only |
|
||||
| `just setup-js` | `bun install` only |
|
||||
|
||||
### Development
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `just dev` | Start backend + Tauri desktop app (reuses a running backend if one exists) |
|
||||
| `just dev-web` | Start backend + web app (no Tauri/Rust build) |
|
||||
| `just dev-backend` | Backend only |
|
||||
| `just dev-frontend` | Tauri app only (backend must already be running) |
|
||||
| `just kill` | Stop all dev processes |
|
||||
|
||||
### Build
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `just build` | CPU server binary + Tauri installer |
|
||||
| `just build-local` | **Windows:** CPU + CUDA server binaries + Tauri installer |
|
||||
| `just build-server` | CPU server binary only |
|
||||
| `just build-server-cuda` | **Windows:** CUDA server binary only, placed in `%APPDATA%/sh.voicebox.app/backends/cuda` for local testing |
|
||||
| `just build-tauri` | Tauri app only |
|
||||
| `just build-web` | Web app only |
|
||||
|
||||
### Quality
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `just check` | Lint + format + typecheck (Biome + ruff) |
|
||||
| `just fix` | Auto-fix lint + format issues |
|
||||
| `just lint` / `just format` | Lint or format only |
|
||||
| `just test` | Run Python tests (pytest) |
|
||||
| `just test-models` | End-to-end generation against every TTS engine using the frozen binary |
|
||||
|
||||
### Database
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `just db-init` | Initialize SQLite database |
|
||||
| `just db-reset` | Delete and reinitialize the database |
|
||||
|
||||
### Utilities
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `just generate-api` | Generate TypeScript API client from the backend's OpenAPI schema |
|
||||
| `just docs` | Open `http://localhost:17493/docs` in your browser |
|
||||
| `just logs` | Tail backend logs |
|
||||
| `just clean` | Remove build artifacts |
|
||||
| `just clean-python` | Remove the Python venv + `__pycache__` |
|
||||
| `just clean-all` | Nuclear clean (includes all `node_modules`) |
|
||||
|
||||
## Project Structure
|
||||
|
||||
<Files>
|
||||
<Folder name="voicebox" defaultOpen>
|
||||
<Folder name="app">
|
||||
<Folder name="src">
|
||||
<File name="components/" />
|
||||
<File name="lib/" />
|
||||
<File name="hooks/" />
|
||||
</Folder>
|
||||
</Folder>
|
||||
<Folder name="backend">
|
||||
<File name="app.py" />
|
||||
<File name="main.py" />
|
||||
<File name="config.py" />
|
||||
<File name="models.py" />
|
||||
<File name="server.py" />
|
||||
<Folder name="routes">
|
||||
<File name="..." />
|
||||
</Folder>
|
||||
<Folder name="services">
|
||||
<File name="..." />
|
||||
</Folder>
|
||||
<Folder name="backends">
|
||||
<File name="..." />
|
||||
</Folder>
|
||||
<Folder name="database">
|
||||
<File name="..." />
|
||||
</Folder>
|
||||
<Folder name="utils">
|
||||
<File name="..." />
|
||||
</Folder>
|
||||
</Folder>
|
||||
<Folder name="tauri">
|
||||
<Folder name="src-tauri" />
|
||||
</Folder>
|
||||
<Folder name="web" />
|
||||
<Folder name="scripts" />
|
||||
</Folder>
|
||||
</Files>
|
||||
|
||||
### Request Flow
|
||||
|
||||
HTTP request → **routes/** (validate input) → **services/** (business logic) → **backends/** (TTS/STT inference) → **utils/** (audio processing)
|
||||
|
||||
### Key Modules
|
||||
|
||||
- **app.py** — FastAPI app factory, CORS, lifecycle events
|
||||
- **main.py** — Entry point (imports app, runs uvicorn)
|
||||
- **server.py** — Tauri sidecar launcher, parent-pid watchdog
|
||||
- **services/generation.py** — Single function handling all generation modes
|
||||
- **backends/** — TTS/STT engine implementations (MLX, PyTorch, etc.)
|
||||
|
||||
## Model Downloads
|
||||
|
||||
Models are automatically downloaded from HuggingFace Hub on first use, with live progress streamed to the UI:
|
||||
|
||||
- **Whisper** (transcription) — auto-downloads on first transcription
|
||||
- **TTS engines** — auto-download on first generation. Sizes range from 82 M (Kokoro, ~350 MB) to 3 B (TADA, ~8 GB)
|
||||
|
||||
See [Model Management](/developer/model-management) for the full list.
|
||||
|
||||
<Callout type="warn">
|
||||
First-time usage will be slower due to model downloads, but subsequent runs will use cached models.
|
||||
</Callout>
|
||||
|
||||
## Generate OpenAPI Client
|
||||
|
||||
After starting the backend server, generate the TypeScript API client:
|
||||
|
||||
```bash
|
||||
just generate-api
|
||||
```
|
||||
|
||||
This downloads the OpenAPI schema and generates the TypeScript client in `app/src/lib/api/`.
|
||||
|
||||
## Manual Setup (Advanced)
|
||||
|
||||
If you prefer not to use Just, follow these manual steps:
|
||||
|
||||
### 1. Install JavaScript Dependencies
|
||||
|
||||
```bash
|
||||
bun install
|
||||
```
|
||||
|
||||
This installs dependencies for:
|
||||
- `app/` - Shared React frontend
|
||||
- `tauri/` - Tauri desktop wrapper
|
||||
- `web/` - Web deployment wrapper
|
||||
|
||||
### 2. Set Up Python Backend
|
||||
|
||||
```bash
|
||||
cd backend
|
||||
|
||||
# Create virtual environment
|
||||
python -m venv venv
|
||||
|
||||
# Activate virtual environment
|
||||
source venv/bin/activate # macOS/Linux
|
||||
# or
|
||||
venv\Scripts\activate # Windows
|
||||
|
||||
# Install Python dependencies
|
||||
pip install -r requirements.txt
|
||||
|
||||
# Apple Silicon: install MLX dependencies
|
||||
pip install -r requirements-mlx.txt
|
||||
|
||||
# Chatterbox pins numpy<1.26 / torch==2.6 which break on Python 3.12+
|
||||
pip install --no-deps chatterbox-tts
|
||||
|
||||
# HumeAI TADA pins torch>=2.7,<2.8 which conflicts with our torch>=2.1
|
||||
pip install --no-deps hume-tada
|
||||
|
||||
# Install Qwen3-TTS from source
|
||||
pip install git+https://github.com/QwenLM/Qwen3-TTS.git
|
||||
|
||||
# PyInstaller and linting tools
|
||||
pip install pyinstaller ruff pytest pytest-asyncio
|
||||
```
|
||||
|
||||
### 3. Start Development
|
||||
|
||||
Start the backend:
|
||||
```bash
|
||||
cd backend
|
||||
source venv/bin/activate
|
||||
uvicorn main:app --reload --port 17493
|
||||
```
|
||||
|
||||
In a new terminal, start the desktop app:
|
||||
```bash
|
||||
cd tauri
|
||||
bun run tauri dev
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
|
||||
<Cards>
|
||||
<Card title="Architecture" href="/developer/architecture">
|
||||
Understand the system architecture
|
||||
</Card>
|
||||
<Card title="Contributing" href="/developer/contributing">
|
||||
Read the contribution guidelines
|
||||
</Card>
|
||||
<Card title="Building" href="/developer/building">
|
||||
Learn how to build production releases
|
||||
</Card>
|
||||
<Card title="TTS Engines" href="/developer/tts-engines">
|
||||
Add a new TTS engine end-to-end
|
||||
</Card>
|
||||
</Cards>
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="Backend won't start">
|
||||
- Check Python version (must be 3.11+)
|
||||
- Ensure virtual environment is activated: `source backend/venv/bin/activate`
|
||||
- Verify all dependencies are installed: `pip install -r requirements.txt`
|
||||
- Check if port 17493 is available
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Tauri build fails">
|
||||
- Ensure Rust is installed: `rustc --version`
|
||||
- Clean the build: `cd tauri/src-tauri && cargo clean`
|
||||
- Try rebuilding: `just dev`
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="OpenAPI client generation fails">
|
||||
- Ensure backend is running: `curl http://localhost:17493/openapi.json`
|
||||
- Check network connectivity
|
||||
- Verify the backend is accessible at localhost:17493
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
See the full [Troubleshooting Guide](/overview/troubleshooting) for more issues and solutions.
|
||||
301
docs/content/docs/developer/stories.mdx
Normal file
301
docs/content/docs/developer/stories.mdx
Normal file
@@ -0,0 +1,301 @@
|
||||
---
|
||||
title: "Stories & Timeline"
|
||||
description: "How the multi-voice timeline editor works in Voicebox"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Stories allow users to arrange multiple voice generations on a timeline to create multi-voice narratives. The system supports tracks, trimming, splitting, and audio mixing.
|
||||
|
||||
## Architecture
|
||||
|
||||
**Story:** A container that holds story items with metadata.
|
||||
|
||||
**Story Item:** Links a generation to a story with timeline position, track, and trim data.
|
||||
|
||||
**Export:** Combines all items into a single mixed audio file.
|
||||
|
||||
## Data Model
|
||||
|
||||
### Story Table
|
||||
|
||||
```python
|
||||
class Story(Base):
|
||||
__tablename__ = "stories"
|
||||
|
||||
id = Column(String, primary_key=True)
|
||||
name = Column(String, nullable=False)
|
||||
description = Column(Text)
|
||||
created_at = Column(DateTime)
|
||||
updated_at = Column(DateTime)
|
||||
```
|
||||
|
||||
### StoryItem Table
|
||||
|
||||
```python
|
||||
class StoryItem(Base):
|
||||
__tablename__ = "story_items"
|
||||
|
||||
id = Column(String, primary_key=True)
|
||||
story_id = Column(String, ForeignKey("stories.id"))
|
||||
generation_id = Column(String, ForeignKey("generations.id"))
|
||||
start_time_ms = Column(Integer, default=0) # Timeline position
|
||||
track = Column(Integer, default=0) # Track number
|
||||
trim_start_ms = Column(Integer, default=0) # Trim from start
|
||||
trim_end_ms = Column(Integer, default=0) # Trim from end
|
||||
created_at = Column(DateTime)
|
||||
```
|
||||
|
||||
## Timeline Concepts
|
||||
|
||||
### Start Time
|
||||
|
||||
`start_time_ms` is the absolute position on the timeline where an item begins playing. Items on the same track cannot overlap; items on different tracks can.
|
||||
|
||||
### Tracks
|
||||
|
||||
A `track` is an integer (0-indexed) that identifies the horizontal row an item sits on. Audio on separate tracks plays concurrently, so tracks are the primary way to layer multiple voices or sound effects.
|
||||
|
||||
### Trimming
|
||||
|
||||
`trim_start_ms` and `trim_end_ms` hide the leading/trailing portions of the source generation without modifying the underlying audio file. The effective playback length is `generation.duration * 1000 - trim_start_ms - trim_end_ms`. Trimming is non-destructive — the same generation can be trimmed differently in different stories.
|
||||
|
||||
## Core Operations
|
||||
|
||||
### Adding Items
|
||||
|
||||
When adding a generation to a story:
|
||||
|
||||
```python
|
||||
async def add_item_to_story(
|
||||
story_id: str,
|
||||
data: StoryItemCreate,
|
||||
db: Session,
|
||||
) -> StoryItemDetail:
|
||||
# Calculate start time if not provided
|
||||
if data.start_time_ms is None:
|
||||
# Find the end of all existing items
|
||||
existing_items = get_items_with_durations(story_id, db)
|
||||
max_end_time_ms = max(
|
||||
item.start_time_ms + int(gen.duration * 1000)
|
||||
for item, gen in existing_items
|
||||
)
|
||||
start_time_ms = max_end_time_ms + 200 # 200ms gap
|
||||
|
||||
# Create the item
|
||||
item = DBStoryItem(
|
||||
id=str(uuid.uuid4()),
|
||||
story_id=story_id,
|
||||
generation_id=data.generation_id,
|
||||
start_time_ms=start_time_ms,
|
||||
track=data.track or 0,
|
||||
)
|
||||
db.add(item)
|
||||
db.commit()
|
||||
```
|
||||
|
||||
### Moving Items
|
||||
|
||||
Update position and/or track:
|
||||
|
||||
```python
|
||||
async def move_story_item(
|
||||
story_id: str,
|
||||
item_id: str,
|
||||
data: StoryItemMove,
|
||||
db: Session,
|
||||
) -> StoryItemDetail:
|
||||
item = get_item(story_id, item_id, db)
|
||||
|
||||
item.start_time_ms = data.start_time_ms
|
||||
item.track = data.track
|
||||
|
||||
db.commit()
|
||||
```
|
||||
|
||||
### Trimming Items
|
||||
|
||||
Non-destructive trimming:
|
||||
|
||||
```python
|
||||
async def trim_story_item(
|
||||
story_id: str,
|
||||
item_id: str,
|
||||
data: StoryItemTrim,
|
||||
db: Session,
|
||||
) -> StoryItemDetail:
|
||||
item = get_item(story_id, item_id, db)
|
||||
generation = get_generation(item.generation_id, db)
|
||||
|
||||
# Validate trim doesn't exceed duration
|
||||
max_duration_ms = int(generation.duration * 1000)
|
||||
if data.trim_start_ms + data.trim_end_ms >= max_duration_ms:
|
||||
return None # Invalid trim
|
||||
|
||||
item.trim_start_ms = data.trim_start_ms
|
||||
item.trim_end_ms = data.trim_end_ms
|
||||
|
||||
db.commit()
|
||||
```
|
||||
|
||||
### Splitting Items
|
||||
|
||||
Split one item into two at a specific time:
|
||||
|
||||
```python
|
||||
async def split_story_item(
|
||||
story_id: str,
|
||||
item_id: str,
|
||||
data: StoryItemSplit,
|
||||
db: Session,
|
||||
) -> List[StoryItemDetail]:
|
||||
item = get_item(story_id, item_id, db)
|
||||
generation = get_generation(item.generation_id, db)
|
||||
|
||||
# Calculate split point
|
||||
current_trim_start = item.trim_start_ms
|
||||
current_trim_end = item.trim_end_ms
|
||||
original_duration_ms = int(generation.duration * 1000)
|
||||
absolute_split_ms = current_trim_start + data.split_time_ms
|
||||
|
||||
# Update original: trim from end
|
||||
item.trim_end_ms = original_duration_ms - absolute_split_ms
|
||||
|
||||
# Create new item: trim from start
|
||||
new_item = DBStoryItem(
|
||||
generation_id=item.generation_id, # Same generation
|
||||
start_time_ms=item.start_time_ms + data.split_time_ms,
|
||||
track=item.track,
|
||||
trim_start_ms=absolute_split_ms,
|
||||
trim_end_ms=current_trim_end,
|
||||
)
|
||||
|
||||
db.add(new_item)
|
||||
db.commit()
|
||||
|
||||
return [item, new_item]
|
||||
```
|
||||
|
||||
### Duplicating Items
|
||||
|
||||
Create a copy with all properties:
|
||||
|
||||
```python
|
||||
async def duplicate_story_item(
|
||||
story_id: str,
|
||||
item_id: str,
|
||||
db: Session,
|
||||
) -> StoryItemDetail:
|
||||
original = get_item(story_id, item_id, db)
|
||||
generation = get_generation(original.generation_id, db)
|
||||
|
||||
# Calculate effective duration for positioning
|
||||
effective_duration_ms = (
|
||||
int(generation.duration * 1000)
|
||||
- original.trim_start_ms
|
||||
- original.trim_end_ms
|
||||
)
|
||||
|
||||
# Place copy after original with 200ms gap
|
||||
new_item = DBStoryItem(
|
||||
generation_id=original.generation_id,
|
||||
start_time_ms=original.start_time_ms + effective_duration_ms + 200,
|
||||
track=original.track,
|
||||
trim_start_ms=original.trim_start_ms,
|
||||
trim_end_ms=original.trim_end_ms,
|
||||
)
|
||||
|
||||
db.add(new_item)
|
||||
db.commit()
|
||||
```
|
||||
|
||||
## Audio Export
|
||||
|
||||
### Mixing Algorithm
|
||||
|
||||
The export function mixes all items into a single audio file:
|
||||
|
||||
```python
|
||||
async def export_story_audio(story_id: str, db: Session) -> bytes:
|
||||
items = get_all_items_with_generations(story_id, db)
|
||||
|
||||
# Calculate total duration
|
||||
max_end_time_ms = max(
|
||||
data['start_time_ms'] + data['duration_ms']
|
||||
for data in audio_data
|
||||
)
|
||||
|
||||
# Create output buffer
|
||||
total_samples = int((max_end_time_ms / 1000.0) * sample_rate)
|
||||
final_audio = np.zeros(total_samples, dtype=np.float32)
|
||||
|
||||
# Mix each item at its position
|
||||
for data in audio_data:
|
||||
audio = data['audio']
|
||||
start_sample = int((data['start_time_ms'] / 1000.0) * sample_rate)
|
||||
|
||||
# Apply trim
|
||||
trimmed_audio = audio[trim_start_sample:len(audio) - trim_end_sample]
|
||||
|
||||
# Add to buffer (overlapping items sum together)
|
||||
final_audio[start_sample:start_sample + len(trimmed_audio)] += trimmed_audio
|
||||
|
||||
# Normalize to prevent clipping
|
||||
max_val = np.abs(final_audio).max()
|
||||
if max_val > 1.0:
|
||||
final_audio = final_audio / max_val
|
||||
|
||||
return audio_to_bytes(final_audio, sample_rate)
|
||||
```
|
||||
|
||||
## API Endpoints
|
||||
|
||||
| Method | Endpoint | Description |
|
||||
|--------|----------|-------------|
|
||||
| GET | `/stories` | List all stories |
|
||||
| POST | `/stories` | Create a story |
|
||||
| GET | `/stories/{id}` | Get story with items |
|
||||
| PUT | `/stories/{id}` | Update story metadata |
|
||||
| DELETE | `/stories/{id}` | Delete story |
|
||||
| POST | `/stories/{id}/items` | Add item to story |
|
||||
| DELETE | `/stories/{id}/items/{item_id}` | Remove item |
|
||||
| PUT | `/stories/{id}/items/{item_id}/move` | Move item |
|
||||
| PUT | `/stories/{id}/items/{item_id}/trim` | Trim item |
|
||||
| POST | `/stories/{id}/items/{item_id}/split` | Split item |
|
||||
| POST | `/stories/{id}/items/{item_id}/duplicate` | Duplicate item |
|
||||
| PUT | `/stories/{id}/items/times` | Batch update times |
|
||||
| PUT | `/stories/{id}/items/reorder` | Reorder items |
|
||||
| GET | `/stories/{id}/export-audio` | Export mixed audio |
|
||||
|
||||
## Response Schemas
|
||||
|
||||
### StoryItemDetail
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "item_uuid",
|
||||
"story_id": "story_uuid",
|
||||
"generation_id": "generation_uuid",
|
||||
"start_time_ms": 1500,
|
||||
"track": 0,
|
||||
"trim_start_ms": 200,
|
||||
"trim_end_ms": 100,
|
||||
"profile_id": "profile_uuid",
|
||||
"profile_name": "Narrator",
|
||||
"text": "Hello world",
|
||||
"audio_path": "/path/to/audio.wav",
|
||||
"duration": 2.5,
|
||||
"created_at": "2024-01-15T10:30:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
## Frontend Integration
|
||||
|
||||
The timeline UI needs to:
|
||||
|
||||
1. **Fetch story** with all items
|
||||
2. **Render waveforms** for each item
|
||||
3. **Handle drag/drop** to move items
|
||||
4. **Handle edge drag** for trimming
|
||||
5. **Sync playhead** across all tracks
|
||||
6. **Export** when user clicks download
|
||||
160
docs/content/docs/developer/transcription.mdx
Normal file
160
docs/content/docs/developer/transcription.mdx
Normal file
@@ -0,0 +1,160 @@
|
||||
---
|
||||
title: "Transcription"
|
||||
description: "How Whisper-based audio transcription works in Voicebox"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Voicebox uses OpenAI's Whisper for automatic speech recognition (ASR). Transcription powers two flows:
|
||||
|
||||
1. **Reference-text auto-fill** — when a user records or uploads a voice sample, the backend transcribes it and populates the `reference_text` field so cloning can use it.
|
||||
2. **On-demand transcription** — a user-facing `/transcribe` endpoint for arbitrary audio.
|
||||
|
||||
On Apple Silicon, the transcription path runs through **MLX-Whisper** (from `mlx-audio`) for ~8× faster inference than PyTorch. Everywhere else it runs through PyTorch's `transformers` Whisper.
|
||||
|
||||
## Architecture
|
||||
|
||||
Transcription is wired through the same backend abstraction as TTS. The `STTBackend` protocol lives in `backend/backends/__init__.py`:
|
||||
|
||||
```python
|
||||
@runtime_checkable
|
||||
class STTBackend(Protocol):
|
||||
async def load_model(self, model_size: str) -> None: ...
|
||||
async def transcribe(
|
||||
self,
|
||||
audio_path: str,
|
||||
language: Optional[str] = None,
|
||||
model_size: Optional[str] = None,
|
||||
) -> str: ...
|
||||
def unload_model(self) -> None: ...
|
||||
def is_loaded(self) -> bool: ...
|
||||
```
|
||||
|
||||
Two implementations ship today:
|
||||
|
||||
- **`MLXSTTBackend`** (`backends/mlx_backend.py`) — uses `mlx_audio.stt.load()`. Default on Apple Silicon.
|
||||
- **`PyTorchSTTBackend`** (`backends/pytorch_backend.py`) — uses `transformers.WhisperForConditionalGeneration`. Default everywhere else.
|
||||
|
||||
`get_stt_backend()` picks the right one based on `get_backend_type()`. `backend/services/transcribe.py` is a thin wrapper that delegates to the backend.
|
||||
|
||||
## Model Sizes
|
||||
|
||||
Five Whisper variants are registered in `ModelConfig`:
|
||||
|
||||
| Model | HuggingFace Repo | Size | Notes |
|
||||
|-------|------------------|------|-------|
|
||||
| **Base** | `openai/whisper-base` | ~300 MB | Default; fast, decent quality |
|
||||
| **Small** | `openai/whisper-small` | ~500 MB | Better quality, still fast |
|
||||
| **Medium** | `openai/whisper-medium` | ~1.5 GB | High quality |
|
||||
| **Large** | `openai/whisper-large-v3` | ~3 GB | Best quality, slow on CPU |
|
||||
| **Turbo** | `openai/whisper-large-v3-turbo` | ~1.5 GB | Large-tier quality, ~5× faster than Large |
|
||||
|
||||
The `tiny` model is **not** exposed — the quality gap to `base` wasn't worth the download.
|
||||
|
||||
`Turbo` + MLX-Whisper on Apple Silicon dropped user-facing transcription latency from ~20s to ~2-3s in v0.1.10.
|
||||
|
||||
## Language Hints
|
||||
|
||||
Whisper can auto-detect language, but providing a hint improves accuracy on short clips:
|
||||
|
||||
```python
|
||||
text = await backend.transcribe(audio_path, language="en")
|
||||
```
|
||||
|
||||
Accepted language codes are the standard Whisper set (99+ languages). The frontend typically passes the profile's language if available, or lets Whisper detect otherwise.
|
||||
|
||||
## Model Loading
|
||||
|
||||
Both backends are lazy: the model is loaded on first use and cached in memory. Switching sizes unloads the previous model.
|
||||
|
||||
On MLX, the model is loaded via `mlx_audio.stt.load(hf_repo)`. On PyTorch, via:
|
||||
|
||||
```python
|
||||
WhisperProcessor.from_pretrained(hf_repo)
|
||||
WhisperForConditionalGeneration.from_pretrained(hf_repo).to(device)
|
||||
```
|
||||
|
||||
Both load paths use `model_load_progress()` from `backends/base.py` so the frontend sees live download progress on the first use.
|
||||
|
||||
## Audio Preprocessing
|
||||
|
||||
Whisper expects mono 16 kHz audio. The audio utility in `backend/utils/audio.py` handles resampling and format conversion transparently:
|
||||
|
||||
- **Formats:** WAV, MP3, FLAC, OGG, M4A (via soundfile / librosa)
|
||||
- **Target:** mono, 16 kHz, float32
|
||||
|
||||
Files longer than Whisper's 30-second window are handled by the underlying library's chunking logic — no explicit splitting in Voicebox code.
|
||||
|
||||
## API Endpoints
|
||||
|
||||
| Method | Endpoint | Description |
|
||||
|--------|----------|-------------|
|
||||
| POST | `/transcribe` | Transcribe an uploaded audio file |
|
||||
|
||||
### Request
|
||||
|
||||
Multipart form data:
|
||||
|
||||
```
|
||||
POST /transcribe
|
||||
Content-Type: multipart/form-data
|
||||
|
||||
file: <audio_file>
|
||||
language: en # optional
|
||||
model_size: base # optional (default: "base")
|
||||
```
|
||||
|
||||
### Response
|
||||
|
||||
```json
|
||||
{
|
||||
"text": "Hello, this is a test transcription.",
|
||||
"duration": 3.5
|
||||
}
|
||||
```
|
||||
|
||||
## Use Cases
|
||||
|
||||
### Reference Text for Voice Cloning
|
||||
|
||||
Adding a voice sample triggers transcription automatically:
|
||||
|
||||
1. User uploads or records audio.
|
||||
2. The backend writes the audio file and calls `/transcribe` internally (or the frontend calls it separately).
|
||||
3. The returned text becomes `reference_text` on the new `profile_samples` row.
|
||||
4. Cloning engines that need reference text (Chatterbox, TADA, etc.) read it from there.
|
||||
|
||||
### Quality Tips
|
||||
|
||||
- Provide a language hint for short clips (under 5 seconds) — auto-detection is unreliable on little audio.
|
||||
- Use Turbo or Large for noisy audio — Base can hallucinate on hard inputs.
|
||||
- Prefer clean audio; transcription errors become reference-text errors, which become cloning errors.
|
||||
|
||||
## Memory Management
|
||||
|
||||
`unload_model()` drops the model reference and clears the CUDA cache if applicable. `/models/unload` wires this up for manual control.
|
||||
|
||||
A singleton per backend is returned by `get_stt_backend()` — multiple callers share one Whisper instance.
|
||||
|
||||
## Error Handling
|
||||
|
||||
| Error | Cause | Solution |
|
||||
|-------|-------|----------|
|
||||
| Model not found | First run + network failure | Retry; check connectivity |
|
||||
| OOM on load | Large model on low-VRAM GPU | Switch to Small or Turbo |
|
||||
| Empty result | No speech in audio | Confirm input has voice; check trim |
|
||||
| Wrong language | Auto-detect misfired | Pass `language` hint |
|
||||
|
||||
## Next Steps
|
||||
|
||||
<Cards>
|
||||
<Card title="Model Management" href="/developer/model-management">
|
||||
Download / load / unload any model
|
||||
</Card>
|
||||
<Card title="Voice Profiles" href="/developer/voice-profiles">
|
||||
How reference text is stored alongside samples
|
||||
</Card>
|
||||
<Card title="GPU Acceleration" href="/overview/gpu-acceleration">
|
||||
Platform-specific acceleration including MLX-Whisper
|
||||
</Card>
|
||||
</Cards>
|
||||
703
docs/content/docs/developer/tts-engines.mdx
Normal file
703
docs/content/docs/developer/tts-engines.mdx
Normal file
@@ -0,0 +1,703 @@
|
||||
---
|
||||
title: "TTS Engines"
|
||||
description: "How to add new text-to-speech engines to Voicebox"
|
||||
---
|
||||
|
||||
> **For humans:** This doc is optimized for AI agents to implement new TTS engines autonomously. It's structured as a phased workflow with explicit gates and a checklist so an agent can do the full integration — dependency research, backend, frontend, bundling — and hand you a draft release or prod build to test locally. It's also a useful reference if you're doing it yourself.
|
||||
|
||||
Adding an engine touches ~10 files across 4 layers. The backend protocol work is straightforward — the real time sink is dependency hell, upstream library bugs, and PyInstaller bundling.
|
||||
|
||||
**Do not start writing code until you complete Phase 0.** The v0.2.3 release was three patch releases of PyInstaller fixes because dependency research was skipped. Every issue — `inspect.getsource()` failures, missing native data files, metadata lookups, dtype mismatches — was discoverable by reading the model library's source code before integration began.
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
The backend is split into layers:
|
||||
|
||||
| Layer | Purpose | Files Touched |
|
||||
|-------|---------|---------------|
|
||||
| `routes/` | Thin HTTP handlers | None (auto-dispatch) |
|
||||
| `services/` | Business logic | None (auto-dispatch) |
|
||||
| `backends/` | Engine implementations | `your_engine_backend.py` |
|
||||
| `utils/` | Shared utilities | As needed |
|
||||
|
||||
New engines only need to touch `backends/` and `models.py` on the backend side — the route and service layers use a model config registry that handles dispatch automatically.
|
||||
|
||||
## Phase 0: Dependency Research
|
||||
|
||||
**This phase is mandatory.** Clone the model library and its key dependencies into a temporary directory and inspect them before writing any integration code. The goal is to produce a dependency audit that identifies every PyInstaller-incompatible pattern, every native data file, and every upstream bug you'll need to work around.
|
||||
|
||||
### 0.1 Clone and Inspect the Model Library
|
||||
|
||||
```bash
|
||||
# Create a throwaway workspace
|
||||
mkdir /tmp/engine-research && cd /tmp/engine-research
|
||||
|
||||
# Clone the model library
|
||||
git clone https://github.com/org/model-library.git
|
||||
cd model-library
|
||||
```
|
||||
|
||||
**Read these files first, in order:**
|
||||
|
||||
1. **`setup.py` / `setup.cfg` / `pyproject.toml`** — Check pinned dependency versions. If the library pins `torch==2.6.0` or `numpy<1.26`, you'll need `--no-deps` installation and manual sub-dependency listing (this is what happened with `chatterbox-tts`).
|
||||
|
||||
2. **`__init__.py` and the main model class** — Trace the import chain. Look for:
|
||||
- `from_pretrained()` — does it call `huggingface_hub` internally? Does it pass `token=True` (which crashes without a stored HF token)?
|
||||
- `from_local()` — does it exist? You may need manual `snapshot_download()` + `from_local()` to bypass download bugs.
|
||||
- Device handling — does it default to CUDA? Does it support MPS? Many libraries crash on MPS with unsupported operators.
|
||||
|
||||
3. **All `import` statements** — Recursively trace what the library imports. You're looking for:
|
||||
- `inspect.getsource()` anywhere in the chain (search all `.py` files)
|
||||
- `typeguard` / `@typechecked` decorators (these call `inspect.getsource()` at import time)
|
||||
- `importlib.metadata.version()` or `pkg_resources.get_distribution()` (need `--copy-metadata`)
|
||||
- `lazy_loader` (needs `--collect-all` to bundle `.pyi` stubs)
|
||||
|
||||
### 0.2 Scan for PyInstaller-Incompatible Patterns
|
||||
|
||||
Run these searches against the cloned library **and** its transitive dependencies:
|
||||
|
||||
```bash
|
||||
# inspect.getsource — will crash in frozen binary without --collect-all
|
||||
grep -r "inspect.getsource\|getsource(" .
|
||||
|
||||
# typeguard / @typechecked — calls inspect.getsource at import time
|
||||
grep -r "@typechecked\|from typeguard" .
|
||||
|
||||
# importlib.metadata — needs --copy-metadata
|
||||
grep -r "importlib.metadata\|pkg_resources.get_distribution\|pkg_resources.require" .
|
||||
|
||||
# Data files loaded at runtime — need --collect-all or --collect-data
|
||||
grep -r "Path(__file__).parent\|os.path.dirname(__file__)\|resources_path\|pkg_resources.resource_filename" .
|
||||
|
||||
# Native library paths — may need env var override in frozen builds
|
||||
grep -r "/usr/share\|/usr/lib\|/usr/local\|espeak\|phonemize" .
|
||||
|
||||
# torch.load without map_location — will crash on CPU-only builds
|
||||
grep -r "torch.load(" . | grep -v "map_location"
|
||||
|
||||
# HuggingFace token bugs
|
||||
grep -r 'token=True\|token=os.getenv' .
|
||||
|
||||
# Float64/Float32 assumptions — librosa returns float64, many models assume float32
|
||||
grep -r "torch.from_numpy\|\.double()\|float64" .
|
||||
|
||||
# @torch.jit.script — calls inspect.getsource(), crashes in frozen builds
|
||||
grep -r "@torch.jit.script\|torch.jit.script" .
|
||||
|
||||
# torchaudio.load — requires torchcodec in torchaudio 2.10+, use soundfile.read() instead
|
||||
grep -r "torchaudio.load\|torchaudio.save" .
|
||||
|
||||
# Gated HuggingFace repos — models that hardcode gated repos as tokenizer/config sources
|
||||
grep -r "from_pretrained\|tokenizer_name\|AutoTokenizer" . | grep -i "llama\|meta-llama\|gated"
|
||||
```
|
||||
|
||||
### 0.3 Install and Trace in a Throwaway Venv
|
||||
|
||||
```bash
|
||||
# Create isolated venv
|
||||
python -m venv /tmp/engine-venv
|
||||
source /tmp/engine-venv/bin/activate
|
||||
|
||||
# Install the package (try normally first)
|
||||
pip install model-package
|
||||
|
||||
# Check if it conflicts with our stack
|
||||
pip install model-package torch==2.10 transformers==4.57.3 numpy>=1.26
|
||||
# If this fails, you need --no-deps:
|
||||
pip install --no-deps model-package
|
||||
|
||||
# Get the full dependency tree
|
||||
pip show model-package # Check Requires: field
|
||||
pip show -f model-package # List all installed files (look for data files)
|
||||
|
||||
# Check for non-PyPI dependencies
|
||||
pip install model-package 2>&1 | grep -i "no matching distribution"
|
||||
```
|
||||
|
||||
### 0.4 Test Model Loading on CPU
|
||||
|
||||
Before writing any integration code, verify the model works on CPU in a plain Python script:
|
||||
|
||||
```python
|
||||
import torch
|
||||
# Force CPU to catch map_location bugs early
|
||||
model = ModelClass.from_pretrained("org/model", device="cpu")
|
||||
|
||||
# Test with a float32 audio array (not float64)
|
||||
import numpy as np
|
||||
audio = np.random.randn(16000).astype(np.float32)
|
||||
output = model.generate("Hello world", audio)
|
||||
print(f"Output shape: {output.shape}, dtype: {output.dtype}, sample rate: {model.sample_rate}")
|
||||
```
|
||||
|
||||
If this crashes, you've found a bug you'll need to monkey-patch. Common ones:
|
||||
- `RuntimeError: expected scalar type Float but found Double` → needs float32 cast
|
||||
- `RuntimeError: map_location` → needs `torch.load` patch
|
||||
- `RuntimeError: Unsupported operator aten::...` → needs MPS skip
|
||||
|
||||
### 0.5 Produce a Dependency Audit
|
||||
|
||||
Before proceeding to Phase 1, write down:
|
||||
|
||||
1. **PyPI vs non-PyPI deps** — which packages need `--find-links`, `git+https://`, or `--no-deps`?
|
||||
2. **PyInstaller directives needed** — which packages need `--collect-all`, `--copy-metadata`, `--hidden-import`?
|
||||
3. **Runtime data files** — which packages ship data files (YAML, pretrained weights, phoneme tables, shader libraries) that must be bundled?
|
||||
4. **Native library paths** — which packages look for data at system paths that won't exist in a frozen binary?
|
||||
5. **Monkey-patches needed** — `torch.load` map_location, float64→float32 casts, MPS skip, HF token bypass, etc.
|
||||
6. **Sample rate** — what does the engine output? (24kHz, 44.1kHz, 48kHz)
|
||||
7. **Model download method** — `from_pretrained()` with library-managed download, or manual `snapshot_download()` + `from_local()`?
|
||||
|
||||
This audit becomes your implementation plan for Phases 1, 4, and 5.
|
||||
|
||||
## Phase 1: Backend Implementation
|
||||
|
||||
### 1.1 Create the Backend File
|
||||
|
||||
Create `backend/backends/<engine>_backend.py` (~200-300 lines) implementing the `TTSBackend` protocol:
|
||||
|
||||
```python
|
||||
class YourBackend:
|
||||
"""Must satisfy the TTSBackend protocol."""
|
||||
|
||||
async def load_model(self, model_size: str = "default") -> None: ...
|
||||
async def create_voice_prompt(self, audio_path: str, reference_text: str, use_cache: bool = True) -> tuple[dict, bool]: ...
|
||||
async def combine_voice_prompts(self, audio_paths: list[str], ref_texts: list[str]) -> tuple[np.ndarray, str]: ...
|
||||
async def generate(self, text: str, voice_prompt: dict, language: str = "en", seed: int | None = None, instruct: str | None = None) -> tuple[np.ndarray, int]: ...
|
||||
def unload_model(self) -> None: ...
|
||||
def is_loaded(self) -> bool: ...
|
||||
def _get_model_path(self, model_size: str) -> str: ...
|
||||
```
|
||||
|
||||
**Key decisions per engine:**
|
||||
|
||||
| Decision | Options | Examples |
|
||||
|----------|---------|---------|
|
||||
| **Voice prompt storage** | Pre-computed tensors vs deferred file paths | Qwen stores tensor dicts; Chatterbox stores paths |
|
||||
| **Caching** | Use voice prompt cache or skip it | LuxTTS caches with prefix; Chatterbox skips caching |
|
||||
| **Device selection** | CUDA / MPS / CPU | Chatterbox forces CPU on macOS (MPS bugs) |
|
||||
| **Model download** | Library handles it vs manual `snapshot_download` | Turbo uses manual download to bypass `token=True` bug |
|
||||
| **Sample rate** | Engine-specific | LuxTTS outputs 48kHz, everything else is 24kHz |
|
||||
|
||||
### 1.2 Voice Prompt Patterns
|
||||
|
||||
**Pattern A: Pre-computed tensors** (Qwen, LuxTTS)
|
||||
```python
|
||||
encoded = model.encode_prompt(audio_path)
|
||||
return encoded, False # (prompt_dict, was_cached)
|
||||
```
|
||||
|
||||
**Pattern B: Deferred file paths** (Chatterbox, MLX)
|
||||
```python
|
||||
return {"ref_audio": audio_path, "ref_text": reference_text}, False
|
||||
```
|
||||
|
||||
**Pattern C: Hybrid** (possible for new engines)
|
||||
```python
|
||||
embedding = model.extract_speaker(audio_path)
|
||||
return {"embedding": embedding, "ref_audio": audio_path}, False
|
||||
```
|
||||
|
||||
If caching, prefix your cache keys:
|
||||
```python
|
||||
cache_key = "yourengine_" + get_cache_key(audio_path, reference_text)
|
||||
```
|
||||
|
||||
### 1.3 Register the Engine
|
||||
|
||||
In `backend/backends/__init__.py`:
|
||||
|
||||
**Add a `ModelConfig` entry:**
|
||||
|
||||
```python
|
||||
ModelConfig(
|
||||
model_name="your-engine",
|
||||
display_name="Your Engine",
|
||||
engine="your_engine",
|
||||
hf_repo_id="org/model-repo",
|
||||
size_mb=3200,
|
||||
needs_trim=False, # set True if output needs trim_tts_output()
|
||||
languages=["en", "fr", "de"],
|
||||
),
|
||||
```
|
||||
|
||||
**Add to `TTS_ENGINES` dict:**
|
||||
|
||||
```python
|
||||
TTS_ENGINES = {
|
||||
...
|
||||
"your_engine": "Your Engine",
|
||||
}
|
||||
```
|
||||
|
||||
**Add factory branch:**
|
||||
|
||||
```python
|
||||
elif engine == "your_engine":
|
||||
from .your_backend import YourBackend
|
||||
backend = YourBackend()
|
||||
```
|
||||
|
||||
### 1.4 Update Request Models
|
||||
|
||||
In `backend/models.py`:
|
||||
- Add engine name to `GenerationRequest.engine` regex pattern
|
||||
- Add any new language codes to the language regex
|
||||
|
||||
## Phase 2: Route and Service Integration
|
||||
|
||||
With the model config registry, route and service layers have **zero per-engine dispatch points**. All endpoints use registry helpers like `get_model_config()`, `load_engine_model()`, `engine_needs_trim()`, `check_model_loaded()`, etc.
|
||||
|
||||
**You don't need to touch any route or service files** unless your engine needs custom behavior in the generate pipeline.
|
||||
|
||||
### Post-Processing
|
||||
|
||||
If your model produces trailing silence, set `needs_trim=True` on your `ModelConfig`. The generation service applies `trim_tts_output()` automatically.
|
||||
|
||||
## Phase 3: Frontend Integration
|
||||
|
||||
### 3.1 TypeScript Types
|
||||
|
||||
In `app/src/lib/api/types.ts`:
|
||||
- Add to the `engine` union type on `GenerationRequest`
|
||||
|
||||
### 3.2 Language Maps
|
||||
|
||||
In `app/src/lib/constants/languages.ts`:
|
||||
- Add entry to `ENGINE_LANGUAGES` record
|
||||
- Add any new language codes to `ALL_LANGUAGES` if needed
|
||||
|
||||
### 3.3 Engine/Model Selector
|
||||
|
||||
In `app/src/components/Generation/EngineModelSelector.tsx`:
|
||||
- Add entry to `ENGINE_OPTIONS` and `ENGINE_DESCRIPTIONS`
|
||||
- Add to `ENGLISH_ONLY_ENGINES` if applicable
|
||||
|
||||
### 3.4 Form Hook
|
||||
|
||||
In `app/src/lib/hooks/useGenerationForm.ts`:
|
||||
- Add to Zod schema enum for `engine`
|
||||
- Add engine-to-model-name mapping
|
||||
- Update payload construction for engine-specific fields
|
||||
|
||||
**Watch out for model naming inconsistencies.** The HuggingFace repo name, the model size label, and the API model name don't always follow predictable patterns. For example, TADA's 3B model is named `tada-3b-ml` (not `tada-3b`), because it's a multilingual variant. Always check the actual repo names and build the frontend model name mapping from those, not from assumptions like `{engine}-{size}`.
|
||||
|
||||
### 3.5 Model Management
|
||||
|
||||
In `app/src/components/ServerSettings/ModelManagement.tsx`:
|
||||
- Add description to `MODEL_DESCRIPTIONS` record
|
||||
- Add model name to `voiceModels` filter condition
|
||||
|
||||
### 3.6 Non-Cloning Engines (Preset Voices)
|
||||
|
||||
If your engine uses **pre-built voices** instead of zero-shot cloning from reference audio (e.g. Kokoro), additional integration is needed:
|
||||
|
||||
**Backend:**
|
||||
- In `kokoro_backend.py` (or your engine), define a `VOICES` list of `(voice_id, display_name, gender, language)` tuples
|
||||
- `create_voice_prompt()` should return `{"voice_type": "preset", "preset_engine": "<engine>", "preset_voice_id": "<id>"}`
|
||||
- `generate()` should read `voice_prompt.get("preset_voice_id")` to select the voice
|
||||
- Add a `seed_preset_profiles("<engine>")` call in `backend/routes/models.py` after model download completes
|
||||
- The `seed_preset_profiles()` function in `backend/services/profiles.py` creates DB profiles with `voice_type="preset"`
|
||||
|
||||
**Frontend:**
|
||||
- The `EngineModelSelector` filters options based on `selectedProfile.voice_type`:
|
||||
- `"cloned"` profiles → only cloning engines shown (Kokoro hidden)
|
||||
- `"preset"` profiles → only the preset's engine shown
|
||||
- Profile cards show the engine name as a badge for preset profiles
|
||||
- When a preset profile is selected, the engine auto-switches
|
||||
|
||||
**Profile schema fields for presets:**
|
||||
- `voice_type: "preset"` (vs `"cloned"` for traditional profiles)
|
||||
- `preset_engine: "<engine>"` — which engine owns this voice
|
||||
- `preset_voice_id: "<id>"` — the engine-specific voice identifier
|
||||
|
||||
**For future "designed" voices** (text description instead of audio, e.g. Qwen CustomVoice):
|
||||
- Use `voice_type: "designed"` with `design_prompt` field
|
||||
- `create_voice_prompt_for_profile()` already returns the design prompt for this type
|
||||
|
||||
## Phase 4: Dependencies
|
||||
|
||||
Use the dependency audit from Phase 0 to drive this phase. You should already know what packages are needed, which conflict, and which require special installation.
|
||||
|
||||
### 4.1 Python Dependencies
|
||||
|
||||
Add to `backend/requirements.txt`. There are three installation patterns, depending on what Phase 0 revealed:
|
||||
|
||||
**Normal PyPI packages:**
|
||||
```
|
||||
some-model-package>=1.0.0
|
||||
```
|
||||
|
||||
**Pinned dependency conflicts (`--no-deps`)** — If the model package pins old versions of torch/numpy/transformers, install with `--no-deps` and list sub-dependencies manually. This is the pattern used for `chatterbox-tts`:
|
||||
```bash
|
||||
# In justfile / CI setup:
|
||||
pip install --no-deps chatterbox-tts
|
||||
|
||||
# In requirements.txt — list each actual sub-dependency:
|
||||
conformer>=0.3.2
|
||||
diffusers>=0.31.0
|
||||
omegaconf>=2.3.0
|
||||
resemble-perth>=0.0.2
|
||||
s3tokenizer>=0.1.6
|
||||
```
|
||||
|
||||
To identify sub-deps: `pip show chatterbox-tts` → `Requires:` field, then cross-reference against existing `requirements.txt` to avoid duplicates.
|
||||
|
||||
**Non-PyPI packages** — Some libraries only exist on GitHub or require custom indexes:
|
||||
```
|
||||
# Git-only packages (no PyPI release)
|
||||
linacodec @ git+https://github.com/ysharma3501/LinaCodec.git
|
||||
Zipvoice @ git+https://github.com/ysharma3501/LuxTTS.git
|
||||
|
||||
# Custom package indexes (C extensions with platform-specific wheels)
|
||||
--find-links https://k2-fsa.github.io/icefall/piper_phonemize.html
|
||||
piper-phonemize>=1.2.0
|
||||
```
|
||||
|
||||
### 4.2 Dependency Conflict Resolution
|
||||
|
||||
Check for conflicts with the existing stack before adding anything:
|
||||
|
||||
```bash
|
||||
# Our current stack pins (approximate):
|
||||
# Python 3.12+, torch>=2.10, transformers>=4.57, numpy>=1.26
|
||||
|
||||
# Test compatibility
|
||||
pip install model-package torch==2.10 transformers==4.57.3 numpy>=1.26
|
||||
|
||||
# If it fails, check what the package pins:
|
||||
pip show model-package | grep Requires
|
||||
# Look at setup.py/pyproject.toml for version constraints
|
||||
```
|
||||
|
||||
**Known incompatible patterns in the wild:**
|
||||
- `torch==2.6.0` — many older packages pin this
|
||||
- `numpy<1.26` — conflicts with Python 3.12+
|
||||
- `transformers==4.46.3` — many packages pin old transformers
|
||||
- `onnxruntime` pinned versions — often conflict with torch
|
||||
|
||||
### 4.3 Update Installation Scripts
|
||||
|
||||
Dependencies must be added in multiple places:
|
||||
|
||||
| File | What to add |
|
||||
|------|------------|
|
||||
| `backend/requirements.txt` | Package and version constraint |
|
||||
| `justfile` | `--no-deps` install line if needed (in `setup-python` and `setup-python-release` targets) |
|
||||
| `.github/workflows/release.yml` | Same `--no-deps` line in CI build steps |
|
||||
| `Dockerfile` | Same install commands for Docker builds |
|
||||
|
||||
## Phase 5: PyInstaller Bundling (`build_binary.py`)
|
||||
|
||||
This is where most of the pain lives. **The v0.2.3 release was entirely dedicated to fixing bundling issues** — every new engine that shipped in v0.2.1 (LuxTTS, Chatterbox, Chatterbox Turbo) worked in dev but failed in production builds. Don't skip this phase.
|
||||
|
||||
### 5.1 Register Your Engine in `build_binary.py`
|
||||
|
||||
Every new engine needs entries in `backend/build_binary.py`. This file drives PyInstaller and is the single most common source of "works in dev, breaks in prod" bugs. You need to decide which PyInstaller directives your engine's dependencies require:
|
||||
|
||||
| Directive | What It Does | When You Need It |
|
||||
|-----------|-------------|-----------------|
|
||||
| `--hidden-import <module>` | Includes a module PyInstaller can't detect via static analysis | Dynamic imports, lazy imports, plugin architectures |
|
||||
| `--collect-all <package>` | Bundles source `.py` files, data files, AND native libraries | Packages that call `inspect.getsource()` at import time (e.g. `inflect` via `typeguard`'s `@typechecked`), or that ship pretrained model files (e.g. `perth` ships `.pth.tar` + `hparams.yaml`) |
|
||||
| `--collect-data <package>` | Bundles only data files (not source or native libs) | Packages with YAML configs, vocab files, etc. |
|
||||
| `--collect-submodules <package>` | Bundles all submodules | Packages with deep module trees that PyInstaller misses |
|
||||
| `--copy-metadata <package>` | Copies `importlib.metadata` info | Packages that call `importlib.metadata.version()` or `pkg_resources.get_distribution()` at runtime. Already required for: `requests`, `transformers`, `huggingface-hub`, `tokenizers`, `safetensors`, `tqdm` |
|
||||
|
||||
**Example: adding hidden imports and collect-all for a new engine:**
|
||||
|
||||
```python
|
||||
# In build_binary.py, inside the args list:
|
||||
"--hidden-import",
|
||||
"backend.backends.your_engine_backend",
|
||||
"--hidden-import",
|
||||
"your_engine_package",
|
||||
"--hidden-import",
|
||||
"your_engine_package.inference",
|
||||
"--collect-all",
|
||||
"some_dependency_that_uses_inspect_getsource",
|
||||
"--copy-metadata",
|
||||
"some_dependency_that_checks_its_own_version",
|
||||
```
|
||||
|
||||
### 5.2 Lessons from v0.2.3 — Real Failures and Their Fixes
|
||||
|
||||
These are actual production failures from shipping new engines. Every one of these passed `python -m uvicorn` in dev:
|
||||
|
||||
| Engine | Failure | Root Cause | Fix |
|
||||
|--------|---------|-----------|-----|
|
||||
| LuxTTS | `"could not get source code"` on import | `inflect` uses `typeguard`'s `@typechecked` which calls `inspect.getsource()` — needs `.py` source files, not just bytecode | `--collect-all inflect` |
|
||||
| LuxTTS | `espeak-ng-data` not found | `piper_phonemize` C library looks for data at `/usr/share/espeak-ng-data/` which doesn't exist in the bundle | `--collect-all piper_phonemize` + set `ESPEAK_DATA_PATH` env var at runtime (see 5.3) |
|
||||
| LuxTTS | `inspect.getsource` error in Vocos codec | `linacodec` and `zipvoice` use source introspection | `--collect-all linacodec` + `--collect-all zipvoice` |
|
||||
| Chatterbox | `FileNotFoundError` for watermark model | `perth` ships pretrained model files (`hparams.yaml`, `.pth.tar`) that PyInstaller doesn't bundle by default | `--collect-all perth` |
|
||||
| All engines | `importlib.metadata` failures | Frozen binary doesn't include package metadata for `huggingface-hub`, `transformers`, etc. | `--copy-metadata` for each affected package |
|
||||
| All engines | Download progress bars stuck at 0% | `huggingface_hub` silently disables tqdm progress bars based on logger level in frozen builds — our progress tracker never receives byte updates | Force-enable tqdm's internal counter in `HFProgressTracker` |
|
||||
| TADA | `inspect.getsource` error in DAC's `Snake1d` | `@torch.jit.script` calls `inspect.getsource()` which fails without `.py` source files | Wrote a lightweight shim (`dac_shim.py`) reimplementing `Snake1d` without `@torch.jit.script`, registered fake `dac.*` modules in `sys.modules` |
|
||||
| All engines | `NameError: name 'obj' is not defined` on macOS | Python 3.12.0 has a [CPython bug](https://github.com/pyinstaller/pyinstaller/issues/7992) that corrupts bytecode when PyInstaller rewrites code objects | Upgrade to Python 3.12.13+ |
|
||||
| All engines | `resource_tracker` subprocess crash | `multiprocessing` in frozen binaries needs `freeze_support()` called before anything else | Added to `server.py` entry point |
|
||||
|
||||
### 5.3 Runtime Frozen-Build Handling (`server.py`)
|
||||
|
||||
Some fixes can't live in `build_binary.py` — they need runtime detection. The entry point `backend/server.py` handles these before any heavy imports:
|
||||
|
||||
```python
|
||||
# 1. freeze_support() — MUST be called before any multiprocessing use
|
||||
import multiprocessing
|
||||
multiprocessing.freeze_support()
|
||||
|
||||
# 2. Native data paths — redirect C libraries to bundled data
|
||||
if getattr(sys, 'frozen', False):
|
||||
_meipass = getattr(sys, '_MEIPASS', os.path.dirname(sys.executable))
|
||||
_espeak_data = os.path.join(_meipass, 'piper_phonemize', 'espeak-ng-data')
|
||||
if os.path.isdir(_espeak_data):
|
||||
os.environ.setdefault('ESPEAK_DATA_PATH', _espeak_data)
|
||||
|
||||
# 3. stdout/stderr safety — PyInstaller --noconsole on Windows sets these to None
|
||||
if not _is_writable(sys.stdout):
|
||||
sys.stdout = open(os.devnull, 'w')
|
||||
```
|
||||
|
||||
If your engine's dependencies include native libraries that look for data at system paths (like espeak-ng does), you'll need to add a similar `os.environ.setdefault()` block here.
|
||||
|
||||
### 5.4 CUDA vs CPU Build Branching
|
||||
|
||||
`build_binary.py` produces two different binaries:
|
||||
|
||||
- **`voicebox-server`** (CPU) — excludes all `nvidia.*` packages to avoid bundling ~3 GB of CUDA DLLs
|
||||
- **`voicebox-server-cuda`** — includes `torch.cuda` and `torch.backends.cudnn`
|
||||
|
||||
On Windows, if the build environment has CUDA torch installed but you're building the CPU binary, the script temporarily swaps to CPU-only torch and restores CUDA torch afterward. This prevents PyInstaller from accidentally bundling CUDA libraries into the CPU build.
|
||||
|
||||
New engine imports go in the **common section** (not the CUDA or MLX conditional blocks) unless your engine has platform-specific dependencies.
|
||||
|
||||
### 5.5 MLX Conditional Inclusion
|
||||
|
||||
Apple Silicon builds conditionally include MLX hidden imports and `--collect-all mlx` / `--collect-all mlx_audio`. If your engine has an MLX-specific backend variant, add its imports inside the `if is_apple_silicon() and not cuda:` block.
|
||||
|
||||
### 5.6 Testing Frozen Builds
|
||||
|
||||
You can't skip this. Models that work in `python -m uvicorn` will break in the PyInstaller binary. The v0.2.3 release required **three patch releases** (v0.2.1 → v0.2.2 → v0.2.3) to get all engines working in production.
|
||||
|
||||
1. Build: `just build`
|
||||
2. Launch the binary directly (not via `python -m`)
|
||||
3. Test the **full chain**: download → load → generate → progress tracking
|
||||
4. Check stderr for the actual error (logs go to stderr for Tauri sidecar capture)
|
||||
5. Fix, rebuild, repeat
|
||||
|
||||
**Common gotcha:** testing only generation with a pre-cached model from your dev install. Always test with a clean model cache to verify downloads work too.
|
||||
|
||||
## Phase 6: Common Upstream Workarounds
|
||||
|
||||
### torch.load device mismatch
|
||||
```python
|
||||
_original_torch_load = torch.load
|
||||
def _patched_torch_load(*args, **kwargs):
|
||||
kwargs.setdefault("map_location", "cpu")
|
||||
return _original_torch_load(*args, **kwargs)
|
||||
torch.load = _patched_torch_load
|
||||
```
|
||||
|
||||
### Float64/Float32 dtype mismatch
|
||||
```python
|
||||
original_fn = SomeClass.some_method
|
||||
def patched_fn(self, *args, **kwargs):
|
||||
result = original_fn(self, *args, **kwargs)
|
||||
return result.float()
|
||||
SomeClass.some_method = patched_fn
|
||||
```
|
||||
|
||||
### HuggingFace token bug
|
||||
```python
|
||||
from huggingface_hub import snapshot_download
|
||||
local_path = snapshot_download(repo_id=REPO, token=None)
|
||||
model = ModelClass.from_local(local_path, device=device)
|
||||
```
|
||||
|
||||
### MPS tensor issues
|
||||
Skip MPS entirely if operators aren't supported:
|
||||
```python
|
||||
def _get_device(self):
|
||||
if torch.cuda.is_available():
|
||||
return "cuda"
|
||||
return "cpu" # Skip MPS
|
||||
```
|
||||
|
||||
### Gated HuggingFace repos as hardcoded config sources
|
||||
|
||||
Some models hardcode a gated HuggingFace repo as their tokenizer or config source (e.g., TADA hardcodes `"meta-llama/Llama-3.2-1B"` in both its `AlignerConfig` and `TadaConfig`). This silently fails without HF authentication.
|
||||
|
||||
**Fix:** Download from an ungated mirror and patch the config objects directly:
|
||||
|
||||
```python
|
||||
# Download tokenizer from ungated mirror
|
||||
UNGATED_TOKENIZER = "unsloth/Llama-3.2-1B"
|
||||
tokenizer_path = snapshot_download(UNGATED_TOKENIZER, token=None)
|
||||
|
||||
# Patch the model config to use the local path instead of the gated repo
|
||||
config = ModelConfig.from_pretrained(model_path)
|
||||
config.tokenizer_name = tokenizer_path
|
||||
model = ModelClass.from_pretrained(model_path, config=config)
|
||||
```
|
||||
|
||||
**Do NOT monkey-patch `AutoTokenizer.from_pretrained`** — it's a classmethod, and replacing it corrupts the descriptor, which breaks other engines that use different tokenizers (e.g., Qwen uses a Qwen tokenizer via `AutoTokenizer`). Always patch at the config level, not the class method level.
|
||||
|
||||
### `torchaudio.load()` requires `torchcodec` in 2.10+
|
||||
|
||||
As of `torchaudio>=2.10`, `torchaudio.load()` requires the `torchcodec` package for audio I/O. If your engine or backend code uses `torchaudio.load()`, replace it with `soundfile`:
|
||||
|
||||
```python
|
||||
# Before (breaks without torchcodec):
|
||||
import torchaudio
|
||||
waveform, sr = torchaudio.load("audio.wav")
|
||||
|
||||
# After:
|
||||
import soundfile as sf
|
||||
import torch
|
||||
data, sr = sf.read("audio.wav", dtype="float32")
|
||||
waveform = torch.from_numpy(data).unsqueeze(0)
|
||||
```
|
||||
|
||||
Note: `torchaudio.functional.resample()` and other pure-PyTorch math functions work fine without `torchcodec` — only the I/O functions are affected.
|
||||
|
||||
### `@torch.jit.script` breaks in frozen builds
|
||||
|
||||
`torch.jit.script` calls `inspect.getsource()` to parse the decorated function's source code. In a PyInstaller binary, `.py` source files aren't available, so this crashes at import time.
|
||||
|
||||
**Fix:** Remove or avoid `@torch.jit.script` decorators. If the decorated function comes from an upstream dependency, write a shim that reimplements the function without the decorator (see "Toxic dependency chains" below).
|
||||
|
||||
### Toxic dependency chains — the shim pattern
|
||||
|
||||
Sometimes a model library depends on a package with a massive, hostile transitive dependency tree, but only uses a tiny piece of it. When the dependency chain is unbuildable or would pull in dozens of unwanted packages, the right move is to write a lightweight shim.
|
||||
|
||||
**Example:** TADA depends on `descript-audio-codec` (DAC), which pulls in `descript-audiotools` -> `onnx`, `tensorboard`, `protobuf`, `matplotlib`, `pystoi`, etc. The `onnx` package fails to build from source on macOS. But TADA only uses `Snake1d` from DAC — a 7-line PyTorch module.
|
||||
|
||||
**Solution:** Create a shim at `backend/utils/dac_shim.py` that registers fake modules in `sys.modules`:
|
||||
|
||||
```python
|
||||
import sys
|
||||
import types
|
||||
import torch
|
||||
from torch import nn
|
||||
|
||||
def snake(x, alpha):
|
||||
"""Snake activation — reimplemented without @torch.jit.script."""
|
||||
return x + (1.0 / (alpha + 1e-9)) * torch.sin(alpha * x).pow(2)
|
||||
|
||||
class Snake1d(nn.Module):
|
||||
def __init__(self, channels):
|
||||
super().__init__()
|
||||
self.alpha = nn.Parameter(torch.ones(1, channels, 1))
|
||||
def forward(self, x):
|
||||
return snake(x, self.alpha)
|
||||
|
||||
# Register fake dac.* modules so "from dac.nn.layers import Snake1d" works
|
||||
_nn = types.ModuleType("dac.nn")
|
||||
_layers = types.ModuleType("dac.nn.layers")
|
||||
_layers.Snake1d = Snake1d
|
||||
_nn.layers = _layers
|
||||
|
||||
for name, mod in [("dac", types.ModuleType("dac")),
|
||||
("dac.nn", _nn), ("dac.nn.layers", _layers)]:
|
||||
sys.modules[name] = mod
|
||||
```
|
||||
|
||||
**Key rules for shims:**
|
||||
- Import the shim **before** importing the model library (so it finds the fake modules first)
|
||||
- Do NOT use `@torch.jit.script` in the shim (see above)
|
||||
- Only reimplement what the model actually uses — check the import chain carefully
|
||||
|
||||
## Candidate Engines
|
||||
|
||||
The [`docs/PROJECT_STATUS.md`](https://github.com/jamiepine/voicebox/blob/main/docs/PROJECT_STATUS.md) file is the canonical, living list of candidates under evaluation — including why some have been backlogged (e.g. VoxCPM, which is effectively CUDA-only upstream).
|
||||
|
||||
At a glance, current top candidates:
|
||||
|
||||
| Model | Tier | Size | Cross-platform? | Key Features |
|
||||
|-------|------|------|-----------------|--------------|
|
||||
| **MOSS-TTS-Nano** | 1 | 0.1 B | Yes (CPU realtime) | 48 kHz stereo, Apache 2.0, released 2026-04-13 |
|
||||
| **Voxtral TTS** | 2 | 4 B | Likely | `mistralai/Voxtral-4B-TTS-2603` — presets + cloning |
|
||||
| **VibeVoice** | 2 | ~500 M | Yes | Podcast-style multi-speaker dialogue |
|
||||
| **Dia2** | 3 | TBD | TBD | Successor to the original Dia |
|
||||
| **Fish Audio S2 Pro** | 3 | Medium | Yes | Word-level control via inline text |
|
||||
|
||||
**Backlogged:**
|
||||
|
||||
- **VoxCPM** (2B, Apache 2.0) — CUDA ≥12 required upstream; MPS broken in issues #232/#248; CPU path rejected by maintainers (#256). Keep watching for a PR that relaxes the device requirement.
|
||||
|
||||
Update `PROJECT_STATUS.md` when you pick one up or mark one as shipped/backlogged.
|
||||
|
||||
## Implementation Checklist
|
||||
|
||||
Use this as a gate between phases. Do not proceed to the next phase until every item in the current phase is checked.
|
||||
|
||||
### Phase 0: Dependency Research
|
||||
- [ ] Cloned model library source into a temp directory
|
||||
- [ ] Read `setup.py` / `pyproject.toml` — noted pinned dependency versions
|
||||
- [ ] Traced all imports from the model class through to leaf dependencies
|
||||
- [ ] Searched for `inspect.getsource`, `@typechecked`, `typeguard` in the full dependency tree
|
||||
- [ ] Searched for `importlib.metadata`, `pkg_resources.get_distribution` in the dependency tree
|
||||
- [ ] Searched for `Path(__file__).parent`, `os.path.dirname(__file__)`, hardcoded system paths
|
||||
- [ ] Searched for `torch.load` calls missing `map_location`
|
||||
- [ ] Searched for `torch.from_numpy` without `.float()` cast
|
||||
- [ ] Searched for `token=True` or `token=os.getenv("HF_TOKEN")` in HuggingFace calls
|
||||
- [ ] Searched for `@torch.jit.script` / `torch.jit.script` (crashes in frozen builds)
|
||||
- [ ] Searched for `torchaudio.load` / `torchaudio.save` (requires `torchcodec` in 2.10+)
|
||||
- [ ] Searched for hardcoded gated HuggingFace repo names (e.g., `meta-llama/*`)
|
||||
- [ ] Evaluated whether any dependency is used minimally enough to shim instead of install
|
||||
- [ ] Tested model loading and generation on CPU in a throwaway venv
|
||||
- [ ] Tested with a clean HuggingFace cache (no pre-downloaded models)
|
||||
- [ ] Produced a written dependency audit documenting all findings
|
||||
|
||||
### Phase 1: Backend Implementation
|
||||
- [ ] Created `backend/backends/<engine>_backend.py` implementing `TTSBackend` protocol
|
||||
- [ ] Chose voice prompt pattern (pre-computed tensors vs deferred file paths)
|
||||
- [ ] Implemented all monkey-patches identified in Phase 0
|
||||
- [ ] Used `get_torch_device()` from `backends/base.py` for device selection
|
||||
- [ ] Used `model_load_progress()` from `backends/base.py` for download/load tracking
|
||||
- [ ] Tested: model downloads correctly
|
||||
- [ ] Tested: model loads on CPU
|
||||
- [ ] Tested: generation produces valid audio
|
||||
- [ ] Tested: voice cloning from reference audio works
|
||||
- [ ] Registered `ModelConfig` in `backends/__init__.py`
|
||||
- [ ] Added to `TTS_ENGINES` dict
|
||||
- [ ] Added factory branch in `get_tts_backend_for_engine()`
|
||||
- [ ] Updated engine regex in `backend/models.py`
|
||||
|
||||
### Phase 2–3: Route, Service, and Frontend
|
||||
- [ ] Confirmed zero changes needed in routes/services (or documented why custom behavior is needed)
|
||||
- [ ] Added engine to TypeScript union type in `app/src/lib/api/types.ts`
|
||||
- [ ] Added language map entry in `app/src/lib/constants/languages.ts`
|
||||
- [ ] Added to `ENGINE_OPTIONS` and `ENGINE_DESCRIPTIONS` in `EngineModelSelector.tsx`
|
||||
- [ ] Added to Zod schema and model-name mapping in `useGenerationForm.ts`
|
||||
- [ ] Added description in `ModelManagement.tsx`
|
||||
|
||||
### Phase 4: Dependencies
|
||||
- [ ] Added packages to `backend/requirements.txt`
|
||||
- [ ] If `--no-deps` needed: listed sub-dependencies explicitly
|
||||
- [ ] If git-only packages: added `@ git+https://...` entries
|
||||
- [ ] If custom index needed: added `--find-links` line
|
||||
- [ ] Updated `justfile` setup targets
|
||||
- [ ] Updated `.github/workflows/release.yml` build steps
|
||||
- [ ] Updated `Dockerfile` if applicable
|
||||
- [ ] Verified `pip install` succeeds in a clean venv with existing requirements
|
||||
|
||||
### Phase 5: PyInstaller Bundling
|
||||
- [ ] Added `--hidden-import` entries in `build_binary.py` for:
|
||||
- [ ] `backend.backends.<engine>_backend`
|
||||
- [ ] The model package and its key submodules
|
||||
- [ ] Added `--collect-all` for any packages that:
|
||||
- [ ] Use `inspect.getsource()` / `@typechecked`
|
||||
- [ ] Ship pretrained model data files (`.pth.tar`, `.yaml`, etc.)
|
||||
- [ ] Ship native data files (phoneme tables, shader libraries, etc.)
|
||||
- [ ] Added `--copy-metadata` for any packages that use `importlib.metadata`
|
||||
- [ ] If engine has native data paths: added `os.environ.setdefault()` in `server.py`
|
||||
- [ ] Built frozen binary with `just build`
|
||||
- [ ] Tested in frozen binary with **clean model cache** (not pre-cached from dev):
|
||||
- [ ] Model download works with real-time progress
|
||||
- [ ] Model loading works
|
||||
- [ ] Generation produces valid audio
|
||||
- [ ] No errors in stderr logs
|
||||
|
||||
### Phase 6: Final Verification
|
||||
- [ ] Engine works in dev mode (`just dev`)
|
||||
- [ ] Engine works in frozen binary (`just build` → run binary directly)
|
||||
- [ ] Tested on target platform (macOS for MLX, Windows/Linux for CUDA)
|
||||
- [ ] No regressions in existing engines
|
||||
251
docs/content/docs/developer/tts-generation.mdx
Normal file
251
docs/content/docs/developer/tts-generation.mdx
Normal file
@@ -0,0 +1,251 @@
|
||||
---
|
||||
title: "TTS Generation"
|
||||
description: "How text-to-speech generation works across Voicebox's multi-engine backend"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Voicebox ships seven TTS engines — Qwen3-TTS, Qwen CustomVoice, LuxTTS, Chatterbox Multilingual, Chatterbox Turbo, TADA, and Kokoro — behind a single `TTSBackend` Protocol. All of them expose the same async interface so the routes and services don't need per-engine branching.
|
||||
|
||||
This page covers how generation flows through that abstraction. For the step-by-step guide to adding a new engine, see [TTS Engines](/developer/tts-engines).
|
||||
|
||||
## The `TTSBackend` Protocol
|
||||
|
||||
Every engine implements the same contract (defined in `backend/backends/__init__.py`):
|
||||
|
||||
```python
|
||||
@runtime_checkable
|
||||
class TTSBackend(Protocol):
|
||||
async def load_model(self, model_size: str) -> None: ...
|
||||
async def create_voice_prompt(
|
||||
self, audio_path: str, reference_text: str, use_cache: bool = True
|
||||
) -> Tuple[dict, bool]: ...
|
||||
async def combine_voice_prompts(
|
||||
self, audio_paths: List[str], reference_texts: List[str]
|
||||
) -> Tuple[np.ndarray, str]: ...
|
||||
async def generate(
|
||||
self,
|
||||
text: str,
|
||||
voice_prompt: dict,
|
||||
language: str = "en",
|
||||
seed: Optional[int] = None,
|
||||
instruct: Optional[str] = None,
|
||||
) -> Tuple[np.ndarray, int]: ...
|
||||
def unload_model(self) -> None: ...
|
||||
def is_loaded(self) -> bool: ...
|
||||
```
|
||||
|
||||
## The `ModelConfig` Registry
|
||||
|
||||
Each downloadable model variant is described by a `ModelConfig` dataclass:
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class ModelConfig:
|
||||
model_name: str # "luxtts", "qwen-tts-1.7B", "kokoro"
|
||||
display_name: str # "LuxTTS (Fast, CPU-friendly)"
|
||||
engine: str # "luxtts", "qwen", "kokoro"
|
||||
hf_repo_id: str # "YatharthS/LuxTTS"
|
||||
model_size: str = "default"
|
||||
size_mb: int = 0
|
||||
needs_trim: bool = False
|
||||
supports_instruct: bool = False
|
||||
languages: list[str] = field(default_factory=lambda: ["en"])
|
||||
```
|
||||
|
||||
Registry helpers in `backends/__init__.py` replace what used to be per-engine `if/elif` chains:
|
||||
|
||||
- `get_all_model_configs()` — every TTS + STT variant
|
||||
- `get_tts_model_configs()` — only TTS variants
|
||||
- `get_model_config(model_name)` — lookup by name
|
||||
- `engine_needs_trim(engine)` — whether output should run through `trim_tts_output()`
|
||||
- `load_engine_model(engine, model_size)` — downloads + loads, handles engines with multiple sizes
|
||||
- `get_tts_backend_for_engine(engine)` — thread-safe backend factory with double-checked locking
|
||||
|
||||
The `TTS_ENGINES` dict is the canonical list of shipped engine names:
|
||||
|
||||
```python
|
||||
TTS_ENGINES = {
|
||||
"qwen": "Qwen TTS",
|
||||
"qwen_custom_voice": "Qwen CustomVoice",
|
||||
"luxtts": "LuxTTS",
|
||||
"chatterbox": "Chatterbox TTS",
|
||||
"chatterbox_turbo": "Chatterbox Turbo",
|
||||
"tada": "TADA",
|
||||
"kokoro": "Kokoro",
|
||||
}
|
||||
```
|
||||
|
||||
## Voice Prompt Patterns
|
||||
|
||||
Each engine chooses how to represent a voice in the prompt dict returned from `create_voice_prompt()`. Three patterns are in use today:
|
||||
|
||||
**Pattern A — Pre-computed tensors** (Qwen3-TTS, LuxTTS)
|
||||
|
||||
```python
|
||||
encoded = model.encode_prompt(audio_path)
|
||||
return encoded, False # (prompt_dict, was_cached)
|
||||
```
|
||||
|
||||
**Pattern B — Deferred file paths** (Chatterbox, Chatterbox Turbo, TADA)
|
||||
|
||||
```python
|
||||
return {"ref_audio": audio_path, "ref_text": reference_text}, False
|
||||
```
|
||||
|
||||
**Pattern C — Preset voice pointer** (Kokoro, Qwen CustomVoice)
|
||||
|
||||
```python
|
||||
return {
|
||||
"voice_type": "preset",
|
||||
"preset_engine": "kokoro",
|
||||
"preset_voice_id": "am_adam",
|
||||
}, False
|
||||
```
|
||||
|
||||
Pattern C is the shape used for profiles where `voice_type == "preset"` — there's no cloning step; the engine looks up a baked-in voice by ID.
|
||||
|
||||
Engines that cache voice prompts prefix their cache keys to avoid collisions:
|
||||
|
||||
```python
|
||||
cache_key = f"{engine}_{hash(audio_path, reference_text)}"
|
||||
```
|
||||
|
||||
## Device Selection
|
||||
|
||||
Engines pick their device through `get_torch_device()` in `backends/base.py`, which layers:
|
||||
|
||||
1. `VOICEBOX_FORCE_CPU` environment override
|
||||
2. CUDA (if compiled and available)
|
||||
3. XPU (Intel Arc via IPEX)
|
||||
4. MPS (Apple Silicon) — **only for engines that support it**; some (Chatterbox, older Qwen paths) skip MPS and fall back to CPU due to upstream operator gaps
|
||||
5. CPU
|
||||
|
||||
Qwen TTS uses MLX directly on Apple Silicon instead of going through PyTorch — see `mlx_backend.py`.
|
||||
|
||||
## Generation Flow
|
||||
|
||||
The request path from frontend to audio file:
|
||||
|
||||
1. **Request** — `POST /generate` with `GenerationRequest`:
|
||||
```json
|
||||
{
|
||||
"profile_id": "uuid",
|
||||
"text": "...",
|
||||
"language": "en",
|
||||
"seed": 42,
|
||||
"model_size": "1.7B",
|
||||
"instruct": "warm, slightly amused",
|
||||
"engine": "qwen",
|
||||
"max_chunk_chars": 800
|
||||
}
|
||||
```
|
||||
The `engine` field is validated against the regex `^(qwen|qwen_custom_voice|luxtts|chatterbox|chatterbox_turbo|tada|kokoro)$`.
|
||||
|
||||
2. **Route** — `routes/generate.py` validates input and delegates.
|
||||
|
||||
3. **Service** — `services/generation.py` fetches the profile, resolves the engine backend via `get_tts_backend_for_engine(engine)`, and ensures the model is loaded (downloading it on first use with live progress).
|
||||
|
||||
4. **Voice prompt** — the service calls `create_voice_prompt()` (or the preset equivalent). For cloned profiles with multiple samples, it calls `combine_voice_prompts()` first to merge reference audio.
|
||||
|
||||
5. **Queue** — the request is serialized through `services/task_queue.py` to avoid multiple generations fighting for the GPU.
|
||||
|
||||
6. **Inference** — the engine's `generate()` returns `(audio_array, sample_rate)`.
|
||||
|
||||
7. **Post-process** — if `engine_needs_trim(engine)` is True, `trim_tts_output()` strips trailing silence. Effects chains (if any) are applied per generation version, not the clean version.
|
||||
|
||||
8. **Persist** — audio is written to the generations directory, a row is inserted into the `generations` table, and the response includes the generation metadata.
|
||||
|
||||
## Chunking for Long Text
|
||||
|
||||
Text longer than `max_chunk_chars` (default 800, range 100–5000) is split at sentence boundaries, generated in sequence, and crossfaded together. The chunking behavior is engine-agnostic — it lives in the service layer, not in individual backends.
|
||||
|
||||
## Instruct Mode
|
||||
|
||||
Two engines support natural-language delivery control via the `instruct` kwarg:
|
||||
|
||||
- **Qwen CustomVoice** — `supports_instruct=True`, fully wired to the model's instruct head.
|
||||
- **Qwen Base** — silently drops the instruct text (`supports_instruct=False`). The frontend hides the instruct input for Base profiles.
|
||||
|
||||
```python
|
||||
# Good instruct prompts:
|
||||
"warm and conversational, slight smile"
|
||||
"whisper, intimate and close"
|
||||
"authoritative, broadcast quality"
|
||||
```
|
||||
|
||||
Other engines ignore `instruct` entirely.
|
||||
|
||||
## Memory Management
|
||||
|
||||
Models are loaded lazily on first use and kept in memory. Switching between model sizes (e.g. Qwen 1.7B ↔ 0.6B) unloads the previous model before loading the new one to avoid OOM:
|
||||
|
||||
```python
|
||||
def unload_model(self):
|
||||
if self.model is not None:
|
||||
del self.model
|
||||
self.model = None
|
||||
if torch.cuda.is_available():
|
||||
torch.cuda.empty_cache()
|
||||
```
|
||||
|
||||
The model management API (`/models/load`, `/models/unload`) lets users free VRAM manually — see [Model Management](/developer/model-management).
|
||||
|
||||
## API Endpoints
|
||||
|
||||
| Method | Endpoint | Description |
|
||||
|--------|----------|-------------|
|
||||
| POST | `/generate` | Generate speech from text |
|
||||
| GET | `/audio/{generation_id}` | Serve generated audio file |
|
||||
|
||||
### Response schema
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "generation_uuid",
|
||||
"profile_id": "profile_uuid",
|
||||
"text": "...",
|
||||
"language": "en",
|
||||
"audio_path": "/path/to/audio.wav",
|
||||
"duration": 3.5,
|
||||
"seed": 42,
|
||||
"engine": "qwen",
|
||||
"model_size": "1.7B",
|
||||
"instruct": "...",
|
||||
"created_at": "2026-04-18T10:30:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
- **CUDA** is the fastest backend for every PyTorch-based engine. Apple Silicon MLX is competitive with CUDA for Qwen TTS specifically.
|
||||
- **Serial queue** — only one generation runs at a time per process; concurrent requests are queued.
|
||||
- **Voice prompt caching** saves ~1-2s on repeated generations from the same profile.
|
||||
- **Model pinning** — the first load is slow (download + load), subsequent generations reuse the cached model in memory.
|
||||
|
||||
### Per-engine VRAM (approximate, on CUDA)
|
||||
|
||||
| Engine | VRAM |
|
||||
|--------|------|
|
||||
| Kokoro | ~150 MB |
|
||||
| LuxTTS | ~1 GB |
|
||||
| Chatterbox Turbo | ~1.5 GB |
|
||||
| Qwen 0.6B / Qwen CustomVoice 0.6B | ~2 GB |
|
||||
| Chatterbox Multilingual | ~3 GB |
|
||||
| Qwen 1.7B / Qwen CustomVoice 1.7B | ~6 GB |
|
||||
| TADA 1B | ~4 GB |
|
||||
| TADA 3B | ~8 GB |
|
||||
|
||||
## Next Steps
|
||||
|
||||
<Cards>
|
||||
<Card title="TTS Engines" href="/developer/tts-engines">
|
||||
Add a new engine — full phased workflow
|
||||
</Card>
|
||||
<Card title="Model Management" href="/developer/model-management">
|
||||
Downloading, loading, and unloading models
|
||||
</Card>
|
||||
<Card title="Voice Profiles" href="/developer/voice-profiles">
|
||||
Cloned vs preset profile schema
|
||||
</Card>
|
||||
</Cards>
|
||||
232
docs/content/docs/developer/voice-profiles.mdx
Normal file
232
docs/content/docs/developer/voice-profiles.mdx
Normal file
@@ -0,0 +1,232 @@
|
||||
---
|
||||
title: "Voice Profiles"
|
||||
description: "How voice profile management works in Voicebox"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Voice profiles are the unit of "a saved voice" in Voicebox. As of 0.4 they support two flavors backed by the same `profiles` table:
|
||||
|
||||
- **Cloned profiles** — store one or more reference audio samples; the cloning engine generates a voice embedding at use time
|
||||
- **Preset profiles** — store no audio; just a pointer to an engine-specific pre-built voice (e.g. Kokoro's `am_adam`, Qwen CustomVoice's `Ryan`)
|
||||
|
||||
The schema also reserves a third type, `designed`, for future text-described voices. Not currently used by any shipped engine.
|
||||
|
||||
## Architecture
|
||||
|
||||
The voice profile system consists of three main components:
|
||||
|
||||
**Database Layer:** SQLite tables store profile metadata, sample references (cloned), and engine + voice ID (preset).
|
||||
|
||||
**File Storage:** Audio samples are stored on disk in a structured directory format. Preset profiles have no on-disk audio.
|
||||
|
||||
**Profile Module:** `backend/services/profiles.py` provides the business logic for CRUD operations and dispatches to the appropriate engine based on `voice_type`.
|
||||
|
||||
## Data Model
|
||||
|
||||
### VoiceProfile Table
|
||||
|
||||
```python
|
||||
class VoiceProfile(Base):
|
||||
__tablename__ = "profiles"
|
||||
|
||||
id = Column(String, primary_key=True, default=lambda: str(uuid.uuid4()))
|
||||
name = Column(String, unique=True, nullable=False)
|
||||
description = Column(Text)
|
||||
language = Column(String, default="en")
|
||||
avatar_path = Column(String, nullable=True)
|
||||
effects_chain = Column(Text, nullable=True)
|
||||
|
||||
# Voice type system — added v0.3.x
|
||||
voice_type = Column(String, default="cloned") # "cloned" | "preset" | "designed"
|
||||
preset_engine = Column(String, nullable=True) # e.g. "kokoro" — only for preset
|
||||
preset_voice_id = Column(String, nullable=True) # e.g. "am_adam" — only for preset
|
||||
design_prompt = Column(Text, nullable=True) # text description — only for designed (reserved)
|
||||
default_engine = Column(String, nullable=True) # auto-selected engine, locked for preset
|
||||
|
||||
created_at = Column(DateTime, default=datetime.utcnow)
|
||||
updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)
|
||||
```
|
||||
|
||||
The `voice_type` column discriminates the three flavors:
|
||||
|
||||
| `voice_type` | `preset_engine` | `preset_voice_id` | Samples in `profile_samples` |
|
||||
| ------------ | --------------- | ----------------- | ---------------------------- |
|
||||
| `cloned` | NULL | NULL | Required (≥1 row) |
|
||||
| `preset` | engine name | voice ID string | None |
|
||||
| `designed` | NULL | NULL | None (uses `design_prompt`) |
|
||||
|
||||
The `default_engine` column is set automatically when the profile is created. For preset profiles it's locked to the source engine — switching engines at generation time will skip the profile (and the UI auto-switches back when the user clicks a greyed-out card; see the floating generate box and profile grid).
|
||||
|
||||
### ProfileSample Table
|
||||
|
||||
```python
|
||||
class ProfileSample(Base):
|
||||
__tablename__ = "profile_samples"
|
||||
|
||||
id = Column(String, primary_key=True, default=lambda: str(uuid.uuid4()))
|
||||
profile_id = Column(String, ForeignKey("profiles.id"))
|
||||
audio_path = Column(String, nullable=False)
|
||||
reference_text = Column(Text, nullable=False)
|
||||
```
|
||||
|
||||
Only populated for cloned profiles. Preset and designed profiles have zero rows in this table.
|
||||
|
||||
## File Structure
|
||||
|
||||
Profiles are stored in the data directory:
|
||||
|
||||
<Files>
|
||||
<Folder name="data" defaultOpen>
|
||||
<Folder name="profiles">
|
||||
<Folder name="{profile_id}">
|
||||
<File name="{sample_id_1}.wav" />
|
||||
<File name="{sample_id_2}.wav" />
|
||||
</Folder>
|
||||
</Folder>
|
||||
</Folder>
|
||||
</Files>
|
||||
|
||||
## Core Functions
|
||||
|
||||
### Creating a Profile
|
||||
|
||||
```python
|
||||
async def create_profile(data: VoiceProfileCreate, db: Session) -> VoiceProfileResponse:
|
||||
# 1. Create database record
|
||||
db_profile = DBVoiceProfile(
|
||||
id=str(uuid.uuid4()),
|
||||
name=data.name,
|
||||
description=data.description,
|
||||
language=data.language,
|
||||
)
|
||||
db.add(db_profile)
|
||||
db.commit()
|
||||
|
||||
# 2. Create profile directory
|
||||
profile_dir = profiles_dir / db_profile.id
|
||||
profile_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
return VoiceProfileResponse.model_validate(db_profile)
|
||||
```
|
||||
|
||||
### Adding Samples
|
||||
|
||||
When a sample is added, the audio is validated and copied to the profile directory:
|
||||
|
||||
```python
|
||||
async def add_profile_sample(
|
||||
profile_id: str,
|
||||
audio_path: str,
|
||||
reference_text: str,
|
||||
db: Session,
|
||||
) -> ProfileSampleResponse:
|
||||
# 1. Validate audio (duration, format, quality)
|
||||
is_valid, error_msg = validate_reference_audio(audio_path)
|
||||
if not is_valid:
|
||||
raise ValueError(f"Invalid reference audio: {error_msg}")
|
||||
|
||||
# 2. Copy to profile directory
|
||||
sample_id = str(uuid.uuid4())
|
||||
dest_path = profile_dir / f"{sample_id}.wav"
|
||||
audio, sr = load_audio(audio_path)
|
||||
save_audio(audio, str(dest_path), sr)
|
||||
|
||||
# 3. Create database record
|
||||
db_sample = DBProfileSample(
|
||||
id=sample_id,
|
||||
profile_id=profile_id,
|
||||
audio_path=str(dest_path),
|
||||
reference_text=reference_text,
|
||||
)
|
||||
db.add(db_sample)
|
||||
db.commit()
|
||||
```
|
||||
|
||||
### Voice Prompt Creation
|
||||
|
||||
When generating speech, samples are combined into a voice prompt:
|
||||
|
||||
```python
|
||||
async def create_voice_prompt_for_profile(
|
||||
profile_id: str,
|
||||
db: Session,
|
||||
) -> dict:
|
||||
samples = db.query(DBProfileSample).filter_by(profile_id=profile_id).all()
|
||||
|
||||
if len(samples) == 1:
|
||||
# Single sample - use directly
|
||||
voice_prompt, _ = await tts_model.create_voice_prompt(
|
||||
sample.audio_path,
|
||||
sample.reference_text,
|
||||
)
|
||||
else:
|
||||
# Multiple samples - combine them
|
||||
combined_audio, combined_text = await tts_model.combine_voice_prompts(
|
||||
[s.audio_path for s in samples],
|
||||
[s.reference_text for s in samples],
|
||||
)
|
||||
voice_prompt, _ = await tts_model.create_voice_prompt(
|
||||
combined_audio_path,
|
||||
combined_text,
|
||||
)
|
||||
|
||||
return voice_prompt
|
||||
```
|
||||
|
||||
## Audio Validation
|
||||
|
||||
Reference audio is validated before being accepted:
|
||||
|
||||
- **Duration:** 3-30 seconds recommended
|
||||
- **Format:** WAV, MP3, FLAC, OGG, M4A supported
|
||||
- **Sample Rate:** Engine-specific — the audio utility resamples to whatever the active engine expects (Whisper uses 16 kHz, most TTS engines use 24 kHz, LuxTTS outputs 48 kHz). Resampling happens on the fly; the stored sample retains its original rate.
|
||||
- **Channels:** Converted to mono if stereo
|
||||
|
||||
## Export/Import
|
||||
|
||||
Profiles can be exported as ZIP archives for sharing:
|
||||
|
||||
<Files>
|
||||
<Folder name="profile_export.zip" defaultOpen>
|
||||
<File name="profile.json" />
|
||||
<Folder name="samples">
|
||||
<File name="sample_1.wav" />
|
||||
<File name="sample_1.json" />
|
||||
</Folder>
|
||||
</Folder>
|
||||
</Files>
|
||||
|
||||
## API Endpoints
|
||||
|
||||
| Method | Endpoint | Description |
|
||||
|--------|----------|-------------|
|
||||
| GET | `/profiles` | List all profiles |
|
||||
| POST | `/profiles` | Create a profile |
|
||||
| GET | `/profiles/{id}` | Get profile by ID |
|
||||
| PUT | `/profiles/{id}` | Update profile |
|
||||
| DELETE | `/profiles/{id}` | Delete profile |
|
||||
| GET | `/profiles/{id}/samples` | Get profile samples |
|
||||
| POST | `/profiles/{id}/samples` | Add sample to profile |
|
||||
| PUT | `/profiles/samples/{id}` | Update sample text |
|
||||
| DELETE | `/profiles/samples/{id}` | Delete sample |
|
||||
| GET | `/profiles/{id}/export` | Export as ZIP |
|
||||
| POST | `/profiles/import` | Import from ZIP |
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Sample Quality
|
||||
|
||||
- Use clean audio with minimal background noise
|
||||
- Ensure the reference text exactly matches what is spoken
|
||||
- Multiple samples (3-5) improve voice cloning quality
|
||||
|
||||
### Language Matching
|
||||
|
||||
- Set the profile language to match the reference audio
|
||||
- Supported languages: en, zh, ja, ko, de, fr, ru, pt, es, it
|
||||
|
||||
### Naming Conventions
|
||||
|
||||
- Use descriptive names that identify the voice
|
||||
- Avoid special characters that may cause filesystem issues
|
||||
37
docs/content/docs/index.mdx
Normal file
37
docs/content/docs/index.mdx
Normal file
@@ -0,0 +1,37 @@
|
||||
---
|
||||
title: "Voicebox Documentation"
|
||||
description: "Voicebox is a local-first voice cloning studio -- a free and open-source alternative to ElevenLabs."
|
||||
---
|
||||
|
||||
Voicebox is a **local-first voice cloning studio** -- a free and open-source alternative to ElevenLabs. Clone voices from a few seconds of audio, generate speech in 23 languages across 7 TTS engines, apply post-processing effects, and compose multi-voice projects with a timeline editor.
|
||||
|
||||

|
||||
|
||||
- **Complete privacy** -- models and voice data stay on your machine
|
||||
- **7 TTS engines** -- Qwen3-TTS, Qwen CustomVoice, LuxTTS, Chatterbox Multilingual, Chatterbox Turbo, HumeAI TADA, and Kokoro
|
||||
- **Cloning and preset voices** -- zero-shot cloning from a reference sample, or 50+ curated preset voices via Kokoro and Qwen CustomVoice
|
||||
- **23 languages** -- from English to Arabic, Japanese, Hindi, Swahili, and more
|
||||
- **Post-processing effects** -- pitch shift, reverb, delay, chorus, compression, and filters
|
||||
- **Expressive speech** -- paralinguistic tags like `[laugh]`, `[sigh]`, `[gasp]` via Chatterbox Turbo; natural-language delivery control via Qwen CustomVoice
|
||||
- **Unlimited length** -- auto-chunking with crossfade for scripts, articles, and chapters
|
||||
- **Stories editor** -- multi-track timeline for conversations, podcasts, and narratives
|
||||
- **API-first** -- REST API for integrating voice synthesis into your own projects
|
||||
- **Native performance** -- built with Tauri (Rust), not Electron
|
||||
- **Runs everywhere** -- macOS (MLX/Metal), Windows (CUDA), Linux, AMD ROCm, Intel Arc, Docker
|
||||
|
||||
## Download
|
||||
|
||||
| Platform | Download |
|
||||
|----------|----------|
|
||||
| macOS (Apple Silicon) | [Download DMG](https://voicebox.sh/download/mac-arm) |
|
||||
| macOS (Intel) | [Download DMG](https://voicebox.sh/download/mac-intel) |
|
||||
| Windows | [Download MSI](https://voicebox.sh/download/windows) |
|
||||
| Docker | `docker compose up` |
|
||||
|
||||
[View all releases](https://github.com/jamiepine/voicebox/releases/latest)
|
||||
|
||||
## Get Started
|
||||
|
||||
- [Installation](/overview/installation) -- download and install Voicebox
|
||||
- [Quick Start](/overview/quick-start) -- get up and running in 5 minutes
|
||||
- [API Reference](/api-reference) -- integrate voice synthesis into your apps
|
||||
4
docs/content/docs/meta.json
Normal file
4
docs/content/docs/meta.json
Normal file
@@ -0,0 +1,4 @@
|
||||
{
|
||||
"title": "Voicebox Documentation",
|
||||
"pages": ["overview", "api-reference", "developer"]
|
||||
}
|
||||
37
docs/content/docs/overview/building-stories.mdx
Normal file
37
docs/content/docs/overview/building-stories.mdx
Normal file
@@ -0,0 +1,37 @@
|
||||
---
|
||||
title: "Building Stories"
|
||||
description: "Create multi-voice narratives with the Stories Editor"
|
||||
---
|
||||
|
||||
## Getting Started
|
||||
|
||||
The Stories Editor is perfect for creating podcasts, audiobooks, and multi-speaker content.
|
||||
|
||||
<Steps>
|
||||
<Step title="Create Story">
|
||||
**Stories** → **+ New Story**
|
||||
</Step>
|
||||
<Step title="Add Tracks">
|
||||
Create tracks for each speaker
|
||||
</Step>
|
||||
<Step title="Add Clips">
|
||||
Generate or drag audio to tracks
|
||||
</Step>
|
||||
<Step title="Arrange">
|
||||
Position and trim clips on timeline
|
||||
</Step>
|
||||
<Step title="Export">
|
||||
Render final audio
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Use Cases
|
||||
|
||||
- Multi-host podcasts
|
||||
- Audiobook narration with character voices
|
||||
- Game dialogue scenes
|
||||
- Educational content with multiple speakers
|
||||
|
||||
## Coming Soon
|
||||
|
||||
Full timeline editor documentation will be added as features are finalized.
|
||||
288
docs/content/docs/overview/creating-voice-profiles.mdx
Normal file
288
docs/content/docs/overview/creating-voice-profiles.mdx
Normal file
@@ -0,0 +1,288 @@
|
||||
---
|
||||
title: "Creating Voice Profiles"
|
||||
description: "How to create voice profiles, both cloning-based and preset-based"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
A **voice profile** is a saved voice you can reuse across generations, stories, and the API. As of 0.4, Voicebox profiles come in two flavors that map to two different ways of getting a voice:
|
||||
|
||||
| Profile type | What it stores | Use when… |
|
||||
| -------------- | ---------------------------------------------------- | -------------------------------------------------------- |
|
||||
| **Cloned** | One or more reference audio samples + a voice embedding | You want to replicate a specific person's voice |
|
||||
| **Preset** | A reference to a pre-built voice in a specific engine | You want a curated, production-ready voice with no audio prep |
|
||||
|
||||
Both types live in the same Profiles tab and behave the same way at generation time — pick the type that matches your goal and follow the workflow below.
|
||||
|
||||
<Callout type="info">
|
||||
Not sure which to use? Cloning gives you a *specific* voice but needs clean audio. Preset gives you *good* voices instantly but you don't get to choose who they sound like.
|
||||
</Callout>
|
||||
|
||||
## Workflow A — Cloned Profiles
|
||||
|
||||
Use this when you want to replicate a specific person's voice from a recording.
|
||||
|
||||
<Steps>
|
||||
<Step title="Prepare Audio">
|
||||
10-30 seconds of clear speech, minimal background noise. See [Voice Cloning](/overview/voice-cloning) for the engine catalog.
|
||||
</Step>
|
||||
<Step title="Create Profile">
|
||||
**Profiles** → **+ New Profile** → choose a cloning engine (Qwen3-TTS, Chatterbox Multilingual, Chatterbox Turbo, LuxTTS, or TADA)
|
||||
</Step>
|
||||
<Step title="Upload or Record Sample">
|
||||
Drag in an audio file, or record directly with the in-app recorder
|
||||
</Step>
|
||||
<Step title="Generate to Test">
|
||||
Use the profile to generate a test phrase. If quality is poor, add more samples
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
### Audio Requirements (Cloning Only)
|
||||
|
||||
<Cards>
|
||||
<Card title="Duration">
|
||||
**10-30 seconds**
|
||||
|
||||
Too short: Poor quality
|
||||
Too long: Unnecessary
|
||||
</Card>
|
||||
<Card title="Clarity">
|
||||
**Clear speech**
|
||||
|
||||
No background noise
|
||||
No music or overlapping voices
|
||||
</Card>
|
||||
<Card title="Quality">
|
||||
**High fidelity**
|
||||
|
||||
44.1 kHz or 48 kHz sample rate
|
||||
Minimal compression
|
||||
</Card>
|
||||
<Card title="Content">
|
||||
**Natural speech**
|
||||
|
||||
Conversational tone
|
||||
Complete sentences
|
||||
</Card>
|
||||
</Cards>
|
||||
|
||||
### File Formats
|
||||
|
||||
Supported formats:
|
||||
- **WAV** (recommended) — Lossless quality
|
||||
- **MP3** — Acceptable, minimal compression
|
||||
- **M4A** — Acceptable
|
||||
- **FLAC** — Lossless alternative
|
||||
|
||||
<Callout type="info">
|
||||
Use WAV for best results. Avoid heavily compressed formats.
|
||||
</Callout>
|
||||
|
||||
### Recording Tips
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="Quiet Space">
|
||||
- Record in a quiet room
|
||||
- Turn off fans, AC, appliances
|
||||
- Close windows to reduce outside noise
|
||||
- Use soft furnishings to reduce echo
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Microphone Placement">
|
||||
- 6-12 inches from mouth
|
||||
- Slight angle to reduce plosives (p, b, t)
|
||||
- Use a pop filter if available
|
||||
- Maintain consistent distance
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Recording Settings">
|
||||
- 44.1 kHz or 48 kHz sample rate
|
||||
- 16-bit or 24-bit depth
|
||||
- Mono is fine (stereo will be converted)
|
||||
- Avoid automatic gain control
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
### Speaking Style
|
||||
|
||||
- **Natural pace** — Don't rush or speak too slowly
|
||||
- **Clear articulation** — Pronounce words clearly
|
||||
- **Consistent volume** — Maintain steady loudness
|
||||
- **Normal tone** — Speak as you normally would
|
||||
- **Complete sentences** — Avoid fragments or "ums"
|
||||
|
||||
### Multiple Samples
|
||||
|
||||
Adding multiple samples can significantly improve quality:
|
||||
|
||||
<Cards>
|
||||
<Card title="Robustness">
|
||||
Model learns a more complete representation
|
||||
</Card>
|
||||
<Card title="Versatility">
|
||||
Handles different speaking styles better
|
||||
</Card>
|
||||
<Card title="Quality">
|
||||
Reduces artifacts and improves naturalness
|
||||
</Card>
|
||||
<Card title="Consistency">
|
||||
More reliable across different texts
|
||||
</Card>
|
||||
</Cards>
|
||||
|
||||
Consider adding samples with:
|
||||
|
||||
1. **Different tones** — casual, formal, excited, calm
|
||||
2. **Different content** — narratives, questions, statements
|
||||
3. **Different recording conditions** — studio quality, room acoustics
|
||||
|
||||
<Callout type="warn">
|
||||
All samples should be from the **same speaker**. Mixing voices will produce poor results.
|
||||
</Callout>
|
||||
|
||||
### Processing Existing Audio
|
||||
|
||||
If you have existing audio (podcasts, videos, etc.):
|
||||
|
||||
<Steps>
|
||||
<Step title="Find Clean Speech">
|
||||
Look for segments with just the target speaker, no background music, minimal noise
|
||||
</Step>
|
||||
<Step title="Use Audio Editor">
|
||||
Tools like Audacity or Adobe Audition: cut clean 10-30s segments, remove silence at start/end, normalize volume
|
||||
</Step>
|
||||
<Step title="Export as WAV">
|
||||
Save as high-quality WAV file
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
For light background noise, use Audacity's noise reduction (gentle settings — over-processing introduces artifacts).
|
||||
|
||||
### Testing & Iteration
|
||||
|
||||
After creating a cloned profile:
|
||||
|
||||
<Steps>
|
||||
<Step title="Generate Test">
|
||||
Try a simple phrase: `"Hello, this is a test of my voice profile."`
|
||||
</Step>
|
||||
<Step title="Evaluate Quality">
|
||||
Listen for natural tone, clear pronunciation, proper prosody, lack of artifacts
|
||||
</Step>
|
||||
<Step title="Iterate">
|
||||
If quality is poor: add more samples, try different source audio, check sample quality
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
#### Common Issues
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="Robotic Voice">
|
||||
**Cause**: Poor quality samples or too short
|
||||
|
||||
**Fix**: Use longer, higher-quality samples
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Wrong Tone">
|
||||
**Cause**: Sample tone doesn't match desired output
|
||||
|
||||
**Fix**: Record samples in the style you want to generate
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Artifacts/Glitches">
|
||||
**Cause**: Background noise or audio issues in samples
|
||||
|
||||
**Fix**: Clean up samples or re-record in quieter environment
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Workflow B — Preset Profiles
|
||||
|
||||
Use this when you want a ready-made voice without recording anything. Available engines: **Kokoro 82M** (50 voices) and **Qwen CustomVoice** (9 voices). See [Preset Voices](/overview/preset-voices) for the full catalog.
|
||||
|
||||
<Steps>
|
||||
<Step title="Create Profile">
|
||||
**Profiles** → **+ New Profile** → choose **Kokoro** or **Qwen CustomVoice** as the engine
|
||||
</Step>
|
||||
<Step title="Pick a Voice">
|
||||
The engine's voice catalog appears. Click any voice to preview it
|
||||
</Step>
|
||||
<Step title="Name and Save">
|
||||
Give the profile a name. No audio sample required
|
||||
</Step>
|
||||
<Step title="Generate">
|
||||
The profile is ready immediately — use it in the floating generate box or Generate page
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
<Callout type="info">
|
||||
Preset profiles are **locked to their source engine**. Switching to a different engine in the floating generate box greys out the profile, since the voice only exists in that engine. Clicking a greyed profile auto-switches the engine back.
|
||||
</Callout>
|
||||
|
||||
### Qwen CustomVoice + Instruct
|
||||
|
||||
Preset voices in Qwen CustomVoice support **delivery instructions** — natural-language style control over tone, pace, and emotion. The floating generate box shows a slider icon next to the generate button when a Qwen CustomVoice profile is selected; click it to reveal the instruct textarea.
|
||||
|
||||
See [Preset Voices → Using Instruct Mode](/overview/preset-voices#using-instruct-mode) for examples.
|
||||
|
||||
## Advanced Tips
|
||||
|
||||
### Celebrity / Character Voices (Cloning)
|
||||
|
||||
For cloning public figures or characters:
|
||||
|
||||
1. **Legal considerations** — Ensure you have rights or it's clearly fair use
|
||||
2. **Source quality** — Find high-quality interview audio or clean clips
|
||||
3. **Consistency** — Use clips where they speak similarly
|
||||
4. **Multiple samples** — Very important for recognizable voices
|
||||
|
||||
### Accent & Dialect (Cloning)
|
||||
|
||||
Cloning models preserve accent and dialect:
|
||||
|
||||
- British English samples generate British English output
|
||||
- Southern accent samples produce Southern accent output
|
||||
- Regional pronunciations are maintained
|
||||
|
||||
### Emotion Transfer (Cloning)
|
||||
|
||||
The emotional tone of samples affects generation:
|
||||
|
||||
- Energetic samples → energetic output
|
||||
- Calm samples → calm output
|
||||
- Mix samples for a more versatile profile
|
||||
|
||||
For Qwen CustomVoice presets, use the **instruct** field instead of relying on sample emotion — that's exactly what it controls.
|
||||
|
||||
## Managing Profiles
|
||||
|
||||
### Organization
|
||||
|
||||
- **Descriptive names** — "John Smith - Professional Narrator"
|
||||
- **Add descriptions** — Note recording conditions, use cases, or which preset voice
|
||||
- **Language tags** — Mark the primary language
|
||||
- **Archive unused** — Keep profile list manageable
|
||||
|
||||
### Export / Import
|
||||
|
||||
- **Export** profiles to share or backup
|
||||
- **Import** from colleagues or teammates
|
||||
- **Cloned profiles** export with their voice embeddings (not the original audio)
|
||||
- **Preset profiles** export as engine + voice ID metadata only — the importer must have that engine's model installed
|
||||
|
||||
## Next Steps
|
||||
|
||||
<Cards>
|
||||
<Card title="Voice Cloning" href="/overview/voice-cloning">
|
||||
Engine catalog and best practices for cloning
|
||||
</Card>
|
||||
<Card title="Preset Voices" href="/overview/preset-voices">
|
||||
Full catalog of Kokoro and Qwen CustomVoice voices
|
||||
</Card>
|
||||
<Card title="Generate Speech" href="/overview/generating-speech">
|
||||
Use your profile to generate speech
|
||||
</Card>
|
||||
<Card title="Build Stories" href="/overview/building-stories">
|
||||
Create multi-voice narratives
|
||||
</Card>
|
||||
</Cards>
|
||||
240
docs/content/docs/overview/docker.mdx
Normal file
240
docs/content/docs/overview/docker.mdx
Normal file
@@ -0,0 +1,240 @@
|
||||
---
|
||||
title: "Docker Deployment"
|
||||
description: "Run Voicebox as a headless server with a web UI using Docker"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Voicebox can run as a Docker container with a full web UI -- no desktop app required. This is ideal for headless servers, shared GPU machines, or self-hosted deployments.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
git clone https://github.com/jamiepine/voicebox.git
|
||||
cd voicebox
|
||||
docker compose up
|
||||
```
|
||||
|
||||
Open [http://localhost:17493](http://localhost:17493) in your browser. The full Voicebox UI is served directly from the backend.
|
||||
|
||||
<Callout type="info">
|
||||
The first build takes a few minutes (compiling the frontend, installing Python dependencies). Subsequent starts are fast thanks to Docker layer caching.
|
||||
</Callout>
|
||||
|
||||
## How It Works
|
||||
|
||||
The Docker image uses a 3-stage build:
|
||||
|
||||
1. **Frontend** -- builds the React SPA with Bun and Vite
|
||||
2. **Backend** -- installs Python dependencies and TTS model packages
|
||||
3. **Runtime** -- combines both into a minimal image running the FastAPI server
|
||||
|
||||
The backend serves the web UI automatically when the built frontend is present. All API routes work exactly as they do in the desktop app.
|
||||
|
||||
## Configuration
|
||||
|
||||
### docker-compose.yml
|
||||
|
||||
The default `docker-compose.yml` binds to localhost only, mounts persistent volumes for data and model cache, and sets sensible resource limits:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
voicebox:
|
||||
build: .
|
||||
container_name: voicebox
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "127.0.0.1:17493:17493"
|
||||
volumes:
|
||||
- ./output:/app/data/generations
|
||||
- voicebox-data:/app/data
|
||||
- huggingface-cache:/home/voicebox/.cache/huggingface
|
||||
environment:
|
||||
- LOG_LEVEL=info
|
||||
deploy:
|
||||
resources:
|
||||
limits:
|
||||
cpus: '4'
|
||||
memory: 8G
|
||||
```
|
||||
|
||||
### Exposing to Your Network
|
||||
|
||||
By default the container only listens on `127.0.0.1`. To allow other machines on your network to connect, change the port binding:
|
||||
|
||||
```yaml
|
||||
ports:
|
||||
- "0.0.0.0:17493:17493"
|
||||
```
|
||||
|
||||
<Callout type="warn">
|
||||
The API has no built-in authentication. Only expose to trusted networks, or put a reverse proxy with auth in front of it.
|
||||
</Callout>
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `LOG_LEVEL` | `info` | Logging verbosity (`debug`, `info`, `warning`, `error`) |
|
||||
| `VOICEBOX_MODELS_DIR` | (HuggingFace cache) | Custom path for model storage |
|
||||
| `VOICEBOX_CORS_ORIGINS` | (local origins) | Additional CORS origins, comma-separated |
|
||||
|
||||
### Resource Limits
|
||||
|
||||
The default compose file limits the container to 4 CPUs and 8GB RAM. Adjust these based on your hardware:
|
||||
|
||||
```yaml
|
||||
deploy:
|
||||
resources:
|
||||
limits:
|
||||
cpus: '8'
|
||||
memory: 16G
|
||||
```
|
||||
|
||||
<Callout type="info">
|
||||
TTS model inference is memory-intensive. 8GB is the minimum for running a single engine. 16GB+ is recommended if you want multiple engines loaded simultaneously.
|
||||
</Callout>
|
||||
|
||||
## Volumes
|
||||
|
||||
| Volume | Container Path | Purpose |
|
||||
|--------|---------------|---------|
|
||||
| `./output` | `/app/data/generations` | Generated audio files (bind-mount, easy access from host) |
|
||||
| `voicebox-data` | `/app/data` | Profiles, database, cache |
|
||||
| `huggingface-cache` | `/home/voicebox/.cache/huggingface` | Downloaded models (persists across rebuilds) |
|
||||
|
||||
The `huggingface-cache` volume is important -- without it, models would be re-downloaded every time the container is rebuilt.
|
||||
|
||||
## GPU Acceleration
|
||||
|
||||
### NVIDIA GPU (CUDA)
|
||||
|
||||
To use your NVIDIA GPU inside the container, install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) and add GPU access to your compose file:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
voicebox:
|
||||
build: .
|
||||
# ... existing config ...
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: 1
|
||||
capabilities: [gpu]
|
||||
```
|
||||
|
||||
### AMD GPU (ROCm)
|
||||
|
||||
For AMD GPUs, use the ROCm runtime:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
voicebox:
|
||||
build: .
|
||||
# ... existing config ...
|
||||
devices:
|
||||
- /dev/kfd
|
||||
- /dev/dri
|
||||
group_add:
|
||||
- video
|
||||
```
|
||||
|
||||
### CPU Only
|
||||
|
||||
The default configuration runs on CPU. This works fine but generation will be slower. LuxTTS is the fastest engine on CPU (150x realtime).
|
||||
|
||||
## Security
|
||||
|
||||
The Docker image follows security best practices:
|
||||
|
||||
- **Non-root user** -- the server runs as `voicebox`, not `root`
|
||||
- **Localhost binding** -- only accessible from the host machine by default
|
||||
- **Health checks** -- automatic restart if the server hangs (`/health` endpoint polled every 30s)
|
||||
- **CORS restricted** -- only local origins allowed by default
|
||||
|
||||
### Running Behind a Reverse Proxy
|
||||
|
||||
For production deployments, put Voicebox behind nginx or Caddy with TLS and authentication:
|
||||
|
||||
```nginx
|
||||
server {
|
||||
listen 443 ssl;
|
||||
server_name voicebox.example.com;
|
||||
|
||||
ssl_certificate /etc/ssl/certs/voicebox.pem;
|
||||
ssl_certificate_key /etc/ssl/private/voicebox.key;
|
||||
|
||||
auth_basic "Voicebox";
|
||||
auth_basic_user_file /etc/nginx/.htpasswd;
|
||||
|
||||
location / {
|
||||
proxy_pass http://127.0.0.1:17493;
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Container starts but UI shows JSON
|
||||
|
||||
If you see `{"message": "voicebox API", ...}` instead of the web UI, the frontend build may have failed during the Docker build. Check the build logs:
|
||||
|
||||
```bash
|
||||
docker compose build --no-cache
|
||||
```
|
||||
|
||||
Look for errors in the "Build frontend" stage.
|
||||
|
||||
### Models downloading on every restart
|
||||
|
||||
Make sure the `huggingface-cache` volume is configured. Without it, the model cache is lost when the container stops:
|
||||
|
||||
```yaml
|
||||
volumes:
|
||||
- huggingface-cache:/home/voicebox/.cache/huggingface
|
||||
```
|
||||
|
||||
### Out of memory
|
||||
|
||||
TTS models are large. If the container is killed by the OOM killer, increase the memory limit:
|
||||
|
||||
```yaml
|
||||
deploy:
|
||||
resources:
|
||||
limits:
|
||||
memory: 16G
|
||||
```
|
||||
|
||||
### Port already in use
|
||||
|
||||
```bash
|
||||
# Check what's using port 17493
|
||||
lsof -i :17493
|
||||
|
||||
# Or use a different port
|
||||
ports:
|
||||
- "127.0.0.1:8080:17493"
|
||||
```
|
||||
|
||||
## Prebuilt Images (Coming Soon)
|
||||
|
||||
We plan to publish prebuilt Docker images to GitHub Container Registry so you won't need to build locally:
|
||||
|
||||
```bash
|
||||
# Not available yet — coming in a future release
|
||||
docker run -p 17493:17493 ghcr.io/jamiepine/voicebox:latest
|
||||
```
|
||||
|
||||
The CPU image will be ~3-4 GB (Python + PyTorch + TTS packages). A separate CUDA tag (~6-8 GB) will be available for NVIDIA GPU users. This is normal for ML containers.
|
||||
|
||||
For now, use `docker compose up` to build from source as described above.
|
||||
|
||||
## Connecting the Desktop App
|
||||
|
||||
You can also use the desktop app as a frontend for a Docker-hosted backend. In the desktop app, go to **Settings -> Server**, enable **Remote Mode**, and enter `http://<server-ip>:17493`.
|
||||
|
||||
See the [Remote Mode guide](/overview/remote-mode) for details.
|
||||
65
docs/content/docs/overview/generating-speech.mdx
Normal file
65
docs/content/docs/overview/generating-speech.mdx
Normal file
@@ -0,0 +1,65 @@
|
||||
---
|
||||
title: "Generating Speech"
|
||||
description: "Generate high-quality speech from text"
|
||||
---
|
||||
|
||||
## Basic Generation
|
||||
|
||||
<Steps>
|
||||
<Step title="Select Profile">
|
||||
Choose a voice profile from the dropdown
|
||||
</Step>
|
||||
<Step title="Enter Text">
|
||||
Type or paste your text
|
||||
</Step>
|
||||
<Step title="Generate">
|
||||
Click **Generate** and wait a few seconds
|
||||
</Step>
|
||||
<Step title="Play & Export">
|
||||
Preview and download the result
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Text Formatting Tips
|
||||
|
||||
The way you format text affects the output quality.
|
||||
|
||||
### Punctuation
|
||||
|
||||
Use proper punctuation for natural pauses:
|
||||
|
||||
```
|
||||
Good: "Hello! How are you today? I'm doing great."
|
||||
Bad: "Hello how are you today Im doing great"
|
||||
```
|
||||
|
||||
### Emphasis
|
||||
|
||||
Use formatting to suggest emphasis:
|
||||
|
||||
```
|
||||
- ALL CAPS for louder/emphasized: "That was AMAZING!"
|
||||
- Italics for subtle emphasis: "I *really* enjoyed that"
|
||||
- Bold for strong emphasis: "This is **very** important"
|
||||
```
|
||||
|
||||
<Callout type="info">
|
||||
The model interprets these hints but results may vary.
|
||||
</Callout>
|
||||
|
||||
## Advanced Features
|
||||
|
||||
### Batch Generation
|
||||
|
||||
For long-form content, split into smaller chunks for better control and faster processing.
|
||||
|
||||
### Voice Caching
|
||||
|
||||
Voicebox caches voice prompts for faster re-generation with the same profile.
|
||||
|
||||
## Coming Soon
|
||||
|
||||
- Real-time streaming
|
||||
- Word-level timing control
|
||||
- Emotion and style controls
|
||||
- SSML support
|
||||
88
docs/content/docs/overview/generation-history.mdx
Normal file
88
docs/content/docs/overview/generation-history.mdx
Normal file
@@ -0,0 +1,88 @@
|
||||
---
|
||||
title: "Generation History"
|
||||
description: "Track and manage all your generated audio"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Voicebox keeps a complete history of all generated audio, making it easy to find, reuse, and manage your creations.
|
||||
|
||||
## Features
|
||||
|
||||
<Cards>
|
||||
<Card title="Full History" icon={<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><circle cx="12" cy="12" r="10"/><polyline points="12 6 12 12 16 14"/></svg>}>
|
||||
Every generation is automatically saved
|
||||
</Card>
|
||||
<Card title="Search & Filter" icon={<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><circle cx="11" cy="11" r="8"/><path d="m21 21-4.3-4.3"/></svg>}>
|
||||
Find by text, voice, or date
|
||||
</Card>
|
||||
<Card title="Re-generate" icon={<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="M3 12a9 9 0 0 1 9-9 9.75 9.75 0 0 1 6.74 2.74L21 8"/><path d="M21 3v5h-5"/><path d="M21 12a9 9 0 0 1-9 9 9.75 9.75 0 0 1-6.74-2.74L3 16"/><path d="M8 16H3v5"/></svg>}>
|
||||
Regenerate any past generation with one click
|
||||
</Card>
|
||||
<Card title="Export" icon={<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="M21 15v4a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2v-4"/><polyline points="7 10 12 15 17 10"/><line x1="12" y1="15" x2="12" y2="3"/></svg>}>
|
||||
Download individual or batch exports
|
||||
</Card>
|
||||
</Cards>
|
||||
|
||||
## Viewing History
|
||||
|
||||
Navigate to the **History** tab to see all your generations.
|
||||
|
||||
Each entry shows:
|
||||
- Generated text
|
||||
- Voice profile used
|
||||
- Timestamp
|
||||
- Audio duration
|
||||
- Language
|
||||
|
||||
## Actions
|
||||
|
||||
### Play
|
||||
Click any generation to play it immediately.
|
||||
|
||||
### Re-generate
|
||||
Regenerate with the same settings or modify the text/voice.
|
||||
|
||||
### Download
|
||||
Export as WAV, MP3, or M4A.
|
||||
|
||||
### Delete
|
||||
Remove unwanted generations to free up space.
|
||||
|
||||
### Add to Story
|
||||
Drag generations to the Stories Editor timeline.
|
||||
|
||||
## Search & Filter
|
||||
|
||||
<Tabs items={["By Text", "By Voice", "By Date"]}>
|
||||
<Tab value="By Text">
|
||||
Search for specific text content
|
||||
```
|
||||
"Hello world"
|
||||
```
|
||||
</Tab>
|
||||
<Tab value="By Voice">
|
||||
Filter by voice profile
|
||||
```
|
||||
Select from dropdown
|
||||
```
|
||||
</Tab>
|
||||
<Tab value="By Date">
|
||||
Filter by date range
|
||||
```
|
||||
Last 7 days, Last 30 days, Custom range
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
## Storage
|
||||
|
||||
History is stored locally:
|
||||
|
||||
- **macOS**: `~/Library/Application Support/sh.voicebox.app/data/`
|
||||
- **Windows**: `%APPDATA%/sh.voicebox.app/data/`
|
||||
- **Linux**: `~/.config/sh.voicebox.app/data/`
|
||||
|
||||
<Callout type="warn">
|
||||
Deleting the data directory will remove all history. Export important files first.
|
||||
</Callout>
|
||||
236
docs/content/docs/overview/gpu-acceleration.mdx
Normal file
236
docs/content/docs/overview/gpu-acceleration.mdx
Normal file
@@ -0,0 +1,236 @@
|
||||
---
|
||||
title: "GPU Acceleration"
|
||||
description: "How Voicebox uses your GPU — auto-detection, manual setup, troubleshooting"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Voicebox auto-detects available accelerators on first launch and picks the fastest backend it can use. For most people this just works — open the app and you're already on the right backend.
|
||||
|
||||
This page is for the cases where it doesn't:
|
||||
|
||||
- You have a GPU but Voicebox is running on CPU
|
||||
- You upgraded GPUs (especially to RTX 50-series / Blackwell) and generation broke
|
||||
- You want to switch backends manually (e.g. force MLX over PyTorch on Apple Silicon)
|
||||
- You see `[UNSUPPORTED - see logs]` next to your GPU in Settings
|
||||
|
||||
## Backend Matrix
|
||||
|
||||
| Platform | Auto-selected backend | Notes |
|
||||
| --------------------------- | ------------------------- | ---------------------------------------------------- |
|
||||
| **macOS Apple Silicon** | MLX (Metal) | 4-5x faster than PyTorch via Apple Neural Engine |
|
||||
| **macOS Intel** | PyTorch CPU | No GPU acceleration available; PyTorch ≥ 2.2 only |
|
||||
| **Windows + NVIDIA** | PyTorch CUDA (cu128) | Auto-downloads the CUDA backend binary on first use |
|
||||
| **Windows + Intel Arc** | PyTorch XPU (IPEX) | New in 0.4 — works with Arc A-series and B-series |
|
||||
| **Windows generic GPU** | DirectML | Universal Windows GPU support; slower than CUDA |
|
||||
| **Linux + NVIDIA** | PyTorch CUDA (cu128) | Same auto-download flow as Windows |
|
||||
| **Linux + AMD** | PyTorch ROCm | Auto-configures `HSA_OVERRIDE_GFX_VERSION` |
|
||||
| **Linux + Intel Arc** | PyTorch XPU (IPEX) | |
|
||||
| **Any (no GPU)** | PyTorch CPU | Works everywhere; expect 5-50x slower than GPU |
|
||||
|
||||
The detected backend is shown in Settings → GPU. Logs at startup also print the chosen backend and the device name.
|
||||
|
||||
## Apple Silicon — MLX vs PyTorch
|
||||
|
||||
On M-series Macs, Voicebox ships an MLX-optimized backend that uses the Apple Neural Engine. It's **4-5x faster** than the PyTorch (CPU/Metal) path for supported engines.
|
||||
|
||||
| Engine | MLX support | Notes |
|
||||
| -------------------- | ----------- | ------------------------------------------- |
|
||||
| Qwen3-TTS | ✅ Native | Uses MLX exclusively when available |
|
||||
| Chatterbox / Turbo | PyTorch MPS | Falls back to Metal via PyTorch |
|
||||
| LuxTTS | PyTorch MPS | |
|
||||
| TADA | PyTorch MPS | |
|
||||
| Kokoro | PyTorch MPS | Requires `PYTORCH_ENABLE_MPS_FALLBACK=1` |
|
||||
| Qwen CustomVoice | PyTorch MPS | |
|
||||
| Whisper (transcribe) | ✅ Native | MLX-Whisper is the default on Apple Silicon |
|
||||
|
||||
The Whisper Turbo + MLX combo dropped transcription latency from ~20s to ~2-3s on M-series chips (see CHANGELOG entry for v0.1.10).
|
||||
|
||||
## Windows / Linux + NVIDIA — The CUDA Backend Swap
|
||||
|
||||
Voicebox doesn't bundle CUDA into the main installer (it would balloon downloads to multi-gigabyte territory for users who don't have an NVIDIA GPU). Instead, when you first need it, the app downloads a separate **CUDA backend binary** that contains the PyTorch + CUDA runtime.
|
||||
|
||||
<Steps>
|
||||
<Step title="Open Settings → GPU">
|
||||
If an NVIDIA GPU is detected, you'll see "Install CUDA backend" in the GPU panel
|
||||
</Step>
|
||||
<Step title="Click Install">
|
||||
The app downloads two archives separately:
|
||||
- **Server core** (~200-400 MB) — versioned with each Voicebox release
|
||||
- **CUDA libs** (~4 GB) — the heavy PyTorch + CUDA DLLs, versioned independently
|
||||
</Step>
|
||||
<Step title="Restart">
|
||||
Voicebox restarts to swap in the CUDA backend
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
<Callout type="info">
|
||||
The split-archive design (added in v0.4) means most Voicebox upgrades only redownload the small server-core archive. The 4 GB libs archive is only refreshed when the underlying CUDA toolkit or torch major version changes.
|
||||
</Callout>
|
||||
|
||||
### Auto-update
|
||||
|
||||
When a new Voicebox release ships, the GPU panel checks if the bundled server-core matches the installed CUDA version. If only the core changed (typical), it pulls the new core in the background. If the libs version changed (rare — only happens on cu126 → cu128 type bumps), you'll be prompted to confirm the larger download.
|
||||
|
||||
## RTX 50-series / Blackwell
|
||||
|
||||
Voicebox 0.4 added explicit RTX 50-series support:
|
||||
|
||||
- CUDA toolkit upgraded to **cu128** (previous releases used cu126 which lacks Blackwell kernels)
|
||||
- Build pinned with `TORCH_CUDA_ARCH_LIST=...12.0+PTX` for forward-compatibility
|
||||
|
||||
If you're on an RTX 5070 / 5080 / 5090 and you see "no kernel image is available" errors:
|
||||
|
||||
1. Make sure you're on Voicebox **≥ 0.4.0** (Settings → About)
|
||||
2. Reinstall the CUDA backend (Settings → GPU → Reinstall CUDA backend) — older installs may have stale cu126 libs
|
||||
3. If errors persist, see the GPU compatibility warnings section below
|
||||
|
||||
## Intel Arc (XPU)
|
||||
|
||||
New in 0.4. Works with both Arc A-series (Alchemist: A380, A580, A750, A770) and B-series (Battlemage).
|
||||
|
||||
### Setup
|
||||
|
||||
Voicebox auto-detects Arc GPUs and routes through Intel's PyTorch XPU backend (powered by IPEX — Intel Extension for PyTorch). No extra installation step beyond the standard Voicebox install.
|
||||
|
||||
Verify it's working:
|
||||
- Settings → GPU should show **XPU** followed by your Arc model name (e.g. `XPU (Intel Arc A770)`)
|
||||
- Startup logs print `Backend: PYTORCH` and `GPU: XPU (Intel Arc ...)`
|
||||
|
||||
### Engines on XPU
|
||||
|
||||
All PyTorch-based engines work on XPU. Performance is generally between CPU and CUDA — expect ~2-3x speedup over CPU for the larger models.
|
||||
|
||||
## DirectML
|
||||
|
||||
The fallback for Windows users with non-NVIDIA, non-Intel-Arc GPUs (older AMD discrete, integrated GPUs, etc.). Slower than CUDA and XPU but provides some acceleration over CPU.
|
||||
|
||||
Auto-selected when no other GPU backend is available.
|
||||
|
||||
## AMD ROCm (Linux)
|
||||
|
||||
ROCm provides PyTorch GPU acceleration on AMD discrete GPUs. Voicebox auto-configures `HSA_OVERRIDE_GFX_VERSION` for common cards that need the override.
|
||||
|
||||
### Verifying
|
||||
|
||||
```bash
|
||||
# In a terminal
|
||||
echo $HSA_OVERRIDE_GFX_VERSION
|
||||
# Should show e.g. 10.3.0 for RX 6000 series
|
||||
```
|
||||
|
||||
If detection fails, set the variable manually before launching Voicebox:
|
||||
|
||||
```bash
|
||||
export HSA_OVERRIDE_GFX_VERSION=10.3.0
|
||||
voicebox
|
||||
```
|
||||
|
||||
Common values:
|
||||
- `10.3.0` — RX 6000 series (RDNA 2)
|
||||
- `11.0.0` — RX 7000 series (RDNA 3)
|
||||
- `9.0.0` — Older Vega cards
|
||||
|
||||
## GPU Compatibility Warnings
|
||||
|
||||
Voicebox 0.4 added a runtime check that compares your GPU's compute capability against the architectures the bundled PyTorch was compiled for. If they don't match, you'll see:
|
||||
|
||||
- A startup log line: `WARNING: GPU COMPATIBILITY: <your GPU> is not supported by this PyTorch build...`
|
||||
- The GPU label in Settings shows `[UNSUPPORTED - see logs]`
|
||||
- The `/health` API returns a populated `gpu_compatibility_warning` field
|
||||
|
||||
### What to do
|
||||
|
||||
The most common trigger is a brand-new GPU architecture that pre-built PyTorch wheels don't yet cover natively. In order of preference:
|
||||
|
||||
1. **Update Voicebox** — newer releases ship newer PyTorch with broader arch support
|
||||
2. **Reinstall the CUDA backend** — Settings → GPU → Reinstall CUDA backend
|
||||
3. **For bleeding-edge GPUs (newer than current Blackwell):** install PyTorch nightly manually:
|
||||
```bash
|
||||
pip install torch --index-url https://download.pytorch.org/whl/nightly/cu128 --force-reinstall
|
||||
```
|
||||
Then point Voicebox at that environment via [Remote Mode](/overview/remote-mode) until stable PyTorch catches up.
|
||||
4. **Fall back to CPU** temporarily — set `VOICEBOX_FORCE_CPU=1` before launching
|
||||
|
||||
## CPU-Only Fallback
|
||||
|
||||
When no GPU is available (or you've forced it off), Voicebox runs the PyTorch CPU backend. Expect:
|
||||
|
||||
- 5-50x slower generation depending on engine and text length
|
||||
- Heavy CPU usage during generation
|
||||
- Some engines work better than others on CPU:
|
||||
- **Kokoro 82M** — runs at realtime on modern CPUs
|
||||
- **LuxTTS** — exceeds 150x realtime on CPU
|
||||
- **Chatterbox Turbo (350M)** — usable but slow
|
||||
- Larger models (Qwen 1.7B, Chatterbox Multilingual, TADA 3B) — painful
|
||||
|
||||
For CPU-bound use cases, prefer the smaller, lighter engines.
|
||||
|
||||
## Verifying Your Setup
|
||||
|
||||
Three places to check that the right backend is being used:
|
||||
|
||||
<Steps>
|
||||
<Step title="Settings → GPU">
|
||||
Shows the detected backend, GPU model, and VRAM (when applicable). Look for the `[UNSUPPORTED - see logs]` suffix
|
||||
</Step>
|
||||
<Step title="Settings → Logs">
|
||||
The "Server logs" tab shows the startup banner with `Backend: <type>` and `GPU: <name>`
|
||||
</Step>
|
||||
<Step title="Health endpoint">
|
||||
`curl http://localhost:17493/health` returns a JSON payload with `backend_type`, `backend_variant`, and `gpu_compatibility_warning` (when applicable)
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="Settings shows CPU instead of my GPU">
|
||||
- On NVIDIA: install the CUDA backend (Settings → GPU)
|
||||
- On Intel Arc: confirm IPEX detection in startup logs; restart the app after a driver update
|
||||
- On AMD Linux: check `HSA_OVERRIDE_GFX_VERSION` is set
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="'no kernel image is available' / 'CUDA error'">
|
||||
Almost always means the bundled PyTorch doesn't have kernels for your GPU's compute capability.
|
||||
|
||||
1. Update to Voicebox ≥ 0.4.0 (Blackwell support added there)
|
||||
2. Reinstall the CUDA backend
|
||||
3. If still broken, install PyTorch nightly via Remote Mode
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Out of memory (CUDA)">
|
||||
- Switch to a smaller model size (e.g. Qwen3 0.6B instead of 1.7B)
|
||||
- Use Settings → Models to unload other engines you're not using
|
||||
- Enable `low_cpu_mem_usage` is already on for CPU; for CUDA, the engine's `device_map` handles offload automatically
|
||||
- Close other GPU applications
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="MPS fallback errors on macOS">
|
||||
Some operations don't have a Metal implementation. Voicebox sets `PYTORCH_ENABLE_MPS_FALLBACK=1` for engines that need it (notably Kokoro), but if you launch from a custom env, set it manually:
|
||||
```bash
|
||||
export PYTORCH_ENABLE_MPS_FALLBACK=1
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Generation works but is slow on my GPU">
|
||||
- Check Settings → GPU shows your GPU (not CPU)
|
||||
- Check VRAM usage — you may be paging to system memory
|
||||
- Try a smaller model
|
||||
- For NVIDIA: confirm cu128 is installed (Settings → GPU → version)
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Next Steps
|
||||
|
||||
<Cards>
|
||||
<Card title="Remote Mode" href="/overview/remote-mode">
|
||||
Run the backend on a different machine with a stronger GPU
|
||||
</Card>
|
||||
<Card title="Model Management" href="/developer/model-management">
|
||||
Unload models to free GPU memory
|
||||
</Card>
|
||||
<Card title="Troubleshooting" href="/overview/troubleshooting">
|
||||
General troubleshooting beyond GPU
|
||||
</Card>
|
||||
</Cards>
|
||||
119
docs/content/docs/overview/installation.mdx
Normal file
119
docs/content/docs/overview/installation.mdx
Normal file
@@ -0,0 +1,119 @@
|
||||
---
|
||||
title: "Installation"
|
||||
description: "Download and install Voicebox on macOS, Windows, or Linux"
|
||||
---
|
||||
|
||||
## Download
|
||||
|
||||
Voicebox is available for macOS and Windows, with Linux builds coming soon.
|
||||
|
||||
<Cards>
|
||||
<Card title="macOS" icon={<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="M12 2c-1.5 0-2.8.4-3.9 1.1A5.5 5.5 0 0 0 4 2.5C2.5 2.5 1 4 1 6c0 3.5 2.5 6 5 7.5C5 16 4 18 4 20c0 1.5.5 2.5 1.5 3C6.5 23.5 8 24 9.5 24c2 0 3.5-.5 5-2 1.5 1.5 3 2 5 2 1.5 0 3-.5 4-1 1-.5 1.5-1.5 1.5-3 0-2-1-4-2-6.5 2.5-1.5 5-4 5-7.5 0-2-1.5-3.5-3-3.5-.9 0-2.1.4-3.1 1.1A6.5 6.5 0 0 0 12 2Z"/></svg>}>
|
||||
Download for Apple Silicon or Intel Macs
|
||||
</Card>
|
||||
<Card title="Windows" icon={<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><rect x="2" y="3" width="20" height="14" rx="2" ry="2"/><line x1="8" y1="21" x2="16" y2="21"/><line x1="12" y1="17" x2="12" y2="21"/></svg>}>
|
||||
Download MSI installer or Setup executable
|
||||
</Card>
|
||||
</Cards>
|
||||
|
||||
### macOS
|
||||
|
||||
<Tabs items={["Apple Silicon", "Intel"]}>
|
||||
<Tab value="Apple Silicon">
|
||||
Download: [voicebox_aarch64.app.tar.gz](https://github.com/jamiepine/voicebox/releases/latest/download/voicebox_aarch64.app.tar.gz)
|
||||
|
||||
```bash
|
||||
# Extract the archive
|
||||
tar -xzf voicebox_aarch64.app.tar.gz
|
||||
|
||||
# Move to Applications
|
||||
mv Voicebox.app /Applications/
|
||||
```
|
||||
</Tab>
|
||||
<Tab value="Intel">
|
||||
Download: [voicebox_x64.app.tar.gz](https://github.com/jamiepine/voicebox/releases/latest/download/voicebox_x64.app.tar.gz)
|
||||
|
||||
```bash
|
||||
# Extract the archive
|
||||
tar -xzf voicebox_x64.app.tar.gz
|
||||
|
||||
# Move to Applications
|
||||
mv Voicebox.app /Applications/
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
### Windows
|
||||
|
||||
<Tabs items={["MSI Installer", "Setup Executable"]}>
|
||||
<Tab value="MSI Installer">
|
||||
Download: [voicebox_x64_en-US.msi](https://github.com/jamiepine/voicebox/releases/latest/download/voicebox_x64_en-US.msi)
|
||||
|
||||
Double-click the MSI file and follow the installation wizard.
|
||||
</Tab>
|
||||
<Tab value="Setup Executable">
|
||||
Download: [voicebox_x64-setup.exe](https://github.com/jamiepine/voicebox/releases/latest/download/voicebox_x64-setup.exe)
|
||||
|
||||
Run the executable and follow the installation wizard.
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
### Linux
|
||||
|
||||
<Callout type="info">
|
||||
Linux builds are coming soon. Currently blocked by GitHub runner disk space limitations.
|
||||
</Callout>
|
||||
|
||||
## First Launch
|
||||
|
||||
When you launch Voicebox for the first time:
|
||||
|
||||
1. **Model Download** — The TTS engine you generate with first will download its model automatically. Sizes range from ~350 MB (Kokoro) to ~8 GB (TADA 3B). Most users start with Qwen 1.7B (~3.5 GB).
|
||||
2. **Data Directory** — Voice profiles and generated audio are stored in:
|
||||
- macOS: `~/Library/Application Support/sh.voicebox.app/`
|
||||
- Windows: `%APPDATA%/sh.voicebox.app/`
|
||||
- Linux: `~/.config/sh.voicebox.app/`
|
||||
|
||||
3. **Backend Server** — The bundled Python server starts automatically
|
||||
|
||||
<Callout type="info">
|
||||
First generation will be slower due to model downloads. Subsequent runs use cached models.
|
||||
</Callout>
|
||||
|
||||
## System Requirements
|
||||
|
||||
### Minimum
|
||||
|
||||
- **OS:** macOS 11+, Windows 10+, or Linux
|
||||
- **RAM:** 8GB
|
||||
- **Storage:** 5GB free space (for models and data)
|
||||
- **CPU:** Modern multi-core processor
|
||||
|
||||
### Recommended
|
||||
|
||||
- **RAM:** 16GB+
|
||||
- **GPU:** CUDA-capable NVIDIA GPU (for faster generation)
|
||||
- **Storage:** 10GB+ free space
|
||||
|
||||
<Callout type="info">
|
||||
CPU inference is supported but significantly slower than GPU. A CUDA-capable GPU is highly recommended for real-time workflows.
|
||||
</Callout>
|
||||
|
||||
## Verification
|
||||
|
||||
After installation, verify everything works:
|
||||
|
||||
1. Launch Voicebox
|
||||
2. Check the server status indicator in the bottom-left corner (should be green)
|
||||
3. Navigate to **Profiles** and create a test profile
|
||||
4. Generate a short audio clip to verify the TTS engine works
|
||||
|
||||
<Callout type="success">
|
||||
If you see a green status indicator and can generate audio, you're all set!
|
||||
</Callout>
|
||||
|
||||
## Next Steps
|
||||
|
||||
<Card title="Quick Start Guide" href="/overview/quick-start">
|
||||
Create your first voice profile and generate speech
|
||||
</Card>
|
||||
68
docs/content/docs/overview/introduction.mdx
Normal file
68
docs/content/docs/overview/introduction.mdx
Normal file
@@ -0,0 +1,68 @@
|
||||
---
|
||||
title: "Introduction"
|
||||
description: "Voicebox is a local-first voice cloning studio -- a free and open-source alternative to ElevenLabs."
|
||||
---
|
||||
|
||||
## What is Voicebox?
|
||||
|
||||
Voicebox is a **local-first voice cloning studio** -- a free and open-source alternative to ElevenLabs. Clone voices from a few seconds of audio or pick from 50+ preset voices, generate speech in 23 languages across 7 TTS engines, apply post-processing effects, and compose multi-voice projects with a timeline editor.
|
||||
|
||||
- **Complete privacy** -- models and voice data stay on your machine
|
||||
- **7 TTS engines** -- Qwen3-TTS, Qwen CustomVoice, LuxTTS, Chatterbox Multilingual, Chatterbox Turbo, HumeAI TADA, and Kokoro
|
||||
- **Cloning and preset voices** -- zero-shot cloning from a reference sample, or curated preset voices via Kokoro (50 voices) and Qwen CustomVoice (9 voices)
|
||||
- **23 languages** -- from English to Arabic, Japanese, Hindi, Swahili, and more
|
||||
- **Post-processing effects** -- pitch shift, reverb, delay, chorus, compression, and filters
|
||||
- **Expressive speech** -- paralinguistic tags like `[laugh]`, `[sigh]`, `[gasp]` via Chatterbox Turbo; natural-language delivery control via Qwen CustomVoice
|
||||
- **Unlimited length** -- auto-chunking with crossfade for scripts, articles, and chapters
|
||||
- **Stories editor** -- multi-track timeline for conversations, podcasts, and narratives
|
||||
- **API-first** -- REST API for integrating voice synthesis into your own projects
|
||||
- **Native performance** -- built with Tauri (Rust), not Electron
|
||||
- **Runs everywhere** -- macOS (MLX/Metal), Windows (CUDA), Linux, AMD ROCm, Intel Arc, Docker
|
||||
|
||||
## TTS Engines
|
||||
|
||||
Seven engines with different strengths, switchable per-generation:
|
||||
|
||||
| Engine | Profile Type | Languages | Strengths |
|
||||
|--------|--------------|-----------|-----------|
|
||||
| **Qwen3-TTS** (0.6B / 1.7B) | Cloned | 10 | High-quality multilingual cloning |
|
||||
| **Qwen CustomVoice** (0.6B / 1.7B) | Preset (9 voices) | 10 | Natural-language delivery control (tone, emotion, pace) |
|
||||
| **LuxTTS** | Cloned | English | Lightweight (~1GB VRAM), 48kHz output, 150x realtime on CPU |
|
||||
| **Chatterbox Multilingual** | Cloned | 23 | Broadest language coverage |
|
||||
| **Chatterbox Turbo** | Cloned | English | Fast 350M model with paralinguistic emotion/sound tags |
|
||||
| **TADA** (1B / 3B) | Cloned | 10 | HumeAI speech-language model -- 700s+ coherent audio |
|
||||
| **Kokoro** | Preset (50 voices) | 9 | 82M parameters, CPU realtime, lowest VRAM of any engine |
|
||||
|
||||
## GPU Support
|
||||
|
||||
| Platform | Backend | Notes |
|
||||
|----------|---------|-------|
|
||||
| macOS (Apple Silicon) | MLX (Metal) | 4-5x faster via Neural Engine |
|
||||
| Windows / Linux (NVIDIA) | PyTorch (CUDA) | Auto-downloads CUDA binary from within the app |
|
||||
| Linux (AMD) | PyTorch (ROCm) | Auto-configures HSA_OVERRIDE_GFX_VERSION |
|
||||
| Windows (any GPU) | DirectML | Universal Windows GPU support |
|
||||
| Intel Arc | IPEX/XPU | Intel discrete GPU acceleration |
|
||||
| Any | CPU | Works everywhere, just slower |
|
||||
|
||||
## Use Cases
|
||||
|
||||
- **Game development** -- generate dynamic dialogue for characters
|
||||
- **Content creation** -- produce podcasts and video voiceovers
|
||||
- **Accessibility** -- build text-to-speech tools for users who need them
|
||||
- **Voice assistants** -- create custom voice interfaces
|
||||
- **Production pipelines** -- automate voiceover workflows via the REST API
|
||||
|
||||
## Tech Stack
|
||||
|
||||
| Layer | Technology |
|
||||
|-------|------------|
|
||||
| Desktop App | Tauri (Rust) |
|
||||
| Frontend | React, TypeScript, Tailwind CSS |
|
||||
| State | Zustand, React Query |
|
||||
| Backend | FastAPI (Python) |
|
||||
| TTS Engines | Qwen3-TTS, Qwen CustomVoice, LuxTTS, Chatterbox, Chatterbox Turbo, TADA, Kokoro |
|
||||
| Effects | Pedalboard (Spotify) |
|
||||
| Transcription | Whisper / Whisper Turbo (PyTorch or MLX) |
|
||||
| Inference | MLX (Apple Silicon) / PyTorch (CUDA/ROCm/XPU/CPU) |
|
||||
| Database | SQLite |
|
||||
| Audio | WaveSurfer.js, librosa |
|
||||
21
docs/content/docs/overview/meta.json
Normal file
21
docs/content/docs/overview/meta.json
Normal file
@@ -0,0 +1,21 @@
|
||||
{
|
||||
"title": "Overview",
|
||||
"defaultOpen": true,
|
||||
"pages": [
|
||||
"introduction",
|
||||
"installation",
|
||||
"docker",
|
||||
"quick-start",
|
||||
"gpu-acceleration",
|
||||
"voice-cloning",
|
||||
"preset-voices",
|
||||
"stories-editor",
|
||||
"recording-transcription",
|
||||
"generation-history",
|
||||
"remote-mode",
|
||||
"creating-voice-profiles",
|
||||
"generating-speech",
|
||||
"building-stories",
|
||||
"troubleshooting"
|
||||
]
|
||||
}
|
||||
202
docs/content/docs/overview/preset-voices.mdx
Normal file
202
docs/content/docs/overview/preset-voices.mdx
Normal file
@@ -0,0 +1,202 @@
|
||||
---
|
||||
title: "Preset Voices"
|
||||
description: "Use built-in, ready-made voices without recording audio samples"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Some Voicebox engines ship with a curated set of pre-built voices. Instead of cloning from your own audio sample, you pick a voice from a fixed catalog and the model speaks in that voice. No recording, no upload, no per-voice training required.
|
||||
|
||||
Two engines in 0.4 ship preset voices:
|
||||
|
||||
| Engine | Voices | Languages | Strengths |
|
||||
| --------------------- | ----------------------- | --------- | ------------------------------------------------------- |
|
||||
| **Kokoro 82M** | 50 | 9 | Tiny model, CPU-friendly, lowest VRAM of any engine |
|
||||
| **Qwen CustomVoice** | 9 (premium curated) | 4 | Natural-language style control over tone, emotion, pace |
|
||||
|
||||
<Callout type="info">
|
||||
Looking for cloning a specific person's voice instead? See [Voice Cloning](/overview/voice-cloning).
|
||||
</Callout>
|
||||
|
||||
## When to Use Preset Voices
|
||||
|
||||
<Cards>
|
||||
<Card title="No reference audio">
|
||||
You don't have (or don't want to provide) a recording of the target voice
|
||||
</Card>
|
||||
<Card title="Production reliability">
|
||||
Curated voices have predictable quality across any text input
|
||||
</Card>
|
||||
<Card title="Speed">
|
||||
Skip the audio cleanup, sample preparation, and quality iteration loop
|
||||
</Card>
|
||||
<Card title="Lightweight setup">
|
||||
Kokoro runs at CPU realtime with ~150 MB on disk — no GPU needed
|
||||
</Card>
|
||||
</Cards>
|
||||
|
||||
## Creating a Preset-Voice Profile
|
||||
|
||||
<Steps>
|
||||
<Step title="Open Profiles → New Profile">
|
||||
Same entry point as cloning profiles
|
||||
</Step>
|
||||
<Step title="Choose the engine">
|
||||
Select **Kokoro** or **Qwen CustomVoice** from the engine dropdown
|
||||
</Step>
|
||||
<Step title="Pick a preset voice">
|
||||
The voice catalog for the chosen engine appears — preview each by clicking it
|
||||
</Step>
|
||||
<Step title="Name and save">
|
||||
Give the profile a name. No audio sample needed — just save
|
||||
</Step>
|
||||
<Step title="Generate">
|
||||
Use the profile like any other in the floating generate box or the Generate page
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
<Callout type="info">
|
||||
Preset profiles are locked to their source engine — switching engines won't work since the voice exists only for that model. The profile grid greys out preset profiles when you switch to a different engine, and clicking one auto-switches the engine back to the right one.
|
||||
</Callout>
|
||||
|
||||
## Kokoro 82M — 50 Voices Across 9 Languages
|
||||
|
||||
Kokoro is the smallest engine in Voicebox at 82M parameters. It runs at CPU realtime with negligible VRAM, making it the best option for lightweight local inference. Voices are pre-built style vectors trained into the model — there's no concept of cloning here.
|
||||
|
||||
**Repository:** [`hexgrad/Kokoro-82M`](https://huggingface.co/hexgrad/Kokoro-82M) · Apache 2.0 licensed
|
||||
|
||||
### American English
|
||||
|
||||
| Female | Male |
|
||||
| ------- | ------- |
|
||||
| Alloy | Adam |
|
||||
| Aoede | Echo |
|
||||
| Bella | Eric |
|
||||
| Heart | Fenrir |
|
||||
| Jessica | Liam |
|
||||
| Kore | Michael |
|
||||
| Nicole | Onyx |
|
||||
| Nova | Puck |
|
||||
| River | Santa |
|
||||
| Sarah | |
|
||||
| Sky | |
|
||||
|
||||
### British English
|
||||
|
||||
| Female | Male |
|
||||
| -------- | ------ |
|
||||
| Alice | Daniel |
|
||||
| Emma | Fable |
|
||||
| Isabella | George |
|
||||
| Lily | Lewis |
|
||||
|
||||
### Other Languages
|
||||
|
||||
| Language | Voices |
|
||||
| ----------------- | ------------------------------------------- |
|
||||
| Spanish (`es`) | Dora (f), Alex (m), Santa (m) |
|
||||
| French (`fr`) | Siwis (f) |
|
||||
| Hindi (`hi`) | Alpha (f), Beta (f), Omega (m), Psi (m) |
|
||||
| Italian (`it`) | Sara (f), Nicola (m) |
|
||||
| Japanese (`ja`) | Alpha (f), Gongitsune (f), Nezumi (f), Tebukuro (f), Kumo (m) |
|
||||
| Portuguese (`pt`) | Dora (f), Alex (m), Santa (m) |
|
||||
| Chinese (`zh`) | Xiaobei (f), Xiaoni (f), Xiaoxiao (f), Xiaoyi (f) |
|
||||
|
||||
### Kokoro at a Glance
|
||||
|
||||
| Property | Value |
|
||||
| --------------- | -------------------------------------------- |
|
||||
| Parameters | 82M |
|
||||
| Sample rate | 24 kHz |
|
||||
| VRAM | ~150 MB (negligible on CPU) |
|
||||
| Speed | Realtime on CPU, faster on GPU |
|
||||
| Instruct | Not supported (preset voice carries the style) |
|
||||
| License | Apache 2.0 |
|
||||
|
||||
## Qwen CustomVoice — 9 Premium Voices with Instruct Control
|
||||
|
||||
Qwen CustomVoice ships with 9 curated speakers and supports **natural-language style control** — you tell the model how to deliver the line ("speak slowly with warmth", "authoritative and clear") and it adapts tone, emotion, and pace.
|
||||
|
||||
Two model sizes:
|
||||
- **1.7B** — full quality, recommended default
|
||||
- **0.6B** — lighter, faster, lower-end hardware
|
||||
|
||||
**Repository:** [`Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice`](https://huggingface.co/Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice) (and 0.6B variant) · by Alibaba
|
||||
|
||||
### Voice Catalog
|
||||
|
||||
| Speaker | Gender | Language | Description |
|
||||
| --------- | ------ | -------- | ------------------------------------------------------------ |
|
||||
| Vivian | female | Chinese | Bright, slightly edgy young female voice |
|
||||
| Serena | female | Chinese | Warm, gentle young female voice |
|
||||
| Uncle Fu | male | Chinese | Seasoned male voice with a low, mellow timbre |
|
||||
| Dylan | male | Chinese | Youthful Beijing male voice with a clear, natural timbre |
|
||||
| Eric | male | Chinese | Lively Chengdu male voice with a slightly husky brightness |
|
||||
| Ryan | male | English | Dynamic male voice with strong rhythmic drive (default) |
|
||||
| Aiden | male | English | Sunny American male voice with a clear midrange |
|
||||
| Ono Anna | female | Japanese | Playful Japanese female voice with a light, nimble timbre |
|
||||
| Sohee | female | Korean | Warm Korean female voice with rich emotion |
|
||||
|
||||
### Using Instruct Mode
|
||||
|
||||
In the floating generate box, switch to a Qwen CustomVoice profile and click the **delivery instructions** toggle (slider icon, left of the generate button). A second textarea appears below the main text:
|
||||
|
||||
- Main text → what you want the voice to say
|
||||
- Instruct text → how you want it delivered
|
||||
|
||||
Examples of effective instruct prompts:
|
||||
|
||||
```
|
||||
Speak slowly with emphasis, like reading bedtime stories
|
||||
Warm and friendly, conversational tone
|
||||
Professional and authoritative, broadcast quality
|
||||
Whisper, intimate and close
|
||||
Excited and energetic, like sports commentary
|
||||
```
|
||||
|
||||
The full Generate page also surfaces the instruct field as a separate input.
|
||||
|
||||
### Qwen CustomVoice at a Glance
|
||||
|
||||
| Property | Value |
|
||||
| --------------- | -------------------------------------------------- |
|
||||
| Parameters | 1.7B / 0.6B |
|
||||
| Languages | Chinese, English, Japanese, Korean (10 supported) |
|
||||
| Voices | 9 curated preset speakers |
|
||||
| VRAM | ~3.5 GB (1.7B), ~1.2 GB (0.6B) |
|
||||
| Instruct | Yes — natural-language style control |
|
||||
| Cloning | No — paired Base Qwen3-TTS engine handles cloning |
|
||||
|
||||
## Cloning vs Preset — Quick Decision
|
||||
|
||||
| You want… | Use |
|
||||
| -------------------------------------------------- | ----------------------------------------- |
|
||||
| To replicate a specific person's voice | [Voice Cloning](/overview/voice-cloning) |
|
||||
| Production-ready voices with no audio prep | Kokoro or Qwen CustomVoice |
|
||||
| The smallest possible footprint (CPU-only) | Kokoro |
|
||||
| Fine control over delivery (tone, pace, emotion) | Qwen CustomVoice |
|
||||
| The broadest language coverage | [Voice Cloning](/overview/voice-cloning) via Chatterbox Multilingual (23 langs) |
|
||||
|
||||
## Limitations
|
||||
|
||||
<Callout type="warn">
|
||||
Preset voices are fixed — you can't fine-tune or modify the underlying voice. If you want a specific voice that isn't in the catalog, use a cloning engine and provide a reference sample.
|
||||
</Callout>
|
||||
|
||||
- Preset voices can't be exported to use in other Voicebox installations as audio (only as profile metadata pointing to the same engine + voice ID)
|
||||
- The Kokoro voice catalog is set by the upstream model — new voices appear only when hexgrad publishes new model releases
|
||||
- Qwen CustomVoice's 9 speakers are part of the model checkpoint — same constraint
|
||||
|
||||
## Next Steps
|
||||
|
||||
<Cards>
|
||||
<Card title="Voice Cloning" href="/overview/voice-cloning">
|
||||
Clone a specific voice from your own audio
|
||||
</Card>
|
||||
<Card title="Generate Speech" href="/overview/generating-speech">
|
||||
Use a profile to generate audio
|
||||
</Card>
|
||||
<Card title="Build Stories" href="/overview/building-stories">
|
||||
Compose multi-voice narratives
|
||||
</Card>
|
||||
</Cards>
|
||||
164
docs/content/docs/overview/quick-start.mdx
Normal file
164
docs/content/docs/overview/quick-start.mdx
Normal file
@@ -0,0 +1,164 @@
|
||||
---
|
||||
title: "Quick Start"
|
||||
description: "Get started with Voicebox in 5 minutes"
|
||||
---
|
||||
|
||||
This guide will walk you through creating your first voice profile and generating speech.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Make sure you have [installed Voicebox](/overview/installation) and launched the app.
|
||||
|
||||
## Step 1: Create a Voice Profile
|
||||
|
||||
Voice profiles are the foundation of Voicebox. Each profile contains voice samples that the AI uses to clone the voice.
|
||||
|
||||
<Steps>
|
||||
<Step title="Navigate to Profiles">
|
||||
Click the **Profiles** tab in the sidebar
|
||||
</Step>
|
||||
|
||||
<Step title="Create New Profile">
|
||||
Click the **+ New Profile** button
|
||||
|
||||
Fill in the details:
|
||||
- **Name:** A descriptive name (e.g., "John Smith")
|
||||
- **Language:** Select the primary language
|
||||
- **Description:** Optional notes about the voice
|
||||
</Step>
|
||||
|
||||
<Step title="Add Voice Sample">
|
||||
You have two options:
|
||||
|
||||
**Option A: Upload Audio**
|
||||
- Click **Upload Sample**
|
||||
- Select an audio file (WAV, MP3, or M4A)
|
||||
- Ideal length: 10-30 seconds of clear speech
|
||||
|
||||
**Option B: Record Live**
|
||||
- Click **Record Sample**
|
||||
- Speak clearly for 10-30 seconds
|
||||
- Click stop when finished
|
||||
</Step>
|
||||
|
||||
<Step title="Save Profile">
|
||||
Click **Create Profile** to save
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
<Callout type="info">
|
||||
For best results, use clean audio with minimal background noise and consistent speaking tone.
|
||||
</Callout>
|
||||
|
||||
## Step 2: Generate Speech
|
||||
|
||||
Now let's use your new voice profile to generate speech.
|
||||
|
||||
<Steps>
|
||||
<Step title="Go to Generation">
|
||||
Click the **Generate** tab in the sidebar
|
||||
</Step>
|
||||
|
||||
<Step title="Select Voice Profile">
|
||||
Choose your newly created profile from the dropdown
|
||||
</Step>
|
||||
|
||||
<Step title="Enter Text">
|
||||
Type or paste the text you want to generate:
|
||||
|
||||
```
|
||||
Hello! This is my first voice generation with Voicebox.
|
||||
```
|
||||
|
||||
<Callout type="info">
|
||||
Paralinguistic tags like `[laugh]`, `[sigh]`, and `[gasp]` only work with
|
||||
**Chatterbox Turbo**. Qwen3-TTS, LuxTTS, Chatterbox Multilingual, and
|
||||
HumeAI TADA will read those tags literally instead of turning them into
|
||||
expressive sounds.
|
||||
</Callout>
|
||||
|
||||
To insert supported tags, select **Chatterbox Turbo** and type `/` in the
|
||||
text input to open the tag inserter.
|
||||
</Step>
|
||||
|
||||
<Step title="Generate">
|
||||
Click **Generate** and wait a few seconds
|
||||
|
||||
<Callout type="info">
|
||||
First generation may take longer due to model initialization. Subsequent generations will be faster.
|
||||
</Callout>
|
||||
</Step>
|
||||
|
||||
<Step title="Play & Download">
|
||||
- Click **Play** to preview the audio
|
||||
- Click **Download** to save the audio file
|
||||
- The generation is also saved to your **History**
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Step 3: Build a Story (Optional)
|
||||
|
||||
The Stories Editor lets you create multi-voice narratives with a timeline-based interface.
|
||||
|
||||
<Steps>
|
||||
<Step title="Create New Story">
|
||||
Navigate to **Stories** and click **+ New Story**
|
||||
</Step>
|
||||
|
||||
<Step title="Add Voice Tracks">
|
||||
Click **+ Add Track** to create tracks for different speakers
|
||||
</Step>
|
||||
|
||||
<Step title="Add Audio Clips">
|
||||
- Drag generated audio from your History
|
||||
- Or generate new clips directly in the timeline
|
||||
- Arrange clips on the timeline
|
||||
</Step>
|
||||
|
||||
<Step title="Edit & Export">
|
||||
- Trim clips by dragging edges
|
||||
- Adjust timing and spacing
|
||||
- Click **Export** to render the final audio
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## What's Next?
|
||||
|
||||
<Cards>
|
||||
<Card title="Voice Cloning Guide" href="/overview/creating-voice-profiles">
|
||||
Learn advanced techniques for high-quality voice cloning
|
||||
</Card>
|
||||
<Card title="API Integration" href="/api-reference">
|
||||
Integrate Voicebox into your own applications
|
||||
</Card>
|
||||
<Card title="Stories Editor" href="/overview/stories-editor">
|
||||
Master the multi-track timeline editor
|
||||
</Card>
|
||||
<Card title="Remote Mode" href="/overview/remote-mode">
|
||||
Connect to a GPU server for faster generation
|
||||
</Card>
|
||||
</Cards>
|
||||
|
||||
## Tips for Success
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="Getting the Best Voice Quality">
|
||||
- Use 10-30 seconds of clear, consistent speech
|
||||
- Avoid background noise and echo
|
||||
- Multiple samples from the same speaker improve quality
|
||||
- Match the speaking style you want to generate
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Improving Generation Speed">
|
||||
- Use a CUDA-capable GPU for 5-10x faster generation
|
||||
- Enable voice prompt caching for repeated generations
|
||||
- Consider running the backend on a remote GPU server
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Troubleshooting Common Issues">
|
||||
- **Server won't start:** Check if port 17493 is available
|
||||
- **Poor audio quality:** Try adding more voice samples
|
||||
- **Slow generation:** Verify GPU acceleration is enabled
|
||||
- See the full [Troubleshooting Guide](/overview/troubleshooting) for more
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
64
docs/content/docs/overview/recording-transcription.mdx
Normal file
64
docs/content/docs/overview/recording-transcription.mdx
Normal file
@@ -0,0 +1,64 @@
|
||||
---
|
||||
title: "Recording & Transcription"
|
||||
description: "Record audio and transcribe speech with Whisper"
|
||||
---
|
||||
|
||||
## Recording
|
||||
|
||||
Voicebox includes built-in recording capabilities for creating voice samples and capturing audio.
|
||||
|
||||
### Features
|
||||
|
||||
- **Microphone input** - Record from any audio input device
|
||||
- **System audio capture** - Record desktop audio (macOS/Windows)
|
||||
- **Waveform visualization** - See audio levels in real-time
|
||||
- **Multiple formats** - Export as WAV, MP3, or M4A
|
||||
|
||||
### How to Record
|
||||
|
||||
<Steps>
|
||||
<Step title="Select Input">
|
||||
Choose your microphone or system audio
|
||||
</Step>
|
||||
<Step title="Start Recording">
|
||||
Click the record button and speak clearly
|
||||
</Step>
|
||||
<Step title="Stop & Save">
|
||||
Click stop when finished
|
||||
</Step>
|
||||
<Step title="Use or Export">
|
||||
Use as voice sample or export to file
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Transcription
|
||||
|
||||
Automatic speech-to-text powered by OpenAI's Whisper model.
|
||||
|
||||
### Features
|
||||
|
||||
- **High accuracy** - Industry-leading speech recognition
|
||||
- **Multiple languages** - Supports 50+ languages
|
||||
- **Automatic detection** - Language auto-detection
|
||||
- **Timestamps** - Word-level timing information
|
||||
|
||||
### How to Transcribe
|
||||
|
||||
<Steps>
|
||||
<Step title="Select Audio">
|
||||
Choose a recording or upload an audio file
|
||||
</Step>
|
||||
<Step title="Choose Language">
|
||||
Select language or use auto-detect
|
||||
</Step>
|
||||
<Step title="Transcribe">
|
||||
Click transcribe and wait for processing
|
||||
</Step>
|
||||
<Step title="Review & Export">
|
||||
Review text and export as needed
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
<Callout type="info">
|
||||
Transcription is useful for creating voice samples from existing audio or generating subtitles.
|
||||
</Callout>
|
||||
146
docs/content/docs/overview/remote-mode.mdx
Normal file
146
docs/content/docs/overview/remote-mode.mdx
Normal file
@@ -0,0 +1,146 @@
|
||||
---
|
||||
title: "Remote Mode"
|
||||
description: "Connect to a GPU server for faster generation"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Remote Mode allows you to run the Voicebox backend on a separate machine (like a GPU server) while using the desktop app on your local machine.
|
||||
|
||||
## Use Cases
|
||||
|
||||
- **No local GPU** - Use a cloud GPU or remote workstation
|
||||
- **Faster generation** - Leverage powerful remote hardware
|
||||
- **Shared infrastructure** - Multiple users connect to one server
|
||||
- **Laptop workflows** - Keep your laptop cool and battery-efficient
|
||||
|
||||
## Architecture
|
||||
|
||||
In Remote Mode, the Voicebox desktop app (running on your local machine) communicates with the backend server (running on a remote machine) via HTTP. The local app provides only the user interface, while the remote server handles all the heavy processing including the TTS models, API endpoints, and audio generation.
|
||||
|
||||
## Setting Up Remote Mode
|
||||
|
||||
### On the Server
|
||||
|
||||
<Steps>
|
||||
<Step title="Install Dependencies">
|
||||
```bash
|
||||
# Clone the repo
|
||||
git clone https://github.com/jamiepine/voicebox.git
|
||||
cd voicebox/backend
|
||||
|
||||
# Install Python dependencies
|
||||
pip install -r requirements.txt
|
||||
|
||||
# Engines with incompatible transitive pins — install with --no-deps
|
||||
pip install --no-deps chatterbox-tts
|
||||
pip install --no-deps hume-tada
|
||||
|
||||
# Qwen3-TTS from source
|
||||
pip install git+https://github.com/QwenLM/Qwen3-TTS.git
|
||||
```
|
||||
|
||||
Or just run `just setup` from the repo root, which handles all of this.
|
||||
</Step>
|
||||
|
||||
<Step title="Start the Server">
|
||||
```bash
|
||||
# Allow external connections
|
||||
uvicorn main:app --host 0.0.0.0 --port 17493
|
||||
```
|
||||
|
||||
<Callout type="warn">
|
||||
This exposes the server to your network. Use a firewall or VPN for security.
|
||||
</Callout>
|
||||
</Step>
|
||||
|
||||
<Step title="Open Firewall">
|
||||
```bash
|
||||
# Ubuntu/Debian
|
||||
sudo ufw allow 17493
|
||||
|
||||
# Or use your cloud provider's firewall settings
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
### On the Client
|
||||
|
||||
<Steps>
|
||||
<Step title="Open Settings">
|
||||
In Voicebox, go to **Settings → Server**
|
||||
</Step>
|
||||
|
||||
<Step title="Enable Remote Mode">
|
||||
Toggle **Use Remote Server**
|
||||
</Step>
|
||||
|
||||
<Step title="Enter Server URL">
|
||||
```
|
||||
http://<server-ip>:17493
|
||||
```
|
||||
|
||||
Replace `<server-ip>` with your server's IP address
|
||||
</Step>
|
||||
|
||||
<Step title="Test Connection">
|
||||
Click **Test Connection** to verify
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Cloud Deployment
|
||||
|
||||
### AWS EC2
|
||||
|
||||
```bash
|
||||
# Launch a GPU instance (e.g., g4dn.xlarge)
|
||||
# Install dependencies
|
||||
# Start server with --host 0.0.0.0
|
||||
```
|
||||
|
||||
### Vast.ai
|
||||
|
||||
```bash
|
||||
# Rent a GPU instance
|
||||
# SSH in and clone repo
|
||||
# Start server
|
||||
```
|
||||
|
||||
### RunPod
|
||||
|
||||
```bash
|
||||
# Deploy a pod with CUDA support
|
||||
# Install Voicebox backend
|
||||
# Expose port 17493
|
||||
```
|
||||
|
||||
## Security Considerations
|
||||
|
||||
<Callout type="warn">
|
||||
The API currently has no authentication. Only use on trusted networks or with a VPN.
|
||||
</Callout>
|
||||
|
||||
**Best Practices:**
|
||||
- Use a VPN (WireGuard, Tailscale) instead of exposing to the internet
|
||||
- Run behind a reverse proxy with authentication (nginx + basic auth)
|
||||
- Use HTTPS with SSL certificates
|
||||
- Firewall rules to limit access to specific IPs
|
||||
|
||||
## Performance
|
||||
|
||||
Expected performance on various GPUs:
|
||||
|
||||
| GPU | Generation Speed |
|
||||
|-----|------------------|
|
||||
| RTX 4090 | ~2-3s per 10 words |
|
||||
| RTX 3090 | ~3-4s per 10 words |
|
||||
| RTX 3060 | ~5-7s per 10 words |
|
||||
| CPU (12-core) | ~20-30s per 10 words |
|
||||
|
||||
<Callout type="info">
|
||||
A GPU with 8GB+ VRAM is recommended for best performance.
|
||||
</Callout>
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
See the [Troubleshooting Guide](/overview/troubleshooting) for common issues.
|
||||
64
docs/content/docs/overview/stories-editor.mdx
Normal file
64
docs/content/docs/overview/stories-editor.mdx
Normal file
@@ -0,0 +1,64 @@
|
||||
---
|
||||
title: "Stories Editor"
|
||||
description: "Create multi-voice narratives with a timeline-based editor"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
The Stories Editor is a DAW-like timeline interface for creating multi-voice narratives, podcasts, and conversations.
|
||||
|
||||
## Features
|
||||
|
||||
<Cards>
|
||||
<Card title="Multi-Track Timeline">
|
||||
Arrange multiple voice tracks in parallel
|
||||
</Card>
|
||||
<Card title="Inline Editing">
|
||||
Trim and split clips directly in the timeline
|
||||
</Card>
|
||||
<Card title="Auto-Playback">
|
||||
Preview with synchronized playhead
|
||||
</Card>
|
||||
<Card title="Voice Mixing">
|
||||
Build conversations with multiple speakers
|
||||
</Card>
|
||||
</Cards>
|
||||
|
||||
## Creating a Story
|
||||
|
||||
<Steps>
|
||||
<Step title="Create New Story">
|
||||
Navigate to **Stories** and click **+ New Story**
|
||||
</Step>
|
||||
<Step title="Add Tracks">
|
||||
Create separate tracks for each voice/speaker
|
||||
</Step>
|
||||
<Step title="Add Clips">
|
||||
- Drag from generation history
|
||||
- Generate new clips inline
|
||||
- Upload audio files
|
||||
</Step>
|
||||
<Step title="Arrange & Edit">
|
||||
- Position clips on timeline
|
||||
- Trim clip edges
|
||||
- Adjust spacing and timing
|
||||
</Step>
|
||||
<Step title="Export">
|
||||
Render the final mixed audio
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Use Cases
|
||||
|
||||
- **Podcasts**: Multi-host conversations
|
||||
- **Audiobooks**: Narrator + character voices
|
||||
- **Game Dialogue**: Character interactions
|
||||
- **Video Voiceovers**: Multiple speakers
|
||||
- **Audio Drama**: Full voice casts
|
||||
|
||||
## Coming Soon
|
||||
|
||||
- Word-level editing
|
||||
- Crossfades and transitions
|
||||
- Audio effects (reverb, EQ)
|
||||
- Real-time collaboration
|
||||
596
docs/content/docs/overview/troubleshooting.mdx
Normal file
596
docs/content/docs/overview/troubleshooting.mdx
Normal file
@@ -0,0 +1,596 @@
|
||||
---
|
||||
title: "Troubleshooting"
|
||||
description: "Common issues and solutions for Voicebox"
|
||||
---
|
||||
|
||||
This guide covers common issues you might encounter when using or developing Voicebox, along with solutions.
|
||||
|
||||
## Installation Issues
|
||||
|
||||
### macOS: "App is damaged and can't be opened"
|
||||
|
||||
This occurs because the app isn't signed with an Apple Developer certificate.
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Remove the quarantine attribute
|
||||
xattr -cr /Applications/Voicebox.app
|
||||
```
|
||||
|
||||
### Windows: SmartScreen Warning
|
||||
|
||||
Windows SmartScreen may warn that the app is unrecognized.
|
||||
|
||||
**Solution:**
|
||||
- Click "More info"
|
||||
- Click "Run anyway"
|
||||
|
||||
<Callout type="info">
|
||||
This is expected for unsigned applications. We're working on code signing for future releases.
|
||||
</Callout>
|
||||
|
||||
### Linux: AppImage Won't Run
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
chmod +x voicebox-*.AppImage
|
||||
./voicebox-*.AppImage
|
||||
```
|
||||
|
||||
## Server Issues
|
||||
|
||||
### Backend Server Won't Start
|
||||
|
||||
**Symptoms:**
|
||||
- Red status indicator in bottom-left corner
|
||||
- "Failed to connect to server" error
|
||||
|
||||
**Solutions:**
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="Port Already in Use">
|
||||
Check if port 17493 is already in use:
|
||||
|
||||
```bash
|
||||
# macOS/Linux
|
||||
lsof -i :17493
|
||||
|
||||
# Windows
|
||||
powershell -Command "Get-NetTCPConnection -LocalPort 17493 -State Listen"
|
||||
```
|
||||
|
||||
Kill the process using the port:
|
||||
```bash
|
||||
# macOS/Linux
|
||||
kill -9 <PID>
|
||||
|
||||
# Windows
|
||||
taskkill /PID <PID> /F
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Permission Issues">
|
||||
The server binary might not have execute permissions:
|
||||
|
||||
```bash
|
||||
# macOS/Linux
|
||||
chmod +x ~/Library/Application\ Support/sh.voicebox.app/backend/voicebox-server
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Check Logs">
|
||||
View server logs for errors:
|
||||
|
||||
**macOS:**
|
||||
```bash
|
||||
tail -f ~/Library/Application\ Support/sh.voicebox.app/logs/server.log
|
||||
```
|
||||
|
||||
**Windows:**
|
||||
```bash
|
||||
type %APPDATA%\sh.voicebox.app\logs\server.log
|
||||
```
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
### `flash-attn is not installed` Warning in Server Logs
|
||||
|
||||
**Symptoms:**
|
||||
```
|
||||
Warning: flash-attn is not installed. Will only run the manual PyTorch version.
|
||||
Please install flash-attn for faster inference.
|
||||
```
|
||||
|
||||
**This is harmless.** The warning is emitted by our transformer-based engines (Chatterbox / Qwen) on every startup. FlashAttention is an optional acceleration library — when it's not present, PyTorch's built-in scaled-dot-product attention (SDPA) runs instead, which is near-FA2 throughput on modern GPUs. Generation works normally.
|
||||
|
||||
**Why it shows up on every platform:**
|
||||
- **Windows:** `flash-attn` has no official Windows support. The upstream project (Dao-AILab/flash-attention) still only says it *might* work, and source builds typically fail on recent CUDA/MSVC combinations.
|
||||
- **macOS (Apple Silicon):** FlashAttention is CUDA-only and doesn't apply here at all. MLX has its own optimized attention kernels.
|
||||
- **Linux:** It's not pinned in our requirements because installing it is fragile and version-sensitive; users who want it install it themselves.
|
||||
|
||||
**Solutions (all optional):**
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="Ignore it (recommended)">
|
||||
PyTorch SDPA is what actually runs the model, and on Ampere/Ada/Hopper GPUs it's within a few percent of FA2 for our workloads. You won't notice a meaningful speed difference.
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Install flash-attn on Linux">
|
||||
```bash
|
||||
pip install flash-attn --no-build-isolation
|
||||
```
|
||||
|
||||
Requires a matching CUDA toolkit. Build can take 20+ minutes.
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Install flash-attn on Windows (community wheels)">
|
||||
Official builds don't exist, but community maintainers publish prebuilt wheels:
|
||||
|
||||
- [kingbri1/flash-attention releases](https://github.com/kingbri1/flash-attention/releases)
|
||||
- [bdashore3/flash-attention releases](https://github.com/bdashore3/flash-attention/releases)
|
||||
|
||||
Pick the wheel matching your exact CUDA + PyTorch + Python combination. Example:
|
||||
|
||||
```bash
|
||||
pip install https://github.com/kingbri1/flash-attention/releases/download/v2.8.3/flash_attn-2.8.3+cu128torch2.8.0cxx11abiFALSE-cp312-cp312-win_amd64.whl
|
||||
```
|
||||
|
||||
Alternatively, run Voicebox's backend inside WSL2 and use the standard Linux wheels.
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
### Connection Timeout
|
||||
|
||||
**Symptoms:**
|
||||
- Long loading times
|
||||
- "Connection timeout" errors
|
||||
|
||||
**Solution:**
|
||||
- Restart the app
|
||||
- Check your firewall settings
|
||||
- Ensure localhost is accessible
|
||||
|
||||
## Generation Issues
|
||||
|
||||
### First Generation is Very Slow
|
||||
|
||||
**Symptoms:**
|
||||
- First generation takes 2-5 minutes
|
||||
- Progress indicator stuck at "Loading model..."
|
||||
|
||||
**Explanation:**
|
||||
This is expected behavior. The first generation downloads the selected TTS engine's model and initializes it. Sizes range from 350 MB (Kokoro) to 8 GB (TADA 3B).
|
||||
|
||||
**Solution:**
|
||||
- Wait for the initial download to complete (progress is shown in Settings → Models)
|
||||
- Subsequent generations reuse the cached model and are much faster
|
||||
- Check your internet connection
|
||||
- For low-bandwidth setups, start with Kokoro (~350 MB) or LuxTTS (~300 MB)
|
||||
|
||||
### Poor Voice Quality
|
||||
|
||||
**Symptoms:**
|
||||
- Robotic or unnatural voice
|
||||
- Missing emotion or prosody
|
||||
- Pronunciation errors
|
||||
|
||||
**Solutions:**
|
||||
|
||||
<Steps>
|
||||
<Step title="Improve Voice Samples">
|
||||
- Use 10-30 seconds of clear audio
|
||||
- Avoid background noise
|
||||
- Ensure consistent speaking tone
|
||||
- Add multiple samples from the same speaker
|
||||
</Step>
|
||||
|
||||
<Step title="Match Speaking Style">
|
||||
The generated voice will mimic the tone and style of your samples. If your sample is monotone, the generation will be too.
|
||||
</Step>
|
||||
|
||||
<Step title="Adjust Text Formatting">
|
||||
- Use proper punctuation
|
||||
- Add commas for natural pauses
|
||||
- Capitalize proper nouns
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
### Generation Fails with "Out of Memory"
|
||||
|
||||
**Symptoms:**
|
||||
- Generation crashes
|
||||
- "CUDA out of memory" or "RuntimeError: out of memory"
|
||||
|
||||
**Solutions:**
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="Free GPU Memory">
|
||||
Close other GPU-intensive applications:
|
||||
- Games
|
||||
- Video editors
|
||||
- Multiple browser tabs with WebGL
|
||||
|
||||
Then restart Voicebox.
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Use CPU Mode">
|
||||
If your GPU doesn't have enough VRAM (need 6GB+), use CPU mode:
|
||||
|
||||
Settings → Generation → Use CPU instead of GPU
|
||||
|
||||
<Callout type="warn">
|
||||
CPU generation is 5-10x slower but uses system RAM instead of VRAM.
|
||||
</Callout>
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Reduce Batch Size">
|
||||
For long text, split it into smaller chunks instead of generating all at once.
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
### MLX "Failed to load the default metallib" (Apple Silicon)
|
||||
|
||||
**Symptoms:**
|
||||
- Generation fails with "library not found" or "metallib" errors
|
||||
- Server logs reference missing Metal shader libraries
|
||||
|
||||
**Solutions:**
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="Rebuild the Server Binary">
|
||||
```bash
|
||||
just build-server
|
||||
```
|
||||
|
||||
The build script bundles MLX Metal shader libraries on Apple Silicon automatically.
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Reinstall MLX Dependencies">
|
||||
```bash
|
||||
pip install -r backend/requirements-mlx.txt
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Verify Backend Detection">
|
||||
Check Settings → Server Status. Should show **Backend: MLX** on Apple Silicon. If it shows **Backend: PYTORCH**, MLX isn't installed correctly.
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Audio Issues
|
||||
|
||||
### No Audio Playback
|
||||
|
||||
**Symptoms:**
|
||||
- Generated audio won't play
|
||||
- Playback button doesn't respond
|
||||
|
||||
**Solutions:**
|
||||
- Check system audio settings
|
||||
- Ensure audio output device is connected
|
||||
- Try exporting and playing in a media player
|
||||
|
||||
### Crackling or Distorted Audio
|
||||
|
||||
**Symptoms:**
|
||||
- Audio has static or distortion
|
||||
- Clipping sounds
|
||||
|
||||
**Solutions:**
|
||||
- Check if your input samples have distortion
|
||||
- Reduce playback volume
|
||||
- Re-generate with cleaner voice samples
|
||||
|
||||
## Development Issues
|
||||
|
||||
### Backend Won't Start in Dev Mode
|
||||
|
||||
**Symptoms:**
|
||||
- `just dev-backend` or `just dev` fails
|
||||
- Import errors or module not found
|
||||
|
||||
**Solutions:**
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="Python Version">
|
||||
Ensure Python 3.11 or higher:
|
||||
|
||||
```bash
|
||||
python --version
|
||||
```
|
||||
|
||||
If not, install Python 3.11+ and recreate the virtual environment.
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Virtual Environment">
|
||||
Ensure venv is activated:
|
||||
|
||||
```bash
|
||||
# macOS/Linux
|
||||
source backend/venv/bin/activate
|
||||
|
||||
# Windows
|
||||
backend\venv\Scripts\activate
|
||||
```
|
||||
|
||||
You should see `(venv)` in your prompt.
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Dependencies">
|
||||
Reinstall dependencies — easiest via `just`:
|
||||
|
||||
```bash
|
||||
just setup
|
||||
```
|
||||
|
||||
Or manually:
|
||||
|
||||
```bash
|
||||
cd backend
|
||||
pip install -r requirements.txt
|
||||
pip install --no-deps chatterbox-tts
|
||||
pip install --no-deps hume-tada
|
||||
pip install git+https://github.com/QwenLM/Qwen3-TTS.git
|
||||
```
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
### Tauri Build Fails
|
||||
|
||||
**Symptoms:**
|
||||
- `bun run tauri build` fails
|
||||
- Rust compilation errors
|
||||
|
||||
**Solutions:**
|
||||
|
||||
```bash
|
||||
# Clean build artifacts
|
||||
cd tauri/src-tauri
|
||||
cargo clean
|
||||
|
||||
# Update Rust
|
||||
rustup update
|
||||
|
||||
# Try building again
|
||||
cd ../..
|
||||
bun run tauri build
|
||||
```
|
||||
|
||||
### OpenAPI Client Generation Fails
|
||||
|
||||
**Symptoms:**
|
||||
- `./scripts/generate-api.sh` fails
|
||||
- "Failed to fetch schema" error
|
||||
|
||||
**Solutions:**
|
||||
|
||||
<Steps>
|
||||
<Step title="Ensure Backend is Running">
|
||||
```bash
|
||||
curl http://localhost:17493/openapi.json
|
||||
```
|
||||
|
||||
Should return JSON. If not, start the backend.
|
||||
</Step>
|
||||
|
||||
<Step title="Check Port">
|
||||
Ensure nothing else is using port 17493
|
||||
</Step>
|
||||
|
||||
<Step title="Regenerate Manually">
|
||||
```bash
|
||||
cd backend
|
||||
source venv/bin/activate
|
||||
uvicorn main:app --reload --port 17493
|
||||
|
||||
# In another terminal
|
||||
./scripts/generate-api.sh
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Database Issues
|
||||
|
||||
### "Database is locked" Error
|
||||
|
||||
**Symptoms:**
|
||||
- Profile or generation operations fail
|
||||
- SQLite lock errors
|
||||
|
||||
**Solutions:**
|
||||
- Close all Voicebox instances
|
||||
- Delete the lock file:
|
||||
```bash
|
||||
# macOS
|
||||
rm ~/Library/Application\ Support/sh.voicebox.app/data/voicebox.db-shm
|
||||
rm ~/Library/Application\ Support/sh.voicebox.app/data/voicebox.db-wal
|
||||
```
|
||||
|
||||
### Corrupted Database
|
||||
|
||||
**Symptoms:**
|
||||
- App crashes on launch
|
||||
- Data missing or corrupted
|
||||
|
||||
**Solutions:**
|
||||
|
||||
<Callout type="warn">
|
||||
This will delete all your voice profiles and generation history. Export important profiles first if possible.
|
||||
</Callout>
|
||||
|
||||
```bash
|
||||
# macOS
|
||||
rm ~/Library/Application\ Support/sh.voicebox.app/data/voicebox.db
|
||||
|
||||
# Windows
|
||||
del %APPDATA%\sh.voicebox.app\data\voicebox.db
|
||||
```
|
||||
|
||||
Restart the app to create a fresh database.
|
||||
|
||||
## Model Issues
|
||||
|
||||
### Model Download Fails
|
||||
|
||||
**Symptoms:**
|
||||
- "Failed to download model" error
|
||||
- Stuck at "Downloading..."
|
||||
|
||||
**Solutions:**
|
||||
- Check your internet connection
|
||||
- Check HuggingFace Hub status
|
||||
- Try using a VPN if HuggingFace is blocked in your region
|
||||
- Manually download via the HuggingFace CLI and place in the cache directory:
|
||||
|
||||
```bash
|
||||
pip install huggingface_hub
|
||||
huggingface-cli download Qwen/Qwen3-TTS-12Hz-1.7B-Base
|
||||
```
|
||||
|
||||
### Wrong Model Version
|
||||
|
||||
**Symptoms:**
|
||||
- Generation quality suddenly degraded
|
||||
- Different voice output
|
||||
|
||||
**Solutions:**
|
||||
Clear the model cache and re-download. Replace the `Qwen*` glob with the engine org prefix for other engines (`ResembleAI*` for Chatterbox, `HumeAI*` for TADA, `hexgrad*` for Kokoro, etc.) or use `DELETE /models/{name}` via the API.
|
||||
|
||||
```bash
|
||||
# macOS / Linux
|
||||
rm -rf ~/.cache/huggingface/hub/models--Qwen*
|
||||
|
||||
# Windows
|
||||
rmdir /s %USERPROFILE%\.cache\huggingface\hub\models--Qwen*
|
||||
```
|
||||
|
||||
## Performance Issues
|
||||
|
||||
### Slow Generation on GPU
|
||||
|
||||
**Symptoms:**
|
||||
- Generation slower than expected
|
||||
- GPU not being utilized
|
||||
|
||||
**Solutions:**
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="Verify CUDA Installation">
|
||||
```bash
|
||||
nvidia-smi
|
||||
```
|
||||
|
||||
Should show your GPU. If not, install CUDA drivers.
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Check GPU Selection">
|
||||
If you have multiple GPUs, ensure Voicebox is using the right one.
|
||||
|
||||
Settings → Generation → GPU Device
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Update GPU Drivers">
|
||||
Outdated drivers can cause performance issues. Update to the latest NVIDIA drivers.
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Apple Silicon: Confirm MLX Backend">
|
||||
Check Settings → Server Status. Should show **Backend: MLX** on Apple Silicon — MLX is 4–5× faster than PyTorch here. If it shows **Backend: PYTORCH**, reinstall MLX:
|
||||
|
||||
```bash
|
||||
pip install -r backend/requirements-mlx.txt
|
||||
```
|
||||
|
||||
GPU availability should read "Metal (Apple Silicon via MLX)".
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
### High Memory Usage
|
||||
|
||||
**Symptoms:**
|
||||
- App uses excessive RAM
|
||||
- System becomes sluggish
|
||||
|
||||
**Solutions:**
|
||||
- Close unused voice profiles
|
||||
- Clear generation history
|
||||
- Restart the app periodically
|
||||
|
||||
## Update Issues
|
||||
|
||||
### "Update Check Failed"
|
||||
|
||||
**Solutions:**
|
||||
- Confirm your internet connection — updates are fetched from GitHub releases.
|
||||
- Ensure `github.com` is accessible and not blocked by a firewall or proxy.
|
||||
- As a fallback, download the latest release from GitHub and install manually.
|
||||
|
||||
### "Invalid Signature" Error
|
||||
|
||||
**Solutions:**
|
||||
- Re-download the installer — the signature may have been corrupted in transit.
|
||||
- Verify the `.sig` file matches the installer; if it doesn't, file an issue.
|
||||
|
||||
## Remote Mode Issues
|
||||
|
||||
### Can't Connect to Remote Server
|
||||
|
||||
**Symptoms:**
|
||||
- "Connection refused" error
|
||||
- Remote server not found
|
||||
|
||||
**Solutions:**
|
||||
|
||||
<Steps>
|
||||
<Step title="Check Server Status">
|
||||
Ensure the remote server is running:
|
||||
|
||||
```bash
|
||||
curl http://<server-ip>:17493/health
|
||||
```
|
||||
</Step>
|
||||
|
||||
<Step title="Check Firewall">
|
||||
Ensure port 17493 is open on the remote server:
|
||||
|
||||
```bash
|
||||
# Allow port on Ubuntu/Debian
|
||||
sudo ufw allow 17493
|
||||
```
|
||||
</Step>
|
||||
|
||||
<Step title="Verify Network">
|
||||
- Ensure both machines are on the same network (for local servers)
|
||||
- Use IP address instead of hostname
|
||||
- Try pinging the server: `ping <server-ip>`
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Still Having Issues?
|
||||
|
||||
If you're still experiencing problems:
|
||||
|
||||
1. **Check GitHub Issues:** [github.com/jamiepine/voicebox/issues](https://github.com/jamiepine/voicebox/issues)
|
||||
2. **Open a New Issue:** Provide:
|
||||
- Operating system and version
|
||||
- Voicebox version
|
||||
- Steps to reproduce
|
||||
- Error messages or logs
|
||||
3. **Join Discord:** [discord.gg/voicebox](https://discord.gg/voicebox) (coming soon)
|
||||
|
||||
## Diagnostic Information
|
||||
|
||||
When reporting issues, include this information:
|
||||
|
||||
```bash
|
||||
# Voicebox version
|
||||
# Check Help → About in the app
|
||||
|
||||
# Operating system
|
||||
uname -a # macOS/Linux
|
||||
systeminfo # Windows
|
||||
|
||||
# Python version (for dev issues)
|
||||
python --version
|
||||
|
||||
# GPU info (if generation issues)
|
||||
nvidia-smi # NVIDIA GPUs
|
||||
```
|
||||
118
docs/content/docs/overview/voice-cloning.mdx
Normal file
118
docs/content/docs/overview/voice-cloning.mdx
Normal file
@@ -0,0 +1,118 @@
|
||||
---
|
||||
title: "Voice Cloning"
|
||||
description: "Clone any voice from a few seconds of reference audio"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Voicebox can replicate a specific person's voice from a short audio sample — known as **zero-shot voice cloning**. You provide 10-30 seconds of clear speech, the model extracts a voice embedding, and from then on you can generate any text in that voice.
|
||||
|
||||
Five engines in 0.4 support cloning:
|
||||
|
||||
| Engine | Languages | Strengths |
|
||||
| --------------------------- | --------- | -------------------------------------------------------------------------- |
|
||||
| **Qwen3-TTS** (0.6B / 1.7B) | 10 | High-quality multilingual, supports delivery instructions on the same kwarg |
|
||||
| **Chatterbox Multilingual** | 23 | Broadest language coverage — Arabic, Hindi, Swahili, Hebrew, more |
|
||||
| **Chatterbox Turbo** | English | Fast 350M model with paralinguistic emotion tags (`[laugh]`, `[sigh]`) |
|
||||
| **LuxTTS** | English | Lightweight (~1 GB VRAM), 48 kHz output, 150x realtime on CPU |
|
||||
| **TADA** (1B / 3B) | 10 | Speech-language model with 700s+ coherent long-form generation |
|
||||
|
||||
<Callout type="info">
|
||||
Don't want to record audio? Use a curated voice from Kokoro or Qwen CustomVoice instead — see [Preset Voices](/overview/preset-voices).
|
||||
</Callout>
|
||||
|
||||
## How It Works
|
||||
|
||||
<Steps>
|
||||
<Step title="Upload or Record Sample">
|
||||
Provide 10-30 seconds of clear speech from the target voice
|
||||
</Step>
|
||||
<Step title="Engine Analysis">
|
||||
The selected engine analyzes vocal characteristics, tone, and speaking patterns
|
||||
</Step>
|
||||
<Step title="Voice Profile Created">
|
||||
A voice embedding is generated and stored with your profile
|
||||
</Step>
|
||||
<Step title="Generate Speech">
|
||||
Use the profile to generate any text in the cloned voice
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Choosing an Engine for Cloning
|
||||
|
||||
Different engines suit different use cases. The profile grid greys out unsupported engines so you can switch easily.
|
||||
|
||||
| If you want… | Pick |
|
||||
| -------------------------------------------------- | --------------------- |
|
||||
| Best overall quality on a few common languages | **Qwen3-TTS 1.7B** |
|
||||
| Faster generation, slightly lower quality | **Qwen3-TTS 0.6B** |
|
||||
| Languages outside Qwen's 10 (Arabic, Hindi, etc.) | **Chatterbox Multilingual** |
|
||||
| Expressive English with `[laugh]` `[sigh]` tags | **Chatterbox Turbo** |
|
||||
| CPU-only or GPU-light setup, English | **LuxTTS** |
|
||||
| Long-form generation (audiobooks, full chapters) | **TADA 3B** |
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Sample Quality
|
||||
|
||||
<Cards>
|
||||
<Card title="Do">
|
||||
- Use 10-30 seconds of audio
|
||||
- Clear, consistent speaking
|
||||
- Minimal background noise
|
||||
- Natural speaking pace
|
||||
</Card>
|
||||
<Card title="Don't">
|
||||
- Very short clips (< 5 seconds)
|
||||
- Heavy background noise
|
||||
- Music or overlapping voices
|
||||
- Heavily processed audio
|
||||
</Card>
|
||||
</Cards>
|
||||
|
||||
### Multiple Samples
|
||||
|
||||
Adding multiple samples from the same speaker can improve quality:
|
||||
|
||||
- Different speaking styles (casual, formal)
|
||||
- Different emotions (happy, serious)
|
||||
- Different recording conditions
|
||||
|
||||
<Callout type="info">
|
||||
The model will learn a more robust representation from diverse samples. Especially helpful for distinctive voices the model might otherwise smooth over.
|
||||
</Callout>
|
||||
|
||||
## Supported Languages by Engine
|
||||
|
||||
- **Qwen3-TTS** — English, Chinese, Japanese, Korean, German, French, Russian, Portuguese, Spanish, Italian (10)
|
||||
- **Chatterbox Multilingual** — Arabic, Chinese, Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi, Italian, Japanese, Korean, Malay, Norwegian, Polish, Portuguese, Russian, Spanish, Swahili, Swedish, Turkish (23)
|
||||
- **Chatterbox Turbo** — English
|
||||
- **LuxTTS** — English
|
||||
- **TADA 3B** — 10 multilingual; **TADA 1B** — English
|
||||
|
||||
For complete language tables and engine-specific notes, see the [TTS Engines developer guide](/developer/tts-engines).
|
||||
|
||||
## Limitations
|
||||
|
||||
<Callout type="warn">
|
||||
Voice cloning should only be used with consent. Ensure you have permission to clone someone's voice. See the project's [SECURITY.md](https://github.com/jamiepine/voicebox/blob/main/SECURITY.md) and your local laws on synthetic voice content.
|
||||
</Callout>
|
||||
|
||||
- Quality depends on sample clarity — noisy samples produce noisy clones
|
||||
- Works best with consistent speaking tone within a sample
|
||||
- May struggle with extreme accents or speech impediments
|
||||
- Background noise reduces quality and can introduce artifacts
|
||||
|
||||
## Next Steps
|
||||
|
||||
<Cards>
|
||||
<Card title="Creating Voice Profiles" href="/overview/creating-voice-profiles">
|
||||
Step-by-step guide to creating profiles
|
||||
</Card>
|
||||
<Card title="Preset Voices" href="/overview/preset-voices">
|
||||
Use built-in voices instead of cloning
|
||||
</Card>
|
||||
<Card title="Generating Speech" href="/overview/generating-speech">
|
||||
Use a profile to generate audio
|
||||
</Card>
|
||||
</Cards>
|
||||
1
docs/lib/cn.ts
Normal file
1
docs/lib/cn.ts
Normal file
@@ -0,0 +1 @@
|
||||
export { twMerge as cn } from 'tailwind-merge';
|
||||
9
docs/lib/layout.shared.tsx
Normal file
9
docs/lib/layout.shared.tsx
Normal file
@@ -0,0 +1,9 @@
|
||||
import type { BaseLayoutProps } from 'fumadocs-ui/layouts/shared';
|
||||
|
||||
export function baseOptions(): BaseLayoutProps {
|
||||
return {
|
||||
nav: {
|
||||
title: 'Voicebox',
|
||||
},
|
||||
};
|
||||
}
|
||||
5
docs/lib/openapi.ts
Normal file
5
docs/lib/openapi.ts
Normal file
@@ -0,0 +1,5 @@
|
||||
import { createOpenAPI } from 'fumadocs-openapi/server';
|
||||
|
||||
export const openapi = createOpenAPI({
|
||||
input: ['./openapi.json'],
|
||||
});
|
||||
27
docs/lib/source.ts
Normal file
27
docs/lib/source.ts
Normal file
@@ -0,0 +1,27 @@
|
||||
import { type InferPageType, loader } from 'fumadocs-core/source';
|
||||
import { lucideIconsPlugin } from 'fumadocs-core/source/lucide-icons';
|
||||
import { docs } from '@/.source';
|
||||
|
||||
// See https://fumadocs.dev/docs/headless/source-api for more info
|
||||
export const source = loader({
|
||||
baseUrl: '/',
|
||||
source: docs.toFumadocsSource(),
|
||||
plugins: [lucideIconsPlugin()],
|
||||
});
|
||||
|
||||
export function getPageImage(page: InferPageType<typeof source>) {
|
||||
const segments = [...page.slugs, 'image.png'];
|
||||
|
||||
return {
|
||||
segments,
|
||||
url: `/og/docs/${segments.join('/')}`,
|
||||
};
|
||||
}
|
||||
|
||||
export async function getLLMText(page: InferPageType<typeof source>) {
|
||||
const processed = await page.data.getText('processed');
|
||||
|
||||
return `# ${page.data.title}
|
||||
|
||||
${processed}`;
|
||||
}
|
||||
53
docs/mdx-components.tsx
Normal file
53
docs/mdx-components.tsx
Normal file
@@ -0,0 +1,53 @@
|
||||
import { Callout } from 'fumadocs-ui/components/callout';
|
||||
import { Card, Cards } from 'fumadocs-ui/components/card';
|
||||
import { File, Files, Folder } from 'fumadocs-ui/components/files';
|
||||
import { Step, Steps } from 'fumadocs-ui/components/steps';
|
||||
import { Tab, Tabs } from 'fumadocs-ui/components/tabs';
|
||||
import defaultMdxComponents from 'fumadocs-ui/mdx';
|
||||
import type { MDXComponents } from 'mdx/types';
|
||||
import type { ReactNode } from 'react';
|
||||
import { APIPage } from '@/components/api-page';
|
||||
|
||||
// Simple accordion using native HTML details/summary
|
||||
function AccordionGroup({ children }: { children: ReactNode }) {
|
||||
return <div className="my-6 space-y-2">{children}</div>;
|
||||
}
|
||||
|
||||
function Accordion({ title, children }: { title: string; children: ReactNode }) {
|
||||
return (
|
||||
<details className="group border rounded-lg p-4">
|
||||
<summary className="cursor-pointer font-semibold list-none">
|
||||
<span className="group-open:rotate-90 transition-transform inline-block mr-2">▶</span>
|
||||
{title}
|
||||
</summary>
|
||||
<div className="mt-4 pl-6">{children}</div>
|
||||
</details>
|
||||
);
|
||||
}
|
||||
|
||||
export function getMDXComponents(components?: MDXComponents): MDXComponents {
|
||||
return {
|
||||
...defaultMdxComponents,
|
||||
// Layout components
|
||||
Card,
|
||||
Cards,
|
||||
// Files
|
||||
Files,
|
||||
Folder,
|
||||
File,
|
||||
// Callouts
|
||||
Callout,
|
||||
// Tabs
|
||||
Tabs,
|
||||
Tab,
|
||||
// Steps
|
||||
Steps,
|
||||
Step,
|
||||
// Accordion (native HTML-based)
|
||||
AccordionGroup,
|
||||
Accordion,
|
||||
// OpenAPI component
|
||||
APIPage,
|
||||
...components,
|
||||
};
|
||||
}
|
||||
25
docs/next.config.mjs
Normal file
25
docs/next.config.mjs
Normal file
@@ -0,0 +1,25 @@
|
||||
import { createMDX } from 'fumadocs-mdx/next';
|
||||
|
||||
const withMDX = createMDX();
|
||||
|
||||
/** @type {import('next').NextConfig} */
|
||||
const config = {
|
||||
reactStrictMode: true,
|
||||
async rewrites() {
|
||||
return [
|
||||
{
|
||||
source: '/docs/:path*.mdx',
|
||||
destination: '/llms.mdx/docs/:path*',
|
||||
},
|
||||
];
|
||||
},
|
||||
webpack: (config) => {
|
||||
config.experiments = {
|
||||
...config.experiments,
|
||||
topLevelAwait: true,
|
||||
};
|
||||
return config;
|
||||
},
|
||||
};
|
||||
|
||||
export default withMDX(config);
|
||||
1441
docs/openapi.json
Normal file
1441
docs/openapi.json
Normal file
File diff suppressed because it is too large
Load Diff
35
docs/package.json
Normal file
35
docs/package.json
Normal file
@@ -0,0 +1,35 @@
|
||||
{
|
||||
"name": "example-next-mdx",
|
||||
"version": "0.0.0",
|
||||
"private": true,
|
||||
"scripts": {
|
||||
"build": "fumadocs-mdx && next build",
|
||||
"dev": "fumadocs-mdx && next dev",
|
||||
"start": "next start",
|
||||
"postinstall": "fumadocs-mdx"
|
||||
},
|
||||
"dependencies": {
|
||||
"@radix-ui/react-popover": "^1.1.15",
|
||||
"class-variance-authority": "^0.7.1",
|
||||
"fumadocs-core": "^16.4.11",
|
||||
"fumadocs-mdx": "13",
|
||||
"fumadocs-openapi": "^10.2.7",
|
||||
"fumadocs-ui": "^16.4.11",
|
||||
"lucide-react": "^0.546.0",
|
||||
"next": "^16.1.6",
|
||||
"react": "^19.2.0",
|
||||
"react-dom": "^19.2.0",
|
||||
"shiki": "^3.22.0",
|
||||
"tailwind-merge": "^3.5.0"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@tailwindcss/postcss": "^4.1.15",
|
||||
"@types/mdx": "^2.0.13",
|
||||
"@types/node": "^24.9.1",
|
||||
"@types/react": "^19.2.2",
|
||||
"@types/react-dom": "^19.2.2",
|
||||
"postcss": "^8.5.6",
|
||||
"tailwindcss": "^4.1.15",
|
||||
"typescript": "^5.9.3"
|
||||
}
|
||||
}
|
||||
758
docs/plans/DOCKER_DEPLOYMENT.md
Normal file
758
docs/plans/DOCKER_DEPLOYMENT.md
Normal file
@@ -0,0 +1,758 @@
|
||||
# Docker Deployment Guide
|
||||
|
||||
**Status:** In Development for v0.2.0
|
||||
**Requested By:** Reddit community ([thread](https://reddit.com/r/LocalLLaMA/...))
|
||||
|
||||
## Overview
|
||||
|
||||
Docker support makes Voicebox easier to deploy, especially for:
|
||||
|
||||
- **Consistent Environments**: Same setup across dev/staging/prod
|
||||
- **GPU Passthrough**: Easy NVIDIA/AMD GPU access
|
||||
- **Server Deployments**: Run on headless Linux servers
|
||||
- **Multi-User Setups**: Isolate instances per user/team
|
||||
- **Cloud Platforms**: Deploy to AWS, GCP, Azure, DigitalOcean
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Using Pre-Built Images (Recommended)
|
||||
|
||||
```bash
|
||||
# CPU-only version
|
||||
docker run -p 8000:8000 -v voicebox-data:/app/data \
|
||||
ghcr.io/jamiepine/voicebox:latest
|
||||
|
||||
# NVIDIA GPU version
|
||||
docker run --gpus all -p 8000:8000 -v voicebox-data:/app/data \
|
||||
ghcr.io/jamiepine/voicebox:latest-cuda
|
||||
|
||||
# AMD GPU version (experimental)
|
||||
docker run --device=/dev/kfd --device=/dev/dri -p 8000:8000 \
|
||||
-v voicebox-data:/app/data \
|
||||
ghcr.io/jamiepine/voicebox:latest-rocm
|
||||
```
|
||||
|
||||
Then open: `http://localhost:8000`
|
||||
|
||||
### Using Docker Compose (Easiest)
|
||||
|
||||
Create `docker-compose.yml`:
|
||||
|
||||
```yaml
|
||||
version: '3.8'
|
||||
|
||||
services:
|
||||
voicebox:
|
||||
image: ghcr.io/jamiepine/voicebox:latest-cuda
|
||||
ports:
|
||||
- "8000:8000"
|
||||
volumes:
|
||||
- voicebox-data:/app/data
|
||||
- huggingface-cache:/root/.cache/huggingface
|
||||
environment:
|
||||
- GPU_MEMORY_FRACTION=0.8 # Use 80% of GPU memory
|
||||
- TTS_MODE=local
|
||||
- WHISPER_MODE=local
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: 1
|
||||
capabilities: [gpu]
|
||||
|
||||
volumes:
|
||||
voicebox-data:
|
||||
huggingface-cache:
|
||||
```
|
||||
|
||||
Run:
|
||||
```bash
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
## Building From Source
|
||||
|
||||
### Basic Dockerfile
|
||||
|
||||
```dockerfile
|
||||
# Dockerfile
|
||||
FROM python:3.11-slim
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Install system dependencies
|
||||
RUN apt-get update && apt-get install -y \
|
||||
git \
|
||||
build-essential \
|
||||
ffmpeg \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Copy application
|
||||
COPY backend/ /app/backend/
|
||||
COPY requirements.txt /app/
|
||||
|
||||
# Install Python dependencies
|
||||
RUN pip install --no-cache-dir -r requirements.txt
|
||||
RUN pip install --no-cache-dir git+https://github.com/QwenLM/Qwen3-TTS.git
|
||||
|
||||
# Create data directory
|
||||
RUN mkdir -p /app/data
|
||||
|
||||
# Expose port
|
||||
EXPOSE 8000
|
||||
|
||||
# Run server
|
||||
CMD ["uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "8000"]
|
||||
```
|
||||
|
||||
Build and run:
|
||||
```bash
|
||||
docker build -t voicebox .
|
||||
docker run -p 8000:8000 -v $(pwd)/data:/app/data voicebox
|
||||
```
|
||||
|
||||
### Multi-Stage Build (Optimized)
|
||||
|
||||
Smaller image size by separating build and runtime:
|
||||
|
||||
```dockerfile
|
||||
# Dockerfile.optimized
|
||||
# Stage 1: Build dependencies
|
||||
FROM python:3.11-slim AS builder
|
||||
|
||||
WORKDIR /build
|
||||
|
||||
RUN apt-get update && apt-get install -y \
|
||||
git build-essential && \
|
||||
rm -rf /var/lib/apt/lists/*
|
||||
|
||||
COPY backend/requirements.txt .
|
||||
RUN pip install --no-cache-dir --target=/build/packages \
|
||||
-r requirements.txt
|
||||
|
||||
RUN pip install --no-cache-dir --target=/build/packages \
|
||||
git+https://github.com/QwenLM/Qwen3-TTS.git
|
||||
|
||||
# Stage 2: Runtime
|
||||
FROM python:3.11-slim
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Install only runtime dependencies
|
||||
RUN apt-get update && apt-get install -y \
|
||||
ffmpeg \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Copy installed packages from builder
|
||||
COPY --from=builder /build/packages /usr/local/lib/python3.11/site-packages/
|
||||
|
||||
# Copy application code
|
||||
COPY backend/ /app/backend/
|
||||
|
||||
# Create data directory
|
||||
RUN mkdir -p /app/data
|
||||
|
||||
EXPOSE 8000
|
||||
|
||||
CMD ["uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "8000"]
|
||||
```
|
||||
|
||||
Build:
|
||||
```bash
|
||||
docker build -f Dockerfile.optimized -t voicebox:slim .
|
||||
```
|
||||
|
||||
## GPU Support
|
||||
|
||||
### NVIDIA GPUs (CUDA)
|
||||
|
||||
**Dockerfile:**
|
||||
```dockerfile
|
||||
FROM nvidia/cuda:12.1.0-runtime-ubuntu22.04
|
||||
|
||||
# Install Python
|
||||
RUN apt-get update && apt-get install -y \
|
||||
python3.11 python3-pip git ffmpeg && \
|
||||
rm -rf /var/lib/apt/lists/*
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Install PyTorch with CUDA support
|
||||
COPY backend/requirements.txt .
|
||||
RUN pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
|
||||
|
||||
# Install other dependencies
|
||||
RUN pip3 install -r requirements.txt
|
||||
RUN pip3 install git+https://github.com/QwenLM/Qwen3-TTS.git
|
||||
|
||||
COPY backend/ /app/backend/
|
||||
|
||||
EXPOSE 8000
|
||||
CMD ["uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "8000"]
|
||||
```
|
||||
|
||||
**Run with GPU:**
|
||||
```bash
|
||||
docker run --gpus all -p 8000:8000 \
|
||||
-v voicebox-data:/app/data \
|
||||
voicebox:cuda
|
||||
```
|
||||
|
||||
**Docker Compose with GPU:**
|
||||
```yaml
|
||||
services:
|
||||
voicebox:
|
||||
image: voicebox:cuda
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: all
|
||||
capabilities: [gpu]
|
||||
```
|
||||
|
||||
### AMD GPUs (ROCm) - Experimental
|
||||
|
||||
**Dockerfile:**
|
||||
```dockerfile
|
||||
FROM rocm/dev-ubuntu-22.04:6.0
|
||||
|
||||
# Install Python
|
||||
RUN apt-get update && apt-get install -y \
|
||||
python3.11 python3-pip git ffmpeg && \
|
||||
rm -rf /var/lib/apt/lists/*
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Install PyTorch with ROCm support
|
||||
COPY backend/requirements.txt .
|
||||
RUN pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.0
|
||||
|
||||
# Install other dependencies
|
||||
RUN pip3 install -r requirements.txt
|
||||
RUN pip3 install git+https://github.com/QwenLM/Qwen3-TTS.git
|
||||
|
||||
# Set ROCm environment variables
|
||||
ENV HSA_OVERRIDE_GFX_VERSION=10.3.0
|
||||
ENV ROCM_PATH=/opt/rocm
|
||||
|
||||
COPY backend/ /app/backend/
|
||||
|
||||
EXPOSE 8000
|
||||
CMD ["uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "8000"]
|
||||
```
|
||||
|
||||
**Run with AMD GPU:**
|
||||
```bash
|
||||
docker run --device=/dev/kfd --device=/dev/dri \
|
||||
--group-add video --ipc=host --cap-add=SYS_PTRACE \
|
||||
--security-opt seccomp=unconfined \
|
||||
-p 8000:8000 -v voicebox-data:/app/data \
|
||||
voicebox:rocm
|
||||
```
|
||||
|
||||
**Note:** ROCm support varies by GPU model. Works best on Linux. See [AMD ROCm docs](https://rocm.docs.amd.com) for compatibility.
|
||||
|
||||
## Volume Mounts
|
||||
|
||||
### Essential Volumes
|
||||
|
||||
```bash
|
||||
docker run -v voicebox-data:/app/data \ # Profiles, generations, history
|
||||
-v huggingface-cache:/root/.cache/huggingface \ # Downloaded models
|
||||
-p 8000:8000 voicebox
|
||||
```
|
||||
|
||||
### Development Volume Mounts
|
||||
|
||||
For development with hot-reload:
|
||||
|
||||
```bash
|
||||
docker run -v $(pwd)/backend:/app/backend \ # Live code changes
|
||||
-v voicebox-data:/app/data \
|
||||
-e RELOAD=true \
|
||||
-p 8000:8000 voicebox
|
||||
```
|
||||
|
||||
### Custom Model Storage
|
||||
|
||||
Use external model directory:
|
||||
|
||||
```bash
|
||||
docker run -v /path/to/models:/models \
|
||||
-e MODELS_DIR=/models \
|
||||
-v voicebox-data:/app/data \
|
||||
-p 8000:8000 voicebox
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
Configure Voicebox via environment variables:
|
||||
|
||||
```bash
|
||||
docker run -e TTS_MODE=local \
|
||||
-e WHISPER_MODE=openai-api \
|
||||
-e OPENAI_API_KEY=sk-... \
|
||||
-e GPU_MEMORY_FRACTION=0.8 \
|
||||
-e LOG_LEVEL=info \
|
||||
-p 8000:8000 voicebox
|
||||
```
|
||||
|
||||
### Available Variables
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `TTS_MODE` | `local` | TTS provider: `local`, `remote` |
|
||||
| `TTS_REMOTE_URL` | - | URL for remote TTS server |
|
||||
| `WHISPER_MODE` | `local` | Whisper provider: `local`, `openai-api`, `remote` |
|
||||
| `WHISPER_REMOTE_URL` | - | URL for remote Whisper server |
|
||||
| `OPENAI_API_KEY` | - | OpenAI API key (if using OpenAI Whisper) |
|
||||
| `GPU_MEMORY_FRACTION` | `0.9` | Fraction of GPU memory to use (0.0-1.0) |
|
||||
| `DATA_DIR` | `/app/data` | Directory for profiles/generations |
|
||||
| `MODELS_DIR` | `/app/models` | Directory for local models |
|
||||
| `LOG_LEVEL` | `info` | Logging level: `debug`, `info`, `warning`, `error` |
|
||||
| `RELOAD` | `false` | Enable hot-reload for development |
|
||||
|
||||
## Complete Docker Compose Examples
|
||||
|
||||
### Production Deployment
|
||||
|
||||
```yaml
|
||||
# docker-compose.prod.yml
|
||||
version: '3.8'
|
||||
|
||||
services:
|
||||
voicebox:
|
||||
image: ghcr.io/jamiepine/voicebox:latest-cuda
|
||||
container_name: voicebox
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "8000:8000"
|
||||
volumes:
|
||||
- voicebox-data:/app/data
|
||||
- huggingface-cache:/root/.cache/huggingface
|
||||
environment:
|
||||
- TTS_MODE=local
|
||||
- WHISPER_MODE=local
|
||||
- GPU_MEMORY_FRACTION=0.8
|
||||
- LOG_LEVEL=info
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: 1
|
||||
capabilities: [gpu]
|
||||
healthcheck:
|
||||
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
|
||||
interval: 30s
|
||||
timeout: 10s
|
||||
retries: 3
|
||||
start_period: 40s
|
||||
|
||||
volumes:
|
||||
voicebox-data:
|
||||
driver: local
|
||||
huggingface-cache:
|
||||
driver: local
|
||||
```
|
||||
|
||||
Run:
|
||||
```bash
|
||||
docker compose -f docker-compose.prod.yml up -d
|
||||
```
|
||||
|
||||
### Development Setup
|
||||
|
||||
```yaml
|
||||
# docker-compose.dev.yml
|
||||
version: '3.8'
|
||||
|
||||
services:
|
||||
voicebox:
|
||||
build:
|
||||
context: .
|
||||
dockerfile: Dockerfile
|
||||
ports:
|
||||
- "8000:8000"
|
||||
volumes:
|
||||
- ./backend:/app/backend:ro
|
||||
- voicebox-data:/app/data
|
||||
- huggingface-cache:/root/.cache/huggingface
|
||||
environment:
|
||||
- RELOAD=true
|
||||
- LOG_LEVEL=debug
|
||||
- TTS_MODE=local
|
||||
command: uvicorn backend.main:app --host 0.0.0.0 --port 8000 --reload
|
||||
|
||||
volumes:
|
||||
voicebox-data:
|
||||
huggingface-cache:
|
||||
```
|
||||
|
||||
### Multi-Service Stack
|
||||
|
||||
Full stack with reverse proxy and monitoring:
|
||||
|
||||
```yaml
|
||||
# docker-compose.stack.yml
|
||||
version: '3.8'
|
||||
|
||||
services:
|
||||
# Main Voicebox app
|
||||
voicebox:
|
||||
image: ghcr.io/jamiepine/voicebox:latest-cuda
|
||||
restart: unless-stopped
|
||||
volumes:
|
||||
- voicebox-data:/app/data
|
||||
- huggingface-cache:/root/.cache/huggingface
|
||||
environment:
|
||||
- TTS_MODE=local
|
||||
- WHISPER_MODE=local
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: 1
|
||||
capabilities: [gpu]
|
||||
|
||||
# Nginx reverse proxy
|
||||
nginx:
|
||||
image: nginx:alpine
|
||||
ports:
|
||||
- "80:80"
|
||||
- "443:443"
|
||||
volumes:
|
||||
- ./nginx.conf:/etc/nginx/nginx.conf:ro
|
||||
- ./ssl:/etc/nginx/ssl:ro
|
||||
depends_on:
|
||||
- voicebox
|
||||
|
||||
# Prometheus monitoring (optional)
|
||||
prometheus:
|
||||
image: prom/prometheus
|
||||
ports:
|
||||
- "9090:9090"
|
||||
volumes:
|
||||
- ./prometheus.yml:/etc/prometheus/prometheus.yml
|
||||
- prometheus-data:/prometheus
|
||||
|
||||
volumes:
|
||||
voicebox-data:
|
||||
huggingface-cache:
|
||||
prometheus-data:
|
||||
```
|
||||
|
||||
## Cloud Deployment
|
||||
|
||||
### AWS EC2
|
||||
|
||||
1. **Launch GPU Instance** (g4dn.xlarge or p3.2xlarge)
|
||||
2. **Install Docker + nvidia-docker:**
|
||||
```bash
|
||||
# Amazon Linux 2
|
||||
sudo yum install -y docker
|
||||
sudo systemctl start docker
|
||||
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
|
||||
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
|
||||
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
|
||||
sudo tee /etc/apt/sources.list.d/nvidia-docker.list
|
||||
sudo apt-get update && sudo apt-get install -y nvidia-docker2
|
||||
sudo systemctl restart docker
|
||||
```
|
||||
3. **Deploy:**
|
||||
```bash
|
||||
docker run --gpus all -d -p 80:8000 \
|
||||
-v voicebox-data:/app/data \
|
||||
--restart unless-stopped \
|
||||
ghcr.io/jamiepine/voicebox:latest-cuda
|
||||
```
|
||||
|
||||
### DigitalOcean
|
||||
|
||||
Use GPU Droplet + Docker:
|
||||
|
||||
```bash
|
||||
# Create droplet via CLI
|
||||
doctl compute droplet create voicebox \
|
||||
--size gpu-h100x1-80gb \
|
||||
--image ubuntu-22-04-x64 \
|
||||
--region nyc3
|
||||
|
||||
# SSH and deploy
|
||||
ssh root@<droplet-ip>
|
||||
curl -fsSL https://get.docker.com -o get-docker.sh
|
||||
sh get-docker.sh
|
||||
docker run --gpus all -d -p 80:8000 voicebox:cuda
|
||||
```
|
||||
|
||||
### Google Cloud Run (CPU-only)
|
||||
|
||||
```bash
|
||||
# Build and push
|
||||
docker build -t gcr.io/your-project/voicebox .
|
||||
docker push gcr.io/your-project/voicebox
|
||||
|
||||
# Deploy to Cloud Run
|
||||
gcloud run deploy voicebox \
|
||||
--image gcr.io/your-project/voicebox \
|
||||
--platform managed \
|
||||
--region us-central1 \
|
||||
--memory 4Gi \
|
||||
--cpu 2 \
|
||||
--port 8000
|
||||
```
|
||||
|
||||
### Fly.io
|
||||
|
||||
Create `fly.toml`:
|
||||
```toml
|
||||
app = "voicebox"
|
||||
|
||||
[build]
|
||||
image = "ghcr.io/jamiepine/voicebox:latest"
|
||||
|
||||
[[services]]
|
||||
http_checks = []
|
||||
internal_port = 8000
|
||||
protocol = "tcp"
|
||||
|
||||
[[services.ports]]
|
||||
port = 80
|
||||
handlers = ["http"]
|
||||
|
||||
[[services.ports]]
|
||||
port = 443
|
||||
handlers = ["tls", "http"]
|
||||
|
||||
[mounts]
|
||||
source = "voicebox_data"
|
||||
destination = "/app/data"
|
||||
```
|
||||
|
||||
Deploy:
|
||||
```bash
|
||||
fly launch
|
||||
fly deploy
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### GPU Not Detected
|
||||
|
||||
**Check NVIDIA Docker:**
|
||||
```bash
|
||||
docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
|
||||
```
|
||||
|
||||
If this fails, reinstall nvidia-docker2.
|
||||
|
||||
**Check AMD ROCm:**
|
||||
```bash
|
||||
docker run --rm --device=/dev/kfd --device=/dev/dri rocm/dev-ubuntu-22.04:6.0 rocminfo
|
||||
```
|
||||
|
||||
### Permission Errors
|
||||
|
||||
Container can't write to volumes:
|
||||
```bash
|
||||
# Fix permissions
|
||||
docker run --user $(id -u):$(id -g) -v $(pwd)/data:/app/data voicebox
|
||||
```
|
||||
|
||||
### Out of Memory
|
||||
|
||||
Reduce GPU memory usage:
|
||||
```bash
|
||||
docker run -e GPU_MEMORY_FRACTION=0.5 voicebox
|
||||
```
|
||||
|
||||
Or use CPU-only:
|
||||
```bash
|
||||
docker run -e DEVICE=cpu voicebox
|
||||
```
|
||||
|
||||
### Model Download Fails
|
||||
|
||||
Ensure HuggingFace cache is writable:
|
||||
```bash
|
||||
docker run -v huggingface-cache:/root/.cache/huggingface voicebox
|
||||
```
|
||||
|
||||
Or use host cache:
|
||||
```bash
|
||||
docker run -v ~/.cache/huggingface:/root/.cache/huggingface voicebox
|
||||
```
|
||||
|
||||
### Port Already in Use
|
||||
|
||||
Change host port:
|
||||
```bash
|
||||
docker run -p 8080:8000 voicebox # Use port 8080 instead
|
||||
```
|
||||
|
||||
## Security Best Practices
|
||||
|
||||
### 1. Don't Run as Root
|
||||
|
||||
Create non-root user in Dockerfile:
|
||||
```dockerfile
|
||||
RUN useradd -m -u 1000 voicebox
|
||||
USER voicebox
|
||||
```
|
||||
|
||||
### 2. Use Secrets for API Keys
|
||||
|
||||
Don't put API keys in docker-compose.yml:
|
||||
|
||||
```bash
|
||||
# Use Docker secrets
|
||||
echo "sk-your-key" | docker secret create openai_key -
|
||||
|
||||
docker service create \
|
||||
--secret openai_key \
|
||||
-e OPENAI_API_KEY_FILE=/run/secrets/openai_key \
|
||||
voicebox
|
||||
```
|
||||
|
||||
### 3. Network Isolation
|
||||
|
||||
Use internal networks for multi-container setups:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
voicebox:
|
||||
networks:
|
||||
- internal
|
||||
nginx:
|
||||
networks:
|
||||
- internal
|
||||
- external
|
||||
ports:
|
||||
- "80:80"
|
||||
|
||||
networks:
|
||||
internal:
|
||||
internal: true
|
||||
external:
|
||||
```
|
||||
|
||||
### 4. Resource Limits
|
||||
|
||||
Prevent resource exhaustion:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
voicebox:
|
||||
deploy:
|
||||
resources:
|
||||
limits:
|
||||
cpus: '4'
|
||||
memory: 8G
|
||||
reservations:
|
||||
cpus: '2'
|
||||
memory: 4G
|
||||
```
|
||||
|
||||
## Performance Tuning
|
||||
|
||||
### GPU Memory Management
|
||||
|
||||
```bash
|
||||
# Use 80% of GPU (default 90%)
|
||||
docker run -e GPU_MEMORY_FRACTION=0.8 voicebox
|
||||
|
||||
# Allow GPU memory growth (prevents OOM)
|
||||
docker run -e TF_FORCE_GPU_ALLOW_GROWTH=true voicebox
|
||||
```
|
||||
|
||||
### Model Caching
|
||||
|
||||
Pre-download models to volume:
|
||||
|
||||
```bash
|
||||
# Download models first
|
||||
docker run --rm -v huggingface-cache:/root/.cache/huggingface \
|
||||
voicebox python -c "
|
||||
from transformers import WhisperProcessor, WhisperForConditionalGeneration
|
||||
WhisperProcessor.from_pretrained('openai/whisper-base')
|
||||
WhisperForConditionalGeneration.from_pretrained('openai/whisper-base')
|
||||
"
|
||||
|
||||
# Then run normally
|
||||
docker run -v huggingface-cache:/root/.cache/huggingface voicebox
|
||||
```
|
||||
|
||||
### Multi-Worker Setup
|
||||
|
||||
Use uvicorn workers for better throughput:
|
||||
|
||||
```dockerfile
|
||||
CMD ["uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "4"]
|
||||
```
|
||||
|
||||
## Monitoring
|
||||
|
||||
### Health Checks
|
||||
|
||||
Built-in health endpoint:
|
||||
```bash
|
||||
curl http://localhost:8000/health
|
||||
```
|
||||
|
||||
Docker health check:
|
||||
```yaml
|
||||
healthcheck:
|
||||
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
|
||||
interval: 30s
|
||||
timeout: 10s
|
||||
retries: 3
|
||||
```
|
||||
|
||||
### Prometheus Metrics
|
||||
|
||||
Add metrics exporter:
|
||||
```python
|
||||
# backend/main.py
|
||||
from prometheus_fastapi_instrumentator import Instrumentator
|
||||
|
||||
Instrumentator().instrument(app).expose(app)
|
||||
```
|
||||
|
||||
Then scrape `/metrics` with Prometheus.
|
||||
|
||||
### Logs
|
||||
|
||||
View container logs:
|
||||
```bash
|
||||
docker logs -f voicebox
|
||||
|
||||
# Or with compose
|
||||
docker compose logs -f voicebox
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [ ] Publish official images to GitHub Container Registry
|
||||
- [ ] Add Kubernetes Helm charts
|
||||
- [ ] Create Docker Desktop extension
|
||||
- [ ] Add automated vulnerability scanning
|
||||
- [ ] Support ARM64 builds for Raspberry Pi / Apple Silicon
|
||||
|
||||
## Contributing
|
||||
|
||||
Help improve Docker support:
|
||||
1. Test on different platforms (AMD GPU, ARM64, etc.)
|
||||
2. Submit Dockerfile optimizations
|
||||
3. Share deployment configurations
|
||||
4. Report issues: [GitHub Issues](https://github.com/jamiepine/voicebox/issues)
|
||||
|
||||
## Resources
|
||||
|
||||
- [Docker Documentation](https://docs.docker.com)
|
||||
- [NVIDIA Container Toolkit](https://github.com/NVIDIA/nvidia-docker)
|
||||
- [AMD ROCm Docker](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/docker.html)
|
||||
- [Docker Compose Reference](https://docs.docker.com/compose/compose-file/)
|
||||
235
docs/plans/OPENAI_SUPPORT.md
Normal file
235
docs/plans/OPENAI_SUPPORT.md
Normal file
@@ -0,0 +1,235 @@
|
||||
# OpenAI API Compatibility
|
||||
|
||||
**Status:** Planned for v0.2.0
|
||||
|
||||
**Issue:** [#10 OpenAI API compatibility](https://github.com/jamiepine/voicebox/issues/10)
|
||||
|
||||
## Overview
|
||||
|
||||
This feature exposes OpenAI-compatible endpoints from Voicebox, allowing any tool, library, or application that speaks the OpenAI Audio API to use Voicebox as a drop-in local replacement.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
subgraph clients [External Clients]
|
||||
SDK[OpenAI SDK]
|
||||
Curl[curl / HTTP]
|
||||
Apps[Third-party Apps]
|
||||
end
|
||||
|
||||
subgraph voicebox [Voicebox Server]
|
||||
OpenAI["/v1/audio/* endpoints"]
|
||||
TTS[TTSModel]
|
||||
Whisper[WhisperModel]
|
||||
Profiles[Voice Profiles]
|
||||
end
|
||||
|
||||
SDK --> OpenAI
|
||||
Curl --> OpenAI
|
||||
Apps --> OpenAI
|
||||
OpenAI --> TTS
|
||||
OpenAI --> Whisper
|
||||
OpenAI --> Profiles
|
||||
```
|
||||
|
||||
## Use Cases
|
||||
|
||||
- **OpenAI SDK users**: `openai.audio.speech.create()` works with Voicebox
|
||||
- **LLM frameworks**: LangChain, AutoGen, etc. can use Voicebox for TTS
|
||||
- **Shell scripts**: `curl` commands copy-pasted from OpenAI docs work
|
||||
- **Existing integrations**: Any tool expecting OpenAI's API works without code changes
|
||||
|
||||
## Endpoints to Implement
|
||||
|
||||
### 1. `POST /v1/audio/speech` (TTS)
|
||||
|
||||
OpenAI spec: https://platform.openai.com/docs/api-reference/audio/createSpeech
|
||||
|
||||
**Request:**
|
||||
|
||||
```json
|
||||
{
|
||||
"model": "tts-1",
|
||||
"input": "Hello world!",
|
||||
"voice": "alloy",
|
||||
"response_format": "mp3",
|
||||
"speed": 1.0
|
||||
}
|
||||
```
|
||||
|
||||
**Response:** Audio file (mp3, wav, opus, aac, flac, pcm)
|
||||
|
||||
**Voice Mapping Strategy:**
|
||||
|
||||
- `voice` parameter maps to Voicebox profile names (case-insensitive)
|
||||
- If no match, use a configurable default profile
|
||||
- Support special syntax: `voice: "profile:uuid"` for explicit profile ID
|
||||
|
||||
### 2. `POST /v1/audio/transcriptions` (Whisper)
|
||||
|
||||
OpenAI spec: https://platform.openai.com/docs/api-reference/audio/createTranscription
|
||||
|
||||
**Request:** (multipart/form-data)
|
||||
|
||||
- `file`: Audio file
|
||||
- `model`: "whisper-1"
|
||||
- `language`: Optional language hint
|
||||
- `response_format`: json, text, srt, verbose_json, vtt
|
||||
|
||||
**Response:**
|
||||
|
||||
```json
|
||||
{
|
||||
"text": "Hello world!"
|
||||
}
|
||||
```
|
||||
|
||||
## Implementation Details
|
||||
|
||||
### New File: `backend/openai_compat.py`
|
||||
|
||||
Create a dedicated module with an APIRouter for OpenAI-compatible endpoints:
|
||||
|
||||
```python
|
||||
from fastapi import APIRouter, UploadFile, File, Form, HTTPException
|
||||
from fastapi.responses import StreamingResponse
|
||||
from pydantic import BaseModel
|
||||
from typing import Literal, Optional
|
||||
|
||||
router = APIRouter(prefix="/v1/audio", tags=["OpenAI Compatible"])
|
||||
|
||||
class SpeechRequest(BaseModel):
|
||||
model: str = "tts-1"
|
||||
input: str
|
||||
voice: str = "alloy"
|
||||
response_format: Literal["mp3", "wav", "opus", "aac", "flac", "pcm"] = "mp3"
|
||||
speed: float = 1.0
|
||||
|
||||
@router.post("/speech")
|
||||
async def create_speech(request: SpeechRequest, db: Session = Depends(get_db)):
|
||||
# 1. Map voice name to profile
|
||||
# 2. Generate audio using existing TTSModel
|
||||
# 3. Convert to requested format
|
||||
# 4. Return audio stream
|
||||
...
|
||||
|
||||
@router.post("/transcriptions")
|
||||
async def create_transcription(
|
||||
file: UploadFile = File(...),
|
||||
model: str = Form("whisper-1"),
|
||||
language: Optional[str] = Form(None),
|
||||
response_format: str = Form("json"),
|
||||
):
|
||||
# 1. Save uploaded file
|
||||
# 2. Transcribe using existing WhisperModel
|
||||
# 3. Return in requested format
|
||||
...
|
||||
```
|
||||
|
||||
### Voice Profile Resolution
|
||||
|
||||
Add helper in [backend/profiles.py](backend/profiles.py):
|
||||
|
||||
```python
|
||||
async def resolve_voice_for_openai(voice: str, db: Session) -> Optional[VoiceProfile]:
|
||||
"""
|
||||
Resolve OpenAI voice parameter to a Voicebox profile.
|
||||
|
||||
Priority:
|
||||
1. Exact profile name match (case-insensitive)
|
||||
2. Profile ID match (if voice starts with "profile:")
|
||||
3. Default profile from config
|
||||
4. First available profile
|
||||
"""
|
||||
...
|
||||
```
|
||||
|
||||
### Audio Format Conversion
|
||||
|
||||
Add conversion utilities in [backend/utils/audio.py](backend/utils/audio.py):
|
||||
|
||||
```python
|
||||
def convert_audio_format(
|
||||
audio: np.ndarray,
|
||||
sample_rate: int,
|
||||
target_format: str, # mp3, wav, opus, aac, flac, pcm
|
||||
) -> bytes:
|
||||
"""Convert audio to target format using ffmpeg or pydub."""
|
||||
...
|
||||
```
|
||||
|
||||
### Configuration
|
||||
|
||||
Add to [backend/config.py](backend/config.py):
|
||||
|
||||
```python
|
||||
# OpenAI API Compatibility
|
||||
OPENAI_COMPAT_ENABLED = True
|
||||
OPENAI_COMPAT_DEFAULT_VOICE = None # Profile ID or name for default voice
|
||||
OPENAI_COMPAT_REQUIRE_AUTH = False # Require API key validation
|
||||
OPENAI_COMPAT_API_KEY = None # If set, validate against this
|
||||
```
|
||||
|
||||
### Integration with main.py
|
||||
|
||||
In [backend/main.py](backend/main.py), include the router:
|
||||
|
||||
```python
|
||||
from . import openai_compat
|
||||
|
||||
# Add OpenAI-compatible routes
|
||||
if config.OPENAI_COMPAT_ENABLED:
|
||||
app.include_router(openai_compat.router)
|
||||
```
|
||||
|
||||
## Streaming Support (Future Enhancement)
|
||||
|
||||
Initial implementation returns complete audio. Streaming can be added later:
|
||||
|
||||
```python
|
||||
@router.post("/speech")
|
||||
async def create_speech(request: SpeechRequest):
|
||||
if request.stream:
|
||||
return StreamingResponse(
|
||||
generate_audio_chunks(request),
|
||||
media_type=f"audio/{request.response_format}"
|
||||
)
|
||||
...
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
Example usage after implementation:
|
||||
|
||||
```bash
|
||||
# TTS with curl
|
||||
curl http://localhost:8000/v1/audio/speech \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"model": "tts-1", "input": "Hello!", "voice": "MyProfile"}' \
|
||||
--output speech.mp3
|
||||
|
||||
# With OpenAI Python SDK
|
||||
from openai import OpenAI
|
||||
client = OpenAI(base_url="http://localhost:8000/v1", api_key="unused")
|
||||
response = client.audio.speech.create(
|
||||
model="tts-1",
|
||||
voice="MyProfile",
|
||||
input="Hello world!"
|
||||
)
|
||||
response.stream_to_file("output.mp3")
|
||||
|
||||
# Transcription
|
||||
curl http://localhost:8000/v1/audio/transcriptions \
|
||||
-F file=@audio.mp3 \
|
||||
-F model="whisper-1"
|
||||
```
|
||||
|
||||
## Security Considerations
|
||||
|
||||
- Optional API key validation (for shared deployments)
|
||||
- Rate limiting on endpoints
|
||||
- Input length limits (same as existing `/generate` endpoint)
|
||||
|
||||
## Dependencies
|
||||
|
||||
- `pydub` or `ffmpeg-python` for audio format conversion (mp3, opus, etc.)
|
||||
- No changes to existing TTS/Whisper model code
|
||||
5
docs/postcss.config.mjs
Normal file
5
docs/postcss.config.mjs
Normal file
@@ -0,0 +1,5 @@
|
||||
export default {
|
||||
plugins: {
|
||||
'@tailwindcss/postcss': {},
|
||||
},
|
||||
};
|
||||
BIN
docs/public/images/app-screenshot-1.webp
Normal file
BIN
docs/public/images/app-screenshot-1.webp
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 134 KiB |
BIN
docs/public/images/app-screenshot-2.webp
Normal file
BIN
docs/public/images/app-screenshot-2.webp
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 129 KiB |
BIN
docs/public/images/app-screenshot-3.webp
Normal file
BIN
docs/public/images/app-screenshot-3.webp
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 108 KiB |
BIN
docs/public/logo/icon-dark.png
Normal file
BIN
docs/public/logo/icon-dark.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 10 KiB |
BIN
docs/public/logo/icon-light.png
Normal file
BIN
docs/public/logo/icon-light.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 10 KiB |
10
docs/scripts/generate-openapi.ts
Normal file
10
docs/scripts/generate-openapi.ts
Normal file
@@ -0,0 +1,10 @@
|
||||
import { generateFiles } from 'fumadocs-openapi';
|
||||
import { openapi } from '../lib/openapi';
|
||||
|
||||
await generateFiles({
|
||||
input: openapi,
|
||||
output: 'content/docs/api-reference',
|
||||
groupBy: 'tag',
|
||||
});
|
||||
|
||||
console.log('✓ OpenAPI documentation generated in content/docs/api-reference/');
|
||||
22
docs/source.config.ts
Normal file
22
docs/source.config.ts
Normal file
@@ -0,0 +1,22 @@
|
||||
import { defineConfig, defineDocs, frontmatterSchema, metaSchema } from 'fumadocs-mdx/config';
|
||||
|
||||
// You can customise Zod schemas for frontmatter and `meta.json` here
|
||||
// see https://fumadocs.dev/docs/mdx/collections
|
||||
export const docs = defineDocs({
|
||||
dir: 'content/docs',
|
||||
docs: {
|
||||
schema: frontmatterSchema,
|
||||
postprocess: {
|
||||
includeProcessedMarkdown: true,
|
||||
},
|
||||
},
|
||||
meta: {
|
||||
schema: metaSchema,
|
||||
},
|
||||
});
|
||||
|
||||
export default defineConfig({
|
||||
mdxOptions: {
|
||||
// MDX options
|
||||
},
|
||||
});
|
||||
36
docs/tsconfig.json
Normal file
36
docs/tsconfig.json
Normal file
@@ -0,0 +1,36 @@
|
||||
{
|
||||
"compilerOptions": {
|
||||
"baseUrl": ".",
|
||||
"target": "ESNext",
|
||||
"lib": ["dom", "dom.iterable", "esnext"],
|
||||
"allowJs": true,
|
||||
"skipLibCheck": true,
|
||||
"strict": true,
|
||||
"forceConsistentCasingInFileNames": true,
|
||||
"noEmit": true,
|
||||
"esModuleInterop": true,
|
||||
"module": "esnext",
|
||||
"moduleResolution": "bundler",
|
||||
"resolveJsonModule": true,
|
||||
"isolatedModules": true,
|
||||
"jsx": "react-jsx",
|
||||
"incremental": true,
|
||||
"paths": {
|
||||
"@/*": ["./*"],
|
||||
"@/.source": [".source"]
|
||||
},
|
||||
"plugins": [
|
||||
{
|
||||
"name": "next"
|
||||
}
|
||||
]
|
||||
},
|
||||
"include": [
|
||||
"next-env.d.ts",
|
||||
"**/*.ts",
|
||||
"**/*.tsx",
|
||||
".next/types/**/*.ts",
|
||||
".next/dev/types/**/*.ts"
|
||||
],
|
||||
"exclude": ["node_modules"]
|
||||
}
|
||||
Reference in New Issue
Block a user