Initial commit

This commit is contained in:
2026-04-24 19:18:15 +08:00
commit fbcbe08696
555 changed files with 96692 additions and 0 deletions

View File

@@ -0,0 +1,120 @@
---
name: add-tts-engine
description: Use this skill to add a new TTS engine to Voicebox. It walks through dependency research, backend implementation, frontend wiring, PyInstaller bundling, and frozen-build testing. Always start with Phase 0 (dependency audit) before writing any code.
---
# Add TTS Engine
## Goal
Integrate a new text-to-speech engine into Voicebox end-to-end: dependency research, backend protocol implementation, frontend UI wiring, PyInstaller bundling, and frozen-build verification. The user should only need to test the final build locally.
## Reference Doc
The full phased guide lives at `docs/content/docs/developer/tts-engines.mdx`. **Read this file in its entirety before starting.** It contains:
- Phase 0: Dependency research (mandatory before writing code)
- Phase 1: Backend implementation (`TTSBackend` protocol)
- Phase 2: Route and service integration (usually zero changes)
- Phase 3: Frontend integration (5 files)
- Phase 4: Dependencies (`requirements.txt`, justfile, CI, Docker)
- Phase 5: PyInstaller bundling (`build_binary.py` + `server.py`)
- Phase 6: Common upstream workarounds
- Implementation checklist (gate between phases)
## Workflow
### 1. Read the guide
```bash
# Read the full TTS engines doc
cat docs/content/docs/developer/tts-engines.mdx
```
Internalize all phases, especially Phase 0 and Phase 5. The v0.2.3 release was three patch releases because Phase 0 was skipped.
### 2. Dependency research (Phase 0)
Clone the model library into a temporary directory and audit it. Do NOT skip this.
```bash
mkdir /tmp/engine-research && cd /tmp/engine-research
git clone <model-library-url>
```
Run the grep searches from Phase 0.2 in the guide against the cloned source and its transitive dependencies. Produce a written dependency audit covering:
1. PyPI vs non-PyPI packages
2. PyInstaller directives needed (`--collect-all`, `--copy-metadata`, `--hidden-import`)
3. Runtime data files that must be bundled
4. Native library paths that need env var overrides in frozen builds
5. Monkey-patches needed (`torch.load`, float64, MPS, HF token)
6. Sample rate
7. Model download method (`from_pretrained` vs `snapshot_download` + `from_local`)
Test model loading and generation on CPU in the throwaway venv before proceeding.
### 3. Implement (Phases 14)
Follow the guide's phases in order. Key files to modify:
**Backend (Phase 1):**
- Create `backend/backends/<engine>_backend.py`
- Register in `backend/backends/__init__.py` (ModelConfig + TTS_ENGINES + factory)
- Update regex in `backend/models.py`
**Frontend (Phase 3):**
- `app/src/lib/api/types.ts` — engine union type
- `app/src/lib/constants/languages.ts` — ENGINE_LANGUAGES
- `app/src/components/Generation/EngineModelSelector.tsx` — ENGINE_OPTIONS, ENGINE_DESCRIPTIONS
- `app/src/lib/hooks/useGenerationForm.ts` — Zod schema, model-name mapping
- `app/src/components/ServerSettings/ModelManagement.tsx` — MODEL_DESCRIPTIONS
**Dependencies (Phase 4):**
- `backend/requirements.txt`
- `justfile` (setup-python, setup-python-release targets)
- `.github/workflows/release.yml`
- `Dockerfile` (if applicable)
### 4. PyInstaller bundling (Phase 5)
Register the engine in `backend/build_binary.py`:
- `--hidden-import` for the backend module and model package
- `--collect-all` for packages using `inspect.getsource`, shipping data files, or native libraries
- `--copy-metadata` for packages using `importlib.metadata`
If the engine has native data paths, add `os.environ.setdefault()` in `backend/server.py` inside the `if getattr(sys, 'frozen', False):` block.
### 5. Verify in dev mode
```bash
just dev
```
Test the full chain: model download → load → generate → voice cloning.
### 6. Use the checklist
Walk through the Implementation Checklist at the bottom of `tts-engines.mdx`. Every item must be checked before handing the build to the user.
## Key Lessons (from v0.2.3)
These are the most common failure modes. Phase 0 research catches all of them:
| Pattern | Symptom in Frozen Build | Fix |
|---------|------------------------|-----|
| `@typechecked` / `inspect.getsource()` | "could not get source code" | `--collect-all <package>` |
| Package ships pretrained model files | `FileNotFoundError` for `.pth.tar`, `.yaml` | `--collect-all <package>` |
| C library with hardcoded system paths | `FileNotFoundError` for `/usr/share/...` | `--collect-all` + env var in `server.py` |
| `importlib.metadata.version()` | "No package metadata found" | `--copy-metadata <package>` |
| `torch.load` without `map_location` | CUDA device not available on CPU build | Monkey-patch `torch.load` |
| `torch.from_numpy` on float64 data | dtype mismatch RuntimeError | Cast to `.float()` |
| `token=True` in HF download calls | Auth failure without stored HF token | Use `snapshot_download(token=None)` + `from_local()` |
## Notes
- The route and service layers have zero per-engine dispatch points. `main.py` requires zero changes.
- The model config registry in `backends/__init__.py` handles all dispatch automatically.
- Use `get_torch_device()` and `model_load_progress()` from `backends/base.py` — don't reimplement device detection or progress tracking.
- Always test with a **clean HuggingFace cache** (no pre-downloaded models from dev).
- Do NOT push or create a release. Hand the build to the user for local testing.

View File

@@ -0,0 +1,94 @@
---
name: draft-release-notes
description: Use this skill to draft or update the [Unreleased] section of CHANGELOG.md from the actual changes since the last tag. Run this at any point during development to keep a working copy of the release narrative. Does NOT bump versions or create tags.
---
# Draft Release Notes
## Goal
Update the `[Unreleased]` section at the top of `CHANGELOG.md` with a narrative release story based on the real changes since the last tag. This is a **non-destructive working copy** — run it as many times as you want during development.
## Workflow
1. **Identify the last release tag and gather changes.**
```bash
LAST_TAG=$(git tag --list "v*" --sort=-v:refname | head -n 1)
echo "Last tag: $LAST_TAG"
```
Then collect raw material from three sources:
a. **Commit log since last tag:**
```bash
git log --oneline "$LAST_TAG"..HEAD
```
b. **GitHub-generated release notes preview** (PR titles, new contributors):
```bash
gh api repos/:owner/:repo/releases/generate-notes \
-f tag_name="vNEXT" \
-f target_commitish="$(git rev-parse HEAD)" \
-f previous_tag_name="$LAST_TAG" \
--jq '.body'
```
c. **Diff stat for theme analysis:**
```bash
git diff --stat "$LAST_TAG"..HEAD
```
2. **Draft the release narrative.**
Write markdown for the `[Unreleased]` section following the format below. Do not include the `## [Unreleased]` heading itself — just the body content.
3. **Update CHANGELOG.md.**
Replace everything between `## [Unreleased]` and the next `## [` heading with the new draft. Preserve the HTML comment header and all existing release sections below.
The `[Unreleased]` section must always exist and always be the first section after the header comments.
4. **Do NOT commit, tag, or bump versions.** Just leave the file modified in the working tree.
## Release Story Format
Structure the `[Unreleased]` section like this:
```markdown
## [Unreleased]
<One strong opening paragraph: what this release is about and why it matters.
Tie it to concrete shipped changes. No vague hype.>
<One paragraph on major technical shifts, if applicable.>
### <Feature/Theme Group>
- Bullet points with specifics
- Reference PRs where available: ([#123](https://github.com/jamiepine/voicebox/pull/123))
### <Another Group>
- ...
### Bug Fixes
- ...
```
### Style Guidelines
- **Factual and specific.** Every claim should trace to a real commit or PR.
- **Narrative over list.** Lead with paragraphs that tell the story, then support with bullets.
- **Group by theme, not by commit.** Cluster related changes under descriptive headings.
- **Reference PRs** where they exist, but don't fabricate them.
- **Skip trivial chores** (typo fixes, CI tweaks) unless they're the bulk of the release.
- **Match the voice of existing releases** — look at the v0.2.1 and v0.2.3 entries in CHANGELOG.md for tone reference.
## When There Are No Changes
If `git log "$LAST_TAG"..HEAD` is empty, leave the `[Unreleased]` section empty (just the heading) and tell the user there's nothing to draft.
## Notes
- This skill only touches the `[Unreleased]` section. It never modifies stamped release sections.
- The agent can be asked to run this skill at any point — mid-feature, before a PR, or right before cutting a release.
- The `release-bump` skill depends on this draft being up to date before it finalizes.

View File

@@ -0,0 +1,124 @@
---
name: release-bump
description: Use this skill to finalize a release. It stamps the [Unreleased] changelog section with a version and date, runs bumpversion to update all version files, and creates the release commit and tag. Only run this when you're ready to ship.
---
# Release Bump
## Goal
Finalize the changelog draft, bump the version across all tracked files, and create a tagged release commit. After this skill runs, the repo has a clean release commit and tag ready to push.
## Prerequisites
- `gh` CLI installed and authenticated (`gh auth status`).
- `bumpversion` installed (`pip install bumpversion` or available in the project venv).
- The `[Unreleased]` section of `CHANGELOG.md` should already contain the release narrative. If it's empty or stale, run the `draft-release-notes` skill first.
## Workflow
1. **Verify the working tree is clean** (except `CHANGELOG.md` which may have the draft).
```bash
git status --porcelain
```
Only `CHANGELOG.md` (and optionally `.agents/` files) should be modified. If there are other uncommitted changes, stop and ask the user to commit or stash them first.
2. **Determine the bump level.**
Ask the user if not specified: `patch`, `minor`, or `major`. Check the current version:
```bash
grep '^current_version' .bumpversion.cfg
```
3. **Stamp the changelog.**
Read the current `[Unreleased]` content from `CHANGELOG.md`. Compute the new version (based on bump level and current version). Then:
a. Replace the `## [Unreleased]` section body with an empty placeholder.
b. Insert a new stamped section immediately after `## [Unreleased]`:
```markdown
## [Unreleased]
## [X.Y.Z] - YYYY-MM-DD
<the content that was in [Unreleased]>
```
c. Update the reference links at the bottom of the file:
- Change the `[Unreleased]` link to compare against the new tag
- Add a new link for the new version
```markdown
[Unreleased]: https://github.com/jamiepine/voicebox/compare/vX.Y.Z...HEAD
[X.Y.Z]: https://github.com/jamiepine/voicebox/compare/vPREVIOUS...vX.Y.Z
```
4. **Stage the changelog.**
```bash
git add CHANGELOG.md
```
5. **Run bumpversion.**
```bash
bumpversion --allow-dirty <patch|minor|major>
```
The `--allow-dirty` flag is needed because `CHANGELOG.md` is already staged. bumpversion will:
- Update version strings in all tracked files (see `.bumpversion.cfg`)
- Create a commit with message `Bump version: X.Y.Z -> A.B.C`
- Create a tag `vA.B.C`
The staged `CHANGELOG.md` will be included in this commit automatically.
6. **Verify results.**
```bash
git show --name-only --stat HEAD
git tag --list "v*" --sort=-v:refname | head -n 5
```
Confirm the commit contains:
- `CHANGELOG.md`
- `.bumpversion.cfg`
- `tauri/src-tauri/tauri.conf.json`
- `tauri/src-tauri/Cargo.toml`
- `package.json`
- `app/package.json`
- `tauri/package.json`
- `landing/package.json`
- `web/package.json`
- `backend/__init__.py`
Confirm the new tag exists.
7. **Do NOT push** unless the user explicitly asks. Report the tag name and suggest:
```
Ready to push. When you're ready:
git push origin main --follow-tags
```
## Version Calculation Reference
Given current version `X.Y.Z`:
- `patch` -> `X.Y.(Z+1)`
- `minor` -> `X.(Y+1).0`
- `major` -> `(X+1).0.0`
## Error Recovery
- If bumpversion fails, the tag won't exist. Fix the issue and re-run — bumpversion is idempotent as long as the tag doesn't already exist.
- If you need to undo a release commit (before pushing): `git tag -d vX.Y.Z && git reset --soft HEAD~1`
- Never amend a release commit that has been pushed.
## Notes
- When the tag is pushed, the release CI (`.github/workflows/release.yml`) automatically extracts the matching version section from `CHANGELOG.md` and uses it as the GitHub Release body. No manual copy-paste needed.
- The release commit message is controlled by `.bumpversion.cfg` (`Bump version: X.Y.Z -> A.B.C`). Do not override it.
- If you need to manually update the GitHub Release body after the fact: `gh release edit vX.Y.Z --notes-file <(sed -n '/## \[X.Y.Z\]/,/## \[/p' CHANGELOG.md | head -n -1)`

View File

@@ -0,0 +1,299 @@
---
name: triage-prs
description: Use this skill to triage the open PR queue before a release. Classifies every open PR into must-merge, candidate, superseded, or deferred; writes a working triage doc; and runs the merge loop end-to-end. Designed for the pre-release "PR speedrun" pass where a solo maintainer wants to clear the inbound backlog in a single session.
---
# Triage PRs
## Goal
Turn a backlog of open PRs into a shipped set of merges in a single focused session. Produce a tracked, resumable plan (`<VERSION>_PR_TRIAGE.md`), then work it — rebasing where needed, merging in isolation-safe batches, applying post-merge follow-ups, and closing superseded or partially-applicable PRs with credit to their authors.
This skill pairs with `draft-release-notes` and `release-bump`: triage first, then draft notes against the new main, then cut the release.
## When to use
- Before a minor or major release when 10+ open PRs have accumulated
- When you want to unblock merging without losing the narrative of what's landing
- When you know you can't personally review every PR deeply, but need to land the critical subset fast
## Prerequisites
- `gh` CLI authenticated against the repo
- A dedicated worktree for PR review (avoid contaminating `main` with checkouts of contributor branches)
- Clarity on the target version — the triage doc is named after it (e.g. `0.4.0_PR_TRIAGE.md`)
## Workflow
### 1. Set up an isolated PR-review worktree
```bash
git worktree list # check for stale ones first
git worktree prune
git worktree add ../voicebox-pr-review -b pr-review-<VERSION> main
```
Keep the main worktree for release-prep work (changelog drafts, direct-to-main follow-ups). Keep the review worktree for `gh pr checkout` — each checkout moves HEAD to a contributor branch, which you don't want to do in the main worktree.
### 2. Gather metadata for every open PR
```bash
gh pr list --state open --limit 50 --json \
number,title,author,isDraft,mergeable,mergeStateStatus,files,additions,deletions,reviewDecision,statusCheckRollup,maintainerCanModify \
--jq '.[] | {num: .number, title, author: .author.login, mergeable, state: .mergeStateStatus, canModify: .maintainerCanModify, changes: "+\(.additions)/-\(.deletions)", files: [.files[].path]}'
```
You want, for each PR:
- Size (`+additions/-deletions`)
- Mergeable state (`CLEAN`, `UNSTABLE`, `DIRTY` = conflicts, `UNKNOWN` = GitHub still computing)
- Whether maintainer edits are allowed on the branch (needed later if you rebase for the author)
- File paths touched (helps spot overlaps between PRs)
`UNKNOWN` is common right after a push to main — just try the merge and see.
### 3. Classify into tiers
Sort each PR into exactly one bucket:
**Tier 1 — Merge:** small, mergeable, fixes a real bug, clean CI, low review cost. One-liners, dependency relaxations, targeted safety hardening. These are the easy wins.
**Tier 2 — Candidate, review:** medium size (50-200 lines), touches more surface area, looks sound but needs a closer read. New user-facing features that fit the product direction.
**Supersede:** the fix or feature is already covered by something merged. Close with a comment pointing to the superseding PR. Check carefully — "similar title" isn't proof; compare the actual diffs.
**Defer to next release:** big features, dirty conflicts, draft PRs, anything touching the release pipeline in ways that would introduce risk. Don't merge these in a speedrun — they need dedicated focus.
### 4. Write the triage doc
Create `<VERSION>_PR_TRIAGE.md` in the PR-review worktree root. Structure:
```markdown
# <Repo> <VERSION> — PR Triage
Working doc for tracking which open PRs land in <VERSION>. Delete after release cut.
Last updated: <DATE>
## Progress
**Tier 1: 0 / N merged**
**Tier 2: 0 / M handled**
**Supersede triage: pending**
---
## Merge for <VERSION> — critical bug fixes
| PR | Status | Size | What it fixes | Why must-have |
|---|---|---|---|---|
| [#123](url) | [ ] | +5/-0 | ... | ... |
## Strong candidate — needs a quick review
| PR | Status | Size | Summary |
|---|---|---|---|
## Close as superseded
| PR | Status | Reason |
|---|---|---|
## Defer to <NEXT_VERSION>
- [#xxx](url) ... — reason
---
## Order of attack
1. Close superseded PRs (one-liner comments)
2. Merge tier-1 in dependency-free batches — check file paths don't overlap
3. Review tier-2 individually
4. Rerun `draft-release-notes` to pick up everything
5. Run `release-bump`
```
The **Progress** header is the most important part — it's your scoreboard and lets you resume cleanly if the session gets interrupted.
### 5. Work the loop — per PR
For each PR in the tier-1 / tier-2 list:
**a. Checkout in the review worktree:**
```bash
cd ../voicebox-pr-review
git checkout pr-review-<VERSION> # reset to neutral base
gh pr checkout <N>
```
**b. Read the *actual* commit, not `main..HEAD`:**
```bash
git show HEAD # the PR's actual changes
git show --stat HEAD # files touched + line counts
```
**Do NOT review via `git diff main..HEAD`** if the PR branch is older than main. That diff includes *every commit that landed on main after the PR was forked* as `-` (deletion) lines. A 3-line PR can look like a 700-line revert. This is the single easiest way to misjudge a PR.
**c. Evaluate concerns:** correctness, scope, interaction with already-merged work, version compatibility (e.g. can't use an API that requires a dependency version we don't yet pin).
**d. Rebase if the branch is behind main:**
```bash
git fetch origin main
git rebase origin/main
```
This is **essential** before squash-merging. GitHub's squash computes `diff(PR-head, merge-base)` — on a stale branch, that diff includes reverting every in-between commit. Rebasing moves the merge-base forward so the squash is clean.
**e. If maintainer edits are allowed, push the rebase back to the contributor's fork:**
```bash
git remote add <author> https://github.com/<author>/<repo>.git
git fetch <author> <branch> # get their ref first
git push <author> HEAD:<branch> --force-with-lease
```
This keeps GitHub's PR UI in sync with the rebased state and makes the merge clean from the GitHub side.
**f. Merge:**
```bash
gh pr merge <N> --squash
```
**g. Update the triage doc** — flip the checkbox to `✅ merged <sha>` (use the short SHA from `gh pr view <N> --json mergeCommit --jq '.mergeCommit.oid[0:7]'`). Update the Progress header.
### 6. Batch tiny fixes
PRs with ≤5 line changes, clean CI, non-overlapping file paths, and obviously-correct intent (e.g. one-line dependency relax, env var add, import path fix) can be merged in a single loop without the review-per-PR ceremony:
```bash
for pr in 425 384 416 429; do
echo "=== Merging PR $pr ==="
gh pr merge $pr --squash
done
```
Verify afterward that each landed cleanly:
```bash
for pr in 425 384 416 429; do
gh pr view $pr --json state,mergeCommit --jq "{pr: $pr, state, sha: .mergeCommit.oid[0:7]}"
done
```
### 7. Post-merge follow-ups
Sometimes a PR is worth merging despite a known minor issue (e.g. incomplete dtype map, stale sentinel cleanup). Don't block the merge; apply the follow-up as a normal branch + PR right after:
```bash
cd <main-worktree>
git pull --ff-only origin main
git checkout -b fix/<short-name>
# edit...
git commit -m "fix(<area>): <one-liner>"
git push -u origin fix/<short-name>
gh pr create --title "..." --body "Follow-up to #<N>. ..."
```
Record both SHAs in the triage doc (`✅ merged <pr-sha> + follow-up <pr>`).
**Direct-to-main exception:** only under an explicit, scoped policy (e.g. "release speedrun"). Don't default to it.
### 8. Supersede: close with a credit-pointing comment
```bash
gh pr close <N> --comment "Closing — superseded by merged #<M> which landed <brief description>. Thanks!"
```
Check the diffs first — "similar title" is not enough. If the PR is *partially* superseded (the diagnosis is right but only half the changes are still needed), do a partial-apply instead.
### 9. Partial-apply pattern
When a PR has both valuable and questionable changes bundled:
```bash
cd <main-worktree>
git pull --ff-only origin main
# Cherry-pick specific files from the PR branch
git checkout <pr-commit-sha> -- <file1> <file2>
# Review the staged changes, adjust as needed
git diff --cached
# Apply any surgical edits to files you don't want to bulk-replace
# (e.g. the PR's file predates a recent main commit you need to preserve)
# Commit with a trailer crediting the original author
git commit -m "$(cat <<'EOF'
<subject>
<body explaining what was kept vs dropped>
Co-Authored-By: <author> <noreply@github.com>
EOF
)"
git push ... # branch + PR, unless under the direct-to-main exception
```
Then close the PR with a comment explaining what was applied and what was dropped, referencing the commit SHA.
### 10. Keep the doc current
Every merge, every close, every follow-up → update `<VERSION>_PR_TRIAGE.md`. The doc is your session log. If you're interrupted and resume tomorrow, the doc is the only source of truth for "where am I."
### 11. When triage is done
- Every PR in the doc has a terminal status (✅ merged / ✅ closed / deferred)
- Progress header shows N/N for each tier
- Next skill to run is `draft-release-notes` (to regenerate `[Unreleased]` against the new main), then `release-bump`
You can delete the triage doc after the release ships, or keep it in version history as a record.
## Gotchas
- **`main..HEAD` on a stale branch lies.** It shows everything main gained since the branch split as deletions. Always review via `git show HEAD` for the PR's actual commit.
- **Squash-merging an unrebased branch reverts in-between work.** The squash computes `diff(PR-head, merge-base)`. Rebase moves the merge-base forward.
- **`mergeable=UNKNOWN`** is transient — GitHub is recomputing after a push. Just try the merge.
- **Route ordering matters (FastAPI and similar):** `DELETE /history/failed` must be registered *before* `DELETE /history/{id}`, or the parameterized path will consume `"failed"` as an ID.
- **Apple's `-weak_framework` overrides `-framework`** for the same framework, regardless of order — use it via `cargo:rustc-link-arg=-Wl,-weak_framework,Name` when a dependency hard-links something optional.
- **Dependency version floors constrain what you can apply.** Before accepting a kwarg rename like `torch_dtype=``dtype=`, check the min-version pin supports it. Sometimes the right move is to cherry-pick half the PR.
- **`cpal::Stream` and similar `!Send` audio types** can't cross `await` points or `spawn_blocking`. Sometimes a "not-ideal but correct" sync wait is the best available fix; flag but don't block.
- **PyTorch nightly builds are not shippable for releases** — non-deterministic, can regress between runs. If a PR suggests switching to nightly to fix a GPU issue, prefer `TORCH_CUDA_ARCH_LIST=...+PTX` or wait for stable support instead.
## Canonical commands reference
```bash
# Bulk PR metadata
gh pr list --state open --limit 50 --json number,title,author,mergeable,mergeStateStatus,additions,deletions,maintainerCanModify,files
# Detailed single-PR view
gh pr view <N> --json body,author,headRefName,baseRefName,mergeable,maintainerCanModify,files,statusCheckRollup
# The actual commit, not the branch-vs-main diff
git show HEAD
git show --stat HEAD
gh pr diff <N>
# Rebase contributor branch onto current main
git fetch origin main && git rebase origin/main
# Push rebase back to contributor fork (maintainerCanModify=true required)
git remote add <author> https://github.com/<author>/<repo>.git
git fetch <author> <branch>
git push <author> HEAD:<branch> --force-with-lease
# Merge
gh pr merge <N> --squash
# Confirm merge SHA for triage doc
gh pr view <N> --json state,mergeCommit --jq '{state, sha: .mergeCommit.oid[0:7]}'
# Close superseded
gh pr close <N> --comment "Closing — superseded by merged #<M>. Thanks!"
```
## Notes
- **Never review a stale branch via `main..HEAD`.** This is the single most important line in this skill.
- **The triage doc is the session state.** Lose the doc, lose the session. Update it after every action.
- **Credit contributors even on partial-applies.** Use `Co-Authored-By:` trailers and close comments that link to the applied commit.
- **Don't let perfect be the enemy of shipped.** A fix that goes from "broken" to "works with a minor known issue" is a strict improvement. Flag the issue, file a follow-up, merge the fix.