talkforge

Short AI-narrated promo videos with a lip-synced avatar of your face,
your cloned voice, and custom data-viz scenes.

💸 New to Runpod? Sign up via this referral link and get a random credit bonus between $5 and $500 — enough to render anywhere from ~30 to ~3,000 talkforge promos before paying a cent. Runpod is the only paid component of this pipeline (~$0.10–$0.20 per typical 45-second video, measured on the RTX-4090-class GPUs we use).

Driven end-to-end by Claude Code through the /marketing-video skill — you point Claude at a product URL, it does the research, drafts the script, designs the scenes, runs the pipeline, and hands back an mp4. The manual setup below is for the underlying tools; day-to-day you don't run any of it by hand.

talkforge is the pipeline; /marketing-video is the entry point. The skill (in ~/.claude/skills/marketing-video/, part of the solana.new toolkit) handles the research, questioning, and script drafting; talkforge handles the narration synthesis, lip-sync, Remotion render, and final mux.

The output is a single self-contained .mp4. Re-rendering after a text or asset tweak typically costs ~$0.10–$0.20 in cloud GPU and ~7 minutes wall time on a modest VPS, thanks to a content-hash cache that skips anything unchanged.

Where to start?

First time? Drop your user files → install the subprojects → render once manually to verify the rig.

Driving a single video? From Claude Code: /marketing-video <product url> — the skill handles research, drafting, render, and delivery.

Producing a series? Also bootstrap the Claude Project (one-time, ~10 min) — your ideation surface that drafts briefs to feed Claude Code.

Table of contents

See it end-to-end
How you actually use this (Claude Code)
What's interesting about it
Project layout
Required user-provided files
Setup
Quick start: running the pipeline manually
Common tweaks
How the pipeline flows
Adapting for your project
Cost reference
Dependencies summary
Files you'll iterate on
Troubleshooting
Running a content series
See also

See it end-to-end

The pipeline turns one selfie into a talking-head promo with your cloned voice + custom data-viz scenes:

1. Your selfie
_{A normal headshot — phone camera is fine.} 2. AI-generated podcast scene
_{Cinematic studio shot generated from your selfie via ChatGPT image gen (prompt below). This becomes portrait.jpg.} 3. Talkforge output
_{InfiniteTalk lip-syncs your cloned voice (OpenVoice v2) onto the portrait; Remotion covers the avatar with data-viz scenes between the on-camera bookends. View script}

ChatGPT image-gen prompt used for step 2 (click to expand)

Use the character and clothing from the attached reference image. Create a realistic cinematic podcast studio scene with the same man seated at a modern wooden desk, speaking into a large professional black studio microphone mounted on a boom arm. Preserve his facial identity, hairstyle, light stubble, plain black crew-neck shirt, calm attentive expression, and direct eye contact with the camera.

The desk should be clean and uncluttered, with no audio mixer and no headphones. The man's forearms should rest naturally on the desk, with relaxed, believable hand posture, one hand lightly resting over the other.

The environment is a refined modern content-creator studio with dark acoustic wall panels or vertical slatted panels, a warm wooden desk, minimal modern decor, and a premium YouTube podcast aesthetic. Use cyberpunk-inspired but tasteful lighting: moody blue, cyan, purple, and magenta neon accents, soft cinematic key lighting on the face, subtle rim lighting, shallow depth of field, realistic photography, high detail, and a polished professional composition.

Keep the background clean and minimal — avoid clutter, busy shelves, and repeated branding. Place at most one tasteful focal element behind the subject (e.g. a single glowing neon wall sign with your brand's logo or symbol, or one large piece of geometric wall art). No screens, no framed photos, no plaques, no duplicate logos. Subtle decor only: a small plant, a few books, minimal geometric objects. The background should support the subject, never compete with them.

Modern podcast studio, clean desk, realistic hands, cinematic lighting, dark premium atmosphere, single tasteful neon focal element, shallow depth of field, warm wood and black materials, professional YouTube creator setup, 16:9 aspect ratio.

Tip: Where the prompt says "your brand's logo or symbol", swap in whatever fits your product — a neon "$" for a fintech, a stylized neuron for an AI tool, a logomark, even just a colored geometric shape. Leaving it generic gives the model creative latitude; specifying it gets you on-brand. The original generated portrait above used a Solana "S" logo.

Reuse the prompt with your own selfie. The pipeline doesn't care whether your portrait.jpg is a real photo or AI-generated — InfiniteTalk lip-syncs both equally well.

How you actually use this (Claude Code)

You:    /marketing-video Make a promo video for example.com
Claude: → loads the marketing-video skill from
          ~/.claude/skills/marketing-video/
        → screenshots the site with Playwright
        → reads the copy, identifies the hook + key features + CTA
        → asks 2-4 clarifying questions (audience, distribution,
          tone, must-haves) via AskUserQuestion
        → drafts a 45-55s narration in casual dev-to-dev voice,
          structured as 2 on-camera bookends + 4-6 voiceover middle
          segments
        → pastes the draft script back for your approval
You:    Approves (or tweaks)
Claude: → forks the example Remotion project to <product>-promo/
        → builds the data-viz scenes matching the product
        → invokes the talkforge pipeline: generate.py → 3-6 Runpod
          lip-sync calls + 5-6 local still-portrait builds
          (~7 min, ~$0.10-$0.20)
        → retimes scene boundaries to match measured chunk durations
        → renders Remotion + mixes SFX
        → uploads the final mp4 to litterbox + sends you the URL
You:    Watches, says "the stats segment sounds soft, EN-AU + EQ"
Claude: → swaps OpenVoice settings, busts the audio cache for that
          one segment, re-runs (~1 min, $0)
        → uploads the new version

The skill knows about story structure, research, asking the right questions. talkforge (this repo) knows about TTS, lip-sync, scene timing, caching. The skill calls into talkforge as its render backend.

The full operational runbook (recipe, cache invariants, gotchas, when to ask vs proceed) lives in CLAUDE.md. That file is loaded at the start of every Claude Code session and is the source of truth for how the pipeline is orchestrated. Read it if you want to understand the agent's playbook, or edit it to change Claude's behavior on future runs.

What the pipeline does under the hood

┌──────────────────┐    ┌──────────────────────┐    ┌────────────────────┐
│ Script segments  │───►│ TTS (local)          │───►│ Lip-sync (Runpod)  │
│ (claude-drafted) │    │  Kokoro or OpenVoice │    │  InfiniteTalk      │
└──────────────────┘    └──────────────────────┘    └─────────┬──────────┘
                                                              ▼
┌──────────────────┐    ┌──────────────────────┐    ┌────────────────────┐
│ Remotion scenes  │───►│ Remotion render      │◄───│ narration.mp4      │
│ (claude-coded)   │    │  (data viz / B-roll) │    │  (talking head)    │
└──────────────────┘    └──────────────────────┘    └────────────────────┘
                                  │
                                  ▼
                       ┌──────────────────────┐
                       │ ffmpeg mix + SFX     │
                       └─────────┬────────────┘
                                 ▼
                          final mp4

What's interesting about it

Per-segment mode switching. Mark each line of your script as either "video" (the viewer sees the avatar's mouth move; pay for lip-sync) or "audio" (voiceover only; the avatar is hidden by your data viz, lip-sync skipped, $0 spent on that segment). A typical bookend video has 2–4 "video" segments and the rest "audio" — most of the cloud-GPU cost is spent on the small portion the viewer actually sees.
Two-layer content-hash cache. Edit one sentence, and only that segment re-renders. Swap the portrait, and the audio waveforms cache-hit while only the mp4s rebuild. Edit voice settings, and the audio + mp4 both auto-bust via a worker-script hash. Cache makes iteration genuinely cheap.
Pluggable TTS engines. Kokoro for fast/polished/stock voices, OpenVoice v2 for cloning your own voice. Swap with a single config flag.
Remotion for the visuals. Code-defined React scenes, deterministic rendering, no editor required. Comes with one example project (Solana Bytes promo) that shows how to wire data viz over the narration.

Project layout

talkforge/
├── InfiniteTalk/         # Narration generator (TTS + Runpod lip-sync orchestration)
│   ├── generate.py       #   ← main script
│   ├── README.md         #   ← cache layers, modes, Runpod deployment
│   ├── voice_reference.wav  # 12-30s sample of the cloned voice
│   ├── portrait.jpg      # the face the avatar uses
│   ├── _work/            # per-chunk intermediates + cache
│   └── out.mp4           # raw narration (avatar video with audio)
│
├── videos/               # Each video lives in its own subdir
│   ├── solana-bytes-promo/   #   ← example Remotion project (committed)
│   │   ├── src/              #     React/TSX scenes
│   │   ├── scripts/          #     SFX mix script
│   │   ├── public/           #     audio SFX, screenshots
│   │   └── out/              #     final mp4 (rendered, gitignored)
│   └── <your-project>/       #   ← your videos go here (auto-gitignored)
│
└── openvoice-test/       # OpenVoice v2 + MeloTTS voice cloning (its own venv)
    ├── synth_worker.py   # long-lived TTS subprocess
    ├── README.md         # setup notes, patches, troubleshooting
    └── venv/             # isolated Python env (~720 MB, not in git)

Required user-provided files

The repo ships as a template, not a working video. Before your first render, drop these unique-to-you files into place. None of them are committed (all gitignored).

Path	What	Format	How to get it
`InfiniteTalk/portrait.jpg`	The face the avatar uses	JPG or PNG (update `IMAGE_PATH` in `generate.py` for PNG). Aspect ratio close to `OUTPUT_WIDTH × OUTPUT_HEIGHT` (default 720×400 ≈ 16:9). Square works too — the pipeline center-crops.	Take a clean front-facing photo. Good lighting, neutral expression, head/shoulders framing.
`InfiniteTalk/voice_reference.wav`	Sample of the voice you want cloned	WAV, mono, 24 kHz, 16-bit, 12-30 seconds. Read the script in `docs/voice-reference-script.md`*; it's designed to cover all phonemes the cloning model needs.	Record on your phone, convert to the right format with `ffmpeg -i input.m4a -ac 1 -ar 24000 -sample_fmt s16 voice_reference.wav`
`InfiniteTalk/.env`	Your Runpod credentials	`RUNPOD_API_KEY=<your-key>` + `RUNPOD_ENDPOINT_ID=<your-endpoint-id>`	Copy from `InfiniteTalk/.env.example`. Get the API key from your Runpod Settings page; get the endpoint ID by deploying the InfiniteTalk Runpod Hub template (1-click, see InfiniteTalk/README.md → Deploying your own Runpod endpoint). New to Runpod? Sign up here (referral link, get a $5-$500 random bonus = ~30-3,000 free videos). Both values only needed if your script has `"video"` mode segments.

* If docs/voice-reference-script.md doesn't exist yet, paste a 25-30 second monologue with varied vowels/consonants — see voice-a-style-guide.md for what the voice should sound like.

What's already there for you

The repo ships with two example Remotion projects wired up to the pipeline, so you can render a sample video on day one without writing React from scratch:

videos/solana-bytes-promo/ — the original Solana Bytes account-decoder promo (V1 in the Solana Concepts series). Committed as the canonical reference for the Remotion conventions this pipeline uses.

Your own videos go in videos/<your-project>/ and are automatically gitignored — fork the example, drop screenshots in public/screenshots/, edit the scenes, render. The .gitignore has a videos/* pattern with a !videos/solana-bytes-promo/ exception, so the example stays tracked while your personal videos don't bloat the repo.

Setup

1. InfiniteTalk (narration generator)

cd InfiniteTalk
pip install --user --break-system-packages python-dotenv requests
echo "RUNPOD_API_KEY=<your-key>" > .env
# Drop a portrait at portrait.jpg (square or 16:9 — will be cropped to
# OUTPUT_WIDTH × OUTPUT_HEIGHT, default 720×400)
# Drop a 12-30s voice sample at voice_reference.wav (24 kHz mono recommended)

You'll also need to deploy a Runpod endpoint that speaks the InfiniteTalk API the script expects — see InfiniteTalk/README.md → Deploying your own Runpod endpoint.

Put your endpoint ID in InfiniteTalk/.env as RUNPOD_ENDPOINT_ID=<your-endpoint-id> alongside the API key.

2. Remotion (visuals)

cd videos/<your-project>
npm install

videos/solana-bytes-promo/ is the included example. Use it as a template for your own project — see Adapting for your project.

3. OpenVoice (voice cloning) — optional

The OpenVoice v2 ecosystem has tight version pins that conflict with modern Python/torch. We isolate it in its own venv with patched requirements — see openvoice-test/README.md for the full recipe (~15 min, includes a 117 MB checkpoint download).

If you don't need voice cloning, set VOICE_ENGINE = "kokoro" in generate.py and skip OpenVoice setup entirely.

4. Kokoro (stock voices) — optional

pip install --user --break-system-packages kokoro
# 10+ stock voices — see KOKORO_VOICE constant in generate.py.
# Fast, polished, not cloneable.

Quick start: running the pipeline manually

Day-to-day, Claude Code orchestrates these steps for you. This section is for: (a) initial bring-up, (b) debugging a stuck render, (c) running the pipeline outside Claude.

Assumes all three subprojects are set up (see Setup above) and you have a RUNPOD_API_KEY for a deployed InfiniteTalk endpoint (see Deploying the Runpod endpoint).

# 1. Generate narration (TTS + lip-sync). ~7 min, ~$0.10-$0.20 in Runpod
#    for a typical bookend video (3-6 visible video chunks + 5-6 voiceover).
cd InfiniteTalk
python3 generate.py
# Produces out.mp4 — the avatar talking.

# 2. Wire that narration into your Remotion composition.
cp out.mp4 ../videos/<your-project>/public/avatar/narration.mp4

# 3. Render the visuals over the narration.
cd ../videos/<your-project>
npx remotion render Master out/master-silent.mp4

# 4. Mix narration + SFX onto the final master (script lives in
#    videos/solana-bytes-promo/scripts/mix-audio-narration.ts as a template).
npx tsx scripts/mix-audio-narration.ts
# Final video: out/master.mp4

Common tweaks

Change the narration text

Edit InfiniteTalk/generate.py → SCRIPT_SEGMENTS list of (mode, text) tuples:

SCRIPT_SEGMENTS = [
    ("video", "Your opening line (on-camera)."),
    ("audio", "Voiceover line that plays under your data viz."),
    ("audio", "Another voiceover line."),
    ("video", "Your closing line (on-camera)."),
]

Re-run the pipeline. The cache keeps unchanged segments — only modified ones re-render. If you only tweak an "audio" segment, Runpod isn't called at all.

Swap the avatar

Drop a new portrait at InfiniteTalk/portrait.jpg. Re-run; the cache key includes the image hash, so all mp4 chunks rebuild (3-6 Runpod calls for the visible avatar segments + local still-video rebuilds for voiceover). The approved audio waveforms cache-hit, so no voice re-roll.

Re-roll one segment

rm InfiniteTalk/_work/job_005_audio.key   # bust audio cache for that seg
python3 InfiniteTalk/generate.py
# Only that one segment re-synthesizes. Useful when TTS produces a bad
# take for a single segment but the rest sound right.

Switch TTS engine

Edit InfiniteTalk/generate.py: set VOICE_ENGINE = "kokoro" or "openvoice". Caches are keyed by engine — switching back and forth is instant (each engine keeps its own cache).

Change cloned-voice settings (OpenVoice)

Edit openvoice-test/synth_worker.py:

SPEAKER_KEY — "EN-US" / "EN-AU" / "EN-BR" / "EN-Default" / "EN_INDIA"
ENABLE_WATERMARK — False recovers presence (default)
PRESENCE_EQ — ffmpeg filter chain applied post-synthesis

Any edit auto-invalidates audio + mp4 caches via a script-source hash.

How the pipeline flows

SCRIPT_SEGMENTS in generate.py
    │
    │ build_jobs (clause-pack video to ≤3s sub-chunks):
    │
    ▼
┌────────────────────────────────┐
│ Per-job dispatch               │
│   audio mode → 1 wav synth     │
│   video mode → N wav sub-chunks│
└────────────────────────────────┘
    │
    │ for each job:
    │
    ▼
┌─────────────────────┐    ┌───────────────────┐
│ TTS synth           │    │ Cache check       │
│   kokoro CLI        │◄─► │  (text/voice/img/ │
│   or OpenVoice      │    │   worker hash)    │
│   worker subprocess │    └───────────────────┘
└─────────────────────┘
    │
    ▼
┌───────────────┐    ┌──────────────────────┐
│ audio mode    │    │ video mode           │
│ → still-video │    │ → Runpod InfiniteTalk│
│   (ffmpeg)    │    │   (lip-sync ~2 min)  │
└───────────────┘    └──────────────────────┘
    │                          │
    └────────────┬─────────────┘
                 ▼
        concat → out.mp4
                 │
                 ▼
        public/avatar/narration.mp4
                 │
                 ▼
        ┌────────────────┐
        │ Remotion render│  → master-silent.mp4
        └────────────────┘
                 │
                 ▼
        ┌────────────────┐
        │ ffmpeg mix SFX │  → master.mp4
        └────────────────┘

Adapting for your project

The example Remotion project (videos/solana-bytes-promo/) shows the pattern, but it's specific to one product. To repoint the pipeline at your project:

Fork the Remotion subdir — cp -r videos/solana-bytes-promo videos/<your-project> and edit:
- src/Master.tsx — SCENES map and scene composition
- src/scenes/*.tsx — replace each scene's content
- src/data/*.ts — your product's data
- public/screenshots/ — your product's screenshots
- src/theme.ts — colors and fonts
Edit SCRIPT_SEGMENTS in generate.py to match your scene structure: "video" for visible avatar moments, "audio" for voiceover under data viz.
Update SCENES frame boundaries in Master.tsx to land at the measured chunk endings (run generate.py once, check the printed per-chunk durations, multiply by fps to get frames).
Update SFX timing in scripts/mix-audio-narration.ts to land on your scene cuts.

The InfiniteTalk and openvoice-test subdirs stay as-is — they're project-agnostic. The pipeline doesn't know or care what your scenes contain.

Cost reference

Operation	Time	Cost
Full re-render, 3-6 visible video chunks + 5-6 voiceover, fresh inputs	~7 min	~$0.10–$0.20
Full re-render, cached audio (e.g. image-only swap)	~5 min	~$0.10–$0.20 (same Runpod count)
Single audio segment changed	~1 min	$0 (no Runpod)
Single visible video segment changed	~3 min	~$0.03
Same inputs (cache fully hits)	<30 s	$0
Remotion render	~1 min	$0
SFX mix	<10 s	$0

Runpod is the only paid component. Each "video"-mode sub-chunk ≤3 s costs ~$0.03 on an RTX-4090-class GPU (≈60 s of GPU time at Runpod's serverless rate). Audio-mode segments are local-only. Picking a beefier GPU (A100, H100) makes the lip-sync render faster but doesn't improve the output for this model — stick with 4090/L40-class for the best cost/quality trade.

The chunk count per render isn't fixed — generate.py greedy-packs clauses up to 3 s each, so every comma inside a "video"-mode segment becomes another chunk. A natural-language outro and URL typically yield 4–6 visible chunks total (≈$0.12–$0.18); a bookend with no commas in either runs 3 chunks (≈$0.10).

💸 New to Runpod? Sign up via this referral link and get a random credit bonus between $5 and $500. At ~$0.15 per typical promo render, that's anywhere from ~30 to ~3,000 free videos before you pay anything.

Dependencies summary

System (all subprojects):

Python 3.12+
Node.js 18+
ffmpeg with libx264, AAC, libsoxr (default Ubuntu build is fine)

Python:

python-dotenv, requests (for generate.py)
kokoro (only if using kokoro engine)
Everything else lives in openvoice-test/venv/ (isolated)

Node:

Remotion 4.x + plugins (in your Remotion subproject's package.json)
tsx for the SFX mix script

Cloud:

Runpod account + a deployed InfiniteTalk endpoint (see InfiniteTalk/README.md)

Files you'll iterate on

What	Where
Narration script	`InfiniteTalk/generate.py` → `SCRIPT_SEGMENTS`
TTS engine + Kokoro voice	`InfiniteTalk/generate.py` → `VOICE_ENGINE`, `KOKORO_VOICE`
OpenVoice config (speaker / EQ / wavmark)	`openvoice-test/synth_worker.py`
Runpod endpoint ID + API key	`InfiniteTalk/.env` → `RUNPOD_ENDPOINT_ID`, `RUNPOD_API_KEY`
Scene timing (frame boundaries)	`<your-project>/src/Master.tsx` → `SCENES`
Scene visuals	`<your-project>/src/scenes/*.tsx`
Compose duration	`<your-project>/src/Root.tsx` → `durationInFrames`
SFX events	`<your-project>/scripts/mix-audio-narration.ts`
Portrait	`InfiniteTalk/portrait.jpg`
Voice reference	`InfiniteTalk/voice_reference.wav`

Troubleshooting

"Could not load this library: ...libcudart.so..." — torchaudio version mismatch with system torch. Pin torchaudio to match (e.g. pip install "torchaudio==2.10.*" for torch 2.10.x).
"You need to install fugashi" — MeloTTS trying to import Japanese support. The patched melo/text/cleaner.py lazy-loads per language; if you reinstalled MeloTTS, re-apply the patch (see openvoice-test/README.md).
"averaged_perceptron_tagger_eng not found" — python -c "import nltk; nltk.download('averaged_perceptron_tagger_eng')" from inside the OpenVoice venv.
Mp4 has wrong audio after config change — check that you're on the latest generate.py; the mp4 cache key now folds in the worker source hash, but older builds didn't.
Lip-sync drifts late in a chunk — the chunk exceeded Runpod's 3.24 s cap; shorten the spoken text or split the segment.
Voice sounds too soft — see the voice configuration section.
{"error": "..."} from Runpod — the deployed endpoint isn't speaking the InfiniteTalk API the script expects. See Deploying your own Runpod endpoint for the contract.

Running a content series

If you're producing multiple videos around a theme (e.g. "Solana Concepts"), the natural setup is a two-surface workflow that separates creative ideation from pipeline execution:

A Claude Project (web) owns the ideation surface — backlog tracking, concept selection, script drafting, "what next" loops. Lightweight, works on mobile, persistent context across sessions.
This repo + Claude Code owns the execution surface — turns an approved brief into pixels via the /marketing-video skill.

One-time setup (~10 min)

Bootstrap the web Project:

Open claude.ai/projects → Create Project. Name it whatever fits (e.g. "Solana Concepts — Video Series").
Custom instructions → paste the entire block from docs/claude-project-instructions.md (everything inside the ``` block under Custom Instructions).
Project knowledge → upload these files from this repo (drag- and-drop into the sidebar):
- README.md — pipeline overview
- CLAUDE.md — operational constraints
- docs/series-brief-template.md — required brief format
- docs/voice-a-style-guide.md — voice characteristics + TTS rules
- docs/concepts-covered.md — backlog tracking
- One finished SRT (e.g. videos/solana-bytes-promo/out/solana-bytes-master.srt) as a length/rhythm reference
Smoke test: in a new conversation in the Project, ask "What concept should we cover next?" — it should suggest 2-3 untouched topics from the backlog with one-line angles for each. If it asks "what's this project about?", the custom instructions didn't paste correctly.

Per-video workflow

1. (web)        Open the Project → "What concept next?"
                → ideation loop → fills brief template
                → outputs a ready-to-paste brief

2. (terminal)   Clear Claude Code context (/clear or new session)
                so the agent loads CLAUDE.md fresh

3. (terminal)   /marketing-video <paste the entire brief>

4. (terminal)   Watch Claude execute the runbook:
                  ✓ load CLAUDE.md (you'll see it cite the runbook)
                  ✓ validate brief against series-brief-template.md
                  ✓ if brief is incomplete, AskUserQuestion for gaps
                  ✓ create videos/<new-name>/ by forking the example
                  ✓ run generate.py → retime SCENES → render → mix
                  ✓ upload mp4 + send you the URL
                  ✓ append shipped entry to docs/concepts-covered.md

5. (web)        Refresh the concepts-covered.md upload in the
                Project so the next ideation session sees what
                shipped (in-place edit in the Knowledge sidebar)

Sanity checks during execution

Good signs that Claude is following the runbook:

References CLAUDE.md or the style guide by name in its narration
Quotes costs in $0.03/chunk or $0.10-$0.20/video (the latest numbers in CLAUDE.md). If you see "$0.25" or "$0.75", your uploaded docs are stale — refresh them.
Names the new project videos/<concept>-promo/ per the layout convention
Hyphenates 2-3 letter acronyms in SCRIPT_SEGMENTS (e.g. P-D-A, I-D) without being told to

Red flags that the docs need tightening (tell me what happened, I'll fix the docs):

Asks "where should I put the new video?" — layout convention isn't surfacing
Forgets to hyphenate acronyms — style guide isn't reaching Claude
Tries to run commands from the wrong cwd — recipe step needs absolute paths
Asks 4+ pipeline-mechanics questions — CLAUDE.md isn't being read

When the Project knowledge gets stale

The repo changes faster than you'll refresh Project Knowledge. As a rule, refresh after meaningful commits to:

docs/concepts-covered.md (every shipped video)
CLAUDE.md or docs/voice-a-style-guide.md (when conventions change)
README.md (when costs / structure / setup change)

Pull the latest from raw.githubusercontent.com/moviendome/Talkforge/main/ or git pull and re-upload via the Project's Knowledge sidebar.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
InfiniteTalk		InfiniteTalk
docs		docs
openvoice-test		openvoice-test
videos/solana-bytes-promo		videos/solana-bytes-promo
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

talkforge

See it end-to-end

How you actually use this (Claude Code)

What the pipeline does under the hood

What's interesting about it

Project layout

Required user-provided files

What's already there for you

Setup

1. InfiniteTalk (narration generator)

2. Remotion (visuals)

3. OpenVoice (voice cloning) — optional

4. Kokoro (stock voices) — optional

Quick start: running the pipeline manually

Common tweaks

Change the narration text

Swap the avatar

Re-roll one segment

Switch TTS engine

Change cloned-voice settings (OpenVoice)

How the pipeline flows

Adapting for your project

Cost reference

Dependencies summary

Files you'll iterate on

Troubleshooting

Running a content series

One-time setup (~10 min)

Per-video workflow

Sanity checks during execution

When the Project knowledge gets stale

See also

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages