Series

Building an AI Podcast Index

An eight-part build-along: a locally-running tool that ingests a YouTube podcast channel, extracts guests and topics, lets you clip-search by intent, and generates questions for future episodes — using uv, FastAPI, Vite + React, and a provider-switchable LLM client.

8 parts · first published Jun 30, 2026

Start with part 1 →

01

Building an AI Podcast Index: the Project, the Stack, and What You'll Have at the End

An eight-part build-along: a local tool that ingests a YouTube podcast channel, extracts guests and topics with Claude, lets you clip-search by intent, and generates grounded questions for future episodes. Post 1: the demo, the architecture, and one paragraph per stack choice.

Jun 30, 2026 · 10 min read

02

Ingesting YouTube transcripts: the YT Data API path, with yt-dlp + Whisper as honest fallback

YouTube's Data API is the right primary path for transcripts. yt-dlp + local Whisper is the fallback when captions are missing — used carefully, documented honestly, and quota-aware.

Aug 11, 2026 · 13 min read
03

Structured extraction with Pydantic + Claude: guests, topics, and quotes from raw transcripts

Schema-first prompting with Pydantic + Anthropic tool use, tiered routing between Haiku and Sonnet, prompt caching for the system block, and a single retry that feeds the validation error back into the prompt.

Jul 14, 2026 · 9 min read
04

Entity resolution for guests: fuzzy matching first, LLM disambiguation second

The same person shows up as 'Bibhusan Bista', 'Bibhusan B.', and 'B. Bista' across three episodes. Don't ask the LLM first — try cheap deterministic matching, then escalate only the ambiguous cases.

Jul 21, 2026 · 8 min read
05

Building a provider-switching LLM client: one interface, three providers, task-tier routing

A 60-line adapter package lets you swap Anthropic, OpenAI, and local Ollama via an env var, route cheap classification to Haiku and synthesis to Opus, and add prompt caching without touching call sites.

Jul 28, 2026 · 9 min read
06

Search without embeddings: Postgres tsvector, LLM rerank, and 30-second clips

For a few hundred podcast episodes, Postgres full-text search plus an LLM rerank beats embedding-based RAG on both quality and operational simplicity. No vector DB, no embedding pipeline.

Aug 4, 2026 · 13 min read
07

The React side: guest pages, search UI, and codegen'd types

A Vite + React SPA with three real pages — popular guests, guest detail, search — wired to FastAPI through codegen'd TypeScript types from a shared Pydantic schema. No UI framework, five components, type-safe end to end.

Sep 15, 2026 · 13 min read
08

The question generator, the cron job, and shipping it locally

Grounded question generation is one prompt away. Wrap the project with a local cron, a one-line backup, and a popular-guests landing query — and the podcast index runs on its own.

Aug 18, 2026 · 9 min read

Start with part 1 →

Building an AI Podcast Index

Building an AI Podcast Index: the Project, the Stack, and What You'll Have at the End

Ingesting YouTube transcripts: the YT Data API path, with yt-dlp + Whisper as honest fallback

Structured extraction with Pydantic + Claude: guests, topics, and quotes from raw transcripts

Entity resolution for guests: fuzzy matching first, LLM disambiguation second

Building a provider-switching LLM client: one interface, three providers, task-tier routing

Search without embeddings: Postgres tsvector, LLM rerank, and 30-second clips

The React side: guest pages, search UI, and codegen'd types

The question generator, the cron job, and shipping it locally