Series

Building an AI Podcast Index

An eight-part build-along: a locally-running tool that ingests a YouTube podcast channel, extracts guests and topics, lets you clip-search by intent, and generates questions for future episodes — using uv, FastAPI, Vite + React, and a provider-switchable LLM client.

8 parts · first published

Building an AI Podcast Index
  1. 01

    Building an AI Podcast Index: the Project, the Stack, and What You'll Have at the End

    An eight-part build-along: a local tool that ingests a YouTube podcast channel, extracts guests and topics with Claude, lets you clip-search by intent, and generates grounded questions for future episodes. Post 1: the demo, the architecture, and one paragraph per stack choice.

    · 10 min read
  2. 02

    Ingesting YouTube transcripts: the YT Data API path, with yt-dlp + Whisper as honest fallback

    YouTube's Data API is the right primary path for transcripts. yt-dlp + local Whisper is the fallback when captions are missing — used carefully, documented honestly, and quota-aware.

    · 13 min read
  3. 03

    Structured extraction with Pydantic + Claude: guests, topics, and quotes from raw transcripts

    Schema-first prompting with Pydantic + Anthropic tool use, tiered routing between Haiku and Sonnet, prompt caching for the system block, and a single retry that feeds the validation error back into the prompt.

    · 9 min read
  4. 04

    Entity resolution for guests: fuzzy matching first, LLM disambiguation second

    The same person shows up as 'Bibhusan Bista', 'Bibhusan B.', and 'B. Bista' across three episodes. Don't ask the LLM first — try cheap deterministic matching, then escalate only the ambiguous cases.

    · 8 min read
  5. 05

    Building a provider-switching LLM client: one interface, three providers, task-tier routing

    A 60-line adapter package lets you swap Anthropic, OpenAI, and local Ollama via an env var, route cheap classification to Haiku and synthesis to Opus, and add prompt caching without touching call sites.

    · 9 min read
  6. 06

    Search without embeddings: Postgres tsvector, LLM rerank, and 30-second clips

    For a few hundred podcast episodes, Postgres full-text search plus an LLM rerank beats embedding-based RAG on both quality and operational simplicity. No vector DB, no embedding pipeline.

    · 13 min read
  7. 07

    The React side: guest pages, search UI, and codegen'd types

    A Vite + React SPA with three real pages — popular guests, guest detail, search — wired to FastAPI through codegen'd TypeScript types from a shared Pydantic schema. No UI framework, five components, type-safe end to end.

    · 13 min read
  8. 08

    The question generator, the cron job, and shipping it locally

    Grounded question generation is one prompt away. Wrap the project with a local cron, a one-line backup, and a popular-guests landing query — and the podcast index runs on its own.

    · 9 min read