# Podroma API Guide for Agents

This document is for AI agents and developer tools that want to
programmatically read podcast insights from **podroma.com**.

Podroma converts long-form YouTube podcasts into structured insights:
up to 10 dense, MECE "what's actually worth knowing" takeaways from each
episode, written in the speakers' own words with named attribution. You
can also ask follow-up questions about an episode via the Q&A endpoint.

> Podroma surfaces the transformative **synthesis**, distilled insights
> in the speakers' own framing, not verbatim transcripts. Raw transcripts
> are not available through the API.

All endpoints below are public — no API key, no authentication.

Base URL: `https://podroma.com`

## Core service

Given a YouTube podcast URL, Podroma:

1. Generates up to 10 dense, MECE key insights in the speakers' own words
   (via Claude)
2. Categorizes the episode (Tech & AI, Business & Finance, Health, …)
3. Caches the result so future requests are instant
4. Lets you ask follow-up questions about the episode (Q&A)

Each analyzed episode lives at a stable URL: `https://podroma.com/podcast/{id}`.

## Key endpoints

### 1. Cache check (free, fast)

`GET /api/check?platform=youtube&id={video_id_or_url}`

Tells you whether a video has already been transcribed + analyzed. Use
this **before** kicking off a transcribe call to avoid burning your
daily quota.

**Request**

```http
GET /api/check?platform=youtube&id=dQw4w9WgXcQ
```

The `id` parameter accepts a bare 11-character YouTube id **or** a full
`https://www.youtube.com/watch?v=...` URL.

**Response**

```json
{
  "cached": true,
  "id": 73,
  "ready": true,
  "synthesis_url": "https://podroma.com/podcast/73"
}
```

`cached=false` means we've never processed this video. `cached=true` +
`ready=false` means it's in progress (synthesis hasn't finished).

### 2. Transcribe (create a new episode)

`POST /api/episodes/create`

Idempotent — if the URL has already been transcribed, this returns the
existing `id` without doing any work and without consuming a transcribe
slot. Otherwise it inserts a placeholder row and returns the new `id`.

**Request**

```http
POST /api/episodes/create
Content-Type: application/json

{
  "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
}
```

**Response**

```json
{
  "id": 142,
  "alreadyAnalyzed": false
}
```

After receiving the `id`, **trigger the actual transcribe work** by
calling:

`POST /api/library/item/{id}/regenerate`

This is the call that consumes a transcribe slot against the rate
limits. It takes ~60–120 seconds for a typical 1–2 hour podcast.
Returns `{ ok: true }` on success or `{ ok: true, alreadyRegenerated: true }`
if the row was already done.

Then poll the read endpoint (next section) until `ready: true`.

### 3. Fetch episode

`GET /api/library/item/{id}`

Returns the cached metadata + synthesis (the distilled insights).

> **Note:** Podroma does **not** expose raw or speaker-labeled
> transcripts via the API. We surface the transformative synthesis
> (dense key insights in the speakers' own words), not verbatim
> reproductions of the source. To explore what was said, use the Q&A
> endpoint below. It answers from the transcript server-side without
> returning it.

**Request**

```http
GET /api/library/item/73
```

**Response**

```json
{
  "id": 73,
  "title": "Joe Rogan Experience #2440 - Matt Damon & Ben Affleck",
  "author": "PowerfulJRE",
  "source_url": "https://www.youtube.com/watch?v=...",
  "source_type": "youtube",
  "created_at": 1733600000000,
  "category": "Entertainment & Culture",
  "synthesis_markdown": "## 1. ...\n\n...",
  "ready": true,
  "video_id": "abc123def45",
  "thumbnail_url": "https://i.ytimg.com/vi/abc123def45/maxresdefault.jpg"
}
```

The `synthesis_markdown` field uses GitHub-flavored markdown with
numbered `## N. <title>` headings — one per insight.

### 4. Ask a question (Q&A)

`POST /api/library/item/{id}/chat`

Streams a Claude response grounded in the episode's content. This is how
you dig into specifics — the model answers from the transcript
server-side, but won't dump it verbatim (it's a Q&A assistant, not a
transcript export). Returns `text/plain` token-by-token (no SSE wrapper —
just raw chunked text).

**Request**

```http
POST /api/library/item/73/chat
Content-Type: application/json

{
  "messages": [
    { "role": "user", "content": "What did Matt say about Ben's directing process?" }
  ]
}
```

`messages` is the full conversation history (user + assistant turns).
Append your new user question at the end on each call.

**Response**

```
text/plain; charset=utf-8
(streaming chunks of the answer)
```

## Rate limits

Enforced per source IP and per session cookie:

| Limit                          | Value |
|--------------------------------|-------|
| Concurrent transcribes per IP  | 2     |
| Daily transcribes per IP       | 25    |
| Daily transcribes per session  | 10    |
| Daily Q&A questions per IP     | 50    |

A "transcribe" counts when `POST /api/library/item/{id}/regenerate`
actually does work — cache hits do not count.

When you exceed a limit, the API returns **HTTP 429** with this body:

```json
{
  "error": "Daily transcription limit reached (25 per IP per 24h). Try again tomorrow.",
  "scope": "daily_ip",
  "action": "transcribe",
  "limit": 25,
  "retry_after_seconds": 1800
}
```

Response headers include `Retry-After`, `X-RateLimit-Scope`,
`X-RateLimit-Action`, and `X-RateLimit-Limit`. The `scope` field tells
you which limit fired (`concurrent_ip`, `daily_ip`, or `daily_session`)
so you can branch accordingly.

The session cookie (`podroma_sid`) is set automatically by Podroma's
middleware on first response. Send it back on subsequent requests if
you want the session-level limit to apply (otherwise only the per-IP
limits gate you).

## Typical agent flow

```text
1. GET  /api/check?platform=youtube&id=<VID>
   → if cached && ready, GOTO 4

2. POST /api/episodes/create { url: "<youtube url>" }
   → get { id }

3. POST /api/library/item/{id}/regenerate
   → wait ~60-120s, then poll step 4

4. GET  /api/library/item/{id}
   → if ready=true, use synthesis_markdown

5. POST /api/library/item/{id}/chat { messages: [...] }
   → ask follow-up questions (answered from the transcript server-side)
```

## Constraints

- **Maximum video duration: 4 hours.** Longer episodes are rejected at
  `POST /api/episodes/create` with HTTP 400 and `scope: "duration_cap"`
  in the body, before any transcribe slot is consumed.
- **YouTube only.** Spotify and Apple Podcasts aren't supported.
- **Auto-captions required.** Videos without YouTube captions can't be
  transcribed (we don't run our own ASR).
- **No deletion endpoint.** Episodes are public once analyzed.
- **No webhooks.** Poll the GET endpoint for status changes.

### Duration cap error shape

```json
{
  "error": "We can analyze podcast episodes up to 4 hours long. This episode is longer than that — try a shorter episode from the same podcast.",
  "scope": "duration_cap",
  "max_duration_sec": 14400,
  "video_duration_sec": 21630
}
```

## Schema gotchas

- `created_at` is a Unix epoch in **milliseconds**, not seconds.
- `synthesis_markdown` is null while the episode is still being analyzed
  (poll until `ready: true`).
- `category` may be null for rows that pre-date the categorizer; this
  doesn't affect the synthesis content.

## Contact

Questions, bug reports, or want a higher rate limit? Email
[insights@podroma.com](mailto:insights@podroma.com).

## MCP server

Podroma is also available as a **Model Context Protocol** server — the
same capabilities as this REST API, consumable natively by Claude
(claude.ai connectors, Claude Code, Claude Desktop), Cursor, and any
other MCP client:

```
https://podroma.com/api/mcp/mcp     (Streamable HTTP, no auth)
```

Tools: `list_recent_episodes`, `get_episode_insights`,
`search_podcasters`, `list_podcaster_episodes`, `ask_episode` (50/day
per IP), `analyze_episode` (25/day per IP). Human-friendly setup
instructions: https://podroma.com/mcp
