# POST /api/render/preview

Realtime, low-latency preview endpoint. Use it for live editors where a
human (or an LLM) is iteratively tuning a template and you want sub-
second visual feedback before committing to a paid render.

> Previews are **free** — they never consume render credits. They are
> encoded as WebP at quality 70 and cached for one hour by content hash.
> Pixel-identical to `/generate` (no watermark, no overlay).

```http
POST https://audome.io/api/render/preview
Authorization: Bearer YOUR_API_TOKEN
Content-Type: application/json
```

## Request body

| Field            | Type    | Required | Description                                            |
| ---------------- | ------- | -------- | ------------------------------------------------------ |
| `projectId`    | string  | yes      | The template ID returned when you save a project.      |
| `data`         | object  | no       | Dynamic values; same flat `"key.text"`/`"key.img"` shape as `/generate`. |

## Response

```json
{
  "success": true,
  "previewUrl": "https://audome.io/api/images/renders/<userId>/preview-<hash>.webp",
  "expiresAt": 1746640000000,
  "watermarked": false,
  "lowRes": true,
  "cached": false,
  "rateLimit": {
    "remaining": 29,
    "resetAt": 1746640060000
  },
  "generationTime": 1240
}
```

| Field             | Type     | Description                                                          |
| ----------------- | -------- | -------------------------------------------------------------------- |
| `previewUrl`    | string   | CDN URL to a WebP image. Drop into `<img>`.                        |
| `expiresAt`     | number   | Unix ms when the cached entry expires (1 h after creation).          |
| `watermarked`   | boolean  | Always `false` — preview output is not watermarked.                 |
| `lowRes`        | boolean  | Reduced quality vs `/generate` (WebP @ Q70 instead of full PNG).    |
| `cached`        | boolean  | `true` if the result was served from the in-memory cache.          |
| `rateLimit`     | object   | `remaining` calls in the active minute window + `resetAt` Unix ms. |
| `generationTime`| number   | Server-side render time in ms (omitted on cache hit).                |

## Caching

The cache key is `sha256(projectId + canonical(data) + editor_data.updated_at + WxH)`.
`canonical(data)` recursively sorts object keys, so `{ a:1, b:2 }` and
`{ b:2, a:1 }` produce identical cache keys.

- TTL: **1 hour** per entry.
- Capacity: **1 000** entries (LRU eviction).
- Cache invalidates automatically when the master template is saved
  (the `updated_at` column of `render_projects` participates in the key).
- Cache hits return in **< 50 ms**. Cache misses take **800-2000 ms**.

## Rate limits

| Window  | Limit        |
| ------- | ------------ |
| 1 min   | 30 requests  |
| 1 hour  | 200 requests |

Token-bucket algorithm, per Audome account. When you exceed either limit
the response is `429`:

```json
{
  "error": "rate_limited",
  "retryAfter": 12,
  "rateLimit": { "remaining": 0, "resetAt": 1746640060000 }
}
```

The response also sets these headers:

```http
Retry-After: 12
X-RateLimit-Limit-Minute: 30
X-RateLimit-Limit-Hour: 200
X-RateLimit-Reset: 1746640060000
```

## Concurrency: previews never block paid renders

Previews are processed by a separate Puppeteer queue capped at
`PREVIEW_MAX_CONCURRENCY` (default `1`). The queue runs **next to**
`/api/render/generate` and never holds a render slot, so a flood of
preview requests cannot delay a paying customer's final render.

If the preview queue is saturated the request waits in the queue (up to
30 s) before timing out with `503`.

## Differences vs /api/render/generate

| Aspect              | `/preview`               | `/generate`                |
| ------------------- | ------------------------- | --------------------------- |
| Credits charged     | 0                         | 1 per call                  |
| Output format       | WebP @ Q70                | PNG / JPEG / WebP           |
| Output storage      | CDN (1 h TTL)             | CDN (permanent)             |
| Watermark           | never                     | never                       |
| Cache               | content-hashed, 1 h       | none                        |
| Latency target      | < 50 ms cached, 1-2 s cold| 1.5 - 4 s                   |
| Webhook supported   | no                        | yes                         |
| Rate limit          | 30/min, 200/hour          | account-wide 120/min        |

## Recommended client behavior

1. **Debounce 500-800 ms** between user input and the preview call.
   Avoid firing a request on every keystroke.
2. **Hash the payload locally** and skip the network call entirely if it
   is identical to the last in-flight request.
3. **Use `AbortController`** to cancel an in-flight request when the user
   keeps typing — the server caches the eventual result, but you do not
   want stale UI.
4. **Show the rate limit**: when `rateLimit.remaining < 5` consider
   pausing previews and falling back to "click to preview".

## Example: live editor wiring

```js
const debounceMs = 600;
let timer;
let inflight;

async function schedulePreview(state) {
  clearTimeout(timer);
  if (inflight) inflight.abort();
  timer = setTimeout(async () => {
    inflight = new AbortController();
    try {
      const res = await fetch('https://audome.io/api/render/preview', {
        method: 'POST',
        signal: inflight.signal,
        headers: {
          Authorization: `Bearer ${token}`,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          projectId: state.projectId,
          data: state.data,
        }),
      });
      if (res.status === 429) {
        const { retryAfter } = await res.json();
        console.warn(`Rate limited, retry in ${retryAfter}s`);
        return;
      }
      const { previewUrl, cached, rateLimit } = await res.json();
      document.querySelector('#preview').src = previewUrl;
      if (rateLimit.remaining < 5) {
        // pause auto-preview to avoid hitting the limit
      }
    } catch (e) {
      if (e.name !== 'AbortError') throw e;
    }
  }, debounceMs);
}
```

## Error codes

| Status | `error` value         | Meaning                                              |
| ------ | --------------------- | ---------------------------------------------------- |
| 400    | `missing_projectId` | Body did not include `projectId`.                  |
| 401    | `unauthorized`      | Bearer token missing or invalid.                     |
| 403    | `forbidden`         | Token does not own `projectId`.                    |
| 404    | `project_not_found` | The template ID does not exist or was deleted.       |
| 429    | `rate_limited`      | Account exceeded preview rate limits.                |
| 503    | `preview_busy`      | Preview queue saturated. Retry after `retryAfter`. |
| 500    | `render_failed`     | Internal error. Includes `requestId` for support.  |

## Notes for AI agents wiring this up

- This endpoint is **not a substitute for `/generate`** when you need a
  permanent, full-resolution asset. Preview output is lower-resolution
  WebP @ Q70 with a 1-hour CDN TTL. Use `/generate` once the user
  confirms the design.
- The `cached: true` field lets you skip showing a loading spinner on
  fast paths — treat it as "this came from L1 cache".
- The `generationTime` field is useful telemetry for monitoring the
  render pool's warmth.
