POST /api/render/preview

Realtime, low-latency preview endpoint. Use it for live editors where a human (or an LLM) is iteratively tuning a template and you want sub- second visual feedback before committing to a paid render.

Previews are free — they never consume render credits. They are encoded as WebP at quality 70 and cached for one hour by content hash. Pixel-identical to /generate (no watermark, no overlay).

http

POST https://audome.io/api/render/preview
Authorization: Bearer YOUR_API_TOKEN
Content-Type: application/json

Request body

Field	Type	Required	Description
`projectId`	string	yes	The template ID returned when you save a project.
`data`	object	no	Dynamic values; same flat `"key.text"`/`"key.img"` shape as `/generate`.

Response

json

{
  "success": true,
  "previewUrl": "https://audome.io/api/images/renders/<userId>/preview-<hash>.webp",
  "expiresAt": 1746640000000,
  "watermarked": false,
  "lowRes": true,
  "cached": false,
  "rateLimit": {
    "remaining": 29,
    "resetAt": 1746640060000
  },
  "generationTime": 1240
}

Field	Type	Description
`previewUrl`	string	CDN URL to a WebP image. Drop into `<img>`.
`expiresAt`	number	Unix ms when the cached entry expires (1 h after creation).
`watermarked`	boolean	Always `false` — preview output is not watermarked.
`lowRes`	boolean	Reduced quality vs `/generate` (WebP @ Q70 instead of full PNG).
`cached`	boolean	`true` if the result was served from the in-memory cache.
`rateLimit`	object	`remaining` calls in the active minute window + `resetAt` Unix ms.
`generationTime`	number	Server-side render time in ms (omitted on cache hit).

Caching

The cache key is sha256(projectId + canonical(data) + editor_data.updated_at + WxH). canonical(data) recursively sorts object keys, so { a:1, b:2 } and { b:2, a:1 } produce identical cache keys.

TTL: 1 hour per entry.
Capacity: 1 000 entries (LRU eviction).
Cache invalidates automatically when the master template is saved

(the updated_at column of render_projects participates in the key).

Cache hits return in < 50 ms. Cache misses take 800-2000 ms.

Rate limits

Window	Limit
1 min	30 requests
1 hour	200 requests

Token-bucket algorithm, per Audome account. When you exceed either limit the response is 429:

json

{
  "error": "rate_limited",
  "retryAfter": 12,
  "rateLimit": { "remaining": 0, "resetAt": 1746640060000 }
}

The response also sets these headers:

http

Retry-After: 12
X-RateLimit-Limit-Minute: 30
X-RateLimit-Limit-Hour: 200
X-RateLimit-Reset: 1746640060000

Concurrency: previews never block paid renders

Previews are processed by a separate Puppeteer queue capped at PREVIEW_MAX_CONCURRENCY (default 1). The queue runs next to /api/render/generate and never holds a render slot, so a flood of preview requests cannot delay a paying customer's final render.

If the preview queue is saturated the request waits in the queue (up to 30 s) before timing out with 503.

Differences vs /api/render/generate

Aspect	`/preview`	`/generate`
Credits charged	0	1 per call
Output format	WebP @ Q70	PNG / JPEG / WebP
Output storage	CDN (1 h TTL)	CDN (permanent)
Watermark	never	never
Cache	content-hashed, 1 h	none
Latency target	< 50 ms cached, 1-2 s cold	1.5 - 4 s
Webhook supported	no	yes
Rate limit	30/min, 200/hour	account-wide 120/min

Recommended client behavior

Debounce 500-800 ms between user input and the preview call.

Avoid firing a request on every keystroke.

Hash the payload locally and skip the network call entirely if it

is identical to the last in-flight request.

Use `AbortController` to cancel an in-flight request when the user

keeps typing — the server caches the eventual result, but you do not want stale UI.

Show the rate limit: when rateLimit.remaining < 5 consider

pausing previews and falling back to "click to preview".

Example: live editor wiring

const debounceMs = 600;
let timer;
let inflight;

async function schedulePreview(state) {
  clearTimeout(timer);
  if (inflight) inflight.abort();
  timer = setTimeout(async () => {
    inflight = new AbortController();
    try {
      const res = await fetch('https://audome.io/api/render/preview', {
        method: 'POST',
        signal: inflight.signal,
        headers: {
          Authorization: `Bearer ${token}`,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          projectId: state.projectId,
          data: state.data,
        }),
      });
      if (res.status === 429) {
        const { retryAfter } = await res.json();
        console.warn(`Rate limited, retry in ${retryAfter}s`);
        return;
      }
      const { previewUrl, cached, rateLimit } = await res.json();
      document.querySelector('#preview').src = previewUrl;
      if (rateLimit.remaining < 5) {
        // pause auto-preview to avoid hitting the limit
      }
    } catch (e) {
      if (e.name !== 'AbortError') throw e;
    }
  }, debounceMs);
}

Error codes

Status	`error` value	Meaning
400	`missing_projectId`	Body did not include `projectId`.
401	`unauthorized`	Bearer token missing or invalid.
403	`forbidden`	Token does not own `projectId`.
404	`project_not_found`	The template ID does not exist or was deleted.
429	`rate_limited`	Account exceeded preview rate limits.
503	`preview_busy`	Preview queue saturated. Retry after `retryAfter`.
500	`render_failed`	Internal error. Includes `requestId` for support.

Notes for AI agents wiring this up

This endpoint is not a substitute for `/generate` when you need a

permanent, full-resolution asset. Preview output is lower-resolution WebP @ Q70 with a 1-hour CDN TTL. Use /generate once the user confirms the design.

The cached: true field lets you skip showing a loading spinner on

fast paths — treat it as "this came from L1 cache".

The generationTime field is useful telemetry for monitoring the

render pool's warmth.