POST /api/render/preview
Realtime, low-latency preview endpoint. Use it for live editors where a human (or an LLM) is iteratively tuning a template and you want sub- second visual feedback before committing to a paid render.
Previews are free — they never consume render credits. They are encoded as WebP at quality 70 and cached for one hour by content hash. Pixel-identical to /generate (no watermark, no overlay).POST https://audome.io/api/render/preview
Authorization: Bearer YOUR_API_TOKEN
Content-Type: application/jsonRequest body
| Field | Type | Required | Description |
|---|---|---|---|
projectId | string | yes | The template ID returned when you save a project. |
data | object | no | Dynamic values; same flat "key.text"/"key.img" shape as /generate. |
Response
{
"success": true,
"previewUrl": "https://audome.io/api/images/renders/<userId>/preview-<hash>.webp",
"expiresAt": 1746640000000,
"watermarked": false,
"lowRes": true,
"cached": false,
"rateLimit": {
"remaining": 29,
"resetAt": 1746640060000
},
"generationTime": 1240
}| Field | Type | Description |
|---|---|---|
previewUrl | string | CDN URL to a WebP image. Drop into <img>. |
expiresAt | number | Unix ms when the cached entry expires (1 h after creation). |
watermarked | boolean | Always false — preview output is not watermarked. |
lowRes | boolean | Reduced quality vs /generate (WebP @ Q70 instead of full PNG). |
cached | boolean | true if the result was served from the in-memory cache. |
rateLimit | object | remaining calls in the active minute window + resetAt Unix ms. |
generationTime | number | Server-side render time in ms (omitted on cache hit). |
Caching
The cache key is sha256(projectId + canonical(data) + editor_data.updated_at + WxH). canonical(data) recursively sorts object keys, so { a:1, b:2 } and { b:2, a:1 } produce identical cache keys.
- TTL: 1 hour per entry.
- Capacity: 1 000 entries (LRU eviction).
- Cache invalidates automatically when the master template is saved
(the updated_at column of render_projects participates in the key).
- Cache hits return in < 50 ms. Cache misses take 800-2000 ms.
Rate limits
| Window | Limit |
|---|---|
| 1 min | 30 requests |
| 1 hour | 200 requests |
Token-bucket algorithm, per Audome account. When you exceed either limit the response is 429:
{
"error": "rate_limited",
"retryAfter": 12,
"rateLimit": { "remaining": 0, "resetAt": 1746640060000 }
}The response also sets these headers:
Retry-After: 12
X-RateLimit-Limit-Minute: 30
X-RateLimit-Limit-Hour: 200
X-RateLimit-Reset: 1746640060000Concurrency: previews never block paid renders
Previews are processed by a separate Puppeteer queue capped at PREVIEW_MAX_CONCURRENCY (default 1). The queue runs next to /api/render/generate and never holds a render slot, so a flood of preview requests cannot delay a paying customer's final render.
If the preview queue is saturated the request waits in the queue (up to 30 s) before timing out with 503.
Differences vs /api/render/generate
| Aspect | /preview | /generate |
|---|---|---|
| Credits charged | 0 | 1 per call |
| Output format | WebP @ Q70 | PNG / JPEG / WebP |
| Output storage | CDN (1 h TTL) | CDN (permanent) |
| Watermark | never | never |
| Cache | content-hashed, 1 h | none |
| Latency target | < 50 ms cached, 1-2 s cold | 1.5 - 4 s |
| Webhook supported | no | yes |
| Rate limit | 30/min, 200/hour | account-wide 120/min |
Recommended client behavior
- Debounce 500-800 ms between user input and the preview call.
Avoid firing a request on every keystroke.
- Hash the payload locally and skip the network call entirely if it
is identical to the last in-flight request.
- Use `AbortController` to cancel an in-flight request when the user
keeps typing — the server caches the eventual result, but you do not want stale UI.
- Show the rate limit: when
rateLimit.remaining < 5consider
pausing previews and falling back to "click to preview".
Example: live editor wiring
const debounceMs = 600;
let timer;
let inflight;
async function schedulePreview(state) {
clearTimeout(timer);
if (inflight) inflight.abort();
timer = setTimeout(async () => {
inflight = new AbortController();
try {
const res = await fetch('https://audome.io/api/render/preview', {
method: 'POST',
signal: inflight.signal,
headers: {
Authorization: `Bearer ${token}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
projectId: state.projectId,
data: state.data,
}),
});
if (res.status === 429) {
const { retryAfter } = await res.json();
console.warn(`Rate limited, retry in ${retryAfter}s`);
return;
}
const { previewUrl, cached, rateLimit } = await res.json();
document.querySelector('#preview').src = previewUrl;
if (rateLimit.remaining < 5) {
// pause auto-preview to avoid hitting the limit
}
} catch (e) {
if (e.name !== 'AbortError') throw e;
}
}, debounceMs);
}Error codes
| Status | error value | Meaning |
|---|---|---|
| 400 | missing_projectId | Body did not include projectId. |
| 401 | unauthorized | Bearer token missing or invalid. |
| 403 | forbidden | Token does not own projectId. |
| 404 | project_not_found | The template ID does not exist or was deleted. |
| 429 | rate_limited | Account exceeded preview rate limits. |
| 503 | preview_busy | Preview queue saturated. Retry after retryAfter. |
| 500 | render_failed | Internal error. Includes requestId for support. |
Notes for AI agents wiring this up
- This endpoint is not a substitute for `/generate` when you need a
permanent, full-resolution asset. Preview output is lower-resolution WebP @ Q70 with a 1-hour CDN TTL. Use /generate once the user confirms the design.
- The
cached: truefield lets you skip showing a loading spinner on
fast paths — treat it as "this came from L1 cache".
- The
generationTimefield is useful telemetry for monitoring the
render pool's warmth.