Public docsView raw/llms-full.txt

POST /api/render/preview

Realtime, low-latency preview endpoint. Use it for live editors where a human (or an LLM) is iteratively tuning a template and you want sub- second visual feedback before committing to a paid render.

Previews are free — they never consume render credits. They are encoded as WebP at quality 70 and cached for one hour by content hash. Pixel-identical to /generate (no watermark, no overlay).
http
POST https://audome.io/api/render/preview
Authorization: Bearer YOUR_API_TOKEN
Content-Type: application/json

Request body

FieldTypeRequiredDescription
projectIdstringyesThe template ID returned when you save a project.
dataobjectnoDynamic values; same flat "key.text"/"key.img" shape as /generate.

Response

json
{
  "success": true,
  "previewUrl": "https://audome.io/api/images/renders/<userId>/preview-<hash>.webp",
  "expiresAt": 1746640000000,
  "watermarked": false,
  "lowRes": true,
  "cached": false,
  "rateLimit": {
    "remaining": 29,
    "resetAt": 1746640060000
  },
  "generationTime": 1240
}
FieldTypeDescription
previewUrlstringCDN URL to a WebP image. Drop into <img>.
expiresAtnumberUnix ms when the cached entry expires (1 h after creation).
watermarkedbooleanAlways false — preview output is not watermarked.
lowResbooleanReduced quality vs /generate (WebP @ Q70 instead of full PNG).
cachedbooleantrue if the result was served from the in-memory cache.
rateLimitobjectremaining calls in the active minute window + resetAt Unix ms.
generationTimenumberServer-side render time in ms (omitted on cache hit).

Caching

The cache key is sha256(projectId + canonical(data) + editor_data.updated_at + WxH). canonical(data) recursively sorts object keys, so { a:1, b:2 } and { b:2, a:1 } produce identical cache keys.

  • TTL: 1 hour per entry.
  • Capacity: 1 000 entries (LRU eviction).
  • Cache invalidates automatically when the master template is saved

(the updated_at column of render_projects participates in the key).

  • Cache hits return in < 50 ms. Cache misses take 800-2000 ms.

Rate limits

WindowLimit
1 min30 requests
1 hour200 requests

Token-bucket algorithm, per Audome account. When you exceed either limit the response is 429:

json
{
  "error": "rate_limited",
  "retryAfter": 12,
  "rateLimit": { "remaining": 0, "resetAt": 1746640060000 }
}

The response also sets these headers:

http
Retry-After: 12
X-RateLimit-Limit-Minute: 30
X-RateLimit-Limit-Hour: 200
X-RateLimit-Reset: 1746640060000

Concurrency: previews never block paid renders

Previews are processed by a separate Puppeteer queue capped at PREVIEW_MAX_CONCURRENCY (default 1). The queue runs next to /api/render/generate and never holds a render slot, so a flood of preview requests cannot delay a paying customer's final render.

If the preview queue is saturated the request waits in the queue (up to 30 s) before timing out with 503.

Differences vs /api/render/generate

Aspect/preview/generate
Credits charged01 per call
Output formatWebP @ Q70PNG / JPEG / WebP
Output storageCDN (1 h TTL)CDN (permanent)
Watermarknevernever
Cachecontent-hashed, 1 hnone
Latency target< 50 ms cached, 1-2 s cold1.5 - 4 s
Webhook supportednoyes
Rate limit30/min, 200/houraccount-wide 120/min
  1. Debounce 500-800 ms between user input and the preview call.

Avoid firing a request on every keystroke.

  1. Hash the payload locally and skip the network call entirely if it

is identical to the last in-flight request.

  1. Use `AbortController` to cancel an in-flight request when the user

keeps typing — the server caches the eventual result, but you do not want stale UI.

  1. Show the rate limit: when rateLimit.remaining < 5 consider

pausing previews and falling back to "click to preview".

Example: live editor wiring

js
const debounceMs = 600;
let timer;
let inflight;

async function schedulePreview(state) {
  clearTimeout(timer);
  if (inflight) inflight.abort();
  timer = setTimeout(async () => {
    inflight = new AbortController();
    try {
      const res = await fetch('https://audome.io/api/render/preview', {
        method: 'POST',
        signal: inflight.signal,
        headers: {
          Authorization: `Bearer ${token}`,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          projectId: state.projectId,
          data: state.data,
        }),
      });
      if (res.status === 429) {
        const { retryAfter } = await res.json();
        console.warn(`Rate limited, retry in ${retryAfter}s`);
        return;
      }
      const { previewUrl, cached, rateLimit } = await res.json();
      document.querySelector('#preview').src = previewUrl;
      if (rateLimit.remaining < 5) {
        // pause auto-preview to avoid hitting the limit
      }
    } catch (e) {
      if (e.name !== 'AbortError') throw e;
    }
  }, debounceMs);
}

Error codes

Statuserror valueMeaning
400missing_projectIdBody did not include projectId.
401unauthorizedBearer token missing or invalid.
403forbiddenToken does not own projectId.
404project_not_foundThe template ID does not exist or was deleted.
429rate_limitedAccount exceeded preview rate limits.
503preview_busyPreview queue saturated. Retry after retryAfter.
500render_failedInternal error. Includes requestId for support.

Notes for AI agents wiring this up

  • This endpoint is not a substitute for `/generate` when you need a

permanent, full-resolution asset. Preview output is lower-resolution WebP @ Q70 with a 1-hour CDN TTL. Use /generate once the user confirms the design.

  • The cached: true field lets you skip showing a loading spinner on

fast paths — treat it as "this came from L1 cache".

  • The generationTime field is useful telemetry for monitoring the

render pool's warmth.