The Browser in the Middle
You need to generate images from data. Social cards, product listings, certificates, promotional graphics. The established approach: write HTML and CSS, render it in a browser, screenshot the result.
HTMLCSSToImage (HCTI) is the hosted version of that idea. Send HTML and CSS to their API, they render it in headless Chrome, you get an image back. No Puppeteer to manage, no Chrome to deploy. It’s a good product that solves a real pain — hosting and scaling headless browsers is nobody’s idea of fun.
But there’s a browser in the middle. And browsers bring baggage.
How HCTI Works
HCTI’s API takes an HTML string and optional CSS. Their servers load it into headless Chrome, wait for fonts and assets to load, render the page, and return a screenshot as PNG, JPEG, or WebP.
This means every image you generate goes through a full browser rendering pipeline — HTML parser, CSS layout engine, font rasterizer, compositor. For simple compositions like a social card with a title, subtitle, and background color, you’re spinning up an engine designed to render Gmail and Google Docs. It works. It’s also more machinery than the task requires.
HCTI imposes a 30-second timeout on renders. If your HTML references external fonts or images that load slowly, the render can time out. The HTML and CSS payload is capped at 50 MB — generous for most use cases, but a hard wall if you’re embedding base64-encoded assets.
The deeper issue isn’t the limits. It’s the rendering model. Browsers are nondeterministic image generators. Font loading is a race condition. CSS rendering shifts between Chrome versions. A layout that looks correct in Chrome 120 might shift by a pixel in Chrome 125. If you need identical output across time, a browser is the wrong foundation.
The Layer-Based Alternative
The Iteration Layer Image Generation API skips the browser entirely. Instead of writing HTML and hoping Chrome renders it the way you expect, you describe the image as a stack of layers — backgrounds, images, text, shapes, QR codes, barcodes — and the API composes them directly.
No DOM. No CSS cascade. No browser engine. You define where each element goes, what it looks like, and in what order it stacks. The API renders the composition and returns the image.
This is a fundamentally different model. HCTI gives you the full power of CSS — flexbox, grid, media queries, animations (frozen at screenshot time). Iteration Layer gives you explicit positioning and typed layers. Less flexible for arbitrary web layouts. More predictable for structured image generation.
What the Layer Model Looks Like
Here’s a product listing image — product photo with AI background removal, title, price, and an EAN-13 barcode — built with the TypeScript SDK:
import { IterationLayer } from "iterationlayer";
const client = new IterationLayer({ apiKey: "YOUR_API_KEY" });
const result = await client.generate({
width_in_px: 800,
height_in_px: 800,
format: "png",
layers: [
{ type: "solid-color-background", color: "#ffffff" },
{ type: "static-image", url: "https://example.com/product.jpg",
x_in_px: 50, y_in_px: 50, width_in_px: 700, height_in_px: 500,
smart_crop: true, remove_background: true },
{ type: "text", text: "**Wireless Headphones Pro**",
x_in_px: 50, y_in_px: 580, width_in_px: 700, height_in_px: 60,
font_family: "Inter", font_size_in_px: 28, color: "#1a1a1a" },
{ type: "text", text: "$299.00",
x_in_px: 50, y_in_px: 650, width_in_px: 700, height_in_px: 40,
font_family: "Inter", font_size_in_px: 22, color: "#666666" },
{ type: "barcode", value: "5901234123457", format: "ean13",
x_in_px: 50, y_in_px: 720, width_in_px: 200, height_in_px: 50,
fg_hex_color: "#333333", bg_hex_color: "#ffffff" },
],
});
Five layers, one request, one deterministic image. The **Wireless Headphones Pro** renders as bold text — text layers support Markdown formatting for inline bold and italic without needing separate font weight layers.
The equivalent in HCTI would be an HTML template with CSS positioning, a separate barcode generation library (or embedding a barcode image from another service), and hope that the fonts load before Chrome takes the screenshot.
Fonts: Hosted vs. Bundled
Fonts in browser-based rendering are a known pain point. HCTI renders whatever Chrome has installed or whatever you load via @font-face. That means you host font files, reference them in your CSS, and the browser fetches them during the render. If the font CDN is slow, your render is slow. If the font fails to load, Chrome falls back to a system font and your image looks wrong.
Iteration Layer bundles 98 fonts — Inter, Roboto, Open Sans, Noto Sans (JP, KR, SC, TC, Arabic, Telugu, and more), Playfair Display, and dozens of others. You reference them by name. No hosting, no CDN, no race conditions. The font is always available because it ships with the rendering engine.
For CJK text, Arabic, Telugu, and other scripts that require specific font support, this matters. Setting up Noto Sans Japanese in a headless Chrome container means installing the font in the container’s OS font directory and rebuilding the image. With Iteration Layer, you set font_family: "Noto Sans JP" and it works.
You can still provide custom fonts via URL if the bundled set doesn’t cover your brand typeface. But for most use cases, the bundled fonts eliminate a category of problems.
Built-In Features HCTI Can’t Do Natively
HCTI renders HTML. If HTML can’t do it, HCTI can’t do it — unless you generate the asset elsewhere and embed it.
QR codes and barcodes. Iteration Layer has dedicated layer types for QR codes and six barcode formats (EAN-13, EAN-8, Codabar, Code 128, Code 39, ITF). One layer definition, no external library. With HCTI, you’d generate the barcode with a JavaScript library, render it to an inline SVG or canvas in your HTML, and hope it renders cleanly in Chrome.
AI-powered image processing. Image layers support smart_crop (object-detection-based cropping) and remove_background (AI segmentation) as flags on the layer itself. Upload a product photo with a cluttered background, and the API removes it during composition. With HCTI, you’d need to pre-process the image through a background removal service before embedding it in your HTML.
Auto-scaling text. Text layers scale the font size down automatically to fit within the defined bounding box. If a product name is too long for the allocated space, the text shrinks to fit. In HTML/CSS, you’d need JavaScript to measure text and adjust font size iteratively — which adds complexity and increases render time.
Markdown in text. Text layers parse **bold** and *italic* inline. Mixed formatting in a single text block without splitting into multiple HTML elements.
Determinism
This is the core technical difference. Browser rendering is inherently nondeterministic. Chrome’s text shaping engine, subpixel rendering, anti-aliasing, and layout calculations can produce slightly different output depending on the Chrome version, OS, GPU, and display scaling factor.
For many use cases, “close enough” is fine. But if you’re generating images that need to be pixel-identical across runs — compliance documents, product images with strict brand guidelines, automated visual regression testing — browser rendering introduces variables you can’t fully control.
Iteration Layer’s rendering pipeline has no browser. Same input, same output, every time. The fonts are bundled, the composition is deterministic, and there are no external variables that shift the result.
Side-by-Side
| Capability | HTMLCSSToImage | Iteration Layer |
|---|---|---|
| Rendering engine | Headless Chrome | Native layer compositor |
| Input format | HTML + CSS | JSON layer stack |
| Layout model | CSS (flexbox, grid, etc.) | Explicit positioning |
| Bundled fonts | System fonts only | 98 bundled fonts |
| Custom fonts |
Via @font-face (hosted) |
Via URL or bundled |
| QR codes | Requires JS library in HTML | Built-in layer type |
| Barcodes | Requires JS library in HTML | 6 formats built-in |
| Background removal | Not available | Built-in on image layers |
| Smart crop | Not available | Built-in on image layers |
| Auto-scaling text | Requires JS measurement | Built-in |
| Rich text | Full HTML/CSS | Markdown (bold, italic) |
| Render timeout | 30 seconds | No browser timeout |
| Max payload | 50 MB | No HTML payload limit |
| Deterministic output | No (browser-dependent) | Yes |
| Data residency | US-hosted | EU-hosted (Frankfurt) |
When HCTI Makes Sense
HCTI is the right choice when your templates are genuinely complex HTML. If you have existing CSS-heavy designs — layouts with flexbox nesting, CSS grid, custom animations frozen at a frame, or rich HTML content like tables with dynamic row counts — HCTI lets you render them without rethinking the layout as layers.
It’s also a good fit if your team thinks in HTML/CSS and the templates are already built. Rewriting a complex HTML template as a layer stack has a real cost, and HCTI lets you ship without that migration.
The tradeoff is clear: HCTI gives you the full browser rendering engine with all its power and all its unpredictability. Iteration Layer gives you a constrained but deterministic model that covers the use cases where most programmatic images live — positioned compositions of text, images, shapes, and data-driven elements.
When to Use Iteration Layer
If your images are structured compositions — social cards, product listings, certificates, personalized email banners, event graphics — the layer model is simpler and more predictable. You don’t need flexbox to position a title at y: 180. You don’t need @font-face when the font is bundled. You don’t need a separate barcode library when the API generates them natively.
The layer model also scales without browser overhead. Each image is a single API call with a JSON body. No Chrome instances to manage, no rendering timeouts to work around, no font loading to debug. Your infrastructure stays the same whether you generate 10 images or 10,000.
Get Started
Check the docs for the full layer reference, font catalog, and SDK guides. The TypeScript and Python SDKs handle authentication and response parsing — your integration is a few lines of code.
Image Generation is one API in a composable suite — chain it with Document Extraction to turn parsed data into visual assets, or with Image Transformation to post-process the output. Same auth, same credit pool.
Sign up for a free account at iterationlayer.com/image-generation — no credit card required.