The Complete Guide to Programmatic Image Generation

18 min read Image Generation

Why Generate Images Programmatically

Every application that displays content eventually needs to generate images. Open Graph cards for social sharing. Certificates for course completions. Product listing images for e-commerce. Weekly report summaries for Slack. Personalized email banners.

These images share a common pattern: a fixed layout with dynamic data. The template stays the same — background, text positions, logo placement. The data changes — page title, recipient name, product photo, metric values.

Manual creation doesn’t scale. A designer can make 20 OG images by hand. They can’t make 10,000 product listing images that update every time a price changes. Programmatic generation turns image creation into a function call: pass data in, get an image out.

This guide covers the three main approaches to programmatic image generation, their tradeoffs, and how to build production-ready image generation pipelines.

Approach 1: Canvas APIs

The lowest-level approach. Libraries like node-canvas (Node.js), Pillow (Python), and the browser’s native Canvas API let you draw directly onto a pixel buffer.

import { createCanvas, loadImage } from "canvas";

const canvas = createCanvas(1200, 630);
const ctx = canvas.getContext("2d");

// Background
ctx.fillStyle = "#1a1a2e";
ctx.fillRect(0, 0, 1200, 630);

// Accent line
ctx.fillStyle = "#e94560";
ctx.fillRect(0, 0, 1200, 4);

// Title text
ctx.font = "bold 48px Inter";
ctx.fillStyle = "#ffffff";
ctx.fillText("Your Blog Post Title", 60, 300);

// Author text
ctx.font = "24px Inter";
ctx.fillStyle = "#a0a0b0";
ctx.fillText("by Author Name", 60, 360);

const buffer = canvas.toBuffer("image/png");

Advantages:

  • Full pixel-level control
  • No external dependencies beyond the canvas library
  • Fast rendering for simple compositions

Problems:

  • No text wrapping. You measure text width manually and break lines yourself.
  • No automatic font variant switching. Bold and italic require loading separate fonts and switching contexts.
  • Font rendering quality varies by platform. What looks crisp on macOS may look different on Linux.
  • Positioning is entirely manual. Every element needs explicit x/y coordinates calculated from scratch.
  • No smart cropping or image analysis. You load an image and draw it at coordinates — if the subject is off-center, too bad.

Canvas works for simple compositions — a colored rectangle with some text. It becomes painful quickly when you need text wrapping, multiple font weights, or dynamic image placement.

Approach 2: Headless Browsers (Puppeteer/Playwright)

The most popular approach for teams already working with web technologies. Render HTML/CSS as a page, take a screenshot.

import puppeteer from "puppeteer";

const browser = await puppeteer.launch();
const page = await browser.newPage();

await page.setViewport({ width: 1200, height: 630 });
await page.setContent(`
  <div style="
    width: 1200px;
    height: 630px;
    background: #1a1a2e;
    display: flex;
    flex-direction: column;
    justify-content: center;
    padding: 60px;
    font-family: Inter, sans-serif;
    border-top: 4px solid #e94560;
  ">
    <h1 style="color: #fff; font-size: 48px; margin: 0;">
      Your Blog Post Title
    </h1>
    <p style="color: #a0a0b0; font-size: 24px; margin-top: 20px;">
      by Author Name
    </p>
  </div>
`);

const screenshot = await page.screenshot({ type: "png" });
await browser.close();

Advantages:

  • Use HTML/CSS — familiar to every web developer
  • Flexbox and Grid handle layout automatically
  • Text wrapping, font weights, and rich formatting work natively
  • CSS custom fonts via @font-face

Problems:

  • Heavy infrastructure. Puppeteer needs Chromium — a 200MB+ download. Docker images balloon to over 1GB. ARM builds require special Chromium builds.
  • Slow. Browser launch takes 1-3 seconds. Page rendering adds more. For batch generation of thousands of images, the overhead adds up.
  • Memory-hungry. Each browser instance uses 50-200MB of RAM. Concurrent generation requires either pooling (complex) or sequential processing (slow).
  • CSS rendering inconsistencies. Fonts render differently across Chromium versions. Custom fonts sometimes fail to load before the screenshot. Color profiles vary.
  • Zombie processes. Browser instances that don’t close properly accumulate. In production, you need cleanup logic, timeouts, and process monitoring.
  • No image intelligence. HTML renders images statically. If a product photo has the subject off-center, the CSS crop won’t detect and adapt.

Puppeteer is the default choice because everyone knows HTML. But it’s essentially running a full browser engine to generate a single image. The ratio of infrastructure complexity to output is high.

Approach 3: Layer-Based APIs

The API approach skips both manual pixel drawing and browser rendering. You describe the image as a stack of layers — backgrounds, rectangles, text, images — and the API composes them into a final image.

The Image Generation API works this way:

import { IterationLayer } from "iterationlayer";

const client = new IterationLayer({ apiKey: "YOUR_API_KEY" });

const { data: { buffer: imageBase64 } } = await client.generateImage({
  dimensions: { width: 1200, height: 630 },
  output_format: "png",
  fonts: [
    {
      name: "Inter",
      weight: "Bold",
      style: "normal",
      file: { type: "url", name: "Inter-Bold.ttf", url: "https://your-cdn.com/fonts/Inter-Bold.ttf" },
    },
    {
      name: "Inter",
      weight: "Regular",
      style: "normal",
      file: { type: "url", name: "Inter-Regular.ttf", url: "https://your-cdn.com/fonts/Inter-Regular.ttf" },
    },
  ],
  layers: [
    {
      type: "solid-color-background",
      index: 0,
      hex_color: "#1a1a2e",
      opacity: 100,
    },
    {
      type: "rectangle",
      index: 1,
      hex_color: "#e94560",
      position: { x: 0, y: 0 },
      dimensions: { width: 1200, height: 4 },
      opacity: 100,
    },
    {
      type: "text",
      index: 2,
      text: "Your Blog Post Title",
      font_name: "Inter",
      font_weight: "Bold",
      font_size_in_px: 48,
      text_color: "#ffffff",
      text_align: "left",
      position: { x: 60, y: 200 },
      dimensions: { width: 1080, height: 200 },
      is_splitting_lines: true,
      opacity: 100,
    },
    {
      type: "text",
      index: 3,
      text: "by Author Name",
      font_name: "Inter",
      font_weight: "Regular",
      font_size_in_px: 24,
      text_color: "#a0a0b0",
      text_align: "left",
      position: { x: 60, y: 430 },
      dimensions: { width: 1080, height: 40 },
      opacity: 100,
    },
  ],
});

const imageBuffer = Buffer.from(imageBase64, "base64");

Advantages:

  • No server infrastructure. No browser, no canvas library, no native dependencies.
  • Deterministic rendering. The same input always produces the same output, regardless of platform.
  • Text wrapping built in. The is_splitting_lines property handles line breaking within the container dimensions.
  • Markdown bold and italic. Write **bold** or *italic* in the text field — the API switches font variants automatically.
  • Smart cropping on image layers. The should_use_smart_cropping flag uses AI to detect subjects and crop intelligently.
  • Custom fonts. Upload any TTF, OTF, WOFF, or WOFF2 file.
  • Scales automatically. No server capacity to manage.

Tradeoffs:

  • Less layout flexibility than CSS. No flexbox, no grid, no auto-layout. You position elements with explicit coordinates.
  • Requires thinking in layers rather than in a document flow model.
  • Network dependency. Every image requires an API call.

Comparing the Three Approaches

Canvas Puppeteer Layer-based API
Infrastructure Canvas library Chromium (200MB+) None (API call)
Text wrapping Manual Automatic (CSS) Automatic (is_splitting_lines)
Rich text (bold/italic) Manual font switching HTML/CSS Markdown syntax
Custom fonts Platform-dependent @font-face (sometimes fails) Upload TTF/OTF/WOFF/WOFF2
Smart image cropping Not available Not available AI-powered (should_use_smart_cropping)
Rendering speed Fast Slow (1-3s startup) Fast (API response time)
Memory per image Low 50-200MB (browser) None (server-side)
Determinism Platform-dependent Chromium version-dependent Deterministic
Deployment Native bindings Chromium in Docker HTTP client

For simple colored rectangles with text, canvas works. For complex layouts where you already have HTML templates, Puppeteer works if you can handle the infrastructure. For everything else — especially at scale — the API approach is simpler.

Layer Types

The API provides layer types that compose into any layout:

solid-color-background — fills the entire canvas with a single color. The foundation of most compositions.

rectangle — a positioned, colored rectangle. Use for backgrounds, badges, dividers, borders, and decorative elements. Supports rotation and angled edges for diagonal designs.

text — rendered text with full typography control. Properties include font name, weight, style, size, color, alignment (horizontal and vertical), auto-wrapping, and paragraph spacing. Supports markdown **bold** and *italic* for inline formatting.

static-image — a positioned image from a URL or base64 input. Placed at specific coordinates with specific dimensions. The should_use_smart_cropping flag uses AI object detection to center the image on its subject before fitting it into the container.

image-overlay — a full-canvas image overlay. Used for textures, gradients, watermarks, and visual effects. Applied at a specified opacity.

Template Design Patterns

The Reusable Template Function

The most common pattern: a function that takes dynamic data and returns the API request body.

const buildOgImageRequest = (data: {
  title: string;
  authorName: string;
  authorPhotoUrl?: string;
}) => ({
  dimensions: { width: 1200, height: 630 },
  output_format: "png" as const,
  fonts: [
    {
      name: "Inter",
      weight: "Bold",
      style: "normal",
      file: { type: "url", name: "Inter-Bold.ttf", url: "https://your-cdn.com/fonts/Inter-Bold.ttf" },
    },
    {
      name: "Inter",
      weight: "Regular",
      style: "normal",
      file: { type: "url", name: "Inter-Regular.ttf", url: "https://your-cdn.com/fonts/Inter-Regular.ttf" },
    },
  ],
  layers: [
    { type: "solid-color-background", index: 0, hex_color: "#1a1a2e", opacity: 100 },
    { type: "rectangle", index: 1, hex_color: "#e94560", position: { x: 0, y: 0 }, dimensions: { width: 1200, height: 4 }, opacity: 100 },
    {
      type: "text",
      index: 2,
      text: data.title,
      font_name: "Inter",
      font_weight: "Bold",
      font_size_in_px: 48,
      text_color: "#ffffff",
      text_align: "left",
      is_splitting_lines: true,
      position: { x: 60, y: 180 },
      dimensions: { width: 800, height: 250 },
      opacity: 100,
    },
    {
      type: "text",
      index: 3,
      text: `by ${data.authorName}`,
      font_name: "Inter",
      font_weight: "Regular",
      font_size_in_px: 22,
      text_color: "#a0a0b0",
      text_align: "left",
      position: { x: 60, y: 460 },
      dimensions: { width: 800, height: 35 },
      opacity: 100,
    },
    ...(data.authorPhotoUrl
      ? [{
          type: "static-image" as const,
          index: 4,
          file: { type: "url" as const, name: "author.jpg", url: data.authorPhotoUrl },
          position: { x: 950, y: 250 },
          dimensions: { width: 180, height: 180 },
          should_use_smart_cropping: true,
          opacity: 100,
        }]
      : []),
  ],
});

The function handles conditional layers (author photo only if provided) and returns a plain object ready for JSON.stringify. Call it for every page, every blog post, every product — swap the data, get a unique image.

Multi-Platform Templates

Different platforms need different dimensions. Build a template per platform, or build one template that adapts:

const PLATFORM_CONFIGS_BY_PLATFORM: Record<string, { width: number; height: number; titleSize: number }> = {
  og: { width: 1200, height: 630, titleSize: 48 },
  twitter: { width: 1200, height: 675, titleSize: 48 },
  instagram: { width: 1080, height: 1080, titleSize: 56 },
  linkedin: { width: 1200, height: 627, titleSize: 44 },
};

For each platform, adjust the canvas dimensions and font sizes. The layout structure stays the same.

The Conditional Layer Pattern

Many templates need layers that appear only when certain data exists. An author photo that’s optional. A “SALE” badge that only shows when a discount is active. A category tag that varies by content type.

Handle this by conditionally spreading layers into the array:

const buildProductCardRequest = (product: {
  name: string;
  price: string;
  imageUrl: string;
  isOnSale: boolean;
  discountLabel?: string;
}) => ({
  dimensions: { width: 1080, height: 1080 },
  output_format: "png" as const,
  fonts: [
    { name: "Inter", weight: "Bold", style: "normal", file: { type: "url", name: "Inter-Bold.ttf", url: "..." } },
    { name: "Inter", weight: "Regular", style: "normal", file: { type: "url", name: "Inter-Regular.ttf", url: "..." } },
  ],
  layers: [
    { type: "solid-color-background", index: 0, hex_color: "#ffffff", opacity: 100 },
    {
      type: "static-image",
      index: 1,
      file: { type: "url", name: "product.jpg", url: product.imageUrl },
      position: { x: 40, y: 40 },
      dimensions: { width: 1000, height: 700 },
      should_use_smart_cropping: true,
      opacity: 100,
    },
    {
      type: "text",
      index: 2,
      text: product.name,
      font_name: "Inter",
      font_weight: "Bold",
      font_size_in_px: 36,
      text_color: "#111111",
      text_align: "left",
      position: { x: 40, y: 770 },
      dimensions: { width: 800, height: 80 },
      is_splitting_lines: true,
      opacity: 100,
    },
    {
      type: "text",
      index: 3,
      text: product.price,
      font_name: "Inter",
      font_weight: "Regular",
      font_size_in_px: 28,
      text_color: "#666666",
      text_align: "left",
      position: { x: 40, y: 870 },
      dimensions: { width: 400, height: 50 },
      opacity: 100,
    },
    ...(product.isOnSale && product.discountLabel
      ? [
          {
            type: "rectangle" as const,
            index: 4,
            hex_color: "#e94560",
            position: { x: 830, y: 860 },
            dimensions: { width: 210, height: 50 },
            opacity: 100,
          },
          {
            type: "text" as const,
            index: 5,
            text: product.discountLabel,
            font_name: "Inter",
            font_weight: "Bold",
            font_size_in_px: 22,
            text_color: "#ffffff",
            text_align: "center",
            position: { x: 830, y: 865 },
            dimensions: { width: 210, height: 40 },
            opacity: 100,
          },
        ]
      : []),
  ],
});

The sale badge — a red rectangle with white text — only appears when both isOnSale is true and a discountLabel exists. The rest of the layers render regardless. This pattern keeps your template functions declarative rather than branching through if-else trees.

Batch Generation

When generating images for a catalog, feed, or dataset:

const items = await database.getAllProducts();

const BATCH_SIZE = 10;
const results = [];

for (let i = 0; i < items.length; i += BATCH_SIZE) {
  const batch = items.slice(i, i + BATCH_SIZE);
  const batchResults = await Promise.all(
    batch.map(async (item) => {
      const requestBody = buildProductListingRequest(item);
      const { data: { buffer } } = await client.generateImage(requestBody);

      return { id: item.id, buffer: Buffer.from(buffer, "base64") };
    })
  );
  results.push(...batchResults);
}

Batch size controls concurrency. Process 10-20 images in parallel per batch to balance throughput and resource usage.

For error handling in batch pipelines, wrap each individual generation in a try-catch so a single failure doesn’t halt the entire run:

const batchResults = await Promise.all(
  batch.map(async (item) => {
    try {
      const requestBody = buildProductListingRequest(item);
      const { data: { buffer } } = await client.generateImage(requestBody);

      return { id: item.id, status: "success" as const, buffer: Buffer.from(buffer, "base64") };
    } catch (error) {
      return { id: item.id, status: "failed" as const, error: String(error) };
    }
  })
);

After the batch completes, filter by status to separate successes from failures. Retry failed items in a subsequent batch, or log them for manual review.

Typography

Custom Fonts

The API accepts TTF, OTF, WOFF, and WOFF2 font files. Each font variant — Regular, Bold, Italic, Bold Italic — is a separate file:

const fonts = [
  { name: "Brand Font", weight: "Regular", style: "normal", file: { type: "url", name: "brand-regular.ttf", url: "..." } },
  { name: "Brand Font", weight: "Bold", style: "normal", file: { type: "url", name: "brand-bold.ttf", url: "..." } },
  { name: "Brand Font", weight: "Regular", style: "italic", file: { type: "url", name: "brand-italic.ttf", url: "..." } },
];

Reference fonts in text layers by font_name and font_weight. The API matches the correct variant.

Markdown Formatting

Text layers support inline markdown. Write **bold text** and the API renders it using the Bold font weight. Write *italic text* and it uses the italic style. This works within the same text block — no separate layers for different formatting.

{
  type: "text",
  text: "Check out our **new feature** for *faster* processing",
  font_name: "Inter",
  font_weight: "Regular",
  // Bold and italic variants auto-switch based on markdown
}

For this to work, the Bold and Italic font variants must be included in the fonts array.

Text Wrapping

The is_splitting_lines property (default: true) automatically wraps text within the container dimensions. Set a position and dimensions for the text container, and the text flows within it.

For text that should not wrap (single-line labels, prices), set is_splitting_lines: false.

Handling Variable-Length Text

Dynamic text creates a layout problem: a 10-word title fits neatly, but a 30-word title overflows its container. There are two approaches.

Approach 1: Fixed font size with a tall container. Set is_splitting_lines: true and give the text container enough height to accommodate long titles. Short titles leave whitespace below, which is acceptable in most designs.

Approach 2: Dynamic font size selection. Choose the font size based on text length before calling the API:

const getTitleFontSizeInPx = (title: string): number => {
  const characterCount = title.length;

  if (characterCount <= 40) {
    return 56;
  }

  if (characterCount <= 80) {
    return 44;
  }

  return 36;
};

This keeps short titles prominent and long titles readable. The thresholds depend on your container width and font choice — test with real data to find the right breakpoints.

Smart Cropping

The should_use_smart_cropping flag on static-image and image-overlay layers uses AI object detection to find the main subject in an image before fitting it into the container.

Without smart crop: the image is center-cropped to fit the container dimensions. If the subject is off-center, it may be cut off.

With smart crop: the AI finds faces, products, and focal points, then positions the crop to keep the subject centered within the container.

This is particularly valuable for:

  • Profile pictures — faces are always centered regardless of the original photo composition
  • Product images — products stay in frame even when supplier photos have inconsistent compositions
  • Editorial images — the focal point of the image is preserved when cropping to different aspect ratios

Output Formats

The API supports four output formats:

  • PNG — lossless, supports transparency. Best for images with text where sharp edges matter. Larger file sizes.
  • JPEG — lossy, no transparency. Smaller files. Best for photo-heavy compositions and email.
  • WebP — modern format, better compression than JPEG with quality comparable to PNG. Best for web delivery.
  • AVIF — best compression, supports transparency and HDR. Slightly less browser support than WebP.

For most use cases: PNG for images that need transparency or pixel-perfect text, JPEG for email and legacy systems, WebP for web delivery.

Common Use Cases

OG images — generate per-page social sharing cards with the page title, description, and optionally an image. 1200x630 for most platforms.

Certificates — branded completion certificates with recipient name, course title, date, and signature. Custom serif fonts for a formal look.

Product listing images — product photo with smart crop, price badge, brand logo, and status overlays (SALE, NEW, SOLD OUT).

Email banners — personalized per-recipient banners that bypass email client CSS limitations. JPEG output for universal compatibility.

Report cards — key metrics rendered as shareable images for Slack, email, and documentation. Consistent rendering across every client.

Social media graphics — platform-specific dimensions (1080x1080 Instagram, 1200x675 Twitter) generated from the same data with different templates.

Real estate graphics — property photos with status badges, price, and agent branding. Smart crop handles inconsistent MLS photos.

Migration from Puppeteer

If you’re currently using Puppeteer for image generation, the migration path is straightforward:

  1. Map your HTML elements to layers. A div with a background color becomes a solid-color-background or rectangle layer. An h1 becomes a text layer. An img becomes a static-image layer.

  2. Convert CSS positions to coordinates. Flexbox and grid positions become explicit x/y coordinates and width/height dimensions.

  3. Replace font loading with font uploads. @font-face declarations become entries in the fonts array.

  4. Remove the browser infrastructure. Delete the Puppeteer dependency, the Chromium Docker layer, the browser pool, and the cleanup logic.

The API call replaces the browser launch, page creation, HTML rendering, and screenshot capture. What was 50 lines of Puppeteer setup becomes one HTTP request.

Get Started

Check the docs for the full layer reference, font handling guide, and output format options. The TypeScript and Python SDKs handle authentication and response parsing.

Sign up for a free account — no credit card required. Start with an OG image template and expand to your other image generation needs from there.

Start building in minutes

Free trial included. No credit card required.