Generate Images from Prompts Inside Claude and Cursor with MCP

8 min read Image Generation

Your AI Assistant Can Make Images Now

You’re writing a blog post. You need an OG image. You open Figma, find the template, update the title, export, upload. Fifteen minutes for a rectangle with text on it.

Or you’re building a landing page. The designer is busy. You need a placeholder hero graphic that actually looks like the final thing — not a gray box with “image here” in it.

MCP changes this. Model Context Protocol lets AI assistants like Claude Desktop and Cursor call external APIs as tools. Connect the Image Generation API as an MCP server, and your assistant can compose layered images from a conversation. Describe what you want, and it builds the API request, calls the endpoint, and returns the image.

No Figma. No Photoshop. No code. Just a prompt and a result.

What MCP Is (and Isn’t)

MCP is an open standard that lets AI assistants discover and call external tools. It’s a plugin system — you register a service, and the assistant can use it when your request matches.

It’s not a magic image generator that hallucinates pixels. The Image Generation API uses a layer-based composition model. Each image is a stack of layers — backgrounds, shapes, text, images, overlays — rendered in order. The AI assistant’s job is to translate your description into that layer structure and call the API.

This means the output is deterministic. Same layers, same result. Every time.

Setting Up in Claude Code

Claude Code supports MCP servers natively. Add the Iteration Layer server with a single command:

claude mcp add iterationlayer --transport streamablehttp https://api.iterationlayer.com/mcp

The first time you use an Iteration Layer tool, a browser window opens for OAuth authentication. Log in, authorize access, and you’re connected. No API keys to manage.

To verify the server is available, start a conversation and ask Claude Code what MCP tools it has access to. You should see generate_image listed among the available tools.

Setting Up in Cursor

Add to your .cursor/mcp.json:

{
  "mcpServers": {
    "iterationlayer": {
      "type": "streamablehttp",
      "url": "https://api.iterationlayer.com/mcp"
    }
  }
}

Save and restart. The Image Generation tool is now available in your Cursor AI conversations. Authentication works the same way — OAuth in the browser on first use.

The MCP server exposes a tool that maps to POST https://api.iterationlayer.com/image-generation/v1/generate. When the assistant calls it, it sends a JSON body with fonts, dimensions, layers, and output_format. You never see the raw request unless you ask — the assistant handles the structure, and you describe what you want in plain language.

A Real Conversation

Here’s what using it actually looks like. You type:

“Generate an OG image for my blog post titled ‘Why We Moved to Postgres’. Dark background, white text, red accent line at the top. 1200x630.”

The assistant builds a layer stack:

  1. A solid-color-background layer — dark navy (#1a1a2e)
  2. A rectangle layer — thin red bar (#e94560) across the top, 4px tall
  3. A text layer — your title in white, large font, positioned with breathing room

It calls the API with that structure and returns the generated image. The whole exchange takes seconds.

You can iterate from there:

“Make the title bigger. Add the site name ‘acme.dev’ in smaller text at the bottom right.”

The assistant adjusts the font size on the title layer, adds a second text layer for the site name, and regenerates. Same workflow a designer would follow in Figma — but through conversation.

More Conversations, More Use Cases

The examples above cover OG images, but the same conversational pattern works for any composition. Here are a few more real interactions.

Adding a photo with smart cropping:

“Add the team photo from https://example.com/team.jpg on the right side. Make sure nobody’s face gets cut off.”

The assistant adds a static-image layer with should_use_smart_cropping enabled. The API detects the people in the photo and positions the crop so faces stay visible — even if the original photo has off-center framing.

Building an event banner:

“Create a 1200x600 banner for our meetup. Black background. ‘TypeScript Berlin’ in white at the top. ‘March 12, 2026 — 7pm’ in gray below it. Put a thin orange line between the two.”

The assistant generates a five-layer composition: background, title text, divider rectangle, date text, all positioned with appropriate spacing. You see the result and decide:

“Move the date closer to the line. Make the orange brighter.”

Two adjustments, one regeneration. The assistant updates the y coordinate of the date layer and changes the rectangle’s hex_color. Iterating on a design through conversation is faster than clicking through Figma panels when you already know what you want.

Using an image overlay for texture:

“Take the OG image we just made and add a subtle paper texture over it. Use this URL for the texture: https://example.com/paper-grain.png. Keep it subtle — maybe 15% opacity.”

The assistant adds an image-overlay layer at the top of the stack with opacity: 15. The texture blends over the entire canvas without obscuring the text or accent elements underneath.

The Layer Model

The Image Generation API composes images from layer types, stacked by index:

  • solid-color-background — a full-canvas fill. Your base layer.
  • rectangle — colored shapes for accents, borders, dividers, cards. Supports rotation and angled edges for diagonal designs.
  • text — rendered text with custom fonts, alignment, and markdown support (bold and italic). Auto-wraps lines within a bounding box.
  • static-image — photos, logos, icons. Placed at specific coordinates with specific dimensions. Supports AI-powered smart cropping that detects the subject and keeps it centered.
  • image-overlay — a full-canvas image overlay with opacity control. Useful for textures, gradients, and watermarks.

Layers render bottom-to-top by index. A background at index 0, shapes at index 1-2, text at index 3, an overlay at index 4. The mental model is the same as any design tool — Figma, Photoshop, Sketch. Except the “design tool” is your AI assistant.

What You Can Generate

OG images and social cards. The most common use case. Every blog post, every landing page, every product update needs a 1200x630 image. Define a template in conversation, swap the title for each post.

Event graphics. Conference talk cards, meetup announcements, workshop banners. Background + event name + speaker name + date. Four layers, one API call.

Email headers. Branded rectangles with campaign-specific text. Generate a batch by asking the assistant to loop over your campaign titles.

Placeholder graphics. Need a hero image for a prototype? Describe the layout and let the assistant generate it. When the real design arrives, you swap it out.

Custom Fonts

The API supports TTF, OTF, WOFF, and WOFF2 font files. You reference them by URL in the request, and text layers use them by name and weight:

{
  "fonts": [
    {
      "name": "Inter",
      "weight": "Regular",
      "style": "normal",
      "file": { "type": "url", "name": "Inter-Regular.ttf", "url": "https://example.com/fonts/Inter-Regular.ttf" }
    },
    {
      "name": "Inter",
      "weight": "Bold",
      "style": "normal",
      "file": { "type": "url", "name": "Inter-Bold.ttf", "url": "https://example.com/fonts/Inter-Bold.ttf" }
    }
  ]
}

Text layers that use **bold markdown** automatically render with the Bold variant. This means you can mix weights within a single text block — useful for highlighting a keyword in a title.

When you tell the assistant “use Inter for the text, bold the product name,” it knows to include both font weights and wrap the product name in **double asterisks** in the text layer.

MCP vs Direct API Calls

MCP is for ad-hoc work. You need one image, right now, and you don’t want to write code. The assistant handles the API structure, the layer composition, the font references. You describe, it generates.

For production pipelines — generating OG images at build time, creating social cards from a CMS, batch-producing event graphics — use the API directly. The request body is JSON. Any language that can make an HTTP POST can generate images.

The two approaches complement each other. Prototype in conversation with MCP, then extract the layer structure into your codebase for automation.

Combining Multiple MCP Servers

The Image Generation MCP server works alongside other MCP servers. If you also have the Image Transformation and Image Upscaling servers configured, you can chain operations in a single conversation:

“Generate a 1200x630 OG image with a dark blue background and white title ‘API Reference’. Then convert it to WebP.”

The assistant calls the Image Generation API first, then pipes the result through the Image Transformation API for format conversion. Two API calls, one conversation, no manual file handling.

“The icon I want to use is only 200x200. Upscale it to 4x first, then add it to the banner.”

The assistant calls the Image Upscaling API to bring the icon to 800x800, then uses the upscaled version as a static-image layer in the generation request.

What This Isn’t

This is not DALL-E or Midjourney. The API does not generate images from abstract prompts like “a cat wearing a top hat.” It composes structured layers — backgrounds, shapes, text, and existing images — into a final output.

Think of it as programmatic Figma, not generative AI art. The output is precise, branded, and reproducible. The AI assistant’s role is translating your intent into the layer structure — not hallucinating pixels.

Get Started

Sign up for a free account at iterationlayer.com — no credit card required. Run the claude mcp add command above, authenticate in the browser, and generate your first image from your next Claude Code or Cursor conversation.

The docs cover every layer type, every property, and every output format. Start with an OG image — it’s the simplest template and the most immediately useful.

Start building in minutes

Free trial included. No credit card required.