Why We Built Iteration Layer

7 min read

Content Processing Is a Mess

If you’ve built anything that touches documents or images, you know the drill. You need to extract data from PDFs, so you duct-tape together an OCR library and a regex parser. You need thumbnails, so you spin up ImageMagick in a Docker container. You need to generate reports or ebooks, so you wrestle with PDF libraries that treat a simple table like a research problem.

Each tool solves one narrow problem. Each one breaks in its own way. And the glue code connecting them — the format conversions, the error handling, the retry logic — that’s where the real complexity lives. Not in the business logic you actually care about.

We’ve been on both sides of this. Before Iteration Layer, we built an AI-driven book publishing company. That meant building the entire content pipeline from scratch: parsing manuscripts, generating book covers programmatically, processing product images for Amazon, rendering marketing graphics for launches. Every piece worked. Every piece also broke in its own creative way at 2 AM before a release.

We maintained the Sharp pipeline that worked perfectly until someone uploaded a CMYK TIFF. We wrote the template engine that couldn’t handle a font weight it hadn’t seen before. We built the manuscript-to-EPUB converter that choked on tables. And every time we fixed one thing, we thought: this should just be an API call.

So we built it.

One Pipeline, Four APIs

Content processing follows a natural lifecycle: you ingest raw content, you transform it, and you generate new output from it. Instead of building one monolithic platform that tries to do everything, we built four focused APIs that map to this lifecycle.

Document Extraction handles ingestion. Give it a PDF, a Word document, a scanned image — it gives you structured JSON. You define a schema describing the fields you want, and the parser extracts them with confidence scores so you know when to trust the result and when to flag it for review. No OCR setup, no template configuration, no regex. It handles field types like text, numbers, dates, addresses, IBANs, currencies, arrays, and more out of the box.

Image Transformation covers the middle of the pipeline — the part where you need to upscale, resize, crop, convert formats, adjust quality, or chain multiple operations together. Define up to 30 operations in a single request: upscale to 4x resolution, resize to 800x600, convert to WebP, compress to 85% quality, smart-crop around the detected subject. One API call instead of a Sharp pipeline you have to host and scale yourself. Need an image under 500 KB for email? Tell the API the target file size and it figures out the optimal quality and dimension tradeoffs.

Image Generation turns templates into pixels. Write your template in HTML and CSS — technologies you already know — inject dynamic data, and get back a PNG, JPEG, or PDF. Social cards, certificates, OG images, report graphics. Anything you’d design in Figma and then manually export fifty times, you can now render programmatically.

Document Generation closes the loop on the output side. Feed it structured data and get back a polished PDF, DOCX, EPUB, or PPTX. Contracts, reports, ebooks, slide decks — generated from templates, populated with your data, ready to ship.

Each API works independently. You can use the Document Extraction without ever touching image processing. You can generate documents without parsing a single one first. But the real leverage comes when you chain them together.

Composable by Design

The output of one API flows naturally into the input of the next. Parse a supplier catalog, feed each product into Image Generation, and you have marketplace-ready listing images — no glue code, no format juggling. Parse manuscripts, transform the cover art, generate the final EPUB. Extract article text, transform the hero image, generate a social card.

This is the Unix philosophy applied to content processing: small, focused tools that compose into workflows you haven’t imagined yet.

That composability is a deliberate choice, and it cuts against the grain. Monolithic platforms lock you into one vendor’s idea of a workflow. If their OCR is good but their image processing is mediocre, tough luck — you’re stuck with both or neither.

We think APIs should work like building blocks. Snap them together however you want. Combine ours with your own services or third-party tools. The output is always standard JSON or image data — nothing proprietary, nothing locked in.

This also means you pay for what you use. Need parsing this month but not image generation? You only pay for parsing. A spike in document parsing doesn’t affect your image generation capacity. Start with one API, add another when the need arises, swap one out without touching the rest.

MCP: APIs as Agent Tools

Every Iteration Layer API ships as an MCP server from day one. If you work with Claude, Cursor, or any MCP-compatible client, our APIs show up as tools your agent can call directly.

That changes the workflow fundamentally. Instead of writing integration code, you describe what you want. “Parse this supplier catalog and generate a product card for each item.” The agent discovers the Document Extraction, calls it with the right schema, takes the structured output, and feeds it into Image Generation — all without you specifying the sequence.

We’ve been using this internally while building the platform, and it’s the kind of thing that feels like a gimmick until you try it. Once you’ve watched an agent chain three API calls to solve a task you were about to spend an hour scripting, it’s hard to go back.

Building Blocks for What’s Next

The four APIs we’re launching cover the core of the content lifecycle, but we see them as the foundation, not the finish line.

Our vision is a library of focused, composable APIs that cover every step of content processing — from raw input to polished output. Each one does one thing well. Each one connects to everything else. The same way Unix pipes let you chain grep | sort | uniq into something no single tool could do, we want you to be able to chain Ingest, Transform, and Generate steps into workflows that solve your specific problem.

That means more APIs, more integrations, more field types, more output formats. But always following the same principles — focused scope, standard I/O, and the ability to snap into any pipeline.

We’re building infrastructure that disappears. You shouldn’t have to think about OCR libraries, image processing servers, or headless browser farms. You should think about your product, your users, and the content workflow that connects them. The plumbing is our problem.

Get Started

Pick the API that solves your most immediate problem — Document Extraction, Image Transformation, Image Generation, or Document Generation. We also publish TypeScript and Python SDKs, so your next npm install or pip install is all the setup you need — and if you’re building with AI agents, the SDKs make our APIs discoverable as tools out of the box.

Sign up for a free account — no credit card required. As new needs come up, add another API to the chain. They all use the same authentication, the same error format, and the same response structure, so adding a second or third takes minutes.

We built this because we needed it. We think you might too.

Start building in minutes

Free trial included. No credit card required.