Document Generation vs DocRaptor: HTML-to-PDF or Structured Content?

7 min read Document Generation

The Best HTML-to-PDF Engine You Can Buy

You need to generate documents from data. Contracts, reports, invoices, proposals. The established approach: write HTML and CSS, feed it to a rendering engine, get a PDF back.

DocRaptor is the hosted version of the best engine available for that job — Prince XML. Prince has been the gold standard for HTML-to-PDF conversion for over a decade. It supports CSS Paged Media, which means proper page breaks, headers, footers, page numbers, and all the typographic niceties that browser-based PDF generators botch. The output quality is genuinely excellent.

But it’s still HTML-to-PDF. You write HTML, you style it with CSS, and you get a PDF. That model carries assumptions worth examining.

How DocRaptor Works

DocRaptor wraps the Prince XML engine behind an API. Send HTML (or a URL pointing to an HTML page), and Prince renders it into a PDF using the CSS Paged Media specification. The result is significantly better than what you’d get from headless Chrome or wkhtmltopdf — Prince actually implements the paged media spec, while browsers mostly ignore it.

CSS Paged Media gives you @page rules to define page size and margins, page-break-before and page-break-after to control where content splits, widows and orphans to prevent single-line paragraphs at the top or bottom of a page, and named pages for different layouts within one document. Prince supports most of this spec, and DocRaptor makes Prince available without running it on your own servers.

The catch: CSS Paged Media is its own world. It looks like CSS, but it behaves differently from anything you’d write for a browser. The @page rule has no equivalent in browser rendering. Properties like prince-bookmark-level are Prince-specific extensions. You’re writing CSS that only one engine understands, and debugging it means iterating against Prince’s rendering behavior — not your browser’s DevTools.

For teams that already know CSS Paged Media, DocRaptor is a straightforward choice. For everyone else, there’s a learning curve that’s easy to underestimate.

The Output Format Problem

DocRaptor produces PDF. It also supports XLS export, which is useful for spreadsheet-style data. But that’s the full list.

If a client needs the contract as a Word document so legal can redline it — you need a different tool. If you’re publishing content that should be available as an EPUB — different tool again. If the quarterly report needs to ship as a slide deck for the board meeting — yet another tool.

This is the reality of document generation in most organizations. PDF is one format among several, and the others aren’t going away. Building your document pipeline around HTML-to-PDF means you’ve solved one output format and left the rest as separate engineering problems.

Each additional format typically means a separate library, a separate template system, and a separate set of rendering bugs to maintain. The contract template that looks perfect in PDF needs to be re-implemented for DOCX using a Word template library that has completely different layout semantics.

Structured Content, Four Outputs

The Iteration Layer Document Generation API takes a different approach. Instead of starting with HTML and rendering it to one format, you start with structured JSON blocks and the API renders them to any of four formats: PDF, DOCX, EPUB, and PPTX.

The content model is built from block types — headlines, paragraphs, tables, images, grids, table of contents, page breaks, and more. You define the content once, choose your output format, and the API handles the rendering differences between formats.

This means the same content definition that produces a polished PDF also produces a properly formatted Word document — with real Word styles, not a PDF crammed into a .docx container. The EPUB output is a valid ebook with reflowable text and proper chapter structure. The PPTX output is an actual PowerPoint file with editable slides.

One content model. Four outputs. No re-implementation.

What the Block Model Looks Like

Here’s a service agreement — with a table of contents, headers, footers, and page numbers — built with the TypeScript SDK:

import { IterationLayer } from "iterationlayer";

const client = new IterationLayer({ apiKey: "YOUR_API_KEY" });

const result = await client.generateDocument({
  format: "docx",
  document: {
    metadata: { title: "Service Agreement", author: "Legal Dept" },
    page: {
      size: { preset: "Letter" },
      margins: { top_in_pt: 72, right_in_pt: 72, bottom_in_pt: 72, left_in_pt: 72 },
    },
    styles: { /* ... */ },
    header: [
      { type: "paragraph", markdown: "**Acme Corp** — Confidential" },
    ],
    footer: [
      { type: "page-number", text_alignment: "center" },
    ],
    content: [
      { type: "headline", level: "h1", text: "Service Agreement" },
      { type: "table-of-contents", levels: ["h1", "h2"], leader: "dots" },
      { type: "headline", level: "h2", text: "1. Scope of Services" },
      { type: "paragraph", markdown: "Provider agrees to deliver..." },
      { type: "headline", level: "h2", text: "2. Payment Terms" },
      { type: "paragraph", markdown: "Client shall pay **$5,000/month**..." },
    ],
  },
});

Change format: "docx" to format: "pdf" and you get a PDF. Change it to format: "epub" and you get an ebook. The content stays identical. No CSS to rewrite, no template to rebuild.

The table-of-contents block scans the headline blocks and generates entries with dot leaders automatically. In DocRaptor, you’d build this manually — creating anchor links, styling the dots with CSS leader functions (which Prince supports but few developers have used), and maintaining the list as sections change.

Tables That Actually Work

If you’ve ever built a table in HTML and then tried to make it paginate correctly in a PDF, you know the pain. Table rows that split across pages. Header rows that don’t repeat. Column widths that shift when content wraps.

The Iteration Layer table block handles column spans, row spans, separate header and body styling, configurable borders, and automatic header row repetition across page breaks. You define the table structure in JSON — rows, cells, spans — and the rendering engine handles pagination correctly in every output format.

In DOCX output, the table is a native Word table that the recipient can edit. In PDF, the headers repeat on every page. In EPUB, the table reflows for different screen sizes. Same definition, format-appropriate rendering.

Grid Layout for Multi-Column Content

Documents aren’t always single-column. Brochures, fact sheets, and reports often need content side by side — an image next to a text block, or three columns of statistics.

The grid block provides a 12-column layout system. Place content blocks in columns, and the API renders them side by side. This works across all four output formats without you worrying about how DOCX handles columns differently from PDF.

In CSS Paged Media, multi-column layout is possible but fragile. Prince supports CSS columns and floats, but getting content to break correctly across pages while maintaining column alignment is a debugging exercise that can eat hours.

Headers, Footers, and Page Numbers

DocRaptor handles headers and footers through Prince’s @page margin boxes — CSS constructs like @top-center and @bottom-right that position content in the page margins. It works, but the syntax is unfamiliar even to experienced CSS developers. Dynamic content like page numbers uses CSS counters (counter(page)), and conditional content per page requires named page rules.

Iteration Layer’s headers and footers are blocks — the same block types you use in the document body. A paragraph block in the header renders as header text. A page-number block renders the current page number. You control alignment with text_alignment, not CSS margin box positioning.

The difference is legibility. Compare Prince’s @page { @bottom-center { content: counter(page); } } with Iteration Layer’s { type: "page-number", text_alignment: "center" }. Both produce the same result. One reads like a document description, the other reads like a CSS specification.

Side-by-Side

Capability DocRaptor Iteration Layer
Rendering engine Prince XML Native document renderer
Input format HTML + CSS JSON block structure
Layout model CSS Paged Media Block types + 12-column grid
PDF output Yes Yes
DOCX output No Yes (native Word)
EPUB output No Yes
PPTX output No Yes
Table of contents Manual HTML + CSS leaders Auto-generated from headlines
Tables HTML tables (pagination issues) Column/row spans, repeating headers
Headers/footers @page margin boxes Block-based, same as content
Page numbers CSS counters Dedicated block type
Multi-column CSS columns/floats 12-column grid
Markdown in content No (raw HTML) Yes, in paragraph blocks
Data residency US-hosted EU-hosted (Frankfurt)

When DocRaptor Makes Sense

DocRaptor is the right choice when PDF is your only output format and your team already knows CSS Paged Media. If you have existing HTML templates that Prince renders correctly, DocRaptor lets you generate PDFs without rethinking your template architecture.

It’s also the right fit if you need pixel-perfect control over PDF typography. Prince’s rendering quality is hard to beat — it handles kerning, ligatures, and typographic details at a level that dedicated document renderers still aspire to. For print-quality PDF output where the visual design is paramount, Prince through DocRaptor remains the industry standard.

DocRaptor starts at $15/month for 125 documents, with higher tiers for more volume. The pricing is straightforward and well-established.

When to Use Iteration Layer

If your documents need to ship in more than one format — PDF for archival, DOCX for collaboration, EPUB for distribution, PPTX for presentations — the block model eliminates the need to maintain separate template systems for each output.

If your team doesn’t know CSS Paged Media and doesn’t want to learn it, structured JSON blocks are a shorter path to production. You describe the document’s content and structure, not its visual rendering in a CSS specification that only one engine understands.

The block model also simplifies dynamic documents. Generating a contract from structured data means mapping your data to blocks — no HTML string templating, no worrying about escaping user content in HTML, no CSS that breaks when a table has more rows than you planned for. The content model handles the layout concerns that CSS Paged Media pushes onto you.

Get Started

Check the docs for the full block reference, style options, and SDK guides. The TypeScript and Python SDKs handle authentication and response parsing — your integration is a single function call that returns a document buffer ready to save or stream.

Document Generation is part of a composable suite — chain it with Document Extraction to parse documents and automatically generate formatted output, all through one credit pool.

Start building in minutes

Free trial included. No credit card required.