Document to Markdown

Convert documents and images to clean markdown

Name: Document to Markdown
Brand: Iteration Layer
Availability: OnlineOnly

Convert PDFs, Office files, images, emails, tables, and public URLs into clean markdown for RAG systems, agents, extraction, and review.

Try for Free

Zero data retention Made & hosted in the EU $65 free trial credits

See Document to Markdown in action

Start from a real implementation pattern, not blank docs. See the input, runnable code, and structured output your workflow can use next.

Read the docs

Preview Bash TypeScript Python Go

Input Preview

Output Preview

# Northwind Accounting Services GmbH

## Invoice INV-2024-0042

Pappelallee 18  
10437 Berlin  
accounts@northwind.example  
DE813529441

**Invoice Date:** 2024-03-15  
**Due Date:** 2024-04-14  
**Payment Terms:** Net 30

## Bill To

Acme Retail Europe  
Finance Team  
Nieuwezijds Voorburgwal 21  
1012 RC Amsterdam

| Description | Hours | Rate | Amount |
| --- | ---: | ---: | ---: |
| Month-end close automation workshop | 6 | USD 120.00 | USD 720.00 |
| Invoice schema rollout and testing | 4 | USD 120.00 | USD 480.00 |
| Vendor onboarding playbook update | 2 | USD 95.00 | USD 190.00 |

**Subtotal:** USD 1,390.00  
**Tax (0%):** USD 0.00  
**Total Due:** USD 1,390.00

Please remit payment within 30 days via bank transfer using reference **INV-2024-0042**. IBAN: DE42 1001 0010 0987 6543 21.

Input Preview

Output Preview

# Elena Vasquez

## Senior Software Engineer

Berlin, Germany  
elena.vasquez@email.com  
+49 170 1234567  
github.com/elenavasquez

## Professional Summary

Distributed systems engineer with 8 years of experience across Elixir, Kubernetes, and event-driven platforms. Built internal developer tooling, event-driven data pipelines, and high-throughput APIs for B2B SaaS teams.

## Professional Experience

- TechFlow GmbH — Senior Software Engineer (2022-Present) — led event ingestion platform handling 120M jobs/month
- DataBridge AG — Software Engineer (2018-2022) — migrated reporting from batch scripts to a self-serve data platform
- NordStack Labs — Platform Engineer (2016-2018) — built deployment tooling for regulated cloud workloads

## Selected Achievements

- Cut incident recovery time by 42% through automated failover runbooks
- Reduced p95 request latency from 480ms to 170ms on a multi-tenant API
- Mentored 6 engineers into senior and staff-level platform roles

## Education

M.Sc. Computer Science, Technical University of Munich

## Skills

Elixir, Python, Go, Kubernetes, PostgreSQL, Kafka, Terraform, AWS

Input Preview

Output Preview

# Custom Burger Receipt

121 Seventh Street  
San Francisco, CA 94103  
(415) 252-2634

**WiFi:** SOMA250  
**Password:** 59632  
**Order:** 9007  
**Order Time:** Apr'28'09 03:09PM

| Item | Amount |
| --- | ---: |
| Veggie Burger | USD 5.99 |
| Bleu Cheese | USD 1.49 |
| 1 Bal 1/2 | USD 3.79 |
| Cash | USD 13.00 |

**Subtotal:** USD 11.27  
**Tax:** USD 1.07  
**Payment:** USD 12.34  
**Change Due:** USD 0.66

Input Preview

Output Preview

# Executive Meeting Memo

**To:** COO, CFO, VP Operations  
**From:** Operations Office  
**Date:** 2024-11-05

**Subject:** Q1 readiness, staffing posture, and warehouse expansion

## Executive Summary

Leadership reviewed expansion readiness, staffing constraints, and Q1 operating targets. The team agreed to unlock warehousing capacity now while keeping fixed headcount growth disciplined through the first quarter.

## Key Decisions

- Approve warehouse expansion in Frankfurt with a January 12 start date
- Freeze discretionary hiring until February board review
- Track Q1 gross margin weekly against the revised operating plan

## Action Register

| Action | Owner | Due |
| --- | --- | --- |
| Finalize racking vendor shortlist | Logistics | 2024-11-12 |
| Publish revised hiring guardrails | People Ops | 2024-11-15 |
| Send weekly margin dashboard to leadership | Finance | Fridays |

Input Preview

\documentclass[11pt]{article}
\usepackage[margin=1in]{geometry}
\usepackage{booktabs}
\usepackage{hyperref}
\title{Document Processing Benchmark Notes}
\author{Nadia Keller}
\date{March 2026}
\begin{document}
\maketitle

\begin{abstract}
We compared structured extraction, markdown conversion, and spreadsheet generation workflows across invoices, scanned warehouse sheets, and compliance packets.
\end{abstract}

\section{Scope}
The benchmark covered 120 source files spanning PDF, DOCX, JPEG, and LaTeX inputs. We measured field accuracy, table retention, and handoff effort for downstream automation.

\section{Summary Table}
\begin{tabular}{lrr}
\toprule
Workflow & Accuracy & Median Runtime \\
\midrule
Invoice extraction & 98.4\% & 1.8s \\
Markdown conversion & 96.9\% & 1.2s \\
Sheet generation & 100.0\% & 0.7s \\
\bottomrule
\end{tabular}

\section{Key Findings}
Structured document APIs reduce glue code, preserve tabular content better than OCR-only pipelines, and shorten review time for finance operations.

\begin{itemize}
  \item OCR-only pipelines lost row groupings in 17\% of warehouse tables.
  \item Markdown output remained suitable for LLM ingestion without custom cleanup.
  \item Spreadsheet generation removed a manual CSV reformatting step from the finance workflow.
\end{itemize}

\section{Next Steps}
Extend the benchmark to receipts, insurance packets, and multi-file extraction. Publish the evaluation harness after the April review.
\end{document}

Output Preview

# Document Processing Benchmark Notes

**Author:** Nadia Keller  
**Date:** March 2026

## Abstract

We compared structured extraction, markdown conversion, and spreadsheet generation workflows across invoices, scanned warehouse sheets, and compliance packets.

## Scope

The benchmark covered 120 source files spanning PDF, DOCX, JPEG, and LaTeX inputs. We measured field accuracy, table retention, and handoff effort for downstream automation.

## Summary Table

| Workflow | Accuracy | Median Runtime |
| --- | ---: | ---: |
| Invoice extraction | 98.4% | 1.8s |
| Markdown conversion | 96.9% | 1.2s |
| Sheet generation | 100.0% | 0.7s |

## Key Findings

- OCR-only pipelines lost row groupings in 17% of warehouse tables.
- Markdown output remained suitable for LLM ingestion without custom cleanup.
- Spreadsheet generation removed a manual CSV reformatting step from the finance workflow.

## Next Steps

Extend the benchmark to receipts, insurance packets, and multi-file extraction. Publish the evaluation harness after the April review.

Request

curl -X POST \
  https://api.iterationlayer.com/document-to-markdown/v1/convert \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "file": {
    "type": "url",
    "name": "accounts-payable-invoice.pdf",
    "url": "https://iterationlayer.com/code-samples/accounts-payable-invoice.pdf"
  }
}'

Response

{
  "success": true,
  "data": {
    "name": "accounts-payable-invoice.pdf",
    "mime_type": "application/pdf",
    "markdown": "# Northwind Accounting Services GmbH\n\n## Invoice INV-2024-0042\n\nPappelallee 18\n10437 Berlin\naccounts@northwind.example\nDE813529441\n\n**Invoice Date:** 2024-03-15\n**Due Date:** 2024-04-14\n**Payment Terms:** Net 30\n\n## Bill To\n\nAcme Retail Europe\nFinance Team\nNieuwezijds Voorburgwal 21\n1012 RC Amsterdam\n\n| Description | Hours | Rate | Amount |\n|---|---:|---:|---:|\n| Month-end close automation workshop | 6 | USD 120.00 | USD 720.00 |\n| Invoice schema rollout and testing | 4 | USD 120.00 | USD 480.00 |\n| Vendor onboarding playbook update | 2 | USD 95.00 | USD 190.00 |\n\n**Subtotal:** USD 1,390.00\n**Tax (0%):** USD 0.00\n**Total Due:** USD 1,390.00"
  }
}

Request

import { IterationLayer } from "iterationlayer";

const client = new IterationLayer({
  apiKey: "YOUR_API_KEY",
});

const result = await client.convertDocumentToMarkdown({
  file: {
    type: "url",
    name: "accounts-payable-invoice.pdf",
    url: "https://iterationlayer.com/code-samples/accounts-payable-invoice.pdf",
  },
});

Response

{
  "success": true,
  "data": {
    "name": "accounts-payable-invoice.pdf",
    "mime_type": "application/pdf",
    "markdown": "# Northwind Accounting Services GmbH\n\n## Invoice INV-2024-0042\n\nPappelallee 18\n10437 Berlin\naccounts@northwind.example\nDE813529441\n\n**Invoice Date:** 2024-03-15\n**Due Date:** 2024-04-14\n**Payment Terms:** Net 30\n\n## Bill To\n\nAcme Retail Europe\nFinance Team\nNieuwezijds Voorburgwal 21\n1012 RC Amsterdam\n\n| Description | Hours | Rate | Amount |\n|---|---:|---:|---:|\n| Month-end close automation workshop | 6 | USD 120.00 | USD 720.00 |\n| Invoice schema rollout and testing | 4 | USD 120.00 | USD 480.00 |\n| Vendor onboarding playbook update | 2 | USD 95.00 | USD 190.00 |\n\n**Subtotal:** USD 1,390.00\n**Tax (0%):** USD 0.00\n**Total Due:** USD 1,390.00"
  }
}

Request

from iterationlayer import IterationLayer

client = IterationLayer(
    api_key="YOUR_API_KEY"
)

result = client.convert_document_to_markdown(
    file={
        "type": "url",
        "name": "accounts-payable-invoice.pdf",
        "url": "https://iterationlayer.com/code-samples/accounts-payable-invoice.pdf",
    }
)

Response

{
  "success": true,
  "data": {
    "name": "accounts-payable-invoice.pdf",
    "mime_type": "application/pdf",
    "markdown": "# Northwind Accounting Services GmbH\n\n## Invoice INV-2024-0042\n\nPappelallee 18\n10437 Berlin\naccounts@northwind.example\nDE813529441\n\n**Invoice Date:** 2024-03-15\n**Due Date:** 2024-04-14\n**Payment Terms:** Net 30\n\n## Bill To\n\nAcme Retail Europe\nFinance Team\nNieuwezijds Voorburgwal 21\n1012 RC Amsterdam\n\n| Description | Hours | Rate | Amount |\n|---|---:|---:|---:|\n| Month-end close automation workshop | 6 | USD 120.00 | USD 720.00 |\n| Invoice schema rollout and testing | 4 | USD 120.00 | USD 480.00 |\n| Vendor onboarding playbook update | 2 | USD 95.00 | USD 190.00 |\n\n**Subtotal:** USD 1,390.00\n**Tax (0%):** USD 0.00\n**Total Due:** USD 1,390.00"
  }
}

Request

import il "github.com/iterationlayer/sdk-go"

client := il.NewClient("YOUR_API_KEY")

result, err := client.ConvertDocumentToMarkdown(il.ConvertDocumentToMarkdownRequest{
  File: il.NewFileFromURL(
    "accounts-payable-invoice.pdf",
    "https://iterationlayer.com/code-samples/accounts-payable-invoice.pdf",
  ),
})

Response

{
  "success": true,
  "data": {
    "name": "accounts-payable-invoice.pdf",
    "mime_type": "application/pdf",
    "markdown": "# Northwind Accounting Services GmbH\n\n## Invoice INV-2024-0042\n\nPappelallee 18\n10437 Berlin\naccounts@northwind.example\nDE813529441\n\n**Invoice Date:** 2024-03-15\n**Due Date:** 2024-04-14\n**Payment Terms:** Net 30\n\n## Bill To\n\nAcme Retail Europe\nFinance Team\nNieuwezijds Voorburgwal 21\n1012 RC Amsterdam\n\n| Description | Hours | Rate | Amount |\n|---|---:|---:|---:|\n| Month-end close automation workshop | 6 | USD 120.00 | USD 720.00 |\n| Invoice schema rollout and testing | 4 | USD 120.00 | USD 480.00 |\n| Vendor onboarding playbook update | 2 | USD 95.00 | USD 190.00 |\n\n**Subtotal:** USD 1,390.00\n**Tax (0%):** USD 0.00\n**Total Due:** USD 1,390.00"
  }
}

Use the same workflow from code, agents, or n8n

When an automation moves from prototype to production, you should not have to rebuild it for every environment. Iteration Layer lets scripts, agents, and n8n workflows call the same European AI workflow runtime.

Fits into your existing stack

Native SDKs for TypeScript, Python, and Go. OpenAPI spec for everything else. MCP server for AI agents and Claude Code skills. n8n integration for visual workflows.

EU AI workflow runtime

Run document, image, and file steps through one EU-hosted workflow layer with shared API conventions and billing.

Agent-ready by design

Expose the same document and image actions to MCP tools and Claude Code skills, then reuse the API contract when workflows graduate into scripts or automations.

Verified n8n node

Install the verified Iteration Layer node in n8n, then route documents and generated files through the same provider from visual workflows.

Install official n8n Node Read the Guide

Three steps to your first conversion

Send your document

Upload any document via URL or base64 — PDF, Office, EPUB, LaTeX, email, images, public website URLs, and more. Any supported format works in the same endpoint.

We parse, OCR, and describe

The document is parsed, scanned pages are run through OCR, and tables are extracted. Image files also receive a natural language description of their visual content.

Get clean markdown

Receive a JSON result with the file name, MIME type, and extracted markdown. HTML links are rendered as numbered references, and image files also include a plain-language description field.

Intelligent Parsing

The API automatically selects the best parsing approach for your document. Dense tables, multi-column layouts, and mixed content are handled without any configuration.

Top-tier extraction quality

Strong extraction accuracy across real workflow files — forms, invoices, scans, tables, charts, and photos. Our benchmark scored 0.93, second place overall.

Clean Markdown Output

Headings, paragraphs, tables, lists, and link references are preserved as clean markdown syntax. Website pages drop common navigation and footer boilerplate before output.

Deep Content Understanding

Images and scanned documents aren't treated as pixel grids to OCR. The API understands what they depict — product photos, charts, diagrams — and returns a natural language description alongside the extracted text.

Built-In OCR

Scanned PDFs and image files are automatically run through OCR. You get readable markdown regardless of whether the source is text or pixels.

All Document Formats

40+ file formats plus public website URLs — PDF, DOCX, PPTX, ODT, ODS, XLSX, EPUB, LaTeX, EML, Jupyter, images, and more — all handled by the same endpoint. No format-specific setup or pre-processing required.

No Model Training

Your documents are never used to train or improve AI models. This is guaranteed for all plans — not gated behind an enterprise contract.

Real-world pipelines, ready to ship

Each recipe chains multiple APIs into a complete workflow. Pick one, tweak it, and deploy — or use it as a starting point for your own pipeline.

Convert Contract to Markdown

Convert a contract PDF to clean markdown for clause extraction or LLM analysis.

Convert Document for Knowledge Base

Convert external documents — specs, contracts, reports — to markdown for knowledge base ingestion.

Convert Document for RAG Ingestion

Convert a document to clean markdown suitable for chunking and embedding in a RAG pipeline.

Convert Invoice to Markdown

Convert a PDF invoice to clean markdown for LLM processing or document pipelines.

Convert Resume to Markdown

Convert a resume PDF to clean markdown for LLM parsing or candidate pipelines.

Convert Document to Markdown

Convert PDF, DOCX, HTML, or image documents to clean, structured Markdown.

Preprocess Document for LLM Classification

Convert a document to markdown and classify it with an LLM in a single pipeline.

Browse all recipes

European by design

Your data is processed on EU-hosted infrastructure and never stored beyond temporary logs. Zero data retention, GDPR-compliant workflows, and a Data Processing Agreement are available for every customer. Learn more about our security practices .

EU-hosted core processing

Application and processing infrastructure runs in Europe, with provider-scope ISO 27001 and BSI C5 evidence documented for procurement reviews.

Zero data retention

Customer files and processing results are not stored after the request. Usage logs are retained for 90 days and automatically deleted.

Clear answers for security teams

Give reviewers the answers they need up front: where files are processed, what is retained, which subprocessors are involved, and how AI inputs, outputs, review gates, and audit records move through each workflow.

Trust and Compliance EU Hosting Responsible AI Use Security Controls

Pricing

Start usage-based. Switch to a subscription when your volume becomes predictable.

Pay as you go

Usage-based

$0.033 to $0.022 / credit

Graduated pricing. Your effective rate decreases automatically as monthly usage grows.

No monthly commitment
Pay only for credits used
Automatic volume discounts as usage grows

Try for Free

Subscriptions

Predictable volume

From $29.99 /month

Fixed recurring credit packs with lower effective credit prices for steady usage.

Lower effective per-credit prices
Fixed recurring credit packs
Predictable monthly budget

Compare Plans

All APIs included Free trial credits per API Project-based budget caps Auto overage billing

Compare pricing options

Still evaluating?

Compare Iteration Layer against the biggest alternatives at a glance, then open the full head-to-head pages when you want the details.

Feature	Iteration Layer	LlamaParse	Mistral Document AI	Unstructured
Markdown output	Clean markdown Returns well-structured markdown with preserved headings, tables, and lists from any document	Clean markdown Returns clean markdown optimized for LLM context and RAG pipeline ingestion	Clean markdown High-quality markdown with strong structure preservation and table support	Element-based Returns structured JSON elements that can be reassembled into markdown — designed for chunking, not direct markdown output
Table preservation	Markdown tables Tables are extracted and rendered as clean markdown table syntax	Markdown tables Strong table-to-markdown conversion with multi-column layout support	Markdown tables Strong table-to-markdown conversion as part of Mistral's document understanding	Element-based Tables extracted as structured elements with HTML table support for complex layouts
Image description	Yes Returns a natural language description of image content alongside OCR markdown for image files	No Text extraction only — no semantic description of visual image content	No Text extraction only — no semantic description of visual image content	No Text extraction only — no semantic description of visual image content
Supported input formats	40+ formats Process 40+ formats — PDF, Office, EPUB, RTF, LaTeX, email, Jupyter, images, and more — in a single API endpoint	PDF, images Primarily supports PDF and image input formats	PDF, images Supports PDF and image files for OCR and markdown conversion	64+ formats Supports 64+ file types including email, code files, and specialized formats

See how we compare to our competitors

LlamaParse Mistral Document AI Unstructured Reducto Nanonets DocuPipe Extend PDF.ai AWS Textract Azure Document Intelligence Google Document AI OlmOCR PaddleOCR Tesseract

Frequently asked questions

How accurate is the extraction quality?

Our OCR benchmark shows strong extraction accuracy, reliability, and performance across 41 real workflow files, including forms, invoices, scans, tables, charts, and photos.

What file formats are supported?

The API accepts 40+ file formats including PDF, DOCX, PPTX, ODT, ODS, XLSX, EPUB, CSV, TSV, HTML, LaTeX, EML, Jupyter notebooks, and all common image formats. Scanned documents are processed with built-in OCR.

What is the difference between this and Document Extraction?

Document to Markdown runs only the ingestion step — it converts files to clean markdown. Document Extraction builds on this by also applying a schema to extract specific fields as structured JSON. Use Document to Markdown when you want the content itself; use Document Extraction when you want specific named values.

Why does the markdown include an image description?

For image files, the API runs both OCR (to extract any text) and a vision model (to describe the visual content). The description is returned as a separate field so you can use it in your own downstream processing.

How many files can I send per request?

Up to 20 files per request. Each file gets its own result in the response array. The order of results matches the order of the input files.

Is the output suitable for LLMs?

Yes. The markdown format is the same used internally by the Document Extraction API as input to LLM extraction. Tables, structure, content, and numbered link references are preserved in a way that models read reliably.

Is Document to Markdown GDPR-compliant?

Yes. Files are processed on EU infrastructure, handled in memory, and not retained after processing. See our security practices and GDPR and AI Act overview for the compliance context.

Build your first workflow in minutes

Chain our APIs into a workflow you can test with your own data. Free trial credits included.

Try for Free

Zero data retention Made & hosted in the EU $65 free trial credits