Convert any document or image to clean markdown with a single API call. Send one file and receive the markdown output — no schema, no extraction, just clean text ready for LLM pipelines or downstream processing.
Key Features
- Multi-Format Support — Convert PDF, DOCX, XLSX, images, HTML, Markdown, CSV, JSON, plain text, and public website URLs.
- Built-In OCR — Scanned PDFs and image files are processed through OCR automatically. No separate step required.
- Image Descriptions — For image files and useful referenced images, OCR text and a plain-language visual description are included as structured image context.
- Website Image Context — Public website URL inputs include useful referenced images in the markdown context while skipping logos, tracking pixels, decorative graphics, and semantically unhelpful images.
- LLM-Grade Output — The markdown format is the same used internally by the Document Extraction API. Tables, structure, numbered link references, and layout are preserved for reliable LLM consumption.
Overview
The Document to Markdown API converts a document or public website page to clean markdown. You send one file or website URL (base64 or URL) and receive a JSON object with the result.
Endpoint: POST /document-to-markdown/v1/convert
Limits:
- Max file size: 50 MB
Supported File Formats
- Documents: PDF, DOCX, PPTX, ODT, EPUB, RTF
- Spreadsheets: XLSX, XLS, ODS, CSV, TSV
- Email: EML, MSG (headers, body, and attachment extraction)
- Notebooks: Jupyter (.ipynb)
- Academic & Publishing: LaTeX (.tex, .latex), BibTeX (.bib), Typst (.typst, .typ)
- Markup & Text: HTML, Markdown, JSON, XML, YAML, TOML, RST, Org, Djot, MDX, TXT
- Images: PNG, JPEG, GIF, WebP, AVIF, HEIF, BMP, TIFF, JP2, PNM/PBM/PGM/PPM, SVG
How It Works
Every conversion runs the same ingestion pipeline used by Document Extraction:
- Parse — the file format is detected and validated.
-
Ingest — the file is converted to markdown using the appropriate processor:
- PDF — pages are rendered to images and run through OCR.
- Images (PNG, JPEG, GIF, WebP, AVIF, HEIF, BMP, TIFF, JP2, PNM) — OCR extracts text and a vision model generates a description of the visual content.
- Office documents (DOCX, PPTX, ODT, ODS, XLSX/XLS) — content is extracted and normalized to markdown with formatting, tables, lists, and footnotes preserved.
- EPUB — chapters are extracted and converted to markdown via the HTML pipeline.
- LaTeX — converted to markdown: headings, formatting, lists, tables, math equations, and code blocks.
- Jupyter Notebooks — code and markdown cells are extracted with outputs.
- RTF — converted to markdown with bold, italic, strikethrough, Unicode, and special characters.
- Email (EML, MSG) — headers and body are parsed into structured markdown. Attachments are extracted, ingested through the pipeline, and returned separately.
- CSV/TSV — converted to markdown tables with auto-detected delimiters.
- HTML and website pages — converted through the HTML ingestion path before markdown output, with navigation/footer boilerplate removed, links rendered as numbered references, and useful image references annotated with OCR and visual description context.
- Text and markup formats (Markdown, JSON, XML, YAML, TOML, RST, Org, Djot, MDX, BibTeX, Typst, TXT) — returned as-is for direct LLM consumption.
- Return — the result is returned as a JSON object.
There is no LLM extraction step. The API stops after ingestion.
How Nested Files Work
Some file formats contain other files — emails have attachments, archives have entries. When the API encounters a container format, it extracts the nested files and ingests each one through the same pipeline.
Currently supported containers: EML and MSG (email attachments).
The response includes:
-
markdown— the container’s own content (email headers and body), plus an “Attachments” section listing filenames -
nested_files— an array of ingested nested files, each withnameandmarkdown(anddescriptionfor image files)
Each nested file is billed as its own document. An email with a 3-page PDF attachment costs 1 credit (email) + 3 credits (PDF pages) = 4 credits total.
For Document Extraction, nested file markdown is appended to the container’s markdown so the LLM sees the full content — email body and all attachments — as one combined context.
Request Format
curl -X POST \
https://api.iterationlayer.com/document-to-markdown/v1/convert \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"file": {
"type": "base64",
"name": "invoice.pdf",
"base64": "<base64-encoded-file>"
}
}'{
"success": true,
"data": {
"name": "invoice.pdf",
"mime_type": "application/pdf",
"markdown": "# Invoice\n\n**Invoice Number:** INV-2024-0042\n\n**Date:** 2024-03-15\n\n| Description | Qty | Unit Price | Total |\n|---|---|---|---|\n| Consulting | 10h | $100.00 | $1,000.00 |\n| Support | 5h | $80.00 | $400.00 |\n\n**Total: $1,400.00**"
}
}import { IterationLayer } from "iterationlayer";
const client = new IterationLayer({
apiKey: "YOUR_API_KEY",
});
const result = await client.convertDocumentToMarkdown({
file: {
type: "base64",
name: "invoice.pdf",
base64: new Uint8Array([/* file bytes */]),
},
});{
"success": true,
"data": {
"name": "invoice.pdf",
"mime_type": "application/pdf",
"markdown": "# Invoice\n\n**Invoice Number:** INV-2024-0042\n\n**Date:** 2024-03-15\n\n| Description | Qty | Unit Price | Total |\n|---|---|---|---|\n| Consulting | 10h | $100.00 | $1,000.00 |\n| Support | 5h | $80.00 | $400.00 |\n\n**Total: $1,400.00**"
}
}from iterationlayer import IterationLayer
client = IterationLayer(api_key="YOUR_API_KEY")
result = client.convert_document_to_markdown(
file={
"type": "base64",
"name": "invoice.pdf",
"base64": b"...",
}
){
"success": true,
"data": {
"name": "invoice.pdf",
"mime_type": "application/pdf",
"markdown": "# Invoice\n\n**Invoice Number:** INV-2024-0042\n\n**Date:** 2024-03-15\n\n| Description | Qty | Unit Price | Total |\n|---|---|---|---|\n| Consulting | 10h | $100.00 | $1,000.00 |\n| Support | 5h | $80.00 | $400.00 |\n\n**Total: $1,400.00**"
}
}import il "github.com/iterationlayer/sdk-go"
client := il.NewClient("YOUR_API_KEY")
result, err := client.ConvertDocumentToMarkdown(il.ConvertDocumentToMarkdownRequest{
File: il.FileInput{Type: "base64", Name: "invoice.pdf", Base64: []byte{ /* file bytes */ }},
}){
"success": true,
"data": {
"name": "invoice.pdf",
"mime_type": "application/pdf",
"markdown": "# Invoice\n\n**Invoice Number:** INV-2024-0042\n\n**Date:** 2024-03-15\n\n| Description | Qty | Unit Price | Total |\n|---|---|---|---|\n| Consulting | 10h | $100.00 | $1,000.00 |\n| Support | 5h | $80.00 | $400.00 |\n\n**Total: $1,400.00**"
}
}Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
file |
FileInput |
Yes | The file to convert. |
webhook_url |
string |
No | HTTPS URL to receive results asynchronously. If provided, returns 201 immediately. See Webhooks. |
Async Mode
Add a webhook_url parameter to process the request in the background. The API returns 201 Accepted immediately and delivers the result to your webhook URL when processing completes. See Webhooks for payload format and retry behavior.
FileInput
The file is either a base64-encoded binary or a URL reference.
| Parameter | Type | Required | Description |
|---|---|---|---|
type |
"base64" | "url" |
Yes | Input method. |
name |
string |
Required for base64, optional for url |
File name including extension. URL inputs without a name may be resolved as website pages. |
base64 |
string |
When type = "base64" |
Base64-encoded file content. |
url |
string |
When type = "url" |
Public HTTPS URL to fetch the file from. HTTP is not accepted. |
fetch_options |
object |
No | Website retrieval options for public URL inputs. |
For websites, the API always checks the site’s robots.txt before fetching the page. If the URL is disallowed for the effective User-Agent, the request fails with 400 Bad Request and the page is not fetched. Sitemap entries in robots.txt are recorded as metadata only; they are not traversed.
Website URL fetches are rate-limited per destination host. The service automatically respects robots.txt crawl-delay hints, slows down after upstream 429 Too Many Requests responses, and uses upstream Retry-After or rate-limit reset headers when provided. This protects public sites and may delay later requests to the same host; it is not configurable through fetch_options.
If a website URL fetch looks like a WAF or security challenge, the API automatically retries the same URL in an isolated Chromium browser session. Detection covers common Cloudflare, Akamai, AWS WAF/CloudFront, Imperva/Incapsula, DataDome, PerimeterX, Sucuri, F5/BIG-IP, hCaptcha, reCAPTCHA, and generic verification pages. Successful browser fallback is recorded under metadata.security; failed fallback returns 400 Bad Request with a clear security-challenge message.
Fetch Options
fetch_options lets you control retrieval behavior for a URL input treated as a website page. Robots compliance and per-host rate limiting are mandatory and are not configurable.
| Field | Type | Required | Description |
|---|---|---|---|
locale |
string | No |
BCP 47 locale tag sent as Accept-Language (e.g., "en-US", "de-DE", "fr"). |
user_agent |
string | No |
Custom User-Agent header string (1–500 characters). |
auth |
object | No | Authentication for website URL inputs. Supports bearer tokens, HTTP Basic auth, and a custom auth header. Secret values are not returned in metadata. |
headers |
object | No |
Additional request headers for website URL inputs. Header names and values are validated; unsafe headers such as Cookie, Set-Cookie, Host, Content-Length, hop-by-hop headers, and browser-controlled Sec-* headers are rejected. |
timeout_ms |
integer | No |
Website fetch timeout in milliseconds. Must be between 1000 and 60000. |
should_render_javascript |
boolean | No |
Use Chromium browser rendering before conversion. Default: false. |
Custom Headers
Custom headers are sent on direct fetches and, when JavaScript rendering is enabled or security-challenge fallback is needed, as Chromium request headers where supported by the browser. Each Chromium fetch uses an isolated browser context that is disposed after the request; cookies and session state are not persisted across URLs or requests.
Fetch options control retrieval behavior, not billing.
Auth Examples
Use one auth shape per request.
{ "auth": { "type": "bearer", "token": "..." } }
{ "auth": { "type": "basic", "username": "...", "password": "..." } }
{ "auth": { "type": "custom_header", "name": "x-api-key", "value": "..." } }Website URL Example
To convert a public website page to markdown, send a URL input without a filename. Add fetch_options only when the page needs explicit retrieval controls.
{
"file": {
"type": "url",
"url": "https://example.com/docs/api-reference",
"fetch_options": {
"should_render_javascript": true
}
}
}Response Format
The response is a JSON object with the conversion result.
| Field | Type | Description |
|---|---|---|
name |
string |
File name from the request. |
mime_type |
string |
Detected MIME type of the file. |
markdown |
string |
Extracted markdown content. HTML and website inputs include numbered link references when links are present. Useful image references may be followed by structured <image> context blocks. Empty string if no text was found. |
description |
string |
Plain-language description of the image content. Present only for image files (PNG, JPEG, GIF, WebP). |
Image Files
For image files, the response includes both markdown (OCR output) and description (vision model output). The description field describes what the image depicts — suitable for use as alt text, for downstream search indexing, or as context in LLM prompts.
{
"name": "product-photo.png",
"mime_type": "image/png",
"markdown": "Sale — 30% off all items",
"description": "A product photograph of a white ceramic mug on a wooden table. The mug has a minimalist design with no text or logo. Natural lighting from the left."
}Image Context in Markdown
When a document, HTML file, website page, or container attachment includes a useful image, the markdown preserves the image reference and appends a structured context block directly after it. The block links the OCR text and visual description to the image URL used in the markdown.

<image>
<url>/assets/product-photo.png</url>
<description>
A product photograph of a white ceramic mug on a wooden table. The mug has a minimalist design with no text or logo. Natural lighting from the left.
</description>
<contents>
Sale — 30% off all items
</contents>
</image>If only OCR text or only a visual description is available, the unavailable section is omitted. If neither is available, only the markdown image reference is returned.
Recipes
For complete, runnable examples see the Recipes page.
- Convert Invoice to Markdown — Convert a PDF invoice to structured markdown.
- Convert Contract to Markdown — Extract contract text and clauses as clean markdown.
- Convert Resume to Markdown — Convert a resume PDF to structured markdown for downstream processing.
Error Responses
| Status | Description |
|---|---|
| 400 |
Invalid request (missing or invalid file parameter) |
| 401 | Missing or invalid API key |
| 422 | Processing error (file could not be parsed or ingested) |
| 429 | Rate limit exceeded |