Document to Markdown
Convert any document or image to clean markdown with a single API call. Send one file and receive the markdown output — no schema, no extraction, just clean text ready for LLM pipelines or downstream processing.
Key Features
- Multi-Format Support — Convert PDF, DOCX, XLSX, images, HTML, Markdown, CSV, JSON, and plain text.
- Built-In OCR — Scanned PDFs and image files are processed through OCR automatically. No separate step required.
- Image Descriptions — For image files, the response includes a plain-language description of the visual content alongside the extracted markdown.
- LLM-Grade Output — The markdown format is the same used internally by the Document Extraction API. Tables, structure, and layout are preserved for reliable LLM consumption.
Overview
The Document to Markdown API converts a document to clean markdown. You send one file (base64 or URL) and receive a JSON object with the result.
Endpoint: POST /document-to-markdown/v1/convert
Supported formats: PDF, DOCX, XLSX/XLS, CSV, TXT, Markdown, JSON, HTML, PNG, JPEG, GIF, WebP, SVG
Limits:
- Max file size: 50 MB
How It Works
Every conversion runs the same ingestion pipeline used by Document Extraction:
- Parse — the file format is detected and validated.
-
Ingest — the file is converted to markdown using the appropriate processor:
- PDF — pages are rendered to images and run through OCR.
- Images — OCR extracts text and a vision model generates a description of the visual content.
- Office documents — DOCX, XLSX, and CSV content is extracted and normalized to markdown.
- Text formats — HTML, Markdown, JSON, CSV, and plain text are cleaned and returned as-is.
- Return — the result is returned as a JSON object.
There is no LLM extraction step. The API stops after ingestion.
Request Format
curl -X POST \
https://api.iterationlayer.com/document-to-markdown/v1/convert \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"file": {
"type": "base64",
"name": "invoice.pdf",
"base64": "<base64-encoded-file>"
}
}'{
"success": true,
"data": {
"name": "invoice.pdf",
"mime_type": "application/pdf",
"markdown": "# Invoice\n\n**Invoice Number:** INV-2024-0042\n\n**Date:** 2024-03-15\n\n| Description | Qty | Unit Price | Total |\n|---|---|---|---|\n| Consulting | 10h | $100.00 | $1,000.00 |\n| Support | 5h | $80.00 | $400.00 |\n\n**Total: $1,400.00**"
}
}import { IterationLayer } from "iterationlayer";
const client = new IterationLayer({
apiKey: "YOUR_API_KEY",
});
const result = await client.convertToMarkdown({
file: {
type: "base64",
name: "invoice.pdf",
base64: "<base64-encoded-file>",
},
});{
"success": true,
"data": {
"name": "invoice.pdf",
"mime_type": "application/pdf",
"markdown": "# Invoice\n\n**Invoice Number:** INV-2024-0042\n\n**Date:** 2024-03-15\n\n| Description | Qty | Unit Price | Total |\n|---|---|---|---|\n| Consulting | 10h | $100.00 | $1,000.00 |\n| Support | 5h | $80.00 | $400.00 |\n\n**Total: $1,400.00**"
}
}from iterationlayer import IterationLayer
client = IterationLayer(api_key="YOUR_API_KEY")
result = client.convert_to_markdown(
file={
"type": "base64",
"name": "invoice.pdf",
"base64": "<base64-encoded-file>",
}
){
"success": true,
"data": {
"name": "invoice.pdf",
"mime_type": "application/pdf",
"markdown": "# Invoice\n\n**Invoice Number:** INV-2024-0042\n\n**Date:** 2024-03-15\n\n| Description | Qty | Unit Price | Total |\n|---|---|---|---|\n| Consulting | 10h | $100.00 | $1,000.00 |\n| Support | 5h | $80.00 | $400.00 |\n\n**Total: $1,400.00**"
}
}import il "github.com/iterationlayer/sdk-go"
client := il.NewClient("YOUR_API_KEY")
result, err := client.ConvertToMarkdown(il.ConvertRequest{
File: il.NewFileFromBase64(
"invoice.pdf",
"<base64-encoded-file>",
),
}){
"success": true,
"data": {
"name": "invoice.pdf",
"mime_type": "application/pdf",
"markdown": "# Invoice\n\n**Invoice Number:** INV-2024-0042\n\n**Date:** 2024-03-15\n\n| Description | Qty | Unit Price | Total |\n|---|---|---|---|\n| Consulting | 10h | $100.00 | $1,000.00 |\n| Support | 5h | $80.00 | $400.00 |\n\n**Total: $1,400.00**"
}
}Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
file |
FileInput |
Yes | The file to convert. |
FileInput
The file is either a base64-encoded binary or a URL reference.
| Parameter | Type | Required | Description |
|---|---|---|---|
type |
"base64" | "url" |
Yes | Input method. |
name |
string |
Yes | File name including extension. Used to detect the format. |
base64 |
string |
When type = "base64" |
Base64-encoded file content. |
url |
string |
When type = "url" |
Public URL to fetch the file from. |
Response Format
The response is a JSON object with the conversion result.
| Field | Type | Description |
|---|---|---|
name |
string |
File name from the request. |
mime_type |
string |
Detected MIME type of the file. |
markdown |
string |
Extracted markdown content. Empty string if no text was found. |
description |
string |
Plain-language description of the image content. Present only for image files (PNG, JPEG, GIF, WebP). |
Image Files
For image files, the response includes both markdown (OCR output) and description (vision model output). The description field describes what the image depicts — suitable for use as alt text, for downstream search indexing, or as context in LLM prompts.
{
"name": "product-photo.png",
"mime_type": "image/png",
"markdown": "Sale — 30% off all items",
"description": "A product photograph of a white ceramic mug on a wooden table. The mug has a minimalist design with no text or logo. Natural lighting from the left."
}Recipes
For complete, runnable examples see the Recipes page.
- Convert Invoice to Markdown — Convert a PDF invoice to structured markdown.
- Convert Contract to Markdown — Extract contract text and clauses as clean markdown.
- Convert Resume to Markdown — Convert a resume PDF to structured markdown for downstream processing.
Error Responses
| Status | Code | Description |
|---|---|---|
| 400 |
bad_request |
Missing or invalid file parameter. |
| 401 |
unauthorized |
Missing or invalid API key. |
| 422 |
processing_error |
The file could not be parsed or ingested. |
| 429 |
rate_limited |
Request rate limit exceeded. |