How to Convert Documents to Markdown for RAG Pipelines with the Iteration Layer Document to Markdown API - Recipes

Request

curl -X POST https://api.iterationlayer.com/document-to-markdown/v1/convert \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "file": {
      "type": "url",
      "name": "product-manual.pdf",
      "url": "https://example.com/docs/product-manual-v3.pdf"
    }
  }'

Response

{
  "success": true,
  "data": {
    "name": "product-manual.pdf",
    "mime_type": "application/pdf",
    "markdown": "# Product Manual v3\n\n## Installation\n\nDownload the latest release from the releases page. Unpack the archive and run the installer.\n\n### System Requirements\n\n| Component | Minimum | Recommended |\n|---|---|---|\n| CPU | 2 cores | 4 cores |\n| RAM | 4 GB | 8 GB |\n| Disk | 2 GB | 10 GB |\n\n## Configuration\n\nThe configuration file is located at `/etc/app/config.yaml`. The following options are available:\n\n- **port** — HTTP server port (default: 8080)\n- **log_level** — Logging verbosity: debug, info, warn, error\n- **max_connections** — Maximum concurrent connections (default: 100)\n\n## API Reference\n\n### Authentication\n\nAll API calls require a Bearer token in the Authorization header.\n\n### Endpoints\n\n| Method | Path | Description |\n|---|---|---|\n| GET | /health | Health check |\n| POST | /process | Submit a processing job |\n| GET | /jobs/:id | Get job status |"
  }
}

Request

import { IterationLayer } from "iterationlayer";

const client = new IterationLayer({ apiKey: "YOUR_API_KEY" });

const result = await client.convertDocumentToMarkdown({
  file: {
    type: "url",
    name: "product-manual.pdf",
    url: "https://example.com/docs/product-manual-v3.pdf",
  },
});

// Split markdown into chunks at heading boundaries
const chunks = result.markdown.split(/(?=^## )/m);

Response

{
  "success": true,
  "data": {
    "name": "product-manual.pdf",
    "mime_type": "application/pdf",
    "markdown": "# Product Manual v3\n\n## Installation\n\nDownload the latest release from the releases page. Unpack the archive and run the installer.\n\n### System Requirements\n\n| Component | Minimum | Recommended |\n|---|---|---|\n| CPU | 2 cores | 4 cores |\n| RAM | 4 GB | 8 GB |\n| Disk | 2 GB | 10 GB |\n\n## Configuration\n\nThe configuration file is located at `/etc/app/config.yaml`. The following options are available:\n\n- **port** — HTTP server port (default: 8080)\n- **log_level** — Logging verbosity: debug, info, warn, error\n- **max_connections** — Maximum concurrent connections (default: 100)\n\n## API Reference\n\n### Authentication\n\nAll API calls require a Bearer token in the Authorization header.\n\n### Endpoints\n\n| Method | Path | Description |\n|---|---|---|\n| GET | /health | Health check |\n| POST | /process | Submit a processing job |\n| GET | /jobs/:id | Get job status |"
  }
}

Request

import re
from iterationlayer import IterationLayer

client = IterationLayer(api_key="YOUR_API_KEY")

result = client.convert_document_to_markdown(
    file={
        "type": "url",
        "name": "product-manual.pdf",
        "url": "https://example.com/docs/product-manual-v3.pdf",
    }
)

# Split markdown into chunks at heading boundaries
chunks = re.split(r"(?=^## )", result["markdown"], flags=re.MULTILINE)

Response

{
  "success": true,
  "data": {
    "name": "product-manual.pdf",
    "mime_type": "application/pdf",
    "markdown": "# Product Manual v3\n\n## Installation\n\nDownload the latest release from the releases page. Unpack the archive and run the installer.\n\n### System Requirements\n\n| Component | Minimum | Recommended |\n|---|---|---|\n| CPU | 2 cores | 4 cores |\n| RAM | 4 GB | 8 GB |\n| Disk | 2 GB | 10 GB |\n\n## Configuration\n\nThe configuration file is located at `/etc/app/config.yaml`. The following options are available:\n\n- **port** — HTTP server port (default: 8080)\n- **log_level** — Logging verbosity: debug, info, warn, error\n- **max_connections** — Maximum concurrent connections (default: 100)\n\n## API Reference\n\n### Authentication\n\nAll API calls require a Bearer token in the Authorization header.\n\n### Endpoints\n\n| Method | Path | Description |\n|---|---|---|\n| GET | /health | Health check |\n| POST | /process | Submit a processing job |\n| GET | /jobs/:id | Get job status |"
  }
}

Request

import (
    "strings"

    il "github.com/iterationlayer/sdk-go"
)

client := il.NewClient("YOUR_API_KEY")

result, err := client.ConvertDocumentToMarkdown(il.ConvertDocumentToMarkdownRequest{
    File: il.FileInput{Type: "url", Name: "product-manual.pdf", Url: "https://example.com/docs/product-manual-v3.pdf"},
})

// Split markdown into chunks at heading boundaries
chunks := strings.Split(result.Markdown, "\n## ")

Response

{
  "success": true,
  "data": {
    "name": "product-manual.pdf",
    "mime_type": "application/pdf",
    "markdown": "# Product Manual v3\n\n## Installation\n\nDownload the latest release from the releases page. Unpack the archive and run the installer.\n\n### System Requirements\n\n| Component | Minimum | Recommended |\n|---|---|---|\n| CPU | 2 cores | 4 cores |\n| RAM | 4 GB | 8 GB |\n| Disk | 2 GB | 10 GB |\n\n## Configuration\n\nThe configuration file is located at `/etc/app/config.yaml`. The following options are available:\n\n- **port** — HTTP server port (default: 8080)\n- **log_level** — Logging verbosity: debug, info, warn, error\n- **max_connections** — Maximum concurrent connections (default: 100)\n\n## API Reference\n\n### Authentication\n\nAll API calls require a Bearer token in the Authorization header.\n\n### Endpoints\n\n| Method | Path | Description |\n|---|---|---|\n| GET | /health | Health check |\n| POST | /process | Submit a processing job |\n| GET | /jobs/:id | Get job status |"
  }
}

Template

{
  "name": "Convert Document for RAG Ingestion",
  "nodes": [
    {
      "parameters": {
        "content": "## Convert Document for RAG Ingestion

AI teams building retrieval-augmented generation pipelines use this recipe to convert source documents into clean markdown that chunks well and produces high-quality embeddings.

**Note:** This workflow uses the Iteration Layer community node (`n8n-nodes-iterationlayer`). Install it via Settings > Community Nodes on self-hosted n8n, or add it directly on n8n Cloud with Verified Community Nodes enabled.",
        "height": 280,
        "width": 500,
        "color": 2
      },
      "type": "n8n-nodes-base.stickyNote",
      "typeVersion": 1,
      "position": [
        200,
        40
      ],
      "id": "4dce764c-c876-46e6-b345-073db354eff4",
      "name": "Overview"
    },
    {
      "parameters": {
        "content": "### Step 1: Convert Document to Markdown
Resource: **Document to Markdown**

Configure the Document to Markdown parameters below, then connect your credentials.",
        "height": 160,
        "width": 300,
        "color": 6
      },
      "type": "n8n-nodes-base.stickyNote",
      "typeVersion": 1,
      "position": [
        475,
        100
      ],
      "id": "b8aee214-9501-4c49-b56a-5f00a2bb2f41",
      "name": "Step 1 Note"
    },
    {
      "parameters": {},
      "type": "n8n-nodes-base.manualTrigger",
      "typeVersion": 1,
      "position": [
        250,
        300
      ],
      "id": "641f3a90-4201-4e73-8faa-526100f62954",
      "name": "Manual Trigger"
    },
    {
      "parameters": {
        "resource": "documentToMarkdown",
        "fileInputMode": "url",
        "fileName": "product-manual.pdf",
        "fileUrl": "https://example.com/docs/product-manual-v3.pdf"
      },
      "type": "n8n-nodes-iterationlayer.iterationLayer",
      "typeVersion": 1,
      "position": [
        500,
        300
      ],
      "id": "2da87e7d-179a-453c-b7ab-2175662c07b6",
      "name": "Convert Document to Markdown",
      "credentials": {
        "iterationLayerApi": {
          "id": "1",
          "name": "Iteration Layer API"
        }
      }
    }
  ],
  "connections": {
    "Manual Trigger": {
      "main": [
        [
          {
            "node": "Convert Document to Markdown",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  },
  "settings": {
    "executionOrder": "v1"
  }
}

Prompt

Convert the document at [file URL] to markdown for RAG ingestion. Use the convert_document_to_markdown tool with the file URL.

Ingest

Generate

Integrations

Built for

By product

By industry

Docs

Overview

APIs

Workflows

SDKs

Agent Tools

Agent Frameworks

Chat UIs

API Reference

Billing

Trust & Compliance

Benchmarks

Blog

More

Convert Document for RAG Ingestion

Who this is for

Related Recipes

Convert Contract to Markdown

Convert Document for Knowledge Base

Convert Invoice to Markdown

Try with your own data

Document to Markdown