The Problem: Your AI Assistant Can’t Read Your Files
You’re in Claude Code or Cursor. You have a PDF contract you need to review. Or a scanned invoice you need to understand. Or a DOCX specification from a client.
Your assistant can read code, search files, and write implementations. But hand it a PDF and it’s stuck. The raw bytes are meaningless. A scanned document is just pixels. Even a text-based PDF requires extraction before the model can reason about its content.
The workaround most people use: open the file in a viewer, copy the text, paste it into the chat. This works for simple text PDFs. It falls apart for scanned documents, multi-column layouts, tables, and images. And it completely fails for formats like XLSX or HTML where “copy the text” loses all structure.
MCP fixes this. Connect the Iteration Layer MCP server, and your assistant gains the ability to convert any document — PDF, image, DOCX, XLSX, CSV, HTML — to clean markdown in real time. The content becomes part of the conversation.
Setting Up in Claude Code
One command:
claude mcp add iterationlayer --transport http https://api.iterationlayer.com/mcpThe first time you use an Iteration Layer tool, a browser window opens for OAuth authentication. Log in, authorize access, and you’re connected. No API keys to manage, no configuration files to edit.
To verify the connection, ask Claude Code what MCP tools it has access to. You should see convert_document_to_markdown listed among the available tools.
Setting Up in Cursor
Add to your .cursor/mcp.json:
{
"mcpServers": {
"iterationlayer": {
"type": "http",
"url": "https://api.iterationlayer.com/mcp"
}
}
}Save and restart. The tool is now available in your Cursor AI conversations.
What This Unlocks
Once connected, your assistant can read documents conversationally. No copy-paste, no file viewer, no preprocessing.
Reading a PDF specification:
Convert this PDF to markdown so I can review the API specification.
The assistant calls convert_document_to_markdown, sends the file, and gets back clean markdown with headings, tables, and structure preserved. It can then answer questions about the content, summarize sections, or compare it against your implementation.
Understanding a scanned document:
This is a scanned contract. Convert it to text so I can review the terms.
The tool runs OCR automatically. Scanned pages become readable text with tables and structure intact. Your assistant can then summarize the key clauses, identify dates, or flag specific terms.
Reading a spreadsheet:
Convert this Excel file to markdown. I need to understand the data structure.
XLSX files are converted to markdown tables. The assistant can then analyze column patterns, suggest database schemas, or write import code based on the actual data.
Describing an image:
What’s in this diagram?
For image files, the tool returns both OCR-extracted text and a description field — a natural language description of what the image shows. A system architecture diagram becomes “A three-tier architecture with a load balancer, two application servers, and a primary-replica database pair.” The assistant can reason about the image without multimodal capabilities.
Why This Matters for Developers
The document-to-markdown tool is especially useful during development. Real-world scenarios where this shows up:
Implementing a document spec. A client sends a 40-page PDF specification. Instead of reading it separately and manually referencing it while coding, you convert it to markdown and the assistant works from the spec directly. “Implement the authentication flow described in section 3.2” becomes a concrete, grounded instruction.
Processing sample data. You’re building a data pipeline and need to understand the shape of incoming documents. Drop a sample invoice, receipt, or report into the conversation. The assistant reads the converted markdown and can immediately write parsing code, suggest database schemas, or identify edge cases.
Debugging OCR issues. If your pipeline processes scanned documents, you can quickly check what the OCR output looks like for a specific file. “Convert this scan and show me the markdown” gives you instant visibility into what your pipeline sees.
Code review with context. Reviewing a PR that processes contracts? Convert a sample contract to markdown and the assistant can verify whether the code handles all the fields that actually appear in the document.
How Tool Discovery Works
When your AI assistant starts a conversation, it queries all connected MCP servers to discover available tools. The Iteration Layer MCP server advertises convert_document_to_markdown along with its description, accepted file formats, and input schema.
The assistant reads this description and knows when the tool is relevant. If you mention a document, ask to read a file, or upload a PDF, the assistant recognizes that convert_document_to_markdown is the right tool. You don’t need to say “use the document to markdown tool” — the assistant figures it out from context.
The tool accepts files as:
- URL — publicly accessible file URLs
- Base64 — inline file content, useful for local files the assistant can read from disk
Combining with Other Tools
The Iteration Layer MCP server exposes more than just document conversion. Once connected, your assistant also gains access to:
- extract_document — extract specific typed fields with confidence scores
- transform_image — resize, crop, convert, compress images
- generate_image — create images from layer compositions
- generate_document — produce PDFs from structured content
- generate_sheet — create CSV, XLSX, or Markdown spreadsheets
These tools compose naturally in conversation. Convert a document to markdown, then ask the assistant to extract specific fields. Or convert a scanned receipt, then generate a spreadsheet from the data. The assistant chains the tools as needed.
Example: Document to spreadsheet
Convert this PDF invoice to markdown, then create a CSV with the line items.
The assistant calls convert_document_to_markdown to read the invoice, parses the line items from the markdown, then calls generate_sheet to produce a CSV. Two tool calls, one conversation.
Example: Diagram to description
Convert this architecture diagram, then update the README with the description.
The assistant calls convert_document_to_markdown to get the image description, then edits the README with the generated text.
What Gets Converted
The tool handles every format the underlying API supports:
- PDFs — text and scanned, with built-in OCR for scanned pages
- DOCX — Word documents with structure preserved
- XLSX — Excel spreadsheets rendered as markdown tables
- CSV — tabular data converted to markdown tables
- TXT — plain text passed through
- HTML — markup converted to markdown syntax
- PNG, JPEG, GIF, WebP — OCR for text plus the description field
For image inputs, the response includes a description field alongside the markdown — a natural language description of the visual content. This is generated by a vision model and describes what the image depicts, not just what text appears in it.
Privacy and Data Handling
All document processing happens on EU-hosted servers. No file content is stored beyond temporary 90-day operational logs. The MCP connection uses OAuth — your credentials are never sent to the MCP server directly.
Get Started
Sign up at iterationlayer.com — no credit card required. Run the claude mcp add command above, authenticate in the browser, and start reading documents from your next Claude Code or Cursor session.
The free tier includes 25 conversions. Enough to try every file format and see how the markdown output compares to what you’ve been getting from copy-paste.