Extract structured data from any document
Send a PDF, image, or document — get structured JSON back. Define the fields you need, and the API extracts them with confidence scores.
No credit card required — start with free trial credits
What's included
Schema-Driven Extraction
Define 17 typed fields — dates, IBANs, currencies, addresses, nested arrays — and get structured JSON back. No prompt engineering, no output parsing.
Built-In Trust Scores
Every extracted value includes a confidence score and a verbatim source citation from the document. Route low-confidence results to human review.
Multi-File Merge
Send up to 20 files per request — PDFs, images, spreadsheets, Word docs — and get one unified extraction across all of them.
MCP (Model Context Protocol)
Connect directly to AI agents. Use our APIs as tools in Claude, GPT, and other LLM-powered workflows.
Zapier
Connect to 5,000+ apps with Zapier integration.
n8n
Build automated workflows with n8n integration.
How it works
Define a schema
Describe the fields you want to extract using our schema format. Each field has a name, a type, and an optional description to guide the extraction.
- 17 field types including text, currency, date, IBAN, and address
- Nested arrays for line items, tables, and repeating sections
- Optional descriptions to clarify ambiguous fields
Send your documents
Upload PDFs, images, or office documents via URL or base64. Send up to 20 files per request — they are combined into a single extraction result.
- PDF, Word, Excel, images, and scanned documents
- Up to 20 files combined into one structured result
- Built-in OCR for scanned pages and photos
Get structured data
Receive JSON with extracted fields, confidence scores, and source citations. Every field includes provenance so you know exactly where the value came from.
- Confidence scores between 0 and 1 for every field
- Source citations linking each value to its location in the document
- Missing fields return null with a confidence score of 0
Quick Start
One API call, one credit deducted. Chains naturally with our other APIs — pipe the output of one into the next without glue code. You'll be up and running in minutes.
- Full OpenAPI 3.1 specification available for code generation and IDE integration.
- MCP server support for seamless integration with AI agents and tools.
- Comprehensive documentation with examples for every field type and edge case.
curl -X POST https://api.iterationlayer.com/document-extraction/v1/extract \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"files": [{ "type": "url", "name": "invoice.pdf", "url": "https://example.com/invoice.pdf" }],
"schema": {
"fields": [
{ "name": "invoice_number", "type": "TEXT", "description": "The invoice number" },
{ "name": "total_amount", "type": "CURRENCY_AMOUNT", "description": "The total amount" },
{ "name": "line_items", "type": "ARRAY", "description": "Line items", "fields": [
{ "name": "description", "type": "TEXT", "description": "Item description" },
{ "name": "amount", "type": "CURRENCY_AMOUNT", "description": "Item amount" }
]}
]
}
}'
import { IterationLayer } from "iterationlayer";
const client = new IterationLayer({ apiKey: "YOUR_API_KEY" });
const result = await client.extract({
files: [{ type: "url", name: "invoice.pdf", url: "https://example.com/invoice.pdf" }],
schema: {
fields: [
{ type: "TEXT", name: "invoice_number", description: "The invoice number" },
{ type: "CURRENCY_AMOUNT", name: "total_amount", description: "The total amount" },
{ type: "ARRAY", name: "line_items", description: "Line items", fields: [
{ type: "TEXT", name: "description", description: "Item description" },
{ type: "CURRENCY_AMOUNT", name: "amount", description: "Item amount" },
]},
],
},
});
from iterationlayer import IterationLayer
client = IterationLayer(api_key="YOUR_API_KEY")
result = client.extract(
files=[{"type": "url", "name": "invoice.pdf", "url": "https://example.com/invoice.pdf"}],
schema={
"fields": [
{"type": "TEXT", "name": "invoice_number", "description": "The invoice number"},
{"type": "CURRENCY_AMOUNT", "name": "total_amount", "description": "The total amount"},
{"type": "ARRAY", "name": "line_items", "description": "Line items", "fields": [
{"type": "TEXT", "name": "description", "description": "Item description"},
{"type": "CURRENCY_AMOUNT", "name": "amount", "description": "Item amount"},
]},
],
},
)
import il "github.com/iterationlayer/sdk-go"
client := il.NewClient("YOUR_API_KEY")
result, err := client.Extract(il.ExtractRequest{
Files: []il.FileInput{
il.NewFileFromURL("invoice.pdf", "https://example.com/invoice.pdf"),
},
Schema: il.ExtractionSchema{
"invoice_number": il.NewTextFieldConfig("invoice_number", "The invoice number"),
"total_amount": il.NewCurrencyAmountFieldConfig("total_amount", "The total amount"),
},
})
See it in action
Ready-to-use workflows for the most common data processing tasks.
Automate Invoice Processing
Extract line items, totals, and vendor details from invoices into structured JSON for accounting workflows.
Digitize Academic Papers
Extract titles, authors, abstracts, and citations from academic papers into structured JSON for research workflows.
Extract Contract Clauses
Extract parties, dates, and clauses from contracts into structured JSON for legal review workflows.
Extract Product Catalog Data
Extract product names, SKUs, prices, and specifications from catalog documents into structured JSON for e-commerce workflows.
Extract Real Estate Listings
Extract property addresses, prices, room counts, and features from listing documents into structured JSON for MLS and property platforms.
Extract Rental Application Data
Extract applicant details, employment history, income, and references from rental application forms into structured JSON for tenant screening.
Onboard Employees
Merge an employment contract, ID document, and tax form into a single employee onboarding record.
Onboard Suppliers
Merge a supplier application, bank details, and tax certificate into a single structured supplier profile.
Parse Receipts and Expenses
Extract merchant details, dates, and line items from receipts into structured JSON for expense tracking workflows.
Parse Resumes and CVs
Extract candidate details, skills, and work experience from resumes into structured JSON for recruiting workflows.
Process Customs Declarations
Merge a commercial invoice, packing list, and bill of lading into a unified customs declaration.
Process Medical Records
Extract patient details, diagnoses, and medications from medical records into structured JSON for healthcare workflows.
Scrape Structured Web Data
Extract page titles, headings, links, and content from web pages into structured JSON for data collection workflows.
Privacy by default
We built Iteration Layer with privacy by design. Your data is processed in the EU and never stored beyond temporary logs. Learn more about our security practices .
No data storage
We don't store your files or processing results. Logs are automatically deleted after 30 days.
EU-hosted infrastructure
All processing runs on servers located in the European Union. Your data never leaves the EU.
GDPR-compliant by design
Full compliance with EU data protection regulations. Data Processing Agreement available for all customers.
Pricing
Start with free trial credits. No credit card required.
Developer
For individuals & small projects
-
1,000 credits / month1,000 image transformations 500 document generations 500 image generations 100 document extractions
-
All APIs included
-
Free trial credits per API
-
Email support
-
Budget caps per key
Startup
Save 40%For growing teams
-
5,000 credits / month5,000 image transformations 2,500 document generations 2,500 image generations 500 document extractions
-
All APIs included
-
Free trial credits per API
-
Priority support
-
Budget caps per key
Business
Save 47%For high-volume workloads
-
15,000 credits / month15,000 image transformations 7,500 document generations 7,500 image generations 1,500 document extractions
-
All APIs included
-
Free trial credits per API
-
Priority support
-
Budget caps per key
Frequently asked questions
What file formats are supported?
The API accepts PDF, DOCX, XLSX, CSV, TXT, HTML, PNG, JPEG, GIF, and WebP. Scanned documents are processed with built-in OCR.
How does schema-based extraction work?
You define a schema describing the fields you want (name, type, description). The API uses AI to locate and extract those fields from the document.
What are confidence scores?
Every extracted field includes a confidence score between 0 and 1, indicating how certain the API is about the result. Use these to build human review flows.
How many files can I send per request?
You can send up to 20 files per request. All files are combined into a single extraction result — the API pulls fields from across all documents. The total size limit is 200 MB with 50 MB per file.
Does it handle scanned documents?
Yes. The API includes built-in OCR for scanned documents and images. No separate OCR step is needed.
What happens when a field isn't found?
Missing fields return null with a confidence score of 0. You can use confidence thresholds to decide when to flag documents for manual review.