LlamaIndex is commonly used for RAG, document agents, and private-data workflows. Iteration Layer fits before and after LlamaIndex: convert messy files into clean Markdown before indexing, extract structured fields for business logic, and generate deliverables after the agent has made a decision.
Recommended Path
Use Iteration Layer as an ingestion and tool layer around LlamaIndex:
- Convert PDFs, DOCX files, images, HTML, spreadsheets, and text files to Markdown with Document to Markdown.
- Extract structured fields with Document Extraction.
- Generate reports, spreadsheets, and images from the agent result.
Install
pip install llama-index-core iterationlayerConvert Before Indexing
from iterationlayer import IterationLayer
from llama_index.core import Document, VectorStoreIndex
client = IterationLayer(api_key="YOUR_API_KEY")
converted = client.convert_document_to_markdown(
file={
"type": "url",
"name": "policy.pdf",
"url": "https://example.com/policy.pdf",
}
)
documents = [Document(text=converted["markdown"], metadata={"source": "policy.pdf"})]
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()Add Structured Extraction
Use extraction when the workflow needs reliable fields, not just retrieved passages:
result = client.extract_document(
files=[
{
"type": "url",
"name": "contract.pdf",
"url": "https://example.com/contract.pdf",
}
],
schema={
"fields": [
{"name": "counterparty", "type": "TEXT", "description": "Contract counterparty"},
{"name": "renewal_date", "type": "DATE", "description": "Renewal date"},
{"name": "termination_notice_days", "type": "INTEGER", "description": "Notice period in days"},
]
},
)Production Guidance
Use Iteration Layer before indexing when files need OCR, layout-aware Markdown, or structured extraction that should be consistent across runs. Store the converted Markdown, extraction result, citations, and source metadata alongside your LlamaIndex documents so downstream answers can be audited.
For production RAG pipelines, run document conversion and extraction as deterministic ingestion steps before retrieval. Use LlamaIndex agents for interactive workflows where the model should decide which document or generated output to request next.