Not All Fields Are Created Equal
When you send a schema with 15 fields to the Document Extraction API, the parser doesn’t just extract them one by one in the order you listed them. That would be slow and wasteful — some fields depend on others, some can be extracted together, and some need special handling.
Instead, the parser builds an extraction plan before it processes a single field. This plan groups fields by type and effort, resolves dependencies between CALCULATED fields and their sources, and handles ARRAY pagination for large tables. The result is faster, more accurate extraction.
This post walks through how the planner works. It’s not necessary to use the API — the planning is automatic. But if you’re evaluating the parser for production use, understanding the internals helps you design better schemas and predict extraction behavior.
Step 1: Dependency Resolution
CALCULATED fields depend on other fields. A computed_total that sums subtotal and tax_amount can’t be computed until both source fields are extracted. A chained calculation where tax_amount = line_total × tax_rate and line_total = unit_price × quantity has a three-level dependency chain.
The planner builds a dependency graph from your schema. It identifies which fields are independent (no dependencies) and which fields must wait for others. The independent fields are scheduled first. Dependent fields are scheduled after their sources resolve.
Independent: invoice_number, vendor_name, invoice_date, subtotal, tax_amount
Dependent: computed_total (waits for subtotal, tax_amount)
Circular dependencies — where field A depends on B and B depends on A — are detected at plan time and rejected with a clear error, before any extraction starts.
How Dependency Resolution Actually Works
The planner uses topological sorting — the same algorithm package managers use to determine install order. Each CALCULATED field declares its source_field_names, which creates edges in a directed graph. The planner walks this graph and produces an ordered list of extraction phases.
Consider this schema:
const schema = {
fields: [
{ name: "unit_price", type: "CURRENCY_AMOUNT", description: "Price per unit" },
{ name: "quantity", type: "INTEGER", description: "Number of units" },
{ name: "line_total", type: "CALCULATED", operation: "multiply", source_field_names: ["unit_price", "quantity"] },
{ name: "tax_rate", type: "DECIMAL", description: "Tax rate as a decimal" },
{ name: "tax_amount", type: "CALCULATED", operation: "multiply", source_field_names: ["line_total", "tax_rate"] },
{ name: "grand_total", type: "CALCULATED", operation: "sum", source_field_names: ["line_total", "tax_amount"] },
],
};
The dependency graph has three levels. Level 0: unit_price, quantity, tax_rate (no dependencies — these are extracted from the document). Level 1: line_total (depends on unit_price and quantity). Level 2: tax_amount (depends on line_total and tax_rate). Level 3: grand_total (depends on line_total and tax_amount).
The planner resolves this in exactly four phases. It doesn’t matter what order you list the fields in your schema — the planner reorders them based on the dependency graph. You could put grandTotal first and unitPrice last, and the extraction order would be identical.
Step 2: Effort Grouping
Not all field types cost the same to extract. A TEXT field that reads a clearly labeled invoice number is straightforward. An ARRAY field that needs to paginate through a 50-row line items table is substantially more work.
The planner groups fields by effort:
- Simple fields — TEXT, INTEGER, DECIMAL, BOOLEAN, DATE, DATETIME, TIME, EMAIL, ENUM, COUNTRY, CURRENCY_CODE. These are typically found in fixed positions with clear labels.
- Structured fields — ADDRESS, CURRENCY_AMOUNT, IBAN. These require understanding context and structure beyond simple text extraction.
- Complex fields — ARRAY. These involve identifying table boundaries, understanding column headers, and potentially paginating through multi-page tables.
- Computed fields — CALCULATED. These don’t require extraction at all — they’re computed from other field values after extraction.
The planner processes simple fields first (they’re fast and often provide context for harder extractions), then structured fields, then complex fields, and finally computed fields.
This ordering is deliberate. Simple fields like invoice number, date, and vendor name provide context that helps the parser locate more complex data. If the parser already knows this is an invoice from “Acme Corp” dated “2026-01-15”, it has better context for interpreting the line items table and computing totals.
Within each effort group, fields that don’t depend on each other can be extracted concurrently. The planner identifies these opportunities and batches independent fields together. This is why a schema with 15 simple fields doesn’t take 15 times as long as a schema with one — the planner groups them into a single extraction pass.
Step 3: ARRAY Pagination
ARRAY fields handle tabular data — invoice line items, director lists, transaction records. A table might span multiple pages. A table might have 200 rows.
The planner doesn’t try to extract all rows at once. It paginates through the table, extracting chunks of rows and assembling them into the final result. This handles arbitrarily long tables without hitting context limits or sacrificing accuracy on individual rows.
The pagination is automatic. You define the ARRAY field with its nested schema, and the planner figures out how to chunk the extraction based on table size and document structure.
The planner handles several edge cases in table extraction. Tables that split across page boundaries — where a row starts on one page and continues on the next — are reassembled into complete rows. Tables with merged cells or multi-line cell content are normalized into a consistent row structure. Tables with subtotal rows interspersed between data rows are handled without the subtotal rows contaminating the extracted data.
For very long tables — 100+ rows across multiple pages — the planner also manages context windows. Each chunk includes enough surrounding context (column headers, page identifiers) to maintain extraction accuracy without repeating the entire table on every pass.
Step 4: Source Coordination
When your schema includes multiple files (multi-file extraction supports up to 20), the planner coordinates extraction across all sources. Fields might appear in different files — an invoice number in one document and a corresponding purchase order number in another.
The planner tracks which source file each field value came from, and the response includes a source field so you know the provenance of every extracted value.
What This Means for Schema Design
Understanding the planner helps you write better schemas:
Name your CALCULATED field sources explicitly. The planner resolves dependencies by field name. If your CALCULATED field references subtotal, make sure there’s a field named exactly subtotal in your schema.
Use ARRAY for any repeating data. Don’t try to extract individual rows as separate fields (lineItem1Description, lineItem2Description). Use an ARRAY field with a nested schema. The planner handles tables of any length.
Keep descriptions specific. The field description helps the parser locate the right data. “Invoice total” is good. “The total amount due for this invoice including tax and shipping” is better — it distinguishes the total from subtotals and other numeric values.
Put required fields first. While the planner reorders fields internally, marking critical fields as is_required: true tells the parser to prioritize accuracy on those fields.
Write precise descriptions for ambiguous fields. The parser uses your field descriptions to locate data. A document might contain five different dates — invoice date, due date, delivery date, order date, payment date. “Date” as a description forces the parser to guess. “Invoice issue date” narrows it down. The planner doesn’t resolve ambiguity in descriptions, but precise descriptions reduce the chances of mismatched extractions downstream.
A Complete Planning Example
Here’s what the planner does with a typical invoice schema containing 12 fields:
Schema input:
invoice_number (TEXT), vendor_name (TEXT), invoice_date (DATE),
due_date (DATE), currency (CURRENCY_CODE), vendor_address (ADDRESS),
line_items (ARRAY), subtotal (CURRENCY_AMOUNT), tax_rate (DECIMAL),
tax_amount (CURRENCY_AMOUNT), printed_total (CURRENCY_AMOUNT),
computed_total (CALCULATED: sum of subtotal + tax_amount)
Planner output:
Phase 1 (simple): invoice_number, vendor_name, invoice_date, due_date, currency
Phase 2 (structured): vendor_address, subtotal, tax_rate, tax_amount, printed_total
Phase 3 (complex): line_items
Phase 4 (computed): computed_total
Five simple fields are extracted together in the first pass. Four structured fields (including three CURRENCY_AMOUNT fields and one ADDRESS) go next. The ARRAY field gets its own pass with pagination support. Finally, the CALCULATED field is computed from the already-extracted subtotal and tax_amount. Four phases instead of twelve sequential extractions.
Performance Implications
The extraction plan runs before any field is processed. The planning itself is fast — it’s graph traversal, not AI inference. The benefit is that the subsequent extraction is more efficient because fields are grouped and ordered optimally.
For a typical 10-field schema on a single document, the planning overhead is negligible. For a complex schema with ARRAY fields, CALCULATED chains, and 20 files, the planning step saves significant time by avoiding redundant work and ordering operations correctly.
What’s Next
Every field the extraction plan produces can feed directly into Image Generation or Document Generation — same auth, same credit pool.
Get Started
The extraction planner is automatic — you don’t configure it. Design your schema, send your documents, and the parser handles the planning internally. Check the docs for schema design best practices and field type reference.
The TypeScript and Python SDKs handle request building and response parsing. Sign up for a free account — no credit card required.