Iteration Layer

The Document Intake Contract Nobody Designs Until It Breaks

9 min read

The Workflow Starts Before Extraction

Most document automation diagrams start too late.

They show a file entering an extraction step, a JSON response coming back, and a downstream system receiving clean fields. That diagram skips the part where many production failures are born: intake.

The file did not appear from nowhere. It came from an upload form, email inbox, webhook, shared drive, customer portal, automation tool, or support ticket. It arrived with a tenant, a source, a filename, a sender, a case, an expected purpose, and a set of assumptions. If those assumptions are not captured at the boundary, every later step has to guess.

Extraction then becomes responsible for too much. It has to infer document type, decide whether the file belongs to the current workflow, handle duplicates, explain wrong uploads, group attachments, pick schemas, and route review. That is not a content-processing problem. It is an intake contract problem.

Designing intake as a contract sounds bureaucratic. It is the opposite. It makes the rest of the workflow simpler because every step knows what it is allowed to assume.

The same boundary shows up in large document packet workflows, where a single upload may contain several evidence sets.

Create a Job Before You Process the File

A simpler way to think about intake is this: create the job before you process the file.

A file is input. The workflow record is the job that explains why the file matters.

That job should exist before any processing call happens. It does not need to be complicated. It should identify who owns the file, what workflow it belongs to, where it came from, what the workflow expected, and what should happen next.

Those fields decide behavior.

If the tenant is missing, you may not know which data boundary applies. If the workflow is missing, you do not know which schema or review queue to use. If the source channel is missing, you lose evidence about trust. If the expected document type is missing, a wrong upload may look like a generic extraction failure.

The job becomes the place where state accumulates: accepted, rejected, extracted, needs review, approved, generated, exported, failed, retried. Without that record, state leaks into filenames, logs, temporary tables, webhook payloads, and reviewer notes.

That works for demos. It breaks when users upload real files.

Stop Bad Inputs Early

Some failures should happen before content processing starts.

Unsupported file type. Missing tenant. Wrong workflow. Duplicate upload. Password-protected PDF. File too large for this product path. Attachment from an unknown sender. Required companion file missing. Document type does not match what the user was asked to provide.

If these checks happen after extraction, the system wastes work and creates confusing state. A reviewer may see a task for a file that never should have been processed. A user may see a vague failure after waiting. Support may only find an error that says the downstream step could not parse a value.

Early validation should produce explicit rejection reasons. unsupported_file_type is better than failed. wrong_document_type is better than empty_fields. duplicate_upload is better than running the same workflow twice and hoping idempotency catches it later.

Validation should not become a wall that rejects every unusual but legitimate document. Many businesses receive messy files because their customers, suppliers, or partners do not share the same systems. A strict validator can block real work.

The practical design is not accept everything or reject everything. It is accept, reject, or route to review. Intake should know which path a file took and why.

Treat Source as a Signal

Not all files deserve the same automation path.

A PDF arriving from a verified accounting integration is different from a PDF uploaded through a public form. An email attachment from a known supplier is different from an attachment forwarded by an unknown address. A document uploaded by an authenticated admin is different from one uploaded by an end customer during onboarding.

The content may look similar. The workflow risk is not the same.

Source trust can affect which schema runs, which confidence thresholds apply, whether a human must approve the document type, whether generated outputs can be sent automatically, how duplicates are detected, and how long your application retains the source file.

For example, an invoice pulled from a verified supplier portal might go straight to field extraction and accounting review. An invoice-looking file from a public upload form might first need tenant matching and duplicate checks. A contract from a sales ops system might be trusted enough to generate a renewal notice after extraction. A contract uploaded by a customer might require review before any generated output leaves the system.

This is not about distrusting users by default. It is about keeping workflow risk attached to the source context. Intake is where that context is still available.

Decide Whether Files Belong Together

Many document workflows receive more than one file at a time.

An email contains four attachments. A user uploads a zip file. A supplier portal asks for contract, bank proof, and tax certificate. A claims workflow receives photos, invoices, and a report. The system must decide whether these files are independent jobs, one packet, or a mix.

That decision should not be left to extraction.

Grouping should answer which files belong together, which file is primary, which files are supporting evidence, which required files are missing, and whether the workflow can proceed partially. Without that, the processing layer receives a pile of files and has to infer business structure from content alone.

Sometimes related files should be processed together because they support one output object. A base contract and amendment may need shared context. A bank letter and supplier form may both contribute to one supplier record. Sometimes files should be split because they represent separate business objects. Ten invoices in one email are usually ten invoice jobs, not one giant invoice packet.

The tradeoff is context versus isolation. Grouping related evidence can improve workflow decisions, but grouping unrelated files creates noise and makes review harder. Intake should define the boundary before extraction starts.

Intake Chooses the First Operation

Once intake has the job context, it should decide the next operation.

Not every accepted file should go to field extraction. A long policy document may need Markdown conversion for search or review. A wrong upload may need rejection. A low-trust file may need classification or human review before processing. A workflow with existing structured data may skip extraction and generate a document or spreadsheet.

This decision is where many systems become tangled. They send every file to the same parser and then try to recover intent from the result. If the output is empty, maybe the document was wrong. If fields are low confidence, maybe the file was unreadable. If the schema does not fit, maybe the workflow should have converted the document first.

The processing step should not have to infer product intent from a filename.

The intake contract should produce a clear next action: extract fields, convert to Markdown, reject, review, generate, or wait for more files. That action can change as the job state changes, but it should always be explicit.

Review Needs Intake Context

Human review is often the right fallback. It is also painful when intake data is missing.

A reviewer should not receive a task that only says “processing failed.” They need to know what was expected, what arrived, where it came from, which fields are uncertain, whether the source is trusted, whether companion files are missing, and what decision they are being asked to make.

There is a difference between reviewing document type, reviewing extracted fields, approving a generated output, and resolving a duplicate. If all of those states collapse into one generic review queue, reviewers become the hidden workflow engine.

Good intake keeps review narrow. It routes the right task to the right person with the smallest necessary context. A reviewer may only need to confirm that a file is a bank letter. Another may need to correct a total amount. Another may need to decide whether a packet is complete. Those are different jobs.

The point is not to remove humans. It is to stop using humans to compensate for missing state.

Where Iteration Layer Fits

Iteration Layer handles the content operations after your application has established the intake contract.

Once the job says the workflow needs typed fields, Document Extraction can run the selected schema. When the job needs readable context, Document to Markdown can convert the document. When approved data needs to become a PDF, DOCX, EPUB, PPTX, spreadsheet, CSV, or Markdown output, Document Generation and Sheet Generation handle that side of the workflow.

Your application still owns the job object, routing state, rejection reasons, review decisions, grouping, and retention behavior. That split is important. Content-processing APIs should process content. They should not silently become the place where tenant boundaries, source trust, business rules, and product states are invented.

The benefit of a shared API surface is that intake can choose different operations without sending every branch through a different integration pattern. The workflow remains explicit, and the processing calls stay focused.

Design the Contract Before the Parser

Take one upload endpoint and describe the job it creates before any processing happens.

Can you name the tenant, workflow, source channel, expected document type, submitted document type, source trust level, grouping ID, retention policy, rejection reasons, and next operation? Can support see why a file stopped? Can a reviewer tell what decision they need to make? Can the system retry safely without creating duplicates?

If the answer is no, extraction will become the place where every missing decision lands.

That may work while documents are clean and volume is low. It will not hold when customers upload the wrong file, send five attachments instead of one, retry after a timeout, or ask why a generated output used the wrong value.

The workflow starts before the API call. Design the intake contract first.

Related reading

Learn how to turn the same pattern into production-ready document, image, and automation workflows.

Try with your own data

Get a free API key and run this in minutes. No credit card required.