Iteration Layer
Menu

Processing Client Documents Securely: A Field Guide for Technical Agencies

17 min read

The Security Questionnaire That Started This

You land a new enterprise client. The SOW is signed. The kickoff call goes well. Then legal sends over a 47-page security questionnaire, and question 14 reads:

“List all sub-processors that will have access to customer data, including the nature of processing, data categories involved, and the jurisdiction of processing infrastructure.”

You look at your stack. The PDF parsing goes through a US-based extraction API. The image resizing runs through a CDN provider with edge nodes in 40 countries. The document generation uses a SaaS tool that stores templates in AWS us-east-1. And the OCR step — you’re not actually sure where that runs. The vendor’s docs don’t say.

This is the moment most technical agencies realize that data security isn’t just an engineering problem. It’s a supply chain problem. Every third-party service that touches client data is a link in the chain, and your client’s legal team is going to pull on every one.

Security work feels like friction when the client is waiting for a demo to become a workflow. But it can also be what makes the workflow approvable. Stanford’s 2026 Enterprise AI Playbook found that security requirements often started as barriers and later enabled projects to handle sensitive data. The agency version is straightforward: the more sensitive the client documents, the more valuable a controlled processing path becomes.

This guide covers the practical security considerations for agencies processing client documents through third-party APIs — not as a theoretical compliance exercise, but as the concrete decisions and trade-offs that determine whether you pass that security questionnaire or spend six weeks in back-and-forth with legal.

The Data Flow You Need to Map

Secure agency document processing data flow from client upload through API processing, storage, delivery, audit logs, and sub-processors

Before you can answer any security question, you need a clear picture of where client data goes. For a typical agency document workflow, that looks something like this:

  1. Ingestion. The client uploads a document (contract, invoice, report, manuscript). Where does the file land? Your server? An object store? A managed upload service?
  2. Processing. The document gets parsed, transformed, converted, or some combination. Which services touch the raw file? Which services see the extracted content?
  3. Storage. Where do intermediate results live? Where does the final output go? How long are files retained at each step?
  4. Delivery. How does the processed result get back to the client or into the client’s system? What happens to the file after delivery?

For each step, you need to know:

Most agencies can answer these questions for their own infrastructure. The gap is almost always in the third-party processing step.

Zero Retention: What It Actually Means

“We don’t store your data” is a claim many API providers make. But the implementation details matter enormously.

What zero retention should mean:

What it often actually means:

The difference matters. “Temporarily stored for processing” means the file exists on a disk that could be accessed, backed up, or compromised. Even if deletion happens automatically after 5 minutes, during those 5 minutes the file is at rest on infrastructure you don’t control.

Questions to ask your processing provider:

Question Good answer Concerning answer
Is the file written to disk during processing? No, processed entirely in memory “Temporarily written for processing”
Are request payloads logged? Only metadata (endpoint, status, timing) “We log requests for debugging”
Is there any caching of input files? No file caching between requests “Files are cached for 1 hour”
Can support staff access uploaded files? No, files are never persisted “Only with customer permission”
Are failed request bodies captured by error tracking? No, error reports contain only metadata “We use Sentry/Datadog with full payload capture”

Iteration Layer’s approach: files are processed in memory and immediately discarded. No files are written to persistent storage. Logs auto-delete after 90 days and contain only request metadata — never file contents or extracted data. Full details are on the security page.

DPA Chains: Your Obligations and Your Vendors’

If you’re an EU-based agency — or if you process data through multi-tenant document pipelinesGDPR compliance creates a chain of data processing agreements that flows from the data subject through the controller (your client), to you (processor), and to your sub-processors (API providers, cloud infrastructure, SaaS tools).

The chain looks like this:

Data Subject (end user)
    └── Data Controller (your client)
          └── Data Processor (your agency)
                ├── Sub-processor A (document extraction API)
                ├── Sub-processor B (image processing API)
                ├── Sub-processor C (cloud hosting provider)
                └── Sub-processor D (email delivery service)

Your obligations as a processor:

The practical problem: many API providers don’t offer DPAs, or only offer them on enterprise plans. Some offer a generic DPA that doesn’t specify the data categories or processing purposes relevant to your use case. And some don’t respond to DPA requests at all.

What a proper sub-processor DPA should include:

Element Why it matters
Subject matter and purpose of processing Defines scope — “image transformation” vs. vague “data processing”
Categories of personal data Specifies what’s covered — document contents, metadata, etc.
Duration of processing How long the sub-processor has access
Jurisdiction of processing Where infrastructure is located
Technical and organizational measures Encryption, access controls, incident response
Sub-processor’s own sub-processors The chain continues — cloud providers, CDNs, monitoring tools
Audit rights Your right (and your client’s right) to verify compliance
Data deletion obligations What happens after the processing relationship ends
Breach notification timeline GDPR requires 72 hours to the supervisory authority

Iteration Layer’s approach: a DPA is available to all customers — no enterprise plan required. It specifies document and image processing as the purpose, covers all personal data contained in uploaded files, and lists the full sub-processor chain. See the Data Processing Agreement for the complete terms.

Sub-Processor Transparency

This is where most vendor evaluations get stuck. You ask “who are your sub-processors?” and get one of three answers:

  1. A complete list with company names, purposes, and jurisdictions. This is what you need.
  2. A vague reference to “industry-standard cloud providers.” This tells you nothing.
  3. No answer. The vendor either doesn’t track their sub-processors or doesn’t want to disclose them.

Sub-processor transparency matters because your DPA with your client typically requires you to list all sub-processors. If your image processing API uses three undisclosed services under the hood — say, an AI model provider for smart cropping, a CDN for temporary file hosting, and a monitoring service that captures request data — you can’t honestly answer your client’s security questionnaire.

A practical sub-processor evaluation checklist:

The notification requirement is particularly important. GDPR Art. 28(2) requires that processors inform controllers of intended changes to sub-processors, giving the controller the opportunity to object. If your sub-processor (the API provider) adds a new AI service to their pipeline without telling you, and that service processes personal data, you’re in violation of your own DPA with your client — even though you did nothing wrong.

Encryption: In Transit, at Rest, and the Gaps Between

Encryption is table stakes. Every reputable API provider encrypts data in transit (TLS) and at rest (AES-256 or equivalent). But the interesting question is what happens between those two states.

In transit: your files are encrypted over HTTPS between your server and the API. Standard TLS 1.2 or 1.3. Nothing unusual here, and any provider without this is not worth considering.

At rest: if the provider stores files (even temporarily), they should be encrypted on disk. But remember the zero-retention discussion above — if files are never written to disk, encryption at rest is irrelevant because there’s nothing at rest to encrypt. This is a stronger guarantee than encrypted storage.

The gaps:

Questions for your provider:

Area Question
Transit What TLS version is enforced?
At rest Are files written to disk? If so, what encryption?
In memory Are requests processed in isolation?
Internal Is service-to-service communication encrypted?
Logging Do logs contain file content or extracted data?
Backups Are processing results included in backups?

Audit Trails: What You Can Prove

When your client asks “can you demonstrate that document X was processed on date Y and deleted immediately after?” you need an audit trail. But audit trails for document processing are tricky because of a tension: you need to prove processing happened without retaining the processed data.

What a good audit trail includes:

What it should NOT include:

How to build your own audit layer on top of an API:

Log the request metadata on your side before and after the API call. You control what goes into your audit log. A minimal audit entry looks like:

{
  "timestamp": "2026-04-15T09:23:41Z",
  "operation": "document_extraction",
  "client_id": "acme-corp",
  "project_id": "invoice-processing-q2",
  "file_name": "invoice-2026-0412.pdf",
  "file_size_in_bytes": 284731,
  "status": "success",
  "processing_duration_in_ms": 1847,
  "provider": "iterationlayer",
  "data_jurisdiction": "EU"
}

This gives you proof of processing without retaining any document content. Store these logs according to your retention policy (typically aligned with your client’s data retention requirements), and you can answer audit questions years later without ever having kept the original files.

Jurisdiction: Where Your Data Actually Goes

For EU agencies processing EU client data, jurisdiction is not a nice-to-have. It’s a legal requirement under GDPR, and a practical requirement for most enterprise security questionnaires.

The jurisdictional landscape:

Provider type Typical jurisdiction GDPR implication
US cloud giants (AWS, GCP, Azure) US by default, EU regions available Requires Standard Contractual Clauses (SCCs) or adequacy decision. US region is the default for most services.
US SaaS APIs US only Data crosses the Atlantic. SCCs required. Post-Schrems II, supplementary measures may be needed.
EU-hosted APIs EU No international transfer. Simplest compliance path.
Self-hosted / on-premise Your infrastructure Full control, but you bear the full operational burden.

The practical difference: when you use a US-hosted service, you need SCCs in your DPA, you need to assess whether US surveillance laws (FISA 702, EO 12333) create risks for the specific data categories you’re processing, and you need to document supplementary measures if applicable. This isn’t impossible — thousands of companies do it. But it’s additional legal overhead for every vendor in your stack.

When you use an EU-hosted service, none of that applies. No international transfer means no SCCs, no transfer impact assessment, no supplementary measures. The security questionnaire answer is simple: “All processing occurs within the EU. No personal data leaves EU jurisdiction.”

Iteration Layer runs all processing on EU infrastructure. No data crosses EU borders during processing. Full details are on the security page.

Answering the Security Questionnaire

Let’s go back to that 47-page questionnaire. Here are the questions agencies encounter most frequently, and how to structure your answers when you’re using a third-party processing API.

“List all sub-processors”

How to answer: provide a table with the provider name, purpose, data categories, and jurisdiction. Be specific about what each provider does.

Sub-processor Purpose Data categories Jurisdiction
Iteration Layer Document extraction, image transformation, document generation Document content, image files, metadata EU
[Your hosting provider] Application hosting Application data, user accounts [Jurisdiction]
[Your email provider] Transactional email Email addresses, notification content [Jurisdiction]

“How is data encrypted?”

How to answer:

“What is the data retention period?”

How to answer: distinguish between your own retention and your sub-processors’ retention.

“Where is data processed geographically?”

How to answer: list each processing step and its jurisdiction.

“How are security incidents handled?”

How to answer: describe your incident response process AND your sub-processor notification chain.

“Can we audit your processing infrastructure?”

How to answer: distinguish between what you can audit directly and what is covered by your sub-processor’s DPA.

For Iteration Layer specifically, the security page and DPA cover the technical and organizational measures in detail.

The Composability Advantage for Security

Here’s something that doesn’t show up on security questionnaires but has a real impact on your security posture: vendor count.

A typical document processing workflow might involve:

That’s five vendors, five DPAs, five sub-processor entries, five sets of security documentation to review, five breach notification chains to manage, and five vendors to re-evaluate every time your client renews.

Iteration Layer covers document extraction, document-to-markdown conversion, website extraction, image transformation, image generation, document generation, and sheet generation in a single platform. One vendor, one DPA, one sub-processor entry, one security review.

This isn’t just an administrative convenience. Fewer vendors means:

When your client’s legal team asks “how many third parties touch our data?” the answer “one processing platform for all document and image operations” is meaningfully different from “five specialized services, each with their own data handling policies.”

A Practical Security Checklist for Agency Projects

Use this checklist when setting up a new client project that involves document processing:

Before the project starts

During implementation

Before going live

Ongoing

The Trade-Offs of Self-Hosting

Some agencies choose to self-host document processing to maintain full control over client data. That’s a valid choice, and there are scenarios where it’s the right one:

But self-hosting comes with its own security obligations:

For most agency workloads, the security trade-off favors an API with strong data handling practices over self-hosted infrastructure with a growing maintenance burden. The API provider specializes in secure processing. Your agency specializes in building products for clients. Each plays to their strength.

Getting Started

If you’re evaluating processing APIs for an agency project, start with the security documentation:

Both are publicly available without requiring a sales call or NDA. Read them before your client’s legal team asks — it’s easier to answer the security questionnaire when you’ve already reviewed the documentation.

From there, start the 7-day trial, test the APIs with non-sensitive sample data, and evaluate whether the data handling meets your client’s requirements. The SDKs are available for TypeScript, Python, and Go — same APIs, same security model, same zero-retention guarantee regardless of which language you use.

The security questionnaire doesn’t have to be the bottleneck. With the right vendor choices and a clear data flow map, it becomes a documentation exercise instead of a scramble.

Related reading

Learn how to turn the same pattern into production-ready document, image, and automation workflows.

Try with your own data

Start the trial and run this in minutes.