/ ACCELERATOR· Quillect

Quillect — Document AI. Any document in. Clean structured data out.

Turn any inbound document — invoices, contracts, forms, statements, KYC packs — into clean, validated, structured data. OCR, layout analysis, schema-constrained LLM extraction, and rule-based validation — with a human controlling the exceptions, not every page.

Built on Amazon Textract / Azure Document Intelligence / Google Document AI / docTR for OCR, Claude / GPT / open-weight models for extraction, and your systems of record for output. BYOM — a vendor-agnostic alternative to single-vendor IDP suites.

Document AI fails when it is graded as “mostly right.” A 95%-accurate invoice field is a wrong payment one time in twenty. Quillect treats extraction as a measured pipeline — every field carries a confidence score, validation rules catch what models miss, and anything below threshold routes to a human before it ever reaches a system of record.

The pipeline

Each stage is a pluggable component. We swap implementations to fit the customer's document mix, volume, and compliance constraints.

  1. Ingestion. Documents arrive from anywhere — monitored email inbox, REST API, SFTP drop, scanner, or web upload. PDFs, images, scans, and office files are normalised to a common internal format.
  2. Classification. An agent identifies the document type (invoice, purchase order, contract, bank statement, ID document…) and routes it to the right extraction schema. Unknown types go to triage, not the bin.
  3. OCR & layout analysis. Amazon Textract, Azure Document Intelligence, Google Document AI, or self-hosted docTR — recovering text, tables, key-value pairs, and reading order from clean PDFs through to crumpled scans.
  4. Schema-constrained extraction. An LLM (Claude or GPT, or an open-weight model on-prem) maps the recognised content onto a defined target schema — constrained to the fields, types, and enums the schema allows. The model cannot invent a field.
  5. Validation. Business rules run on every record: arithmetic checks (line items sum to total), format checks (GSTIN / PAN / IBAN patterns), cross-references against master data, and per-field confidence thresholds.
  6. Human-in-the-loop exceptions. Only records that fail validation or fall below confidence land in a review queue, side-by-side with the source page. Reviewers correct, approve, and that feedback sharpens the schema and prompts.
  7. Structured output. Clean, validated records are pushed to the systems that need them — ERP, accounting, CRM, data warehouse — as JSON, via API, or as a governed table on your Data Platform.

Why “human controls the exceptions” matters

Fully manual processing does not scale; fully automatic processing is not trustworthy on documents that move money or carry legal weight. Quillect's design point is straight automation on the confident majority, and human judgement concentrated where it earns its keep.

What it does well — and what it doesn't

Quillect is honest about its operating envelope.

It does well

  • High-volume, repeating document types — invoices, POs, remittances, statements, forms.
  • Mixed-quality inputs — clean digital PDFs through to photographed and scanned pages.
  • Table and line-item extraction with arithmetic validation.
  • Straight-through processing of the confident majority, with measured exception rates.

It doesn't pretend to do

  • Legal interpretation — it extracts contract terms; it does not advise on them.
  • Decisions on low-confidence records — those route to a human by design.
  • One-off, never-seen-again document types where building a schema costs more than reading by hand.
  • Anything involving sensitive personal data without the proper authorisation and residency context.

How it compares to single-vendor IDP

The cloud vendors ship capable document suites — Amazon Textract, Azure Document Intelligence, Google Document AI. They are strong OCR and form engines. Quillect uses them as components rather than as the whole stack.

Quillect is for organisations that want:

Where it fits

Quillect is the document-to-data front door for the rest of the stack. Extracted records land on the Data Platform we build, feed AI-Powered BI and downstream agents, are governed by Responsible AI controls, and have their extraction quality assured by AI Eval Service. Start with a GenAI Readiness Assessment to scope your document mix and volumes.

Related resources

Quillect is one of six production-ready accelerators we run. See document AI live — and explore the full accelerator suite →

We are an intent away