Use case · US tax returns

US tax return extraction. Line-level fidelity.

Lending, mortgage, audit, and high-net-worth underwriting all depend on US tax returns being parsed accurately, line by line. fluex extracts every numeric line of a 1040, every form 1099, and the standard schedules (C, E, K-1) into a normalized schema in under five seconds — with a confidence score per field and a complete audit trail.

Why US tax returns are a pain

A complete US tax return is rarely one form. It's a 1040 plus an unpredictable mix of schedules (A, B, C, D, E, K-1) and 1099s (NEC, MISC, INT, DIV, B, R) — sometimes 30+ pages of forms with field codes that humans can read at a glance and OCR systems consistently misread. Lenders and underwriters need every line accurate; a misread Schedule C net profit is a denial. Manual data entry takes hours per return; legacy OCR systems hit acceptable accuracy on the 1040 itself but fall over on schedules.

How fluex does it

fluex extracts US tax forms into a flat normalized schema keyed by IRS line number — so line_22_total_income on a 1040 is the same field whether the customer used TurboTax, H&R Block, or paper-filed. Every line is returned with a confidence score, a bounding box reference back to the source page, and a validation result against IRS arithmetic rules. Schedule attachments are parsed and linked to their parent form automatically.

Sample extraction output

doc_typeForm 1040 (2024)
filing_statusMarried filing jointly
primary_taxpayerJordan T. Hall
ssn***-**-1234
agi_line_11US$ 287,420.00
taxable_income_line_15US$ 256,140.00
total_tax_line_24US$ 51,228.00
federal_tax_withheld_line_25aUS$ 48,150.00
refund_line_34US$ 0.00
amount_owed_line_37US$ 3,078.00
schedules_attachedSchedule A, Schedule B, Schedule D, Schedule E
forms_10991099-INT, 1099-DIV, 1099-B (×2)
arithmetic_check✓ all lines balance
confidence0.97 → auto-approved

What you get out of the box

Every line, every schedule

Full 1040, Schedules A/B/C/D/E/K-1, and the 1099 family. Returned as a flat JSON keyed by IRS line numbers.

Arithmetic validation

IRS line arithmetic rules are checked automatically. Inconsistencies are flagged before the document reaches your model.

Multi-year stitching

Process 2 or 3 years of returns at once and get a stitched view ready for underwriting (income trend, AGI history, deduction patterns).

PII-aware audit trail

SSNs and dependent identifiers are redacted in metadata by default. Configurable retention from 0 to 7 years.

Integration patterns

For mortgage lenders and underwriters who need fast tax-return parsing, fluex's async mode handles 50-100 page returns in under 30 seconds end-to-end. Direct integrations exist for Encompass, Blend and lender CRMs. The REST API can also drop a structured JSON straight into your underwriting model.

Compliance & trust

Tax returns are highly sensitive. fluex retains them encrypted at rest with per-tenant keys, with configurable retention (default 90 days, can be 0). PII is redacted in audit metadata by default. HIPAA BAA is available on Enterprise for healthcare-adjacent workflows. See our trust page for the full posture: encryption, tenant isolation, sub-processors, GDPR DPA, CCPA, SOC 2 Type II in progress, and HIPAA BAA on Enterprise.

Get started

Pay-per-page pricing means you can start an evaluation today without an annual commit. Most teams ship their first tax-return extraction into production within a week.