Document Classification
Borrower packages, indexed on intake.
LendPipe classifies every file in a borrower package by type, period, and entity — across 70+ document categories — and flags what's missing or duplicated, so review starts with an organized file instead of a folder of PDFs.
Drop borrower documents here
+ 18 more files
What it does
Intake that turns a 30-file zip into a complete, organized credit file.
A commercial borrower package arrives as a mess. Twenty-eight files in a Dropbox folder, named `scan0042.pdf`, `IMG_4821.pdf`, `tax_return_FINAL_v3.pdf`, and `financials (2).xlsx`. Some are statements; some are tax returns; some are the same document scanned twice because the borrower thought they hadn't sent it. Three documents are missing entirely — but you won't know which until an analyst spends 45 minutes opening every file and matching them against the checklist. LendPipe's intake layer reads every file on upload, identifies what it is, and routes it to the right place in the deal file. A scanned PDF that says "Form 1120-S" at the top is classified as a corporate tax return, the period is detected from the form headers, the borrower entity is matched against the application, and the document is renamed to a structured naming convention. Bank statements are grouped by account and period. Personal financial statements are attributed to the right guarantor. Duplicates are detected and consolidated. Missing documents are surfaced against your institution's required-document checklist before an analyst opens the deal. What the analyst sees is an organized credit file, not a folder of cryptically named PDFs.
Capabilities
What document intake handles
70+ document categories
Corporate and personal tax returns by form type, audited and compiled financials, interim statements, bank statements, PFS, rent rolls, K-1s, operating statements, A/R aging, COI, and the supporting schedules your institution accepts.
Period and entity detection
Each document is tagged with its reporting period (FY2024, Q3 2024, October statement) and attributed to the correct entity — operating company, holding entity, guarantor, or affiliated borrower.
Duplicate and version detection
The same document uploaded twice — under different filenames, or after a borrower re-sent it — is detected and consolidated. Later versions of the same document are linked to earlier versions for audit trail.
Missing-document checklist
Required documents for the product type are checked against the uploaded package. Missing items are surfaced as a structured checklist before analyst review begins, with the option to message the borrower directly from the platform.
Multi-entity routing
Documents for the operating company, real estate holding company, guarantor, and affiliated entities are routed to the right entity's file. Multi-entity deals show a complete picture rather than a flat list of PDFs.
Renaming and audit trail
Files are renamed to a structured convention — entity / type / period — while the original filename is preserved in the audit trail. Examiner review and committee questions answered without re-organizing the folder.
How it works
From folder of PDFs to organized credit file
- 01
Files arrive in any format and any structure
Borrowers upload through the portal, brokers forward email packages, your team drops files from a network share, or an integration pushes documents via API. The intake layer accepts whatever arrives — including the zip-of-zips that brokers love and the multi-page composite PDFs that nobody loves.
- 02
Each file classified by content, not filename
Classification is content-based: the document is read and matched against the 70+ category taxonomy by what it actually contains, not by what it's called. `IMG_4821.pdf` becomes "Personal Financial Statement — Robert Chen — October 2024". Multi-document PDFs are split into their constituent documents before classification.
- 03
Period, entity, and ownership context applied
Period is extracted from the document's own headers and dates. Entity attribution uses the legal name on the document, cross-referenced with the application's entity list. Guarantor PFS is attributed to the right individual; K-1s flow to the correct partner; multi-entity tax returns route to the correct entity in the deal structure.
- 04
Checklist reconciled and file presented
Against your institution's required-document checklist for the product type, the package is reconciled. Required-and-present items are checked off; required-and-missing items are surfaced with the option to request from the borrower. Duplicates are consolidated. The organized credit file is presented to the analyst as their starting point.
What you get back
The credit file your analyst opens
Organized document tree
By entity, by document type, by period. Each file renamed to a structured convention, with the original filename preserved in metadata. Replaces the network-share folder of cryptically named PDFs.
Required-document checklist
Your institution's checklist reconciled against the uploaded package. Required-present items checked off; missing items surfaced with the option to request from the borrower in one click.
Duplicate and version report
Every duplicate detected, every later version linked to the earlier version. Examiner audit trail of how the credit file evolved during underwriting.
Routing to downstream analysis
Classified documents are passed to the right downstream surface — bank statements to bank statement analysis, tax returns and financials to spreading, application data to diligence. No second upload, no separate "send to spreading" step.
Built for lenders
Intake built for the messy reality of commercial borrower packages
Commercial loan packages don't arrive in a tidy taxonomy. They arrive as broker-forwarded emails with seventeen attachments, as borrower portal uploads from people who don't use computers professionally, as scanned packets from a CPA who hand-stapled the tax return to the financial statements. The intake step is where analyst time leaks fastest at community banks and SBA lenders — opening files, comparing filenames against the deal checklist, deciding whether `tax_return_FINAL_v3.pdf` is actually the final version. LendPipe is built for that reality. Classification reads the document itself, not the filename. Duplicates that exist because the borrower re-sent something don't double-count. Multi-entity packages route correctly without the analyst manually attributing every PDF. The intake layer doesn't replace the analyst's judgment about whether the package is sufficient — but it eliminates the 45 minutes of file-opening that delays the moment when judgment can be applied.
Common questions
What lenders ask before they switch
How does classification handle documents with no clear file type — scans, photos, multi-document PDFs?
Classification is content-based, so a scan that begins with the Form 1120-S header is classified as a corporate tax return regardless of its filename. Phone photos are read with lower extraction confidence than digital PDFs, and the classification verdict carries the confidence level. Multi-document PDFs — for example, a tax return immediately followed by financial statements in the same file — are detected and split into their constituent documents before classification. Each split document carries a reference back to the original composite file in the audit trail.
Can we customize the document taxonomy for our institution?
Yes. The 70+ category taxonomy is a starting point, not a fixed schema. Categories your institution doesn't use can be hidden; categories specific to your portfolio — particular SBA forms, industry-specific licensure documents, your standard guarantee forms — can be added. The required-document checklist by product type is also configured to your institution; SBA 7(a) checklists, owner-occupied CRE checklists, and equipment-finance checklists all run from the same intake layer with different required sets.
What happens when the borrower sends a document that isn't on our checklist?
It's classified, added to the credit file, and visible to the analyst — it just doesn't satisfy a checklist requirement. Extra documents are common (and often useful) at the bottom of a broker forward; the intake layer preserves them rather than discarding them. If a document is recognized as a category your institution doesn't use, it's classified as "other" with a content description, available for analyst review without driving any required-doc logic.
How are entity and ownership relationships resolved across documents?
On a multi-entity deal — common in commercial lending — entity attribution starts from the application's disclosed entity list. Each document's legal name is matched against the disclosed entities; documents that name a previously-undeclared entity (a new affiliate, an unmentioned related-party) are flagged for analyst review rather than silently created. Guarantor PFS documents are attributed to named guarantors; K-1s route to the right partner based on the partnership return. The audit trail shows which documents drove which entity attribution.
What about ongoing borrower documents — annual reviews, covenant testing, renewals?
The intake layer handles ongoing document arrivals the same way it handles new originations. Annual financial statements for an existing borrower are classified, period-tagged, and attached to the borrower's record — and the annual-review or renewal workflow picks them up. Covenant-testing documents (compliance certificates, quarterly financials) are recognized and routed to the covenant monitoring surface. The intake taxonomy is the same; the downstream routing depends on which workflow is active for the borrower.
Does intake make decisions about document sufficiency, or just classify?
Intake classifies and reconciles against the checklist. It does not make underwriting sufficiency judgments — whether a particular PFS is detailed enough, whether interim financials are recent enough to rely on, whether a tax return is a draft or a filed copy. Those are analyst judgments. The intake layer surfaces the documents and their attributes; the analyst decides whether the package is underwriteable.
Related capabilities
Built to work together, not in isolation
See it run on a real borrower file.
Walk through one of your own deals — document drop to committee-ready output, end to end.
Book a 10-minute demo