Auto-splitting document API, for multi-page file processing
Seamless batch segmentation for streamlined workflows and high‑precision data capture
Enhance processing speed through intelligent boundary detection that isolates multi-page files into discrete records
Try it for free
4.8/5 (30+ reviews)
Trusted by top-tier teams worldwide
Without
Auto-split
Generalist LLMs struggle with structural speed, requiring heavy resource overhead to deliver accuratesplitting results
Prone to missing boundaries in long files
You pay for thousands of tokens just to find a "page break"
Difficult to admit when it's "unsure" of a type
Trained on the open web; lacks deep "document-type" nuance
With
Auto-split
+95%
Document-level split accuracy
Handles 1-page receipts and 50-page contracts.
Scans and slices a 100-page batch in milliseconds
Built-in metrics to trigger human review only when needed
Trained on millions of real-world business documents
Implement “Split” into your document workflow, in seconds
Available for every plan
From Mindee’s platform, create a new pre‑processing model by clicking on “Split” utility
You will find it at the bottom of the user interface. If you are more familiar with this type of pre-processing model, you can directly use it by checking the Documentation for more details.
.webp)
.webp)
Custom to your needs
Enter the document categories that correspond to your needs
Before final pre-processing, you need to define appropriate categories. Be sure to manually add an “Undefined” category. If a file doesn’t match to your document main categories, it will be available in the “Undefined” one.
PDF, HEIC, PNG, JPEG... MUltiple formats
Upload your documents without friction : universal PDF and image support
Accelerate ingestion with native support for PDFs and all image formats. From high-res scans to mobile captures, Mindee API handles any input, ensuring your data is always ready for extraction.
.webp)
.webp)
full document processing stack
Find all your files categorized in standard JSON format, ready for extraction based on categories
Pre-processing via auto-split can then be combined with other Mindee’s API features to further improve the granularity or directly extract data based on each category classification.
Use auto-split and more to optimize your document workflow
Capture
Pre-processing
Data extraction
Enrichment
Validation
Developers and technical profiles already used it !
Add modern AI-based Mindee OCR API to your product, in minutes.
Mindee is an integrated document processing platform backed by reliable AI technology. The service has an intuitive and user-friendly interface and provides highly accurate results extracting data from various document types, especially financial receipts and invoices, which are relatively complex and require specialized optical character recognition (OCR) services. The platform provides seamless integration with our current data processing workflows through customizable APIs, allowing for efficient data extraction and automation.
Amar A.
Mindee is a software that helps us to convert all of our physical business data like bills, invoices, warranty cards, calendar, recipts received to us into a digital documents that can be stored in our drive and can be uploaded in different type of Excel sheets so that all the updates can be maintained and a proper analytics of transactions can be kept by the financial team
Shiv K.
Mindee is a web based tool that help us in scanning and reading different type of documents like identity cards, invoices, proposal plans etc and extract all the information with its AI and then it provides all the information and data associated with these documents a structured way.
Gaurav K.
Excellent. In addition to their great product, the sales team has always been proactive on how they could help us leverage the maximum results from their product. It was like having an additional product manager on our side
Jeff B.
Mindee works reliably and delivers good performance. The OCR data is accurate, and the API is stable. It works like a charm.
Manuel B.
Mindee is a web based tool that help us in scanning and reading different type of documents like identity cards, invoices, proposal plans etc and extract all the information with its AI and then it provides all the information and data associated with these documents a structured way.
Simon
+15M documents processed monthly
Start to auto-split files, extract data
+500 active users
14-day free trial
No credit card

FAQ to know more about Mindee's API
What is automated document splitting ?
Automated document splitting is a pre-processing technology that analyzes multi-page file uploads (like a 50‑page PDF) and automatically breaks them down into separate, logical documents. Instead of a human manually reviewing a file to see where one invoice ends and another begins, the AI detects document boundaries—based on layout changes, page numbering, or content shifts—to split the batch into distinct, standalone records ready for data extraction.
What are examples of automated document splitting ?
In the real-world operational landscape, auto-splitting is a game-changer for accounts payable departments that frequently receive bulk PDF attachments containing dozens of different invoices and credit notes from a single vendor that must be processed as individual records.
It is equally essential for two-way matching and reconciliation workflows, where a single scan might bundle a purchase order (PO) with its corresponding delivery note; the API identifies the boundary between these two distinct records so they can be cross-referenced automatically for audit purposes.
For customer onboarding, this technology allows a new client to upload a single "onboarding packet" containing their ID, a utility bill, and a signed contract, which the system then splits and routes to specialized extraction models for instant verification.
Similarly, in vehicle fleet management, auto-splitting enables the seamless digitization of maintenance folders where insurance certificates, logbooks, and repair invoices are often scanned together, ensuring each document is correctly identified and filed under the right vehicle asset without any manual sorting.
You can check more real-life examples of how companies leverage this technology by visiting customer stories.
How does automated document splitting work ?
Mindee’s splitting solution uses a multi-layered approach to ensure perfect "document boundaries."
- Visual continuity analysis: The AI looks for visual cues, such as consistent headers, logos, or page footers (e.g., "Page 1 of 3").
- Logical boundary detection: It identifies "breaking points," such as a new invoice number, a different date, or a sudden change in document layout (e.g., shifting from a legal contract to a utility bill).
- Batch decomposition: Once the boundaries are confirmed, the system "cuts" the multi-page file into separate digital records.
How to auto-split multiple files in a large document at once ?
Manually splitting PDFs is a massive productivity killer. To automate this, you should implement an API with native splitting capabilities rather than trying to build custom logic in Python.
Mindee’s auto-splitting feature handles this within the API call itself.
When you upload a batch file, the solution detects the record boundaries and provides a structured output of separated documents. This allows developers to build "one-click" upload features where users can drop an entire day's worth of paperwork into a single box, and the system handles the sorting, splitting, and extraction in the background.

.webp)
.webp)
.webp)
.webp)
.webp)
