Blog / Intelligent Document Processing

How Polygons Work in Document AI

The Mindee Team

The Mindee Team

September 29, 2025

5

min read

Share

When extracting data from documents, rectangles often aren’t enough. A bounding box can tell you roughly where a field is, but what if the text is skewed, rotated, or sitting inside an irregular area like a signature box or a stamp? That’s where polygons come in.

Polygons give document AI systems the ability to describe shapes with precision, making them critical for reliable data extraction and visualization.

What Are Polygons in Document Processing?

A polygon is a series of points (x, y coordinates) connected to form a closed shape. Instead of just drawing a simple rectangle, polygons can outline irregular or rotated regions in a document.

Bounding box vs. polygon:

  • A bounding box is always rectangular.
  • A polygon can follow the exact contour of the text or object, regardless of angle or shape.

Example use cases:

  • Outlining a tilted total amount on a receipt.
  • Capturing a round seal or logo on an official document.
  • Pinpointing the exact shape of a handwritten signature.

Why Use Polygons Instead of Bounding Boxes?

Bouding box around the text: simple but less precise when the text is tilted or irregular.
an "example" text with polygons around it
Polygons around the text: follows the shape and angle of the text for better accuracy.

Polygons add value because they:

  • Provide higher accuracy in locating text or elements.
  • Handle rotated or skewed scans better than rectangles.
  • Preserve the context of complex layouts such as multi-column PDFs or tables.
  • Help with visual validation, so users can see exactly what part of the document was extracted.

How Polygons Work Technically

Polygons are stored as coordinate arrays, usually normalized between 0 and 1 relative to the document’s width and height.

Here’s an example JSON snippet:

"locations": [
        {
          "page": 0,
          "polygon": [
            [
              0.3145,
              0.574667
            ],
            [
              0.4499749485051495,
              0.4162655217478252
            ],
            [
              0.4243856094390561,
              0.394379902809719
            ],
            [
              0.2889106609339066,
              0.5527813810618938
            ]
          ]
        }
      ]

Using Polygons with Mindee

Mindee returns polygons for every extracted field. This means you not only get the value, but also where it came from in the document.

Benefits of Mindee’s polygon data

  • Available across catalog models (invoices, receipts, ID cards, etc.).
  • Used in custom models when you define your own fields.
  • Easy to overlay on PDFs/images for validation.

With these coordinates, you can overlay the polygon on the document using libraries like OpenCV or Matplotlib, creating a visual highlight around the extracted field.

Real-World Applications with Mindee

  • Expense management: highlight VAT and totals on receipts for quick validation.
  • KYC processes: crop out and verify ID card photos or MRZ areas.
  • Fraud detection: detect tampered seals or altered signature boxes.
  • Automation workflows: validate that extracted data falls within expected document zones.

Polygons are the backbone of precision in document AI. They make it possible to identify, extract, and validate information with accuracy that simple rectangles can’t match.

With Mindee, polygons can be included in every prediction. Developers and businesses not only capture text, but also gain the context of its exact location, making data extraction more transparent and reliable.

,
,

Key Takeway

Key Takeway

Frequently Asked Questions

Common questions about document processing and AI technologies that power modern document automation.

No items found.

Ready to transform your document processing?

Start automating your document workflows today with Mindee's intelligent document processing platform.