Blog
OCR

Parsing receipts with Mindee’s Machine Learning API

Reading time:
4
min
Published on:
Sep 2, 2020

Doug Sillars

Doug Sillars

Summary

Share the article

Anyone who has filed an expense report can tell you: receipt tracking and expense logging is a headache. Enter Mindee’s receipt parsing API, which uses deep learning to automatically, accurately and instantaneously parse your receipt details.

In this tutorial, we will walk through the steps to use Mindee’s Receipt Parsing API. Let’s get started!

API Prerequisites

  1. You’ll need a free Mindee account. Sign up and confirm your email to login.
  2. A receipt. Look in your bag/wallet for a recent one, or do a Google Image search for a receipt and download a few to test with.

Setting up the API

Log into you Mindee account and access your Expense Receipt API environment by clicking the Expense Receipts card:

To activate the API, click the “Try for Free” button to access our generous free tier. You’ll land on the dashboard page — where you can quickly see API usage (you have none right now, but that will change). On the left navigation, there are links to “Documentation”, “Credentials” and “Live Interface”. The docs tab has all of the technical details you’ll need to build for the receipts API endpoint, and the Live Interface is a cool interactive demo. Rather than try out the demo, we want to build with the API, so click on “Credentials” to create an API token.

Add a new token. In this example, I’ve named it “Tutorial”:

Click “Add New Key” and you’ll be able to see your API token,.

Now, we are ready to make an API call. In this example, we’ll be using cURL.

curl -X POST \ https://api.mindee.net/products/expense_receipts/v2/predict \
 -H 'X-Inferuser-Token: {apiToken}’ \
 -H 'content-type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW' \
 -F file=@/path/to/your/file.png

Simply replace {apiToken} with your new API token and /path/to/your/file/png with the path to your receipt.

NOTE: You can also copy this code right from the documentation tab of the API with your API token inserted for you.

In this example, I used a receipt from the grocery store in Koln airport (my last business trip in 2020):

Pasting the cURL sample into my terminal, I hit enter and about a second later, I received a JSON response with the receipt details. The full JSON can be accessed here. Since the response is quite verbose, we will walk through the various fields section by section.

API Response: Parsing Results

Summary & Documents section:

The first two sections of the response contain information about the API call made:

"call": {
    "endpoint": {
        "name": "expense_receipts",
        "version": "2.1"
    },
    "finished_at": "2020-08-29T18:01:22+00:00",
    "id": "e47d8654-0df7-4839-a282-2c04bf293886",
    "n_documents": 1,
    "n_inputs": 1,
    "processing_time": 1.087,
    "started_at": "2020-08-29T18:01:21+00:00"
},
"documents": [
    {
    "id": "66d9adc6-76cf-4c42-8622-3dadb660ac32",
    "name": "IMG_20200301_073354.jpg"
    }
],

The call section tells us that we ran on the expense receipts endpoint, uploading one document that is one page long, and after just about 1 second, the file was processed. The documents section gives the Mindee id for the file, and the filename.

Predictions:

Now we are getting to the exciting stuff. The Predictions section is broken into several sections. Several of these are identifying fields on the receipt, and others are using Machine Learning to deduce information from the receipt. Let’s go through each section:

Category

category": {
     "probability": 0.51,
     "value": "miscellaneous"
},

The API make a prediction on the type of purchase. In this case, it is 51% sure it is miscellaneous. The possible categories are [toll, food, parking, transport, accommodation, gasoline, miscellaneous].

Date

Identified from text on the receipt and converted into ISO format. This purchase was made on February 3, 2020, and the model is 99% confident in that choice. The segmentation bounding box provides 4 (x,y) coordinates indicating where the date was pulled from the receipt [(0,0) is the upper left corner, (1,1) is the bottom right corner].

"date": {
    "iso": "2020-02-03",
    "probability": 0.99,
    "raw": "03-02-2020",
    "segmentation": {
    "bounding_box": [
      [0.64,0.661],
      [0.801,0.661],
      [0.801,0.686],
      [0.64,0.686]
  ]
}

Locale

Using data from the receipt, the API can predict where the purchase was made, the language and the currency:. Check the documentation for the latest support. At the time of writing, support is centered on Europe and North America.

"locale": {
    "country": "DE",
    "currency": "EUR",
    "language": "de",
    "probability": 0.77,
    "value": "de-DE"
},

In the case of my receipt, it is 77% confident that the purchase is in German, made in Germany, and in euros.

Merchant

"merchant": {
    "name": "REWE",
    "probability": 0.91,
    "segmentation": {
    "bounding_box": [
      [0.279, 0.135],
      [0.719, 0.135],
      [0.719,0.23],
      [0.279,0.23]
     ]
}

The API correctly predicted (it was 91% sure) that it was a REWE store. Again, four (x,y) points mark the location of the text naming the Merchant on the image.

Orientation

"orientation": {
   "degrees":0,
   "probability": 0.99
},

Did the document require rotation before parsing? Measured in 90 degree increments [0.90.180.270]. In this case, it did not require any rotation.

Taxes

"taxes": [],

If any taxes are identified in the receipt, the will appear here. In this case, no taxes were found (but this is the correct result).

Time

"time": {
    "iso": "15:50",
    "probability": 0.99,
    "raw": "15:50",
    "segmentation": {
    "bounding_box": [
      [0.649,0.898],
      [0.732,0.898],
      [0.732,0.925],
      [0.649,0.925]
    ]
}

Time the receipt was printed, confidence, and the (x,y) coordinates that bound the field in the image.

Total

Perhaps the most important part of the receipt, the total spent, along with confidence and the box indicating the location on the receipt.

"total": {
    "amount": 17.74,
    "probability": 0.99,
    "segmentation": {
    "bounding_box": [
     [0.663,0.589],
     [0.765,0.589],
     [0.765,0.617],
     [0.663,0.617]
   ]
}

Summary:

In just over 1 second, a receipt was uploaded, parsed and the response returned to the end user. We know that €17.74 was spent at a REWE in Germany on February 3, 2020 at 15:50. Using the bounding boxes, we can have the user validate the values, and then input this data into an expense management system.

Conclusion

Using the MIndee receipt parsing API, you can quickly validate receipts, allowing for faster, more accurate (and less painful) expense management for our users. If you have questions, please reach out to us in the chat widget in the bottom right.

OCR