Document AI in Google Cloud Platform

Automatic document processing and entity extraction

Vikram Shinde
Level Up Coding

--

Does your company have a lot of Documents / PDF files that your employees have to check and enter into the database manually?

Yes! Then, you are on the right blog!

Photo by Scott Graham on Unsplash

Introduction

Most businesses are now sitting on Document goldmines. These documents are contracts, PDFs, emails, customer feedback, patterns. These documents are increasing over time.

Following are some example of the business which has millions of contract documents. They need to read these documents at a different times of their lifecycle for analyzing it. This needs a lot of processing time and error prone.

  • Mortgage Providers
  • Insurance Companies
  • School

These are unstructured data.

Document AI

Document AI, in beta, offers a scalable, serverless platform to automatically classify, extract, and enrich data from your scanned documents. It converts unstructured data into structured data.

Internally it uses the same deep machine learning technology that powers Google Search, Google Assistant, Natural Language Processing API to derive valuable insights from your unstructured documents.

Use Case

The organisation has multiple field offices and the head office runs the HR / Payroll system. New joiner’s details are filled in a form by field managers and then forms are being sent to head office. The operator enters the details manually to the system and then Employee’s email, training, access card, laptop, and other formalities get sorted.

This end-2-end process takes days and till then employee sits idle. In my opinion new joiners (be it fresher or experienced) have always eager to demonstrate his/her talent or skill so they should not idle in their early days.

In this post, we will see how to automate document processing and end-to-end new joiner process:

  • Field managers fill the form and upload the scanned form
  • The application extracts the data from document and store into database
  • Then alerts to other processes, new joiners employee_id
  • Other processes like ID Cards, Laptop, Desk, etc will fetch the details from the database using employee_id
Architecture

Components

The following Serverless components are used in this architecture. This means that you will pay per use, without any up-front costs. Also, no servers need to be configured or maintained.

  1. Front-end app to upload the scanned document on Cloud Run.
  2. The document is stored in Google Cloud Storage.
  3. This triggers the Cloud Function.
  4. The Cloud Function calls Document AI to fetch the entities.
  5. The Cloud Function reads the response generates employee_id, email, and stores data to Cloud Firestore.
  6. The new-joiner notification is sent to Cloud Pub/Sub topic.
  7. This topic has multiple subscribers: Desk Service, Laptop Service, ID Card Service. These services will fetch the details from Firestore.
  8. The Service deployed using Cloud Run which has end-point to GET the details of an employee from Cloud Firestore.
  9. All components are logging data to Stackdriver.

Deploy this Document AI app using Terraform

Setup

In order to complete this guide, you’ll need to install the following tools

  • Terraform: This guide uses Terraform to deploy resources.
  • Git: Git is used to clone the example code and trigger new deployments.
  • GCP: You will need a GCP account with billing enabled.

Create GCP Project

Create a GCP project for this tutorial.

Select Firestore mode

  • Go to Firestore
  • Select Native Mode
  • Select a Location (e.g. United States)
  • Click on “Create Database”

Create Service Account

  • Create Service Account.
  • Assign the roles: Editor
  • Download the key and renamed it as terraform-key.json

Clone the Repository

Clone the following repository containing the sample code, then switch to the terraform directory:

$ git clone https://github.com/vikramshinde12/document-ai-in-gcp.git

Next, copy the terrform.tfvars.example file to terraform.tfvars. You will need to replace the value of the project variables.

Then, copy the service account key terraform-key.json to this folder.

Execute the following command to set Google Credentials.

$ export GOOGLE_CLOUD_KEYFILE_JSON=terraform-key.json

Execute Terraform scripts

First, initialize the Terraform.

$ terraform init
Initialising Terraform
$ terraform plan
Terraform Plan

Now, apply the changed to the GCP platform.

$ terraform apply

Click on the URL, this will open the sample application.

Access the application

Upload the sample file on the sample frontend app. The example (sample.pdf) is available in the repository.

The entities are extracted using Document AI and add it into Firestore.

Sample form and its entry in the Firestore.

Hit the API using postman.

The Subscribers (e.g. ID_Cards Service, Desk Service, Laptop Service) will get notification of new joiner and it fetches the details using the API.

Cloud Function output

Clean Up

First, permanently delete the resources created by Terraform:

$ terraform destroy

Next, delete the Terraform Admin project and all of its resources:

$ gcloud projects delete [project_id]

Conclusion

This way we have automated end-to-end Document processing in Google Cloud Platform.

I have also shared the code in github for the docker containers and cloud function.

Reference

--

--