Textman; Text Extraction with FastAPI and Pytesseract

In Africa and some parts of the world, some businesses still have a large volume of their data in physical files and not digitized. This significantly hinders progress in these companies because valuable insight that could help them make progressive decisions are locked away in these files.

Just imagine what insights will be covered if these data are digitized and analyzed. Fortunately, that is what we will be doing in this article. We will create a web app that will take in a picture containing text and then extract the text from the image using a cool library called Pytessaract.

Setup

Let’s get started by creating a virtual environment and installing the packages we will need for our app.


# create env
conda create --name textman

# activate it
conda activate textman

NB: for windows user, refer to this article to install tesseract https://medium.com/quantrium-tech/installing-and-using-tesseract-4-on-windows-10-4f7930313f82

# on Ubuntu install tesseract first 
sudo apt install tesseract-ocr
sudo apt install libtesseract-dev


# and then install pytesseract
pip install pytesseract
pip install fastapi[all]
pip install opencv-python

Now let’s create a directory called textman and add the following files and folders

.
├── app.py
├── README.md
├── requirements.txt
├── templates
│ └── index.html

Build the app

Lets now add content to the app.py file

from fastapi import FastAPI, Request, File, UploadFile
from starlette.requests import Request
from fastapi.templating import Jinja2Templates
from pydantic import BaseModel
import numpy as np
import io
import cv2
import pytesseract


app = FastAPI()
templates = Jinja2Templates(directory="templates")

@app.get("/")
def home(request: Request):
return templates.TemplateResponse("index.html", {"request": request})


def read_img(img):
text = pytesseract.image_to_string(img)
return(text)

# , file: bytes = File(...)

@app.post("/extract_text")
async def extract_text(request: Request):
label = ""
if request.method == "POST":
form = await request.form()
# file = form["upload_file"].file
contents = await form["upload_file"].read()
image_stream = io.BytesIO(contents)
image_stream.seek(0)
file_bytes = np.asarray(bytearray(image_stream.read()), dtype=np.uint8)
frame = cv2.imdecode(file_bytes, cv2.IMREAD_COLOR)
label = read_img(frame)

# return {"label": label}

return templates.TemplateResponse("index.html", {"request": request, "label": label})

First, we imported our libraries, instantiated an app, and specified our template directory so FastAPI can read our index.hml. We then created a home route that displays our index.html file. We also created a function(read_img()) that takes in an image and extract text from it in just one line of code. Cool right? I know! We finally created a post routes that get our image file from the form we will create in the index.html file and then use OpenCV to convert the image into the right format for Pytesseract.

Our markup will now look like the following;

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<link rel="preconnect" href="https://fonts.gstatic.com" />
<link
href="https://fonts.googleapis.com/css2?family=Montserrat:wght@300&display=swap"
rel="stylesheet"
/>
<link
href="https://unpkg.com/tailwindcss@^2/dist/tailwind.min.css"
rel="stylesheet"
/>
<title>Image To Text</title>
<style>
body {
content: "By boadzie Daniel";
margin: 0;
min-height: 100vh;
background-color: #ffffff;
background-image: url("data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 80 80' width='80' height='80'%3E%3Cpath fill='%2335323a' fill-opacity='0.17' d='M14 16H9v-2h5V9.87a4 4 0 1 1 2 0V14h5v2h-5v15.95A10 10 0 0 0 23.66 27l-3.46-2 8.2-2.2-2.9 5a12 12 0 0 1-21 0l-2.89-5 8.2 2.2-3.47 2A10 10 0 0 0 14 31.95V16zm40 40h-5v-2h5v-4.13a4 4 0 1 1 2 0V54h5v2h-5v15.95A10 10 0 0 0 63.66 67l-3.47-2 8.2-2.2-2.88 5a12 12 0 0 1-21.02 0l-2.88-5 8.2 2.2-3.47 2A10 10 0 0 0 54 71.95V56zm-39 6a2 2 0 1 1 0-4 2 2 0 0 1 0 4zm40-40a2 2 0 1 1 0-4 2 2 0 0 1 0 4zM15 8a2 2 0 1 0 0-4 2 2 0 0 0 0 4zm40 40a2 2 0 1 0 0-4 2 2 0 0 0 0 4z'%3E%3C/path%3E%3C/svg%3E");
font-family: "Montserrat", sans-serif;
font-weight: 700px;
text-align: center;
display: flex;
align-items: center;
justify-content: center;
}
</style>
</head>
<body>
<section class="container text-gray-500 mx-auto px-4 py-4 flex flex-col">
<div>
<h3 class="text-6xl font-bold">Textman</h3>
<p class="mt-2 italic text-2xl text-left lg:text-center font-semibold">
Upload a picture and have the text extracted from it.
</p>
</div>
<form method="post" action="/extract_text" enctype="multipart/form-data">
<div
class="flex w-full h-40 items-center justify-center bg-grey-lighter"
>
<label
class="w-64 flex flex-col items-center px-4 py-4 bg-green-600 text-white rounded-lg shadow-lg tracking-wide uppercase border border-blue cursorpointer hover:bg-blue hover:text-white"
>
<svg
class="w-8 h-8"
fill="#fff"
xmlns="http://www.w3.org/2000/svg"
viewBox="0 0 20
20"
>
<path
d="M16.88 9.1A4 4 0 0 1 16 17H5a5 5
0 0 1-1-9.9V7a3 3 0 0 1 4.52-2.59A4.98 4.98 0 0 1 17
8c0 .38-.04.74-.12 1.1zM11 11h3l-4-4-4 4h3v3h2v-3z"
/>
</svg>
<span class="mt-2 text-base leading-normal">Select a file</span>
<input type="file" name="upload_file" class="hidden" />
</label>
</div>
<div>
<input
class="text-center w-30 text-white bg-blue-500 hover:bg-blue-400 border-0 py-1 px-2 focus:outline-none hover:bg-red-600 rounded text-lg"
type="submit"
value="Extract text"
/>
</div>
</form>
{% if label %}
<div
class="w-1/2 mr-auto ml-auto rounded-lg mt-4 bg-gray-600 text-white py-4 px-4"
>
<p class="text-lg">{{label}}</p>
</div>
{% endif %}
</section>
</body>
</html>

We are using Tailwindcss; a cool utility-first CSS library and we are also using Heropatterns for the cool background.

Our logic in the template is simple. We first check to see if there is a label and then we display the label using Jinja2 template engine.

The only thing we are left with now is to run our app. To do so, add the following to app.py

# app.py 
# import uvicorn at the top of app.py
import uvicorn

# then add the following to the bottom of app.py
if __name__ == "__main__":
uvicorn.run("app:app", host="127.0.0.1", port=8000, reload=True)

Finally, run the app with;

python app.py

If all goes well, you should see the following;

In just a few lines of code, we’ve created an awesome text extractor. How cool is that? The code for this app is available here.

Conclusion

Machine Learning has numerous applications for industry. Text Extraction is just one these benefits that can transform a business and simplify lives. I hope this article will motivate you to try out these cool technologies and perhaps change a live. Thank for Reading!

--

--

--

Data Scientist, AI, Software Engineer. CTO at Zummit Africa Inc. https://www.boadzie.me/

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

How Programmers Should Communicate

#100DaysofCode Day 2: PC status chatbot for Whatsapp

Kringlecon 2018 (Side Challenges[incomplete])

Overview of AWS Management Tools: What to Choose for Your Business

AWS Management Tools

OWASP TOP 10- DAY 1: Injection

GO-JEK’s bootcamp experience: The Prequel (Part 1)

Wav Media App For Mac

Service Accounts in Kubernetes

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Boadzie Daniel

Boadzie Daniel

Data Scientist, AI, Software Engineer. CTO at Zummit Africa Inc. https://www.boadzie.me/

More from Medium

Text to speech converter GUI with Python

Banking App With FastApi and Tkinter

Banking App With FastApi and Tkinter

Microblog: Customising Python Streamlit Dashboards with Custom Themes and More!

Develop a chat application using React js, FastAPI and websocket