Index
Straighten up, why?
You need to recover data from previously scanned document. Obviously you immediately think of using an OCR ( see this article which shows how to use Tesseract )… but there you are, if your document is of good quality, it is not aligned correctly. he is askew! and your OCR cannot interpret its content.
This is quite normal, in fact you must first straighten the document so that it looks straight, ready for the application of this OCR. We will see in this article how to use Python and the deskew library.
The Python deskew library
What’s great about the Python universe is that there is a library for everything. After some quick research on your search engine you will certainly find quite a few begging to be tried.
I’ll show you one of the simplest: the deskew library which you can find here .
Installation is quick and easy (the underlying code is also quick, if you have time to look at it). For information, this library uses another much better known scikit-image library , which is also very useful for managing image data.
pip install deskew
Goal
The goal is rather simple, we need to retrieve the textual information from the following image:
Just to check what I told you in the introduction, let’s try to use Tesseract directly on this image:
try:
from PIL import Image
except ImportError:
import Image
import pytesseract
image1 = Image.open('img_35h.jpg')
pytesseract.image_to_string(image1, lang='fra')
''
Indeed, Tesseract sees absolutely nothing!
Straighten the image with deskew
First of all, let’s import the necessary libraries:
import numpy as np
from skimage import io
from skimage.transform import rotate
from skimage.color import rgb2gray
from deskew import determine_skew
from matplotlib import pyplot as plt
Then create a small function that will allow us to straighten the image:
def deskew(_img):
image = io.imread(_img)
grayscale = rgb2gray(image)
angle = determine_skew(grayscale)
rotated = rotate(image, angle, resize=True) * 255
return rotated.astype(np.uint8)
This function is quite simple. The deskew library indeed encapsulates all the calls necessary for recovery … except perhaps the conversion to gray levels which is an essential step to detect tilt angles.
Then let’s create a small function to test and see the result (with matplotlib ) of our adjustments:
def display_avant_apres(_original):
plt.subplot(1, 2, 1)
plt.imshow(io.imread(_original))
plt.subplot(1, 2, 2)
plt.imshow(deskew(_original))
The result
I took a few tilted images to test the result. A first test with a tilt of 35 degrees clockwise:
display_avant_apres('img_35h.jpg')
Then with an inclination of 15 degrees counterclockwise:
display_avant_apres('img_15ah.jpg')
The result seems rather conclusive, doesn’t it? let’s try to use Tesseract now on this image:
io.imsave('output.png', deskew('img_35h.jpg'))
image1 = Image.open('output.png')
pytesseract.image_to_string(image1, lang='fra')
'Bonjour'
And There you go ! the result is the expected one.
You want to automate this type of task with the RPA Blue Prism … take a look at this article.
The video below shows the solution in action: