Image processing (part 4) Image Transformations

Share this post

We saw on the previous posts that the images are in reality 2-dimensional (black and white / grayscale) or 3-dimensional (colors) matrix, then we saw how to analyze them quickly thanks to the histograms. Then we used this knowledge to carry out thresholding operations on the images. We will now see in this article how to perform some basic transformations on images with scikit-image such as rotating, and changing image scale and size.

This very simple article will then take us to image filters, then by extension to convolutional neural networks (parts 5 and 6).

Rotation

Before seeing how to simply rotate an image with scikit-image. I invite you to go to the skimage site to see all the image transformations the library offers. There are a lot of them and we will only see some of them in this article of course. So let’s start with the rotation.

First of all let’s import the right Python libraries:

import matplotlib.pyplot as plt
from skimage.io import imread, imshow
from skimage import exposure, transform
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import math

Then let’s create an very simple image (4 x 4) made up of black or white pixels, and display it with the imshow method. We could use any type of image of course, but I find a very simple example more meaningful, especially when showing the result matrix.

image_test = np.array([[1,0,0,0], [0,1,0,0], [0,1,1,0], [0,0,0,1]])
imshow(image_test, cmap=plt.get_cmap('gray'))

A simple call to the transform.rotate () function therefore allows this rotation to be carried out at the desired angle (below 90 °)

rotated = transform.rotate(image_test, angle=90, preserve_range=True)
print(rotated)
array([[0, 0, 0, 1],
       [0, 0, 1, 0],
       [0, 1, 1, 0],
       [0, 0, 0, 0]], dtype=int32)

Note the preserve_range parameter which allows to keep the same number normalization in the matrix / image.

Let’s visualize the result:

_, axes = plt.subplots(ncols=2)
axes[0].imshow(image_test, cmap=plt.get_cmap('gray'))
axes[1].imshow(rotated, cmap=plt.get_cmap('gray'))

Image resizing

Changing the size of an image is of course a common operation. Let’s say we want to increase the size of our image from 4 × 4 to 6 × 6. To do that, we’re going to use the transform.resize () function for this, passing the tuple (6,6) as a parameter.

resized_img = transform.resize(image=image_test, output_shape=(6,6), preserve_range=True)

Let’s take a look on the result matrix (image) :

print(resized_img)
array([[0.72222222, 0.5       , 0.13888889, 0.02777778, 0.        , 0.        ],
       [0.5       , 0.5       , 0.41666667, 0.08333333, 0.        , 0.        ],
       [0.16666667, 0.5       , 0.86111111, 0.30555556, 0.08333333, 0.02777778],
       [0.16666667, 0.5       , 0.97222222, 0.86111111, 0.41666667, 0.13888889],
       [0.08333333, 0.25      , 0.5       , 0.5       , 0.5       , 0.5       ],
       [0.02777778, 0.08333333, 0.16666667, 0.16666667, 0.5       , 0.72222222]])

If we look at the matrix dimensions more closely, we have a 6 × 6 image.

resized_img.shape
(6, 6)

You have probably noticed that we now have decimal numbers in the matrix (therefore gray levels). This is actually due to the enlargement. Let’s go back to pure black and white to have our real enlargement. For that we will convert the values> 0.5 to 1 and the others to zero:

intresized = resized_img > 0.5
imshow(intresized.astype('int32'), cmap=plt.get_cmap('gray'))

We now see better our image stretch with the middle shape and the top left shape.

Rescaling

To change the magnification of an image we can use transform.rescale (). Below we use a ratio of 1/2:

image_rescaled = transform.rescale(image=image_test, scale=1.0 / 2.0, anti_aliasing=False, preserve_range=True)
_, axes = plt.subplots(ncols=2)
axes[0].imshow(image_test, cmap=plt.get_cmap('gray'))
axes[1].imshow(image_rescaled, cmap=plt.get_cmap('gray'))

Euclidean transformations

The scikit-image library really allows you to perform a large number of transformations on images. I find the SimilarityTransform () transformation particularly useful as it allows to combine Euclidean transformations (translations) with scaling.

Let’s take a real image this time. This one is not straight and we will straighten it, then shift it directly with this function:

image = imread('book.jpg')
imshow(image)
tr = transform.SimilarityTransform(scale=1.5, rotation=math.pi/20, translation=(-40, -250))
plt.figure(figsize=(8, 5))
image_tr_1 = transform.warp(image, tr)
plt.imshow(image_tr_1)

In fact, rotating the image creates an offset which we correct with the translation parameter above.

Here is the conclusion of this article on basic image transformations. In the next few articles, we will discuss filters and in particular those of convolution, which will allow us to better understand how so-called convolutional neural networks have become so ubiquitous in the world of image processing.

Share this post

Benoit Cayla

In more than 15 years, I have built-up a solid experience around various integration projects (data & applications). I have, indeed, worked in nine different companies and successively adopted the vision of the service provider, the customer and the software editor. This experience, which made me almost omniscient in my field naturally led me to be involved in large-scale projects around the digitalization of business processes, mainly in such sectors like insurance and finance. Really passionate about AI (Machine Learning, NLP and Deep Learning), I joined Blue Prism in 2019 as a pre-sales solution consultant, where I can combine my subject matter skills with automation to help my customers to automate complex business processes in a more efficient way. In parallel with my professional activity, I run a blog aimed at showing how to understand and analyze data as simply as possible: datacorner.fr Learning, convincing by the arguments and passing on my knowledge could be my caracteristic triptych.

View all posts by Benoit Cayla →

Fork me on GitHub