2D Matrix Transformations for Computer Vision | by Javier Martínez Ojeda | May, 2023


When you want to scale, rotate, or translate an image, the immediate solution is to apply the transformations to each of its pixels individually. By applying the same transformation to all pixels, the transformed pixels will create a new image, which will be the original image transformed.

This can be seen in the following image, where we have a square formed by 4 pixels/vertices. As we want to rotate the square about the origin, we will rotate individually each of the pixels that form the square about the origin. After applying this rotation to each of the pixels, you can see how the same square is formed, but it is now rotated.

Example of the rotation of a 2D square. Image by author

Therefore, to apply any type of transformation to an image, iterate over each pixel of the image and apply the transformation to each of those pixels. But, how do you apply the transformation to each of those pixels?

To know this you must first understand how an image is represented. A 2D image is represented as a 2D matrix, such that the pixel in the upper left corner corresponds to the element with indices (0, 0) of the matrix. Likewise, the pixel in the lower right corner will have indices (n, m), being nxm the size of the image. Therefore, the indices of each element of the matrix represent the position of the pixel in the image. The value of each matrix element, in turn, represents the color intensity of its respective pixel and ranges usually from 0 to 255, 0 being black and 255 white.

2D matrix representation of a 2D image. Photo of the giraffes by Gustav Schwiering on Unsplash. Image by author.

The indices that each pixel has within the matrix are very important, since they are informing what position they take in the image. Therefore, since the transformations consist of modifying the positions of the pixels, the task will be to map the pixel from its original indices in the matrix (original_x, original_y) to new indices (new_x, new_y), as shown below.

Transformation from original to new indices. Image by author

This mapping is carried out by means of a matrix, such that, by multiplying this matrix by the transposed original position vector, the new position vector is obtained. This matrix is called the Transformation Matrix, as it transforms the original position vector to the new one.

Calculation of new position using a transformation matrix and the previous position. Image by author



Source link

Leave a Comment