Skip to main content

Transformation matrices

In the previous article, we saw how to compose multiple transformations by applying them in sequence:

const star = array<vec2f, 30>(
13 collapsed lines
// Outer star points
vec2f(0.000000, -0.197140), vec2f(0.309011, -0.425297), vec2f(0.187467, -0.060929),
vec2f(0.187467, -0.060929), vec2f(0.500000, 0.162443), vec2f(0.115866, 0.159500),
vec2f(0.115866, 0.159500), vec2f(0.000000, 0.525707), vec2f(-0.115866, 0.159500),
vec2f(-0.115866, 0.159500), vec2f(-0.500000, 0.162443), vec2f(-0.187467, -0.060929),
vec2f(-0.187467, -0.060929), vec2f(-0.309011, -0.425297), vec2f(0.000000, -0.197140),
// Inner star points
vec2f(0.000000, -0.197140), vec2f(0.000000, 0.000000), vec2f(0.187467, -0.060929),
vec2f(0.187467, -0.060929), vec2f(0.000000, 0.000000), vec2f(0.115866, 0.159500),
vec2f(0.115866, 0.159500), vec2f(0.000000, 0.000000), vec2f(-0.115866, 0.159500),
vec2f(-0.115866, 0.159500), vec2f(0.000000, 0.000000), vec2f(-0.187467, -0.060929),
vec2f(-0.187467, -0.060929), vec2f(0.000000, 0.000000), vec2f(0.000000, -0.197140)
);
@vertex
fn vs(@builtin(vertex_index) index: u32) -> @builtin(position) vec4f {
// Step 1: Scale
let sX = 0.5;
let sY = 0.5;
var transformed = vec2f(
star[index].x * sX,
star[index].y * sY
);
// Step 2: Rotation
let theta = radians(45.0);
let cosT = cos(theta);
let sinT = sin(theta);
transformed = vec2f(
transformed.x * cosT - transformed.y * sinT,
transformed.x * sinT + transformed.y * cosT
);
// Step 3: Translation
let tX = 0.2;
let tY = -0.1;
transformed += vec2f(tX, tY);
return vec4f(transformed, 0.0, 1.0);
}

The implementation, however, is not very flexible because it is limited to the composition of scale, rotation, and translation. If we wanted to apply different transformations, or change their order, we would be forced to modify the source code each time.

In this article, we will define geometric transformations using matrices, a tool from linear algebra. Being able to uniformly represent transformations and compositions, matrices make the code generic and reusable for any sequence of transformations.

The final code is available in the 04-matrix branch of the GitHub repository.

#Matrices

A rectangular matrix M=(mij)M = (m_{ij}) of dimensions m×nm \times n is a “table” of real numbers:

M=[m11m12m1nm21m22m2nmm1mm2mmn]M = \begin{bmatrix} m_{11} & m_{12} & \cdots & m_{1n} \\ m_{21} & m_{22} & \cdots & m_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ m_{m1} & m_{m2} & \cdots & m_{mn} \end{bmatrix}

where mm is the number of rows, nn the number of columns, and mijm_{ij} are the elements of the matrix.

A matrix of dimensions 1×n1 \times n is called a row matrix:

[m11m12m1n]\begin{bmatrix} m_{11} & m_{12} & \cdots & m_{1n} \end{bmatrix}

While a matrix of dimensions m×1m \times 1 is called a column matrix:

[m11m21mm1]\begin{bmatrix} m_{11} \\ m_{21} \\ \vdots \\ m_{m1} \end{bmatrix}

Finally, a matrix of dimensions n×nn \times n is called a square matrix of order nn:

[m11m12m13m1nm21m22m23m2nm31m32m33m3nmn1mn2mn3mnn]\begin{bmatrix} m_{11} & m_{12} & m_{13} & \cdots & m_{1n} \\ m_{21} & m_{22} & m_{23} & \cdots & m_{2n} \\ m_{31} & m_{32} & m_{33} & \cdots & m_{3n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ m_{n1} & m_{n2} & m_{n3} & \cdots & m_{nn} \end{bmatrix}

Below are shown, in order, a 2×32 \times 3 rectangular matrix, a square matrix of order 22, and a row matrix:

[πe10512],[1234],[105]\begin{bmatrix} \pi & -e & 1 \\ 0 & 5 & -\frac{1}{2} \end{bmatrix} , \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} , \begin{bmatrix} -1 & 0 & 5 \end{bmatrix}

#Matrix product

Given a matrix A=(aij)A = (a_{ij}) of dimensions m×nm \times n and a matrix B=(bij)B = (b_{ij}) of dimensions n×pn \times p, we call the product of AA and BB a matrix C=(cij)C = (c_{ij}) of dimensions m×pm \times p defined as:

cij=k=1naikbkjc_{ij} = \sum_{k=1}^{n} a_{ik} b_{kj}

In other words, the element cijc_{ij} is obtained by combining the ii-th row of matrix AA with the jj-th column of matrix BB. For example:

[ a 11 a 12 a 21 a 22 ] [ b 11 b 12 b 21 b 22 ] = [ c 11 c 12 c 21 c 22 ] \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} \begin{bmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{bmatrix} = \begin{bmatrix} a_{11}b_{11} + a_{12}b_{21} & a_{11}b_{12} + a_{12}b_{22} \\ a_{21}b_{11} + a_{22}b_{21} & a_{21}b_{12} + a_{22}b_{22} \end{bmatrix}

Product of two 2x2 matrices. Tap or click on elements to highlight the rows/columns involved in the calculation.

Where the individual elements are:

c11=a11b11+a12b21c12=a11b12+a12b22c21=a21b11+a22b21c22=a21b12+a22b22\begin{align*} c_{11} &= a_{11} b_{11} + a_{12} b_{21} \\ c_{12} &= a_{11} b_{12} + a_{12} b_{22} \\ c_{21} &= a_{21} b_{11} + a_{22} b_{21} \\ c_{22} &= a_{21} b_{12} + a_{22} b_{22} \end{align*}

The product is defined only between “compatible” matrices, that is, when the number of columns of the first matrix equals the number of rows of the second. The result will have the number of rows of the first matrix and the number of columns of the second. A simpler way to remember this rule is the example:

[m×n][n×p]=[m×p][m \times n] \cdot [n \times p] = [m \times p]

#Transposition

We call the transpose matrix or simply transpose of MM the matrix MT=(mji)M^T = (m_{ji}) of dimensions n×mn \times m obtained by swapping rows with columns. For example:

(123456)T=(135246)\begin{pmatrix} 1 & 2 \\ 3 & 4 \\ 5 & 6 \end{pmatrix}^T = \begin{pmatrix} 1 & 3 & 5 \\ 2 & 4 & 6 \end{pmatrix}

The transpose of the product of two matrices equals transposing the individual matrices and swapping their order:

(AB)T=BTAT(A \cdot B)^T = B^T \cdot A^T

A matrix MM of dimensions n×nn \times n is called symmetric if M=MTM = M^T, that is, if it coincides with its transpose:

[3445]T=[3445]\begin{bmatrix} 3 & 4 \\ 4 & 5 \end{bmatrix}^T = \begin{bmatrix} 3 & 4 \\ 4 & 5 \end{bmatrix}

The elements are symmetric with respect to the main diagonal (\searrow).

#Identity matrix

The identity matrix is a square matrix in which all elements of the main diagonal equal 11, while the others are 00:

In=[1000010000100001]I_n = \begin{bmatrix} 1 & 0 & 0 & \cdots & 0 \\ 0 & 1 & 0 & \cdots & 0 \\ 0 & 0 & 1 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & 1 \end{bmatrix}

The identity matrix is the neutral element with respect to the product operation, that is:

AIn=InA=AA \cdot I_n = I_n \cdot A = A

where AA is a square matrix of order nn.

#Inverse matrix

A square matrix of order nn is said to be invertible if there exists a square matrix M1M^{-1} of order nn such that:

M1M=MM1=InM^{-1} \cdot M = M \cdot M^{-1} = I_n

In that case, M1M^{-1} is called the inverse of MM. A non-invertible matrix is also called singular.

#Point representation

In the following sections, we will use matrices to define transformation functions. In order for these transformations, expressed in matrix form, to act on points, it is necessary to represent the points themselves in a mathematical format compatible with matrix operations.

A point is represented by its coordinates, which we can denote using a vector, a row matrix, or a column matrix:

(x,y),[xy],[xy](x, y) , \begin{bmatrix} x & y \end{bmatrix} , \begin{bmatrix} x \\ y \end{bmatrix}

All three notations define the exact same concept; the difference is only at the notation level. Therefore, we will consider the three forms as equivalent.

#Linear transformations

A linear transformation is a function of the form:

f(x,y)=(ax+by,cx+dy)f(x, y) = (a x + b y, c x + d y)

where a,b,c,da, b, c, d are real coefficients. The name derives from the fact that the transformed coordinates are a linear combination of the original point’s coordinates. Linear transformations can be expressed as the product of a square matrix of order 22 and a column matrix:

f(x,y)=[abcd][xy]f(x, y) = \begin{bmatrix} a & b \\ c & d \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix}

Alternatively, it can be defined as the product of a row matrix and a square matrix of order 22:

f(x,y)=[xy][acbd]f(x, y) = \begin{bmatrix} x & y \end{bmatrix} \begin{bmatrix} a & c \\ b & d \end{bmatrix}

In the following sections, we will describe points using column matrices, so we will use the first form.

Linear transformations preserve parallelism between lines, but not angles and areas: this means that parallel lines before the transformation will remain parallel after, but angles formed by pairs of lines could change their amplitude and geometric figures could undergo a variation in their area.

x y
Un quadrato trasformato mediante scala, taglio, rotazione e traslazione. Il parallelismo tra i lati è mantenuto, ma non gli angoli e l'area. La figura tratteggiata è quella originale.

In other words, if we take a parallelogram, after a linear transformation it will remain a parallelogram, but it might no longer be a rectangle or a square. The directions of lines are modified, and distances are rescaled in a non-uniform way, causing angle distortion and area variation.

#Scale

The scale transformation

fS(x,y)=(xsx,ysy)f_S(x, y) = (x \cdot s_x, y \cdot s_y)

can be written as:

fS(x,y)=[sx00sy][xy]f_S(x, y) = \begin{bmatrix} s_x & 0 \\ 0 & s_y \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix}

Performing the matrix product yields:

[sx00sy][xy]=[sxx+0y0x+syy]=[sxxsyy]\begin{bmatrix} s_x & 0 \\ 0 & s_y \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} s_x \cdot x + 0 \cdot y \\ 0 \cdot x + s_y \cdot y \end{bmatrix} = \begin{bmatrix} s_x \cdot x \\ s_y \cdot y \end{bmatrix}

#Rotation

The rotation

fR(x,y)=(xcos(θ)ysin(θ),xsin(θ)+ycos(θ))f_R(x, y) = (x \cdot \cos (\theta) - y \cdot \sin (\theta), x \cdot \sin (\theta) + y \cdot \cos (\theta))

can be written as:

fR(x,y)=[cos(θ)sin(θ)sin(θ)cos(θ)][xy]f_R(x, y) = \begin{bmatrix} \cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta) \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix}

Performing the matrix product yields:

[cos(θ)sin(θ)sin(θ)cos(θ)][xy]=[cos(θ)xsin(θ)ysin(θ)x+cos(θ)y]\begin{bmatrix} \cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta) \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} \cos(\theta) \cdot x - \sin(\theta) \cdot y \\ \sin(\theta) \cdot x + \cos(\theta) \cdot y \end{bmatrix}

#Shear

The shear transformation

fSH(x,y)=(x+shxy,y+shyx)f_{SH}(x, y) = (x + sh_x \cdot y, y + sh_y \cdot x)

can be written as:

fSH(x,y)=[1shxshy1][xy]f_{SH}(x, y) = \begin{bmatrix} 1 & sh_x \\ sh_y & 1 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} \\

Performing the matrix product yields:

[1shxshy1][xy]=[1x+shxyshyx+1y]=[x+shxyy+shyx]\begin{bmatrix} 1 & sh_x \\ sh_y & 1 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} 1 \cdot x + sh_x \cdot y \\ sh_y \cdot x + 1 \cdot y \end{bmatrix} = \begin{bmatrix} x + sh_x \cdot y \\ y + sh_y \cdot x \end{bmatrix}

#Matrices in WGSL

The WGSL language supports matrices of dimensions between 2×22 \times 2 and 4×44 \times 4 with the primitive types matCxRf, where CC is the number of columns and RR the number of rows. The notation is reversed compared to mathematical convention, which indicates the number of rows first and then the number of columns. For example, the type mat2x4f represents a 4×24 \times 2 matrix (four rows and two columns):

[m11m12m21m22m31m32m41m42]\begin{bmatrix} m_{11} & m_{12} \\ m_{21} & m_{22} \\ m_{31} & m_{32} \\ m_{41} & m_{42} \end{bmatrix}

The binary encoding of matCxRf types is column-major: this means that matCxRf is structured in memory as an array of CC vectors of length RR. For example, mat2x4f is conceptually equivalent to array<vec4f, 2>:

[(m11,m21,m31,m41),(m12,m22,m32,m42)][(m_{11}, m_{21}, m_{31}, m_{41}), \, (m_{12}, m_{22}, m_{32}, m_{42})]

Care must be taken not to confuse the mathematical notation of matrices, which is row-major, with the binary representation in WGSL, which is column-major. To better understand this concept, let’s examine a non-symmetric matrix, such as the shear matrix:

shearMatrix(shx,shy)=[1shxshy1]shearMatrix(sh_x, sh_y) = \begin{bmatrix} 1 & sh_x \\ sh_y & 1 \end{bmatrix}

The mat2x2f constructor follows the column-major representation and therefore accepts elements by column, not by row:

fn shearMatrix(shx: f32, shy: f32) -> mat2x2f {
return mat2x2f(1, shy, shx, 1);
}
fn shearMatrixTransposed(shx: f32, shy: f32) -> mat2x2f {
return mat2x2f(
1, shx,
shy, 1
);
}

The indentation of the shearMatrixTransposed function visually mirrors the mathematical notation, which is row-major, but the result in memory will be the transposed matrix:

shearMatrixTransposed(shx,shy)=[1shyshx1]shearMatrixTransposed(sh_x, sh_y) = \begin{bmatrix} 1 & sh_y \\ sh_x & 1 \end{bmatrix}

If we had considered a symmetric matrix, there would have been no difference, because, by definition, the transpose of a symmetric matrix coincides with the matrix itself.

#Product in WGSL

The product in WGSL is defined both between matrices and between matrices and vectors. We can update our example shader by replacing the rotation and scale transformations with matrices:

fn scaleMatrix(sX: f32, sY: f32) -> mat2x2f {
return mat2x2f(
sX, 0,
0, sY
);
}
fn rotationMatrix(theta: f32) -> mat2x2f {
let cosT = cos(theta);
let sinT = sin(theta);
return mat2x2f(
cosT, sinT,
-sinT, cosT
);
}
@vertex
fn vs(@builtin(vertex_index) index: u32) -> @builtin(position) vec4f {
// Step 1: Scale
var transformed = scaleMatrix(0.5, 0.5) * star[index];
// Step 2: Rotation
transformed = rotationMatrix(radians(45.0)) * transformed;
// Step 3: Translation
let tX = 0.2;
let tY = -0.1;
transformed += vec2f(tX, tY);
return vec4f(transformed, 0.0, 1.0);
}

We can notice a first advantage of using matrices: the code for applying two different transformations, scale and rotation, is always a matrix product.

The scaleMatrix and rotationMatrix functions build the matrices to apply in the form f(x,y)=Mpf(x, y) = M \cdot p, where pp is a column vector. The shader code follows this convention and applies the products in the form M * p rather than p * M.

In general, in WGSL vectors are considered neither rows nor columns; however, both expressions M * p and p * M are valid. In order for the product to always be well-defined, vectors are interpreted as rows or columns depending on the order in which they appear in the product with a matrix. The two expressions are computed as follows:

Mp=[abcd][xy]=[ax+bycx+dy]pM=[xy][abcd]=[ax+cybx+dy]\begin{align*} M \cdot p = \begin{bmatrix}a & b \\ c & d \end{bmatrix} \begin{bmatrix}x \\ y\end{bmatrix} = \begin{bmatrix} ax + by \\ cx + dy \end{bmatrix} \\ p \cdot M = \begin{bmatrix}x & y\end{bmatrix} \begin{bmatrix}a & b \\ c & d \end{bmatrix} = \begin{bmatrix} ax + cy & bx + dy \end{bmatrix} \end{align*}

The two results do not differ merely in being a row vector rather than a column vector (irrelevant in WGSL), but the elements themselves are different! For this reason, it is essential to establish a convention and always apply it consistently.

As with the code example shown, in the following sections we will always use matrices in the form M * p, paying attention to the column-major layout in the matCxRf constructors.

#Composing transformations

The updated shader version implements the formula:

R(Sp)+tR \cdot (S \cdot p) + t

where RR is the rotation matrix, SS is the scale matrix, pp is the vertex to transform, and tt is the translation vector. The logical sequence of steps is:

  1. The point pp is scaled via matrix product with SS, obtaining a first intermediate point.
  2. The computed point is rotated via matrix product with RR, thus obtaining the second intermediate point.
  3. Finally, the second intermediate point is translated via vector addition with tt, thus obtaining the final result.

In particular, the transformations RR and SS are performed in two separate steps; let’s now see how to unify them. In general, the matrix product has the associative property, that is:

ABC=A(BC)=(AB)CA \cdot B \cdot C = A \cdot (B \cdot C) = (A \cdot B) \cdot C

Leveraging this property in the shader formula (pp is a column matrix), we can write:

(RS)p+t(R \cdot S) \cdot p + t

Setting W=RSW = R \cdot S, we can simplify to:

Wp+tW \cdot p + t

The result is significant: the single matrix WW is equivalent to the transformations SS and RR in sequence, that is, it represents their composition. At this point, we can further simplify the shader code:

fn scaleMatrix(sX: f32, sY: f32) -> mat2x2f {
4 collapsed lines
return mat2x2f(
sX, 0,
0, sY
);
}
fn rotationMatrix(theta: f32) -> mat2x2f {
6 collapsed lines
let cosT = cos(theta);
let sinT = sin(theta);
return mat2x2f(
cosT, sinT,
-sinT, cosT
);
}
@vertex
fn vs(@builtin(vertex_index) index: u32) -> @builtin(position) vec4f {
var worldTransform = rotationMatrix(radians(45.0)) * scaleMatrix(0.5, 0.5);
// Step 1: Scale + Rotation
var transformed = worldTransform * star[index];
// Step 2: Translation
let tX = 0.2;
let tY = -0.1;
transformed += vec2f(tX, tY);
return vec4f(transformed, 0.0, 1.0);
}

The advantage of this approach may seem limited in our example, since the overall number of operations does not change. However, in real cases, worldTransform is usually a shader parameter and, consequently, is not recomputed for each vertex.

Generalizing, suppose we want to apply in order a series of linear transformations M1,M2,,MnM_1, M_2, \ldots, M_n to a point pp:

Mn(M3(M2(M1p)))M_n \cdot \ldots \cdot (M_3 \cdot (M_2 \cdot (M_1 \cdot p)))

Repeatedly leveraging the associative property, we can represent the composition through a single matrix W=MnMn1M1W = M_n \cdot M_{n-1} \cdot \ldots \cdot M_1 and use it directly:

WpW \cdot p

By doing so, the single matrix WW encodes the original sequence, abstracting the vertex shader from the type, number, and order of transformations to apply. 😮😱

Unlike multiplication of real numbers, multiplication of square matrices generally does not have the commutative property, that is, ABBAA \cdot B \neq B \cdot A. The order in which transformations are performed is relevant, as we had seen intuitively in the previous article. It is important to note that the order in which matrices appear in the definition of WW is the reverse of the order in which the transformations are “applied.” The product MnMn1M1M_n \cdot M_{n-1} \cdot \ldots \cdot M_1 is equivalent to transforming first with M1M_1, then with M2M_2, and so on.

#Conclusions

In this article, we introduced matrices and some of their properties. We saw that scale, rotation, and shear are linear transformations and therefore can be expressed in the form MpM \cdot p, where MM is a square matrix of order 22.

However, so far in the discussion, we have intentionally left translation aside. As we will see, it is not possible to represent a translation in the plane with order 22 square matrices!

How to solve this problem? The answer in the next article, dedicated to projective spaces and homogeneous coordinates.